Workshop #2
Architecture and Technology of Memory-centric Computing
"TBC"
Prof. Chi Ying TSUI (HKUST)
Abstract
Details will be provided soon. Stay tuned!
Prof. Chi Ying TSUI
"Architectural Exploration and Memory Management for Emerging CPU-FPGA Systems"
Prof. Wei ZHANG (HKUST)
Abstract
Heterogeneous Computing is a promising direction to address the challenges of performance and power walls in today’s high-performance computing. For this purpose, the CPU-FPGA system is promising due to the high flexibility of FPGA, which enables customization for various computing tasks to boost performance and energy efficiency. However, how to interface CPU with the FPGA accelerator in the best way is not straight-forward. Traditionally, CPU-FPGA communicates through direct memory access (DMA). Recently, tightly coupled CPU-FPGA systems with shared cache hierarchy (like Intel HARP and IBM POWER with CAPI) are also proposed to enhance the communication efficiency between CPU and FPGA and simplify the programming model. Such emerging architectures bring new challenges when designing the CPU-FPGA collaborating systems. In this talk, we will introduce some of our works on the memory architecture exploration of the CPU-FPGA systems and the cache management approaches to enhance the FPGA cache utilization targeting the emerging shared cache system.
Prof. Wei ZHANG
"Towards Scalable Near-Memory Computing with Reconfigurable Computing Data"
Prof. Hayden SO (HKU)
Abstract
Recent advances in memory technologies are promising a new paradigm of highly efficient near-data computing. Instead of repeatedly bringing data into the central computational pipeline for processing through the memory hierarchy, these novel systems allow computations to be migrated to locations near where the data reside and to directly operate on the memory-storage system in parallel to the main CPU. The reduced data movement improves system performance and energy-efficiency, making them attractive for applications that process large volume of data, especially sparse and random structures such as graphs that suffer from low spatial and temporal localities in a traditional cache-memory hierarchy. In this talk, a novel computational model that couples computation and data as atomic units will be presented. The proposed system operates with smart memories that are enhanced by attached FPGA-based reconfigurable accelerators. The main CPU share access to the smart memory following a non-uniform memory access (NUMA) paradigm, while novel hardware-software codesign methodologies are being developed to facilitate migration of computational tasks between the CPU and the accelerators. Early results have demonstrated the feasibility, performance benefits, as well as flexibility of the proposed approach when deployed on different heterogeneous systems. On-going development efforts of the system will be discussed.
Prof. Hayden SO