Skip to main content

Enabling the Adoption of Data-Centric Systems: Hardware/Software Support for Processing-Using-Memory Architectures

Computer Architecture Seminar

-
Location: EER 0.806/808
Speaker:
Geraldo F. Oliveira
ETH Zürich

The increasing prevalence and growing size of data in modern applications have led to high performance and energy costs for computation in traditional processor-centric computing systems. To mitigate these costs, the processing-in-memory (PIM) paradigm moves computation closer to where the data resides, reducing (and sometimes eliminating) the need to move data between memory and the processor. There are two main approaches to PIM: (1) processing-near-memory (PNM), where PIM logic is added to the same die as memory or to the logic layer of 3D-stacked memory, and (2) processing-using-memory (PUM), which uses the operational principles of memory cells to perform computation.  Due to a push from the application domain and recent developments in memory manufacturing and packaging, memory manufacturers (and startups) have finally introduced the first real-world PNM architectures into the market. However, fully adopting PUM in today’s systems is still very challenging due to the lack of tools and system support for such architectures across the computer architecture stack, which includes (i) frameworks that can facilitate the implementation of complex operations and algorithms using the underlying PUM primitives;  (ii) execution models that can take advantage of the available application parallelism to maximize hardware utilization and throughput; (iii) compiler support and compiler optimizations targeting PUM architectures;  (iv) operating system support for PUM-aware virtual memory and memory management.

In this talk, we will discuss our major recent research results on different tools and system support for PUM architectures (with a focus on DRAM-based solutions), which aim to ease the adoption of such architectures in current and future systems.  Our work builds on prior works ([1, 2]) that show that current DRAM chips can be modified slightly to execute simple data movement and Boolean operations, unleashing the PUM capabilities of current memory technologies.   Based on that, we will first describe our efforts to extend the capabilities of PUM solutions further to enable their applicability to various workloads. To do so, we implement complex PUM operations using (i) SIMDRAM [3], an end-to-end framework that composes PUM primitives to implement complex arithmetic operations entirely within DRAM in a single-instruction multiple-data (SIMD) manner;  and (ii) pLUTo [4], a PUM architecture that leverages the high storage density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs) instead of relying on complex extra in-DRAM logic. Second, we propose system solutions that expose the newly added PUM capabilities to the application stack, focusing on programmer-friendly approaches. Concretely, we will discuss MIMDRAM [5], a hardware/software co-designed PUM system that introduces the ability to allocate and control only the required amount of computing resources inside the DRAM subarray for PUM computation. MIMDRAM implements compiler passes and system support to guarantee high utilization of the PUM substrate. Third, we extensively analyze current commodity off-the-shelf (COTS) DRAM chips to characterize their capability to perform PUM operations with modifications only to the DRAM controller and not to the DRAM chip or interface [6]. We demonstrate that (i) PUM architectures are a promising solution, leading to significant (e.g., more than an order of magnitude) performance and energy gains compared to processor-centric systems for various real-world applications, and (2) COTS DRAM chips are capable of performing a range of  PUM operations with high success rates.

Geraldo F. Oliveira (https://geraldofojunior.github.io/) is a Ph.D. candidate in the Safari Research Group at ETH Zürich, working with Prof. Onur Mutlu. His current broader research interests are in computer architecture and systems, focusing on memory-centric architectures for high-performance and energy-efficient systems. In particular, his Ph.D. research focuses on taking advantage of new memory technologies to accelerate distinct classes of applications and provide system support for novel memory-centric systems. Geraldo has published several works on this topic in major conferences and journals such as HPCA, ASPLOS, ISCA, MICRO, and IEEE Micro.

 

 

Seminar Series