Final Report Summary - NOVOSOFT (Software management of non-volatile memory hierarchies)
Because of the non-optimal properties of NVM, a hybrid memory design consisting of both a conventional (fast and power-hungry) DRAM component and a novel (slower but less power-hungry) NVM component is considered a viable approach. The intuition is that the DRAM component is small and should be used to service frequent memory accesses, while the NVM component is much larger and is used to store the majority of data, which is typically accessed infrequently. To obtain this behavior, it is necessary to place data in the appropriate type of memory and sometimes also to migrate it between memory types.
The state-of-the-art in hybrid DRAM/NVM memories considers that data placement and migration are orchestrated by the hardware. Typically this is performed at the granularity of virtual memory pages (e.g. 4 or 8KB) in order to reduce the overhead of state information that tracks the access pattern of each individual virtual memory page.
We take a radically different approach in the NovoSoft project: it proposes to manage data placement and migration in software rather than hardware. Any type of software may define data placement and migration: the operation system, language runtime systems and/or the application software. A good place to implement this support is in the language runtime system as the operating system is limited to collect similar per-page statistics as the hardware, while the application-level is perhaps too generic to allow efficient reuse of the technique across applications. We have investigated and developed this approach within the NovoSoft project. In particular, we have studied and analyzed the energy and performance parameters of potential future NVM technologies. We have incorporated these in an analytic model that predicts whether a piece of data is best stored in DRAM or NVM, as it is being accessed by a specific program. Using this model, we have been able to evaluate the performance potential and energy efficiency of hybrid memory systems.
We furthermore developed a programming interface (API) to direct the placement of data in a hybrid memory systems. The API is attached to the standard memory allocation APIs consisting of the malloc() and mmap() families of functions. In particular, we allow these functions to distinguish between "hot" data, i.e. data that will be accessed frequently in main memory and should be stored on DRAM, and the (default) "cold" data. Further facilities are provided to deal with data held in stack-allocated variables and global variables.
NovoSoft further proposes to hide the hybrid memory allocation API from the programmer by utilizing annotations of variables already present in a programming language, in this case the Swan task dataflow parallel programming language. Swan facilitates parallel programming by annotating every task with its memory footprint, i.e. the memory locations or variables that are read from or written to. Swan executes tasks in parallel whenever their memory footprints are non-overlapping. NovoSoft has extended Swan and investigated the option of leveraging these annotations also to determine the placement and migration of data in a hybrid DRAM/NVM memory system.
We have performed extensive experimental evaluation on a variety of embedded, compute-intensive and data-intensive workloads. We have demonstrated energy savings in the range of 50% to 90%, depending on the application, memory organization and assumptions made on the NVM technology parameters. We have further demonstrated that software-driven placement of data in a hybrid memory hierarchy outperforms RaPP, a state-of-the-art hardware techniques for page migration, on both slowdown and energy savings. In our experiments, RaPP introduced slowdowns in the range of 5.4% to 21.2% while software placement was capped by design to a maximum estimated slowdown of 5%, which in practice turned out never to exceed 3.9% due to latency hiding in the processor and memory system. Energy savings between RaPP and software placement are comparable in most cases, but for some workloads, RaPP performs much worse than software-driven placement.
Energy-efficiency of computing systems is an increasing concern, both from a technological viewpoint and a societal one. From a technological viewpoint energy-efficiency determines the capabilities of battery-powered devices such as cell phones and tablets. It also limits the capability of building exascale supercomputers, the next generation of high-performance computing infrastructures that are the driving force behind modern manufacturing and design. Data centers, the workhorses behind cloud computing, online presence and digital retail, are constrained by their power demands.
From a societal point of view, computing is taking up an increasing portion of the world's ability to deliver power. It is estimated that currently all data centers around the world consume more power than the whole of Italy. With a predicted increase in the amount of data going around the world year on year, it is clear to see that we do not have the power supply to sustain the current growth of the digital society.
While the solution to these energy and power problems are multi-faceted, the efficient and effective application of energy-efficient memory technologies will definitely play its role. This way NovoSoft can make