On Databases and Non-Volatile Memory technologies. Interview with Joy Arulraj and Andrew Pavlo
“When we started this project in 2013, it was a moonshot. We were not sure if NVM technologies would ever see the light of day, but Intel has finally started shipping NVM devices in 2019. We are excited about the impact of NVM on next-generation database systems.” — Joy Arulraj and Andrew Pavlo.
I have interviewed Joy Arulraj, Assistant Professor of Computer Science at Georgia Institute of Technology and Andrew Pavlo, Assistant Professor of Computer Science at Carnegie Mellon University. They just published a new book “Non-Volatile Memory Database Management Systems“. We talked about non-volatile memory technologies (NVM), and how NVM is going to impact the next-generation database systems.
Q1. What are emerging non-volatile memory technologies?
Arulraj, Pavlo: Non-volatile memory (NVM) is a broad class of technologies, including phase-change memory and memristors, that provide low latency reads and writes on the same order of magnitude as DRAM, but with persistent writes and large storage capacity like an SSD. For instance, Intel recently started shipping its Optane DC NVM modules based on 3D XPoint technology .
Q2. How do they potentially change the dichotomy between volatile memory and durable storage in database management systems?
Arulraj, Pavlo: Existing database management systems (DBMSs) can be classified into two types based on the primary storage location of the database: (1) disk-oriented and (2) memory-oriented DBMSs. Disk-oriented DBMSs are based on the same hardware assumptions that were made in the first relational DBMSs from the 1970s, such as IBM’s System R. The design of these systems target a two-level storage hierarchy comprising of a fast but volatile byte-addressable memory for caching (i.e., DRAM), and a slow, non-volatile block-addressable device for permanent storage (i.e., SSD). These systems take a pessimistic assumption that a transaction could access data that is not in memory, and thus will incur a long delay to retrieve the needed data from disk. They employ legacy techniques, such as heavyweight concurrency-control schemes, to overcome these limitations.
Recent advances in manufacturing technologies have greatly increased the capacity of DRAM available on a single computer.
But disk-oriented systems were not designed for the case where most, if not all, of the data resides entirely in memory.
The result is that many of their legacy components have been shown to impede their scalability for transaction processing workloads. In contrast, the architecture of memory-oriented DBMSs assumes that all data fits in main memory, and it therefore does away with the slower, disk-oriented components from the system. As such, these memory-oriented DBMSs have been shown to outperform disk-oriented DBMSs. But, they still have to employ heavyweight components that can recover the database after a system crash because DRAM is volatile. The design assumptions underlying both disk-oriented and memory-oriented DBMSs are poised to be upended by the advent of NVM technologies.
Q3. Why are existing DBMSs unable to take full advantage of NVM technology?
Arulraj, Pavlo: NVM differs from other storage technologies in the following ways:
- Byte-Addressability: NVM supports byte-addressable loads and stores unlike other non-volatile devices that only support slow, bulk data transfers as blocks.
- High Write Throughput: NVM delivers more than an order of magnitude higher write throughput compared to SSD. More importantly, the gap between sequential and random write throughput of NVM is much smaller than other durable storage technologies.
- Read-Write Asymmetry: In certain NVM technologies, writes take longer to complete compared to reads. Further, excessive writes to a single memory cell can destroy it.
Although the advantages of NVM are obvious, making full use of them in a DBMS is non-trivial. Our evaluation of state-of-the-art disk-oriented and memory-oriented DBMSs on NVM shows that the two architectures achieve almost the same performance when using NVM. This is because current DBMSs assume that memory is volatile, and thus their architectures are predicated on making redundant copies of changes on durable storage. This illustrates the need for a complete rewrite of the database system to leverage the unique properties of NVM.
Q4.With NVM, which components of legacy DBMSs are unnecessary?
Arulraj, Pavlo: NVM requires us to revisit the design of several key components of the DBMS, including that of the (1) logging and recovery protocol, (2) storage and buffer management, and (3) indexing data structures.
We will illustrate it using the logging and recovery protocol. A DBMS must guarantee the integrity of a database against application, operating system, and device failures. It ensures the durability of updates made by a transaction by writing them out to durable storage, such as SSD, before returning an acknowledgment to the application. Such storage devices, however, are much slower than DRAM, especially for random writes, and only support bulk data transfers as blocks.
During transaction processing, if the DBMS were to overwrite the contents of the database before committing the transaction, then it must perform random writes to the database at multiple locations on disk. DBMSs try to minimize random writes to disk by flushing the transaction’s changes to a separate log on disk with only sequential writes on the critical path of the transaction. This method is referred to as write-ahead logging (WAL).
NVM upends the key design assumption underlying the WAL protocol since it supports fast random writes. Thus, we need to tailor the protocol for NVM. We designed such a protocol that we call write-behind logging (WBL). WBL not only improves the runtime performance of the DBMS, but it also enables it to recovery nearly instantaneously from failures. The way that WBL achieves this is by tracking what parts of the database have changed rather than how it was changed. Using this logging method, the DBMS can directly flush the changes made by transactions to the database instead of recording them in the log. By ordering writes to NVM correctly, the DBMS can guarantee that all transactions are durable and atomic. This allows the DBMS to write fewer data per transaction, thereby improving a NVM device’s lifetime.
Q5. You have designed and implemented a DBMS storage engine architectures that are explicitly tailored for NVM. What are the key elements?
Arulraj, Pavlo: The design of all of the storage engines in existing DBMSs are predicated on a two-tier storage hierarchy comprised of volatile DRAM and a non-volatile SSD. These devices have distinct hardware constraints and performance properties. The traditional engines were designed to account for and reduce the impact of these differences.
For example, they maintain two layouts of tuples depending on the storage device. Tuples stored in memory can contain non-inlined fields because DRAM is byte-addressable and handles random accesses efficiently. In contrast, fields in tuples stored on durable storage are inlined to avoid random accesses because they are more expensive. To amortize the overhead for accessing durable storage, these engines batch writes and flush them in a deferred manner. Many of these techniques, however, are unnecessary in a system with a NVM-only storage hierarchy. We adapted the storage and recovery mechanisms of these traditional engines to exploit NVM’s characteristics.
For instance, consider an NVM-aware storage engine that performs in-place updates. When a transaction inserts a tuple, rather than copying the tuple to the WAL, the engine only records a non-volatile pointer to the tuple in the WAL. This is sufficient because both the pointer and the tuple referred to by the pointer are stored on NVM. Thus, the engine can use the pointer to access the tuple after the system restarts without needing to re-apply changes in the WAL. It also stores indexes as non-volatile B+trees that can be accessed immediately when the system restarts without rebuilding.
The effects of committed transactions are durable after the system restarts because the engine immediately persists the changes made by a transaction when it commits. So, the engine does not need to replay the log during recovery. But the changes of uncommitted transactions may be present in the database because the memory controller can evict cache lines containing those changes to NVM at any time. The engine therefore needs to undo those transactions using the WAL. As this recovery protocol does not include a redo process, the engine has a much shorter recovery latency compared to a traditional engine.
Q6. What is the key takeaway from the book?
Arulraj, Pavlo: All together, the work described in this book illustrates that rethinking the key algorithms and data structures employed in a DBMS for NVM not only improves performance and operational cost, but also simplifies development and enables the DBMS to support near-instantaneous recovery from DBMS failures. When we started this project in 2013, it was a moonshot. We were not sure if NVM technologies would ever see the light of day, but Intel has finally started shipping NVM devices in 2019. We are excited about the impact of NVM on next-generation database systems.
Joy Arulraj is an Assistant Professor of Computer Science at Georgia Institute of Technology. He received his Ph.D. from Carnegie Mellon University in 2018, advised by Andy Pavlo. His doctoral research focused on the design and implementation of non-volatile memory database management systems. This work was conducted in collaboration with the Intel Science & Technology Center for Big Data, Microsoft Research, and Samsung Research.
Andrew Pavlo is an Assistant Professor of Databaseology in the Computer Science Department at Carnegie Mellon University. At CMU, he is a member of the Database Group and the Parallel Data Laboratory. His work is also in collaboration with the Intel Science and Technology Center for Big Data.
– Non-Volatile Memory Database Management Systems. by Joy Arulraj, Georgia Institute of Technology, Andrew Pavlo, Carnegie Mellon University. Book, Morgan & Claypool Publishers, Copyright © 2019, 191 Pages.
ISBN: 9781681734842 | PDF ISBN: 9781681734859 , Hardcover ISBN: 9781681734866
–How to Build a Non-Volatile Memory Database Management System (.PDF), Joy Arulraj Andrew Pavlo
Follow us on Twitter: @odbmsorg