Efficient Crash Consistency for NVMe over PCIe and RDMA

Research output: Contribution to journalArticlepeer-review

Abstract

This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.

Original languageEnglish
Article number7
JournalACM Transactions on Storage
Volume19
Issue number1
DOIs
StatePublished - 11 Jan 2023
Externally publishedYes

Keywords

  • NVMe
  • SSD
  • Storage protocol
  • crash consistency
  • file system

Fingerprint

Dive into the research topics of 'Efficient Crash Consistency for NVMe over PCIe and RDMA'. Together they form a unique fingerprint.

Cite this