Skip to main navigation Skip to search Skip to main content

An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems

  • Ningjing Liang
  • , Xingjun Zhang*
  • , Hailong Yang
  • , Xiaoshe Dong
  • , Changjiang Zhang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

To reduce the storage cost, distributed storage systems are gradually using erasure codes to ensure data reliability. Liberation codes, which satisfy the maximum distance separable (MDS) property and provide optimal modification overhead, are a class of popular two fault tolerant erasure codes. However, erasure codes need to read from surviving nodes and transfer across the network large amounts of data when recovering from single node failures. Existing single node failure recovery approaches for Liberation codes are either time-consuming or suboptimal. In this article, firstly, we prove the minimum number of symbols required to recover one failed node for a Liberation coded system. Then we derive the conditions that optimal recovery solutions need to satisfy. Finally, we propose an algorithm, called Disk Read Optimal Recovery (DROR), which can determine an optimal recovery solution in linear time and recover the failed node reading the minimum amount of data. We have implemented DROR in a real-world storage system Ceph and evaluated DROR on a cluster of Amazon EC2 instances. We show that DROR reduces the reconstruction time by up to 23.6% compared to that of the recovery approach in Ceph.

Original languageEnglish
Article number9149899
Pages (from-to)137631-137645
Number of pages15
JournalIEEE Access
Volume8
DOIs
StatePublished - 2020

Keywords

  • Liberation codes
  • minimum amount of data
  • optimal recovery approach
  • single node failures

Fingerprint

Dive into the research topics of 'An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems'. Together they form a unique fingerprint.

Cite this