Convertible codes: New class of codes for efficient conversion of coded data in distributed storage by Prof. Rashmi Vinayak


Erasure codes are typically used in large-scale distributed storage systems to provide durability of data in the face of failures. In this setting, a set of k blocks to be stored is encoded using an [n, k] code to generate n blocks that are then stored on different storage nodes. A recent work by Kadekodi et al. (2019) shows that the failure rate of storage devices vary significantly over time, and that changing the rate of the code (via a change in the parameters n and k) in response to such variations provides significant reduction in storage space requirement. However, the resource overheads of realizing a change in the code rate on already encoded data in traditional codes is prohibitively high.

Motivated by this application, in this work, we first present a new framework to formalize the notion of code conversion---the process of converting data encoded with an [nI, kI] code into data encoded with an [nF, kF] code while maintaining desired decodability properties, such as the maximum-distance-separable (MDS) property. We then introduce convertible codes, a new class of code pairs that allow for code conversions in a resource-efficient manner. For an important parameter regime, along with the widely used linearity and MDS decodability constraints, we prove tight bounds on the number of nodes accessed during code conversion. We also present explicit low-field-size constructions of access-optimal MDS convertible codes for a broad range of parameters.