When Codes for Storage Systems Meet Storage Systems

14/12/2017 - 12:00

Large-scale storage systems lie at the heart of the big data revolution. As these systems grow in scale and capacity, their complexity grows accordingly, building on new storage media, hybrid memory hierarchies, and distributed architectures. Numerous layers of abstraction hide this complexity from the applications, but also hide valuable information that could improve the system's performance considerably.

I will demonstrate how to bridge this semantic gap in the context of erasure codes, which are used to guarantee data availability and durability. Current theoretical research efforts focus on codes that will reduce the storage, network, and compute overheads of the systems that use them, without sacrificing their reliability. However, the semantic gap makes it difficult to observe the theoretical benefit of the resulting codes in real implementations. I will follow the example of regeneration and locally recoverable codes, showing the key challenges in applying optimal erasure codes to real systems, and how they can be addressed. This part is based on joint work with Matan Liram, Oleg Kolosov, Eitan Yaakobi, Itzhak Tamo and Alexander Barg.

I will then briefly describe the challenges introduced by the semantic gap in other layers of the "storage stack", and my experience in addressing them. I will refer to the memory hierarchy, flash-based solid-state drives, workload analysis, and aspects of data security.