Sorrento: A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications

Report ID: 
2003-30
Authors: 
Hong Tang, Aziz Golbeden, Jingyu Zhou, Lingkun Chu, and Tao Yang
Date: 
2003-10-01 05:00:00

Abstract

This paper describes the design and implementation of Sorrento -- aself-organizing storage cluster built upon commodity components.Sorrento complements previous researches on distributed file/storagesystems by focusing on incremental expandability and manageability ofthe system and on design choices for optimizing performance of paralleldata-intensive applications with low write-sharing patterns. Sorrentovirtualizes distributed storage devices as incrementally expandablevolumes and automatically manages storage node additions and failures.Its consistency model chooses a version-based scheme for data updatingand replica management, which is especially suitable for data-intensiveapplications where distributed processes access disjoint datasets mostof the time. To further facilitate parallel I/O, Sorrento providesload-aware or locality-driven data placement and an adaptive migrationstrategy. This paper presents experimental results to demonstratefeatures and performance of Sorrento using both microbenchmarks andtrace-replay of real applications from several domains, includingscientific computing, data mining, and offline processing for web search.

Document

2003-30.pdf