CNS: NSF EAGER: From a Virtualized Computing Nucleus to a Cloud Computing Universe
The research plan of this project consists of two main tasks:
In the first task, the PIs are developing a formal proof-theoretic framework to ensure that the partition management protocols (which results in the elasticity of cloud computing systems) are correct. The PIs will then use this framework to analyze the partition management functionalities of two representative systems: BigTable (equivalently HBase) and Yahoo?s PNUTS. The PIs will also validate the functionality of the storage subsystem in Google GFS and Hadoop HDFS to ensure that the read and write operations at the file system level indeed satisfy the stated invariants. The PIs will then extend this study to ensure that the atomicity of operations on a single key-value pair indeed are atomic and durable irrespective to the size of the value attribute.
In the second task, the PIs will investigate more stringent levels of data and object consistency in cloud computing systems. In particular, the PIs will develop a distributed implementation of snapshot isolation and evaluate its effectiveness in cloud computing environments primarily with respect to elasticity and scalability. Also, the PIs will put all the pieces together (with their associated invariants) and evaluate if the overall system thus formed is correct. Moreover, the PIs will formalize the characteristics of different replica management protocols to identify the limit of data availability in various cloud computing environments.