ElasTraS: Elastic Transaction Management in the Cloud

ElasTraS targets the design space of scalable, elastic, fault-tolerant, self-managing, transactional relational database for the cloud. ElasTraS is designed to scale out using a cluster of commodity machines while being fault-tolerant and self-managing. ElasTraS is designed to support both classes of database needs for the cloud: (i) large databases partitioned across a set of nodes, and (ii) a large number of small and independent databases common in multi-tenant databases. ElasTraS borrows from the design philosophy of scalable Key-Value stores to minimize distributed synchronization and remove scalability bottlenecks, while leveraging decades of research on transaction processing, concurrency control, and recovery to support rich functionality and transactional guarantees.

Cloud computing has emerged as a pervasive paradigm for hosting Internet scale applications in large computing infrastructures. Major enabling features of the cloud include elasticity of resources, pay per use, low time to market, and the perception of unlimited resources and infinite scalability. As a result, there has been a widespread migration of IT services from enterprise computing infrastructures (i.e., networked cluster of expensive servers) to cloud infrastructures (i.e., large data-centers with hundreds of thousands of commodity servers). Since one of the primary uses of the cloud is for hosting a wide range of web applications, scalable data management systems that drive these applications are a crucial technology component in the cloud.

The ElasTraS project aims to fill the gap in the context of data management for applications in the cloud. ElasTraS is designed to support a relational data model and foreign key relationships amongst tables in a database. Given a set of partitions of the same or different databases, ElasTraS can:

  • deal with workload changes resulting in the growing and shrinking of the cluster size,
  • load balance the partitions,
  • recover from node failures,
  • if configured for dynamic partitioning, then dynamically split hot or very big partitions, and merge consecutive small partitions using a partitioning scheme specified in advance, and
  • provide transactional access to the database partitions, while scaling to large amounts of data and large numbers of concurrent transactions.
  • Transaction Management in the Cloud: We also develop novel techniques for transactional management in the cloud. Most transactional cloud databases use some form of locking for concurrency control, either dynamic locking as the case in Spanner and MySQL Cluster, or static locking as the case in Calvin. In fact, aside from Megastore, all major production database management systems, whether designed for the cloud or not, use lock-based concurrency control. However, dealing with locks left by transactions initiated by failed machines, and determining a multi-programming level that avoids thrashing without under-utilizing available resources, are some of the challenges that arise when using lock-based transaction processing mechanisms in the cloud context. Even in the case of optimistic concurrency control, most proposals in the literature deal with distributed validation but still require the database to acquire locks during two-phase commit when installing updates of a single transaction on multiple machines. Very little theoretical work has been done to entirely eliminate the need for locking in distributed transactions, including locks acquired during two-phase commit.

    We tackle the practicality issues associated with OCC when implementing transactions at the database layer in a distributed setting. We present a novel re-design of OCC that (1) eliminates the need for any locking during two-phase commit when executing distributed transactions, (2) reduces the abort rate of OCC to a rate much less than that incurred by deadlock avoidance mechanisms in lock-based concurrency control, such as wait-die and wound-wait, (3) preserves the no-thrashing property that characterizes OCC, and (4) preserves the higher throughput that OCC demonstrates compared to lock-based concurrency control. Thus, our novel re-design of OCC overcomes the disadvantages of existing OCC mechanisms, while preserving the advantages of OCC over lock-based concurrency control. We implement our proposed approach as part of a transaction processing
    system that we develop; we refer to this transaction processing system as MaaT, which is an abbreviation of "Multi-access as a Transaction."

Project Status: