Data management systems and technologies have historically played a pivotal role in the context of computing environments that involve large volumes of data and information. In fact, data base management systems (DBMS) are the critical components of any data-intensive application infrastructure. Furthermore, the underlying technologies, both in terms of language and query models as well as with respect to the system architectures, have reached a level of maturity that has enabled its use as a plug-and-play component without the need for detailed learning of its internals.
Recently, however, the entire area of data-management especially as it pertains to large-scale data arising from Internet and Web-based applications is at the cross-roads. The main question in this debate is the effectiveness of old DBMS paradigms: declarative query languages, independence of logical and physical data model, and the computational framework based on the Transaction concept. Several of the large Internet companies such as Google, Yahoo, and Amazon have put forth competing solutions for both building data-intensive scalable applications over the Internet/Web as well as for large-scale data anlaytics.
During this quarter, we will begin a joint exploration to gain a deeper understanding to participate in this debate. In particular, the following topics will be covered:
The detailed lecture organization for the course appears below.
Pre-requisites: CMPSC 170.
Required Textbook: Transactional Information Systems by Gerhard WEIKUM and Gottfried VOSSEN
Instructor: Divy Agrawal, agrawal AT cs.ucsb.edu
Office hours: MW 14:30 - 15:30, 3117 Engineering I, and by appointment.
Teaching Assistant:
Date | Topic | Related Reading | Comments |
Th: 9/24/2009 | Data Management Issues in Data-intensive Computing | Lecture #1 Notes | Historical Overview & Motivation |
Tu: 9/29/2009 | Data Management for Enterprise Applications | Lecture #2: Overview; Lecture #2: Correctness | Database Correctness |
Th: 10/1/2009 | Data Management for Enterprise Applications | Lecture #3: Serializability | Correctness models for Transaction Execution; Homework #1 Assigned |
Tu: 10/6/2009 | Data Management for Enterprise Applications | Lecture #4: Two Phase Locking;Lecture #4: Variants of 2PL | Concurrency Control Protocols |
Th: 10/8/2009 | Data Management for Enterprise Applications | Lecture #5: Non-locking Protocols | Concurrency Control Protocols; Homework #1 due |
Tu: 10/13/2009 | Data Management for Enterprise Applications | Lecture #6: Multiversion Data | Multiversion Synchronization; Homework #2 Assigned |
Th: 10/15/2009 | Data Management for Enterprise Applications | Lecture #7: Transaction Failures | Transaction Failures and Recoverability |
Tu: 10/20/2009 | Data Management for Enterprise Applications | Lecture #8: Crash Failures | Database Recovery from Crash Failures |
Th: 10/22/2009 | Data Management for Enterprise Applications | Lecture #9: Recovery Protocols | Database Recovery from Crash Failures; Homework #2 Due |
Tu: 10/27/2009 | Data Management for Enterprise Applications | Lecture #10: Distributed Recovery | Data Distribution & Data Replication |
Th: 10/29/2009 | Project Discussion | ||
Tu: 11/3/2009 | Data Management for Internet Applications | Powerpoint Slides | Yahoo's PNUTS & Amazon's Dynamo |
Th: 11/5/2009 | Data Management for Internet Applications | Powerpoint Slides | Google Solution Stack: Chubby & BigTable |
Tu: 11/10/2009 | Data Management for Internet Applications | Powerpoint Slides | Google's BigTable & Chubby Lock Service |
Th: 11/12/2009 | Data Management for Internet Applications | Powerpoint Slides | Correcness Semantics & Future Outlook |
Tu: 11/17/2009 | Large-scale Data Analysis in the Enterprise Context | Data Warehousing | Data Warehousing Fundamentals |
Th: 11/19/2009 | Large-scale Data Analysis in the Enterprise Context | OLAP & Data Cube | Online Analytical Processing and the Data Cube Model |
Tu: 11/24/2009 | Large-scale Data Analysis in the Internet Context | MapReduce | The MapReduce Paradigm |
Th: 11/26/2009 | NO CLASS | Thanksgiving Holiday | |
Tu: 12/1/2009 | Macro-trends in Computing Infrastructures | Cloud Computing | Cloud Computing, SaaS, PaaS, and IaaaS |
Tu: 12/3/2009 | Micro-trends in Computing Infrastructures | Transactional Memory | Parallel Computing Paradigms |
12/10/2009 | Project Demonstrations | By Appointment |