Course Syllabus

CMPSC 274: Advanced Topics on Databases

Data Management issues for Data-intensive Computing

SPRING 2011: MW 3:00 - 4:50  932 101

Class Website: http://www.cs.ucsb.edu/~agrawal

Course Description

Data management systems and technologies have historically played a pivotal role in the context of computing environments that involve large volumes of data and information. In fact, data base management systems (DBMS) are the critical components of most data-intensive application infrastructure. Furthermore, the underlying technologies, both in terms of language and query models as well as with respect to the system architectures, have reached a level of maturity that has enabled its use as a plug-and-play component without the need for detailed learning of its internals.

Recently, however, the entire area of data-management especially as it pertains to large-scale data arising from Internet and Web-based applications is at the cross-roads. The main question in this debate is the effectiveness of old DBMS paradigms: declarative query languages, independence of logical and physical data model, and the computational framework based on the Transaction concept. Several of the large Internet companies such as Google, Yahoo, and Amazon have put forth competing solutions for both building data-intensive scalable applications over the Internet/Web as well as for large-scale data anlaytics.

During this quarter, we will begin a joint exploration to gain a deeper understanding to participate in this debate. In particular, the following topics will be covered:

The detailed lecture organization for the course appears below.

Pre-requisites: CMPSC 170.

Required Textbook: Transactional Information Systems by Gerhard WEIKUM and Gottfried VOSSEN

Instructor: Divy Agrawal, agrawal AT cs.ucsb.edu

Office hours: TuTh 1:00PM - 2:00PM, 3117 Harold Frank Hall, and by appointment.

Teaching Assistant:

Grading: CMPSC 274 Course Outline (approximate):
Date Topic Related Reading Comments
M: 3/28/2011 Data Management Issues in Data-intensive Computing Lecture #1 Notes Historical Overview & Motivation
W: 3/30/2011 Data Management for Enterprise Applications Lecture #2: Database Computation Model; Database Correctness
M: 4/4/2011 Data Management for Enterprise Applications Lecture #3: Equivalence of Executions Correctness models for Transaction Execution; Homework #1 Assigned
W: 4/6/2011 Cloud Computing: Data Analytics Lecture #4: Data in the Cloud Cloud Computing
M: 4/11/2011 Scalable Data Management in the Cloud Lceture #5: Scalable Data in the Cloud Cloud Computing
W: 4/13/2011 Data Management for Enterprise Applications Lecture #6: Transaction Correctness Conflict Serializability and Serialization Graph
M: 4/18/2011 Data Management for Enterprise Applications Lecture #7: Concurrency Control Protocols Two-phase locking Homework #2 Assigned
W: 4/20/2011 Data Management for Enterprise Applications Lecture #8: Non-locking Protocols Timestamp Ordering & Optimistic Protocols
M: 4/25/2011 Data Management for Enterprise Applications Lecture #9: Recovery Protocols Database Recovery from Crash Failures; Homework #2 Due
W: 4/27/2011 Data Management for Enterprise Applications Lecture #10: Distributed Recovery Data Distribution & Data Replication
W: 5/2/2011 Data Management for Internet Applications Powerpoint Slides Yahoo's PNUTS & Amazon's Dynamo
M: 5/4/2011 Data Management for Internet Applications Powerpoint Slides Google Solution Stack: Chubby & BigTable
W: 5/9/2011 Data Management for Internet Applications Powerpoint Slides Google's BigTable & Chubby Lock Service
M: 5/11/2011 Data Management for Internet Applications Powerpoint Slides Correcness Semantics & Future Outlook
W: 5/16/2011 Large-scale Data Analysis in the Enterprise Context Data Warehousing Data Warehousing Fundamentals
M: 5/18/2011 Large-scale Data Analysis in the Enterprise Context OLAP & Data Cube Online Analytical Processing and the Data Cube Model
W: 5/23/2011 Large-scale Data Analysis in the Internet Context MapReduce The MapReduce Paradigm
W: 5/25/2011 Macro-trends in Computing Infrastructures Cloud Computing Cloud Computing, SaaS, PaaS, and IaaaS
M: 5/30/2011 NO CLASS Memorial Day Holiday  
M: 6/1/2011 Micro-trends in Computing Infrastructures Transactional Memory Parallel Computing Paradigms