CX project - Overview

[CX homepage]

The problem
The opportunity
Project goals
Some fundamental issues

Heterogeneity of machines and OS
Communication latency
Scalability
Robustness
Ease of use

The problem
The ocean contains many tons of gold. But, the gold atoms are too diffuse to extract usefully. Idle cycles on the Internet, like gold atoms in the ocean, seem too diffuse to extract usefully. If we could harness effectively the vast quantities of idle cycles, we could greatly accelerate our acquisition of scientific knowledge, successfully undertake grand challenge computations, and reap the rewards in physics, chemistry, bioinformatics, and medicine, among other fields of knowledge.
The opportunity
Several trends, when combined, point to an opportunity:

The number of networked computing devices is increasing: Computation is getting faster and cheaper: The number of unused cycles per second is growing rapidly

Bandwidth is increasing and getting cheaper

Communication latency is not decreasing

Humans are getting neither faster nor cheaper

These trends and other technological advances lead to opportunities whose surface we have barely scratched. It now is technically feasible to undertake "Internet computations" that are technically infeasible for a network of supercomputers in the same time frame. The maximum feasible problem size for ``Internet computations" is growing more rapidly than that for supercomputer networks. The SETI@home project discloses an emerging global computational organism, bringing "life" to Sun Microsystem's phrase "The network is the computer". The underlying concept holds the promise of a huge computational capacity, in which users pay only for the computational capacity actually used, increasing the utilization of existing computers.
Project Goals
In the CX project, we are designing an open, extensible Computation eXchange that can be instantiated privately, within a single organization (e.g., a university, distributed set of researchers, or corporation), or publicly as part of a market in computation, including charitable computations (e.g., AIDS or cancer research, SETI). Application-specific computation services constitute one kind of extension, in which computational consumers directly
contact specialized computational producers, which provide computational support for particular applications.
The system must enable application programmers to design, implement, and deploy large computations, using computers on the Internet. It must reduce human administrative costs, such as costs associated with:

downloading and executing a program on heterogeneous sets of machines and operating systems

distributing software component upgrades.

It should reduce application design costs by:

giving the application programmer a simple but general programming abstraction

freeing the application programmer from the non-application concerns of interprocessor communication and fault tolerance.

System performance must scale both up and down, despite communication latency, to a set of computation producers whose size varies widely even within the execution of a single computation. It must serve several consumers concurrently, associating different consumers with different priorities. It should support computations of widely varying lifetimes, from a few minutes to several months. Producers must be secure from the code they execute. Discriminating among consumers is supported, both for security and privacy, and for prioritizing the allocation of resources, such as compute producers.
After initial installation of system software, no human intervention is required to upgrade those components. The computational model must enable general task [de]composition with a restrictive shared state that is appropriate to the medium. The API must be simple but general. Communication and fault tolerance must be transparent to the user. Producers' interests must be aligned with their consumer's interests: computations are completed according to how highly they are valued.
Some Fundamental Issues
It is a challenge to achieve the goals of this system with respect to performance, correctness, ease of use, incentive to participate, security, and privacy. Although in the current phase we are not focusing on security and privacy, the Java security model {Gong} and, in particular, the Java 2 RMI API for network security {Scheifler} (covering authentication, confidentiality, and integrity) clearly are intended to support such concerns. Our choice of the Java programming system reflects these benefits implicitly.
Application programming complexity is managed by presenting the programmer with a simple, compact, general API, briefly presented in the next section. Administrative complexity is managed by using the Java programming system: its virtual machine provides a homogeneous platform on top of otherwise heterogeneous sets of machines and operating systems. We use a small set of interrelated Jini clients and services to further
simplify the administration of system components, such as the distribution of software component upgrades. The Production Network is a Jini service that interfaces with every other CX Jini client and service.
Performance issues can be decomposed into several sub-issues.
Heterogeneity of machines & OS
The goal is to overcome the administrative complexity associated with multiple hardware platforms and operating systems, incurring an acceptable loss of execution performance. The tradeoff is between the efficiency of native machine code vs. the universality of virtual machine code. For the applications targetted (not, e.g., real-time applications) the benefits of Java JITs reduce the benefits of native machine code: Java wins by reducing application programming complexity and administrative complexity, whose costs are not declining as fast as execution times.
Communication latency
There is little reason to believe that technological advances will significantly decrease communication latency. Hiding latency, to the extent that it is possible, thus is central to our design.
Scalability
The architecture must scale to a higher degree than existing multiprocessor architectures, such as workstation clusters.
Robustness
An architecture that scales to thousands of computational producers must tolerate faults, particularly when participating machines, in addition
to failing, can dynamically disassociate from an ongoing computation.
Ease of use
The computation consumer distributes code/data to a heterogeneous set of machines/OSs. This motivates using a {\em virtual} machine, in particular,
the JVM. Computational producers must download/install/upgrade system software (not just application code). Use of a screensaver/daemon obviates the need for human administration beyond the one-time installation of producer software. The screensaver/daemon is a wrapper or a Jini client, which downloads a ``task server" service proxy every time it starts, automatically distributing system software upgrades.

[return to the top]

For questions and comments about CX project:cappello@cs.ucsb.edu
For questions and comments about this site:mourlouk@cs.ucsb.com
site last updated: 03/22/2001.