Introduction

The ocean contains many tons of gold. But, the gold atoms are too diffuse to extract usefully. Idle cycles on the Internet, like gold atoms in the ocean, seem too diffuse to extract usefully. If we could harness effectively the vast quantities of idle cycles, we could greatly accelerate our acquisition of scientific knowledge, successfully undertake grand challenge computations, and reap the rewards in physics, chemistry, bioinformatics, and medicine, among other fields of knowledge. An opportunity is suggested by the following trends, taken as a whole: 
  • The number of networked computing devices is increasing: Computation is getting faster and cheaper: The number of unused cycles per second is growing rapidly 
  • Bandwidth is increasing and getting cheaper
  • Communication latency is not decreasing 
  • Humans are getting neither faster nor cheaper. 
These trends and other technological advances lead to opportunities whose surface we have barely scratched. It now is technically feasible to undertake "Internet computations" that are technically  infeasible for a network of supercomputers in the same time frame. The maximum feasible problem size for "Internet computations" is growing more rapidly than that for supercomputer networks. The SETI@home project discloses an emerging global computational organism, bringing "life" to Sun Microsystem's phrase "The network is the computer". The underlying concept holds the promise of a huge computational capacity, in which users pay only for the computational capacity actually used, increasing the utilization of existing computers.

Project Goals

In the Jicos project, we are designing an open, extensible computation exchange that can be instantiated privately, within a single organization (e.g., a university, distributed set of researchers, or corporation), or publicly as part of a market in computation, including charitable computations (e.g., AIDS or cancer research, SETI). Application-specific computation services constitute one kind of extension, in which computational consumers directly contact specialized computational producers, which provide computational support for particular applications. The system must enable application programmers to design, implement, and deploy large computations, using computers on the Internet. It must reduce human administrative costs, such as costs associated with: 
  • downloading and executing a program on heterogeneous sets of machines and operating systems
  • distributing software component upgrades.
It should reduce application design costs by:
  • giving the application programmer a simple but general programming abstraction
  • freeing the application programmer from concerns of interprocessor communication and fault tolerance.
System performance must scale both up and down, despite communication latency, to a set of computation producers whose size varies widely even within the execution of a single computation. It must serve several clients concurrently, associating different clients with different priorities. It should support computations of widely varying lifetimes, from a few minutes to several months. Hosts must be secure from the code they execute. Discriminating among clients is supported, both for security and privacy, and for prioritizing the allocation of resources, such as compute hosts. After initial installation of system software, no human intervention is required to upgrade those components. The computational model must enable general task decomposition and composition with a restrictive shared state that is appropriate to the medium. The API must be simple but general. Communication and fault tolerance must be transparent to the user. Hosts' interests must be aligned with their client's interests: computations are completed according to how highly they are valued. 

Some Fundamental Issues

It is a challenge to achieve the goals of this system with respect to performance, correctness, ease of use, incentive to participate, security, and privacy. Although this introduction does not focus on security and privacy, the Java security model {Gong} and the ``Davis" release of Jini address network security {Scheifler} (covering authentication, confidentiality, and integrity) clearly are intended to support such concerns. Our choice of the Java programming system and Jini reflects these benefits implicitly. In this introduction, we present the HostingServiceProvider (HSP) subsystem of Jicos, focusing on its design with respect to application programming complexity, administrative complexity, and performance. Application programming complexity is managed by presenting the programmer with a simple, compact, general API, briefly presented in the API section. Administrative complexity is managed by using the Java programming system: Its virtual machine provides a homogeneous platform on top of otherwise heterogeneous sets of machines and operating systems. We use a small set of interrelated RMI (soon to be Jini) clients and services to further simplify the administration of system components, such as the distribution of software component upgrades. The HSP is a service that interfaces with every other Jicos client and service. We however focus in this introduction on the Task Server and the Host. Performance issues can be decomposed into several sub-issues.
Heterogeneity of machines/OS
The goal is to overcome the administrative complexity associated with multiple hardware platforms and operating systems, incurring an acceptable loss of execution performance. The tradeoff is between the efficiency of native machine code vs. the universality of virtual machine code. For the applications targeted (not, e.g., real-time applications) the benefits of Java JITs reduce the benefits of native machine code: Java wins by reducing application programming complexity and administrative complexity, whose costs are not declining as fast as execution times. 
Communication latency
There is little reason to believe that technological advances will significantly decrease communication latency. Hiding latency, to the extent that it is possible, thus is central to our design.
Scalability
The architecture must scale to a higher degree than existing multiprocessor architectures, such as workstation clusters. Login privileges must not be required for the consumer to use a machine; such an administrative requirement limits scalability.
Robustness
An architecture that scales to thousands of computational producers must tolerate faults, particularly when participating machines, in addition to failing, can disengage from an ongoing computation.

Ease of use

The computation consumer distributes code/data to a heterogeneous set of machines/OSs. This motivates using a virtual machine, in particular, the JVM. Computational producers must download/install/upgrade system software (not just application code). Use of a screensaver/daemon obviates the need for human administration beyond the one-time installation of host software. The screensaver/daemon is a wrapper for a (soon to be Jini) client (that will download a "task server" service proxy every time it starts, automatically distributing system software upgrades.)