Moodle course web site

Assignment 2: A Basic Compute Farm

Purpose

  • Expand your experience working with Java RMI
  • Take another step in the process of building a Java-centric network computing infrastructure.
  • Build limited fault tolerance into your infrastructure.
  • Specification

    Each large Internet computing project, such as SETI@HOME, tackles some problem that has a simple parallel decomposition. We will call such "embarrassingly parallel" problems piecework-parallel, indicating that a problem in this class has a piecework decomposition: The problem decomposes into objects that implement Task, and whose execute methods return values that can be composed into a solution to the original problem.

    Piecework Decomposition
    Fig. 1: Pieceworktask decomposition topology

    In this assignment, you build a basic compute farm infrastrure for hosting piecework-parallel problems. The client decomposes the problem, constructing a set of Task objects. These tasks are passed to a Space, which makes them available to compute servers, called Computer objects, which function much like those in your first assignment. The results computed by Computers are returned to the Space. The client retrieves results from the Space, composing them into a solution to the original problem.

    The API

    Result interface

    Result objects are minimally mutable.

    package api;
    
    public interface Result<T> extends java.io.Serializable
    {
        T getValue();
        
        void setValue( T value );
        
        long getRunTime(); // of the Task, as seen by the Computer that executes it
    }
    

    Task interface

    package api;
    
    public interface Task<T> extends java.io.Serializable
    {
        Result<T> execute();
    }
    

    The Computer interface

    package system;
    
    public interface Computer extends java.rmi.Remote 
    {
        Result execute( Task task ) throws java.rmi.RemoteException;
    }
    

    The Client-to-Space interface

    package api;
    
    public interface Client2Space extends java.rmi.Remote
    {
        public static String SERVICE_NAME = "CLIENT_2_SPACE";
        
        void put( Task task ) throws java.rmi.RemoteException;
        
        Result take() throws java.rmi.RemoteException;
    }
    

    The client decomposes the problem into a set of Task objects, and passes them to the Space via the put method. In principle, these task objects can be processed in parallel by Computers. After passing all the task objects to the Space, the client retrieves the associated Result objects via the take method. This method blocks until a Result is available to return the the client. Thus, if the client sent 10 Task objects to the Space, it could execute:

    Result[] results = new Result[10];
    for ( int i = 0; i < results.length; i++ )
    {
        results[i] = takeResult(); // waits for a result to become available.
    }

    If a particular Result needs to be associated a particular Task (e.g., a Mandelbrot Result), this information is passed via the Result object. Based on this association, if it matters, it composes the result values into a solution to the original problem.

    The Space

    package system;
    
    import api.*;
    
    public interface Computer2Space extends java.rmi.Remote 
    {
        public static String SERVICE_NAME = "COMPTER_2_SPACE";
        
        void register( Computer computer ) throws java.rmi.RemoteException;
    }
    

    Faulty Computers

    For the purposes of this assignment, a computer is defined to be faulty when a Remote method invoked on it returns a RemoteException. The Space accommodates faulty computers: If a computer that is running a task returns a RemoteException, the task is assigned to another computer.

    The Space implementation's main method instantiates a Space object and binds it into its rmiregistry.

    The space's implementation of register should instantiate a ComputerProxy, which is a separate thread. This thread's run method loops forever, taking available tasks, invoking its associated Computer's execute method with the task, and putting the returned Result object in a data structure for retrieval by the client. The Java LinkedBlockingQueue may be useful.

    The Computer Implementation


    Fig. 2: The client-Space-computer architecture.

    Task classes

    For each of the Task classes that you defined in the 1st assignment, define a corresponding Task class that solves part of the original problem. The decompositions need not be masterpieces of efficiency. For the Traveling Salesman Problem, partition the set of all possible tours into p parts. For example, if there are n cities, you can partition the set of tours into n - 1 parts: those that begin with cities

    The clients

    Define a client that:

    Deployment

    Repeat the above steps for c = 1, 2, 4, and 8. Graph the completion times over c, for each problem type. For the case of c = 1, get completion time for 3 deployment scenarios:

    Analysis

    Paper Summary

    Submit a 1-page summary, entirely in your own words, of the paper titled, "How to Build a ComputeFarm."

    Deliverable

    Mail <cappello@cs.ucsb.edu> a jar file, named <name>.jar, where <name> is the CS computer account username of 1 member of the pair. It should include the following directories and files:

    Directories

    Files



     cappello@cs.ucsb.edu 2009.04.21