Assignment 3: A Divide-and-Conquer API
Purpose
- Expand your experience working with Java RMI
- Augment the functionality of your Java-centric cluster computing system to support a divide-and-conquer (DAC) API.
Specification
The DAC API
Design a DAC API. For design ideas, please see the Cilk paper. Cilk allows any acyclic directed graph of threads. Your api needs to support only "fully strict" (i.e., purely DAC) programs. You may require that a task either:
- Produces a single result object as its output
-
Constructs:
- subtasks
- a "compose" task whose inputs are the single output from each of the subtasks and whose output is some function of its inputs (i.e., subtask outputs).
The implementation of this API involves the cooperation of the Task, the Computer, and the Space. Please look at the specification of a Closure in the Cilk paper. Relate this to the infomation requirements of a Task in your system. The Space needs to distinguish between tasks that have all their arguments (and thus are ready to be executed by Computers) from those that do not. The scheduling algorithm you use to distribute ready tasks to Computers is unspecified: You can do whatever you want.
If task t computes an output o (as opposed to computing subtasks), it must communicate o to its successor task s, which needs o as an input: When t produces o, the Space needs to know which task is s. Thus, you may need some form of task identifier. The Space also needs to know which of s's inputs o is, so the Space can set the appropriate input element of s.
If your DAC api is different for different DAC problems, then it is not truly an "api."
Fault tolerance
Although your system needs to tolerate faulty Computers in the same sense as the previous programming assignment, it does not need to be robust in the presence of faulty application programs.
Application development
- Define the nth Fibonacci number, F(n), recursively:
- For 0 ≤ n < 2, F(n) = n.
- For 2 ≤ n, F(n) = F(n - 1) + F(n - 2).
- Refactor TSP to use your DAC API.
The clients
By client, I mean the class that uses the Space to solve a problem. You may wish to refactor your work from previous assignments to serve this purpose.
For each application, define a client that:
- gets the SpaceImpl's machine domain name, as done in previous programming assignments;
- gets a remote reference to the Space service from the rmiregistry on the Space's physical machine.
- instantiates a problem instance;
- F(16)
- EuclideanTsp problem instance paramenter values: Same as the last assignment.
- logs the arguments.
- sends the root task T to the Space;
- retrieves the root result from the Space
- displays the solution to the root task (i.e., the problem)
- logs the root task's elapsed time as seen by the client.
Experiments
Perform the same experiments as you did in the last programming assignment for TSP, but adding the Fibonacci application, and for c = 1, 2, and 4 (Computers) for both applications. Although it is fine for the client and Space to run in separate JVMs on the same physical machine, the Space and each computer run in different physical machines.
Let P be a problem instance (e.g., an instance of the TSP). Let TC be the elapsed time to complete P using C Computers. For each problem, P, plot P's parallel efficiency: EP{C) = T1 / (C TC).
Design
Provide a design document that is at most 2 pages in which you:
- Identify the key deisgn issues;
- Provide an architecture and high level object-oriented design - diagrams welcome;
- Explain how your architecture & design address the key issues that you identified.
Deliverables
The deliverables are the same as those for the previous programming assignment. Make sure that you provide your:
- source and any supporting library;
- efficiency graphs described in the Experiments section above;
- design document;
- build.xml with Ant targets for all operations needed in your demonstration.