CX project - Quick start

So, what is the CX environment?
What are the CX tasks?
How do I break a problem into smaller tasks using the CX environment?
What are the basic components?
What do I need to know in order to write code for the CX environment?

The CX environment

The basic idea behind CX project, like every other metacomputing environment, is that "“if we could combine the computational resources of devices (mainly PCs) that otherwise would be wasted, then we would something that it’s really good not only from the perspective of taking advantages of unused cycles, but most importantly because then we can create a system with computational power that exceeds every existing system, and makes feasible computations that otherwise would be infeasible"

In order to do so, we must break our problem into smaller problems (divide-and-conquer) and after running these smaller problems into a potentially huge number of different nodes, we must collect those intermediate results and combine them in a meaningful way. How to decompose the initial problem into smaller problems it’s up to the application programmer, who must utilize the CX API. Since the most important concept in the CX project is the notion of task, the following paragraph gives a brief description of what a task is, and what are the similarities between a task and a function.

The CX task

For those of you that have already written programs in any traditional language (for instance C), you are very familiar with the notion of function. Each function does different things (it has different functionality), but still there are some things in common:

All functions have zero or more input arguments
All functions have output (or returned) arguments, unless the return type is "void"
All functions have some properties/attributes. For instance, is Java a function might be static, synchronized, private or public.

Similar observations apply to the CX tasks. Each task has some input arguments, some output arguments and some properties. Some of them might be null, and their number is not always the same, but still there is some space reserved for them, and when present, there is always the same procedure to get and/or set their values.

Divide and conquer

The Fibonacci numbers, that are used as a demo application, are a good example to demonstrate those ideas.

In the above figure, instead of evaluating Fibonacci of 2, that would take centuries (OK, I’m exaggerating a little bit!), we evaluate Fibonacci(1) and Fibonacci(0), and then combine their intermediate results to produce Fibonacci(2) using the definition of the function:

Fibonacci(2)=Fibnucci(1)+Fibonacci(0)

Talking in terms of tasks, this means that instead of executing the task "compute Fib(2)", we break this task into three other tasks:

Fib(1) that will compute fibonacci(1)
Fib(0) that will compute Fibonacci(0)
Fib(1,0) that will combine the intermediate results of Fib(1) and Fib(0) in order to producer Fib(2). In this particular case, the intermediate results will be simple added .

The input/output arguments for those tasks, as well as their properties are given in the table below:

The sole input argument for task T2, the corresponds to Fib(1), is the number 1, while the sole output argument is T4#1, which basically means “sent your output results to task T4 as input argument #1”. The attributes for T2 are STATUS:READY, which means that the task is ready to get executed, and ID=T2 which gives the (unique) Id number for this task. In reality, there are much more attributes, and their format is not exactly like that, but the basic idea remains the same. Note for instance task T4: when it’s created by task T1 it has two missing input arguments, that will be provided by tasks T2 and T3, and therefore it cannot be executed immediately after its creation (you probably also noticed that its status is marked as waiting).

Basic components

As we saw in the previous discussion, tasks can create (or spawn) news tasks, just like function can call other functions. And as we use in C the function main() as our entry point, in the CX environment there must be an initial task that will be submitted to the production network. The process responsible for that is called the consumer, since by submitting the task it “consumes” computation power, and its main responsibilities are:

To submit the initial tasks
To process the returned results

The initial task is sent to the Task Server, which stand between the consumer and the producers that will do the actual computation. Therefore, since Task Servers isolate the consumer from the producers, the application programmer does not have to know anything about them.

Writing CX code

From the application programmer point of view, all the tasks run within a container application that can be either the producer or the task server. So, it is important to understand the CX API, before going any further. If we assume, that you are already familiar with that, then the next step is to decide how to split your initial problem into smaller ones. In general, you will end up with two types of tasks:

the “decompose” type, that spawns new tasks when executed (with smaller problem size than their parent task)
the “compose” type, which combine the intermediate results (of other tasks) in a meaningful way.

For instance, in the example given above T1 is of type “decompose”, while T4 is of type “compose” because it combines the intermediate results of T2 and T3 by adding them.

After you have defined your tasks, the rest of the process is pretty straightforward. The only thing you have to do is to create the consumer application, which will submit the initial task and process the returned results. The task servers, as well as the producers, are provided to you, so you don’t have to implement them.

In order to test your code, you will need a production network. The simplest production network consists of one consumer, one task server and one producer. How to start them it is described in another section of the CX tutorial. Good luck!