CS290I -- Project 2: You are Invited to a Party


The Search for Bounds on R(6,6)

There are two pedagogical goals for this project (three if you are truly ambitious):
  1. to gain further Grid programming experience by building an adaptively scheduled program that can migrate state after it begins running,
  2. to explore scheduling algorithms that attempt to optimize more than just execution turn-around time as performance metric
  3. to improve the known bounds for the 6th symmetric Ramsey number, R(6,6).
To do so, you will build a distributed Ramsey Number search program that attempts to identify counter-example graphs using the machines here in the department. As always, you will need to use some Grid infrastructure pieces we provide for you and you must make sure that you use the correct file formats so that the TA's can check your results.

PLEASE read everything listed in this project description CAREFULLY (with care, slowly, several times, thoroughly) before losing control and attempting to move in with the TA. It is fairly straight-forward to build this program, but you have to understand all of the components thoroughly.


Step 1 -- Read the Lecture Notes

Read and understand each and every word written the Ramsey Number lecture notes. Even if you did attend the lecture, you were awake, and you believe that you know everything there is to know about Ramsey numbers, you should read those lecture notes.

Step 2 -- Representing Ramsey Graph Colorings in a File

For this project, you should represent your two-colored, fully interconnected graphs as two-dimensional arrays of integers. Let the number zero (0) represent one color, and the number one (1) represent the other. The indices of the arrays represent nodes and the graph arrays must be in row major order. The first line of a graph file should be the size of the array dimensions, in number of elements. Since the graphs are undirected, the graph arrays will be square. It is sufficient just to list only one dimension size, although you can list both. The graph reading function in the file test_clique_count.c will take the first number on the first line as the dimensionality of the array.

Next, you should put each row of your array on a separate line in the file that is to contain the array. Note that all line breaks in the file are significant and must be present.

For example

10
0 0 1 1 1 0 1 1 0 0
0 0 0 1 0 1 1 1 0 1
0 0 0 0 1 1 0 1 1 0
0 0 0 0 1 0 1 0 1 1
0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0
is a valid representation of a two-color graph on 10 nodes. If we label the indices from 0, then a "1" in element [0,2] indicates that the edge linking node 0 and node 2 in the graph is has color "1" (call it green). A "0" in element [2,6] indicates that the edge linking node 2 and node 6 has color "0" (call it red).

Notice that only elements on one side of the diagonal matter. That is because [2,6] and [6,2] mean the same thing. That is, an edge from [2,6] is the same edge that goes from [6,2]. Technically, you could make the graph symmetric across the diagonal ([i,j] == [j,i] for all i and j) but the clique counter in clique_count.c will only consider the numbers above the diagonal as valid (it will ignore the others). Notice also that the numbers on the diagonal don't matter since there are not edges [i,i] in the graph.

Requirement 1: You must turn in your best counter example in this format. The TAs will be using a counter-example checker that recognizes this file format and if they cannot verify your answer, they will not be able to give you credit.

Helpful Hint 1:Acquaint yourself with the functions ReadGraph() and PrintGraph() in test_clique_count.c. If you use these functions for your Graph file you will be assured of the correct formatting.


Step 3 -- Counting Monochromatic 6-Cliques

The next step in the process is to realize that a two-colored graph on K edges is a Ramsey counter-example for R(6,6) of size K if and only if the number of monochromatic subcliques of size 6 is equal to 0 (zero). The code in clique_count.c takes a one-dimensional array of integers and a dimension size as arguments (representing the two-dimensional clique array discussed above) and will count up the number of monochromatic subcliques of size 6 that graph contains. The code in test_clique_count.c will perform this mundane task on a clique file passed as the "-f" argument.

For example, assume that

10
0 1 1 1 1 1 1 1 0 0 
0 0 0 1 1 1 1 1 1 1 
0 0 0 0 1 1 1 1 1 0 
0 0 0 0 1 1 1 1 1 1 
0 0 0 0 0 0 1 1 1 1 
0 0 0 0 0 0 0 1 1 1 
0 0 0 0 0 0 0 1 1 1 
0 0 0 0 0 0 0 0 0 1 
0 0 0 0 0 0 0 0 0 1 
0 0 0 0 0 0 0 0 0 0 
is stores in a file called "graph.10". The output from the program test_clique_count -f graph.10 is
bash$ test_clique_count -f clique.10
file clique.10 contains a graph of size 10 with 3 monochromatic cliques
bash$
That means, there are three subcliques of size 6, all of the same color (probably green) in this graph.

Step 4 -- Looking for the Biggest Possible Counter Example

The goal of your program is to find the largest counter example (a two-color graph with zero monochromatic subcliques of size 6) that you can using the minimum amount of CPU to do it. If you can do that for any size of 102 or higher, you will have improved on a mathematical result that was derived in 1965.

You are free to practice what ever methodology suits your whim or esthetic taste in this pursuit (including just searching randomly) but you should be warned: for large dimensions, the number of monochromatic cliques can be huge. The routine CliqueCount() returns an integer. If you get bigger than about 2 billion, you will start to get negative numbers and 4 billion will cause an overflow. Further, the speed with which CliqueCount() runs is a function of how many subcliques there are: the greater the number of subcliques, the slower it runs.

Helpful Hint 2: If you think about things a bit, you'll see that buried in every counter-example of size K is a count-example of size K-1. Think of it this way. If you had a counter example of size K and you removed one node and all of the edges that had it as one end-point, you would not have introduced any new colors or new edges into the resulting smaller graph. About all you could have done was remove one or more bi-chromatic subcliques leaving a graph with a smaller number of nothing but bi-chromatic subcliques.

Therefore, if you have discovered a counter-example of size K you can hypothesize that it must be embedded in a counter-example of size K+1.

Helpful Hint 3: Another realization that you might wish to have is that for a graph of size K the closer the subclique count is to 0, the closer (in some sense) the graph is to a counter-example of size K. To see why, imagine what happens if you had a counter-example of size K and you flipped one edge from red to green. Chances are that number of monochromatic subcliques you would introduce would be small (compared to the total number of monochromatic subcliques there would be in, say, a random coloring).

Developing Re-coloring Heuristics

If you take Hints 2 and 3 to heart, you can develop a set of search heuristics that all have the same general form outlined in the following pseudo-code:

let K = some small number bigger than 6
initialize all elements of a K x K graph g randomly (or to zero)
oldcount = CliqueCount(g,K)
while(K < 102)
pick an edge that you wish to re-color according to some heuristic
flip that edge color
count = CliqueCount(g,K)
if(count == 0)
	counter-example of size K found
	add a row and column to g making graph one size bigger
	initialize the new row and column in some way
	oldcount = CliqueCount(g,K)
	K = K + 1
	continue
else if(count < oldcount)
	keep the edge that color
else
	flip it back to the original color
end while

When the loop terminates, you have new mathematical result. To use this form, the parts you have to fill in are

Helpful Hint 4: For small graphs (under size 10) many different colorings will produce counter examples. You can probably initialize your very first graph (if it is small enough) will all 0s or randomly and a very few edge flips will probably produce a counter example. The first 10 x 10 example in this project description I produced randomly and it is a counter-example. To get the second one I had to add enough 1s to introduce a monochromatic subclique.

In the file simple_search.c you can see a really simple example of using this approach. Here is the main body of the code.


int
main(int argc,char *argv[])
{
	int *g;
	int *new_g;
	int gsize;
	int count;
	int i;
	int j;
	int best_count;
	int best_i;
	int best_j;

	/*
	 * start with graph of size 8
	 */
	gsize = 8;
	g = (int *)malloc(gsize*gsize*sizeof(int));
	if(g == NULL)
		exit(1);

	/*
	 * start out with all zeros
	 */
	memset(g,0,gsize*gsize*sizeof(int));

	/*
	 * while we do not have a publishable result
	 */
	while(gsize < 102)
	{
		/*
		 * find out how we are doing
		 */
		count = CliqueCount(g,gsize);

		/*
		 * if we have a counter example
		 */
		if(count == 0)
		{
			printf("Eureka!  Counter-example found!\n");
			PrintGraph(g,gsize);
			/*
			 * make a new graph one size bigger
			 */
			new_g = (int *)malloc((gsize+1)*(gsize+1)*sizeof(int));
			if(new_g == NULL)
				exit(1);
			/*
			 * copy the old graph into the new graph leaving the
			 * last row and last column alone
			 */
			CopyGraph(g,gsize,new_g,gsize+1);

			/*
			 * zero out the last column and last row
			 */
			for(i=0; i < (gsize+1); i++)
			{
				new_g[i*(gsize+1) + gsize] = 0; // last column
				new_g[gsize*(gsize+1) + i] = 0; // last row
			}

			/*
			 * throw away the old graph and make new one the
			 * graph
			 */
			free(g);
			g = new_g;
			gsize = gsize+1;

			/*
			 * keep going
			 */
			continue;
		}

		/*
		 * otherwise, we need to consider flipping an edge
		 *
		 * let's speculative flip each edge, record the new count,
		 * and unflip the edge.  We'll then remember the best flip and
		 * keep it next time around
		 *
		 * only need to work with upper triangle of matrix =>
		 * notice the indices
		 */
		best_count = 9999999;
		for(i=0; i < gsize; i++)
		{
			for(j=i+1; j < gsize; j++)
			{
				/*
				 * flip it
				 */
				g[i*gsize+j] = 1 - g[i*gsize+j];
				count = CliqueCount(g,gsize);

				/*
				 * is it better?
				 */
				if(count < best_count)
				{
					best_count = count;
					best_i = i;
					best_j = j;
				}

				/*
				 * flip it back
				 */
				g[i*gsize+j] = 1 - g[i*gsize+j];
			}
		}
		
		/*
		 * keep the best flip we saw
		 */
		g[best_i*gsize+best_j] = 1 - g[best_i*gsize+best_j];

		printf("best_count: %d, best edge: (%d,%d), new color: %d\n",
			best_count,
			best_i,
			best_j,
			g[best_i*gsize+best_j]);
		/*
		 * rinse and repeat
		 */
	}


	return(0);

}
It initializes the very first graph of size 8 to all zeros. When it picks an edge, it picks the one that makes the count smallest at each iteration. Finally, when it adds a row and a column, it fills them in with zeros. This code will find counter examples up to size 25 before stopping. Why? Think about it a bit and you'll see that you might always pick the same edge flip each time. Look at the last part of the output generated by print statement right at the bottom of the main while loop
best_count: 2500, best edge: (0,25), new color: 1
best_count: 1875, best edge: (1,25), new color: 1
best_count: 1250, best edge: (10,25), new color: 1
best_count: 625, best edge: (15,25), new color: 1
best_count: 1, best edge: (20,25), new color: 1
best_count: 1, best edge: (0,2), new color: 1
best_count: 1, best edge: (0,2), new color: 0
best_count: 1, best edge: (0,2), new color: 1
best_count: 1, best edge: (0,2), new color: 0
best_count: 1, best edge: (0,2), new color: 1
best_count: 1, best edge: (0,2), new color: 0
best_count: 1, best edge: (0,2), new color: 1
It is stuck flipping edge (0,2) back and forth between colors 0 and 1. What this means is that the graph has the same counter-example count (1 in this case) regardless of what color this edge is. What should you do about it? Remember that you just flipped (0,2) and pick another edge. And there you have it. You are on your way. All of the search routines you come up with will have this flavor. Your goal is to keep them from going into loops like this.

Helpful Hint 5: This remembering business is really the rub. We'll try to illustrate with another example. Say you've just flipped edge [4,7] from red to green and the clique count drops from 25 to 23. Next, you look for an edge to flip that will make the count go below 23 and you don't find one. What do you do? You have several choices, including un-flipping [4,7] but that produced an improvement so you might want to keep that flip. Let's say you choose to flip the edge that makes the count worse by the least amount. Say, flipping edge [9,11] changes the count to 24. Again you search and you can't make the count go any lower so you choose the least damaging flip and that turns out to be [4,7]. Say you flip it and the count goes to 25. You search again and you find that flipping [9,11] is the least damaging flip. This time flipping it leave the count at 25. You search and you find that, in fact, flipping [4,7] will drop the count to 23, so you choose it. If you algorithm is deterministic (meaning you make the decisions the same way at each point) you will continue this cycle indefinitely.

"Of course," you say, "all you have to do is remember that you flipped [4,7] originally and you avoid reflipping it." Okay, but if you take this literally, you will get to make exactly K^2 flips (one for each edge) and then you will have to stop. It may be, though, that for the given coloring of g at the moment, flipping [4,7] made the count go lower, but in an actual counter example, you would want it flipped back. Really, what you want to do is say "Given that the entire graph has this coloring, I never want to flip [4,7] back from green to red." If your graph is in some state, and you flip an edge in one direction, any time you return to that state, if you make the same flip, you will be in a loop because your deterministic algorithm will keep making exactly the same decisions that returned you to this state.

"Ah ha!", you say, "I'll just remember the graph configuration that goes with each flip." Yes, but that takes memory -- potentially lots of memory, and now you have another search problem. First, you must remember each configuration you've seen. If you are using a K x K matrix for each one, you have a space problem. You might be able to hash (compress) the representation in some way, but now you have to search the representation list every time you consider a move to see if this move is legal given the current configuration. The number of moves could be absolutely huge. Tough problem, no? This particular issue, I believe, is why no one has improved on the bounds for R(6,6) since 1965.


Step 5 -- Distributing Your Search Program

There are several properties of this problem that make it possible to use a distributed approach to solving it. First, the size of the state that you need to move around is pretty small. If you find something at 102, your matrix will be 102 x 102. Also, you can move things as single bytes (containing either a 1 or a 0) or worse -- as bit masks -- so you can really cut down on the communication costs.

Next, the graph representation is canonical, meaning that two different heuristics can be working on the same graph in two complete different ways. Or, you can use one heuristic on one machine for some time and then pass its intermediate results off to another "finishing" machine that uses a different heuristic. Perhaps more importantly, if you find a graph that you think is close to being a counter example, it is easy to devote more resources to just that graph by sending the "nearly there" graph to a set of processors for a more thorough search.

In short, it is possible to build a very fluid program, consisting of multiple search techniques, and a scheduler to control them to attack the Ramsey search problem. Unlike the previous projects, in which there was only one algorithm to run and the state was large, you have much more freedom to adjust this kind of program dynamically.


Requirements for Project 2

Your assignment is find the biggest counter example you can for R(6,6) by Dec. 1st at 1:00 PM. That's it. The biggest one wins. You must turn in all of your code and the best counter example you were able to find in the appropriate format. You can use any machine you like, any time you like, between now and the Dec. 1st at 1:00 PM. If your counter-example is bigger than 102, I'll be happy to help you write your ground-breaking paper in combinatorics. Starting at 11:00 AM on Wednesday, Dec. 1st you should find a machine in the CSIL or GSL and begin running your code. Wahid and I will circulate through the lab and visit each one of you to see how each one of your codes work. Think of it as a small presentation of your new-found appreciattion for Grid programming. You should be prepared to demonstrate what the code is doing, show some logging output -- what ever is necessary -- to give us both the warm and comfortable feeling that you have something that is truly searching for R(6,6).

The biggest requirement is that you enjoy this process. You are now doing science -- hard science. If no one has been able to discover a counter- example bigger that 101 since 1965, it is a hard problem. Your programs are treading on the edge of what is known. It should be fun.

You can do this.