PLEASE read everything listed in this project description CAREFULLY (with care, slowly, several times, thoroughly) before losing control and attempting to move in with the TA. It is fairly straight-forward to build this program, but you have to understand all of the components thoroughly.
Next, you should put each row of your array on a separate line in the file that is to contain the array. Note that all line breaks in the file are significant and must be present.
For example
10 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0is a valid representation of a two-color graph on 10 nodes. If we label the indices from 0, then a "1" in element [0,2] indicates that the edge linking node 0 and node 2 in the graph is has color "1" (call it green). A "0" in element [2,6] indicates that the edge linking node 2 and node 6 has color "0" (call it red).
Notice that only elements on one side of the diagonal matter. That is because [2,6] and [6,2] mean the same thing. That is, an edge from [2,6] is the same edge that goes from [6,2]. Technically, you could make the graph symmetric across the diagonal ([i,j] == [j,i] for all i and j) but the clique counter in clique_count.c will only consider the numbers above the diagonal as valid (it will ignore the others). Notice also that the numbers on the diagonal don't matter since there are not edges [i,i] in the graph.
Requirement 1: You must turn in your best counter example in this format. The TAs will be using a counter-example checker that recognizes this file format and if they cannot verify your answer, they will not be able to give you credit.
Helpful Hint 1:Acquaint yourself with the functions ReadGraph() and PrintGraph() in test_clique_count.c. If you use these functions for your Graph file you will be assured of the correct formatting.
For example, assume that
10 0 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0is stores in a file called "graph.10". The output from the program test_clique_count -f graph.10 is
bash$ test_clique_count -f clique.10 file clique.10 contains a graph of size 10 with 3 monochromatic cliques bash$That means, there are three subcliques of size 6, all of the same color (probably green) in this graph.
You are free to practice what ever methodology suits your whim or esthetic taste in this pursuit (including just searching randomly) but you should be warned: for large dimensions, the number of monochromatic cliques can be huge. The routine CliqueCount() returns an integer. If you get bigger than about 2 billion, you will start to get negative numbers and 4 billion will cause an overflow. Further, the speed with which CliqueCount() runs is a function of how many subcliques there are: the greater the number of subcliques, the slower it runs.
Helpful Hint 2: If you think about things a bit, you'll see that buried in every counter-example of size K is a count-example of size K-1. Think of it this way. If you had a counter example of size K and you removed one node and all of the edges that had it as one end-point, you would not have introduced any new colors or new edges into the resulting smaller graph. About all you could have done was remove one or more bi-chromatic subcliques leaving a graph with a smaller number of nothing but bi-chromatic subcliques.
Therefore, if you have discovered a counter-example of size K you can hypothesize that it must be embedded in a counter-example of size K+1.
Helpful Hint 3: Another realization that you might wish to have is that for a graph of size K the closer the subclique count is to 0, the closer (in some sense) the graph is to a counter-example of size K. To see why, imagine what happens if you had a counter-example of size K and you flipped one edge from red to green. Chances are that number of monochromatic subcliques you would introduce would be small (compared to the total number of monochromatic subcliques there would be in, say, a random coloring).
let K = some small number bigger than 6 initialize all elements of a K x K graph g randomly (or to zero) oldcount = CliqueCount(g,K) while(K < 102) pick an edge that you wish to re-color according to some heuristic flip that edge color count = CliqueCount(g,K) if(count == 0) counter-example of size K found add a row and column to g making graph one size bigger initialize the new row and column in some way oldcount = CliqueCount(g,K) K = K + 1 continue else if(count < oldcount) keep the edge that color else flip it back to the original color end whileWhen the loop terminates, you have new mathematical result. To use this form, the parts you have to fill in are
Helpful Hint 4: For small graphs (under size 10) many different colorings will produce counter examples. You can probably initialize your very first graph (if it is small enough) will all 0s or randomly and a very few edge flips will probably produce a counter example. The first 10 x 10 example in this project description I produced randomly and it is a counter-example. To get the second one I had to add enough 1s to introduce a monochromatic subclique.
In the file simple_search.c you can see a really simple example of using this approach. Here is the main body of the code.
int
main(int argc,char *argv[])
{
int *g;
int *new_g;
int gsize;
int count;
int i;
int j;
int best_count;
int best_i;
int best_j;
/*
* start with graph of size 8
*/
gsize = 8;
g = (int *)malloc(gsize*gsize*sizeof(int));
if(g == NULL)
exit(1);
/*
* start out with all zeros
*/
memset(g,0,gsize*gsize*sizeof(int));
/*
* while we do not have a publishable result
*/
while(gsize < 102)
{
/*
* find out how we are doing
*/
count = CliqueCount(g,gsize);
/*
* if we have a counter example
*/
if(count == 0)
{
printf("Eureka! Counter-example found!\n");
PrintGraph(g,gsize);
/*
* make a new graph one size bigger
*/
new_g = (int *)malloc((gsize+1)*(gsize+1)*sizeof(int));
if(new_g == NULL)
exit(1);
/*
* copy the old graph into the new graph leaving the
* last row and last column alone
*/
CopyGraph(g,gsize,new_g,gsize+1);
/*
* zero out the last column and last row
*/
for(i=0; i < (gsize+1); i++)
{
new_g[i*(gsize+1) + gsize] = 0; // last column
new_g[gsize*(gsize+1) + i] = 0; // last row
}
/*
* throw away the old graph and make new one the
* graph
*/
free(g);
g = new_g;
gsize = gsize+1;
/*
* keep going
*/
continue;
}
/*
* otherwise, we need to consider flipping an edge
*
* let's speculative flip each edge, record the new count,
* and unflip the edge. We'll then remember the best flip and
* keep it next time around
*
* only need to work with upper triangle of matrix =>
* notice the indices
*/
best_count = 9999999;
for(i=0; i < gsize; i++)
{
for(j=i+1; j < gsize; j++)
{
/*
* flip it
*/
g[i*gsize+j] = 1 - g[i*gsize+j];
count = CliqueCount(g,gsize);
/*
* is it better?
*/
if(count < best_count)
{
best_count = count;
best_i = i;
best_j = j;
}
/*
* flip it back
*/
g[i*gsize+j] = 1 - g[i*gsize+j];
}
}
/*
* keep the best flip we saw
*/
g[best_i*gsize+best_j] = 1 - g[best_i*gsize+best_j];
printf("best_count: %d, best edge: (%d,%d), new color: %d\n",
best_count,
best_i,
best_j,
g[best_i*gsize+best_j]);
/*
* rinse and repeat
*/
}
return(0);
}
It initializes the very first graph of size 8 to all zeros. When it picks an
edge, it picks the one that makes the count smallest at each iteration.
Finally, when it adds a row and a column, it fills them in with zeros. This
code will find counter examples up to size 25 before stopping. Why? Think
about it a bit and you'll see that you might always pick the same edge
flip each time. Look at the last part of the output generated by
print statement right at the bottom of the main while loop
best_count: 2500, best edge: (0,25), new color: 1 best_count: 1875, best edge: (1,25), new color: 1 best_count: 1250, best edge: (10,25), new color: 1 best_count: 625, best edge: (15,25), new color: 1 best_count: 1, best edge: (20,25), new color: 1 best_count: 1, best edge: (0,2), new color: 1 best_count: 1, best edge: (0,2), new color: 0 best_count: 1, best edge: (0,2), new color: 1 best_count: 1, best edge: (0,2), new color: 0 best_count: 1, best edge: (0,2), new color: 1 best_count: 1, best edge: (0,2), new color: 0 best_count: 1, best edge: (0,2), new color: 1It is stuck flipping edge (0,2) back and forth between colors 0 and 1. What this means is that the graph has the same counter-example count (1 in this case) regardless of what color this edge is. What should you do about it? Remember that you just flipped (0,2) and pick another edge. And there you have it. You are on your way. All of the search routines you come up with will have this flavor. Your goal is to keep them from going into loops like this.
Helpful Hint 5: This remembering business is really the rub. We'll try to illustrate with another example. Say you've just flipped edge [4,7] from red to green and the clique count drops from 25 to 23. Next, you look for an edge to flip that will make the count go below 23 and you don't find one. What do you do? You have several choices, including un-flipping [4,7] but that produced an improvement so you might want to keep that flip. Let's say you choose to flip the edge that makes the count worse by the least amount. Say, flipping edge [9,11] changes the count to 24. Again you search and you can't make the count go any lower so you choose the least damaging flip and that turns out to be [4,7]. Say you flip it and the count goes to 25. You search again and you find that flipping [9,11] is the least damaging flip. This time flipping it leave the count at 25. You search and you find that, in fact, flipping [4,7] will drop the count to 23, so you choose it. If you algorithm is deterministic (meaning you make the decisions the same way at each point) you will continue this cycle indefinitely.
"Of course," you say, "all you have to do is remember that you flipped [4,7] originally and you avoid reflipping it." Okay, but if you take this literally, you will get to make exactly K^2 flips (one for each edge) and then you will have to stop. It may be, though, that for the given coloring of g at the moment, flipping [4,7] made the count go lower, but in an actual counter example, you would want it flipped back. Really, what you want to do is say "Given that the entire graph has this coloring, I never want to flip [4,7] back from green to red." If your graph is in some state, and you flip an edge in one direction, any time you return to that state, if you make the same flip, you will be in a loop because your deterministic algorithm will keep making exactly the same decisions that returned you to this state.
"Ah ha!", you say, "I'll just remember the graph configuration that goes with each flip." Yes, but that takes memory -- potentially lots of memory, and now you have another search problem. First, you must remember each configuration you've seen. If you are using a K x K matrix for each one, you have a space problem. You might be able to hash (compress) the representation in some way, but now you have to search the representation list every time you consider a move to see if this move is legal given the current configuration. The number of moves could be absolutely huge. Tough problem, no? This particular issue, I believe, is why no one has improved on the bounds for R(6,6) since 1965.
Next, the graph representation is canonical, meaning that two different heuristics can be working on the same graph in two complete different ways. Or, you can use one heuristic on one machine for some time and then pass its intermediate results off to another "finishing" machine that uses a different heuristic. Perhaps more importantly, if you find a graph that you think is close to being a counter example, it is easy to devote more resources to just that graph by sending the "nearly there" graph to a set of processors for a more thorough search.
In short, it is possible to build a very fluid program, consisting of multiple search techniques, and a scheduler to control them to attack the Ramsey search problem. Unlike the previous projects, in which there was only one algorithm to run and the state was large, you have much more freedom to adjust this kind of program dynamically.
The biggest requirement is that you enjoy this process. You are now doing science -- hard science. If no one has been able to discover a counter- example bigger that 101 since 1965, it is a hard problem. Your programs are treading on the edge of what is known. It should be fun.