FAQ on Project 1

Introduction to Distributed Systems, Fall 2008


There seems to be a lot of confusion regarding the project. It has been hard to express the project through words, I will make another attempt.
I have received a number of emails about the protocol for deciding on an interview. Please note, you are not supposed to design a protocol in this project. Rather, you are supposed to use a protocol that was published about two decades back (refer to the Wuu and Bernstein paper). Every node has a log. The calendar can be viewed analogous to a dictionary that sits above the log. If you want to insert an event in the calendar, you insert a corresponding entry in the log and you are done. Now, the application semantics require communication between nodes, and whenever you communicate with other nodes, you pass on your knowledge about the calenders. This is as simple as that. Consider this following cases:
* You are at Node 1, User 1 wants to go to the dentist at Monday 9 -- 11. He/She inserts into the calendar (log) this event, and is done. No need for communication. Since there is no synchronous communication, so the calendars can be stale. Bear with it, this is what the protocol provides. Don't try to design protocols to make the calendars really in sync.
* You are at Node 1, User 1 and User 2 want to meet. You look at both calendars stored locally, and pick a time. Insert an event corresponding to that, and you are done. But since user 2 needs to be informed about this, a message is sent to node 2, and along with the message, you ship your log (truncated as required). You DO NOT WAIT FOR ANY ACKNOWLEDGMENT. As far as User 1 is concerned, he/she is done. Now, when the log reaches Node 2 and it processes it, if there is no conflict, this is the end. No other ACK saying Yes/No. If there is a conflict, then User 2 would decide which event should be prioritized (you decide on that, may be a coin toss, may be based on some priority, you choose resolution). Once the even to be deleted is selected, a delete event is inserted into the log. Now the parties involved in the deleted event need to be notified. So you ship the log again. What happens if Node 2 is down now, doesn't matter, your TCP connection will fail, you will figure out that Node 2 is down, and that is also reflected in your timetable for Node 2. So, the next time node 2 comes up, since your time table reflects that Node 2 has not received the latest message, with the next message to Node 2, you will ship the older parts of the log as well. Again, no acknowledgments, no timeouts, no complications introduced by the programmer.
Remember, this protocol is a lazy way for enforcing communication and eventual consistency. So, don't try to come up with a protocol that will give you the most consistent view of the calendars. The protocol allows stale views and conflicts to be present, so bear with it.
* You are at Node 1, User 1, User 2 and User 3 want to meet. You insert entries in the log, ship the logs to nodes 2 and 3. User 1 is done. If both users 2 and 3 are free, this is done. If one of them is not available, again a conflict, you decide the new event needs to be deleted. Put delete entry for the new event. Pass it to the parties involved in the events. Once these parties receive the log, they will remove the event from their calendar. Again, failures are handled as in the previous case.
* Other cases can be enumerated as above.

Please note:
* You are told to use the concept of logs to build a distributed application. Please carefully read the paper, and see how they build the dictionary application on top of the log.
* There can be conflicts, and the protocol for log replication does not prevent that. For the log, these are simply entries, conflicts are from the applications perspective. Once you resolve the conflict, put corresponding events in the log, and the protocol will take care of your events.
* You need not build a great UI for a calendar where you can insert events for the entire year (or your entire life span). A simple day long or week long calendar UI would do. But please allow enough granularity to insert 20-30 events without causing conflicts. For example, if you decide on building only a day long calendar, then do not say that events can only be inserted in hour long granularity. Then you should support minute or even 15-minute granularity of insertion (say for example, event from 4:20 to 4:30).
I hope this post, and the discussions in class clear up a lot of confusions. And please note 12th is the deadline for the project, not for submitting the plan for what you want to do in this project.
Email me if you have any questions.