CS276: Fall '13

Advanced Topics in Networking

Homework 2

Objective

PlanetLab is a global networking testbed founded to give academic researchers a way to evaluate and deploy true Internet-scale services and applications.  It currently supports over 900 individual machines spread across over 470 locations on 5 continents.  The objective of this homework is to experiment with PlanetLab and gather first-hand measurements of the Internet from a global perspective, as it is today.  Specifically, we are looking to evaluate (at a very rough granularity) the types of unexpected and abnormal behavior first observed in [Pax96].

Assignment

In this assignment, you will implement a simple measurement platform where 20 Internet PL nodes of your choice periodically perform pair-wise path measurements using ping and traceroute.   Your service should run as daemons on the subset of PlanetLab hosts you choose, gathering data over a period of at least 2 weeks. Your measurement nodes should maintain and store logs on the local disk, and move them periodically to your CSIL account.  Once all of your measurements have completed, you should analyze the measurement data to answer the questions below. Note that you want to save some time at the end in case you want to go back and perform additional measurements to support your answers.

Please read all of the questions below before writing your measurement daemons, since you will likely need to tailor your daemon based on the answers you need.  Also, beware the flakiness of Planetlab nodes, which means that you will want to pull data files off of them frequently, lest the machine fails and takes your experimental logs with them.

Questions
  1. How often did you detect outages in pair-wise paths between your nodes (as a portion of all paths monitored)?  
    1. How many of them were temporary?  For each temporary failure, can you determine the likely cause from your measurements?  If so, what is it? If not, what missing data would allow you to determine the real cause?
    2. How many failures were long term outages? Remember that PL machines go down often for maintenance and repair.  You should read the PL-user mailing list to hear announcements of planned downtime.  Can you determine the root cause for each of your long-term outages? What are they?
  2. Of the outages in pair-wise paths you detected, what percentage of them are failures in the "core" versus local ISP failures? 
    1. Of these failures, how many could be successfully circumvented in "real time" by an SOSR-like system?  (Hint: this might require you to instrument your measurement daemons with additional code for triggering tests at a random third PL node).
    2. Did you detect a significant difference in the reliability of continental links versus inter-continental links that crossed the Pacific or Atlantic oceans?
  3. Did you detect route "fluttering", if so, where and how often? 
  4. Did you detect "abnormal" triangle routing, where the traceroute path traversed a network hop out of the "expected" general direction of the destination?  Describe these routes, if any.
  5. Plot your average pair-wise latency distributions in a CDF. Based on your measurements, can you detect which nodes in your network are Internet2 hosts?  (Hint: some Internet2 hosts are reachable only by other Internet2 hosts, and not by Internet1 hosts.)
  6. Based on your small sample of measurement data, do you think routing abnormalities have increased or decreased in number from the days when Paxson first performed these measurements?  Why?
Choosing PlanetLab nodes
Beware that a number of PlanetLab hosts are "Alpha" machines.  I suggest logging into each of your desired nodes first to see if they are designated Alpha nodes. Alpha nodes will report their status to you every time you log in.  In general, you want to choose nodes that are spread out in geographic location. You can log on to planet-lab.org and choose your own nodes by going to the right and going to the slice management section.  I suggest choosing nodes such that you include the following subsets:
  • at least 2-3 nodes from continents other than N. America
  • at least 1 transatlantic and 1 transpacific link
  • at least 1 group of 2-3 hosts from the same site, so you can include LAN results in your measurements
Choosing measurement periodicity
Remember that some node-pairs can be very far apart (a la Estonia to New Zealand). You should measure at a frequency that allows each prior traceroute to finish before the new traceroute starts.  1 measurement every 10 minutes is probably safe.

Project Workload
Hi all. I will assign students in the class randomly into slices, with roughly 2 students per slice. It is fine if you collaborate with the other student you are sharing a slice with, but only with him/her.  In the table below, I have listed the slice names and users in each slice. I will continue to update this table as more users sign up for accounts.  Some of the slices have not had their nodes assigned yet. You can do it yourself by going to your slice and then choosing manage nodes.

Submission

This homework assignment is due by 11:59pm on November 21. Late homework will not be accepted. Submit your homework as two files via email attachment to linzhou@cs.  The email MUST have as its subject "276HW2" (without quotes).  The attachments should be two files: one PDF file called LASTNAME.pdf, and one ZIP file called LASTNAME.zip (substitute in your lastname).

Your pdf file should include: 
  • A write-up that describe your methodology, observations and answers to the questions above. Avoid verbose discussion of the results. Additional results, insight, and analysis of the results, however, are strongly encouraged.
  • Submit any graphs that are relevant to your analysis, including a CDF of the 400 path latencies observed.  Use your judgment. Be concise but make any arguments convincingly. Graphs are to be plotted using xgraph or gnuplot only. Avoid printing one graph per page. Logical organization of content (text and graphs) is expected!
Your zip file should include: 
  • The names of the hosts you used in your experiments. The zip file should also contain all of your code and all scripts you used to parse, process and analyze the results.
Finally, when you are all done submitting the homework,  REMEMBER to KILL your daemons.

Grading Guidelines

You will be graded on the consistency of your measurements the quality of your analysis, including your findings, explanation and presentation of your findings.

Cheating Policy

Cheating will not be tolerated. Please read the UCSB Academic Code of Conduct to find out more about Student Conduct and Discipline.