Stefan Karpinski
I am a PhD Candidate at the Computer Science Department at the University of California, Santa Barbara. I'm a member of the the Moment Research Lab. My research focuses on understanding and being able to reproduce realistic workloads in wireless local area networks (WLANs). This research is foundational in the sense that it does not directly improve network performance, but rather informs how to conduct better research in wireless networks.
Realism of Wireless Workload
The necessity of a better understanding of realistic workload in wireless networks derives from the fact that in all experimental research in such networks, some workload is ultimately needed to test the effectiveness of new techniques. Unfortunately, performance predictions from experiments with synthetic workload may not accurately reflect performance once technologies are deployed and experience real usage conditions. Some of our research has shown that an unrealistic traffic model can distort important metrics (e.g. end-to-end delay, received throughput, jitter, network or link layer overhead) by as much as a factor of ten — on average. Everyone “knows” that constant bit-rate (CBR) traffic between randomly chosen nodes does not really resemble real-world usage patterns. However, the impact of this lack of realism has not previously been quantified. It turns out to have a more dramatic impact than expected.
Other surprises have cropped up in the course of this research. Among them are the fact that which nodes communicate with each other, how often and how much has a much more significant affect on performance metrics than does low-level flow behavior. So long as the high-level behavior is accurately modeled, realistic low-level behavior can be achieved simply by repeatedly and independently sampling packet sizes and inter-packet intervals from the appropriate empirical distribution functions for each quantity. Note that in contrast with common variable bit-rate (VBR) schemes, each flow of traffic must have it's own pair of realistic distributions, otherwise performance metrics are distorted nearly as much as they are simply using CBR traffic.
A much harder problem lies in understanding the collection of flows associated with each node, and in turn the collections of nodes that comprise a realistic network scenario. However, understanding and being able to reproduce these complex patterns is essential to being able to create realistic synthetic workload. My current research is focused on using data-mining techniques to extract order from the chaos that traffic traces typically entail.
Research Papers
- WinMee 2007: “Towards Realistic Models of Wireless Workload.” This workshop paper presents a new approach to evaluating the realism of traffic models. The fundamental idea is to use paired simulations—one using real traffic from a network trace, the other using synthetic traffic that approximates the real traffic as closely as the model will allow. If the synthetic traffic produces metric values that are consistently similar to those produced by the original traffic, then the model preserves the essential characteristics of the original trace; otherwise the model must affect some important aspect of workload behavior. Using this technique, we explore the space of synthetic models, and conclude that packet behavior can be realistically modeled by sampling packet sizes and inter-packet intervals from empirical distributions for each flow, while higher levels of traffic behavior will require more complex models. [WinMee Presentation [PDF]]
- Broadnets 2007: “Wireless Traffic: The Failure of CBR Modeling.” This conference paper uses the same paired simulation approach, but focuses more on exploring how different levels of traffic behavior (flow end-point topology, flow behavior & packet behavior) affect the accuracy of performance metrics. This paper provides far more in-depth theoretical analysis, and a more complete exploration of CBR traffic and related partially synthetic models. It shows beyond reasonable doubt that the commonly used random uniform CBR is not effective as a predictor of performance under real-world traffic conditions. It also shows that by switching traffic patterns from real traffic to random uniform CBR traffic, the relative performance of two protocols (in this case AODV and OLSR) can be completely inverted.
- Current Work: “Deconstructing Wireless Workload” (working title). In this work, we use a series of data-mining techniques to extract order from the chaos presented in traffic traces. Our WinMee 2007 paper demonstrated that the behavior of a flow is sufficiently characterized by its duration and the empirical distributions of its packet sizes and inter-packet intervals. Unfortunately, these characteristics are unique to each flow and provide little understanding of how flows are related to each other. In this work we improve this understanding by characterizing each flow's behavior as a linear combination of “fundamental behaviors.” The analysis process requires a series of data-mining techniques:
- non-parametric change-point detection,
- non-parametric probabilistic distribution clustering,
- principal component analysis,
- monotonic smoothing of approximate CDFs.
However complex the process, the end result is simple to understand: the distribution of packets sizes and inter-packet intervals for each flow can be encoded as just a few dozen parameter values, which can be naturally and meaningfully compared across all the flows in a trace and even across separate traces.
Resume & Contact
You can find my resume online here. Feel free to email me at Stefan Karpinski <sgk@cs.ucsb.edu>.
© Stefan Karpinski, 2007.