The High Level Picture
Software bugs are so damaging and widespread that they cost the U.S. economy an estimated \$59.5 billion annually (more than half a percent of the GNP). Although it is certainly not possible to remove all errors, it is estimated that more than a third of the cost associated with bugs could be eliminated through an improved testing and analysis infrastructure. An increased reliance on sophisticated software analysis and testing tools seems inevitable, yet Complex pointer errors, memory leaks, race conditions, and performance anomalies requires sifting through a sea of runtime data.
The end goal of the Mimir project is to enable a new breed of hardware/software analysis tools, for researchers and system builders, that can sift through on-line profile data at unprecedented speeds, yielding a highly accurate and timely image of system execution. A system is required that will be useful in a variety of profiling situations, incur very little overhead even during complex on-line profile analysis, and provide guarantees of bounded error. The problem is that these three requirements in turn necessitate the transfer of profile data at hundreds of gigabytes per second, the ability to perform significant streaming computation over irregular data structures, and non-trivial data aggregation and query methods. We believe that the key to success is a cross-layer approach that combines the raw computational ability of custom architectures with the formal guarantees provided by carefully crafted stream algorithms.
Streams as a Model for Introspection
There are the three major intellectual challenges to realizing a stream algorithm based profiling model: The first challenge is in finding ways by which long slices of data can be extracted efficiently from an executing machine. We propose that promising experimental interconnect technologies, such as 3d-integration, offer a novel ability to include "snap-on" profiling functionality yielding access to processor signals at a rate of terabits/sec. The second challenge is managing this immense fountain of data; online methods will be required as we simply cannot afford to store all of this data for post processing. At a high level, our algorithmic approach to profiling is grounded in geometry, implicitly motivated by the belief that many profiling patterns, trends, or anomalies have natural geometric representations that become discernible under a geometric lens. The third challenge is in building a system that can implement a broad class of online program analysis algorithms, including the above streaming algorithms, in a high throughput and programmable way. We propose ways of leveraging ideas from high throughput network processing to develop such an architecture.
Our algorithmic approach to profiling builds upon concepts in computational geometry, driven by the understanding that many online program analysis problems have natural representation in geometric terms. For example, value profiling could be considered a streaming query over all memory accesses in the one dimension of all possible values. Counting and tracking edges is a problem in the two dimensional space of code address x code address. Even race conditions are queries in the space of threadsets and virtual clocks. While this is not the way such problems are normally framed, in doing so we can bring to bear the significant power of existing and developing streaming geometry algorithms. Our preliminary work on Range Adaptive Profiling (see publications) is a geometric approach to tracking simple profile types (such as basic block execution frequencies, or the value profiling example above) and it describes a new and general purpose profiling method capable of hierarchically classifying streams of data efficiently in hardware. In RAP, input stream events are dynamically classified into increasingly precise categories based on the frequency with which they occur. The more important a class, or range of events, the more precisely it is quantified. Extending this work to capture more useful traces, for example profile data in a multi-dimensional space, however, presents considerable technical and conceptual difficulties that require novel ideas from both computational geometry and computer architecture.
- Shashidhar Mysore, Banit Agrawal, Sheng-Chih Lin, Navin Srivastava, Kaustav Banerjee and Timothy Sherwood. 3D-Integration for Introspection, IEEE Micro: Micro's Top Picks from Computer Architecture Conferences (IEEE Micro - top pick), January-February 2007.
- Shashidhar Mysore, Banit Agrawal, Sheng-Chih Lin, Navin Srivastava, Kaustav Banerjee and Timothy Sherwood. Introspective 3D Chips, Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2006. San Jose, CA
- John Hershberger, Nisheeth Shrivastava, Subhash Suri and Csaba Toth. Adaptive Spatial Partitioning for Multidimensional Data Streams. Algorithmica, Volume 46(1), pages 97-117, Sept 2006.
- Priya Nagpurkar, Hussam Mousa, Chandra Krintz, and Timothy Sherwood. Efficient Remote Profiling for Resource-Constrained Devices Transactions on Architecture and Code Optimization (TACO) Vol 3 No 1, June 2006.
- Shashidhar Mysore, Banit Agrawal, Timothy Sherwood, Nisheeth Shrivastava, and Subhash Suri. Profiling over Adaptive Ranges. (best paper award) Proceedings of the International Symposium on Code Generation and Optimization (CGO'06) March 2006. New York, New York.
- Subhash Suri, Csaba Toth and Yunhong Zhou. Range Counting over Multidimensional Data Streams. Invited paper to Special Issue of Discrete and Computational Geometry, accepted 2005.