Quantifying Machine Availability in Networked and Desktop Grid Systems

Report ID

2003-37

Report Authors

John Brevik, Daniel Nurmi and Rich Wolski

Report Date

2003-11-01

Abstract

In this paper, we examine the problem of predicting machineavailability in desktop and enterprise computing environments.Predicting the duration that a machine will run until it restarts(availability duration) is critically useful to application schedulingand resource characterization in federated systems. We describe twonon-parametric prediction techniques (that can be appliedautomatically as part of a scheduling infrastructure) and we detailtheir accuracy in predicting the availability durations empiricallygathered from three separate computing environments.We describe each method analytically and evaluate its precision using asynthetic trace of machine availability constructed from a known distribution.To detail their practical efficacy, we apply them to machineavailability traces from three separate desktop and enterprisecomputing environments, and evaluate each method in terms of the accuracy withwhich it predicts availability in a trace driven simulation.Our results indicate that availability duration can be predicted withquantifiable confidence bounds and that these bounds can be used asconservative bounds on lifetime predictions.

Document

2003-37.pdf182.79 KB