Report ID
2003-28
Report Authors
Daniel Nurmi, John Brevik, and Rich Wolski
Report Date
Abstract
In this paper, we consider the problem of modeling machineavailabilityin enterprise-area and wide-area distributed computing settings. Usingavailability datagathered from three different environments, we detail the suitability of fourpotential statistical distributions for each data set: exponential, Pareto,Weibull, and hyperexponential. In each case, we use software we havedeveloped to determine the necessary parameters automatically from each datacollection.To gauge suitability, we present both graphical and statisticalevaluations of the accuracy with each distribution fits each data set. Forall three data sets, we find that a hyperexponential model fits slightly moreaccurately than a Weibull, but that both are substantially better choices thaneither an exponential or Pareto. We also test the independence of individualmachine measurements and the stationarity of the underlying statisticalprocess model for each data set.These results indicate that either a hyperexponential or Weibull modeleffectively represents machine availability in enterprise and Internetcomputing environments.
Document
2003-28.pdf911.39 KB