Cloud computing is a highly customizable, pay-per-use, service-oriented methodology that offers many attractive features. It enables arbitrary users to employ potentially vast numbers of multicore cluster resources that are not necessarily owned, managed, or controlled by the users themselves. By reducing the barrier to entry on the use of such distributed systems, cloud technologies encourage innovation and implementation of applications and systems by a broad and diverse developer base – a base that might not otherwise have access to such resource scale. To date, however, cloud computing has been used primarily to implement commercial information technologies and to support web services. Users interested in other domains (e.g. scientific simulation and large scale data analysis) are left to devise their own toolsets. For clouds, such tooling requires that significant expertise, experience, time, and labor be devoted to the customization, configuration, deployment, and management of virtual machines (VMs).
To address this challenge, we have developed CloudRunner, a framework that extracts arbitrary programs from a source code repository (e.g. GitHub), wraps them in a web service and tasking system, and deploys them over disparate cloud infrastructures and local clusters, automating their portability. CloudRunner automatically creates and configures virtual machines so that they can execute the applications, provides a web UI with which users parameterize their applications, deploys instances of the program as cloud-based background tasks, and collects the results for easy access via a browser. CloudRunner is an ideal framework for deploying scientific simulation applications portably and as such, we use it to implement StochSS – Stochastic Simulation as-a-service (now available at http://www.stochss.org). We use StochSS to evaluate CloudRunner overheads and find that they are small, consistent, and amortized for even short running applications.