Execution of Workflows on Cloud Environments

Today, scientific applications are looking towards Cloud environments such as those provided by Amazon, Google, and others as a means of delivering on-demand computational capabilities. So far Cloud environments have been used mostly in the business domain and it is not clear whether these environments are suitable to provide the necessary capabilities for science applications to run efficiently and reliably.

In this work, we have performed an initial evaluation of running scientific workflows, such as those used in astronomy on cloud-like resources. We showed that overheads stemming from wide-area communications and data transfers are not negligible. However, cloud environments can be beneficial for scientific workflows by providing a virtual environment where applications can find a customized execution environment.

In order to support further investigation, we also started characterizing a number of scientific workflows in terms of their data dependencies patterns, data usage, and computational requirements. We captured the structure and performance characteristics of applications such as those is earthquake science, epigenomics, biology, and others.

This work is reported in the following publications:

* "On the Use of Cloud Computing for Scientific Workflows," Christina Hoffa, Gaurang
Mehta, Timothy Freeman, Ewa Deelman, Kate Keahey, Bruce Berriman, John Good, 3rd.
International Workshop on Scientific Workflows and Business Workflow Standards in 
e-Science (SWBES) in conjunction with Fourth IEEE International Conference on 
e-Science (e-Science 2008), 10 December 2008 in Indianapolis, Indiana, USA

* “Characterization of Scientific Workflows,”, Shishir Bharathi, Ann Chervenak, 
Ewa Deelman, Gaurang Mehta, Mei-Hui Su, Karan Vahi, 3rd Workshop on Workflows in
Support of Large-Scale Science (WORKS08), Austin, TX, November 2008.
