Comparison with Other Programming Paradigms
To articulate the tasks involved in the design of workflows in terms of other paradigms that scientists are traditionally more familiar with, we developed a comparison of workflow design with distributed and parallel programming. This comparison describes several challenges facing programmers of heterogenous distributed multi-core systems that are becoming more common in scientific computing.
This work is reported in the following journal article: * "Self-Configuring Applications for Heterogeneous Systems: Program Composition and Optimization Using Cognitive Techniques". Mary Hall, Yolanda Gil, and Robert Lucas. Proceedings of the IEEE, Special Issue on Cutting-Edge Computing: Using New Commodity Architectures, Volume 96, Issue 5, May 2008. Available from the publisher and as a preprint.
Summary of findings: A comparison of multi-core computing and distributed computing:
|Multi-Core Computing||Distributed Computing|
|Programming Model Diversity||Parallel languages: pthreads, MPI, OpenMP, transactions;
GPU languages: Ng, Cuda; FPGA: VHDL or Verilog; Stream processing: StreaMIT, Napa-C
|Sequential, OpenMP, MPI, Global Address Space Languages, mixed languages (C, Fortran, C++, Java)|
|Libraries and Components||Sources: Device-specific libraries; domain-specific libraries; application libraries.
|Sources: Target-specific code; domain-specific libraries; large-scale scientific simulation. Interface.|
|Computation Partitioning||Stream processing: DSP, GPU, Multimedia extension;
Control-intensive: General-purpose; Configurable: FPGA
|Coarse-grain MPI code: large number of nodes;
Data sharing: SMP; Sequential: single node
|Data Movement and Synchronization||Communication and copying: To/from shared memory; To/from buffers; Between functional units. Impact on schedule.||Data Products in Files: To/from different nodes or clusters; Data in catalogs; Intermediate data;
Impact on schedule.
Programming Multi-Core Systems
Heterogeneous systems are comprised of a variety of special-purpose computing engines, complex memory hierarchies, and interconnects that link all of these resources together. Technology advances such as exponentially increasing chip densities have pushed hardware designers towards devices with multiple processing cores to better manage design costs and energy consumption. Heterogeneous devices can further exploit specialized functional units to increase performance and manage power for different phases of a computation. Currently, there are available a proliferation of systems with heterogeneous processing capability at various scales – from systems-on-a-chip FPGAs such as the Xilinx Virtex 4, standard PCs with graphics processors, heterogeneous chip architectures such as the IBM Cell, domain accelerators such as Clearspeed, high-end systems that incorporate co-processors such as the Cray XD1, and distributed systems comprised of clusters of diverse resources.
Heterogeneous platforms can accelerate many applications that mix compute-intensive and control-intensive phases of computation, which are best targeted to different processing elements. These applications include large-scale scientific computations, complex simulations of physical phenomena, and visualizations.
While heterogeneous systems are promising from a performance and power perspective, additional programming challenges arise, including:
- partitioning of the application across functional units
- managing data movement between functional units
- differences in programming models and tools across functional units
- managing reuse of code developed by others.
Further, these applications must be highly optimized for performance and power consumption. Just to put these challenges in perspective, (1), (2) and (4) are also challenges in porting across different homogeneous platforms. In the absence of tool support for this set of challenges, programmers of heterogeneous platforms must explicitly manage these details, which can dominate all other aspects of application programming for heterogeneous systems. The net result is that developing and debugging programs on such systems can be quite tedious, and is only approachable by highly-skilled individuals. A tremendous need exists for new approaches that can both increase the productivity of these highly-skilled developers and make these powerful systems more accessible to a broader group of users.
Programming Distributed Applications through Workflows
Programmers of distributed applications face similar challenges. Here, the focus is on collections of codes that are submitted for execution on distributed, heterogeneous resources, where the available resources may not be known prior to execution. These applications are broken down into coarse-grain components, and each is assigned an aggregate resource (possibly a cluster) for execution. The flow of data across components, usually external files, must be managed appropriately, as well as any dependencies. Programmers of these distributed applications face a very diverse set of resources and requirements. They also must manage software components developed by others, much like programmers of heterogeneous architectures incorporate software libraries developed by others.
We observe that many of these challenges are also faced in programming applications for large-scale, heterogeneous distributed computing environments, and solutions used in practice as well as future research directions in workflow systems for distributed computing can be adapted to support programmers that develop code for multi-core systems. Further, optimization decisions are inherently complex due to large search spaces of possible solutions and the difficulty of predicting performance on increasingly complex architectures.
We introduce the concept of self-configuring applications, whereby the programmer expresses an application as a high-level workflow comprised of tunable software components that are abstractions of implemented codes. The high-level workflow is instantiated and optimized for the edge computing platform, in the presence of training data that is representative of real execution environments. The optimization process relies on empirical search to execute and evaluate portions of a collection of equivalent alternative implementations of the workflow for the most suitable implementation. Machine learning, a rich knowledge representation, and an experience base aid in pruning and navigating the search space. Thus, through a systematic and principled strategy for formulating application optimization for heterogeneous platforms, the programmer’s partial specification of a high-level workflow is realized as an edge computing application.
Cognitive techniques are well-suited for managing systems of such complexity. We investigated how recent trends of using cognitive techniques for code mapping and optimization support this point, and how cognitive techniques could provide a fundamentally new programming paradigm for complex heterogeneous systems, where programmers design self-configuring applications and the system automates optimization decisions and manages the allocation of heterogeneous resources to codes.