Decoupled Execution Paradigm Design

The decoupled execution paradigm (DEP) consists of three components: system architecture, programming model, and runtime system. The architecture view of DEP is shown in the below Figure.

dep-arch-web.jpg

  • System architecture. DEP decouples the nodes into data processing (data) nodes and compute nodes. Data nodes are further decoupled into compute-side data nodes and storage-side data notes. Compute-side data nodes are compute nodes that are dedicated for data processing. Storage-side data nodes are specially designed nodes that are connected to file servers with fast network. Compute-side data nodes reduce the size of computing generated data before sending it to storage nodes. Storage-side data nodes reduce the size of data retrieved from storage before sending it to compute-side data nodes. Writes will go through compute-side data nodes, whereas reads will go through the storage-side data nodes. Data nodes can provide simple data forwarding without any data size reduction, but the idea behind data nodes is to let the data nodes conduct the decoupled data-intensive operations and optimizations to reduce the data size and movement.
  • Programming model. What operations should be passed to the data nodes are determined by users and supported by the decoupled execution programming model (DEPM). The DEPM component is an MPI extension, allowing users to specify operations conducted on data nodes, instead of on compute nodes as the normal MPI library does. The purpose of the MPI extension and the DEPM component are similar to the netCDF Operators 1) in some sense, allowing data-intensive operations to be decoupled and processed on data nodes, and the results being sent back to compute nodes for further processing. For instance, an ncwa operator in netCDF computes the weighted average on specified data and returns the result for further computations, reducing the unnecessary data movement. Different from netCDF Operators, however, the DEPM is extended and much more powerful. It allows operations to be decoupled not only operators, which essentially allows general piece of code to be executed on data nodes, beyond operators. In addition, the DEPM allows optimizations across operations, which is impossible in the netCDF Operators.
  • Runtime system. At runtime, the DEP relies on two libraries, message passing library and data processing library, to support computation-intensive operations and data-intensive operations respectively. The message passing library focuses on the memory abstraction of massively parallel processes and provides the runtime support for computation-intensive operations to be run on massive compute nodes. We leverage the existing MPI library for this purpose. The data processing library focuses on the I/O abstraction and provides runtime support for data-intensive operations to be run on data nodes. These two libraries are tightly coupled, and the message passing library manages the interaction between these two libraries as well. The runtime system can optimize user-defined data-intensive operations and other I/O optimization operations on data nodes as well. The proposed decoupled execution paradigm changes the current execution paradigm by balancing the computation and data-access capabilities. This new paradigm separates computation-intensive operations and data-intensive operations and handles them concurrently and in a coordinated manner, but on different hardware and software environments for best performance.
1) C. S. Zender and D. L. Wang. High performance distributed data reduction and analysis with the netCDF Operators (NCO). Presented to the 23rd AMS Conference on Interactive Information and Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, January 14–18, San Antonio, TX, January 14–18, 2007.
Recent changes RSS feed Debian Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki