Utilizing Dataflow-Based Execution for Coupled Cluster Methods

Heike McCraw, University of Tennessee - Knoxville
Anthony Danalis, University of Tennessee - Knoxville
Thomas Herault, University of Tennessee - Knoxville
George Bosilca, University of Tennessee - Knoxville
Jack Dongarra, University of Tennessee - Knoxville
Karol Kowalski, Pacific Northwest National Laboratory
Theresa Lynn Windus, Iowa State University

This poster is from 2014 IEEE International Conference on Cluster Computing (CLUSTER) (2014): 296, doi:10.1109/CLUSTER.2014.6968738.


Computational chemistry comprises one of the driving forces of High Performance Computing. In particular, manybody methods, such as Coupled Cluster methods (CC) (Bartlett and Musial, 2007) of the quantum chemistry package NWCHEM (Valiev, et.al., 2010), are of particular interest for the applied chemistry community. With the increase in scale, complexity, and heterogeneity of modern platforms, traditional programming models fail to deliver the expected performance scalability. On our way to Exascale, we believe that dataflow-based programming models - in contrast to the control flow model (e.g., as implemented in languages such as C) - may be the only viable way for achieving and maintaining computation at scale. In this paper, we discuss a dataflow-based programming model and its applicability to NWCHEM's CC methods. Our dataflow version of the CC kernels breaks down the algorithm into finer grained tasks with explicitly defined data dependencies. As a result, the serialization imposed by the traditional, linear algorithms can be transformed into parallelism, allowing the overall computation to scale to much larger computational resources. We build this experiment using the Parallel Runtime Scheduling and Execution Control (PARSEC) framework (Bosilca, et.al., 2012) - a task-based dataflow-driven execution engine - that enables efficient task scheduling on distributed systems.