Load Balancing of Dynamical Nucleation Theory Monte Carlo Simulations Through Resource Sharing Barriers
This proceedings is from Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International (2012: 285, doi:10.1109/IPDPS.2012.35.
The dynamical nucleation theory Monte Carlo (DNTMC) application from the NWChem computational chemistry suite utilizes a Markov chain Monte Carlo, two-level parallel structure, with periodic synchronization points that assemble the results of independent finer-grained calculations. Like many such applications, the existing code employs a static partitioning of processes into groups and assigns each group a piece of the finer-grained parallel calculation. A significant cause of performance degradation is load imbalance among groups since the time requirements of the inner-parallel calculation varies widely with the input problem and as a result of the Monte Carlo simulation. We present a novel approach to load balancing such calculations with minimal changes to the application. We introduce the concept of a resource sharing barrier (RSB) – a barrier that allows process groups waiting on other processes’ work to actively contribute to their completion. The RSB load balancing technique is applied to the production DNTMC application code, resulting in a small code change of 200 lines and a reduction in execution time of up to 37%.