The previous article in this series discussed the scatter operation for moving particles to various processes. In this second part of the series we will discuss a commonly used method of communicating information between processes. Each process is logically mapped to a “patch”.
In the animation below, particles are generated in the red patch “P0” and then scattered to the other eight patches. During a particle-based simulation, some information has to be transferred between adjacent processes. The amount of information that has to be communicated depends on a characteristic length scale that is determined by the particle algorithm. In the animation, this length is shown by the “ghost” regions outlined in a darker shade with dashed borders.
Exchanging particles between processes
After the particles have been scattered and ghost regions identified, the particles in the ghost regions are exchanged as depicted in the animation below. Notice that, in addition to the exchange between the left-right and top-bottom patches, the information at corners of patches also have to communicated to the three adjacent patches for a total of 8 communication steps. For three-dimensional simulations, 26 such communication steps are needed for each patch. Also notice that all we are doing is increasing the size of each patch and including regions of overlap between patches.
To keep things manageable, we create a PatchNeighborComm struct for communications between neighbor patches. We also define a Patch struct that takes care of the details for each patch.
The neighbor communication methods are defined as:
An implementation of the functions in this struct is shown below:
The Patch struct takes care of all the communication needs of each patch. The definition I cobbled together is listed below.
The implementation of the Patch struct that I came up with is summarized below. The design can definitely be improved; but recall that our goal is to do a quick parallelization of an existing serial code.
The particle exchange function
The particle exchange function the main simulation code can then be written as follows. Note that this design follows the approach taken by Dr. B. Yan for his parallel DEM code developed a UC Boulder.
Clearly, a lot of communication and book-keeping is needed if we follow this approach. An alternative approach that uses fewer communication steps is the procedure developed by Steve Plimpton (“Fast parallel algorithms for short-range molecular dynamics”, Sandia Report SAND91-1144.UC-405, 1993).
In the next part of this series, we will discuss Plimpton’s approach for domain decomposition.
If you have questions/comments/corrections, please contact banerjee at parresianz dot com dot zen (without the dot zen).