Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Grid indexing / local cell re-ordering for cache reuse

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

marcimb

New member
Hello,

after browsing through some of the older MPAS presentation slides in https://www2.mmm.ucar.edu/people/duda/files/mpas/old_talks/mpas_leeds.pdf, I found a slide about local cell re-ordering (slide 10) which has the potential to improve cache reuse. While it is mentioned in the presentation that there is no local cell re-ordering performed for MPAS meshes, the graph to the right suggest that there is a possible way of obtaining this kind of local cell re-ordering. Have there been further developments in this direction and are there any plans to include such a local cell re-ordering in the official MPAS workflow?

Best regards,
Marc
 
Apologies for the long delay in replying! While we don't perform any online cell reordering, we do have some experimental offline code that orders the cells according to a modified version of the reverse Cuthill-McKee algorithm given a mesh file and a Metis partition file. I don't remember specific numbers, but I think we were seeing at most a ~10% performance improvement when each MPI rank was assigned O(10^3) to O(10^4) grid columns, and less improvement when scaling out to higher numbers of MPI tasks (perhaps because the smaller blocks of cells fit better into cache?).

Generally, I think we've found that placing the vertical dimension innermost helps to alleviate the impact of the indirect addressing that's necessitated by our use of a horizontally unstructured grid. See, e.g., MacDonald et al. (IJHPCA 2010).
 
Top