Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

(RESOLVED) serial vs. dmpar

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Xun_Wang

New member
Hi,

I have a question regarding the configure options of WPS. According to http://www2.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php, it is recommended to compile WPS in serial. Why is that? For long-term runs, running metgrid.exe in parallel does save some computational time. Is there any drawbacks of building WPS in dmpar?

Thanks for your help!
 
Hi,
That is a good question. We have just found that the advantage to running WPS in parallel isn't that beneficial (or much faster) than a simple serial run, as the WPS programs run so quickly anyway. We do find that it is necessary if you have very large domains (1000's x 1000's of grid cells). That being said, if you are finding that it speeds up the process for your runs, then it should be perfectly fine to use parallel processing. As I'm sure you already know, geogrid and metgrid are the only programs that can be run in parallel. Ungrib must still be run serially.
 
Writing large NETCDF files can take a *LONG* time. That's why I use PNETCDF for WRF forecasts.

Are you writing to a Lustre filesystem that supports striping? Most do, though I know of one that doesn't
and scrambles files if you attempt it. This can get you I/O improvement.

You'll need to do testing to see what numbers work best for you. Base on my experience (TACC, PSC), even
numbers before much better than odd. Why, I don't know. Currently, I use 4 on both STAMPEDE2 (TACC)
and BRIDGES (PSC). Use "lfs getstripe" to get the current value. Use "lfs setstripe" to set values.
 
kwerner said:
Hi,
That is a good question. We have just found that the advantage to running WPS in parallel isn't that beneficial (or much faster) than a simple serial run, as the WPS programs run so quickly anyway. We do find that it is necessary if you have very large domains (1000's x 1000's of grid cells). That being said, if you are finding that it speeds up the process for your runs, then it should be perfectly fine to use parallel processing. As I'm sure you already know, geogrid and metgrid are the only programs that can be run in parallel. Ungrib must still be run serially.

Hi,
Thanks for your reply. My domains are not that big. I have two nested domains with 200*130 and 382*253 grid points. But I am using metgrid.exe to generate hourly met_em.d0* files, that is why it makes a difference. For example, when generating 36-hour met_em.d0* files, serial takes 30 min, parallel with 8 nodes takes only 13 min.
 
kwthomas said:
Writing large NETCDF files can take a *LONG* time. That's why I use PNETCDF for WRF forecasts.

Are you writing to a Lustre filesystem that supports striping? Most do, though I know of one that doesn't
and scrambles files if you attempt it. This can get you I/O improvement.

You'll need to do testing to see what numbers work best for you. Base on my experience (TACC, PSC), even
numbers before much better than odd. Why, I don't know. Currently, I use 4 on both STAMPEDE2 (TACC)
and BRIDGES (PSC). Use "lfs getstripe" to get the current value. Use "lfs setstripe" to set values.

Hi Kevin,
Thanks for your reply and suggestion. However my met_em.d0* files are not large at all, only 152 MB for each one. I am using hourly forcing data to produce hourly met_em.d0* files, that's why it takes a long time.
 
Top