Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

How to parallelize a part of reading an aerosol file in module_initialize_real.F in dyn_em directory

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.


New member
I want to parallelize a code part that is inserted into “module_initialize_real.F” by me. The inserted part is as follows:
OPEN (61,FILE='aron_2001030100_d3.txt',STATUS='OLD')
do j=jts,jde-1
do i=its, ide-1
read (61,*) iii,jjj,grid%aron(i,kts,j)
close (61)

Also, aerosol file “aron_2001030100_d3.txt” is structured as follows.
1 1 500.00000
2 1 500.00000
3 1 500.00000
4 1 500.00000

3347 2281 500.00000
3348 2281 500.00000
3349 2281 500.00000
3350 2281 500.00000
3351 2281 500.00000
3352 2281 500.00000

The code inserted by me is designed to work with only one cpu and the code works well with one cpu. When multiple cpus are used, the code does not work the way it is supposed to do, since the code has not been developed for the parallel computation. Could you let me know how to modify the code in a way to make it work with multiple cpus?
Thanks for your kind attention and hope to hear from you.
I would like to make three points:

First: How do I broadcast non-decomposed data to all processors?

There are a number of examples of reading in a data file with one processor, and then broadcasting that information to all of the processors. This is typically what is done with look-up tables for physics schemes. Every MPI process gets the exact same information (such as what is the thermal conductivity for land category 17). A typical location to find one of these examples would be to peek in one the land surface models, looking for the input file string "VEGPARM.TBL".

You will see that there is a
IF ( wrf_dm_on_monitor() ) THEN
This is followed by the usual OPEN, READ, and CLOSE. Afterwards, there is the closing ENDIF for the wrf_dm_on_monitor.

Outside of the wrf_dm_on_monitor, there are a sequence of WRF-ized MPI calls:
CALL wrf_dm_bcast_real    ( ALBTBL  , NLUS )
Each call is for a variable to be broadcast (in this case a real variable), and the length (in units of the type).

Second: Are decomposed fields different?

Yes. We typically do not have a domain-sized 3d array (for example, the aron field looks to be a registry variable) allocated on each MPI process, so the above "solution" will not work for you.

To reduce the number of back-and-forth iterations, here is a one-off way of doing this. Run the real program with one processor. Change the registry so that this input variable is defined as "irh", that way it will automatically be written to the wrfinput_d0x file (as well as the restart file, and the model history file, when you get to those steps). Be careful if you are running nests. Then you can run the WRF model with as many MPI processes as you want, since the decomposed data is in a standard file set up to handle just such situations.

Side comment

The code snippet that you included may just be a pseudo-code example. If not ...
  • Is there a reason to allocate an array as 3d, but only read in the surface?
  • The horizontal indexing might need to be adjusted for the staggering.
  • Usually, the index ide-1 or jde-1 should not appear to remain within the allocated space for a single MPI process.