Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

error when running wrf (Invalid communicator)

acast

New member
Hi,
I am trying to run wrf.exe and I get the following error on my .err file.
"Fatal error in PMPI_Cart_create: Invalid communicator, error stack:
PMPI_Cart_create(340): MPI_Cart_create(comm=0xffff8002, ndims=2, dims=0x372e570, periods=0x7f7b640, reorder=0, comm_cart=0x7fffffff02cc) failed"

Following the guidelines shown in this forum I am using 72 processors and 2 nodes since in my HPC (Poseidon), each node has 36 processors. My namelist_input has nproc_x=9 and nproc_y = 8, e_we=242 and e_sn =92. I attach my namelist.input files and .err files.
Thanks, I have run other WRF simulations with different grid sizes without issues and I have tried several combinations of nproc_x and nproc_y without success.
Alma
 

Attachments

  • namelist.input
    5.3 KB · Views: 10
  • nes_err.txt
    7 KB · Views: 4
Hi Ming,
Using dmpar 15. I should add that if try to run wrf.exe (just as a test) with just one processor it does work. Only when I try to run the model in parallel is when I get the error. I check my sbatch code and I do have there 72 processors, the same as in my namelist.input.
Thanks,
Alma
 
Last edited:
Alma,
option 15 indicates that WRF is compiled in dmpar mode using intel compiler. Is this correct? Please let me know if I am wrong.
I have a few concerns about your namelist:
(1) dx = 27786.20,
dy = 28351.26,
What map projection did you use in this case? Why dx and dy are different?
(2) for dx = 27km, time_step can be up to 162. In your case, I suppose 150 should be fine
(3) Please delete the options below
nproc_x = 9
nproc_y = 8
You can run the case by the command "mpirun -np 36 ./wrf.exe". In this case the model will do decomposition automatically.
Note that the command might be different in your machine. Please consult your computer manager to make sure correct command is used.
(4) Since your domain size is 242 x 92, I would suggest that you use a smaller number of processors, for example, 36 is better than 72. You can also try 12.
(5) please set radt = 27
 
Hi Ming,
Yes my compiler is intel.
I use lat-lon projection and dx and dy are different because I plan to couple WRF with an ocean model (ROMS), so I had to "force" the WRF grid corners to be similar to ROMS.
I deleted the options nproc_x and nproc y and tried mpirun with 12 and changed radt to 27. I attach my error file. I get the same error as before but the model ran for a little bit longer?
Thanks,
Alma
 

Attachments

  • nes_err.txt
    2.2 KB · Views: 3
Last edited:
Top