Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF Nested Run

sksahk

New member
Hi All,

I am getting following error when submitting WRF Simulation with nested domain.
em_real]$ tail -f rsl.error.0192
pe 89 1225 1
ms -4 1122 1
me 99 1236 1
d02 2014-12-25_00:00:00 module_io.F: in wrf_write_field
Fatal error in PMPI_Gather: A process has failed, error stack:
PMPI_Gather(856)..........: MPI_Gather(sbuf=0x2e93f04, scount=1, MPI_INT, rbuf=0xa2f5b80, rcount=1, MPI_INT, root=0, comm=0xc400000b) failed
MPIR_Gather_impl(681).....:
MPIR_Gather(641)..........:
MPIR_Gather_intra(256)....:
dequeue_and_set_error(888): Communication error with rank 224
^C
em_real]$ tail -f rsl.error.0000
ds 1 1 1
de 1419 1509 1
ps 1 1 1
pe 89 95 1
ms -4 -4 1
me 99 105 1
d02 2014-12-25_00:00:00 module_io.F: in wrf_write_field
FATAL ERROR: collect_on_comm: noutbuf_loc (1776967900) > noutbuf (1536)
WILL NOT perform the collection operation
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

I am hereby attaching my namelist.input for your reference.
Simulation is working well with only one domain d01.

Kindly suggest me

Regards

Saurabh Kumar
 

Attachments

  • namelist.input
    4.8 KB · Views: 4
  • rsl.error.0000
    598.9 KB · Views: 2
Hi,
It's possible that you need to use more processors to run this. If you have more available, can you try using more - perhaps a total greater than 600.
 
Okay, then as a test to see if it is the number of processors you're using, can you first try running with only a single domain to see if you're able to run the 823x681 domain with 256 processors? If that works, then try adding the second domain, but make it much smaller - around the same size as domain 01 or even smaller, and see if you can run with that size domain. Obviously you'll need to re-run geogrid and metgrid with the new domain.
 
Top