Hi WRF Forum folks,
I've encountered a rather strange error in moving some WRF ensemble code from Cheyenne to Derecho here at NCAR. When I try to run an ensemble, a random subset of members fails each time with this error:
taskid: 0 hostname: dec0460
module_io_quilt_old.F 2931 F
MPASPECT: UNABLE TO GENERATE PROCESSOR MESH. STOPPING.
PROCMIN_M 1
PROCMIN_N 1
P -71
MINM 1
MINN -71
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 125
module_dm: mpaspect
-------------------------------------------
Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
I was thinking this might be something to do with my nio_tasks_per_group or nio_groups settings in the namelist.input, but I've tried several different combinations of settings for these (1/4, 4/4, 6/6, 6/12, 12/12) along with several different numbers of processors and nodes and I haven't seen any improvements. The most confusing thing is that the ensemble members that fail each time are different, even when the exact same settings are run twice in a row. Have you encountered anything like this?
Thanks,
Matt Wilson
I've encountered a rather strange error in moving some WRF ensemble code from Cheyenne to Derecho here at NCAR. When I try to run an ensemble, a random subset of members fails each time with this error:
taskid: 0 hostname: dec0460
module_io_quilt_old.F 2931 F
MPASPECT: UNABLE TO GENERATE PROCESSOR MESH. STOPPING.
PROCMIN_M 1
PROCMIN_N 1
P -71
MINM 1
MINN -71
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 125
module_dm: mpaspect
-------------------------------------------
Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
I was thinking this might be something to do with my nio_tasks_per_group or nio_groups settings in the namelist.input, but I've tried several different combinations of settings for these (1/4, 4/4, 6/6, 6/12, 12/12) along with several different numbers of processors and nodes and I haven't seen any improvements. The most confusing thing is that the ensemble members that fail each time are different, even when the exact same settings are run twice in a row. Have you encountered anything like this?
Thanks,
Matt Wilson