Hi All,
I'm attempting to run geogrid using precompiled WRFV4.4 and WPSV4.4 on Cheyenne and keep receiving a time out error after URB_Param begins to process:
Processing URB_PARAM
MPT: Launcher network accept (MPI_LAUNCH_TIMEOUT) timed out
MPT: Launcher on r7i7n7 failed to receive connection(s) from: r7i7n7.ib0.cheyenne.ucar.edu r7i6n26.ib0.cheyenne.ucar.edu
MPT: MPT ERROR: Check network connectivity between hosts.
Retry after increasing value of MPI_LAUNCH_TIMEOUT.
See MPI(1) for details.
MPT ERROR: could not launch executable
(HPE MPT 2.25 08/14/21 03:06:24)
Killed
I reached out to UCAR support for assistance and included the following in my submission script on their recommendation to no avail:
### Select 2 nodes with 36 CPUs, for 72 MPI processes
#PBS -l select=2:ncpus=36:mpiprocs=36:nodetype=largemem
Therefore, I'm hoping somebody else has run into this issue and might have a fix. I've attached my output file and namelist for reference.
Thank you in advance!
I'm attempting to run geogrid using precompiled WRFV4.4 and WPSV4.4 on Cheyenne and keep receiving a time out error after URB_Param begins to process:
Processing URB_PARAM
MPT: Launcher network accept (MPI_LAUNCH_TIMEOUT) timed out
MPT: Launcher on r7i7n7 failed to receive connection(s) from: r7i7n7.ib0.cheyenne.ucar.edu r7i6n26.ib0.cheyenne.ucar.edu
MPT: MPT ERROR: Check network connectivity between hosts.
Retry after increasing value of MPI_LAUNCH_TIMEOUT.
See MPI(1) for details.
MPT ERROR: could not launch executable
(HPE MPT 2.25 08/14/21 03:06:24)
Killed
I reached out to UCAR support for assistance and included the following in my submission script on their recommendation to no avail:
### Select 2 nodes with 36 CPUs, for 72 MPI processes
#PBS -l select=2:ncpus=36:mpiprocs=36:nodetype=largemem
Therefore, I'm hoping somebody else has run into this issue and might have a fix. I've attached my output file and namelist for reference.
Thank you in advance!