Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Error running geogrid.exe during URB_Param processing

ccrossett

New member
Hi All,

I'm attempting to run geogrid using precompiled WRFV4.4 and WPSV4.4 on Cheyenne and keep receiving a time out error after URB_Param begins to process:

Processing URB_PARAM
MPT: Launcher network accept (MPI_LAUNCH_TIMEOUT) timed out
MPT: Launcher on r7i7n7 failed to receive connection(s) from: r7i7n7.ib0.cheyenne.ucar.edu r7i6n26.ib0.cheyenne.ucar.edu
MPT: MPT ERROR: Check network connectivity between hosts.
Retry after increasing value of MPI_LAUNCH_TIMEOUT.
See MPI(1) for details.
MPT ERROR: could not launch executable
(HPE MPT 2.25 08/14/21 03:06:24)
Killed


I reached out to UCAR support for assistance and included the following in my submission script on their recommendation to no avail:

### Select 2 nodes with 36 CPUs, for 72 MPI processes
#PBS -l select=2:ncpus=36:mpiprocs=36:nodetype=largemem


Therefore, I'm hoping somebody else has run into this issue and might have a fix. I've attached my output file and namelist for reference.

Thank you in advance!
 

Attachments

  • geogrid_output.txt
    1.5 KB · Views: 6
  • namelist.wps.txt
    1.2 KB · Views: 7
Hi,
Your namelist.wps looks fine.
Based on your geeogrrid_output.txt, I suppose that you run geogrid.exe in parallel mode. Please let me know if I am wrong.
Note that the precompiled WPS codes in cheyenne all are built in serial mode. You need to run by ./geogrid.exe.
 
Hi Ming,

I'm very new to running WRF (and even newer to doing so in Cheyenne) so yes, I think I am running in parallel mode.

A potentially very basic question: I store my namelist and batch script (attached) in a directory in my scratch space and then just tell the batch script to go find geogrid.exe within the directory that houses the precompiled WPS code that I've also copied to my scratch space. I don't have permissions to move the executable to the directory with my batch script, so is there another way I should be linking to that WPS directory? Or am I doing this completely incorrectly?

Thanks again for your help!
 

Attachments

  • rungeogrid.txt
    456 bytes · Views: 8
Hi,
I would suggest that you copy (not link) the precompiled WPS/WRF codes to your scratch space.
Then you can simply run the command ./geogrid.exe in your WPS directory.
Please don't run the job script "rungeogrid.txt" you attached. This is because it run in parallel mode using 72 processors in cheyenne. However, WPS is compiled in serial mode.
 
Top