Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF issue with Cheynne

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Jade666

New member
Hello all,

I was trying to run wrf in Cheynne but the submitted job kept being terminated with an error like below:

MPT ERROR: MPI_COMM_WORLD rank 233 has terminated without calling MPI_Finalize()
aborting job

I have checked some of the rsl files, it seems failing at my input file, message showing as below:
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 115
program wrf: error opening wrfinput_d01 for reading ierr= -1021
-------------------------------------------

However, my input was working properly from one of my previous runs. So I am wondered is it possible to have some other issue to cause the above error instead of the input file and also, is there a way to check why the input file is not working?



Thanks.
 
Please recompile WRF and see whether this case can work.
if you still get the same error, then you need to look at rsl.out.0233 and rsl.error.0233, find error the message in these rsl files. This will give us more information what could be possibly wrong.
 
Thank you so much for your reply. I have re-compiled wrf based on some of the suggestions from this post: https://forum.mmm.ucar.edu/phpBB3/viewtopic.php?f=47&t=208.

This time the error message is like:
MPT ERROR: MPI_COMM_WORLD rank 224 has terminated without calling MPI_Finalize()
aborting job

So I am assuming this time, I should report the error message in rsl.out.0224, which is like below:

taskid: 224 hostname: r5i1n0
Error reading namelist &logging from namelist.input. Using default logging con
fig.
Ntasks in X 18 , ntasks in Y 20
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 1, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 1, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 2, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 3, setting gfdda interval and ending t
ime to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 3, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 3, setting obs nudging interval an
d ending time to 0 for that domain.
bl_pbl_physics /= 4, implies mfshconv must be 0, resetting
--- NOTE: num_soil_layers has been set to 4
WRF V3.4.1 MODEL
wrf: calling alloc_and_configure_domain
*************************************
Parent domain
ids,ide,jds,jde 1 481 1 361
ims,ime,jms,jme 208 247 210 241
ips,ipe,jps,jpe 215 240 217 234
*************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 84863352 bytes alloc
ated
wrf: calling model_to_grid_config_rec
wrf: calling set_scalar_indices_from_config
wrf: calling init_wrfio
module_io.F: in wrf_ioinit
Entering ext_gr1_ioinit
DEBUG wrf_timetoa(): returning with str = [2005-08-20_12:00:00]
DEBUG wrf_timetoa(): returning with str = [2005-08-20_12:00:00]
DEBUG wrf_timetoa(): returning with str = [2005-08-28_12:00:00]
DEBUG wrf_timeinttoa(): returning with str = [0000000000_000:000:030]
DEBUG setup_timekeeping(): clock after creation, clock start time = 2005-08-
20_12:00:00
DEBUG setup_timekeeping(): clock after creation, clock current time = 2005-0
8-20_12:00:00
DEBUG setup_timekeeping(): clock after creation, clock stop time = 2005-08-2
8_12:00:00
DEBUG setup_timekeeping(): clock after creation, clock time step = 000000000
0_000:000:030
setup_timekeeping: set xtime to 0.0000000E+00
setup_timekeeping: set julian to 231.5000
setup_timekeeping: returning...
wrf main: calling open_r_dataset for wrfinput
DEBUG wrf_timetoa(): returning with str = [2005-08-20_12:00:00]
module_io.F: in wrf_open_for_read
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 115
program wrf: error opening wrfinput_d01 for reading ierr= -1021
-------------------------------------------

Please let me know if there is any other information I can provide.
 
Yes. That is why I am a little bit confused and had the initial post. I have all the required input data in that folder and ncdump does show the proper structure and all. However, this message really took me by surprise. That is also why I asked whether there might other reasons that leading to this fatal error other than the missing of input files. Also is there a way to see whether the input file is corrupted or not other than using ncdump? Since I am running WRF based on someone else's results, I can't really recreate the file. Also, this file was working fine in Yellowstone and at least ran successfully once in Cheynne before.
 
If this file is created in yellowstone, then I suppose it is really "old" and is for old version of WRF. It is hard to debug the possible problems related to this old data. I would suggest you recreate this file for the case to run in cheyenne.
 
Top