Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

error running real.exe

psagar

New member
Hello,
I am having a strange problem running real.exe in wrf v4.4. The program seems to stop after running for a fixed number of timesteps. The error is:
FATAL CALLED FROM FILE: <stdin> LINE: 409
error opening met_em.d01.2004-04-22_12:00:00.nc for input; bad date in namelist or file not in directory

If I change the begin date, the error also shifts by the same amount of time. So I know that the problem is not due to problem in the met_em* file. The above date runs successfully if I change the begin date. My guess is that real program is unable to write to the wrfbdy* or wrfin* file after exceeding certain size. I had used export WRFIO_NCD_LARGE_FILE_SUPPORT=1 so it should not be the cause but somehow I still get this error.

This problem was discussed in two earlier posts but a solution was not proposed, only a workaround by running real.exe multiple times was proposed:

Running real.exe multiple times is tedious for long runs. So any help regarding this error is highly appreciated.
 
Last edited:
How many times of data do you want to include in a single wrrbdy file?

Please Take a look in the file WRF/external/io_netcdf/wrf_io.F90. Near the top of the file, you should see:

integer parameter :: MaxTimes = 60000

You can increase the number and see whether it fixes your issue.
 
Thank you Ming for the idea but I just realized that it may not be the issue. My maximum time numbers for me for 20 years would be ~4*365*20 = 29,200, so it is well within the above number of 60000. It could be related to the environment, I have reached out to the technical team but here is my environmental settings for your information in case you have more ideas:

module load cpu
module load gcc/10.2.0
module load openmpi/4.1.3
module load wrf/4.2/5hf6hqj
export NETCDF=$NETCDF_FORTRANHOME
export NETCDF_classic=1
export WRFIO_NCD_LARGE_FILE_SUPPORT=1
export JASPERLIB=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/gcc-10.2.0/jasper-2.0.32-4uwbmdvw2nrmj225qyhlo5sllrhcx3fs
export JASPERFINC=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/gcc-10.2.0/jasper-2.0.32-4uwbmdvw2nrmj225qyhlo5sllrhcx3fs/include/jasper
 
I was running my program in a supercomputer and I just also tried in another local machine and the same error persists even in our local clusters. So I think this is a WRF v4.4 issue. I have not tried other WRF versions and not sure if it will help.
 
I don't see any issues in your environmental setting.

WPS/WRF V4.4 has been well tested and I don't think there is any issue in the codes.

I am suspicious that this is a issue related to data file size.

The large-file support setting allows files to be written that are larger than 2 GB, but smaller than 4 GB.

In your case, can you check how large wrfbdy is before REAL crashed?
 
Thank you Ming. I checked the data size and the size of wrffdda_d01 at the time of crash is ~2.8GB. But I would like to know - if the size would be larger than 4 GB, what would be the option?
 
If the size of wrffdda_d01 at the time of crash is ~2.8GB, I suppose it shouldn't be the reason for the failing of REAL. How about the size of wrfbdy? We would like to know exactly why REAL program failed when processing multi-time data.

It is rare that the datafiles involved in WRF run are larger than 4GB. Under this case, you can set

io_form_'xx'=102, here 'xx' can be input or restart or history. The corresponding files will be split into several small files. However, the trouble is that you need to run a joiner program later to 'merge' these small files, which is not easy to do. This is why we don't recommend users to activate the option io_form_'xx'=102.
 
Thank you and sorry for getting back late Ming. In my case, wrfbdy* files are automatically split into small 6-hourly files for some reason like this:

......................................................
wrfbdy_d01_2004-04-21_12:00:00
wrfbdy_d01_2004-04-21_18:00:00
wrfbdy_d01_2004-04-22_00:00:00

I didn't have to merge them myself, wrf was reading them fine before. But it was not causing any problem before. But I think the problem lies somewhere here. My relevant settings are below:

interval_seconds = 21600
input_from_file = .true.,.true.,.true.,
history_interval = 60, 999999999, 60,
frames_per_outfile = 1, 1, 1,
restart = .false.,
restart_interval = 11520
write_hist_at_0h_rst = .true.,
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
bdy_inname = "wrfbdy_d<domain>_<date>",
 
Hi,
Your namelist options are fine.
I further explored this issue and talked to our software engineer. It is possible that an individual variable or record that is too large may violate certain constraints of netCDF file. We are not sure whether this is the reason for your case, and currently it is just our guess.
Please stay with the option to split wrfbdy file, --- I am sorry for the inconvenience but this is the workaround at present.
 
Top