forrtl: severe (66): output statement overflows record, unit -5, file Internal List-Directed Write

gossart_al · Apr 25, 2023

Hi,

I am running a WRF simulation and the run works fine, with diagnostics enables in the namelist:

mean_diag = 1,
mean_diag_interval = 60,
auxhist3_outname = "wrf_mean_d<domain>_<date>.nc",
io_form_auxhist3 = 2,
frames_per_auxhist3 = 24

However, as the model ends and I want to do a restart, the model crashes with this error message:

forrtl: severe (66): output statement overflows record, unit -5, file Internal List-Directed Write

commenting out the mean output entries in the namelist resolves the issue, but I would like to be able to restart the model and keep writing mean outputs.
I am using a cluster with these settings:

Serial Fortran compiler (mostly for tool generation):
which SFC
/opt/cray/pe/craype/2.7.15/bin/ftn

Serial C compiler (mostly for tool generation):
which SCC
/opt/intel/compilers_and_libraries_2020.4.304/linux/bin/intel64/icc

Fortran compiler for the model source code:
which FC
/opt/cray/pe/craype/2.7.15/bin/ftn
Will use 'time' to report timing information

C compiler for the model source code:
which CC
/opt/cray/pe/craype/2.7.15/bin/cc

Coupld you please help?

Many thanks,

Alexandra

kwerner · Apr 27, 2023

Hi Alexandra,
Can you attach your full namelist.input file and package your output files (e.g., rsl.out.* and rsl.error.*) together in a single *.tar file and attach that as well? Can you also clarify specifically which namelist parameter(s) you must remove to get past the issue? Thanks!

gossart_al · Apr 27, 2023

Hi,
Please find the full namelist and the output files in attach.

commenting / un-commenting this block makes the model run/crash with the error message
mean_diag = 1,
mean_diag_interval = 60,
auxhist5_outname = "wrf_mean_d<domain>_<date>.nc",
io_form_auxhist5 = 2,
frames_per_auxhist5 = 24

many thanks!

kwerner · Apr 28, 2023

Thanks for sending that. I think I'm confused about when this is stopping. In your initial post, you mention that the model runs fine to completion and only stops at the end when you want to do a restart. But the rsl* files you sent seem to indicate that the model is stopping immediately after the initial time step. Per your namelist you are trying to run for 396 days. When you say when "I want to do a restart," do you mean when it's trying to write out a restart file (wrfrst*), or do you mean when you are trying to do a restart run. I ask this because, again, the rsl files do not indicate that it's even trying to write out a restart file yet, and your namelist has restart = .false., which means it's not a restart simulation. I apologize for the confusion, and can definitely see there is a problem, but I just need to understand better what is happening and when to make sure I'm addressing this correctly. Thanks!

gossart_al · May 1, 2023

Hi, I am sorry for the confusion.
I try to run a simulation and the first chunk goes well and the mean outputs are created. Our system has a walltime limit of 24 hours, which means that the model crashes due to time limit. When I try to launch a new job starting from a restart file, this is where the simulations crashes at the beginning, with the error message mentioned above.
Apologies that the namelist did not reflect the simulation, it should have read restart = .true.,

Thanks!

kwerner · May 5, 2023

Thanks for that explanation. That makes things a lot more clear. I am actually able to repeat your issue, but haven't figured out why it's happening yet. I'll continue to look into this and get back to you, hopefully with a suggestion.

kwerner · May 10, 2023

Hi,
Okay, I was able to find a couple things to change in your namelist that may help. I believe you need to remove "auxhist5_interval." You already have "mean_diag_interval," which is essentially the same thing. When I had "auxhist5_interval" in my namelist, wrf failed immediately. I also found that if you add "write_hist_at_0h_rst = .true." to the &time_control section, it no longer fails when doing a restart. I'm not sure why this is, but it worked. This means that a new history file will be written at the initial time of the restart simulation. If you don't want it to overwrite your same file from the previous run, you should put that file somewhere else, or rename it. Can you give those a try and let me know if that changes anything for your simulation?

forrtl: severe (66): output statement overflows record, unit -5, file Internal List-Directed Write

gossart_al

New member

kwerner

Administrator

gossart_al

New member

Attachments

kwerner

Administrator

gossart_al

New member

kwerner

Administrator

kwerner

Administrator