Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF crashes sporadically

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

tpartrid

New member
Hello,

I'm running a year-long simulation over the central United States using WRF 4.0.1 with Noah-MP. I have found that my simulation will crash without producing any error messages specified in either the .log or rsl.error files. I don't believe this is a time step issue as I have already initiated restart runs and the simulation will proceed past the initial crash date. Any help or insight for possible troubleshooting would be greatly appreciated! My namelist is below:

run_days = 0,
run_hours = 00,
run_minutes = 0,
run_seconds = 0,
start_year = 2010, 2010, 2010,
start_month = 02, 02, 02,
start_day = 26, 26, 26,
start_hour = 00, 00, 00,
end_year = 2010, 2010, 2010,
end_month = 12, 12, 12,
end_day = 31, 31, 31,
end_hour = 18, 18, 18,
interval_seconds = 21600
input_from_file = .true.,.true.,.true.,
history_interval = 1440, 1440, 1440,
frames_per_outfile = 1000, 1000, 1000,
restart = .true.,
restart_interval = 80640,
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
debug_level = 0
history_outname = "Output/wrfout_d_<domain>_<date>"
auxinput4_inname = "wrflowinp_d<domain>",
auxinput4_interval = 360, 360, 360,
io_form_auxinput4 = 2
output_diagnostics = 1
auxhist3_outname = "Output/wrfxterm_d_<domain>_<date>"
auxhist3_interval = 1440, 1440, 1440,
frames_per_auxhist3 = 400, 400, 400,
io_form_auxhist3 = 2
io_form_auxhist24 = 2
auxhist24_interval = 1440, 60, 60,
frames_per_auxhist24 = 1000, 24, 24,
auxhist24_outname = "output24/SFC_d_<domain>_<date>"
/

&domains
time_step = 150,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 3,
e_we = 100, 141, 121,
e_sn = 100, 101, 111,
e_vert = 33, 33, 33,
p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 25000, 5000, 1000,
dy = 25000, 5000, 1000,
grid_id = 1, 2, 3,
parent_id = 0, 1, 2,
i_parent_start = 1, 37, 60,
j_parent_start = 1, 41, 50,
parent_grid_ratio = 1, 5, 5,
parent_time_step_ratio = 1, 5, 5,
feedback = 0,
smooth_option = 1,
/

&physics
physics_suite = 'CONUS'
mp_physics = -1, -1, -1,
cu_physics = -1, -1, 0,
ra_lw_physics = -1, -1, -1,
ra_sw_physics = -1, -1, -1,
bl_pbl_physics = -1, -1, -1,
sf_sfclay_physics = -1, -1, -1,
sf_surface_physics = 4, 4, 4,
radt = 30, 30, 30,
bldt = 0, 0, 0,
cudt = 5, 5, 5,
icloud = 1,
num_land_cat = 24,
sf_urban_physics = 0, 0, 0,
sst_update = 1,
prec_acc_dt = 1440,
/

&fdda
/

&noah_mp
dveg = 2,
opt_crop = 1,
/

&dynamics
hybrid_opt = 2,
w_damping = 0,
diff_opt = 1, 1, 1,
km_opt = 4, 4, 4,
diff_6th_opt = 0, 0, 0,
diff_6th_factor = 0.12, 0.12, 0.12,
base_temp = 290.
damp_opt = 3,
zdamp = 5000., 5000., 5000.,
dampcoef = 0.2, 0.2, 0.2
khdif = 0, 0, 0,
kvdif = 0, 0, 0,
non_hydrostatic = .true., .true., .true.,
moist_adv_opt = 1, 1, 1,
scalar_adv_opt = 1, 1, 1,
gwd_opt = 1,
/

&bdy_control
spec_bdy_width = 5,
spec_zone = 1,
relax_zone = 4,
specified = .true., .false.,.false.,
spec_exp = 0.33
nested = .false., .true., .true.,
/

&grib2
/

&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/


Many thanks!
 
Hi,
Unfortunately when the runs crash at seemingly random times, and the errors cannot be reproduced at the same times, it typically means it's a system or environment problem. If, however, the runs are always crashing after a certain number of output hours, or when your files reach a certain size, that may be related to a WRF code setting, or perhaps disk space. I would recommend checking the output file sizes to see if any are reaching the 4GB limit, and also check your disk space (but I assume when you run the restarts, you are still outputting to the same disk space, so that's unlikely the issue). If the stops don't seem to be consistent with hours or size, I unfortunately you may need to reach out to a systems administrator at your institution to see if they can help to troubleshoot.
 
Top