WRF crashes sporadically

tpartrid · Oct 14, 2019

Hello,

I'm running a year-long simulation over the central United States using WRF 4.0.1 with Noah-MP. I have found that my simulation will crash without producing any error messages specified in either the .log or rsl.error files. I don't believe this is a time step issue as I have already initiated restart runs and the simulation will proceed past the initial crash date. Any help or insight for possible troubleshooting would be greatly appreciated! My namelist is below:

run_days = 0,
run_hours = 00,
run_minutes = 0,
run_seconds = 0,
start_year = 2010, 2010, 2010,
start_month = 02, 02, 02,
start_day = 26, 26, 26,
start_hour = 00, 00, 00,
end_year = 2010, 2010, 2010,
end_month = 12, 12, 12,
end_day = 31, 31, 31,
end_hour = 18, 18, 18,
interval_seconds = 21600
input_from_file = .true.,.true.,.true.,
history_interval = 1440, 1440, 1440,
frames_per_outfile = 1000, 1000, 1000,
restart = .true.,
restart_interval = 80640,
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
debug_level = 0
history_outname = "Output/wrfout_d_<domain>_<date>"
auxinput4_inname = "wrflowinp_d<domain>",
auxinput4_interval = 360, 360, 360,
io_form_auxinput4 = 2
output_diagnostics = 1
auxhist3_outname = "Output/wrfxterm_d_<domain>_<date>"
auxhist3_interval = 1440, 1440, 1440,
frames_per_auxhist3 = 400, 400, 400,
io_form_auxhist3 = 2
io_form_auxhist24 = 2
auxhist24_interval = 1440, 60, 60,
frames_per_auxhist24 = 1000, 24, 24,
auxhist24_outname = "output24/SFC_d_<domain>_<date>"
/

&domains
time_step = 150,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 3,
e_we = 100, 141, 121,
e_sn = 100, 101, 111,
e_vert = 33, 33, 33,
p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 25000, 5000, 1000,
dy = 25000, 5000, 1000,
grid_id = 1, 2, 3,
parent_id = 0, 1, 2,
i_parent_start = 1, 37, 60,
j_parent_start = 1, 41, 50,
parent_grid_ratio = 1, 5, 5,
parent_time_step_ratio = 1, 5, 5,
feedback = 0,
smooth_option = 1,
/

&physics
physics_suite = 'CONUS'
mp_physics = -1, -1, -1,
cu_physics = -1, -1, 0,
ra_lw_physics = -1, -1, -1,
ra_sw_physics = -1, -1, -1,
bl_pbl_physics = -1, -1, -1,
sf_sfclay_physics = -1, -1, -1,
sf_surface_physics = 4, 4, 4,
radt = 30, 30, 30,
bldt = 0, 0, 0,
cudt = 5, 5, 5,
icloud = 1,
num_land_cat = 24,
sf_urban_physics = 0, 0, 0,
sst_update = 1,
prec_acc_dt = 1440,
/

&fdda
/

&noah_mp
dveg = 2,
opt_crop = 1,
/

&dynamics
hybrid_opt = 2,
w_damping = 0,
diff_opt = 1, 1, 1,
km_opt = 4, 4, 4,
diff_6th_opt = 0, 0, 0,
diff_6th_factor = 0.12, 0.12, 0.12,
base_temp = 290.
damp_opt = 3,
zdamp = 5000., 5000., 5000.,
dampcoef = 0.2, 0.2, 0.2
khdif = 0, 0, 0,
kvdif = 0, 0, 0,
non_hydrostatic = .true., .true., .true.,
moist_adv_opt = 1, 1, 1,
scalar_adv_opt = 1, 1, 1,
gwd_opt = 1,
/

&bdy_control
spec_bdy_width = 5,
spec_zone = 1,
relax_zone = 4,
specified = .true., .false.,.false.,
spec_exp = 0.33
nested = .false., .true., .true.,
/

&grib2
/

&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/

Many thanks!

kwerner · Oct 18, 2019

Hi,
Unfortunately when the runs crash at seemingly random times, and the errors cannot be reproduced at the same times, it typically means it's a system or environment problem. If, however, the runs are always crashing after a certain number of output hours, or when your files reach a certain size, that may be related to a WRF code setting, or perhaps disk space. I would recommend checking the output file sizes to see if any are reaching the 4GB limit, and also check your disk space (but I assume when you run the restarts, you are still outputting to the same disk space, so that's unlikely the issue). If the stops don't seem to be consistent with hours or size, I unfortunately you may need to reach out to a systems administrator at your institution to see if they can help to troubleshoot.

WRF crashes sporadically

tpartrid

New member

kwerner

Administrator