Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

wrf stops at cldfra1

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

peter_

Member
I am using WRF 4.2 to perform a high resolution simulation with adaptive stepsize. My run always stops around 30 min after the initial time in coincidence with the first radiation calculations by WRF (radt=30), when it is calling cldfra1. Boundary and initial conditions have been prepared from ERA5 model levels up to 0.01 hPa. I checked a large variety of combinations of short and long wave radiation and microphysics options in the namelist and the simulation always fails at the same point. I also tested radt=6 following the suggestion of selecting a value close to dx in km and as expected the program crashed at minute 6 instead of 30. I have seen a similar problematic case with cldfra1 with no solution (wrfforum.com/viewtopic.php?f=21&t=11692&sid=b67aa07976ae9bf0c452061ad852ec78). I was not able to find any clear description on how cldfra1 works. Below I am including my namelist and the last lines of one of the rsl.error files (debug_level=1000) in case that someone can give me a suggestion. Thank you in advance
Code:
&time_control
run_days                 = 0,
run_hours                = 31,
run_minutes              = 0,
run_seconds              = 0,
start_year               = 2019,     2019,
start_month              = 9,        9,
start_day                = 11,       11,
start_hour               = 0,        0,
start_minute             = 00,       00,
start_second             = 00,       00,
end_year                 = 2019,     2019,
end_month                = 9,        9,
end_day                  = 12,       12,
end_hour                 = 7,        7,
end_minute               = 00,       00,
end_second               = 00,       00,
interval_seconds         = 3600,
input_from_file          = .true.,   .true.,
history_interval         = 180,       6,
history_outname          = "/scratch/peter/wrfout_d<domain>_<date>"
frames_per_outfile       = 1000,     110,
restart                  = .false.,
restart_interval         = 5000,
io_form_history          = 2,
io_form_restart          = 2,
io_form_input            = 2,
io_form_boundary         = 2,
debug_level              = 1000,
/

&domains
time_step                = 9,
time_step_fract_num      = 0,
time_step_fract_den      = 1,
max_dom                  = 2,
e_we                     = 364,      697,
e_sn                     = 382,      574,
e_vert                   = 150,       150,
p_top_requested          = 1.0002,
num_metgrid_levels       = 138,
num_metgrid_soil_levels  = 4,
dx                       = 9000,     3000,
dy                       = 9000,     3000,
grid_id                  = 1,        2,
parent_id                = 1,        1,
i_parent_start           = 1,       46,
j_parent_start           = 1,       82,
parent_grid_ratio        = 1,        3,
parent_time_step_ratio   = 1,        3,
feedback                 = 1,
smooth_option            = 0,
use_adaptive_time_step   = .true.,
step_to_output_time      = .true.,
target_cfl               = 1.2,    1.2,
max_step_increase_pct    = 25,      50,
starting_time_step       = 3,      1,
starting_time_step_den   = 10,     10,
max_time_step            = 45,     15,
min_time_step            = 3,      1,
min_time_step_den        = 100,    100,
adaptation_domain        = 1,
/

&physics
mp_physics               = 5,        5,
ra_lw_physics            = 99,       99,
ra_sw_physics            = 99,       99,
radt                     = 30,       30,
co2tf                    = 1,
sf_sfclay_physics        = 91,       91,
sf_surface_physics       = 2,        2,
bl_pbl_physics           = 1,        1,
bldt                     = 0,        0,
cu_physics               = 0,        0,
cudt                     = 5,        5,
isfflx                   = 1,
ifsnow                   = 0,
icloud                   = 1,
surface_input_source     = 1,
num_soil_layers          = 4,
sf_urban_physics         = 0,        0,
maxiens                  = 1,
maxens                   = 3,
maxens2                  = 3,
maxens3                  = 16,
ensdim                   = 144,
/

&fdda
/

&dynamics
w_damping                = 0,
diff_opt                 = 1,
km_opt                   = 4,
diff_6th_opt             = 0,        0,
diff_6th_factor          = 0.12,     0.12,
base_temp                = 290.,
damp_opt                 = 0,
zdamp                    = 5000.,    5000.,
dampcoef                 = 0.2,      0.2,
khdif                    = 0,        0,
kvdif                    = 0,        0,
non_hydrostatic          = .true.,   .true.,
moist_adv_opt            = 1,        1,
scalar_adv_opt           = 1,        1,
/

&bdy_control
spec_bdy_width           = 5,
spec_zone                = 1,
relax_zone               = 4,
specified                = .true.,  .false.,
nested                   = .false.,   .true.,
/

&grib2
/

&namelist_quilt
nio_tasks_per_group      = 0,
nio_groups               = 1,
/
Timing for wrf_patch_to_global_generic: 0.00000 elapsed seconds
<stdin> writing 2d real sst_input Status = 0
Timing for wrf_ext_write_field: 0.24428 elapsed seconds
d02 2019-09-11_00:30:10+07/** output_wrf: calling wrf_iosync
d02 2019-09-11_00:30:10+07/** module_io.F: in wrf_iosync
d02 2019-09-11_00:30:10+07/** output_wrf: back from wrf_iosync
d02 2019-09-11_00:30:10+07/** output_wrf: end, fid = 3
d02 2019-09-11_00:30:10+07/** med_hist_out: opened /scratch/peter/wrfout_d02_2019-09-11_00:30:10 as DATASET=HISTORY
d02 2019-09-11_00:30:10+07/** in med_latbound_in
d02 2019-09-11_00:30:10+07/** module_integrate: calling solve interface
d02 2019-09-11_00:30:10+07/** grid spacing, dt, time_step_sound= 3000.00000 15.0000000 4
d02 2019-09-11_00:30:10+07/** calling inc/HALO_EM_MOIST_OLD_E_7_inline.inc
d02 2019-09-11_00:30:10+07/** calling inc/PERIOD_BDY_EM_MOIST_OLD_inline.inc
d02 2019-09-11_00:30:10+07/** call rk_step_prep
d02 2019-09-11_00:30:10+07/** calling inc/HALO_EM_A_inline.inc
d02 2019-09-11_00:30:10+07/** calling inc/PERIOD_BDY_EM_A_inline.inc
d02 2019-09-11_00:30:10+07/** call rk_phys_bc_dry_1
d02 2019-09-11_00:30:10+07/** call init_zero_tendency
d02 2019-09-11_00:30:10+07/** calling inc/HALO_EM_PHYS_A_inline.inc
d02 2019-09-11_00:30:10+07/** call phy_prep
d02 2019-09-11_00:30:10+07/** DEBUG wrf_timetoa(): returning with str = [2019-09-11_00:30:10]
d02 2019-09-11_00:30:10+07/** call radiation_driver
d02 2019-09-11_00:30:10+07/** Top of Radiation Driver
d02 2019-09-11_00:30:10+07/** CALL cldfra1

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 0x2af4596bd33f in ???
#1 0x16d9d23 in ???
#2 0x17b279f in ???
#3 0x12f9901 in ???
#4 0x11b7ae4 in ???
#5 0x4735fa in ???
#6 0x473bda in ???
#7 0x406213 in ???
#8 0x405bcc in ???
#9 0x2af4596a9494 in ???
#10 0x405c03 in ???
#11 0xffffffffffffffff in ???
 
Peter,
(1) Would you please look at your rsl files and find possible error messages? If something goes wrong in physics, usually there should have some error message printed out. They are not necessarily in rsl.out.0000, but may be found in any rsl files.
(2) This case crashed pretty soon after initialization, which often indicates that probably something is wrong in input data. I understand that this case is kind of special because its model top is high. I wonder whether you can run a test case with lower model top like 30hPa and reduced number of vertical levels, and also turn off adaptive time step. Let's see whether it can run successfully. This will give us some clues to figure out what is wrong.
(3) For 9km run, time step can be 54s. Let's try a conservative option of 36s.
(4) Please turn on w_damping and also turn on cu_physics for D01.
 
Dear Ming:
Thank you for your reply. The error messages in the rsl.* files are
grep -i error rsl.*
rsl.error.0000: NetCDF error: NetCDF: Variable not found
rsl.error.0000: NetCDF error in wrf_io.F90, line 2883 Varname QICE
rsl.error.0000: NetCDF error: NetCDF: Variable not found
rsl.error.0000: NetCDF error in wrf_io.F90, line 2883 Varname CWM
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BXS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BXE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BYS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BYE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTXS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTXE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTYS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTYE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BXS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BXE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BYS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BYE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTXS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTXE
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTYS
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.error.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTYE
rsl.error.0000:Backtrace for this error:
rsl.error.0001:Backtrace for this error:
rsl.error.0002:Backtrace for this error:
rsl.error.0003:Backtrace for this error:
rsl.error.0004:Backtrace for this error:
rsl.error.0005:Backtrace for this error:
rsl.error.0006:Backtrace for this error:
rsl.error.0007:Backtrace for this error:
rsl.error.0008:Backtrace for this error:
rsl.error.0009:Backtrace for this error:
rsl.error.0010:Backtrace for this error:
rsl.error.0011:Backtrace for this error:
rsl.error.0012:Backtrace for this error:
rsl.error.0013:Backtrace for this error:
rsl.error.0014:Backtrace for this error:
rsl.error.0015:Backtrace for this error:
rsl.error.0016:Backtrace for this error:
rsl.error.0017:Backtrace for this error:
rsl.error.0018:Backtrace for this error:
rsl.error.0019:Backtrace for this error:
rsl.error.0020:Backtrace for this error:
rsl.error.0021:Backtrace for this error:
rsl.error.0022:Backtrace for this error:
rsl.error.0023:Backtrace for this error:
rsl.error.0024:Backtrace for this error:
rsl.error.0025:Backtrace for this error:
rsl.error.0026:Backtrace for this error:
rsl.error.0027:Backtrace for this error:
rsl.error.0028:Backtrace for this error:
rsl.error.0029:Backtrace for this error:
rsl.error.0030:Backtrace for this error:
rsl.error.0031:Backtrace for this error:
rsl.error.0032:Backtrace for this error:
rsl.error.0033:Backtrace for this error:
rsl.error.0034:Backtrace for this error:
rsl.error.0035:Backtrace for this error:
rsl.error.0036:Backtrace for this error:
rsl.error.0037:Backtrace for this error:
rsl.error.0038:Backtrace for this error:
rsl.error.0039:Backtrace for this error:
rsl.error.0040:Backtrace for this error:
rsl.error.0041:Backtrace for this error:
rsl.error.0042:Backtrace for this error:
rsl.error.0043:Backtrace for this error:
rsl.error.0044:Backtrace for this error:
rsl.error.0045:Backtrace for this error:
rsl.error.0046:Backtrace for this error:
rsl.error.0047:Backtrace for this error:
rsl.error.0048:Backtrace for this error:
rsl.error.0049:Backtrace for this error:
rsl.error.0050:Backtrace for this error:
rsl.error.0051:Backtrace for this error:
rsl.error.0052:Backtrace for this error:
rsl.error.0053:Backtrace for this error:
rsl.error.0054:Backtrace for this error:
rsl.error.0055:Backtrace for this error:
rsl.error.0056:Backtrace for this error:
rsl.error.0057:Backtrace for this error:
rsl.error.0058:Backtrace for this error:
rsl.error.0059:Backtrace for this error:
rsl.error.0060:Backtrace for this error:
rsl.error.0061:Backtrace for this error:
rsl.error.0062:Backtrace for this error:
rsl.error.0063:Backtrace for this error:
rsl.error.0064:Backtrace for this error:
rsl.error.0065:Backtrace for this error:
rsl.error.0066:Backtrace for this error:
rsl.error.0067:Backtrace for this error:
rsl.error.0068:Backtrace for this error:
rsl.error.0069:Backtrace for this error:
rsl.error.0070:Backtrace for this error:
rsl.error.0071:Backtrace for this error:
rsl.error.0072:Backtrace for this error:
rsl.error.0073:Backtrace for this error:
rsl.error.0074:Backtrace for this error:
rsl.error.0075:Backtrace for this error:
rsl.error.0076:Backtrace for this error:
rsl.error.0077:Backtrace for this error:
rsl.error.0078:Backtrace for this error:
rsl.error.0079:Backtrace for this error:
rsl.error.0080:Backtrace for this error:
rsl.error.0081:Backtrace for this error:
rsl.error.0082:Backtrace for this error:
rsl.error.0083:Backtrace for this error:
rsl.error.0084:Backtrace for this error:
rsl.error.0085:Backtrace for this error:
rsl.error.0086:Backtrace for this error:
rsl.error.0087:Backtrace for this error:
rsl.error.0088:Backtrace for this error:
rsl.error.0089:Backtrace for this error:
rsl.error.0090:Backtrace for this error:
rsl.error.0091:Backtrace for this error:
rsl.error.0092:Backtrace for this error:
rsl.error.0093:Backtrace for this error:
rsl.error.0094:Backtrace for this error:
rsl.error.0095:Backtrace for this error:
rsl.error.0096:Backtrace for this error:
rsl.error.0097:Backtrace for this error:
rsl.error.0098:Backtrace for this error:
rsl.error.0099:Backtrace for this error:
rsl.out.0000: NetCDF error: NetCDF: Variable not found
rsl.out.0000: NetCDF error in wrf_io.F90, line 2883 Varname QICE
rsl.out.0000: NetCDF error: NetCDF: Variable not found
rsl.out.0000: NetCDF error in wrf_io.F90, line 2883 Varname CWM
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BXS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BXE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BYS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BYE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTXS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTXE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTYS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname QICE_BTYE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BXS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BXE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BYS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BYE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTXS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTXE
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTYS
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error: NetCDF: Variable not found
rsl.out.0000:d01 2019-09-11_00:00:00 NetCDF error in wrf_io.F90, line 2883 Varname CWM_BTYE
Also
grep -i floating rsl.*
rsl.error.0000:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0001:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0002:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0003:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0004:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0005:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0006:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0007:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0008:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0009:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0010:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0011:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0012:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0013:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0014:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0015:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0016:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0017:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0018:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0019:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0020:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0021:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0022:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0023:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0024:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0025:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0026:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0027:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0028:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0029:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0030:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0031:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0032:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0033:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0034:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0035:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0036:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0037:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0038:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0039:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0040:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0041:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0042:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0043:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0044:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0045:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0046:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0047:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0048:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0049:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0050:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0051:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0052:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0053:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0054:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0055:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0056:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0057:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0058:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0059:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0060:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0061:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0062:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0063:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0064:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0065:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0066:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0067:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0068:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0069:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0070:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0071:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0072:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0073:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0074:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0075:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0076:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0077:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0078:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0079:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0080:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0081:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0082:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0083:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0084:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0085:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0086:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0087:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0088:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0089:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0090:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0091:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0092:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0093:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0094:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0095:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0096:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0097:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0098:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
rsl.error.0099:program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
In addition
tail slurm.out
srun: error: compute-5: tasks 97-99: Floating point exception
srun: error: compute-5: task 96: Floating point exception
srun: error: compute-2: tasks 0-22,24-31: Floating point exception
srun: error: compute-2: task 23: Floating point exception
srun: error: compute-3: tasks 32-63: Floating point exception
srun: error: compute-4: tasks 64-95: Floating point exception
There were no cfl messages. I would not say that WRF crashed pretty soon but just exactly at the first radiation package call. In the simulation that crashed, adaptive timestep was necessary for the start because very intense vertical velocity develops close to the top at the beginning and in addition I must set w_damping=0 because I need to study the true vertical velocity that might appear very high. Now I followed your suggestions and WRF did not crash in your suggested test case at 30 min and continued until the successful end, so the cldfra1 problem disappeared. I guess that subroutine has a problem at high altitudes, may be some division by 0. Any suggestion how to avoid that problem ?
 
Peter,
Those netCDF errors are not really errors, and they will not lead to model crash. In this case, apparently the model physics is wrong.
(1) As I suggested in my previous answer, would you please run this case with lower model top and less vertical levels? This test will let us know whether the very high model top and large number of vertical levels are possible reasons for the model crash. Please also turn off adaptive time step.
(2) Some of your physics options are obsolete, i.e.,
ra_lw_physics = 99, 99,
ra_sw_physics = 99, 99,
sf_sfclay_physics = 91, 91,
I wonder whether you can try new options like rrtmg radiation and revised MM5 Monin-Obukhov scheme?
 
Dear Ming: I already followed your suggestion as mentioned in the last part of my previous message: "Now I followed your suggestions and WRF did not crash in your suggested test case at 30 min and continued until the successful end, so the cldfra1 problem disappeared. I guess that subroutine has a problem at high altitudes, may be some division by 0. Any suggestion how to avoid that problem ?"
 
Peter,
In this case at least we know for sure that the problem is induced by extremely high model top. To detect when and where the error occurs, please save wrfrst file at the nearest time before the model crash, then restart the model and save wrfout at every time step. Please look at these wrfout files and find the grid point which the error first popped up. Then trace back to possible reason.
 
Ok, thank you. How do I "find the grid point which the error first popped up" ? By searching the position of the first NaN ? And would you suggest any procedure to trace back the reason ? Even debug level = 1000 does not give enough clues.
 
Please stay with debug_level = 0 because this option doesn't always provide helpful information.
You are right that you need to find the first time and grid point where NaN occurs. Form there you can trace back to other variables related to this NaN and possibly figure out what is wrong.
Another option is that you compile WRF with -D option, and then rerun WRF. This option sometimes tell you in which code and line the model crashed.
 
Top