Model crashing at radiation time step

neel14

Member
WRF-4.1.3
Model is crashing at radiation time step with BouLac PBL scheme, but runs successfully with other PBL schemes. Tried changing radt to other values but seeing similar crashing with similar error. Any help would be appreciated.

Code:
 &time_control
 run_days                            = 0,
 run_hours                           = 00,
 run_minutes                         = 0,
 run_seconds                         = 0,
 start_year                          = 2018, 2018,2018,
 start_month                         = 03,   03,   03,
 start_day                           = 31,   31,   31,
 start_hour                          = 00,   00,   00,
 end_year                            = 2018, 2018,2018,
 end_month                           = 05,   05,   05,
 end_day                             = 01,   01,   01,
 end_hour                            = 03,   03,   03,
 interval_seconds                    = 10800,
 input_from_file                     = .true.,.true.,.true.,
 history_interval                    = 1440,  1440, 60,
 frames_per_outfile                  = 100000, 100000, 100000,
 restart                             = .false.,
 restart_interval                    = 1440,
 rst_outname                         = '/scratch/neeldip/OUTPUT/wrfrst_d<domain>_<date>',
 io_form_history                     = 2
 io_form_restart                     = 2
 io_form_input                       = 2
 io_form_boundary                    = 2
 history_outname                     = '/scratch/neeldip/OUTPUT/wrfout_d<domain>_<date>',
 debug_level                         = 0,
 output_diagnostics                  = 1,
 auxhist3_outname                    = '/scratch/neeldip/OUTPUT/wrfxtrm_d<domain>_<date>', 
 auxhist3_interval                   = 180, 180,60,     
 io_form_auxhist3                    = 2,
 frames_per_auxhist3                 = 100000, 100000, 100000,
 frames_per_auxhist23                = 100000, 100000, 100000,
 io_form_auxinput4                   = 2,
 auxinput4_interval                  = 180,
 auxinput4_inname                    = "wrflowinp_d<domain>"
 io_form_auxhist1                    = 2
 auxhist1_interval                   = 1440,  1440,   60,
 frames_per_auxhist1                 = 100000, 100000, 100000,
 auxhist1_outname                    = '/scratch/neeldip/OUTPUT/wrf_trad_fields_d<domain>_<date>',
 iofields_filename                   = "d01.txt","d01.txt","d03.txt",
 ignore_iofields_warning             = .true.,
 /

 &domains
 time_step                           = 120,
 time_step_fract_num                 = 0,
 time_step_fract_den                 = 1,
 max_dom                             = 3,
 e_we                                = 231,247,349,
 e_sn                                = 197,211,313,
 e_vert                              = 35,    35,    35,
 p_top_requested                     = 5000,
 num_metgrid_levels                  = 138,
 num_metgrid_soil_levels             = 4,
 dx                                  = 27000, 9000,3000, 
 dy                                  = 27000, 9000,3000,
 grid_id                             = 1,2,3,   
 parent_id                           = 1,1,2,   
 i_parent_start                      = 1,75,70,
 j_parent_start                      = 1,64,55,    
 parent_grid_ratio                   = 1,3,3,    
 parent_time_step_ratio              = 1,3,4,    
 feedback                            = 1,     
 max_ts_locs                         = 50, 
 ts_buf_size                         = 200,
 max_ts_level                        = 50,
 tslist_unstagger_winds              = .false.,
 use_adaptive_time_step              = .false.,  
 smooth_option                       = 2,
 smooth_cg_topo                      = .true.,
 /

 &physics
 mp_physics                          = 10,   10, 10,   
 progn                               = 1,     1,  1,  
 ra_lw_physics                       = 4,     4,  4,  
 ra_sw_physics                       = 4,     4,  4, 
 radt                                = 27,   27, 27,    
 sf_sfclay_physics                   = 1,     1,  1,  
 sf_surface_physics                  = 2,     2,  2,  
 bl_pbl_physics                      = 9,     9,  9,  
 bldt                                = 0,     0,  0, 
 cu_physics                          = 3,     3,  0,
 shcu_physics                        = 1,     1,  0,  
 cudt                                = 0,     0,  0,  
 cugd_avedx                          = 3,
 ishallow                            = 1,    
 cu_diag                             = 1,     
 shcu_aerosols_opt                   = 2,     2,  2,    
 isfflx                              = 1,
 ifsnow                              = 1,
 icloud                              = 1,
 surface_input_source                = 3,
 sf_urban_physics                    = 0,     0, 0,  
 maxiens                             = 1,
 maxens                              = 3,
 maxens2                             = 3,
 maxens3                             = 16,
 ensdim                              = 144,
 cu_rad_feedback                     = .true.,.true.,.false.,
 sst_update                          = 1,
 num_land_cat                        = 21,
 usemonalb                           = .true.,
 rdmaxalb                            = .true.,
 rdlai2d                             = .true.,
 num_land_cat                        = 17,
 /

 &fdda
 grid_fdda                           = 2, 2, 2,
 gfdda_inname                        = "wrffdda_d<domain>"
 gfdda_interval_m                    = 180,180,180,
 gfdda_end_h                         = 10000, 10000, 10000,
 io_form_gfdda                       = 2,
 fgdt                                = 0, 0, 0,
 fgdtzero                            = 0, 0, 0,
 if_no_pbl_nudging_uv                = 1, 1, 1,
 if_no_pbl_nudging_t                 = 1, 1, 1,
 if_no_pbl_nudging_ph                = 1, 1, 1,
 if_no_pbl_nudging_q                 = 1, 1, 1,
 guv                                 = 0.0003, 0.0003, 0.0003,
 gt                                  = 0.0003, 0.0003, 0.0003, 
 gq                                  = 0.00001, 0.00001, 0.00001,
 gph                                 = 0.0003, 0.0003, 0.0003, 
 ktrop                               = 0,
 xwavenum                            = 2, 1, 1,
 ywavenum                            = 2, 1, 1,
 /

 &dynamics
 hybrid_opt                          = 2,
 w_damping                           = 1,
 diff_opt                            = 2,      2,  2,    
 km_opt                              = 4,      4,  4,    
 diff_6th_opt                        = 2,      2,  2,   
 diff_6th_factor                     = 0.12,   0.12,  0.12, 
 base_temp                           = 290,
 damp_opt                            = 3,
 zdamp                               = 5000.,  5000., 5000.,
 dampcoef                            = 0.2,    0.2,   0.2, 
 khdif                               = 2700,    900,  300,   
 kvdif                               = 100,      100,   100,   
 non_hydrostatic                     = .true., .true., .true.,
 moist_adv_opt                       = 1,      1,   1,    
 scalar_adv_opt                      = 1,      1,   1,       
 chem_adv_opt                        = 1,      1,   1,
 gwd_opt                             = 1,
 etac                                = 0.1,
 epssm                               = 0.5,0.5,0.5,  
 /

 &bdy_control
 spec_bdy_width                      = 9,
 spec_zone                           = 1,
 relax_zone                          = 8, 
 specified                           = .true.
 /

 &grib2
 /

 &namelist_quilt
 nio_tasks_per_group = 0,
 nio_groups = 1,
 /

 &diags
 diag_nwp2 = 1
 /
 

kwerner

Administrator
Staff member
Hi,
Can you package your wrf output error files (e.g., rsl.error.*) together into a single *.TAR file (not a *.rar file - we cannot open that format) and attach it so I can take a look? Thanks!
 

neel14

Member
Hi, I have attached the files.
The model seems to be crashing at d03 timestep.
 

Attachments

  • rsl.tar.gz
    49.9 KB · Views: 11

kwerner

Administrator
Staff member
Thanks for sending those. Essentially the model is crashing immediately before any real integration happens. The inner-most domain is the first domain that integrates forward, which is why you are seeing this on d03. Take a look at this FAQ that discusses the most common reason for segmentation faults (which is the error that shows up in some of the rsl.error.* files). I don't see cfl errors, so you can ignore that section, but pay attention to the part that discusses the model stopping immediately.

I also notice that you are running many different options (several different diagnostics outputs, sst_update, etc). If you determine you do not have any problems with your input data, I would suggest trying to run this with the default namelist.input file that comes with the model code (I'll attach it in case you no longer have the original), and just modifying the dates, times, and domain size/position specs (and don't modify anything else) to see if that runs any further. If so, then you can slowly try to add in some of the other options you want to use to see if you can figure out which one is causing the issue.
 

Attachments

  • namelist.input.413.orig.txt
    3.9 KB · Views: 10

neel14

Member
Hi,
Is there any special static input data requirement with BouLac scheme? I have no problems running other schemes with the same data.
 

kwerner

Administrator
Staff member
No, there shouldn't be any specific requirement for the BouLac scheme, and at this time, we aren't aware of any issues with it, so perhaps it's not an issue with the input data. In that case, I'd recommend first trying to run this with the latest version of WRF (v4.3.3). If that doesn't work, then try to run with the default namelist, but only modifying the date, time, and domain configuration, but not modifying any physics, or adding any additional options to see if it runs. If it completes without errors, then we know the BouLac scheme is capable of running, and that perhaps one of the other options, combined with BouLac is causing the issue. You can follow the same method I mentioned below - to add one new namelist option at a time to see if they run. Since your simulation is stopping immediately, these should be very quick tests. Please let me know what you discover.
 

neel14

Member
Hi,
Sorry for responding so late. There were issues with the HPC.
Haven't been able to test much but tried running with v4.3.3 and it failed as well. Rsl attached. So I guess there's some issue with the input data but I have been able to run 8 other schemes with the same data.

Will try further if possible.
 

Attachments

  • rsl.zip
    360.7 KB · Views: 0

kwerner

Administrator
Staff member
Thanks for following up. I'd recommend trying the basic namelist test I mention in my April 11 post, keeping everything very basic, only modifying the dates, domain dimensions, and turning on the boulac scheme to see if that runs. I don't think it's a data issue if you were able to run other schemes with your data (even in previous WRF versions).
 
Top