Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault when running wrf.exe [WRF 4.0.3]

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

johnphua

New member
I have tried running WRF for a small region of the world on MERRA2 with GLDAS, as well as with Copernicus ERA5 pressure level and single level data. On both datasets, I can successfully run WPS and real.exe, but get a Segmentation Fault when running wrf.exe with similar error messages:
Code:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7F880B24BE08
#1  0x7F880B24AF90
#2  0x7F880A97C4AF
#3  0x195BE9C in taugb3.7070 at module_ra_rrtmg_lw.f90:?
#4  0x197D31D in __rrtmg_lw_taumol_MOD_taumol
#5  0x19A1331 in __rrtmg_lw_rad_MOD_rrtmg_lw
#6  0x19B25D5 in __module_ra_rrtmg_lw_MOD_rrtmg_lwrad
#7  0x15143DD in __module_radiation_driver_MOD_radiation_driver
#8  0x15D35A4 in __module_first_rk_step_part1_MOD_first_rk_step_part1
#9  0x11336F4 in solve_em_
#10  0x102AA22 in solve_interface_
#11  0x46E872 in __module_integrate_MOD_integrate
#12  0x407863 in __module_wrf_top_MOD_wrf_run
How do I solve this problem? Any steps that I missed out? What do I need to look out for?
Really appreciate any help on this.

I am running on WPS 4.0.3 and WRF 4.0.2,
Data used for MERRA and GLDAS:
MERRA2
const_2d_asm_Nx
inst6_3d_ana_Np
inst6_3d_ana_Nv
tavg1_2d_ocn_Nx
tavg1_2d_slv_Nx
GLDAS
noah025_3h_v2.1

namelist.wps (WPS_GEOG from “WRF Preprocessing System (WPS) Geographical Input Data Mandatory Fields Downloads” at http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog.html#optional)
MERRA2 data prepared for metgrid with MERRA2WRF
Code:
&share
 wrf_core = 'ARW',
 max_dom = 1,
 start_date = '2017-01-01_00:00:00',
 end_date   = '2017-12-31_18:00:00',
 interval_seconds = 21600,
 io_form_geogrid = 2,
/

&geogrid
 parent_id         =   1,
 parent_grid_ratio =   1,
 i_parent_start    =   1,
 j_parent_start    =   1,
 e_we              =   600,
 e_sn              =   200,
 geog_data_res = 'default',
 map_proj = 'lat-lon',
 ref_lat = -0.79,
 ref_lon = 113.92,
 dx = 0.1,
 dy = 0.1,
 stand_lon = 0,
 geog_data_path = '<dir_to_wps>/WPS_GEOG/',
/

&ungrib
 out_format = 'WPS',
 prefix = 'FILE',
/

&metgrid
 fg_name = 'MERRA', 'SOILTEMP1', 'SOILTEMP2', 'SOILTEMP3', 'SOILTEMP4', 'SOILMOIST1', 'SOILMOIST2', 'SOILMOIST3', 'SOILMOIST4',
 io_form_metgrid = 2, 
 opt_output_from_metgrid_path = '/mnt/hdd/metgrid_output_merra'
/

namelist.input
Code:
 &time_control
 run_days                            = 365,
 run_hours                           = 0,
 run_minutes                         = 0,
 run_seconds                         = 0,
 start_year                          = 2017,
 start_month                         = 01,
 start_day                           = 01,
 start_hour                          = 00,
 end_year                            = 2017,
 end_month                           = 12,
 end_day                             = 31,
 end_hour                            = 23,
 interval_seconds                    = 21600
 input_from_file                     = .true.,.true.,.true.,
 history_interval                    = 180,  60,   60,
 frames_per_outfile                  = 1000, 1000, 1000,
 restart                             = .false.,
 restart_interval                    = 7200,
 io_form_history                     = 2
 io_form_restart                     = 2
 io_form_input                       = 2
 io_form_boundary                    = 2
 /

 &domains
 time_step                           = 10,
 time_step_fract_num                 = 0,
 time_step_fract_den                 = 1,
 max_dom                             = 1,
 e_we                                = 600,
 e_sn                                = 200,
 e_vert                              = 73,    33,    33,
 p_top_requested                     = 5000,
 num_metgrid_levels                  = 73,
 num_metgrid_soil_levels             = 4,
 dx                                  = 11117.7,
 dy                                  = 11117.7,
 grid_id                             = 1,
 parent_id                           = 0,
 i_parent_start                      = 1,
 j_parent_start                      = 1,
 parent_grid_ratio                   = 1,
 parent_time_step_ratio              = 1,
 feedback                            = 1,
 smooth_option                       = 0
 /

 &physics
 physics_suite                       = 'CONUS'
 mp_physics                          = -1,    -1,    -1,
 cu_physics                          = -1,    -1,     0,
 ra_lw_physics                       = -1,    -1,    -1,
 ra_sw_physics                       = -1,    -1,    -1,
 bl_pbl_physics                      = -1,    -1,    -1,
 sf_sfclay_physics                   = -1,    -1,    -1,
 sf_surface_physics                  = 0,    -1,    -1,
 radt                                = 15,    30,    30,
 bldt                                = 0,     0,     0,
 cudt                                = 5,     5,     5,
 icloud                              = 1,
 num_land_cat                        = 21,
 sf_urban_physics                    = 0,     0,     0,
 surface_input_source                   = 1
 /

 &fdda
 /

 &dynamics
 hybrid_opt                          = 2, 
 w_damping                           = 0,
 diff_opt                            = 1,      1,      1,
 km_opt                              = 4,      4,      4,
 diff_6th_opt                        = 0,      0,      0,
 diff_6th_factor                     = 0.12,   0.12,   0.12,
 base_temp                           = 290.
 damp_opt                            = 3,
 zdamp                               = 5000.,  5000.,  5000.,
 dampcoef                            = 0.2,    0.2,    0.2
 khdif                               = 0,      0,      0,
 kvdif                               = 0,      0,      0,
 non_hydrostatic                     = .true., .true., .true.,
 moist_adv_opt                       = 1,      1,      1,     
 scalar_adv_opt                      = 1,      1,      1,     
 gwd_opt                             = 1,
 /

 &bdy_control
 spec_bdy_width                      = 5,
 specified                           = .true.
 /

 &grib2
 /

 &namelist_quilt
 nio_tasks_per_group = 0,
 nio_groups = 0,
 /


Data used for ERA5:
pl
129: Geopotential
130: Temperature
131: U component of wind
132: V component of wind
157: Relative humidity

sfc
165: 10 metre U wind component
166: 10 metre V wind component
167: 2 metre temperature
168: 2 metre dewpoint temperature
172: Land-sea mask
134: Surface Pressure
151: Mean sea level pressure
235: Skin temperature
33: Snow density
141: Snow depth
139: Soil temperature level 1
170: Soil temperature level 2
183: Soil temperature level 3
236: Soil temperature level 4
39: Volumetric soil water layer 1
40: Volumetric soil water layer 2
41: Volumetric soil water layer 3
42: Volumetric soil water layer 4
(SST and ice frac)
31: Sea ice area fraction
34: Sea surface temperature
namelist.wps (WPS_GEOG from “WRF Preprocessing System (WPS) Geographical Input Data Mandatory Fields Downloads” at http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog.html#optional)
ERA5 data prepared for metgrid with ungrib using Vtable.ERA-interim.pl
Code:
&share
 wrf_core = 'ARW',
 max_dom = 1,
 start_date = '2017-01-01_00:00:00',
 end_date   = '2017-01-01_23:00:00',
 interval_seconds = 3600,
 io_form_geogrid = 2,
/

&geogrid
 parent_id         =   1,
 parent_grid_ratio =   1,
 i_parent_start    =   1,
 j_parent_start    =   1,
 e_we              =   600,
 e_sn              =   200,
 geog_data_res = 'default',
 map_proj = 'lat-lon',
 ref_lat = -0.79,
 ref_lon = 113.92,
 dx = 0.1,
 dy = 0.1,
 stand_lon = 0,
 geog_data_path = '<dir_to_files>/WPS_GEOG/',
/

&ungrib
 out_format = 'WPS',
 prefix = 'ERA5',
/

&metgrid
 fg_name = 'ERA5'
 io_form_metgrid = 2, 
 opt_output_from_metgrid_path = '/mnt/hdd/metgrid_output_era'
/
namelist.input
Code:
  &time_control
 run_days                            = 1,
 run_hours                           = 0,
 run_minutes                         = 0,
 run_seconds                         = 0,
 start_year                          = 2017,
 start_month                         = 01,
 start_day                           = 01,
 start_hour                          = 00,
 end_year                            = 2017,
 end_month                           = 01,
 end_day                             = 01,
 end_hour                            = 23,
 interval_seconds                    = 3600
 input_from_file                     = .true.,.true.,.true.,
 history_interval                    = 180,  60,   60,
 frames_per_outfile                  = 1000, 1000, 1000,
 restart                             = .false.,
 restart_interval                    = 7200,
 io_form_history                     = 2
 io_form_restart                     = 2
 io_form_input                       = 2
 io_form_boundary                    = 2
 /

 &domains
 time_step                           = 50,
 time_step_fract_num                 = 0,
 time_step_fract_den                 = 1,
 max_dom                             = 1,
 e_we                                = 600,
 e_sn                                = 200,
 e_vert                              = 38,    33,    33,
 p_top_requested                     = 5000,
 num_metgrid_levels                  = 38,
 num_metgrid_soil_levels             = 4,
 dx                                  = 11117.7,
 dy                                  = 11117.7,
 grid_id                             = 1,
 parent_id                           = 0,
 i_parent_start                      = 1,
 j_parent_start                      = 1,
 parent_grid_ratio                   = 1,
 parent_time_step_ratio              = 1,
 feedback                            = 1,
 smooth_option                       = 0
 /

 &physics
 physics_suite                       = 'CONUS'
 mp_physics                          = -1,    -1,    -1,
 cu_physics                          = -1,    -1,     0,
 ra_lw_physics                       = -1,    -1,    -1,
 ra_sw_physics                       = -1,    -1,    -1,
 bl_pbl_physics                      = -1,    -1,    -1,
 sf_sfclay_physics                   = -1,    -1,    -1,
 sf_surface_physics                  = 0,    -1,    -1,
 radt                                = 15,    30,    30,
 bldt                                = 0,     0,     0,
 cudt                                = 5,     5,     5,
 icloud                              = 1,
 num_land_cat                        = 21,
 sf_urban_physics                    = 0,     0,     0,
 surface_input_source                = 1,
 use_mp_re                           = 0
 /

 &fdda
 /

 &dynamics
 hybrid_opt                          = 2, 
 w_damping                           = 0,
 diff_opt                            = 1,      1,      1,
 km_opt                              = 4,      4,      4,
 diff_6th_opt                        = 0,      0,      0,
 diff_6th_factor                     = 0.12,   0.12,   0.12,
 base_temp                           = 290.
 damp_opt                            = 3,
 zdamp                               = 5000.,  5000.,  5000.,
 dampcoef                            = 0.2,    0.2,    0.2
 khdif                               = 0,      0,      0,
 kvdif                               = 0,      0,      0,
 non_hydrostatic                     = .true., .true., .true.,
 moist_adv_opt                       = 1,      1,      1,     
 scalar_adv_opt                      = 1,      1,      1,     
 gwd_opt                             = 1,
 /

 &bdy_control
 spec_bdy_width                      = 5,
 specified                           = .true.
 /

 &grib2
 /

 &namelist_quilt
 nio_tasks_per_group = 0,
 nio_groups = 0,
 /
 
I am suspicious that the segmentation fault is caused by some inconsistency between the input data. You may nee to recompile WRF in debug mode (./configure -D), then rerun this case. You can find the exact code and line number where the model crashed, which is a good starting point to figure out what is wrong.
 
Thank you for the advice, I have recompiled WRF in debug mode, and re-running the case with full debug level gives:

Code:
d01 2017-01-01_00:01:40 module_integrate: back from solve interface
d01 2017-01-01_00:01:40 in med_latbound_in
d01 2017-01-01_00:01:40 module_integrate: calling solve interface
d01 2017-01-01_00:01:40  grid spacing, dt, time_step_sound=   11117.7002       50.0000000               4
d01 2017-01-01_00:01:40 calling inc/HALO_EM_MOIST_OLD_E_7_inline.inc
d01 2017-01-01_00:01:40 calling inc/PERIOD_BDY_EM_MOIST_OLD_inline.inc
d01 2017-01-01_00:01:40  call rk_step_prep
d01 2017-01-01_00:01:40 calling inc/HALO_EM_A_inline.inc
d01 2017-01-01_00:01:40 calling inc/PERIOD_BDY_EM_A_inline.inc
d01 2017-01-01_00:01:40  call rk_phys_bc_dry_1
d01 2017-01-01_00:01:40  call init_zero_tendency
d01 2017-01-01_00:01:40 calling inc/HALO_EM_PHYS_A_inline.inc
d01 2017-01-01_00:01:40  call phy_prep
d01 2017-01-01_00:01:40  DEBUG wrf_timetoa():  returning with str = [2017-01-01_00:01:40]
d01 2017-01-01_00:01:40  call radiation_driver
d01 2017-01-01_00:01:40 Top of Radiation Driver
d01 2017-01-01_00:01:40 calling inc/HALO_PWP_inline.inc
d01 2017-01-01_00:01:40  call surface_driver
d01 2017-01-01_00:01:40 in MYJSFC

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7FD8C7E4DE08
#1  0x7FD8C7E4CF90
#2  0x7FD8C757E4AF
#3  0x492E008 in __module_sf_myjsfc_MOD_sfcdif at module_sf_myjsfc.f90:773
#4  0x4938F78 in __module_sf_myjsfc_MOD_myjsfc at module_sf_myjsfc.f90:337 (discriminator 4)
#5  0x32292B4 in __module_surface_driver_MOD_surface_driver at module_surface_driver.f90:1889 (discriminator 16)
#6  0x1F262CC in __module_first_rk_step_part1_MOD_first_rk_step_part1 at module_first_rk_step_part1.f90:497
#7  0x17330E3 in solve_em_ at solve_em.f90:920
#8  0x1535CC4 in solve_interface_ at solve_interface.f90:139
#9  0x46EA57 in __module_integrate_MOD_integrate at module_integrate.f90:325
#10  0x406430 in __module_wrf_top_MOD_wrf_run at module_wrf_top.f90:324
#11  0x4055BD in MAIN__ at wrf.f90:29

Looking at the source code, I realised that `call_surface_driver` can be called from different modules.
I removed the following lines from my namelist.input under `&physics` for the ERA5 run to use the 'CONUS' suite defaults:
Code:
 mp_physics                          = -1,    -1,    -1,
 cu_physics                          = -1,    -1,     0,
 ra_lw_physics                       = -1,    -1,    -1,
 ra_sw_physics                       = -1,    -1,    -1,
 bl_pbl_physics                      = -1,    -1,    -1,
 sf_sfclay_physics                   = -1,    -1,    -1,
 sf_surface_physics                  = 0,    -1,    -1,
 sf_urban_physics                    = 0,     0,     0,
 /

This change got wrf.exe to run, it is currently still running and I will update this post when it is done.

EDIT: wrf.exe ran successfully, I had to also change the run day/run hour on namelist.input from 1 day to 23 hours to match input data (0000h to 2300h)
 
Top