Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault - invalid memory reference

Yan LIU

New member
I am currently using WRF 4.5 to perform regional climate simulations over the Tibetan Plateau. However, I consistently encounter an error when running wrf.exe:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x151d5f527ad0 in ???
#1 0x151d5f526c35 in ???
#2 0x151d5f11851f in ???
#3 0x56100fda0090 in __module_ra_rrtm_MOD_taugb3
#4 0x56100fda1be7 in __module_ra_rrtm_MOD_gasabs
#5 0x56100fdb3e0f in __module_ra_rrtm_MOD_rrtm
#6 0x56100fdb685c in __module_ra_rrtm_MOD_rrtmlwrad
#7 0x56100f453e32 in __module_radiation_driver_MOD_radiation_driver
#8 0x56100f58bd62 in __module_first_rk_step_part1_MOD_first_rk_step_part1
#9 0x56100ede549e in solve_em_
#10 0x56100ebd461a in solve_interface_
#11 0x56100dc06478 in __module_integrate_MOD_integrate
#12 0x56100db87b37 in __module_wrf_top_MOD_wrf_run
#13 0x56100db86fbe in main
```

Below is my namelist.input file:
&time_control
run_days = 0,
run_hours = 2,
run_minutes = 0,
run_seconds = 0,
start_year = 2023, 2023,
start_month = 05, 05,
start_day = 18, 18,
start_hour = 00, 00,
end_year = 2023, 2023,
end_month = 05, 05,
end_day = 18, 18,
end_hour = 02, 02,
interval_seconds = 3600
input_from_file = .true.,.true.,
history_interval = 360, 60,
frames_per_outfile = 1000, 1000,
restart = .false.,
restart_interval = 7200,
auxinput4_inname = "wrflowinp_d<domain>"
auxinput4_interval = 60
io_form_auxinput4 = 2
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
/

&domains
time_step = 30,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 2,
e_we = 100, 133,
e_sn = 100, 136,
e_vert = 45, 45,
dzstretch_s = 1.1
p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 9000, 3000
dy = 9000, 3000
grid_id = 1, 2,
parent_id = 0, 1,
i_parent_start = 1, 29,
j_parent_start = 1, 28,
parent_grid_ratio = 1, 3,
parent_time_step_ratio = 1, 3,
feedback = 1,
smooth_option = 0
!epssm = 0.5
/

&physics
!physics_suite = 'CONUS'
!set_update = 1
mp_physics = 8, 8,
cu_physics = 0, 0,
ra_lw_physics = 1, 1,
ra_sw_physics = 1, 1,
bl_pbl_physics = 2, 2,
sf_sfclay_physics = 2, 2,
sf_surface_physics = 2, 2,
radt = 15, 15,
bldt = 0, 0,
cudt = 0, 0,
icloud = 1,
num_land_cat = 21,
sf_urban_physics = 0, 0,
fractional_seaice = 1,
/

&fdda
/

&dynamics
etac = 0.1
hybrid_opt = 2,
w_damping = 1,
diff_opt = 2, 2,
km_opt = 4, 4,
diff_6th_opt = 0, 0,
diff_6th_factor = 0.12, 0.12,
base_temp = 290.
damp_opt = 3,
zdamp = 5000., 5000.,
dampcoef = 0.2, 0.2,
khdif = 0, 0,
kvdif = 0, 0,
non_hydrostatic = .true., .true.,
moist_adv_opt = 1, 1,
scalar_adv_opt = 1, 1,
gwd_opt = 3, 3,
/

&bdy_control
spec_bdy_width = 5,
specified = .true.
/

&grib2
/

&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/


Could anyone please advise on what I should do to avoid this issue?
 
I am currently using WRF 4.5 to perform regional climate simulations over the Tibetan Plateau. However, I consistently encounter an error when running wrf.exe:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x151d5f527ad0 in ???
#1 0x151d5f526c35 in ???
#2 0x151d5f11851f in ???
#3 0x56100fda0090 in __module_ra_rrtm_MOD_taugb3
#4 0x56100fda1be7 in __module_ra_rrtm_MOD_gasabs
#5 0x56100fdb3e0f in __module_ra_rrtm_MOD_rrtm
#6 0x56100fdb685c in __module_ra_rrtm_MOD_rrtmlwrad
#7 0x56100f453e32 in __module_radiation_driver_MOD_radiation_driver
#8 0x56100f58bd62 in __module_first_rk_step_part1_MOD_first_rk_step_part1
#9 0x56100ede549e in solve_em_
#10 0x56100ebd461a in solve_interface_
#11 0x56100dc06478 in __module_integrate_MOD_integrate
#12 0x56100db87b37 in __module_wrf_top_MOD_wrf_run
#13 0x56100db86fbe in main
```

Below is my namelist.input file:
&time_control
run_days = 0,
run_hours = 2,
run_minutes = 0,
run_seconds = 0,
start_year = 2023, 2023,
start_month = 05, 05,
start_day = 18, 18,
start_hour = 00, 00,
end_year = 2023, 2023,
end_month = 05, 05,
end_day = 18, 18,
end_hour = 02, 02,
interval_seconds = 3600
input_from_file = .true.,.true.,
history_interval = 360, 60,
frames_per_outfile = 1000, 1000,
restart = .false.,
restart_interval = 7200,
auxinput4_inname = "wrflowinp_d<domain>"
auxinput4_interval = 60
io_form_auxinput4 = 2
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
/

&domains
time_step = 30,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 2,
e_we = 100, 133,
e_sn = 100, 136,
e_vert = 45, 45,
dzstretch_s = 1.1
p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 9000, 3000
dy = 9000, 3000
grid_id = 1, 2,
parent_id = 0, 1,
i_parent_start = 1, 29,
j_parent_start = 1, 28,
parent_grid_ratio = 1, 3,
parent_time_step_ratio = 1, 3,
feedback = 1,
smooth_option = 0
!epssm = 0.5
/

&physics
!physics_suite = 'CONUS'
!set_update = 1
mp_physics = 8, 8,
cu_physics = 0, 0,
ra_lw_physics = 1, 1,
ra_sw_physics = 1, 1,
bl_pbl_physics = 2, 2,
sf_sfclay_physics = 2, 2,
sf_surface_physics = 2, 2,
radt = 15, 15,
bldt = 0, 0,
cudt = 0, 0,
icloud = 1,
num_land_cat = 21,
sf_urban_physics = 0, 0,
fractional_seaice = 1,
/

&fdda
/

&dynamics
etac = 0.1
hybrid_opt = 2,
w_damping = 1,
diff_opt = 2, 2,
km_opt = 4, 4,
diff_6th_opt = 0, 0,
diff_6th_factor = 0.12, 0.12,
base_temp = 290.
damp_opt = 3,
zdamp = 5000., 5000.,
dampcoef = 0.2, 0.2,
khdif = 0, 0,
kvdif = 0, 0,
non_hydrostatic = .true., .true.,
moist_adv_opt = 1, 1,
scalar_adv_opt = 1, 1,
gwd_opt = 3, 3,
/

&bdy_control
spec_bdy_width = 5,
specified = .true.
/

&grib2
/

&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/


Could anyone please advise on what I should do to avoid this issue?
Good morning,

Can you please attach any rsl errors or rsl output files you have from your model run? That will help us diagnose your issue better.
 
Good morning, Can you please attach any rsl errors or rsl output files you have from your model run? That will help us diagnose your issue better.
Thanks for your reply, sir! These are my rsl errors file and my namelist.wps file. The rsl output file is too large to upload.
 

Attachments

  • rsl.error.0000
    19.6 KB · Views: 8
  • namelist.wps
    736 bytes · Views: 3
@Yan LIU
The problem is likely that you are trying to run this with a single processor. Take a look at this FAQ to see recommendations on the number of processors to use based on the size of your domains.
 
@Yan LIU
The problem is likely that you are trying to run this with a single processor. Take a look at this FAQ to see recommendations on the number of processors to use based on the size of your domains.
Dear kwerner,

I have followed your advice and adjusted the size of my simulation domain as well as the number of parallel cores I'm using. However, I'm still encountering the same problem. I've attached my current namelist.wps, namelist.input, rsl.error, and rsl.output files.

It's important to note that I'm running WRF on the Linux subsystem of my own Windows 11 system. I've also attached the error message from my WSL window.

I would greatly appreciate any further assistance you can provide. Thank you!
 

Attachments

  • rsl.error.0000
    19.6 KB · Views: 1
  • rsl.out.0000
    537.6 KB · Views: 0
  • namelist.wps
    732 bytes · Views: 0
  • namelist.input
    3.9 KB · Views: 4
  • WSL-error.png
    WSL-error.png
    31.4 KB · Views: 8
Please set epssm = 0.9, 0.9, then try again. This case is located over Tibetan Plateau, where the topography gradient is large. This can easily lead to numerical instability. A larger value of epssm may help overcome this problem.

Also, you may reduce the etac value, e.g., set it to 0.02.

Please let me know whether the above options work.
 
Please set epssm = 0.9, 0.9, then try again. This case is located over Tibetan Plateau, where the topography gradient is large. This can easily lead to numerical instability. A larger value of epssm may help overcome this problem.

Also, you may reduce the etac value, e.g., set it to 0.02.

Please let me know whether the above options work.
Dear Mr. Chen,
Following your advice, I've adjusted the parameters to set epssm to 0.9, 0.9 and etac to 0.02. However, this appears to have led to a new issue. There seems to be a discrepancy or conflict between the smooth_option and epssm settings. Could you please provide further guidance on how to address this matter? I appreciate your continued assistance.
 

Attachments

  • rsl.error.0000
    541 bytes · Views: 1
  • rsl.out.0000
    405 bytes · Views: 0
  • namelist.input
    3.9 KB · Views: 3
Yan,
The model crashed because you add epssm to &domains
Please add epssm to &dynamics, then try again. Let me know if you still have problems.
 
Yan,
The model crashed because you add epssm to &domains
Please add epssm to &dynamics, then try again. Let me know if you still have problems.
Dear Mr. Chen,
I added epssm to &dynamics, and I met the same Segmentation fault again. Should I keep shrinking my simulation domains?
 

Attachments

  • rsl.error.0000
    19.4 KB · Views: 2
  • rsl.out.0000
    3.8 MB · Views: 1
  • namelist.input
    3.9 KB · Views: 4
  • namelist.wps
    732 bytes · Views: 1
  • WSL-error-4-cores.png
    WSL-error-4-cores.png
    24.5 KB · Views: 7
  • WSL-error-9-cores.png
    WSL-error-9-cores.png
    30.9 KB · Views: 7
Good morning
I have the same error as @Yan LIU but in different conditions. My execution finished successfully but still I get a message the Segmentation fault in rsl.error.0000
I attach my files. Can someone please help me with this issue?
 

Attachments

  • namelist.input
    9.3 KB · Views: 2
  • namelist.wps
    1.4 KB · Views: 0
  • rsl.error.0000
    350.3 KB · Views: 2
  • rsl.out.0000
    349.1 KB · Views: 0
  • test_20230612_05.log
    32.7 KB · Views: 1
@Yan LIU

Your rsl file indicates that this case crashed after about 15-minute of integration. At least it tells us that this is not a memory issue. I don't think reducing the domain size could help for this case.

Please run over just a single domain (max_dom =1) and see whether this case is able to finish successfully. We need to ensure that the parent domain works fine, then we can narrow down to possible issues in child domain.

We know that high-resolution WRF can easily blow up in complex terrain area like Tibet, but often a larger value of epssm can help stable the integration.
 
@Yan LIU

Your rsl file indicates that this case crashed after about 15-minute of integration. At least it tells us that this is not a memory issue. I don't think reducing the domain size could help for this case.

Please run over just a single domain (max_dom =1) and see whether this case is able to finish successfully. We need to ensure that the parent domain works fine, then we can narrow down to possible issues in child domain.

We know that high-resolution WRF can easily blow up in complex terrain area like Tibet, but often a larger value of epssm can help stable the integration.
Dear Mr. Chen,
Right now I only keep a single domain. But the model still can only iterate for 15 minutes.
 

Attachments

  • rsl.error.0000
    6.7 KB · Views: 1
  • rsl.out.0000
    275.1 KB · Views: 0
  • namelist.input
    3.5 KB · Views: 0
  • namelist.wps
    671 bytes · Views: 0
I am sorry for the trouble. I guess the complex terrain in Tibet caused some issue.

In this case, I would suggest that you recompile WRF in debug mode, i.e., ./configure -D, then rerun this case. The log file will tell when and where something went wrong first.
 
@Yan LIU

Your rsl file indicates that this case crashed after about 15-minute of integration. At least it tells us that this is not a memory issue. I don't think reducing the domain size could help for this case.

Please run over just a single domain (max_dom =1) and see whether this case is able to finish successfully. We need to ensure that the parent domain works fine, then we can narrow down to possible issues in child domain.

We know that high-resolution WRF can easily blow up in complex terrain area like Tibet, but often a larger value of epssm can help stable the integration.
Dear Mr. Chen,
I'm attempting to avoid domain boundaries on complex terrain by enlarging the domain while retaining a single domain. However, this approach also fails. Then I noticed that I had set radt=15. This could explain why I'm only able to run each simulation for 15 minutes. After adjusting radt to 5, the model can only run for 5 minutes. This leads me to believe that the model encounters an issue whenever it begins the second radiation calculation. So far, I have tested both ra_lw_physics = 4, ra_sw_physics = 1 and ra_lw_physics = 1, ra_sw_physics = 1 as radiation parameterization schemes. But unfortunately, neither has worked. Could you provide guidance on how to configure the radiation scheme to prevent errors when the model calculates the radiation for the second time?
 
I am sorry for the trouble. I guess the complex terrain in Tibet caused some issue.

In this case, I would suggest that you recompile WRF in debug mode, i.e., ./configure -D, then rerun this case. The log file will tell when and where something went wrong first.
Dear Sir, I attempted to compile WRF in Debug mode, but unfortunately, this led to the emergence of new errors. Previously, I only encountered issues while running wrf.exe. However, in Debug mode, problems surfaced when running real.exe. Consequently, I had no choice but to compile WRF back into its normal mode. Today, I also tried adjusting the resolution by increasing it to 30 km, but I was met with the same error as before. I made attempts to implement the physics schemes from other research papers, but to no avail. Tomorrow, I plan to try other versions of WRF on different devices. Wish me luck!
 

Attachments

  • rsl.error.0000
    19 KB · Views: 0
  • rsl.out.0000
    18.2 KB · Views: 0
  • namelist.wps
    675 bytes · Views: 3
  • namelist.input
    3.5 KB · Views: 0
@Ming Chen
Dear sir, I changed the position of the center point, lowered the resolution of Static Data (set geog_data_res = '5m"), and expanded the length of the domain from east to west. Finally, the 9 km single domain case was successfully run on WRF 4.5. Next, I will try two layers of domains.
 

Attachments

  • rsl.error.0000
    23.6 KB · Views: 0
  • rsl.out.0000
    23.6 KB · Views: 1
  • namelist.wps
    697 bytes · Views: 0
  • namelist.input
    3.5 KB · Views: 2
Top