Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault - invalid memory reference

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Dear all,

my goal is to execute WRF model ver. 4.0 on 3 nested domains with resolution respectively of 9000, 3000, 1000 km (1:3:3) as below reported

&domains
time_step = 54,
time_step_fract_num = 0,
time_step_fract_den = 1,
use_adaptive_time_step = .true
step_to_output_time = .true.,
target_cfl = 1.1,1.1,1.1,1.1,1.1,
max_step_increase_pct = 5, 30, 30, 71, 71,
starting_time_step = -1, -1, -1, 4, -1,
max_time_step = 100, 24, 8, 6, -1,
min_time_step = -1, -1, -1, -1, -1,
max_dom = 3,
s_we = 1, 1, 1,
e_we = 130, 151, 301,
s_sn = 1, 1, 1,
e_sn = 130, 151, 301,
s_vert = 1, 1, 1,
e_vert = 41, 41, 41,
num_metgrid_levels = 34,
num_metgrid_soil_levels = 4,
dx = 9000, 3000, 1000,
dy = 9000, 3000, 1000,


For my goal I used 80 CPUs and 200 GB of RAM.

wrf.exe outs with error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7F8941A496D7
#1 0x7F8941A49D1E
#2 0x7F8940F443FF
#3 0x2F5223D in __module_sf_sfclayrev_MOD_psim_stable
#4 0x2F5688C in __module_sf_sfclayrev_MOD_sfclayrev1d
#5 0x2F5B76D in __module_sf_sfclayrev_MOD_sfclayrev
#6 0x25FF232 in __module_surface_driver_MOD_surface_driver
#7 0x1D4EF19 in __module_first_rk_step_part1_MOD_first_rk_step_part1
#8 0x124B3F0 in solve_em_
#9 0x1119D75 in solve_interface_
#10 0x475F0C in __module_integrate_MOD_integrate
#11 0x4764ED in __module_integrate_MOD_integrate
#12 0x4764ED in __module_integrate_MOD_integrate
#13 0x408123 in __module_wrf_top_MOD_wrf_run


Following the WRF recommendation on the number of CPUs to use, I should use a minimum of 2 and a maximum of 154 CPUs.

I’m able to run the simulation only if I consider the 3th nested domine with size of 211 X 211 but when I consider a bigger domine (also 241X241) wrf.exe exits with segmentation fault.

Do you have any suggestions on how I can modify the execution to complete the wrf.exe run and eliminate the segmentation fault?

Could it be that the problem lies in the difference in size between the domains? In which case if I increased the size of the domains d01 and d02 do you think I would solve?

Thank you so much for your support,
Andrew
 

Attachments

  • namelist.input
    5.9 KB · Views: 39
  • rsl_out.tar.gz
    28.2 KB · Views: 35
  • rsl_error.tar.gz
    30 KB · Views: 34
Hi Andrew,
I just want to check - did you happen to compile your code for the moving nested option?
Can you also check to make sure you have enough disk space to write large files?
Thanks!
 
Hi Kwener,
thanks for your reply.

"I just want to check - did you happen to compile your code for the moving nested option?"

--> I don't know what it is. When you talk about "compile" you mean the execution of the specific simulation or during the WRF installation.

Can you also check to make sure you have enough disk space to write large files?
--> Yes, I'm sure to have much disk space.

Thanks
 
Hi,
"Compile" refers to building the code. When you build the WRF code, you must first configure the code. This is the step where you choose the compiler and infrastructure you're using, and then you choose the nesting option (0 = no nesting, 1 = basic, 3 = vortex following). If you are uncertain, can you send your configure.wrf file so I can take a look? Thanks!
 
Hi,
Thanks for sending that, and I apologize for the delay. It looks like you are using the vortex-following option. At the top of your configure.wrf file, I see:
Code:
# configure.wrf
#
# Original configure options used:
# ./configure 
# Compiler choice: 3
# Nesting option: 3
Nesting option 3 is the vortex-following option. If you are not intending to use that for a moving nest to track a tropical cyclone, then you should recompile and choose the basic nesting option (1) when you compile. The steps to take are:

1) Go back to the main WRF directory
2) ./clean -a
3) ./configure
This is where you'll choose your compiler option, and then option "1" for basic nesting
4) ./compile em_real >& compile.log

Then you can try to run your case again and see if that helps.
 
Hi Kwerner,

When I configure I choose the dmpar option 34 in gfortran / gcc.
Choosing option 1 would be serial in pgf90 / gcc.

I need a distributed-memory parallel configuration!
 
Andrea,
When you configure, you should first choose option 34 for a dmpar compile, and then you should have the option to choose your nesting option:
1 = basic
2 = preset moves
3 = vortex following

The nesting option is what I'm referring to - where you've chosen a vortex-following option in the configure you sent me.
 
Hi,

thanks for your reply.
In my previous WRF installation, I done the configuration manually.

Now, Im try to install WRF in a docker so I need to modify the arch/Config.pl file automaticcally.
When I wrote in this file the respond "34" I changed, accidentally, also the response of nesting option.

I will find an alternative procedure.

Thanks for Your help.

Andrea
 
Andrea,
I'm so glad to hear that! Do you mind letting me know what resolved the issue? This could potentially help another user in the future. Thanks!
 
The error was due to having set option 3 in the nesting settings. This choice was unintentional because the configuration of the WRF I carry out in the building of a docker. After selecting option 1 in the nesting settings, as you suggested, the docker seems to work correctly.
Thanks for your invaluable support.
 
Top