Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault vs time_step

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

mauricio_soares

New member
I am running wrf 3.9.1.1 on an i7 desktop with 8 cores. However, only works when I set the time_step less than dx * 2. Any time_step greater that dx*2 I get error "segmentation fault" or any other error random, varies according to the time_step value (but never cfl error). Can there be a problem with my simulation? I've exported the environment variable OMP_STACKSIZE = 200M and ulimit -s unlimited, but nothing changes. The metgrid files seems OK. My namelist is attached.

I would appreciate any comments.
View attachment namelist.input
 
Hi,

Can you run this again for a time_step > 2xDX, so that it fails, but will you please set 'debug_level = 0' to keep the output file(s) small? If you are running in dmpar mode, can you package all of your rsl.* files into a single *.tar file and send those for a run that fails (time_step > 2xDX)? If smpar, will you send the output log for the failed run? Please send the new namelist.input file you use for this run, as well.
Thanks!
 
Hi,

Thank you for sending all of that. It is very odd that you get different errors for different time_steps. I ran a test with your namelist set-up (using my own input data). I was able to run okay with both 36 processors and 1 processor, so it seems to point to your particular input.

1) Did you make any modifications to your code, or is it the default V3.9.1.1 code?
2) If the answer to #1 is no, can you send me the files I would need to try this run with your input (wrfinput_d0* files, wrfbdy_d01, wrflowinp_d0*)? See the Introduction section on the forum homepage for instructions on uploading files.
 
Hi Kwerner,

1) Did you make any modifications to your code, or is it the default V3.9.1.1 code?
No. The code is default. (I compiled with gfortran)

2) If the answer to #1 is no, can you send me the files I would need to try this run with your input (wrfinput_d0* files, wrfbdy_d01, wrflowinp_d0*)? See the Introduction section on the forum homepage for instructions on uploading files.
The files are uploads as mauricio_segm_x_timestep.tar

I'm using a GFS 0.5 analysis. The only special information used is the MUR SST, through the intermediate metgrid files. But this seems to be OK, I've plotted a few fields for verification.

I'm doing several tests. One test was to redistributed the number of levels close (1km) to the surface. In this particular test I was able to run with time_step = dx*3, but with dx*4 the message "critical problem: ZLVL = ZPD; model stops" appeared. With dx*5 the segmentation fault error occurred again.
 
Hi Mauricio,

When I ran in dmpar mode (with 18 processors), I was able to get a bit more information on the individual processors, and unfortunately it looks like there are CFL errors on d04 unless you decrease to about 2xDX for time_step. CFL errors typically occur when either there is very strong convection, or when your domain includes complex/steep terrain. If you're okay with the 2xDX time_step, then you can carry on, but if you will need to run several simulations, for long periods of time then you may want to play around with a few other namelist options:

&domains
epssm = 0.1, 0.1, 0.1, 0.1 (you can try this up to about 0.5/domain)
This option is used to slightly forward the centering of the vertical pressure gradient (or sound waves) in an effort to damp 3-d divergence.

&domains
smooth_cg_topo = .true.
You'll have to do this prior to running real. This smooths the model topography to match the low resolution topography that comes with the driving data. This is useful if the CFL error occurs along the boundary zone.

&dynamics
w_damping = 1

I'm going to attach one of the rsl.error files where I'm seeing the CFL errors so that you can try to determine exactly where in your domain it's happening.
 

Attachments

  • rsl.error.0017.txt
    14.4 KB · Views: 59
Really, I have checked that the points that violate cfl are on steep topography.
OK, I'm more confident in following if it's a cfl error.
It's just intriguing not to show the cfl error when I run on my computer.

Thanks Kwerner!
 
Top