Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault occurred when running WRF-LES -> mpirun -np 3 ./wrf.exe

shanyuzhou

New member
Good morning,
I am a beginner and trying to learn how to run the WRF LES model to simulate the flow of tracer particles named 'plume' at a certain wind speed with 2 domains. When I set the simulation time to less than 1 hour, it runs fine, but when I set it to 5 hours, the program always stops around 1 hour and 30 minutes with a segmentation fault. What could be the reason for this?

>>> mpirun -np 3 ./wrf.exe
starting wrf task 0 of 3
starting wrf task 1 of 3
starting wrf task 2 of 3
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 0 on node lars-ProLiant-DL380-Gen9 exited on signal 11 (Segmentation fault).

For the last part in the error file:
------------------------- rsl --------------------------------------------

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7a2c40623960 in ???
#1 0x7a2c40622ac5 in ???
#2 0x7a2c4004251f in ???
at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3 0x5e2533153c43 in ???
#4 0x5e2533157396 in ???
#5 0x5e253315c14d in ???
#6 0x5e25328fb68c in ???
#7 0x5e2531ec625c in ???
#8 0x5e25317ca952 in ???
#9 0x5e25315cd174 in ???
#10 0x5e2530787cf9 in ???
#11 0x5e2530788389 in ???
#12 0x5e2530713d17 in ???
#13 0x5e253071319e in ???
#14 0x7a2c40029d8f in __libc_start_call_main
at ../sysdeps/nptl/libc_start_call_main.h:58
#15 0x7a2c40029e3f in __libc_start_main_impl
at ../csu/libc-start.c:392
#16 0x5e25307131d4 in ???
#17 0xffffffffffffffff in ???
 

Attachments

  • rsl.error.0000
    553.9 KB · Views: 2
  • namelist.input
    5.6 KB · Views: 4
Hi,
In your rsl file, you have several CFL errors, such as:

Code:
d01 0001-01-01_01:32:18           34  points exceeded cfl=2 in domain d01 at time 0001-01-01_01:32:18 hours
d01 0001-01-01_01:32:18  MAX AT i,j,k:           39           3          73  vert_cfl,w,d(eta)=   2.20772076       5.41727877       6.28930330E-03
d01 0001-01-01_01:32:24           34  points exceeded cfl=2 in domain d01 at time 0001-01-01_01:32:24 hours
d01 0001-01-01_01:32:24  MAX AT i,j,k:           39           3          73  vert_cfl,w,d(eta)=   2.20878172       5.38269329       6.28930330E-03
d01 0001-01-01_01:32:24           34  points exceeded cfl=2 in domain d01 at time 0001-01-01_01:32:24 hours
d01 0001-01-01_01:32:24  MAX AT i,j,k:           39           3          74  vert_cfl,w,d(eta)=   2.20971894       5.28638983       6.28930330E-03
d01 0001-01-01_01:32:24           34  points exceeded cfl=2 in domain d01 at time 0001-01-01_01:32:24 hours
d01 0001-01-01_01:32:24  MAX AT i,j,k:           39           3          74  vert_cfl,w,d(eta)=   2.20922780       5.31041050       6.28930330E-03

These errors indicate the model has become unstable. Try decreasing your time_step to 4 and see if that helps. If not, see Segmentation Faults - Helpful Information, which discusses CFL errors.
 
Top