Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault - only ??? in backtrace for error

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Klara_353

New member
Hello,

I am running simulations in WRF3.9.1 on 8 processors using ERA5 data and my simulations keep crashing.
The calculations end with:

=========================================================
BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
EXIT CODE:139
CLEANING UP REMAINING PROCESSES
YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=========================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions.

I have checked the rsl.error.0000 (the biggest of all eight) and in the end there was just:
Program received signal SIGSEV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7fc35998c2da in ???
#1 0x7fc35998b503 in ???
#2 0x7fc359008f1f in ???
#3 0x56082a24d0f5 in ???
#4 0x56082a26be67 in ???
#5 0x56082a28ea4b in ???
#6 0x56082a29ddc4 in ???
#7 0x560829e56128 in ???
#8 0x560829f0d2e5 in ???
#9 0x560829a81519 in ???
#10 0x56082997e5e0 in ???
#11 0x560828da4cb8 in ???
#12 0x560828da52ad in ???
#13 0x560828da52ad in ???
#14 0x560828d3dba9 in ???
#15 0x560828d3d4fe in ???
#16 0x7fc358febb96 in ???
#17 0x560828d3d539 in ???
#18 0xffffffffffffffff in ???

I am enclosing the namelist.input, namelist.wps and rsl.error.0000 files.

Also, based on previous discussions with segmentation errors, I have tried:
1) first reducing time step, later setting it to adaptive so that it does not run extremely long.
Even without adaptive time step the backtrace error was similar.
2) I did type ulimit -s unlimited.
3) I modified epssm as well as cfl options.
4) Finally I also set smooth_cg_topo=.true., still no help.
5) I shortened (actually moved forward the beginning of) the study period by two days, it crashes anyway.

I should also note that I have changed the negative moisture values in SM028100 (that were created by metgrid.exe) to positive, but the simulation ran neither with nor without them.

I really do not know what else I can try and any help will be appreciated.
Best regards,
Klara
 

Attachments

  • namelist (1).input
    6.6 KB · Views: 60
  • namelist (1).wps
    1.5 KB · Views: 48
  • rsl.error0000.txt
    532 KB · Views: 56
Did the model crash immediately? If so, this often indicates that the input data is wrong, or the memory is not sufficient.
Can you run the same case with GFS as input? I would like to determine first that this issue is not caused by possible problems in ERA5 data. The ERA5 is relatively new and we don't have many experiences running WRF driven by this data.
Another issue is, can you run the case with more processors, for example 16? This is to examine whether the memory is an issue for the failed case.
 
Top