Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault when running real.exe

acast

New member
Dear forum,

I am unable to run real.exe. and the only error that I have is the following.
"Segmentation fault (core dumped)"
I would like to point out that I have compiled WRF and WPS and the .exe files have been created but I do see some Errors on the log files even though the compilation creates the .exe files. I am compiling WRF and WPS on a Cray compiler and have used ungrib and metgrid to create the files with ERA5. Unfortunately, everything was working great until the HPC did some updates to the system. I have contacted the help desk but they have not been able to help so I thought maybe someone here with more experience can help. I honestly do not even know where to start debugging this segmentation fault error when running real.exe.

Thanks,

Alma
 
Alma,
How large is your case (e.g., grid numbers and vertical levels)? Segmentation fault is often caused by insufficient memory or errors in physics. Since you are running real, I am suspicious this is more like a memory issue.
 
Hi Ming,

I ran exactly the same configuration in a different HPC and did not have this error so I will like to know how to check if it is a memory issue. I attach my namelist.input, configure.wrf and configure.wps files and here is a piece of my namelist.input file.

&domains
time_step = 45,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 1,
e_we = 235,
e_sn = 122,
e_vert = 33,
p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 25000,
dy = 25000,
Thanks,

Alma
 

Attachments

  • configure.wps
    3.4 KB · Views: 0
  • namelist.input
    5.9 KB · Views: 3
  • configure.wrf.log
    20.6 KB · Views: 3
Hi Alma,

Thank you for the update. When the model failed with 'segmentation fault' and there is no other error related to physics or dynamics, we often 'guess' this failure might be caused by nsufficient memory or there are some machine issues. I don't know how to detect memory insufficiency from the perspective of software engineer. Note that memory insufficiency only occurs when we run the model, and it has nothing to do with how we compile the code. A case with larger grid numbers will require larger memory.
 
Hi Ming,

Thanks for your response. From reading other posts I wonder if this error is related to the number of processors used when running real.exe. I have tried running real exe with more than 1 processor and still I get this segmentation fault error. If you have any other suggestions on what else to try please let me know. The HPC helpdesk has been unable to solve this issue.

Thanks,
Alma
 
Alma,

Below is the command I use to check memory usage in NCAR Cheyenne HPC:

qhist -d 14 -j JobID#

Note that "JobID#" is the number of my job running on Cheyenne.

I guess there might have a similar command in your HPC to check how many memory is used.

Your case with the grid number of 235 x 122 is not a big case. So I don't think it could be memory issue. There might be some machine issue ?
 
Hi MIng,

I agree that it should not be a memory issue. I contacted the helpdesk HPC again to see if they can help. If you have any more suggestions on what else to try and debug let me know.

Thanks

Alma
 
Top