Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Error wrf.exe while running large domain forrtl: severe (174): SIGSEGV, segmentation fault occurred

XHalo

New member
Dear Users,
I am running WRF ARW V4.4 on a HPC using FNL data.
While running WRF.exe I am getting this error
forrtl: severe (174): SIGSEGV, segmentation fault occurred
I have already kept "ulimit -s unlimited" in my bashrc and there is not a CFL error
however there is a question
when I set e_we = 200,301, and e_sn = 200,301, it can work successfully.

Thank you in advance.
XHalo
 

Attachments

  • namelist.input
    3.9 KB · Views: 5
  • rsl.error.0005.txt
    7 KB · Views: 3
Hi,
The fact that it runs okay when your domains are smaller is likely an indicator that you just need to use more processors. Try increasing the number and see if that helps at all.
 
Hi,
The fact that it runs okay when your domains are smaller is likely an indicator that you just need to use more processors. Try increasing the number and see if that helps at all.
Thanks for your advice.
I tried 96 and 192 processors, but got the same error.
 
Thanks for trying that. Can you check your input data and make sure everything looks okay? I notice the simulation stops immediately and many times the reason for that is the input data. Make sure you have all required variables, and that they all look okay (aren't missing in any locations in the domain). If you still aren't finding anything wrong, please package all of your rsl* files into a single *.tar file (not .rar) and attach that so I can see if there are other errors in any other files. Thanks!
 
Thanks for trying that. Can you check your input data and make sure everything looks okay? I notice the simulation stops immediately and many times the reason for that is the input data. Make sure you have all required variables, and that they all look okay (aren't missing in any locations in the domain). If you still aren't finding anything wrong, please package all of your rsl* files into a single *.tar file (not .rar) and attach that so I can see if there are other errors in any other files. Thanks!
Thanks for your reply.
I think there is no problem with the FNL input data, because it can work successfully when e_we = 200,301, and e_sn = 200,301 without any other changes. One more thing I need to mention is that I tried the WRF ARW V4.2.2 and I used 96 processors. I don't know if the version is the key issue. I am attaching all of my rsl* files and the wrfbdy header information and the wrfinput header information.
Thanks again for your suggestion.
 

Attachments

  • rsl.tar
    930 KB · Views: 1
  • wrfbdy_d01.txt
    28.7 KB · Views: 1
  • wrfinput_d01.txt
    43.3 KB · Views: 1
  • wrfinput_d02.txt
    43.3 KB · Views: 1
Last edited:
I attached some information which was not mentioned before. I hope this is useful.
 

Attachments

  • wrf_error.png
    wrf_error.png
    146.4 KB · Views: 15
Hi,
Can you send me a couple of time periods of met_em* files for each domain? I'd like to try to repeat this test. The files will likely be too large to attach, so if you don't have any other way to get the files to me, take a look at the home page of this forum for information on sharing large files. Thanks!
 
I find the problem that I used the wrong Vtable. The wrf can work successfully now.
Thanks for your suggestion.
 
Top