Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

wrf.exe segmentation fault error

acast

New member
Dear forum,

I am having a really confusing "segmentation fault error" when I run wrf. exe. The ungrib, metgrid and real.exe were run successfully (in theory) since the log files did not point out any errors and they all showed a "successful" message. The wrfout files for the first time step after running wrf.exe are created but then the segmentation fault appears and the job stops.

This is the last couple of lines from my rsl.error.0000 file

Timing for Writing wrfout_d01_2004-01-01_12:00:00 for domain 1: 16.67673 elapsed seconds
Timing for Writing afwa_d01_2004-01-01_12:00:00 for domain 1: 0.48017 elapsed seconds
Timing for Writing wrfzlev_d01_2004-01-01_12:00:00 for domain 1: 0.37146 elapsed seconds
Timing for Writing wrfprs_d01_2004-01-01_12:00:00 for domain 1: 1.70725 elapsed seconds
d01 2004-01-01_12:00:00 Input data is acceptable to use: wrflowinp_d01
d01 2004-01-01_12:00:00 Input data processed for aux input 4 for domain 1
d01 2004-01-01_12:00:00 Input data is acceptable to use: wrffdda_d01
d01 2004-01-01_12:00:00 Input data processed for aux input 10 for domain 1
d01 2004-01-01_12:00:00 Input data is acceptable to use: wrfbdy_d01
Timing for processing lateral boundary for domain 1: 2.11732 elapsed seconds
Tile Strategy is not specified. Assuming 1D-Y
WRF TILE 1 IS 1 IE 30 JS 1 JE 16
WRF NUMBER OF TILES = 1

I attach my namelist.input file, the rsl.error.0000 file and my log file from that job. I am running wrf.exe with a sbatch script, using 64 processors as
srun --ntasks=64 wrf.exe > wrf.log

Thanks a lot,

Alma
 

Attachments

  • serial_job_206559273.log
    2.2 KB · Views: 1
  • rsl.error.0000
    4.3 KB · Views: 7
  • namelist.input
    5.9 KB · Views: 6
Please delete the following 2 lines from your namelist.input, then try again with a smaller number of processors, for example, run with 4 processors.

nproc_x = 8,


nproc_y = 8,

If it still doesn't work, please let me know how you compiled WRF and what your command is.
 
Hi Ming,

I deleted the lines nproc_x and nproc_y and used 4 processors ( srun --ntasks=4 wrf.exe ). The job still stops after creating the first timestep files. Below you will see the output from that run.
starting wrf task 0 of 4
starting wrf task 1 of 4
starting wrf task 2 of 4
starting wrf task 3 of 4
srun: error: nid01351: tasks 0-3: Segmentation fault
srun: Terminating StepId=269302572.0

I also attach the rls.error.000 file and the compile.log file which includes the output from when I compiled wrf. I used 46 dmpar cray. I use debug level =1000
 

Attachments

  • compile.log
    1.6 MB · Views: 2
  • rsl.error.0000
    454.1 KB · Views: 5
Last edited:
Top