kunaldayal
Member
Hi,
I am running the WRF model v3.9.1.1 for wind resource mapping using 3 two-way nested domains of 20 km, 4 km, and 1 km on an HPC at a higher resolution of 1 km x 1 km for a tropical island in the SW pacific.
While I try to run simulations for the whole group of islands over an area of 401 km x 401 km at 1 km grid resolution, the simulation runs for an hour or so and then the simulation crashes with a segmentation fault (core dumped) error message. I am getting the following errors:
WRF rsl.error.0000:
"
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x2AAAAB10C6F7
#1 0x2AAAAB10CD3E
#2 0x2AAAAB74C26F
#3 0x1B66897 in taugb3.5950 at module_ra_rrtmg_lw.f90:?
#4 0x1B877C9 in __rrtmg_lw_taumol_MOD_taumol
#5 0x1B9F51B in __rrtmg_lw_rad_MOD_rrtmg_lw
#6 0x1BB2E7C in __module_ra_rrtmg_lw_MOD_rrtmg_lwrad
#7 0x16C5E49 in __module_radiation_driver_MOD_radiation_driver
#8 0x17B087C in __module_first_rk_step_part1_MOD_first_rk_step_part1
#9 0x11B1B79 in solve_em_
#10 0x1088EAA in solve_interface_
#11 0x47289A in __module_integrate_MOD_integrate
#12 0x4081C3 in __module_wrf_top_MOD_wrf_run
"
Slurm-9892195.out:
"
#!/bin/bash -e
#SBATCH --job-name=WRF5MPIJob # job name (shows up in the queue)
#SBATCH --account=uoa02450 # Nesi project
#SBATCH --time=72:00:00 # Walltime (HH:MM:SS)
#SBATCH --mem-per-cpu=6300 # memory/cpu in MB (half the actual required memory)
#SBATCH --partition=bigmem
#SBATCH --ntasks=72 # number of tasks (e.g. MPI)
#SBATCH --nodes=2
#SBATCH --hint=nomultithread # please try also without hyperthreading
#SBATCH --profile=all
cat $0
srun /scale_wlg_persistent/filesets/project/uoa02450/Build_WRF5/WRFV3/WRFV3/run/wrf.exe
### EOF
starting wrf task 40 of 72
starting wrf task 43 of 72
starting wrf task 47 of 72
starting wrf task 53 of 72
starting wrf task 54 of 72
starting wrf task 55 of 72
starting wrf task 59 of 72
starting wrf task 36 of 72
starting wrf task 38 of 72
starting wrf task 41 of 72
starting wrf task 42 of 72
starting wrf task 45 of 72
starting wrf task 46 of 72
starting wrf task 51 of 72
starting wrf task 56 of 72
starting wrf task 57 of 72
starting wrf task 60 of 72
starting wrf task 61 of 72
starting wrf task 63 of 72
starting wrf task 64 of 72
starting wrf task 65 of 72
starting wrf task 66 of 72
starting wrf task 67 of 72
starting wrf task 69 of 72
starting wrf task 70 of 72
starting wrf task 37 of 72
starting wrf task 39 of 72
starting wrf task 49 of 72
starting wrf task 50 of 72
starting wrf task 58 of 72
starting wrf task 68 of 72
starting wrf task 71 of 72
starting wrf task 52 of 72
starting wrf task 44 of 72
starting wrf task 48 of 72
starting wrf task 62 of 72
starting wrf task 6 of 72
starting wrf task 32 of 72
starting wrf task 7 of 72
starting wrf task 26 of 72
starting wrf task 16 of 72
starting wrf task 11 of 72
starting wrf task 20 of 72
starting wrf task 19 of 72
starting wrf task 29 of 72
starting wrf task 10 of 72
starting wrf task 21 of 72
starting wrf task 4 of 72
starting wrf task 23 of 72
starting wrf task 2 of 72
starting wrf task 17 of 72
starting wrf task 3 of 72
starting wrf task 18 of 72
starting wrf task 1 of 72
starting wrf task 27 of 72
starting wrf task 5 of 72
starting wrf task 33 of 72
starting wrf task 8 of 72
starting wrf task 24 of 72
starting wrf task 12 of 72
starting wrf task 25 of 72
starting wrf task 30 of 72
starting wrf task 31 of 72
starting wrf task 9 of 72
starting wrf task 22 of 72
starting wrf task 34 of 72
starting wrf task 0 of 72
starting wrf task 13 of 72
starting wrf task 15 of 72
starting wrf task 28 of 72
starting wrf task 14 of 72
starting wrf task 35 of 72
srun: error: wbl008: tasks 36-38,40-41,43-65,67-71: Segmentation fault (core dumped)
srun: error: wbl004: tasks 0-35: Segmentation fault (core dumped)
srun: error: wbl008: task 66: Segmentation fault (core dumped)
srun: error: wbl008: tasks 39,42: Segmentation fault (core dumped)
"
I have tried increasing memory from 1500 MB, 2000 MB, 3000 MB, and 6300 MB but the problem remains.
Can it be a namelist.input parent_time_step_ratio related error?? I have a timestep of 120s, 40s and 8s (1:3:5).
Initially, I ran simulations for an island covering 201 km x 201 km at 1 km grid resolution and the simulations worked fine.
Appreciate your kind assistance and advice.
Regards
Kunal
I am running the WRF model v3.9.1.1 for wind resource mapping using 3 two-way nested domains of 20 km, 4 km, and 1 km on an HPC at a higher resolution of 1 km x 1 km for a tropical island in the SW pacific.
While I try to run simulations for the whole group of islands over an area of 401 km x 401 km at 1 km grid resolution, the simulation runs for an hour or so and then the simulation crashes with a segmentation fault (core dumped) error message. I am getting the following errors:
WRF rsl.error.0000:
"
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x2AAAAB10C6F7
#1 0x2AAAAB10CD3E
#2 0x2AAAAB74C26F
#3 0x1B66897 in taugb3.5950 at module_ra_rrtmg_lw.f90:?
#4 0x1B877C9 in __rrtmg_lw_taumol_MOD_taumol
#5 0x1B9F51B in __rrtmg_lw_rad_MOD_rrtmg_lw
#6 0x1BB2E7C in __module_ra_rrtmg_lw_MOD_rrtmg_lwrad
#7 0x16C5E49 in __module_radiation_driver_MOD_radiation_driver
#8 0x17B087C in __module_first_rk_step_part1_MOD_first_rk_step_part1
#9 0x11B1B79 in solve_em_
#10 0x1088EAA in solve_interface_
#11 0x47289A in __module_integrate_MOD_integrate
#12 0x4081C3 in __module_wrf_top_MOD_wrf_run
"
Slurm-9892195.out:
"
#!/bin/bash -e
#SBATCH --job-name=WRF5MPIJob # job name (shows up in the queue)
#SBATCH --account=uoa02450 # Nesi project
#SBATCH --time=72:00:00 # Walltime (HH:MM:SS)
#SBATCH --mem-per-cpu=6300 # memory/cpu in MB (half the actual required memory)
#SBATCH --partition=bigmem
#SBATCH --ntasks=72 # number of tasks (e.g. MPI)
#SBATCH --nodes=2
#SBATCH --hint=nomultithread # please try also without hyperthreading
#SBATCH --profile=all
cat $0
srun /scale_wlg_persistent/filesets/project/uoa02450/Build_WRF5/WRFV3/WRFV3/run/wrf.exe
### EOF
starting wrf task 40 of 72
starting wrf task 43 of 72
starting wrf task 47 of 72
starting wrf task 53 of 72
starting wrf task 54 of 72
starting wrf task 55 of 72
starting wrf task 59 of 72
starting wrf task 36 of 72
starting wrf task 38 of 72
starting wrf task 41 of 72
starting wrf task 42 of 72
starting wrf task 45 of 72
starting wrf task 46 of 72
starting wrf task 51 of 72
starting wrf task 56 of 72
starting wrf task 57 of 72
starting wrf task 60 of 72
starting wrf task 61 of 72
starting wrf task 63 of 72
starting wrf task 64 of 72
starting wrf task 65 of 72
starting wrf task 66 of 72
starting wrf task 67 of 72
starting wrf task 69 of 72
starting wrf task 70 of 72
starting wrf task 37 of 72
starting wrf task 39 of 72
starting wrf task 49 of 72
starting wrf task 50 of 72
starting wrf task 58 of 72
starting wrf task 68 of 72
starting wrf task 71 of 72
starting wrf task 52 of 72
starting wrf task 44 of 72
starting wrf task 48 of 72
starting wrf task 62 of 72
starting wrf task 6 of 72
starting wrf task 32 of 72
starting wrf task 7 of 72
starting wrf task 26 of 72
starting wrf task 16 of 72
starting wrf task 11 of 72
starting wrf task 20 of 72
starting wrf task 19 of 72
starting wrf task 29 of 72
starting wrf task 10 of 72
starting wrf task 21 of 72
starting wrf task 4 of 72
starting wrf task 23 of 72
starting wrf task 2 of 72
starting wrf task 17 of 72
starting wrf task 3 of 72
starting wrf task 18 of 72
starting wrf task 1 of 72
starting wrf task 27 of 72
starting wrf task 5 of 72
starting wrf task 33 of 72
starting wrf task 8 of 72
starting wrf task 24 of 72
starting wrf task 12 of 72
starting wrf task 25 of 72
starting wrf task 30 of 72
starting wrf task 31 of 72
starting wrf task 9 of 72
starting wrf task 22 of 72
starting wrf task 34 of 72
starting wrf task 0 of 72
starting wrf task 13 of 72
starting wrf task 15 of 72
starting wrf task 28 of 72
starting wrf task 14 of 72
starting wrf task 35 of 72
srun: error: wbl008: tasks 36-38,40-41,43-65,67-71: Segmentation fault (core dumped)
srun: error: wbl004: tasks 0-35: Segmentation fault (core dumped)
srun: error: wbl008: task 66: Segmentation fault (core dumped)
srun: error: wbl008: tasks 39,42: Segmentation fault (core dumped)
"
I have tried increasing memory from 1500 MB, 2000 MB, 3000 MB, and 6300 MB but the problem remains.
Can it be a namelist.input parent_time_step_ratio related error?? I have a timestep of 120s, 40s and 8s (1:3:5).
Initially, I ran simulations for an island covering 201 km x 201 km at 1 km grid resolution and the simulations worked fine.
Appreciate your kind assistance and advice.
Regards
Kunal