Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault with SAPRC99_MOSAIC_8BIN_VBS2_AQ_KPP chemistry option (chem_opt=203)

zhaneden

New member
Hello forum,

I have a problem when running WRF version 4.6.1 CHEM on Perlmutter machine. I contacted NERSC support and they are helping us. But I would appreciate if anyone here has any helpful solution or experience to share.

ERROR in <rsl.error.0000>:
MPICH ERROR [Rank 0] [job id 36469101.0] [Mon Mar 3 23:58:18 2025] [nid004172] - Abort(1) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

sbatch script i used:
#SBATCH -N 2
srun -n 64 -c 8 --cpu_bind=cores ${wrfexe}
export OMP_NUM_THREADS= 4

#SBATCH -N 1
srun -n 64 -c 4 --cpu_bind=cores ${wrfexe}
export OMP_NUM_THREADS= 2

None of them worked.
ERROR in <slurm.out>:
libgomp: Invalid value for environment variable OMP_NUM_THREADS:


We did several tests on other machines with the same setting and they all worked properly. We also tested wrf without chem and it works well. It works properly with chem_opt=195, but not with 198 and 203 and I need to run with chem_opt 203. I was wondering what could make this error.
I also attached my namelist and rsl.error files.

Thanks for your help.
 

Attachments

  • test2.tar
    34.5 KB · Views: 0
Just to update. It no longer has segmentation fault. But now the problem is cfl errors. The only way to make it work is to reduce my timestep from 45 to around 9s which will cost too much computer time. (If i request 30mins the model ran 5mins) I tried several test: 1. Turned off chem and do mete only run. 2. Tried chem_opt=202,201. They all worked well with time step 45 or 30s (2hrs or even finished 1 day) Once i changed to 203 SAPRC99+MOSAIC 8 bins aq, it had cfl errors unless i reduced my timestep to very small.
 
Top