kosakaguchi
New member
Thanks to the information and comments shared in the recent post, I am able to compile the MPAS atmosphere v8.2.0 and v8.2.1 with OpenACC option for the NERSC Perlmutter system.
However, in a test run using the 240km mesh, I quickly get the following error:
The same 240km simulation runs fine with the cpu-version of the model executable (compiled without the OpenACC option).
The relevant source code section looks like this:
line 3240 is the fourth line,
I am testing the mpas executable compiled with the debug option. To compile, I copied the entry for "nvhpc" in the top-level Makefile to create the following:
and by loading the following modules
, which leads to those modules for the compile- and run-time
I followed examples from the NERSC documentation and their online tool for generating a batch script.:
I'd appreciate any suggestions for solving the problem.
Best regards,
Koichi
However, in a test run using the 240km mesh, I quickly get the following error:
Code:
Accelerator Fatal Error: call to cuMemcpyHtoDAsync returned error 700: Illegal address during kernel execution
File: /global/cfs/cdirs/wcm_code/MPAS-Atmosphere/models/ksa/MPAS-Model-v8.2.1/src/core_atmosphere/dynamics/mpas_atm_time_integration.F
Function: atm_advance_scalars_work:3049
Line: 3240
The same 240km simulation runs fine with the cpu-version of the model executable (compiled without the OpenACC option).
The relevant source code section looks like this:
Code:
#ifndef DO_PHYSICS
!$acc enter data create(scalar_tend_save)
#else
!$acc enter data copyin(scalar_tend_save)
#endif
line 3240 is the fourth line,
Code:
!$acc enter data copyin(scalar_tend_save)
I am testing the mpas executable compiled with the debug option. To compile, I copied the entry for "nvhpc" in the top-level Makefile to create the following:
Code:
nvhpc-pm-gpu: # BUILDTARGET Nvidia compilers on NERSC Perlmutter GPU node following nvhpc
( $(MAKE) all \
"FC_PARALLEL = ftn" \
"CC_PARALLEL = cc" \
"CXX_PARALLEL = CC" \
"FC_SERIAL = nvfortran" \
"CC_SERIAL = nvc" \
"CXX_SERIAL = nvc++" \
"FFLAGS_PROMOTION = -r8" \
"FFLAGS_OPT = -gopt -O4 -byteswapio -Mfree" \
"CFLAGS_OPT = -gopt -O3" \
"CXXFLAGS_OPT = -gopt -O3" \
"LDFLAGS_OPT = -gopt -O3" \
"FFLAGS_DEBUG = -O0 -g -Mbounds -Mchkptr -byteswapio -Mfree -Ktrap=divz,fp,inv,ovf -traceback" \
"CFLAGS_DEBUG = -O0 -g -traceback" \
"CXXFLAGS_DEBUG = -O0 -g -traceback" \
"LDFLAGS_DEBUG = -O0 -g -Mbounds -Ktrap=divz,fp,inv,ovf -traceback" \
"FFLAGS_OMP = -mp" \
"CFLAGS_OMP = -mp" \
"FFLAGS_ACC = -Mnofma -acc -gpu=cc70,cc80 -Minfo=accel" \
"CFLAGS_ACC =" \
"PICFLAG = -fpic" \
"BUILD_TARGET = $(@)" \
"CORE = $(CORE)" \
"DEBUG = $(DEBUG)" \
"USE_PAPI = $(USE_PAPI)" \
"OPENMP = $(OPENMP)" \
"OPENACC = $(OPENACC)" \
"CPPFLAGS = $(MODEL_FORMULATION) -D_MPI -DCPRPGI" )
and by loading the following modules
Code:
module load PrgEnv-nvidia/8.5.0
module load gpu/1.0
module load cray-hdf5/1.12.2.3
module load cray-netcdf/4.9.0.9
module load cray-parallel-netcdf/1.12.3.9
module load cmake/3.24.3
, which leads to those modules for the compile- and run-time
Code:
1) craype-x86-milan 7) conda/Miniconda3-py311_23.11.0-2 13) cray-mpich/8.1.28 (mpi) 19) cray-hdf5/1.12.2.3 (io)
2) libfabric/1.15.2.0 8) evp-patch 14) cray-libsci/23.12.5 (math) 20) cray-netcdf/4.9.0.9 (io)
3) craype-network-ofi 9) python/3.11 (dev) 15) PrgEnv-nvidia/8.5.0 (cpe) 21) cray-parallel-netcdf/1.12.3.9 (io)
4) xpmem/2.6.2-2.5_2.38__gd067c3f.shasta 10) nvidia/23.9 (g,c) 16) cudatoolkit/12.2 (g) 22) cmake/3.24.3 (buildtools)
5) perftools-base/23.12.0 11) craype/2.7.30 (c) 17) craype-accel-nvidia80
6) cpe/23.12 12) cray-dsmml/0.2.2 18) gpu/1.0
I followed examples from the NERSC documentation and their online tool for generating a batch script.:
Code:
#SBATCH -q debug
#SBATCH -t 00:30:00
...
#SBATCH -C gpu&hbm40g
#SBATCH -G 4
export SLURM_CPU_BIND="cores"
srun -n 4 -c 32 --cpu_bind=cores -G 4 --gpu-bind=none ./atmosphere_model
I'd appreciate any suggestions for solving the problem.
Best regards,
Koichi