Failure in model integration of GPU-enabled MPAS

Questions about and discussion of the GPU-enabled MPAS-Atmosphere branch.
Post Reply
Posts: 2
Joined: Tue May 04, 2021 2:13 am

Failure in model integration of GPU-enabled MPAS

Post by louistse0305 » Wed May 26, 2021 9:13 am

Hi everyone,

I compiled the GPU-enabled MPAS following the document in ... index.html
using OpenMPI v3.1.3 and PGI Compiler 19.10.

Everything looks fine for compiling, and even running static and init, but when it comes to model integration, segmentation fault happens (see attached files)
(43.11 KiB) Downloaded 27 times
(11.02 KiB) Downloaded 29 times
(9.13 KiB) Downloaded 28 times
From the log files, it seem that the model crashed in the very beginning just after calling the radiation, I suspect that the tendencies produced by radiation was unable to transfer to GPUs in the first dynamics time step.
Besides, I am using the Supermicro SuperServer 1029GQ-TVRT with 4 Tesla V100 ... Q-TVRT.cfm
I am wondering if it is a software problem (say improper installation of MPI or other libraries), or a hardware problem (say two CPUs on the server were unable to communicate with thw 4 GPUs)

Here are the commands for running the case:

Code: Select all

gpmetis -minconn -contig -niter=200 ${MPAS_DYNAMICS_RANKS_PER_NODE}
gpmetis -minconn -contig -niter=200 ${MPAS_RADIATION_RANKS_PER_NODE}
mpirun -np 40 ./atmosphere_model &
Any comment is welcome, thank you~

Post Reply

Return to “GPU / OpenACC”