MyAtmosphere
New member
I am trying to run MPAS-A with GPUs on a system that has 2X 64 CPU cores , 4 A100 GPUs 80 GB, and 512 RAM.
The model was compiled with PGI and OpenACC enabled.
I request 2 nodes and set:
I run using the command
I get an empty history file then model crashes. The error I'm getting is:
The log file says:
Does any one know what might be causing this error? I'm running a 15 Km case.
The model was compiled with PGI and OpenACC enabled.
I request 2 nodes and set:
export MPAS_DYNAMICS_RANKS_PER_NODE=24
export MPAS_RADIATION_RANKS_PER_NODE=16
I run using the command
sun -n 80 ./atmosphere_model
I get an empty history file then model crashes. The error I'm getting is:
Role leader is 0
My role is 1
Role leader is 1
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=730788.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: system0034: task 53: Out Of Memory
The log file says:
ERROR: MPAS IO Error: Bad return value from PIO
ERROR: ********************************************************************************
ERROR: Error writing one or more output streams
CRITICAL ERROR: ********************************************************************************
Does any one know what might be causing this error? I'm running a 15 Km case.