I keep getting a strange segmentation fault when running BCs on MPAS V8.3.1. I have tried increasing the amount of mpiprocceses (as seened in the attached .sh file) as well as undersubscribing, but it still happens. It also occurs at a random spot in the run, for example, in the log.init_atmosphere.0000.out, sometimes it stops in the attached example, or is it able to output a few LBC files before crashing.
Note: I am using a regional 60-3 km mesh and the outputted .graph.part. file which I applied "gpmetis -minconn -contig -niter=200 ${name}.graph.info 256" to get a .graph.part.256 file. I then used the new mesh_scaling tool "scale_region" to scale by 3 to a 20-1 km mesh.
Checking Memory:
qhist -j 3277247
==>
Job ID User Queue Nodes NCPUs NGPUs End Mem CPU Elap
------------ ---------- -------- ----- ------ ----- ------- -------- ------ ------
3277247 aroseman cpu 4 512 0 01-1644 66.87 37.37 0.01
What should I try to figure out the cause and avoid the error?
Note: I am using a regional 60-3 km mesh and the outputted .graph.part. file which I applied "gpmetis -minconn -contig -niter=200 ${name}.graph.info 256" to get a .graph.part.256 file. I then used the new mesh_scaling tool "scale_region" to scale by 3 to a 20-1 km mesh.
Checking Memory:
qhist -j 3277247
==>
Job ID User Queue Nodes NCPUs NGPUs End Mem CPU Elap
------------ ---------- -------- ----- ------ ----- ------- -------- ------ ------
3277247 aroseman cpu 4 512 0 01-1644 66.87 37.37 0.01
What should I try to figure out the cause and avoid the error?
Attachments
Last edited: