bartbrashers
New member
I can run ungrib, ghrsst-to-intermediate (writing SST:* files a bit bigger than my WRF domain), and avg_tsfc, just fine.
When I get to the metgrid stage, it gets through processing d01 (for a 5.5-day run I have 23 files), then writes 15 met_em.d02.* files, then stops with this message in both metgrid.out and metgrid.log:
ERROR: get_min(): No items left in the heap.
I'm not finding many clues using Google.
I'm using the 2019 version of PGI-CE:
# pgf90 -V
pgf90 19.10-0 64-bit target on x86-64 Linux -tp piledriver
PGI Compilers and Tools
Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
I've attached my configure.wrf and configure.wps.
The compute nodes I'm running on have MemTotal: 65932648 kB according to /proc/meminfo. I was running 8 copies (8 consective 5-5-day periods) at the time, but the failure happens at the same stage regardless of how many other copies of metgrid are running at once.
I had accidentally set $OPAL_PREFIX to the PGI-supplied version, not the openmpi-3.1.2 version, when I compiled. But that shouldn't matter for metgrid, right? That's only a run-time thing.
Any ideas?
When I get to the metgrid stage, it gets through processing d01 (for a 5.5-day run I have 23 files), then writes 15 met_em.d02.* files, then stops with this message in both metgrid.out and metgrid.log:
ERROR: get_min(): No items left in the heap.
I'm not finding many clues using Google.
I'm using the 2019 version of PGI-CE:
# pgf90 -V
pgf90 19.10-0 64-bit target on x86-64 Linux -tp piledriver
PGI Compilers and Tools
Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
I've attached my configure.wrf and configure.wps.
The compute nodes I'm running on have MemTotal: 65932648 kB according to /proc/meminfo. I was running 8 copies (8 consective 5-5-day periods) at the time, but the failure happens at the same stage regardless of how many other copies of metgrid are running at once.
I had accidentally set $OPAL_PREFIX to the PGI-supplied version, not the openmpi-3.1.2 version, when I compiled. But that shouldn't matter for metgrid, right? That's only a run-time thing.
Any ideas?