Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

atmosphere_model issue with pnetcdf-Bad return value from PIO

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.



I am trying to run a Regional MPAS case. When I specify io_type='netcdf'' for all of the inputs/outputs of MPAS-init and MPAS-model streams* files, everything works fine.

Now I'm testing using io_type='netcdf' for all of MPAS-init portion since init_atmosphere_model must be run serially. And for the streams.atmosphere_model file regarding the different portions of the MPAS-model runs, I am using io_type='pnetcdf'. When I run MPAS-model, the history files are about size 7MB, so I don't think I need 'pnetcdf,cdf5' format. Anyway, the specific error from running 'mpiexec -np 6 atmosphere_model' I see in the last part of the log*out file is:

--- initialize NOAH LSM tables
skipping over lutype = USGS
landuse type = MODIFIED_IGBP_MODIS_NOAH found 20 categories
end read VEGPARM.TBL
input soil texture classification = STAS
soil texture classification = STAS found 19 categories
end read GENPARM.TBL
--- end initialize NOAH LSM tables

min/max of meshScalingDel2 = 1.00000000000000 1.18920711500272
min/max of meshScalingDel4 = 1.00000000000000 1.68179283050743
ERROR: MPAS IO Error: Bad return value from PIO

For reference, the relevant portions of my streams.atmosphere_model file are:
<stream name="output"
output_interval="3:00:00" >
<file name="stream_list.atmosphere.output"/>
pio-2.5.4<stream name="diagnostics"
output_interval="3:00:00" >
<file name="stream_list.atmosphere.diagnostics"/>

Please let me know what a starting point would be to solve this. I am using pio-2.5.4.
Since the model itself (atmosphere_model) is able to run successfully with io_type="netcdf" but not with io_type="pnetcdf" specified for output streams, perhaps the problem is specific to Parallel-NetCDF support in PIO? Are any partial history or diagnostics files created? What was the 'cmake' command that was used to build PIO (or, alternatively, is it certain that PIO was compiled with Parallel-NetCDF support)?
Thanks for that fast reply. I checked my PIO build directory and in the libpio.settings file, indeed it had "PNETCDF Support: no". So I set the environment variables for PNETCDF_DIR and NETCDF_DIR as prescribed online. Still did not see PNETCDF support in libpio.settings. Ran my cmake command: "CC=mpicc FC=mpif90 cmake -DPIO_ENABLE_TIMING=OFF /home/skirby/Downloads/ParallelIO-pio2_5_4" again and checked the output. It said I needed to define:

PnetCDF_C_INCLUDE_DIR, PnetCDF_C_LIBRARY, which I did in CMakeCache.txt. Now, in libpio.settings, it showed PNETCDF Support was on. At this point, I recompiled atmosphere_model and restarted my regional simulation. I am running a 24-h simulation, output interval every 3h. All went well, for the first 2 outputs of hist* and diag* files at 1200GMT and 1500GMT. But then at 1800GMT, while I did get a complete hist* file, no diag* file was output, and atmosphere_model abruptly died, with:

ERROR: MPAS IO Error: Bad return value from PIO
ERROR: Error writing one or more output streams

For reference, these are the file sizes at each history* and diag* file output times.
7291804 Dec 15 11:39
316768 Dec 15 11:40
Upon further testing, the model is completing just fine. On an earlier test, it was failing on a history*, diag* file overwrite even though I had clobber_mode set to overwrite. I went ahead and deleted the hist* and diag* files before the next run and of course the model finished just fine. But then I went and ran the model with the hist* and diag* files still resident and the model overwrote them ok. Not sure what the issue was, but to sum up all is running fine now. Thanks for pointing out the need to verify that PIO had Parallel-NetCDF support. That was the key.
Thanks very much for following up. I had been wondering whether this might be a "clobber_mode" issue, but thought that I should see if any other ideas came to mind before posting here.

If you're writing a single time period to each output file, an alternative to clobber_mode="overwrite" would be clobber_mode="replace_files", which might be more reliable in that it completely overwrites existing files, rather than attempting to overwrite records in existing files. This is something of a subtle distinction for files with only a single time period.