Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Issue in creating init and lbc files from some regional meshes

pouwereou Nimon

New member
Hello everyone,
I’m encountering an issue when trying to create the initial and boundary condition files using the init_atmosphere_model, for a larger regional MPAS-A uniform mesth.
I am running the MPAS-A vaersion 8.2.2 on our cluster.
  • I successfully ran both initial condition and model integration steps, for the global 240 km resolution.
  • I successfully also ran once, the initial condition and model integration steps for a 12 km regional-uniform mesh covering a relatively small area of West Africa (covering a few countries), without any problem.
  • However, after expanding the mesh domain to cover a larger portion of West Africa (covering all the countries), the initialization process hangs, and several files named
    waf.static.nc.locktest.0
    waf.static.nc.locktest.1
    ...
    appear in the run directory.
  • Then, when I get back to re-run the 12 km regional-uniform mesh covering the relatively small area, then these files (waf.static.nc.locktest.0
    waf.static.nc.locktest.1 ...) appear now. That means, some times it runs, some times it does not.
    This occurs both when running in parallel and in serial.
    The log file (log.init_atmosphere.0000.out) stops with the following lines:
    ----- I/O task configuration: -----

    I/O task count = 32
    I/O task stride = 1
    Initializing MPAS_streamInfo from file streams.init_atmosphere
    Reading streams configuration from file streams.init_atmosphere
    Found mesh stream with filename template waf.static.nc
    Using io_type Serial NetCDF for mesh stream
    ** Attempting to bootstrap MPAS framework using stream: input
After this point, nothing happens — the model hangs indefinitely and no init.nc file is produced.

I have created the waf.static.nc from the "Download the 12-km static file (2447 MB)" by using the create_region script. I am attaching the namelist.init_atmosphere and the streams.init_atmosphere files, I used.

Has anyone experienced this issue when using pnetcdf/cdf5 I/O on large regional meshes?
Could it be related to parallel I/O settings, filesystem locking on Lustre, or mesh partitioning?
Any suggestions or guidance would be greatly appreciated.
Thanks in advance,

Pouwereou
 

Attachments

  • namelist.init_atmosphere.txt
    1.5 KB · Views: 1
  • streams.init_atmosphere.txt
    1 KB · Views: 1
I notice that in your namelist.init_atmosphere, you use GFS as input (config_met_prefix = 'GFS'), but you set config_nfglevels = 38. GFS data has 34 vertical levels, while ERA5 has 38 vertical levels. Please double check to make sure you have correct settings for this case.

Also, can you set io_type="pnetcdf,cdf5" for your input and output streams?

Finally, how many processors did you use to run the large-domain case? I am suspicious the issue you have is probably related to insufficient memory. However, I have no explanation why you cannot repeat the previously successful small-domain run. Can you double check to make sure all are the same when you repeated the small-domain run.

Please keep us updated about this case.
 
Hi,
Thank you for your reply. I hadn't noticed it before.
I will take your remarks into account. For the large-domain case (nCells = 233,830), I used 64 processors. I’m still working on it, and I’ll keep you updated.
 
Top