Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

MPAS-A: init_atmosphere_model stops running, but doesn't show any errors;

gpriftis

New member
I am new to running MPAS-A. I successfully ran idealized simulations of waves over a mountain and am now attempting to run the model on a global 60-km mesh (x1.40962.graph.info.part.64).

When running init_atmosphere_model, the last checkpoint in the log states:
"Number of sea ice cells converted to land cells = 5"

After this, the model continues running indefinitely without writing anything further in the log file or failing. It does generate the x1.40962.init.nc file, but its size is only 280 MB.

Any thoughts or suggestions on what might be causing this issue would be great! I have attached the namelists and log files for reference.

Thank you!
 

Attachments

  • mpas_files_init.zip
    3.1 KB · Views: 1
Hello gpriftis, I'll take a look at this. Could you also attach that log.init_atmosphere.0000.out file and a single log.init_atmosphere.####.err file (if there is one with an error)?
 
Yes of course, thank you. I have only attached the log.init_atmosphere.0000.out since it has not generated a log.init_atmosphere.####.err file.
 

Attachments

  • log.init_atmosphere.0000.out.txt
    13.1 KB · Views: 1
Thanks for the files provided!

The last print in your log.init_atmosphere.0000.out.txt comes from the physics_init_seaice routine. Near as I can tell, that should be one of the last prints for config_init_case = 7 before the output is written to the output file. Since your file size is smaller than expected, it's likely your job died while writing to the file.

Could you try re-building the init_atmosphere core with DEBUG=true and re-running the job? Please send back the log.init_atmosphere* and similar files to what you attached in the first message.

This debug run will either give us enough info in those output files or create some sort of "core dump" file that you can examine to try to figure out what caused the MPI_Alltoall to fail.
 
Top