Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

integration step take too long

dashline

Member
When I run MPAS with global 120km or 48km resolution, I find that each step of the integration takes too long. This is not quite consistent with the results I've seen browsing the forums. I would like to ask for some advice on why. I've put up the log file of my 48km run. I am using 288 cores to run it, but at this point the integration step is taking over 80 seconds.
The cpu resources I am using are 2*Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz 18cores.
 

Attachments

  • 48km_log.atmosphere.0000.out.txt
    116.8 KB · Views: 3
  • 120km_log.atmosphere.0000.out.txt
    226.4 KB · Views: 3
Just for reference, I've attached log files for 48-km and 120-km simulations on NCAR's Cheyenne system using 288 MPI tasks and 128 MPI tasks, respectively. I'd definitely agree that the timing you're seeing for each timestep is much too long. Is it possible that there's some problem in launching MPI jobs across nodes, or is it possible that multiple MPI ranks on the same node are being pinned to the same hardware cores?
 

Attachments

  • log.48km_288_cheyenne.txt
    219.2 KB · Views: 1
  • log.120km_128_cheyenne.txt
    99.8 KB · Views: 1
Perhaps it could be illuminating to start by running a short 120-km simulation with just a single MPI task, then to try scaling to 2, 4, 8, etc. MPI tasks to see the point at which scaling breaks down (or if the integration time is reduced at all when increasing the MPI task count)?

In case it helps in verifying that your timing with 1 MPI task is reasonable, I've attached another log file from Cheyenne using a single MPI task for a 3-hour 120-km simulation.
 

Attachments

  • log.120km_1_cheyenne.txt
    26.7 KB · Views: 0
Thank you mgduda for your answer. According to my attempts, the problem should be that when I compile MPAS, the compile option DEBUG is set to TRUE. When I run it on MPAS 8.0 now, the integral time consumption is normal. I also see "Compiler flags: optimize" in the output log file. Adding this part of the log output really helped me. Thanks again!
 
Thanks for following up, and it's good to hear that the problem wasn't something more complicated than the DEBUG=true build option!
 
Top