Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Minimum computer requirements for running MPAS

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

dosach

New member
Dear all,
I'm new here!. My name is Christian and I'm interested in using MPAS for making some climate simulations. However, I have no previous experience on this model. I'll highly appreciate if you can provide some information about the following questions:

1) What are the minimum computer requirements to run MPAS? I mean in terms of the number of CPU cores, memory size and storage system

2) What kind of linux system is more suitable to run the model?

I'm thinking in running the nested subdomain at 30 km and I need to know the computer requirements to buy a machine with the neccesary characteristics.

Hope you can help me. :)

Have a good day,
Christian
 
Regarding memory requirements, we've found that a reasonable way to estimate the amount of memory needed to run a particular simulation is to multiply the number of horizontal cells in the mesh by 0.175 MB/cell; this estimate is based on a single-precision model run with around 55 vertical levels. So, for example, a global, quasi-uniform 30-km simulation might require around 655362 cells * 0.175 MB/cell = 115,000 MB of memory. Note that this is the aggregate amount of memory across all nodes of a computing cluster, and not the amount of memory on each node. Also, this is a rough estimate only, and the amount of memory required for a simulation can vary a little based on, for example, your choice of physics parameterizations.

In principle, there is no minimum requirement for the number of processors (as long as there is at least one processor), since using fewer processors simply means that a simulation will take longer to run. If one were patient enough, and if enough memory were available, a single processor could be used for any simulation. For a few notes on estimating simulation throughput with MPAS-Atmosphere with different numbers of processors, this forum thread may be helpful.

Providing advice on storage requirements can be difficult, since this depends very heavily on how many variables you would like to save in your simulations, and how frequently you would like to save them. In my opinion, the best approach here might be to try some shorter (and perhaps lower-resolution) simulations that otherwise reflect your intended work, and to estimate how much output you will generate from those test simulations.

Regarding the kind of system to use, I can offer that we have run MPAS-Atmosphere on a wide variety of computing systems: Raspberry Pi single-board computers, macOS laptops, linux desktops with 4 or 8 cores, and some large clusters like Cheyenne and Summit. So there isn't necessarily a specific system that I could recommend -- all come with pros and cons (cost, reliability, performance). One point worth mentioning, however, is that a higher-performance interconnect between nodes would generally be beneficial, as MPI communication is performed extensively throughout each time step in the model.

I realize the above information may be rather general, so if you have an specific, follow-on questions, please feel free to post them in this thread!
 
mgduda said:
Regarding memory requirements, we've found that a reasonable way to estimate the amount of memory needed to run a particular simulation is to multiply the number of horizontal cells in the mesh by 0.175 MB/cell; this estimate is based on a single-precision model run with around 55 vertical levels. So, for example, a global, quasi-uniform 30-km simulation might require around 655362 cells * 0.175 MB/cell = 115,000 MB of memory. Note that this is the aggregate amount of memory across all nodes of a computing cluster, and not the amount of memory on each node. Also, this is a rough estimate only, and the amount of memory required for a simulation can vary a little based on, for example, your choice of physics parameterizations.

That is to say, if I were to run MPAS on a personal desktop, the main limit will be the amount of RAM and number of cells? With 12 GB RAM (leaving 4 from 16 GB for the main OS usage), that is roughly 68,571 cells with the default config and physics.
 
zemega said:
That is to say, if I were to run MPAS on a personal desktop, the main limit will be the amount of RAM and number of cells? With 12 GB RAM (leaving 4 from 16 GB for the main OS usage), that is roughly 68,571 cells with the default config and physics.
That's correct. Again, the 0.175 MB/cell figure is a rough estimate. Using more cores will generally increase the model simulation rate, but if time is not an issue, a single MPI task running on a single hardware core will eventually finish any integration that could be done on more cores, as long as there is enough memory to hold the complete model state (plus, e.g., temporary buffers used during I/O that in principle increase the maximum memory usage).
 
mgduda said:
That's correct. Again, the 0.175 MB/cell figure is a rough estimate. Using more cores will generally increase the model simulation rate, but if time is not an issue, a single MPI task running on a single hardware core will eventually finish any integration that could be done on more cores, as long as there is enough memory to hold the complete model state (plus, e.g., temporary buffers used during I/O that in principle increase the maximum memory usage).
I was wondering, is it possible to calculate the expected memory needed? Perhaps during init_model_atmosphere, it would display something like "This process expects roughly 8GB RAM to be used". I feel that, being able to see expected memory usage would allow me to precisely scale down the regional domain to the limit of my personal desktop. On a much bigger cluster, being able to read the log and see the expected memory usage (perhaps additional versus actual memory used) would help both users and administrators in determining its a memory limit problem.
 
It may be a bit difficult to accurately calculate the memory required by a simulation at the init_atmosphere stage, since we don't yet know, e.g., which physics schemes will be used in the simulation or which diagnostics will be requested in output streams. In principle this could be done when the atmosphere_model program starts, but gauging memory requirements of physics and diagnostics would still be quite a challenge, I think.

I do agree that it would be nice to give better indications of out-of-memory errors. One tractable step in this direction might be to check the status of all major memory allocations in the model; for example, just checking allocations of model fields defined in the Registry.xml file might help. We'll give this some serious consideration, and if you have any other suggestions, we'd be glad to hear them (perhaps in a new discussion topic in the Code development section of the forum).
 
Top