Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Limited-Area run crashes


New member
We've been running regional MPAS runs; we use the northern hemisphere of the 30 km grid, trimmed with the MPAS-Limited-Area utility (~337k cells), such that the only boundary ends up nearly along the equator. While most runs run fine at a 180s interval, we're running at 150s for additional stability after some crashes, and we've been having some additional crashes even with the lowered timestep. The crashes happen near the boundary around the equator; once over some terrain over Sumatra, Indonesia, and once over some uninteresting patch in the middle of the Atlantic, but both close to the boundary of the domain. Looking at maps, we can see that u at the upper levels (especially the topmost level; which is #55 at 30 km for us) steadily blows up due to some instability. There's some signs of something going wrong approximately 8 hours into the run, they're very evident 13 hours in, and it crashes about 15 hours in (due to a radiation computation issue, likely unrelated). Sometimes u recovers to reasonable values an hour before the crash, but the instability/damage was done. The points where this happens are 2 cells away from the boundaries, suggesting it probably has something to do with them.

We have tried a number of things with the timestep and smoothing along the boundary, to no avail. We have set config_blend_bdy_terrain to true throughout; we tried modifying the blending parameters (nRelaxZone in core_atmosphere/dynamics/mpas_atm_boundaries.F) to 2 and 10 (instead of 5); changing physics to convection_permitting from mesoscale_reference; running more sub_steps; and using config_rayleigh_damp_u to damp the upper level winds; they all lead to crashes around the same time. Anyway, before continuing to debug, looking to see if others have had similar problems with limited area runs, and if so, how you've solved them.

We’ve been working with another group that has encountered similar stability issues with regional MPAS, although on a 3 km regional mesh. The configuration change that we have settled on to address these kinds of instabilities is to enable the upper-level absorbing layer that uses a 2nd-order horizontal filter. The filter exists in the MPAS release but it is hardwired to apply only to the top three levels in the model. We have generalized it to apply to a user-specified number of levels, with a linear scaling of the constant eddy viscosity from 0 to its maximum value starting from the lowest-most application level to the top level in the model. This generalized code is based on MPAS Version 7.3 and it can be found on it can be found on GitHub in mgduda/MPAS-Model in a branch named 'atmosphere/cam_damping_nlevels’.

You will have to set two namelist.atmosphere variables to enable the absorbing layer. In the &damping section of the namelist, set config_mpas_cam_damp = (real_value > 0). This sets the maximum scaling value for the eddy viscosity, and I would suggest trying a value of 1. to start. We are using 2. in our convective application, so you have some headroom here. The default is zero which means that no filtering is applied, and setting it to something greater than zero activates the filter. The second parameter is the number of layers over which the absorbing layer is active. The default is 3 (this comes from the 32 levels MPAS-CESM-CAM climate configuration). I would suggest trying more levels, perhaps 8, and see how things go.

The behavior we observed that led to regional MPAS instability was a mismatch in the horizontal velocities between the prescribed boundary values specifying inflow and the interior solution specifying outflow, all of this happening in the stratosphere. The instability realizes itself in the relaxation zone where the algorithm blends the solution, and this is very similar to what you described. Regarding the options you have tried, we have seen that increasing the size of the relaxation zone can sometimes help, but it appears that was not the case in your application. The default Rayleigh damping timescale (5 days) is likely too large to damp the horizontal velocity, but I do not think it is advisable to lower it much below its current value.

I also think you may be able to run an even larger timestep in your application, perhaps as high as 240 seconds (I’m assuming you're using config_number_of_sub_steps = 2 and config_dynamics_split_steps = 3). You may need to increase config_number_of_sub_steps to 4 for stability, but the acoustic steps are inexpensive relative to transport, and I think this may drop your run time significantly if it is sufficiently robust.
Thank you! This does seem very similar and we'll keep you updated as we try this. We'll also optimize on the time step after resolving the crashes..

I just wanted to update this thread. We have had success using the updated code to resolve the crashes of regional WRF that we saw. The current settings we are using are:
config_nlevels_cam_damp = 8
config_mpas_cam_coef = 1.0
We've seen crashes resolved in both 15 and 24km regional runs with these settings
Thanks for the follow-up, and it's great to hear that the modified absorbing layer was effective! I believe we're planning to release the modifications to the absorbing layer as part of MPAS-A 8.0.0 before the WRF/MPAS Users' Workshop in June.