Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRFV4.0.1 model crash with YSU and MYNN PBL options

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

ecmaggioni

New member
Dear all,

since the passage to WRFV4.0.1 (from WRF3.9.1), I noticed some problems regarding the crash of the run with particular boundary layers schemes.
In particular, I found that, using the Yonsei University Scheme PBL option, and MYNN2(3) options, the model run has a tendency to crash after few hours of simulation, due to a violation of the vertical cfl conditions in some points with a complex orography (and always in the first two or three levels from ground). I have no problems with other boundary layer schemes, like Mellor-Yamada-Janjic, or QNSE PBLs.

The simulation I'm running a is centered over Italy, 4km horizontal resolution, 37 vertical levels, adaptive time step. I also tried to decrease the target cfl and target hcfl conditions to extremely small values, but the problem remains. The deactivation of the hybrid vertical coordinate does not produce any change.
The namelist is quite the same of the template namelist of WRFV4, except some small changes in the domain and in the physics option:

mp_physics = 5,
cu_physics = 1,
ra_lw_physics = 1,
ra_sw_physics = 1,
bl_pbl_physics = 1,
sf_sfclay_physics = 1,
sf_surface_physics = 1,
radt = 5,
bldt = 0,
cudt = 5,



The same configuration, with WRFV3.9.1 model, does not crash, and I have no problems even using very high target cfl values.


Do you have any advice about the origin of the problem? Perhaps some bugs in the WRFV4 code when using YSU and MYNN boundary layers?


Thanks for the help


Enrico Maggioni
 
Can you post namelist.wps and namelist.input for me to take a look? What is the input data you use to run this case? If you run the same case with the same options but using older version of WRF (e.g., WRFV3.8.1 etc.), can the case be run successfully?
 
Enrico...

Any chance you are over mountains?

Here is a posting today from "forum.wrfforum.com" with a user having a mountain problem.

>The domain had himalayan region and with grid size 3km - seems wrf.exe it was unable to resolve orography >over himalayan region. We had to make following updates to GEOGRID.TBL
>
># smooth_option = smth-desmth_special; smooth_passes=1 (Line no.7)
>smooth_option = 1-2-1; smooth_passes=3
 
Dear all,

sorry for the very long time before answering you.
I have attached the namelist I'm using with WRFV4.

I've tried the solution of smoothing the orography in geogrid, and it is partially working, in the sense that the same simulation starts to crash after more hours of simulation.
Combining the new geo_em file with a reduction in target_cfl and target_hcfl values (which translates in a reduction in timestep values) I have good results, and the runs are more stable, with only a few cases of crashes in the last two weeks.

But nonetheless, it seems that WRFV3.9 was more stable, because the same configuration (with higher values of target_cfl and target_hcfl) almost never gave a crash of the run. The only difference, apart from the WRF version, was the vertical coordinate, which, for WRFV4, is the hybrid coordinate.

Anyway, even if with a higher timestep, it seems I can run the model in this configuration for now

Thank you for the help

Enrico
 

Attachments

  • namelist.input
    4.2 KB · Views: 93
I have a few suggestions for the namelist options:
(1) For a 4km case, it is better to turn off cumulus scheme (i.e., cu_physics = 0)
(2) Please set w_damping =1
(3) It is also helpful to increase the value of epssm (e.g., 0.5 or higher)
(4) starting_time_step = 40 and max_time_step = 150 seem too large for 4km resolution. You probably should set them to the default value.
 
Thank you Ming Chen for the precious advices.

I usually set W_damping option to 1, probably during the new setup of the WRFV4 I didn't notice the value which was set to zero.
About the epssm value, I've never explored changing this variable, I'll try as you suggested.

Thank you

Enrico
 
Hi all,

I just want to give you an update of the boundary layer problem.

Most of the run I made in the last two week went fine. The key of the problem, in my opinion, was the time step variables I used. In some cases, adaptive time step, even with low values of target_cfl and target_hcfl, is not sufficient to avoid the crash of the model. The reduction of the max_time_step to values around 45-50 seconds has reduced the percentage of model crashes to nearly zero.
It seems that, in some cases, we have to "manually" limit the time step values, even in an adaptive_time_step regime.

I've also noticed that the configurations that are more susceptible to crash are the use of YSU and MYNN boundary layer, in very windy conditions (e.g: sustained 850/700 hPa winds which impact on the Alps).


Thank you all for the precious advices.

Enrico
 
Enrico,

Thank you for the kind information. We will pay attention to YSU and MYNN schemes as you suggested. So far we are not aware of any bugs related to these two schemes.
 
Hello,

I would like to add my findings on the topic here.

As OP said, he suspected particular physics schemes in v4.0 to cause crash. But it is most probably nothing about schemes, but simple CFL violation.

I got same issue after upgrading from v3.9.1.1 to V4.0.3 and v4.1. In my setup, I experienced crashes with completely unrelated NOAH-MP messages in error logs, that cost me 2 days of searching for problem at wrong place. It turned out, that NOAH-MP intercepted impossible model state near nest boundaries and called wrf_error_fatal before WRF had chance to print out CFL errors so they were masked by NOAH-MP errors like "Energy budget problem in NOAHMP LSM", "emitted longwave <0; skin T may be wrong due to inconsistent", "STOP in Noah-MP" and so on. I'm intentionally pasting error messages here for google to index this, because it could help someone in a future.

So as I said, it turned out it has nothing with NOAH-MP though. I found that those are CFL violations only after I disabled all wrf_error_fatal calls in NOAH-MP, and then WRF finally printed CFL errors and crashed right after NOAH-MP's messages.

Now some findings about root cause of problem. Yes, it does not happen in 3.9.1.1 that often as in 4.x. It turns out that it is due to different placement of eta levels in real.exe. From v4.0, real creates significantly more dense eta levels near ground than in v3.x by default. This leads to need to decrease model timestep, compared to v3.x in order to keep model numerically stable (the closer eta levels to each other are, the smaller timestep has to be).

There are other possible solutions if one does not want to reduce timestep in order to maintain calculation speed:

1) use real method from v3.x to calculate eta levels by setting auto_levels_opt = 1, and/or
2) use more agressive terrain smoothing in geogrid

I really hope this helps someone running into same problems.

Bye
Ivan
 
Top