Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF4.X more unstable than WRF3.9 (SIGSEGV)

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

toshio

New member
Hi,

I have been using the new version of wrf but my experience shows that it is much more unstable than versions prior 3.9

Using WRF4.1 I made a study case:

analisys date: 2019071500
boundary: GFS(FV3)
number processors: 160

experiment 1:
nproc_x = 16,
nproc_y = 10,
target_cfl = 1.2,
target_hcfl = .84,

result: severe (174): SIGSEGV, segmentation fault occurred

experiment 2:
nproc_x = 16,
nproc_y = 10,
target_cfl = 0.8,
target_hcfl = .5,

result: SUCCESS

experiment 3:
nproc_x = -1,
nproc_y = -1,
target_cfl = 1.0,
target_hcfl = .67,

result: model keep running forever

I have made other runs with experiment2 setup that resulted in SIGSEGV as well.
I put some files in https://drive.google.com/drive/folders/1N3YHyIh87o_AA0mQNoIVIdhaZDOdzP3H?usp=sharing if you want to try reproduce.

Any help is welcome,
 
I suppose this is a real data case. Please let me know I am wrong.
I looked at your namelist.input. I have a few suggestions about it:
(1) If you turn off adaptive time step, does the model still tend to be more unstable than the older version?
(2) If you turn on adaptive time step, please set starting_time_step = 20 (i.e, 4 x DX), which will be able to help stabilize the numerical integration
(3) Please set km_opt =4 for real data case
(4) since epssm = 0.8, I guess you are running over a complex terrain area, right?
We are not aware that WRFV4 tends to be more unstable than WRFV3.9. Please keep us updated if you have more concerns.
 
Hi,

thanks for your response and sorry for not respond later, I have been testing run with some of your suggestions.

Answering your questions:

0 - yes, it is a real data case.
1 - Yes, with fixed time step I have to use smaller values in version 4. (Some cases I had to set time_step about 10s)
2 - Makes no difference, since in few steps it reaches max dt.
3 - km_opt=4 helps, but not so much.
4 - Probably the epssm configuration is a legacy of an old grid. My area has some montains, but I don't think it is very complex (see namelist.wps)

I can make further tests if you want.
 

Attachments

  • namelist.wps
    1 KB · Views: 46
  • wrf3.9.1.1.rsl.out.0000.txt
    1.3 MB · Views: 119
  • namelist.input
    2.9 KB · Views: 133
  • rsl.out.0000.wrf4.0.txt
    248.4 KB · Views: 116
  • rsl.error.0158.wrf4.0.txt
    3 KB · Views: 131
In your nameless.input, many dynamics options are commented out, which means that you use the default options specified in Registry.
However, note that some of the options may change form version to version. In addition, the default options may not be appropriate for your case. Can you explicitly specify these options in your namelist?
 
Hi Chen,

actually those commented options was disabled as an attempt to track what was causing the problem so I changed to defaults to make it easier to understand what is going on.

Please, let me know if you want more tests

Best regards,
 
If you have the same instability issue, I would suggest to turn off adaptive time step, and run with debug mode (./configure -D). In this case, you will find exactly when and where the model crashed. This information will be helpful for you to figure out the possible reasons for the crash.

Since your grid interval is 5km, which is located in the so called grey-zone, you may also turn off cumulus scheme and see whether the case can work.

I am sorry I have no immediate solution to the problem you meet. It is hard for us to debug each individual user's case.
 
Thank you, I know it is impossible to test everything. I will try with ./configure -D (debug = 1000 didn't help so much).

Regards
 
Top