Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault with adaptive time step.

Manuarii

New member
Hi everyone,

I am experiencing an issue while running WRF. Although both WPS and the real.exe processes completed successfully, executing WRF with adaptive time stepping leads to a segmentation fault. To resolve this problem, I need to set a constant time step of 10 seconds for the larger domain. I am currently using WRF version 4.4.2 with 81 processors. Attached are the rsl.error.0001 and namelist.input files for your reference. Is it due to the time step or size of my inner domain ?

Thanks a lot in advance,

Vazquez Ballesta Manuarii
 

Attachments

  • namelist.input
    6.3 KB · Views: 6
  • rsl.error.0001.txt
    26.9 KB · Views: 3
Hi,
Can you try a couple things?

1) first set debug_level = 0. I know you were probably trying to get some additional output information, but this option usually just adds a lot of useless junk to the rsl files, making them more difficult to read.

2) Try running just a single domain with adaptive time stepping on. If that works, try 2 domains. I'd like to know which nest (if any) causes the issue.

3) Try removing the eta_levels from your namelist and running that way - with however many domains it takes to make it fail.

Let me know the results and then package all of your rsl* files into a single *.tar file and attach that, along with the updated namelist.input file so I can take a look. Thanks!
 
Hi, Thanks for your reply.

I attempted to run the simulation by setting max_dom = 2 while using the same adaptive time step options, and it worked without issues. This indicates that the third domain is likely the cause of the problem. With this in mind, I experimented with different configurations in the namelist.input file to prevent the simulation from failing. However, even when using the smallest possible time step with the adaptive time step, nothing changed.

What’s puzzling is that the simulation consistently stops at the exact same point in time: "time 2017-02-20_00:10:00 on domain 3". In some cases, I don’t even encounter any CFL errors, yet the issue persists when running with three domains.

Attached is a zip file that contains the error logs for both the two-domain and three-domain configurations.

Manuarii
 

Attachments

  • rsl_error_3do.zip
    395.4 KB · Views: 1
  • rsl_error_2do.zip
    1.8 MB · Views: 1
Last edited:
Update : I’ve identified that the error is specifically related to the long-wave radiation scheme. In the namelist.input, the radt parameter is set to 10 minutes, and the model crashes exactly when the simulation reaches that 10-minute mark. I tried adjusting this value to both 1 minute and 20 minutes, but the model still fails once it hits the corresponding time.

What steps can I take to properly use the adaptive time step with this configuration?

Manuarii
 
In some cases, I don’t even encounter any CFL errors, yet the issue persists when running with three domains.
Are you changing anything with your setup when you don't receive any CFL errors? If you're not changing anything, the CFL errors should exist every time. The rsl files you shared with me show multiple CFL errors, and unfortunately you will have to overcome that issue before anything else can be resolved. See Segmentation Faults - Helpful Information, which includes information about getting past CFL errors. There are some additional options than just decreasing the time_step. It may be that you can solve the CFL issue without having to use adaptive time step. If you can get rid of the CFL errors, and it's still failing, send your modified namelist.input file, as well as the packaged-up rsl files again. Thanks!
 
Are you changing anything with your setup when you don't receive any CFL errors? If you're not changing anything, the CFL errors should exist every time. The rsl files you shared with me show multiple CFL errors, and unfortunately you will have to overcome that issue before anything else can be resolved. See Segmentation Faults - Helpful Information, which includes information about getting past CFL errors. There are some additional options than just decreasing the time_step. It may be that you can solve the CFL issue without having to use adaptive time step. If you can get rid of the CFL errors, and it's still failing, send your modified namelist.input file, as well as the packaged-up rsl files again. Thanks!
Thanks for your response. Yes, I adjust my setup to avoid CFL errors. In the attached example (rsl.error and namelist.input), where I don't encounter CFL errors, I can increase the time step up to 1s for the inner domain, and the model simulate up to 10 minutes before failing again.

That said, I am still encountering a segmentation fault, even with these adjustments.

Manuarii
 

Attachments

  • namelist.input
    7.8 KB · Views: 1
  • rsl_error_no_cfl_error.zip
    713.1 KB · Views: 1
Top