Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF crashes when generating restart files

htan2013

Member
Dear all experts,

I kindly want to know whether there is/are some settings in my namelist that is not right.
The issue that I am facing is that when I set a restart interval that is shorter than my simulation duration, the model stops.
For example, if I want to make my simulation passes, I have to set restart interval like 14400 (10days) in this namelist. If I use 60, 360, or 720, it always stops when the rst file is generated.

Although it's ok to set the restart interval to a big number, there are issues when I use Derecho where the wall time is only 12hrs. I have attached RSLs, namelist, wrfbdy & wrfinputs (Google drive: WRF Files - Google Drive).

Any advice is much appreciated.

Thanks,
HT
 

Attachments

  • namelist.input
    5.1 KB · Views: 3
  • rsl.error.0000
    359.5 KB · Views: 4
  • rsl.out.0000
    224.5 KB · Views: 1
Hi,
WRF restart files are generally much larger than wrfout* files. So sometimes the model can struggle to write those out. I'm not sure if this could be causing the issue, but you may need to use more processors for this simulation. Given the size of your domains and the fact that you're running on Derecho, you could be using up to 10 nodes (1280 processors) and still be within the limits. That doesn't mean you need to use that many, but you could try using several more nodes to see if it makes any difference.

Another thing to check is just that you have enough disk space wherever you're running this.
 
Hi,
WRF restart files are generally much larger than wrfout* files. So sometimes the model can struggle to write those out. I'm not sure if this could be causing the issue, but you may need to use more processors for this simulation. Given the size of your domains and the fact that you're running on Derecho, you could be using up to 10 nodes (1280 processors) and still be within the limits. That doesn't mean you need to use that many, but you could try using several more nodes to see if it makes any difference.

Another thing to check is just that you have enough disk space wherever you're running this.
Thank you, Kelly!
Will test it using more processors. I thought it would be more related to NetCDF outputting issue.
 
Dear all experts,

I kindly want to know whether there is/are some settings in my namelist that is not right.
The issue that I am facing is that when I set a restart interval that is shorter than my simulation duration, the model stops.
For example, if I want to make my simulation passes, I have to set restart interval like 14400 (10days) in this namelist. If I use 60, 360, or 720, it always stops when the rst file is generated.

Although it's ok to set the restart interval to a big number, there are issues when I use Derecho where the wall time is only 12hrs. I have attached RSLs, namelist, wrfbdy & wrfinputs (Google drive: WRF Files - Google Drive).

Any advice is much appreciated.

Thanks,
HT
Hi,

I am facing the similar issue. If I don't generate restart file (by keeping larger restart interval) it works, but the time when I try to generate restart file, the model stops while generate the rst files. I need restart files as I have to run model for larger duration.`

Did you get any solution?

Thanks,
Sushmita
 
Hi,

I am facing the similar issue. If I don't generate restart file (by keeping larger restart interval) it works, but the time when I try to generate restart file, the model stops while generate the rst files. I need restart files as I have to run model for larger duration.`

Did you get any solution?

Thanks,
Sushmita
Hi Sushmita,

Basically what Kelly suggested,set the io_form_restart to 102 and it helps. Sometimes it's just restart files are too large.
 
Top