Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF running time limit in HPC system

sd16121

Member
Dear Office,

When I run the high-resolution simulation, I found the HPC systems I used have running time limit. Could you please kindly let me know how to handle this issue so that I could run the full job completely.

Best regards,
Stella
 
Hi Stella,
I recommend using the "restart" option. You will create a restart file to be output at a simulation time prior to the end of the wallclock time, or at a particular interval (e.g., every 24 hours simulation time), and then use that restart file to start the model up again from that point.
 
Hi Stella,
I recommend using the "restart" option. You will create a restart file to be output at a simulation time prior to the end of the wallclock time, or at a particular interval (e.g., every 24 hours simulation time), and then use that restart file to start the model up again from that point.

Dear Office,

Thank you so much for your kind reply. The link shows "To initiate the restart run, edit the namelist.input file, so that your start_* time is set to the restart time (which is the <date> of the restart file). You must also set restart=.true." I would like to know the time when I should modify the namelist? After the run extends beyond available wallclock time?

Best regards,
Stella
 
Stella,
The first time you run a simulation, you will set restart = .false. and set your restart_interval to something like 1440, which is the minutes in a 24 hour time period. This means you would get a restart file after each 24 hour simulation time period. Then, let's say your wallclock allowed you to run 6 and 1/2 days of simulation time before it stopped, then you would go back in and set the namelist for your next run to restart = .true. and you could keep the restart interval set to 1440. The new (restart) simulation would start from the beginning of the 6th day and continue to run until it stopped again.
 
Stella,
The first time you run a simulation, you will set restart = .false. and set your restart_interval to something like 1440, which is the minutes in a 24 hour time period. This means you would get a restart file after each 24 hour simulation time period. Then, let's say your wallclock allowed you to run 6 and 1/2 days of simulation time before it stopped, then you would go back in and set the namelist for your next run to restart = .true. and you could keep the restart interval set to 1440. The new (restart) simulation would start from the beginning of the 6th day and continue to run until it stopped again.
Hi,

If the time limit is 6.5 days in total, should I set restart interval as 9360min to run before stopping. Do I need to repeat running real.exe and wrf.exe again by myself? Or the only thing is to submit my job again?
And I would like to confirm whether the restart interval is 1440 or 7200, the restart simulation would start from the stop time point?

Best wishes,
Stella
 
Last edited:
Stella,
You can set restart_interval to whatever time interval you would like - just so long as it would write out a full restart file before stopping. You will not need to re-run real.exe. You will need to modify your namelist to set restart = .true. and the new start date/time information, then re-run wrf.exe. The restart simulation will start from the date/time you set in your namelist.input file - which should correspond to an available restart file's time. You may find this page helpful - this is a practice given to the tutorial students, but you can use it as a guide. You may want to refer back to the single domain case to see how the namelist was set-up for that simulation, and then how it was set up for the restart.
 
Stella,
You can set restart_interval to whatever time interval you would like - just so long as it would write out a full restart file before stopping. You will not need to re-run real.exe. You will need to modify your namelist to set restart = .true. and the new start date/time information, then re-run wrf.exe. The restart simulation will start from the date/time you set in your namelist.input file - which should correspond to an available restart file's time. You may find this page helpful - this is a practice given to the tutorial students, but you can use it as a guide. You may want to refer back to the single domain case to see how the namelist was set-up for that simulation, and then how it was set up for the restart.
Hi Kwerner,

I wonder the restart interval time is the actual running time on my system, or the first 24 hours of my simulation period?

Best,
Stella
 
The restart_interval in the namelist is set to output after a declared amount of simulation time.
 
Top