Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

fail to run a job completely

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.


Dear Office,

I ran a winter case, but it was interrupted at 2021-01-21_04:00:00, and its output should be from 2021-01-18_00:00:00 to 2021-01-21_23:00:00. Could you please kindly let me know if there are any issues?

FYI: another similar job (a summer case with the same configuration) can be run completely.

Best wishes,


  • namelist.input
    4.4 KB · Views: 28
  • namelist.wps
    1.7 KB · Views: 25
Hi Stella,
Can you please attach the output/error files from the job (e.g., rsl.error.*)? If there are multiple rsl files, please package them in a single .tar file (not .rar - we cannot open that format type) and attach it as a single file. Thanks!
Dear Colleagues,

Please see the attached files.

Best wishes,


  • rsl.error.tar
    3.2 MB · Views: 20
  • rsl.out.tar
    2.8 MB · Views: 22
Hi Stella,
The rsl files do not give any real indication of why your run stopped; however, I see a few problems with your simulation.
1) Your domains are way too small. We advise that your domains (e_we x e_sn) not be any smaller than 100x100. Otherwise the simulation is likely going to be unreasonable.
2) I notice your d01 resolution is very coarse. What type of input data are you using? Most input data are fairly fine in resolution these days - usually no more coarse than something like 1 degree. The resolution of d01 should be higher resolution than the input data, and should be no more than about a 7:1 ratio. So if you had 1 degree input data (which is about 110 km), it would be reasonable to have dx = 18000, and since you're using a 3:1 grid ratio, the full setting would be dx = 18000, 6000, 2000.
3) Your time_step should be no more than 6xDX, which you are already staying within the limit, but your time_step is incredibly small compared to your current dx. I suppose this may be because you were unable to get it to run otherwise, given the small number of grid points in each direction, and a large number of vertical levels?
4) You're using 16 processors, which is too many for the small size of your domain. Take a look at this FAQ that discussed how to choose a reasonable number of processors.

I would advise to take a look at this web page, which provides best practice suggestions for setting up your domain. You may also later want to refer to the similar page for the WRF namelist. Additionally, you should make your domains larger, check on the resolution of your input data vs. your dx/dy in your namelist, start with a time_step of 6xDX, and you likely don't need to use adaptive_time_step, unless you are forced to run with a smaller time_step and it's taking too long to run. And make sure you're using an appropriate number of processors.
Dear Office,

Many thanks for your detailed reply. I have two questions now:

1. My input data is the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis ERA5 dataset with high temporal (hourly) and spatial (31 km) resolutions. But the resolution of my domain1 is 40.5 km, which is less than that of input data. How to adjust the setting so that I could simulate dx= 40500, 13500, 4500? (domain3 must be 4500m and ratio must be 3:1)

2. You mentioned that "I suppose this may be because you were unable to get it to run otherwise, given the small number of grid points in each direction, and a large number of vertical levels?" Could you please kindly further explain the meaning of "given the small number of grid points in each direction, and a large number of vertical levels"?

Looking forward to hearing from you.

Best wishes,
Hi Stella,

To answer your questions,
1) Since you have input data that is 31 km, you don't need the outer domain you're using. You can simply run this simulation with 2 domains: 13500, 4500.

2) I may have mis-spoken regarding this. Upon further thought, I'm not certain that having too many vertical levels can actually make the model not run. However, I do know that when dz > dx, if the levels are all equal, the results can be worse, based on previous testing. Honestly, your number of vertical levels is not the problem. The small domains are a much bigger issue.