Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF takes hours to write large wrfrst files (needed every hour for data assimilation)

DCC364

New member
Hello everyone,

I’m trying to generate restart files every hour because I need them for hourly data assimilation. Unfortunately, WRF takes an extremely long time to write the wrfrst files — sometimes it seems to hang for hours when the restart output is due.

My configuration:

High-resolution domains:
  • d01 = 828 × 772 × 90
  • d02 = 1501 × 1309 × 90 (3 km)
  • WRF version: 4.5
  • Run: dmpar, 1024 MPI ranks (8 nodes × 128 ranks per node)
  • Namelist: restart_interval = 60, io_form_restart = 2 (classic NetCDF, since 11 produces no output)

More details from namelist.input:
&time_control
run_days = 0,
run_hours = 1,
run_minutes = 0,
run_seconds = 0,
start_year = 2014, 2014,
start_month = 11, 11,
start_day = 07, 07,
start_hour = 00, 00,
start_minute = 00, 00,
start_second = 00, 00,
end_year = 2014, 2014,
end_month = 11, 11,
end_day = 07, 07,
end_hour = 01, 01,
end_minute = 00, 00,
end_second = 00, 00,
interval_seconds = 10800,
input_from_file = .true., .true.,
fine_input_stream = 0, 0,
history_interval = 60, 60,
frames_per_outfile = 1, 1,
write_hist_at_0h_rst = .true.,
restart = .false.,
restart_interval = 60,
iofields_filename = "fields_NO.txt", "fields_NO.txt",
io_form_history = 2,
io_form_restart = 2,
io_form_input = 2,
io_form_boundary = 2,
debug_level = 0,
nwp_diagnostics = 0,
/


&domains
time_step = 9,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 2,
e_we = 828, 1501,
e_sn = 772, 1309,
e_vert = 90, 90,
p_top_requested = 1000,
num_metgrid_levels = 10,
num_metgrid_soil_levels = 4,
dx = 3000,
dy = 3000,
grid_id = 1, 2,
parent_id = 1, 1,
i_parent_start = 1, 164,
j_parent_start = 1, 151,
parent_grid_ratio = 1, 3,
parent_time_step_ratio = 1, 3,
feedback = 1,
smooth_option = 0,
/


&dynamics
w_damping = 1,
diff_opt = 1, 1,
km_opt = 4, 4,
use_theta_m = 0,
diff_6th_opt = 0, 0,
diff_6th_factor = 0.12, 0.12,
base_temp = 290.,
damp_opt = 3,
zdamp = 5000., 5000.,
dampcoef = 0.33, 0.4,
khdif = 0, 0,
kvdif = 0, 0,
smdiv = 0.1,
emdiv = 0.01,
epssm = 0.5, 0.5,
non_hydrostatic = .true., .true.,
moist_adv_opt = 1, 1,
scalar_adv_opt = 1, 1,
/

&bdy_control
spec_bdy_width = 5,
spec_zone = 1,
relax_zone = 4,
specified = .true., .false.,
nested = .false., .true.,
/

&grib2
/

&namelist_quilt
/

When the model reaches the restart time, the file wrfrst_d02_* begins to appear on disk, and its size quickly grows above 10–15 GB, but then WRF stays stuck for a very long time (sometimes hours) before it finishes.

I need one single restart file per domain (not per processor), because of the data assimilation system I’m using, so I cannot switch to distributed restart files.

Has anyone experienced the same issue — very large or extremely slow restart writing in high-resolution runs?
Any suggestions to make the writing faster or more stable (e.g., using NetCDF-4, quilting, or other I/O options) would be very helpful.

Thank you very much,
Diego
 
The large grid numbers of your parent and child domains indeed require long time to output restart files.

Probably you can try the option io_form_restart=102, which will split the wrfrst to make it faster to output. However, you will need to run JOINER to combine these splitted wrfrst files. Please download the JOINER program from WRF Download

Note that the JOINER program was created years ago and I am not sure whether it works fine with data from newer version of WRF. I guess it is worth trying.
 
Top