Hello,
I have started using WRF past few months and have focused on two-way nesting. I have no issues with preprocessing uptil real.exe. I have wrfinput for both domains along with met files for both domains. The Coarse Grid is 24 km with parent grid ratio of 3. My parent domain is 320x360 whereas my nested domain is 352x364. Following posts from multiple forum, I have followed guidelines for nested runs by keeping a thick buffer zone of around 1/3 distance in both directions.
Using SLURM with openmpi I found that wrf runs absolutely fine with 64 processes but starts hanging on mpi cluster with higher number of processes showing no errors. Many people have resolved this issue by changing either number of processes or increasing the inner domain size. I have tried multiple configurations but WRF hangs with more than 64 processes. It would be really helpful if I can get some hint on why this happens. I have tried debug level to 9999 but no errors still show up.
rsl.error.0000:
Ntasks in X 8 , ntasks in Y 16
Setting blank km_opt entries to domain #1 values.
--> The km_opt entry in the namelist.input is now max_domains.
Setting blank diff_opt entries to domain #1 values.
--> The diff_opt entry in the namelist.input is now max_domains.
--- WARNING: traj_opt is zero, but num_traj is not zero; setting num_traj to zero.
--- NOTE: grid_fdda is 0 for domain 1, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 1, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 1, setting obs nudging interval and ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 2, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 2, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 2, setting obs nudging interval and ending time to 0 for that domain.
--- NOTE: bl_pbl_physics /= 4, implies mfshconv must be 0, resetting
Need MYNN PBL for icloud_bl = 1, resetting to 0
--- NOTE: RRTMG radiation is not used, setting: o3input=0 to avoid data pre-processing
--- NOTE: num_soil_layers has been set to 4
WRF V3.8.1 MODEL
*************************************
Parent domain
ids,ide,jds,jde 1 320 1 360
ims,ime,jms,jme 274 325 331 365
ips,ipe,jps,jpe 281 320 338 360
*************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 71040788 bytes allocated
med_initialdata_input: calling input_input
Max map factor in domain 1 = 1.57. Scale the dt in the model accordingly.
INPUT LandUse = "MODIFIED_IGBP_MODIS_NOAH"
LANDUSE TYPE = "MODIFIED_IGBP_MODIS_NOAH" FOUND 33 CATEGORIES 2 SEASONS WATER CATEGORY = 17 SNOW CATEGORY = 15
Climatological albedo is used instead of table values
*************************************
Nesting domain
ids,ide,jds,jde 1 352 1 364
ims,ime,jms,jme 299 357 332 369
ips,ipe,jps,jpe 309 352 342 364
INTERMEDIATE domain
ids,ide,jds,jde 103 225 113 239
ims,ime,jms,jme 198 230 219 244
ips,ipe,jps,jpe 208 227 229 241
*************************************
alloc_space_field: domain 2 , 7886736 bytes allocated
alloc_space_field: domain 2 , 85629804 bytes allocated
namelist.input:
&time_control
run_days = 10,
run_hours = 0,
run_minutes = 0,
run_seconds = 0,
start_year = 2011, 2011,
start_month = 07, 07,
start_day = 01, 01,
start_hour = 12, 12,
start_minute = 00, 00,
start_second = 00, 00,
end_year = 2011, 2011,
end_month = 07, 07,
end_day = 11, 11,
end_hour = 12, 12,
end_minute = 00, 00,
end_second = 00, 00,
auxinput4_inname = "wrflowinp_d<domain>",
auxinput4_interval = 360, 360,
io_form_auxinput4 = 2,
interval_seconds = 21600,
input_from_file = .true.,.true.,
history_interval = 180, 180,
frames_per_outfile = 1, 1,
restart = .false.,
restart_interval = 720,
io_form_history = 2,
io_form_restart = 2,
io_form_input = 2,
io_form_boundary = 2,
debug_level = 0
/
&domains
time_step = 144,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 2,
use_adaptive_time_step = .true.,
step_to_output_time = .true.,
target_cfl = 1.2, 1.2,
max_step_increase_pct = 5, 51,
starting_time_step = -1, -1
max_time_step = -1, -1
min_time_step = -1, -1
adaptation_domain = 1,
e_we = 320, 352,
e_sn = 360, 364,
e_vert = 50, 50,
p_top_requested = 1000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 24000, 8000,
dy = 24000, 8000,
grid_id = 1, 2,
parent_id = 1, 1,
i_parent_start = 1, 105,
j_parent_start = 1, 115,
parent_grid_ratio = 1, 3,
parent_time_step_ratio = 1, 3,
feedback = 1,
smooth_option = 1,
/
&physics
mp_physics = 6, 6,
ra_lw_physics = 1, 1,
ra_sw_physics = 1, 1,
radt = 10, 10,
sf_sfclay_physics = 1, 1,
sf_surface_physics = 2, 2,
bl_pbl_physics = 1, 1,
bldt = 0, 0,
cu_physics = 1, 0,
cudt = 5, 0,
isfflx = 1,
ifsnow = 1,
icloud = 1,
surface_input_source = 1,
num_soil_layers = 4,
sf_urban_physics = 0, 0,
num_land_cat = 21,
/
&fdda
/
&dynamics
w_damping = 0,
diff_opt = 1,
km_opt = 4,
diff_6th_opt = 2, 2,
diff_6th_factor = 0.12, 0.12,
base_temp = 290.,
damp_opt = 1,
epssm = 0.5,
zdamp = 5000.,5000.,
dampcoef = 0.2, 0.2,
khdif = 0, 0,
kvdif = 0, 0,
non_hydrostatic = .true.,.true.,
moist_adv_opt = 1, 1,
scalar_adv_opt = 1, 1,
!time_step_sound = 4, 4,
/
&bdy_control
spec_bdy_width = 10,
spec_zone = 1,
relax_zone = 4,
spec_exp = 0.33,
specified = .true.,.false.,
nested = .false.,.true.,
/
&grib2
/
&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/
I have started using WRF past few months and have focused on two-way nesting. I have no issues with preprocessing uptil real.exe. I have wrfinput for both domains along with met files for both domains. The Coarse Grid is 24 km with parent grid ratio of 3. My parent domain is 320x360 whereas my nested domain is 352x364. Following posts from multiple forum, I have followed guidelines for nested runs by keeping a thick buffer zone of around 1/3 distance in both directions.
Using SLURM with openmpi I found that wrf runs absolutely fine with 64 processes but starts hanging on mpi cluster with higher number of processes showing no errors. Many people have resolved this issue by changing either number of processes or increasing the inner domain size. I have tried multiple configurations but WRF hangs with more than 64 processes. It would be really helpful if I can get some hint on why this happens. I have tried debug level to 9999 but no errors still show up.
rsl.error.0000:
Ntasks in X 8 , ntasks in Y 16
Setting blank km_opt entries to domain #1 values.
--> The km_opt entry in the namelist.input is now max_domains.
Setting blank diff_opt entries to domain #1 values.
--> The diff_opt entry in the namelist.input is now max_domains.
--- WARNING: traj_opt is zero, but num_traj is not zero; setting num_traj to zero.
--- NOTE: grid_fdda is 0 for domain 1, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 1, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 1, setting obs nudging interval and ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 2, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 2, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 2, setting obs nudging interval and ending time to 0 for that domain.
--- NOTE: bl_pbl_physics /= 4, implies mfshconv must be 0, resetting
Need MYNN PBL for icloud_bl = 1, resetting to 0
--- NOTE: RRTMG radiation is not used, setting: o3input=0 to avoid data pre-processing
--- NOTE: num_soil_layers has been set to 4
WRF V3.8.1 MODEL
*************************************
Parent domain
ids,ide,jds,jde 1 320 1 360
ims,ime,jms,jme 274 325 331 365
ips,ipe,jps,jpe 281 320 338 360
*************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 71040788 bytes allocated
med_initialdata_input: calling input_input
Max map factor in domain 1 = 1.57. Scale the dt in the model accordingly.
INPUT LandUse = "MODIFIED_IGBP_MODIS_NOAH"
LANDUSE TYPE = "MODIFIED_IGBP_MODIS_NOAH" FOUND 33 CATEGORIES 2 SEASONS WATER CATEGORY = 17 SNOW CATEGORY = 15
Climatological albedo is used instead of table values
*************************************
Nesting domain
ids,ide,jds,jde 1 352 1 364
ims,ime,jms,jme 299 357 332 369
ips,ipe,jps,jpe 309 352 342 364
INTERMEDIATE domain
ids,ide,jds,jde 103 225 113 239
ims,ime,jms,jme 198 230 219 244
ips,ipe,jps,jpe 208 227 229 241
*************************************
alloc_space_field: domain 2 , 7886736 bytes allocated
alloc_space_field: domain 2 , 85629804 bytes allocated
namelist.input:
&time_control
run_days = 10,
run_hours = 0,
run_minutes = 0,
run_seconds = 0,
start_year = 2011, 2011,
start_month = 07, 07,
start_day = 01, 01,
start_hour = 12, 12,
start_minute = 00, 00,
start_second = 00, 00,
end_year = 2011, 2011,
end_month = 07, 07,
end_day = 11, 11,
end_hour = 12, 12,
end_minute = 00, 00,
end_second = 00, 00,
auxinput4_inname = "wrflowinp_d<domain>",
auxinput4_interval = 360, 360,
io_form_auxinput4 = 2,
interval_seconds = 21600,
input_from_file = .true.,.true.,
history_interval = 180, 180,
frames_per_outfile = 1, 1,
restart = .false.,
restart_interval = 720,
io_form_history = 2,
io_form_restart = 2,
io_form_input = 2,
io_form_boundary = 2,
debug_level = 0
/
&domains
time_step = 144,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 2,
use_adaptive_time_step = .true.,
step_to_output_time = .true.,
target_cfl = 1.2, 1.2,
max_step_increase_pct = 5, 51,
starting_time_step = -1, -1
max_time_step = -1, -1
min_time_step = -1, -1
adaptation_domain = 1,
e_we = 320, 352,
e_sn = 360, 364,
e_vert = 50, 50,
p_top_requested = 1000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 24000, 8000,
dy = 24000, 8000,
grid_id = 1, 2,
parent_id = 1, 1,
i_parent_start = 1, 105,
j_parent_start = 1, 115,
parent_grid_ratio = 1, 3,
parent_time_step_ratio = 1, 3,
feedback = 1,
smooth_option = 1,
/
&physics
mp_physics = 6, 6,
ra_lw_physics = 1, 1,
ra_sw_physics = 1, 1,
radt = 10, 10,
sf_sfclay_physics = 1, 1,
sf_surface_physics = 2, 2,
bl_pbl_physics = 1, 1,
bldt = 0, 0,
cu_physics = 1, 0,
cudt = 5, 0,
isfflx = 1,
ifsnow = 1,
icloud = 1,
surface_input_source = 1,
num_soil_layers = 4,
sf_urban_physics = 0, 0,
num_land_cat = 21,
/
&fdda
/
&dynamics
w_damping = 0,
diff_opt = 1,
km_opt = 4,
diff_6th_opt = 2, 2,
diff_6th_factor = 0.12, 0.12,
base_temp = 290.,
damp_opt = 1,
epssm = 0.5,
zdamp = 5000.,5000.,
dampcoef = 0.2, 0.2,
khdif = 0, 0,
kvdif = 0, 0,
non_hydrostatic = .true.,.true.,
moist_adv_opt = 1, 1,
scalar_adv_opt = 1, 1,
!time_step_sound = 4, 4,
/
&bdy_control
spec_bdy_width = 10,
spec_zone = 1,
relax_zone = 4,
spec_exp = 0.33,
specified = .true.,.false.,
nested = .false.,.true.,
/
&grib2
/
&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/