Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Array out of bounds in WRF-4.2.2 using OBS nudging

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

bartbrashers

New member
I'm using WRF-4.2.2 configured with the -D option (debug compilation), along with output from OBSGRID-3.8 (the OBSDOMAIN* files) for OBS nudging. I'm also using the metoa_em* files, FWIW. I've used a similar setup for years (without the debugging part). I am currently using debug_level = 300, and running on 16 cores (dmpar).

WRF crashes with an array out-of-bounds error:

Code:
% tail rsl.error.0006
d03 2019-12-09_12:00:00 calling inc/HALO_EM_PHYS_PBL_inline.inc
d03 2019-12-09_12:00:00 calling inc/HALO_EM_PHYS_DIFFUSION_inline.inc
d03 2019-12-09_12:00:00 calling inc/HALO_EM_TKE_5_inline.inc
d03 2019-12-09_12:00:00  call update_phy_ten
d03 2019-12-09_12:00:00 calling inc/HALO_OBS_NUDGE_inline.inc
OBS NUDGING is requested on a total of  1 domain(s).
d03 2019-12-09_12:00:00 in PSU FDDA scheme
++++++CALL ERROB AT KTAU =     0 AND INEST =  3:  NSTA =   200 ++++++
0: Subscript out of range for array ub (module_fddaobs_rtfdda.f90: 1069)
    subscript=-99, lower bound=1, upper bound=39, dimension=2

The chunk of code around line 1069 of phys/module_fddaobs_rtfdda.F was not formatted with human readers in mind:

Code:
!           U COMPONENT WIND ERROR
            ERRF(1,N)=ERRF(1,N)+uratiob*VAROBS(1,N)-((1.-DZOB)*        &
                      ((1.-DyOB)*((1.-                                 &
                      DxOB)*UB(IOB,KOB,JOB)+DxOB*UB(IOB+1,KOB,JOB)     &
                      )+DyOB*((1.-DxOB)*UB(IOB,KOB,JOB+1)+DxOB*        &
                      UB(IOB+1,KOB,JOB+1)))+DZOB*((1.-DyOB)*((1.-DxOB) &
                      *UB(IOB,KOBP,JOB)+DxOB*UB(IOB+1,KOBP,JOB))+      &
                      DyOB*((1.-DxOB)*UB(IOB,KOBP,JOB+1)+DxOB*         &
                      UB(IOB+1,KOBP,JOB+1))))

I don't think Fortran compiles or runs any slower if the formatting made it take a few more lines, so it was easier to read. ;-)

I'm guessing the -99 is an initialization default value.

I would think that the OBSGRID run exclude any bad observation values, using this chunk in namelist.oa:

Code:
&record4
 qc_test_error_max           = .TRUE.
 qc_test_buddy               = .TRUE.
 qc_test_vert_consistency    = .FALSE.
 qc_test_convective_adj      = .FALSE.
 qc_psfc                     = .TRUE.
 max_error_t                 = 5
 max_error_uv                = 7
 max_error_z                 = 4
 max_error_rh                = 25
 max_error_dewpoint          = 10
 max_error_p                 = 300
 max_buddy_t                 = 4
 max_buddy_uv                = 4
 max_buddy_z                 = 4
 max_buddy_rh                = 20
 max_buddy_dewpoint          = 10
 max_buddy_p                 = 400
 buddy_weight                = 1.0
 max_p_extend_t              = 1300
 max_p_extend_w              = 1300
/

Any hints on where to look for issues? I can't think of what to look at next.
 
Hi,
The vertical coordinate has been changed since V4.0. We haven't checked whether OBSGRID works fine with WRF4.0 and later version.
I just wonder whether you can run older version of WRF (before V4.0) with the OBSDOMAIN* file? If so, this issue might be related to the vertical coordinate change in WRF. Otherwise, something else might be wrong.
Please keep me updated about this issue. Thanks.
 
I'm using both of these settings, so it should be backward compatible with the changes to the defaults in WRF-4.0.

Code:
 &time_control
 force_use_old_data                  = .true.,
 &dynamics
 hybrid_opt                          = 0,

Good thought, though. I will try with WRF-3.8.1 (configured with the -D debug switch) and let you know.
 
We did see failed cases with the option "force_use_old_data = true". It is not our top priority to debug this option and make it work perfectly. Instead, we always recommend users to run the sam eversion of REAL and WRF to avoid possible problems.
 
I am using WPS-4.2, the force_use_old_data = true is just a habit.

I have confirmed the out-of-bounds error happens exactly the same (same thread, same time, etc.) with that option removed from the namelist.

Any more hints about how to solve this?
 
Ping! Any other suggestions anyone has as to how to debug this?

Has anyone else hit an out-of-bounds error in phys/module_fddaobs_rtfdda.F?
 
I have traced this to what appears to be a bad sounding. I have debug_level = 500, so it's quite verbose. Many "pages" back, it was giving me a hint:

Code:
# grep -B5 problem rsl.error.0000
rsl.error.0000-d03 2019-12-09_12:00:00  input_wrf: end, fid =             9
rsl.error.0000-d03 2019-12-09_12:00:00 Input data processed for aux input   4 for domain   3
rsl.error.0000-OBS NUDGING: Reading new obs for time window TBACK =   -0.667 TFORWD =    0.667 for grid =  3
rsl.error.0000- opening first fdda obs file, fonc=01 inest=            3
rsl.error.0000- ifon=            1
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800
rsl.error.0000: *** PROBLEM: sounding, p and ht undefined    64.82000       -147.8800

It occurs for each 5-day WRF run, for the first sounding in each.

I looked in OBSDOMAIN301 file, and see a sounding at that lat-lon, here's the top few lines:

Code:
 20191209120000
    64.8200 -147.8800
                                             MADIS
  FM-35 TEMP                              134.     T     F    194
   99200.000       0.000     134.000       0.000     261.050       0.000 -888888.000 -888888.000 -888888.000 -888888.000      89.413       0.000
   98900.000 -888888.000     157.176 -888888.000     263.050 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      94.652 -888888.000
   98800.000 -888888.000     164.978 -888888.000     265.250 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      91.846 -888888.000
   98500.000 -888888.000     188.594 -888888.000     266.650 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      85.104 -888888.000
   98300.000 -888888.000     204.438 -888888.000     267.250 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      83.862 -888888.000
   98000.000 -888888.000     228.281 -888888.000     267.050 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      85.805 -888888.000
   97500.000     256.000     268.228     256.000     267.716   16640.000 -888888.000 -888888.000 -888888.000 -888888.000      84.781   16640.000
   97400.000 -888888.000     276.241 -888888.000     267.850 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      84.576 -888888.000
   97046.359 -888888.000     304.796 -888888.000 -888888.000 -888888.000      -9.156 -888888.000      -1.632 -888888.000 -888888.000 -888888.000
   96900.000 -888888.000     316.645 -888888.000     269.850 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      76.810 -888888.000
   96500.000 -888888.000     349.250 -888888.000     270.050 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      74.528 -888888.000
   95800.000 -888888.000     406.787 -888888.000     271.250 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      68.721 -888888.000
   95400.000 -888888.000     440.025 -888888.000     272.850 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      59.282 -888888.000
   95000.000       0.000     473.512       0.000     273.050   16384.000     -13.791   16640.000      -1.590   16640.000      54.940   16384.000
   93900.000 -888888.000     566.370 -888888.000     273.050 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      54.940 -888888.000
   93393.367 -888888.000     609.592 -888888.000 -888888.000 -888888.000     -17.431 -888888.000      -1.557 -888888.000 -888888.000 -888888.000
   93200.000 -888888.000     626.151 -888888.000     274.150 -888888.000 -888888.000 -888888.000 -888888.000 -888888.000      43.774 -888888.000
   92500.000       0.000     692.959       0.000     273.750   16384.000     -16.440   16384.000       1.408   16384.000      43.666   16384.000

I don't see a problem with this sounding. It looks a lot like soundings from other projects in other OBSDOMAIN* files, where have also used MYNN2.5.

Any hints as to where I should look next?

Bart
 
Hello bartbrashers and anyone who has some idea on this issue
I got the similar error message as below when I ran WRFDA 4DVAR (WRF4.3) (da_varwrf.exe) using test data at https://www2.mmm.ucar.edu/wrf/users/wrfda/download/testdata.html.
I used -D option to configure -D WRFPLUS...I am wondering if you solved the problem and had any progress on fixing it.
Thank you.
-----
d01 2008-02-05_12:00:00 CAM-CLWRF co2vmr: 3.7900000000000000E-004 n2ovmr: 3.1899999999999998E-007 ch4vmr: 1.7740000000000001E-006
0: Subscript out of range for array totplnk (module_ra_rrtm.f90: 6381)
subscript=0, lower bound=1, upper bound=181, dimension=1
 
Top