Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF simulation automatically quit when using NARR data from 2021-08-18 to 2021-08-28

zoujw

New member
Hi,

Thanks for coming to my problems. I writing this message to ask about some problems we are coming across when using NARR data for WRF simulations.

We are extracting NARR input for WRF simulation in Montreal through NCAR RDA Dataset ds608.0. We found there is a period of NARR input could not be simulated through WRF simulation, which is in 2021, from July to August. We previously ran WRF simulation for Montreal using NARR for other years data but those cases worked well.

For this time period, when we run real.exe, the WRF simulation will automatically stop after around 30s, with no error message reported. The following message from rsl.error.0000 is below shown for your reference.

It will be greatly appreciated if anyone can kindly let us know what could be the possible reason and what should we do.

Thank you so much!

Kind regards,

Jiwei

rsl.error.0000
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
taskid: 0 hostname: cdr783.int.cedar.computecanada.ca

module_io_quilt_old.F 2931 T

Ntasks in X 2 , ntasks in Y 2

Domain # 1: dx = 9000.000 m

Domain # 2: dx = 3000.000 m

Domain # 3: dx = 1000.000 m

REAL_EM V4.3.3 PREPROCESSOR

git commit acb21e7b6800a9db3928bfeceac98aff1e94dd82 21 files changed, 2312 deletions(-)

*************************************

Parent domain

ids,ide,jds,jde 1 276 1 296

ims,ime,jms,jme -4 145 -4 155

ips,ipe,jps,jpe 1 138 1 148

*************************************

DYNAMICS OPTION: Eulerian Mass Coordinate

alloc_space_field: domain 1 , 2857172452 bytes allocated

d01 2021-06-25_00:00:00 Yes, this special data is acceptable to use: OUTPUT FROM METGRID V4.2

d01 2021-06-25_00:00:00 Input data is acceptable to use: met_em.d01.2021-06-25_00:00:00.nc

metgrid input_wrf.F first_date_input = 2021-06-25_00:00:00

metgrid input_wrf.F first_date_nml = 2021-06-25_00:00:00

d01 2021-06-25_00:00:00 Timing for input 2 s.

d01 2021-06-25_00:00:00 flag_soil_layers read from met_em file is 1

Max map factor in domain 1 = 1.02. Scale the dt in the model accordingly.

Using sfcprs to compute psfc

d01 2021-06-25_00:00:00 No average surface temperature for use with inland lakes

Assume Noah LSM input

d01 2021-06-25_00:00:00 forcing artificial silty clay loam at 173 points, out of 20424
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 

Attachments

  • namelist.wps
    1.1 KB · Views: 6
  • namelist.input
    5 KB · Views: 3
Last edited:
Hi,
For the dates/times you've been able to use NARR data successfully, was everything else the same about the simulation (as with this one that is failing)? I.e., are you using the exact same namelist, minus the dates/times (same domain resolution, sizes, same static data fields, same physics, same version of WPS/WRF, same compiler, libraries, etc.). If there are differences between the two, can you also attach a nemlist.input file for a case that runs without problems? Will you also please package all of your error/output files (e.g., rsl.*) for the failed real.exe process into a single *.tar file and attach that, as well? Thanks!
 
Hi Kwerner,

Thanks for your attention and reply!

Yes, we are using exactly same settings in WPS and WRF for both failing and successful cases and the only difference is just simply changing the selected time period in namelist.wps and namelist.input.

I attached a zip file which contains all rsl.error and rsl.out files inside for the failing case. Please kindly have a check!

Thank you so much your kind help!

Kind regards,
Jiwei
 

Attachments

  • rsl.files.zip
    9.6 KB · Views: 2
Hi Jiwei,
Thank you for that information. I tested this, using a May 2021 case, and then an August 2021 case, and I get the same issue you are seeing - the May case has no issues in real.exe, while the August case does. The error message I find in your rsl.error.0002 is

Code:
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    3012
grid%tsk unreasonable
-------------------------------------------

I would advise trying to reach out to the RDA support to see if there were any changes to the data during that time, or if they are aware of any known issues, since this seems to be specific to the data time period. If you result in an answer or solution, and if you don't mind, will you post it here for anyone else who may run into the same problem? Thanks, and good luck!
 
Hi Jiwei,
Thank you for that information. I tested this, using a May 2021 case, and then an August 2021 case, and I get the same issue you are seeing - the May case has no issues in real.exe, while the August case does. The error message I find in your rsl.error.0002 is

Code:
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    3012
grid%tsk unreasonable
-------------------------------------------

I would advise trying to reach out to the RDA support to see if there were any changes to the data during that time, or if they are aware of any known issues, since this seems to be specific to the data time period. If you result in an answer or solution, and if you don't mind, will you post it here for anyone else who may run into the same problem? Thanks, and good luck!
Hi Kwerner,

Thanks for letting me know it works for May's case and do you mind telling me what exact time period you tested in 2021? For my current research, it is posssible for me to use May's data instead.

At the same time, I have reached out with RDA support, reporting the same issue but they did not find any differences when generating the NARR input. They also told me I am the only person reporting this issue...I will keep them updated and see if they can solve this.

Thank you so much for your kind help!

Kind regards,
Jiwei
 
Hi Jiwei,
Thank you for that information. I tested this, using a May 2021 case, and then an August 2021 case, and I get the same issue you are seeing - the May case has no issues in real.exe, while the August case does. The error message I find in your rsl.error.0002 is

Code:
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    3012
grid%tsk unreasonable
-------------------------------------------

I would advise trying to reach out to the RDA support to see if there were any changes to the data during that time, or if they are aware of any known issues, since this seems to be specific to the data time period. If you result in an answer or solution, and if you don't mind, will you post it here for anyone else who may run into the same problem? Thanks, and good luck!
Hi Kwerner,

Hope you are doing well!

I just received a reply from UCAR service desk who takes in charge of NARR data. Please see their reply as follow:
_________________________________________________________________________________________________
Chi-Fan Shih commented:

Without having some ideas of what variables to check, I don't know where to start with.
The input data apparently passed 'ungrib' and 'metgrid' steps. The fatal error reported by real.exe points the cause to the unreasonable data values. Are there documentation about the 'reasonable' values of variables?
_________________________________________________________________________________________________

Do you have any experience or ideas about above mentioned 'documentation about the 'reasonable' values of variables' ?

Looking forward to your reply!

Kind regards,
Jiwei
 
Jiwei,

In the dates you are using (that are failing), I get this message:

Code:
 error in the grid%tsk
 i,j=         144         215
 grid%landmask=  0.0000000E+00
 grid%tsk, grid%sst, grid%tmn=   160.0573      0.0000000E+00   160.0573
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    3110
grid%tsk unreasonable
-------------------------------------------

So it looks like skin temperature is the issue.

The date I tested that worked okay for me was 2021-05-01.

Just so you know, I will be out of the office until Jan 8th, so if you need help during that time, you may need to start a new thread so that my colleague(s) will see it. You can point to this thread if you don't want to explain everything again. Otherwise, I'll look at this again when I return.
 
Good afternoon,

Thank you for your attention to this problem. I am encountering this for runs in 2021 and in 2022. I have produced four test runs of real.exe and these are the results:

2021-05-01: success,
2021-09-01: failed with grid%tsk unreasonable error
2022-05-01: success,
2022-09-01: failed with grid%tsk unreasonable error

For the record, I am also using the NARR ds.608.0 dataset for my initial and boundary conditions, but I am running WRF GEOS-Chem (which uses WRFV3.9.1). I did not change anything about my model configuration between these four runs except for the date. I'm wondering if there has been any update regarding this problem?

For your reference, I've attached my namelist and the rsl files for each of the four runs. I would appreciate any advice!

Thanks so much,
Lucas
 

Attachments

  • namelist.input
    8.2 KB · Views: 2
  • 202105.zip
    164.5 KB · Views: 0
  • 202109.zip
    90.1 KB · Views: 0
  • 202205.zip
    164 KB · Views: 0
  • 202209.zip
    91.8 KB · Views: 1
Last edited:
Good afternoon,

Thank you for your attention to this problem. I am encountering this for runs in 2021 and in 2022. I have produced four test runs of real.exe and these are the results:

2021-05-01: success,
2021-09-01: failed with grid%tsk unreasonable error
2022-05-02: success,
2022-09-02: failed with grid%tsk unreasonable error

For the record, I am also using the NARR ds.608.0 dataset for my initial and boundary conditions, but I am running WRF GEOS-Chem (which uses WRFV3.9.1). I did not change anything about my model configuration between these four runs except for the date. I'm wondering if there has been any update regarding this problem?

For your reference, I've attached my namelist and the rsl files for each of the four runs. I would appreciate any advice!

Thanks so much,
Lucas
Hi Lucas,

Sadly, you are meeting exact same problem as I was. If possible, I suggest you switch your focus to any other year before 2019. At least in my case, I tried 2018 and 2013, and luckily these two years are working. However, as far as I tested, from 2019 to 2024, the problem appears.

I have sent email to NARR but they said I was the first user reporting the issue...You may ask them again and I hope they have fixed the problem.

Hope my words will help.

Kind regards,
Jiwei
 
Hello,

I've reached out to NARR and they said they will report the issue to NCEP, but for the time being we have found a work around.
The problem is that the grid point (161,243) which is located at 50.933N, -73.727W has a bad skin temperature for these dates. To get around this, you can fill that point in the GRIB files with the dataset's fill value (9999). I'll attach some python code at the end of the message that I used to do this.

You will also need to go to WPS/metgrid/METGRID.TBL and add "missing_value=-1.E30" to the SKINTEMP block, like so,

Code:
========================================
name=SKINTEMP
mpas_name=skintemp
        interp_option=sixteen_pt+four_pt+wt_average_4pt+wt_average_16pt+search
        masked=both
        interp_land_mask  = LANDSEA(1)
        interp_water_mask = LANDSEA(0)
        fill_missing=0.
        missing_value=-1.E30
========================================


I believe that the WPS programs interpret missing values as -1.E30 as the fix wouldn't work before I added this line.

After taking these steps, real.exe executed without errors. I can't run WRF at this time due to a tight schedule, but I don't think there will be any problems. Let me know if this works for you!

Here is the code I used to edit the GRIB files:
Python:
import pygrib as pg

old = '../narr_tar/'
new =  './'

fnames=['merged_AWIP32.20210901.tar',
       'merged_AWIP32.20210902.tar']

for fname in fnames:
    grbs = pg.open(old + fname)
    new_grbs = open(new + fname, 'wb')
    for grb in grbs:
        if grb.parameterName=='11' and grb.typeOfLevel == 'surface':
            print(grb)
            new_values=grb.values
            new_values[161,243]=grb.missingValue
            grb['values']=new_values
        msg = grb.tostring()
        new_grbs.write(msg)

    grbs.close()
    new_grbs.close()

Best,
Lucas
 
Hello,

I've reached out to NARR and they said they will report the issue to NCEP, but for the time being we have found a work around.
The problem is that the grid point (161,243) which is located at 50.933N, -73.727W has a bad skin temperature for these dates. To get around this, you can fill that point in the GRIB files with the dataset's fill value (9999). I'll attach some python code at the end of the message that I used to do this.

You will also need to go to WPS/metgrid/METGRID.TBL and add "missing_value=-1.E30" to the SKINTEMP block, like so,

Code:
========================================
name=SKINTEMP
mpas_name=skintemp
        interp_option=sixteen_pt+four_pt+wt_average_4pt+wt_average_16pt+search
        masked=both
        interp_land_mask  = LANDSEA(1)
        interp_water_mask = LANDSEA(0)
        fill_missing=0.
        missing_value=-1.E30
========================================


I believe that the WPS programs interpret missing values as -1.E30 as the fix wouldn't work before I added this line.

After taking these steps, real.exe executed without errors. I can't run WRF at this time due to a tight schedule, but I don't think there will be any problems. Let me know if this works for you!

Here is the code I used to edit the GRIB files:
Python:
import pygrib as pg

old = '../narr_tar/'
new =  './'

fnames=['merged_AWIP32.20210901.tar',
       'merged_AWIP32.20210902.tar']

for fname in fnames:
    grbs = pg.open(old + fname)
    new_grbs = open(new + fname, 'wb')
    for grb in grbs:
        if grb.parameterName=='11' and grb.typeOfLevel == 'surface':
            print(grb)
            new_values=grb.values
            new_values[161,243]=grb.missingValue
            grb['values']=new_values
        msg = grb.tostring()
        new_grbs.write(msg)

    grbs.close()
    new_grbs.close()

Best,
Lucas
Hi Lucas,

It is a really nice news! Thanks for your email as well as your code, showing how to solve the problem!

Kind regards,
Jiwei
 
Top