Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

era5_to_int failed to run

GiVu

New member
Hello everyone,

I'm trying to run the Python script era5_to_int to convert ERA5 model level and surface netCDF data into WPS intermediate file format, but the script fails.
I'm working with RDA datasets d633006 - 'ERA5 Reanalysis Model Level Data' and d633000 - 'ERA5 Reanalysis (0.25 Degree Latitude-Longitude Grid)'.
Each file has this name format:

SP.e5.oper.an.ml.128_134_sp.regn320sc.2020092500_2020092505.nc
SSTK.e5.oper.an.sfc.128_034_sstk.ll025sc.2020100100_2020103123.nc

I located all the relevant scripts (WPSUtil.py and fortran_io.py) in my WPS folder and checked that all the required modules were available and adequately installed/loaded.
From the WPS folder, I call the command:

python era5_to_int.py --path 2020_ERA5complete/model_levels/era5 2020-04-01_00 2020-10-31_23

The result is the following:

datetime = 2020-04-01 00:00:00
until_datetime = 2020-10-31 23:00:00
interval_hours = 6:00:00
Processing time record 2020-04-01_00
Traceback (most recent call last):
File "/onyx/clim/users/gvourro/AMEN/WPS-4.5/ERA5/era5_to_int.py", line 449, in <module>
idx = find_time_index(e5filename, initdate)
File "/onyx/clim/users/gvourro/AMEN/WPS-4.5/ERA5/era5_to_int.py", line 300, in find_time_index
with Dataset(ncfilename) as f:
File "src/netCDF4/_netCDF4.pyx", line 2463, in netCDF4._netCDF4.Dataset.__init__
File "src/netCDF4/_netCDF4.pyx", line 2026, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -74] NetCDF: Malformed URL: b''

What am I doing wrong or missing?
Can you please help me to fix the issue?

Thanks in advance for any help you could provide
 
Hi,
Just letting you know we aren't ignoring you. I've notified the developer of this script, and waiting for a response. We will keep you posted. Thank you for your patience.
 
Hi,
Can you clone the most recent version of the era5_to_int program and try again? The developers have added new code that gives more specific error messages, and maybe those messages will help you. Please keep us posted. Thanks!
 
Hi @kwerner,

I cloned the latest version of the era5_to _int from GitHub and ran it. This time, the error suggests that the utility can't find this specific file:

file e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040100_2020040105.nc needed for ERA5 variable Q

However, I downloaded and saved all the required variables for both model levels and surface in the same directory. In addition, if I do

ls -lh ERA5/RDA/ | grep e5.oper.an.ml.0_5_0_1_0_q

I get the list of files that should be supposed to be red from the utility.

Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040100_2020040105.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040106_2020040111.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040112_2020040117.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040118_2020040123.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040200_2020040205.nc
 
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040100_2020040105.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040106_2020040111.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040112_2020040117.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040118_2020040123.nc
Q.e5.oper.an.ml.0_5_0_1_0_q.regn320sc.2020040200_2020040205.nc
I'm glad it's at least showing you the file it's looking for now! In this list of files, each one has a "Q." at the beginning. Can you change those files to be named exactly as is stated in the error message (i.e., removing the leading "Q." for each one, and then see if that works better?
 
Hi Kelly,

As you suggested, I renamed the files and removed the "Q." at the beginning. That led to different issues related to utc_date, as reported below:

python era5_to_int.py --path ../ERA5/RDA/ 2020-04-01_00 2020-10-31_23
datetime = 2020-04-01 00:00:00
until_datetime = 2020-10-31 23:00:00
interval_hours = 6:00:00
Processing time record 2020-04-01_00
Traceback (most recent call last):
File "/onyx/clim/users/gvourro/AMEN/WPS-4.5/era5_to_int/era5_to_int.py", line 464, in <module>
idx = find_time_index(e5filename, initdate)
File "/onyx/clim/users/gvourro/AMEN/WPS-4.5/era5_to_int/era5_to_int.py", line 300, in find_time_index
utc_date = f.variables['utc_date'][:]
KeyError: 'utc_date'

So, I changed the script, precisely the function called find_time_index (around line 300 of the code), so the script could use the standard variable "time" instead of utc_date. Consequently, in line 496, I also changed the hdate variable because it was trying to access the utc_date, which was causing the error.

After all these changes, I managed to run the script, but right now, I'm dealing with missing files 😅

I'm glad it's at least showing you the file it's looking for now! In this list of files, each one has a "Q." at the beginning. Can you change those files to be named exactly as is stated in the error message (i.e., removing the leading "Q." for each one, and then see if that works better?
 
Top