Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WPS ungrib ERROR: Grib2 file or date problem, stopping in edition_num.

scapps

New member
Please note: This is the same issue as this post:
However, I was not allowed to reply to that thread, so I am starting a new one. Also, the issue reported in the other thread was not resolved as they decided to compile with Intel compilers instead. I have been working with AMD to get WRF/WPS compiled. All other executables work fine, just ungrib.exe has this pesky issue.

I have compiled WRFV452 and WPSV450 using the AOCC 4.1 compilers on an Ubuntu 22.04.2 LTS system and am getting the following error when running ungrib.exe on GFS grib2 files:

ls -lrt GRIBFILE* > log
lrwxrwxrwx 1 wrfuser wrfuser 41 Feb 9 07:09 GRIBFILE.AAA -> ../../../gfs/12z/gfs.t12z.pgrb2.0p25.f003

cp /opt/software/WRF/WPSV450/ungrib/Variable_Tables/Vtable.GFS ./Vtable

./ungrib.exe

*** Starting program ungrib.exe ***
Start_date = 2024-02-03_12:00:00 , End_date = 2024-02-08_12:00:00
output format is WPS
Path to intermediate files is ./
ERROR: Grib2 file or date problem, stopping in edition_num.
Warning: ieee_inexact is signaling
FORTRAN STOP
I have confirmed that the GFS grib2 file is fine as it can be read on another system. Also, I have tried multiple GRIB2 files with the same problem.

g2print.exe gives the same error:
./g2print.exe GRIBFILE.AAA
junit = 12 gribflnm = GRIBFILE.AAA
ios = 0
There is a problem with the input file.
Perhaps it is not a Grib2 file?
Warning: ieee_inexact is signaling
Grib2 file or date problem, stopping in edition_num.

g1print recognizes it as a GRIB2 file:

./g1print.exe GRIBFILE.AAA
Copen: File = GRIBFILE.AAA ��W�
Fortran Unit = 0
UNIX File descriptor: 3
----------------------------------------------------
rec GRIB GRIB Lvl Lvl Lvl Time Fcst
Num Code name Code one two hour
----------------------------------------------------
*** stopping in gribcode ***
I was expecting a Grib1 file, but this is a Grib2 file.
Use g2print on Grib2 files
gribsize in gribcode


I have compiled the following without any issues:
zlib 1.3.1
libpng 1.6.40
Jasper 1.900.1

Could this be caused by the versions of the above libraries? If so, what versions do you recommend? Keeping in mind that the AOCC compilers I am using are the latest.

Compilation initially fails:

make[1]: Entering directory '/opt/software/WRF/WPSV450/ungrib/src'​
Makefile:90: warning: overriding recipe for target '.F.o'​
../../configure.wps:106: warning: ignoring old recipe for target '.F.o'​
Makefile:95: warning: overriding recipe for target '.c.o'​
../../configure.wps:98: warning: ignoring old recipe for target '.c.o'​
/bin/rm -f ungrib.exe​
if [ -z ] ; then \​
flang -o ungrib.exe misc_definitions_module.o debug_cio.o module_debug.o module_stringutil.o table.o module_datarray.o gridinfo.o new_storage.o filelist.o ungrib.o output.o rrpr.o rd_grib1.o file_delete.o datint.o rd_grib2.o \​
-L./ngl -lw3 -lg2_4 \​
-Wl,-rpath=/opt/software/grib2/lib -L/opt/software/grib2/lib -l:libjasper.a -l:libpng.a -l:libz.a \​
-L. -lpgu ; \​
else \​
flang -o ungrib.exe misc_definitions_module.o debug_cio.o module_debug.o module_stringutil.o table.o module_datarray.o gridinfo.o new_storage.o filelist.o ungrib.o output.o rrpr.o rd_grib1.o file_delete.o datint.o rd_grib2.o \​
./ngl/w3/libw3.a ./ngl/g2/libg2_4.a \​
-Wl,-rpath=/opt/software/grib2/lib -L/opt/software/grib2/lib -l:libjasper.a -l:libpng.a -l:libz.a \​
libpgu.a ; \​
fi​
ld.lld: error: duplicate symbol: gbytes_​
>>> defined at gbytes.f:1​
>>> gbytes.o:(gbytes_) in archive ./ngl/libw3.a​
>>> defined at gbytesc.f:15​
>>> gbytesc.o:(.text+0x40) in archive ./ngl/libg2_4.a​
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)​
make[1]: [Makefile:19: ungrib.exe] Error 1 (ignored)​
make[1]: Leaving directory '/opt/software/WRF/WPSV450/ungrib/src'​


When I add "-Wl,-allow-multiple-definition" to the LDFLAGS, it compiles. From my configure.wps file:

COMPRESSION_LIBS = -Wl,-rpath=/opt/software/grib2/lib -L/opt/software/grib2/lib -l:libjasper.a -l:libpng.a -l:libz.a
COMPRESSION_INC = -I/opt/software/grib2/include
FDEFS = -DUSE_JPEG2000 -DUSE_PNG
SFC = flang
SCC = clang
DM_FC = mpif90
DM_CC = mpicc
FC = $(DM_FC)
CC = $(DM_CC)
LD = $(FC)
FFLAGS = -Mfree -Mbyteswapio -O
F77FLAGS = -Mfixed -Mbyteswapio -O
FCCOMPAT =
FCSUFFIX =
FNGFLAGS = $(FFLAGS)
LDFLAGS = -Wl,-allow-multiple-definition
CFLAGS =
CPP = /usr/bin/cpp -P -traditional
CPPFLAGS = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DBIT32 -D_MPI

I believe the problem is with the duplicate subroutines here:
WPSV450/ungrib/src/ngl/g2/gbytesc.f
gbytesc.f: SUBROUTINE GBYTES(IN,IOUT,ISKIP,NBYTE,NSKIP,N)

WPSV450/ungrib/src/ngl/w3/gbytes.f
gbytes.f: SUBROUTINE GBYTES(IPACKD,IUNPKD,NOFF,NBITS,ISKIP,ITER)

So, like the previous post, ungrib.exe compiles but it was linked incorrectly? HOW CAN WE RESOLVE THIS?



I ran strace ./ungrib.exe and it never finds the "7777" information in the file:
write(3, "Path to intermediate files is ", 30) = 30
write(3, "./", 2) = 2
write(3, "\n", 1) = 1
openat(AT_FDCWD, "GRIBFILE.AAA", O_RDONLY) = 4
lseek(4, 0, SEEK_SET) = 0
read(4, "GRIB\377\377\0\2\0\0\0\0\17\350D\270\0\0\0\25\1\0\7\0\0\4\0\1\7\350\2\7"..., 512) = 512
lseek(4, 505, SEEK_SET) = 505
read(4, "t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0"..., 512) = 512
lseek(4, 1010, SEEK_SET) = 1010
read(4, "\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240"..., 512) = 512
lseek(4, 1515, SEEK_SET) = 1515
read(4, "\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t\0\240t"..., 512) = 512
lseek(4, 2020, SEEK_SET) = 2020

I compared to strace output for a version of ungrib.exe that does work on another system:
write(3, "Path to intermediate files is ", 30) = 30
write(3, "./", 2) = 2
write(3, "\n", 1) = 1
openat(AT_FDCWD, "GRIBFILE.AAA", O_RDONLY) = 4
lseek(4, 0, SEEK_SET) = 0
read(4, "GRIB\377\377\0\2\0\0\0\0\17\350D\270\0\0\0\25\1\0\7\0\0\4\0\1\7\350\2\7"..., 512) = 512
lseek(4, 266880180, SEEK_SET) = 266880180
read(4, "7777", 4) = 4
mmap(NULL, 266883072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x14dde9e72000
lseek(4, 0, SEEK_SET) = 0
read(4, "GRIB\377\377\0\2\0\0\0\0\17\350D\270\0\0\0\25\1\0\7\0\0\4\0\1\7\350\2\7"..., 266880184) = 266880184
close(4) = 0


Thank you for your assistance.
 
Last edited:
UPDATE: In order to get ungrib.exe to compile with the AOCC compilers, I had to change the following code:

WPSV450/ungrib/src/ngl/w3/pdseup.f:
Line 54, 57, 104 and 106 Rename GBYTES() to GBYTES2()

WPSV450/ungrib/src/ngl/w3/gbytes.f:
Line 1: Rename to SUBROUTINE GBYTES2(IPACKD,IUNPKD,NOFF,NBITS,ISKIP,ITER)

This is a hack and may break other functionality.

Please suggest a better fix for this issue in the F77 code for grib2 support.

Thank you.
 
Another fix is modifying lines 21 and 54 of the ungrib/src/Makefile:
-L./ngl -lg2_4 -lw3 \ # New change
-L./ngl -lw3 -lg2_4 \ # Original
 
Yes, as you can see my issue was very different. Further, I have resolved the issue and posted the fix to the Makefile.
 
I am suspicious this is a library issue. I have talked to our software engineer and hope they can find a solution soon.
 
Top