Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Build of WRF stuck on external/io_netcdf/diffwrf.F90

giovdav

New member
Dears,
I'm trying to build the latest version of the model on a HPC cluster with modules.

Here the steps that I have tried to build it:
Bash:
git clone --recurse-submodules https://github.com/wrf-model/WRF
cd /home/greenriot/WRF/

module purge
module load gnu/11.4.0
module load netcdf-all/4.7.2/gnu/8.4.0
module load mpich/3.0.4/gnu/8.4.0

export FC=/APPLICATIONS/gnu/gcc/11.4.0/bin/gfortran
export F77=/APPLICATIONS/gnu/gcc/11.4.0/bin/gfortran
export FCFLAGS=-m64
export FFLAGS=-m64
export JASPERLIB=/APPLICATIONS/jasper/1.900.1/gnu/8.4.0/lib/
export JASPERINC=/APPLICATIONS/jasper/1.900.1/gnu/8.4.0/include/
export NETCDF=/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/
export NETCDF_INC=/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//include
export NETCDFPATH=/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/
export HDF5=/APPLICATIONS/hdf5/1.12.0/openmpi/4.1.1/gnu/8.4.0/]

export LDFLAGS=-L/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib/
export LIBS=-l/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib/
export CPPFLAGS=-I/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/include/

./clean -a
./configure # option 34 and 1
./compile em_real >& log.compile

The LD_LIBRARY_PATH seem already configured with NetCDF by the module load command but the compile stops with diffwrf.F90, here the error:

1730219166364.png

The same command in my shell work fine with the Fortran code.

Is it a bug?

Thanks in advance.
G.
 
Sure, here the files!
The file configure.wrf is untouched!

Thanks!
G.
 

Attachments

  • configure.wrf.txt
    20.7 KB · Views: 3
  • log.compile.txt
    1.1 MB · Views: 4
Last edited:
When we look for errors in the compile log, we are only interested in those with a capital 'E' ("Error"), and typically the first one we come to in the file. This is the one that appears first in your log.

Code:
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_def_var_chunking_':
nf_nc4.f90:(.text+0x5ea2): undefined reference to `nc_def_var_chunking_ints'
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_inq_var_chunking_':
nf_nc4.f90:(.text+0x61bf): undefined reference to `nc_inq_var_chunking_ints'
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_set_chunk_cache_':
nf_nc4.f90:(.text+0x6c0f): undefined reference to `nc_set_chunk_cache_ints'
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_get_chunk_cache_':
nf_nc4.f90:(.text+0x6c48): undefined reference to `nc_get_chunk_cache_ints'
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_set_var_chunk_cache_':
nf_nc4.f90:(.text+0x6cdc): undefined reference to `nc_set_var_chunk_cache_ints'
/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//lib/libnetcdff.a(nf_nc4.o): In function `nf_get_var_chunk_cache_':
nf_nc4.f90:(.text+0x6d37): undefined reference to `nc_get_var_chunk_cache_ints'
collect2: error: ld returned 1 exit status
 
real    0m0.759s
user    0m0.125s
sys     0m0.222s
make[2]: [diffwrf] Error 1 (ignored)

Typically any errors that mention different "nf_*" functions indicate that there is an issue with the netCDF installation or path setting. If you haven't already, can you go through the tests on this compiling tutorial page to make sure you have everything installed correctly, and that the netcdf library is compatible with the rest of your environment?
 
Dears, here the results of the test with the same shell used for the build!

Bash:
# 0
[GreenRiot@HPC WRF]$ which gfortran
/APPLICATIONS/gnu/gcc/11.4.0/bin/gfortran

[GreenRiot@HPC WRF]$ which cpp
/APPLICATIONS/gnu/gcc/11.4.0/bin/cpp

[GreenRiot@HPC WRF]$ which gcc
/APPLICATIONS/gnu/gcc/11.4.0/bin/gcc

[GreenRiot@HPC WRF]$ gcc --version
gcc (GCC) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


# 1
[GreenRiot@HPC TESTS]$ gfortran TEST_1_fortran_only_fixed.f
[GreenRiot@HPC TESTS]$ ./a.out
SUCCESS test 1 fortran only fixed format
 

# 2
[GreenRiot@HPC TESTS]$ gfortran TEST_2_fortran_only_free.f90
[GreenRiot@HPC TESTS]$ ./a.out
Assume Fortran 2003: has FLUSH, ALLOCATABLE derived type, and ISO C Binding
SUCCESS test 2 fortran only free format


# 3
[GreenRiot@HPC TESTS]$ gcc TEST_3_c_only.c
[GreenRiot@HPC TESTS]$ ./a.out
SUCCESS test 3 C only


# 4
[GreenRiot@HPC TESTS]$ gcc -c -m64 TEST_4_fortran+c_c.c
[GreenRiot@HPC TESTS]$ gfortran -c -m64 TEST_4_fortran+c_f.f90
[GreenRiot@HPC TESTS]$ gfortran -m64 TEST_4_fortran+c_f.o TEST_4_fortran+c_c.o
[GreenRiot@HPC TESTS]$ ./a.out
C function called by Fortran
Values are xx =  2.00 and ii = 1
SUCCESS test 4 fortran calling c
 

# 5
[GreenRiot@HPC TESTS]$ ./TEST_csh.csh
SUCCESS csh test


# 6
[GreenRiot@HPC TESTS]$ ./TEST_perl.pl
SUCCESS perl test


# 7
[GreenRiot@HPC TESTS]$ ./TEST_sh.sh
SUCCESS sh test
 
Thanks for doing that. Can you issue
Code:
echo $PATH >& path.txt

and attach the path.txt file? Thanks!
 
Code:
/APPLICATIONS/mpich/3.0.4/gnu/8.4.0/bin:/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/bin:/APPLICATIONS/gnu/gcc/11.4.0/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/pbs/bin:/home/GreenRiot/.local/bin:/home/GreenRiot/bin
 
Apologies for the delay. It can sometimes take us a few days to get to each forum post because we have so many other tasks we are working on each day, and we don't work on the weekends.

Thanks for sending the PATH information. I do see that the netcdf path seems to be set correctly. I don't believe this is a bug because many others are able to build this code without issues, but there is something going on with your environment/libraries. I'm going to reach out to one of our software engineers to see if they have any thoughts.
 
Dear @kwerner, thanks for your reply!

Maybe could be useful that we are tring to build the code on CentOS 7.9, in the next days I will try to the same also on Alma Linux 9.4.

Thanks for your help.
G.
 
The same command in my shell work fine with the Fortran code.
Due to the way the Makefiles are written, the actual failing command is not printed out.

I was able to somewhat recreate the error by forcibly installing the netCDF libraries in such a way to result in that symbol being missing. This was done by compiling two netCDF C libraries, one with HDF5 and another without, and then building netCDF Fortran against the HDF5-enabled netCDF C but installing it to the non-HDF5 location. This however, does not, nor should it, pass configuration as it fails on the nc4_test portion of configuration (my output from nc4_test.log below):
Code:
if [ -z "0" ] || [ 0 -eq 0 ] ; then \
 ( cd tools ; /bin/rm -f nc4_test.{exe,nc,o} ; gcc -o nc4_test.exe nc4_test.c           -I/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//include -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib -lnetcdf -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib -lnetcdff -L/home/aislas/wrf-model/forum_help/tmp_test_forum/forum_libs//lib -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/simple//lib -L/home/aislas/wrf-model/forum_help/tmp_test_forum/forum_libs//grib2/lib -lnetcdf -lnetcdf -lm ; cd .. ) ; \
else \
 ( cd tools ; /bin/rm -f nc4_test.{exe,nc,o} ; mpicc -cc=gcc -o nc4_test.exe nc4_test.c -I/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//include -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib -lnetcdf -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib -lnetcdff -L/home/aislas/wrf-model/forum_help/tmp_test_forum/forum_libs//lib -L/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/simple//lib -L/home/aislas/wrf-model/forum_help/tmp_test_forum/forum_libs//grib2/lib -lnetcdf -lnetcdf -lm ; cd ..  ) ; \
fi
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib/libnetcdff.so: undefined reference to `nc_get_chunk_cache_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib/libnetcdff.so: undefined reference to `nc_set_var_chunk_cache_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib/libnetcdff.so: undefined reference to `nc_def_var_chunking_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5/lib/libnetcdff.so: undefined reference to `nc_set_chunk_cache_ints'
collect2: error: ld returned 1 exit status

Note, that it is using the shared library libnetcdff.so, which causes symbol checking to be done. If we remove the .so and force usage of the .a archive no undefined symbol checking will be done for this archive, and since the nc4_test is a C file test, as long as the libnetcdf.* works it will pass. Using the same malformed install method I described above, if I force the usage of the libnetcdff.a archive, configuration passes thinking using these two libraries (libnetcdff.a and libnetcdf.* which were not compiled together) can be used together. I end up with a configure.wrf very similar to @giovdav with regards to NETCDF* variables.

When continuing with this configuration the first error I get is :
Code:
ar: creating ../main/libwrflib.a
ranlib ../main/libwrflib.a
make[2]: Leaving directory '/home/aislas/wrf-model/wrf/frame'
make[2]: Entering directory '/home/aislas/wrf-model/wrf/external/io_netcdf'
x=`echo "time gfortran -w -ffree-form -ffree-line-length-none -fconvert=big-endian -frecord-marker=4 -fallow-argument-mismatch -fallow-invalid-boz    " | awk '{print $1}'` ; export x ; \
if [ $x = "gfortran" ] ; then \
           echo removing external declaration of iargc for gfortran ; \
   /lib/cpp -P -nostdinc -P -traditional-cpp -DUSE_NETCDF4_FEATURES -DWRFIO_NCD_LARGE_FILE_SUPPORT -I/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//include -I../ioapi_share diffwrf.F90 | sed '/integer *, *external.*iargc/d' > diffwrf.f ;\
        else \
   /lib/cpp -P -nostdinc -P -traditional-cpp -DUSE_NETCDF4_FEATURES -DWRFIO_NCD_LARGE_FILE_SUPPORT -I/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//include -I../ioapi_share diffwrf.F90 > diffwrf.f ; \
        fi
time gfortran -w -ffree-form -ffree-line-length-none -fconvert=big-endian -frecord-marker=4 -fallow-argument-mismatch -fallow-invalid-boz     -c  -I/home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//include -I../ioapi_share diffwrf.f
0.11user 0.02system 0:00.14elapsed 98%CPU (0avgtext+0avgdata 39136maxresident)k
0inputs+336outputs (0major+5172minor)pagefaults 0swaps
diffwrf io_netcdf is being built now. 
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//lib/libnetcdff.a(nf_nc4.o): in function `nf_def_var_chunking_':
nf_nc4.f90:(.text+0x5d4a): undefined reference to `nc_def_var_chunking_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//lib/libnetcdff.a(nf_nc4.o): in function `nf_set_chunk_cache_':
nf_nc4.f90:(.text+0x6b28): undefined reference to `nc_set_chunk_cache_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//lib/libnetcdff.a(nf_nc4.o): in function `nf_get_chunk_cache_':
nf_nc4.f90:(.text+0x6b61): undefined reference to `nc_get_chunk_cache_ints'
/usr/bin/ld: /home/aislas/wrf-model/forum_help/tmp_test_forum/netcdf-c-4.7.2/no_hdf5//lib/libnetcdff.a(nf_nc4.o): in function `nf_set_var_chunk_cache_':
nf_nc4.f90:(.text+0x6bf5): undefined reference to `nc_set_var_chunk_cache_ints'
collect2: error: ld returned 1 exit status
real  0m0.115s
user  0m0.089s
sys 0m0.025s
make[2]: [makefile:45: diffwrf] Error 1 (ignored)

This doesn't include the nc_inq_var_chunking_ints and I'm not entirely sure why. This post (GCC version for WRF 4.4) also seems to further confirm that to get into this state, netCDF install must be done wrong. I likewise couldn't find a way to force the installs into a bad state with just (compile netCDF) => (compile netCDF Fortran) with any combination of flags. All the built-in checks catch this before continuing compilation. I tentatively suspect one or more extra netCDF installs were in the environment when that netCDF Fortran libnetcdff.a was built & installed resulting in the usage of a non-compatible netCDF C and Fortran libraries.

As this is using v4.6.0+, you could try using the CMake build. I suspect it will fail as well, which would be good info. It doesn't necessarily mean it isn't a WRF build issue, but the two build systems are very different in resolving dependencies within the same environment suggesting there might instead be a problem with the environment. If it does work, well then the make system might actually be the root cause of this issue.

Another piece of information that would be useful to fully diagnosis this would be to get the output of :
Code:
<netcdf path>/bin/nc-config --all
<netcdf path>/bin/nf-config --all

nm -C <netcdf path>/lib/libnetcdf.<so or a, whichever is in there> | grep nc_get_chunk
nm -C <netcdf path>/lib/libnetcdff.a | grep nf_get_chunk
 
Dear @islas here the output:

Code:
[GreenRiot@hpc TESTS]$ nc-config --all

This netCDF 4.7.2 has been built with the following features:

  --cc            -> gcc
  --cflags        -> -I/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/include
  --libs          -> -L/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib  -lnetcdf -lm
  --static        -> -lm

  --has-c++       -> no
  --cxx           ->

  --has-c++4      -> no
  --cxx4          ->

  --has-fortran   -> yes
  --fc            -> gfortran
  --fflags        -> -I/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/include
  --flibs         -> -L/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib -lnetcdff -lnetcdf -lnetcdf -ldl -lm
  --has-f90       ->
  --has-f03       -> yes

  --has-dap       -> no
  --has-dap2      -> no
  --has-dap4      -> no
  --has-nc2       -> yes
  --has-nc4       -> no
  --has-hdf5      -> no
  --has-hdf4      -> no
  --has-logging   -> no
  --has-pnetcdf   -> no
  --has-szlib     ->
  --has-cdf5      -> yes
  --has-parallel4 -> no
  --has-parallel  -> no

  --prefix        -> /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0
  --includedir    -> /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/include
  --libdir        -> /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib
  --version       -> netCDF 4.7.2

Here the results of the NM command:

Code:
[GreenRiot@hpc TESTS]$ nm -C /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib/libnetcdf.a | grep nf_get_chunk
[GreenRiot@hpc TESTS]$

No result.

Code:
[GreenRiot@hpc TESTS]$ nm -C /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib/libnetcdff.a | grep nf_get_chunk
0000000000006c21 T nf_get_chunk_cache_
                 U nf_get_chunk_cache_

So to solve the issue I have to build again NetCDF with hdf5?

Thanks!
G.
 
@giovdav Thanks!
Could you do (note that it is nc_get_chunk):
Code:
nm -C /APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0/lib/libnetcdf.a | grep nc_get_chunk

and
Code:
nf-config --all

Interestingly, the output of the `nc-config --all` says that it has no nc4 or hdf5 support, so the Fortran library should not have even been looking for these symbols. I expect the `nf-config --all` command will return `--has-nc4 -> yes`.

You can use either HDF5 or no HDF5 when building netCDF, but the netCDF Fortran library must be built against the respective netCDF C library.
 
Dear @islas, thanks for your reply!

Assuming that Fortran library as been build against the respective C library what I have to do to build correctly WFR model?

There is a way to fix the problem maybe "hard coding" the paths in the MakeFile?

Thanks in advance!
 
Last edited:
If the Fortran library is built and installed correctly with its respective C library the normal process of building WRF will work.

Unfortunately, this problem is with netCDF Fortran and not with WRF finding the correct libraries. There is no way to fix this in WRF with or without pathing because the error is upstream.

As an analogy, when using libraries you can think of it as if we are referencing pages in books. Sometimes books refer to other books for the final source of information. In this case, WRF requests netCDF Fortran which says it requires page N from netCDF C to be complete. Then we go to netCDF C and page N is missing, meaning we can't finish making our program because we don't have all the information necessary. This is effectively what the `undefined reference` error is.

A potential alternative may be to build netCDF from source using the forum instructions here : Full WRF and WPS Installation Example (Intel)

If using gcc, some flags will need to change, but the remainder of the instructions should be fine. Also, you would not need to build everything from source to solve this issue, just netCDF C and Fortran.
 
If the Fortran library is built and installed correctly with its respective C library the normal process of building WRF will work.
Yes it has been installed correctly, and I don't understand why doesn't work! If I compile the file diffwrf.F90 manually with the same command in of the MakeFile it work without any problem! 😅
 
Code:
time mpif90 -f90=gfortran -w -ffree-form -ffree-line-length-none -fconvert=big-endian -frecord-marker=4 -fallow-argument-mismatch -fallow-invalid-boz     -c -O2 -ftree-vectorize -funroll-loops -w -ffree-form -ffree-line-length-none -fconvert=big-endian -frecord-marker=4 -fallow-argument-mismatch -fallow-invalid-boz  -I/APPLICATIONS/netcdf-all/4.7.2/gnu/8.4.0//include -I../ioapi_share diffwrf.f
 
Ok finally I have solved the problem! :)

I have make some changes:

- used GNU GCC 8.4.0
- build the latest version of netcdf-c-4.9.2 and netcdf-fortran-4.6.1

Here the commands used:


Bash:
#-------------------------------------------------------------------------------
# HOW TO BUILD THE MODEL WRF VERSION 4
#-------------------------------------------------------------------------------

# Clone the repository to the latest version of WRF
git clone --recurse-submodules https://github.com/wrf-model/WRF
cd /home/GreenRiot/WRF/

# Load modules
module purge
module load gnu/8.4.0
module load netcdf-all/4.9.2/gnu/8.4.0
module load mpich/3.0.4/gnu/8.4.0

# Set the environment variables before to build
export FC=$(which gfortran)
export F77=$(which gfortran)
export FCFLAGS=-m64
export FFLAGS=-m64
export JASPERLIB=/APPLICATIONS/jasper/1.900.1/gnu/8.4.0/lib/
export JASPERINC=/APPLICATIONS/jasper/1.900.1/gnu/8.4.0/include/
export NETCDF=$(nc-config --all | grep prefix | awk '{print $3}')
export NETCDF_INC=$(nc-config --all | grep prefix | awk '{print $3}')/include/
export NETCDFPATH=$(nc-config --all | grep prefix | awk '{print $3}')
export HDF5=/APPLICATIONS/hdf5/1.12.0/openmpi/4.1.1/gnu/8.4.0/
export LDFLAGS=-L$(nc-config --all | grep prefix | awk '{print $3}')/lib/
export LIBS=-l$(nc-config --all | grep prefix | awk '{print $3}')/lib/
export CPPFLAGS=-I$(nc-config --all | grep prefix | awk '{print $3}')/include/

# Check the export paths
echo $FC
echo $F77
echo $FCFLAGS
echo $FFLAGS
echo $JASPERLIB
echo $JASPERINC
echo $NETCDF
echo $NETCDF_INC
echo $NETCDFPATH
echo $HDF5
echo $LDFLAGS
echo $LIBS
echo $CPPFLAGS

# Configure
./clean -a
./configure # option 34 and 1

# Building time!
./compile em_real >& log.compile
 
Top