Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

AMD AOCC compiler support for WPS?

AOCC has been an option in WRF for a few releases now. Thanks.
Mike
I concur with @leuthold that the WRF `configure.defaults` already incorporates support for AMD compilers. However, it appears that the WPS component lacks a dedicated configuration setup for AMD compilers. Given that WPS involves heavy data processing, optimizations tailored for AMD CPUs, aligning its configuration with the AMD compiler infrastructure would be conducive to maximizing performance and compatibility.

 
Last edited:
I have been using WRF operationally with AMD processors for 5 consecutive years, if there is any way to use the AMD compiler to speed up WRF simulation it would be very interesting. Due to the price currently, AMD processors are more worth purchasing than Intel processors, so having an AMD compiler option like there is for Intel would be very important.
 
Code:
########################################################################################################################
#ARCH   AMD Linux x86_64, AOCC compilers   # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2
#
COMPRESSION_LIBS    = CONFIGURE_COMP_L
COMPRESSION_INC     = CONFIGURE_COMP_I
FDEFS               = CONFIGURE_FDEFS
NCARG_LIBS          =
NCARG_LIBS2         =
FC                  = mpif90
SFC                 = flang
CC                  = mpicc
SCC                 = clang
LD                  = $(FC)
FFLAGS              = -Mfreeform -ffree-line-length-none -fopenmp
F77FLAGS            = -Mfixed -ffixed-line-length-none -fopenmp
FNGFLAGS            = $(FFLAGS)
LDFLAGS             = -fopenmp
CFLAGS              = -O3 -fopenmp
CPP                 = /lib/cpp -P -traditional
CPPFLAGS            = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI
ARFLAGS             =
RANLIB              = ranlib
CC_TOOLS            = $(SCC)

I came up with this with the help of chatgpt and some coding forums. I have no way to test it but perhaps this will get a start for someone.
@Ming Chen @Gabriel Cassol

@leuthold did you ever get it to work on WPS?
 
Code:
########################################################################################################################
#ARCH   AMD Linux x86_64, AOCC compilers   # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2
#
COMPRESSION_LIBS    = CONFIGURE_COMP_L
COMPRESSION_INC     = CONFIGURE_COMP_I
FDEFS               = CONFIGURE_FDEFS
NCARG_LIBS          =
NCARG_LIBS2         =
FC                  = mpif90
SFC                 = flang
CC                  = mpicc
SCC                 = clang
LD                  = $(FC)
FFLAGS              = -Mfreeform -ffree-line-length-none -fopenmp
F77FLAGS            = -Mfixed -ffixed-line-length-none -fopenmp
FNGFLAGS            = $(FFLAGS)
LDFLAGS             = -fopenmp
CFLAGS              = -O3 -fopenmp
CPP                 = /lib/cpp -P -traditional
CPPFLAGS            = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI
ARFLAGS             =
RANLIB              = ranlib
CC_TOOLS            = $(SCC)

I came up with this with the help of chatgpt and some coding forums. I have no way to test it but perhaps this will get a start for someone.
@Ming Chen @Gabriel Cassol

@leuthold did you ever get it to work on WPS?

Almost. geogrid.exe and metgrid.exe compiled, but ungrib did not. I used almost identical configurations as above. AOCC 4.1.

ungrib.exe error.
"
ld.lld: error: duplicate symbol: gbytes_
>>> defined at gbytes.f:1
>>> gbytes.o : (gbytes_) in archive ./ngl/libw3.a
>>> defined at gbytesc.f:15
>>> gbytesc.o : (.text+0x40) in archive ./ngl/libg2_4.a
 
Almost. geogrid.exe and metgrid.exe compiled, but ungrib did not. I used almost identical configurations as above. AOCC 4.1.

ungrib.exe error.
"
ld.lld: error: duplicate symbol: gbytes_
>>> defined at gbytes.f:1
>>> gbytes.o : (gbytes_) in archive ./ngl/libw3.a
>>> defined at gbytesc.f:15
>>> gbytesc.o : (.text+0x40) in archive ./ngl/libg2_4.a

@leuthold

Did you try mine that gave that error or was that from yours?
 
For version 4.1.0, no. The compiler threw an error about "-vectorize-noncontigous-memory-aggressively". It told me to change it to "--vectorize-non-contiguous-memory-aggressively", which allowed the code to compile.

I started running into unusual crashes in MYNN and Thompson schemes. I had to change the optimizations from -Ofast to -O3 and remove -ffast-math. Below is the relevant part of the configure.wrf

# Settings for AMD Linux x86_64, AOCC flang compiler with AOCC clang (dm+sm)
# Supported AMDARCH are znver1, znver2 and znver3 for ZEN1, ZEN2 and ZEN3 respectively
# For optimized AMDFCFLAGS and AMDLDFLAGS, please reach out to toolchainsupport@amd.com
#

DESCRIPTION = AMD ($SFC/$SCC) : AMD ZEN1/ ZEN2/ ZEN3 Architectures
DMPARALLEL = 1
OMPCPP = -D_OPENMP
OMP = -fopenmp
OMPCC = -fopenmp
SFC = flang
SCC = clang
CCOMP = clang
DM_FC = mpif90
DM_CC = mpicc
FC = time $(DM_FC)
CC = $(DM_CC) -DFSEEKO64_OK
LD = $(FC)
RWORDSIZE = $(NATIVE_RWORDSIZE)

AMDARCH = -march=znver3
AMDMATHLIB = -fveclib=AMDLIBM
AMDLDFLAGS = -Wl,-mllvm -Wl,-enable-loop-reversal -Wl,-mllvm -Wl,-enable-gather -Wl,-mllvm -Wl,--vectorize-non-contiguous-memory-aggressively
AMDFCFLAGS =

PROMOTION = #-fdefault-real-8
ARCH_LOCAL = -DNONSTANDARD_SYSTEM_SUBR -DWRF_USE_CLM -DRPC_TYPES=2 $(NETCDF4_IO_OPTS)
CFLAGS_LOCAL = -w -c -m64 -O3 $(AMDARCH)
LDFLAGS_LOCAL = -m64 -O3 -Mstack_arrays $(AMDARCH) $(AMDLDFLAGS) $(AMDMATHLIB) -lamdlibm -lomp
CPLUSPLUSLIB =
ESMF_LDFLAG = $(CPLUSPLUSLIB)
FCOPTIM = -O3 $(AMDARCH) -Mbyteswapio -Mstack_arrays -ftree-vectorize -Mbyteswapio -funroll-loops -finline-aggressive -finline-hint-functions $(AMDMATHLIB) $(AMDFCFLAGS)
FCREDUCEDOPT = -O3 $(AMDARCH) -Mstack_arrays -DFCREDUCEDOPT
FCNOOPT = -O0
FCDEBUG = # -g $(FCNOOPT)
FORMAT_FIXED = -Mfixed
FORMAT_FREE = -Mfreeform
FCSUFFIX =
BYTESWAPIO = -Mbyteswapio
FCBASEOPTS_NO_G = -w $(FORMAT_FREE) $(BYTESWAPIO)
FCBASEOPTS = -O3 $(FCBASEOPTS_NO_G) $(FCDEBUG)
MODULE_SRCH_FLAG=
TRADFLAG = -traditional $(NETCDF4_IO_OPTS)
CPP = /lib/cpp -P
AR = llvm-ar
ARFLAGS = ru
M4 = m4
RANLIB = llvm-ranlib
RLFLAGS =
CC_TOOLS = $(SCC)
NETCDFPAR_BUILD = echo SKIPPING

I'd like to install and try out AMD's math libraries, but my time working with AMD compilers and clusters has come to an end as I'm retiring.
Mike
 
For version 4.1.0, no. The compiler threw an error about "-vectorize-noncontigous-memory-aggressively". It told me to change it to "--vectorize-non-contiguous-memory-aggressively", which allowed the code to compile.

I started running into unusual crashes in MYNN and Thompson schemes. I had to change the optimizations from -Ofast to -O3 and remove -ffast-math. Below is the relevant part of the configure.wrf

# Settings for AMD Linux x86_64, AOCC flang compiler with AOCC clang (dm+sm)
# Supported AMDARCH are znver1, znver2 and znver3 for ZEN1, ZEN2 and ZEN3 respectively
# For optimized AMDFCFLAGS and AMDLDFLAGS, please reach out to toolchainsupport@amd.com
#

DESCRIPTION = AMD ($SFC/$SCC) : AMD ZEN1/ ZEN2/ ZEN3 Architectures
DMPARALLEL = 1
OMPCPP = -D_OPENMP
OMP = -fopenmp
OMPCC = -fopenmp
SFC = flang
SCC = clang
CCOMP = clang
DM_FC = mpif90
DM_CC = mpicc
FC = time $(DM_FC)
CC = $(DM_CC) -DFSEEKO64_OK
LD = $(FC)
RWORDSIZE = $(NATIVE_RWORDSIZE)

AMDARCH = -march=znver3
AMDMATHLIB = -fveclib=AMDLIBM
AMDLDFLAGS = -Wl,-mllvm -Wl,-enable-loop-reversal -Wl,-mllvm -Wl,-enable-gather -Wl,-mllvm -Wl,--vectorize-non-contiguous-memory-aggressively
AMDFCFLAGS =

PROMOTION = #-fdefault-real-8
ARCH_LOCAL = -DNONSTANDARD_SYSTEM_SUBR -DWRF_USE_CLM -DRPC_TYPES=2 $(NETCDF4_IO_OPTS)
CFLAGS_LOCAL = -w -c -m64 -O3 $(AMDARCH)
LDFLAGS_LOCAL = -m64 -O3 -Mstack_arrays $(AMDARCH) $(AMDLDFLAGS) $(AMDMATHLIB) -lamdlibm -lomp
CPLUSPLUSLIB =
ESMF_LDFLAG = $(CPLUSPLUSLIB)
FCOPTIM = -O3 $(AMDARCH) -Mbyteswapio -Mstack_arrays -ftree-vectorize -Mbyteswapio -funroll-loops -finline-aggressive -finline-hint-functions $(AMDMATHLIB) $(AMDFCFLAGS)
FCREDUCEDOPT = -O3 $(AMDARCH) -Mstack_arrays -DFCREDUCEDOPT
FCNOOPT = -O0
FCDEBUG = # -g $(FCNOOPT)
FORMAT_FIXED = -Mfixed
FORMAT_FREE = -Mfreeform
FCSUFFIX =
BYTESWAPIO = -Mbyteswapio
FCBASEOPTS_NO_G = -w $(FORMAT_FREE) $(BYTESWAPIO)
FCBASEOPTS = -O3 $(FCBASEOPTS_NO_G) $(FCDEBUG)
MODULE_SRCH_FLAG=
TRADFLAG = -traditional $(NETCDF4_IO_OPTS)
CPP = /lib/cpp -P
AR = llvm-ar
ARFLAGS = ru
M4 = m4
RANLIB = llvm-ranlib
RLFLAGS =
CC_TOOLS = $(SCC)
NETCDFPAR_BUILD = echo SKIPPING

I'd like to install and try out AMD's math libraries, but my time working with AMD compilers and clusters has come to an end as I'm retiring.
Mike
Thank you mike, once I get an AMD cpu to test I'll let you know how it goes. Can I reach out to you for help if needed? @leuthold @Gabriel Cassol
 
Top