AMD AOCC compiler support for WPS?

leuthold · Nov 7, 2023

AOCC has been an option in WRF for a few releases now. Thanks.
Mike

William.Hatheway · Dec 28, 2023

leuthold said:
AOCC has been an option in WRF for a few releases now. Thanks.
Mike

I concur with @leuthold that the WRF `configure.defaults` already incorporates support for AMD compilers. However, it appears that the WPS component lacks a dedicated configuration setup for AMD compilers. Given that WPS involves heavy data processing, optimizations tailored for AMD CPUs, aligning its configuration with the AMD compiler infrastructure would be conducive to maximizing performance and compatibility.

AMD Compiler option for WPS · Issue #242 · wrf-model/WPS

https://forum.mmm.ucar.edu/threads/amd-aocc-compiler-support-for-wps.14534/ As stated in the WRF Forum WPS appears to be missing the AOCC compilers in it's configure.defaults options. I think it sh...

github.com

Gabriel Cassol · Dec 28, 2023

I have been using WRF operationally with AMD processors for 5 consecutive years, if there is any way to use the AMD compiler to speed up WRF simulation it would be very interesting. Due to the price currently, AMD processors are more worth purchasing than Intel processors, so having an AMD compiler option like there is for Intel would be very important.

William.Hatheway · Jan 2, 2024

Code:

########################################################################################################################
#ARCH   AMD Linux x86_64, AOCC compilers   # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2
#
COMPRESSION_LIBS    = CONFIGURE_COMP_L
COMPRESSION_INC     = CONFIGURE_COMP_I
FDEFS               = CONFIGURE_FDEFS
NCARG_LIBS          =
NCARG_LIBS2         =
FC                  = mpif90
SFC                 = flang
CC                  = mpicc
SCC                 = clang
LD                  = $(FC)
FFLAGS              = -Mfreeform -ffree-line-length-none -fopenmp
F77FLAGS            = -Mfixed -ffixed-line-length-none -fopenmp
FNGFLAGS            = $(FFLAGS)
LDFLAGS             = -fopenmp
CFLAGS              = -O3 -fopenmp
CPP                 = /lib/cpp -P -traditional
CPPFLAGS            = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI
ARFLAGS             =
RANLIB              = ranlib
CC_TOOLS            = $(SCC)

I came up with this with the help of chatgpt and some coding forums. I have no way to test it but perhaps this will get a start for someone.
@Ming Chen @Gabriel Cassol

@leuthold did you ever get it to work on WPS?

Ming Chen · Jan 3, 2024

AOCC has not been added to the official WPS release. Sorry.

William.Hatheway · Jan 3, 2024

Ming Chen said:
AOCC has not been added to the official WPS release. Sorry.

Can the one I came up with be tested somehow

leuthold · Jan 10, 2024

William.Hatheway said:

Code:

########################################################################################################################
#ARCH   AMD Linux x86_64, AOCC compilers   # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2
#
COMPRESSION_LIBS    = CONFIGURE_COMP_L
COMPRESSION_INC     = CONFIGURE_COMP_I
FDEFS               = CONFIGURE_FDEFS
NCARG_LIBS          =
NCARG_LIBS2         =
FC                  = mpif90
SFC                 = flang
CC                  = mpicc
SCC                 = clang
LD                  = $(FC)
FFLAGS              = -Mfreeform -ffree-line-length-none -fopenmp
F77FLAGS            = -Mfixed -ffixed-line-length-none -fopenmp
FNGFLAGS            = $(FFLAGS)
LDFLAGS             = -fopenmp
CFLAGS              = -O3 -fopenmp
CPP                 = /lib/cpp -P -traditional
CPPFLAGS            = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI
ARFLAGS             =
RANLIB              = ranlib
CC_TOOLS            = $(SCC)

I came up with this with the help of chatgpt and some coding forums. I have no way to test it but perhaps this will get a start for someone.
@Ming Chen @Gabriel Cassol

@leuthold did you ever get it to work on WPS?

Almost. geogrid.exe and metgrid.exe compiled, but ungrib did not. I used almost identical configurations as above. AOCC 4.1.

ungrib.exe error.
"
ld.lld: error: duplicate symbol: gbytes_
>>> defined at gbytes.f:1
>>> gbytes.o : (gbytes_) in archive ./ngl/libw3.a
>>> defined at gbytesc.f:15
>>> gbytesc.o : (.text+0x40) in archive ./ngl/libg2_4.a

William.Hatheway · Jan 10, 2024

leuthold said:
Almost. geogrid.exe and metgrid.exe compiled, but ungrib did not. I used almost identical configurations as above. AOCC 4.1.

ungrib.exe error.
"
ld.lld: error: duplicate symbol: gbytes_
>>> defined at gbytes.f:1
>>> gbytes.o : (gbytes_) in archive ./ngl/libw3.a
>>> defined at gbytesc.f:15
>>> gbytesc.o : (.text+0x40) in archive ./ngl/libg2_4.a

@leuthold

Did you try mine that gave that error or was that from yours?

William.Hatheway · Jan 10, 2024

William.Hatheway said:
@leuthold

Did you try mine that gave that error or was that from yours?

@leuthold

If you want we can work together to try and solve this problem then issue a PR

William.Hatheway · Jan 11, 2024

leuthold said:
AOCC has been an option in WRF for a few releases now. Thanks.
Mike

Where you able to use the AOCC compilers for WRF without having to change anything?

leuthold · Jan 17, 2024

For version 4.1.0, no. The compiler threw an error about "-vectorize-noncontigous-memory-aggressively". It told me to change it to "--vectorize-non-contiguous-memory-aggressively", which allowed the code to compile.

I started running into unusual crashes in MYNN and Thompson schemes. I had to change the optimizations from -Ofast to -O3 and remove -ffast-math. Below is the relevant part of the configure.wrf

# Settings for AMD Linux x86_64, AOCC flang compiler with AOCC clang (dm+sm)
# Supported AMDARCH are znver1, znver2 and znver3 for ZEN1, ZEN2 and ZEN3 respectively
# For optimized AMDFCFLAGS and AMDLDFLAGS, please reach out to toolchainsupport@amd.com
#

DESCRIPTION = AMD ($SFC/$SCC) : AMD ZEN1/ ZEN2/ ZEN3 Architectures
DMPARALLEL = 1
OMPCPP = -D_OPENMP
OMP = -fopenmp
OMPCC = -fopenmp
SFC = flang
SCC = clang
CCOMP = clang
DM_FC = mpif90
DM_CC = mpicc
FC = time $(DM_FC)
CC = $(DM_CC) -DFSEEKO64_OK
LD = $(FC)
RWORDSIZE = $(NATIVE_RWORDSIZE)

AMDARCH = -march=znver3
AMDMATHLIB = -fveclib=AMDLIBM
AMDLDFLAGS = -Wl,-mllvm -Wl,-enable-loop-reversal -Wl,-mllvm -Wl,-enable-gather -Wl,-mllvm -Wl,--vectorize-non-contiguous-memory-aggressively
AMDFCFLAGS =

PROMOTION = #-fdefault-real-8
ARCH_LOCAL = -DNONSTANDARD_SYSTEM_SUBR -DWRF_USE_CLM -DRPC_TYPES=2 $(NETCDF4_IO_OPTS)
CFLAGS_LOCAL = -w -c -m64 -O3 $(AMDARCH)
LDFLAGS_LOCAL = -m64 -O3 -Mstack_arrays $(AMDARCH) $(AMDLDFLAGS) $(AMDMATHLIB) -lamdlibm -lomp
CPLUSPLUSLIB =
ESMF_LDFLAG = $(CPLUSPLUSLIB)
FCOPTIM = -O3 $(AMDARCH) -Mbyteswapio -Mstack_arrays -ftree-vectorize -Mbyteswapio -funroll-loops -finline-aggressive -finline-hint-functions $(AMDMATHLIB) $(AMDFCFLAGS)
FCREDUCEDOPT = -O3 $(AMDARCH) -Mstack_arrays -DFCREDUCEDOPT
FCNOOPT = -O0
FCDEBUG = # -g $(FCNOOPT)
FORMAT_FIXED = -Mfixed
FORMAT_FREE = -Mfreeform
FCSUFFIX =
BYTESWAPIO = -Mbyteswapio
FCBASEOPTS_NO_G = -w $(FORMAT_FREE) $(BYTESWAPIO)
FCBASEOPTS = -O3 $(FCBASEOPTS_NO_G) $(FCDEBUG)
MODULE_SRCH_FLAG=
TRADFLAG = -traditional $(NETCDF4_IO_OPTS)
CPP = /lib/cpp -P
AR = llvm-ar
ARFLAGS = ru
M4 = m4
RANLIB = llvm-ranlib
RLFLAGS =
CC_TOOLS = $(SCC)
NETCDFPAR_BUILD = echo SKIPPING

I'd like to install and try out AMD's math libraries, but my time working with AMD compilers and clusters has come to an end as I'm retiring.
Mike

William.Hatheway · Jan 17, 2024

leuthold said:
For version 4.1.0, no. The compiler threw an error about "-vectorize-noncontigous-memory-aggressively". It told me to change it to "--vectorize-non-contiguous-memory-aggressively", which allowed the code to compile.

I started running into unusual crashes in MYNN and Thompson schemes. I had to change the optimizations from -Ofast to -O3 and remove -ffast-math. Below is the relevant part of the configure.wrf

# Settings for AMD Linux x86_64, AOCC flang compiler with AOCC clang (dm+sm)
# Supported AMDARCH are znver1, znver2 and znver3 for ZEN1, ZEN2 and ZEN3 respectively
# For optimized AMDFCFLAGS and AMDLDFLAGS, please reach out to toolchainsupport@amd.com
#

DESCRIPTION = AMD ($SFC/$SCC) : AMD ZEN1/ ZEN2/ ZEN3 Architectures
DMPARALLEL = 1
OMPCPP = -D_OPENMP
OMP = -fopenmp
OMPCC = -fopenmp
SFC = flang
SCC = clang
CCOMP = clang
DM_FC = mpif90
DM_CC = mpicc
FC = time $(DM_FC)
CC = $(DM_CC) -DFSEEKO64_OK
LD = $(FC)
RWORDSIZE = $(NATIVE_RWORDSIZE)

AMDARCH = -march=znver3
AMDMATHLIB = -fveclib=AMDLIBM
AMDLDFLAGS = -Wl,-mllvm -Wl,-enable-loop-reversal -Wl,-mllvm -Wl,-enable-gather -Wl,-mllvm -Wl,--vectorize-non-contiguous-memory-aggressively
AMDFCFLAGS =

PROMOTION = #-fdefault-real-8
ARCH_LOCAL = -DNONSTANDARD_SYSTEM_SUBR -DWRF_USE_CLM -DRPC_TYPES=2 $(NETCDF4_IO_OPTS)
CFLAGS_LOCAL = -w -c -m64 -O3 $(AMDARCH)
LDFLAGS_LOCAL = -m64 -O3 -Mstack_arrays $(AMDARCH) $(AMDLDFLAGS) $(AMDMATHLIB) -lamdlibm -lomp
CPLUSPLUSLIB =
ESMF_LDFLAG = $(CPLUSPLUSLIB)
FCOPTIM = -O3 $(AMDARCH) -Mbyteswapio -Mstack_arrays -ftree-vectorize -Mbyteswapio -funroll-loops -finline-aggressive -finline-hint-functions $(AMDMATHLIB) $(AMDFCFLAGS)
FCREDUCEDOPT = -O3 $(AMDARCH) -Mstack_arrays -DFCREDUCEDOPT
FCNOOPT = -O0
FCDEBUG = # -g $(FCNOOPT)
FORMAT_FIXED = -Mfixed
FORMAT_FREE = -Mfreeform
FCSUFFIX =
BYTESWAPIO = -Mbyteswapio
FCBASEOPTS_NO_G = -w $(FORMAT_FREE) $(BYTESWAPIO)
FCBASEOPTS = -O3 $(FCBASEOPTS_NO_G) $(FCDEBUG)
MODULE_SRCH_FLAG=
TRADFLAG = -traditional $(NETCDF4_IO_OPTS)
CPP = /lib/cpp -P
AR = llvm-ar
ARFLAGS = ru
M4 = m4
RANLIB = llvm-ranlib
RLFLAGS =
CC_TOOLS = $(SCC)
NETCDFPAR_BUILD = echo SKIPPING

I'd like to install and try out AMD's math libraries, but my time working with AMD compilers and clusters has come to an end as I'm retiring.
Mike

Thank you mike, once I get an AMD cpu to test I'll let you know how it goes. Can I reach out to you for help if needed? @leuthold @Gabriel Cassol

AMD AOCC compiler support for WPS?

leuthold

New member

William.Hatheway

Active member

AMD Compiler option for WPS · Issue #242 · wrf-model/WPS

Gabriel Cassol

New member

William.Hatheway

Active member

Ming Chen

Moderator

William.Hatheway

Active member

leuthold

New member

William.Hatheway

Active member

William.Hatheway

Active member

William.Hatheway

Active member

leuthold

New member

William.Hatheway

Active member