I concur with @leuthold that the WRF `configure.defaults` already incorporates support for AMD compilers. However, it appears that the WPS component lacks a dedicated configuration setup for AMD compilers. Given that WPS involves heavy data processing, optimizations tailored for AMD CPUs, aligning its configuration with the AMD compiler infrastructure would be conducive to maximizing performance and compatibility.AOCC has been an option in WRF for a few releases now. Thanks.
Mike
########################################################################################################################
#ARCH AMD Linux x86_64, AOCC compilers # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2
#
COMPRESSION_LIBS = CONFIGURE_COMP_L
COMPRESSION_INC = CONFIGURE_COMP_I
FDEFS = CONFIGURE_FDEFS
NCARG_LIBS =
NCARG_LIBS2 =
FC = mpif90
SFC = flang
CC = mpicc
SCC = clang
LD = $(FC)
FFLAGS = -Mfreeform -ffree-line-length-none -fopenmp
F77FLAGS = -Mfixed -ffixed-line-length-none -fopenmp
FNGFLAGS = $(FFLAGS)
LDFLAGS = -fopenmp
CFLAGS = -O3 -fopenmp
CPP = /lib/cpp -P -traditional
CPPFLAGS = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI
ARFLAGS =
RANLIB = ranlib
CC_TOOLS = $(SCC)
Can the one I came up with be tested somehowAOCC has not been added to the official WPS release. Sorry.
Code:######################################################################################################################## #ARCH AMD Linux x86_64, AOCC compilers # serial serial_NO_GRIB2 dmpar dmpar_NO_GRIB2 # COMPRESSION_LIBS = CONFIGURE_COMP_L COMPRESSION_INC = CONFIGURE_COMP_I FDEFS = CONFIGURE_FDEFS NCARG_LIBS = NCARG_LIBS2 = FC = mpif90 SFC = flang CC = mpicc SCC = clang LD = $(FC) FFLAGS = -Mfreeform -ffree-line-length-none -fopenmp F77FLAGS = -Mfixed -ffixed-line-length-none -fopenmp FNGFLAGS = $(FFLAGS) LDFLAGS = -fopenmp CFLAGS = -O3 -fopenmp CPP = /lib/cpp -P -traditional CPPFLAGS = -D_UNDERSCORE -DBYTESWAP -DLINUX -DIO_NETCDF -DIO_BINARY -DIO_GRIB1 -DBIT32 CONFIGURE_MPI ARFLAGS = RANLIB = ranlib CC_TOOLS = $(SCC)
I came up with this with the help of chatgpt and some coding forums. I have no way to test it but perhaps this will get a start for someone.
@Ming Chen @Gabriel Cassol
@leuthold did you ever get it to work on WPS?
Almost. geogrid.exe and metgrid.exe compiled, but ungrib did not. I used almost identical configurations as above. AOCC 4.1.
ungrib.exe error.
"
ld.lld: error: duplicate symbol: gbytes_
>>> defined at gbytes.f:1
>>> gbytes.o : (gbytes_) in archive ./ngl/libw3.a
>>> defined at gbytesc.f:15
>>> gbytesc.o : (.text+0x40) in archive ./ngl/libg2_4.a
Where you able to use the AOCC compilers for WRF without having to change anything?AOCC has been an option in WRF for a few releases now. Thanks.
Mike
Thank you mike, once I get an AMD cpu to test I'll let you know how it goes. Can I reach out to you for help if needed? @leuthold @Gabriel CassolFor version 4.1.0, no. The compiler threw an error about "-vectorize-noncontigous-memory-aggressively". It told me to change it to "--vectorize-non-contiguous-memory-aggressively", which allowed the code to compile.
I started running into unusual crashes in MYNN and Thompson schemes. I had to change the optimizations from -Ofast to -O3 and remove -ffast-math. Below is the relevant part of the configure.wrf
# Settings for AMD Linux x86_64, AOCC flang compiler with AOCC clang (dm+sm)
# Supported AMDARCH are znver1, znver2 and znver3 for ZEN1, ZEN2 and ZEN3 respectively
# For optimized AMDFCFLAGS and AMDLDFLAGS, please reach out to toolchainsupport@amd.com
#
DESCRIPTION = AMD ($SFC/$SCC) : AMD ZEN1/ ZEN2/ ZEN3 Architectures
DMPARALLEL = 1
OMPCPP = -D_OPENMP
OMP = -fopenmp
OMPCC = -fopenmp
SFC = flang
SCC = clang
CCOMP = clang
DM_FC = mpif90
DM_CC = mpicc
FC = time $(DM_FC)
CC = $(DM_CC) -DFSEEKO64_OK
LD = $(FC)
RWORDSIZE = $(NATIVE_RWORDSIZE)
AMDARCH = -march=znver3
AMDMATHLIB = -fveclib=AMDLIBM
AMDLDFLAGS = -Wl,-mllvm -Wl,-enable-loop-reversal -Wl,-mllvm -Wl,-enable-gather -Wl,-mllvm -Wl,--vectorize-non-contiguous-memory-aggressively
AMDFCFLAGS =
PROMOTION = #-fdefault-real-8
ARCH_LOCAL = -DNONSTANDARD_SYSTEM_SUBR -DWRF_USE_CLM -DRPC_TYPES=2 $(NETCDF4_IO_OPTS)
CFLAGS_LOCAL = -w -c -m64 -O3 $(AMDARCH)
LDFLAGS_LOCAL = -m64 -O3 -Mstack_arrays $(AMDARCH) $(AMDLDFLAGS) $(AMDMATHLIB) -lamdlibm -lomp
CPLUSPLUSLIB =
ESMF_LDFLAG = $(CPLUSPLUSLIB)
FCOPTIM = -O3 $(AMDARCH) -Mbyteswapio -Mstack_arrays -ftree-vectorize -Mbyteswapio -funroll-loops -finline-aggressive -finline-hint-functions $(AMDMATHLIB) $(AMDFCFLAGS)
FCREDUCEDOPT = -O3 $(AMDARCH) -Mstack_arrays -DFCREDUCEDOPT
FCNOOPT = -O0
FCDEBUG = # -g $(FCNOOPT)
FORMAT_FIXED = -Mfixed
FORMAT_FREE = -Mfreeform
FCSUFFIX =
BYTESWAPIO = -Mbyteswapio
FCBASEOPTS_NO_G = -w $(FORMAT_FREE) $(BYTESWAPIO)
FCBASEOPTS = -O3 $(FCBASEOPTS_NO_G) $(FCDEBUG)
MODULE_SRCH_FLAG=
TRADFLAG = -traditional $(NETCDF4_IO_OPTS)
CPP = /lib/cpp -P
AR = llvm-ar
ARFLAGS = ru
M4 = m4
RANLIB = llvm-ranlib
RLFLAGS =
CC_TOOLS = $(SCC)
NETCDFPAR_BUILD = echo SKIPPING
I'd like to install and try out AMD's math libraries, but my time working with AMD compilers and clusters has come to an end as I'm retiring.
Mike