Running WRF on different machines

Xun_Wang · May 3, 2019

Hi,

I am running WRFv4.1 in our HPC. Right now we have nodes with two different architectures (different CPUs and different mainboards).

I compiled WRFv4.1 on each architecture using the same version of libraries and Intel compiler (libraries and Intel compiler were compiled under each architecture respectively).

These two compilations of WRFv4.1 gave different results (same namelist, same forcing data, same processor number...). For U and V, the maximum difference can reach 10 m/s, and for T2 the maximum difference can reach 4 K.

Of course, some small differences in the output can be expected, but I am not sure if the difference should be this large.

I just want to ask if there is anything during the compilation process I need to take care of, so that the two compilations can produce the "same" result.

Best regards,
Xun

kwerner · May 3, 2019

Hi Xun,

We are aware that in general it cannot be expected to produce identical results when running on two different computers, as there are too many variables in the process that can make a difference. A single digit change anywhere in the non-linear model code will produce different output, and will deviate more as time integrates. Unfortunately, even when you ensure that everything else is the same with the 2 compiles/runs, there isn't really anything that can be done to force bit-for-bit outcomes. We have had other users confirm pretty significant differences in the past, as well.

meteoadriatic · May 3, 2019

Xun,

No guarantees it will do anything, but you can try to remove -fp-model fast=1, -no-prec-div and -no-prec-sqrt from FCBASEOPTS_NO_G line (configure.wrf) and instead of those add -fp-model=precise or even -fp-model=strict. Keep in mind that this might come with some performance penalty though.

Please read here more in detail:
https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-fp-model-fp

Ivan

Xun_Wang · May 6, 2019

kwerner said:
Hi Xun,

We are aware that in general it cannot be expected to produce identical results when running on two different computers, as there are too many variables in the process that can make a difference. A single digit change anywhere in the non-linear model code will produce different output, and will deviate more as time integrates. Unfortunately, even when you ensure that everything else is the same with the 2 compiles/runs, there isn't really anything that can be done to force bit-for-bit outcomes. We have had other users confirm pretty significant differences in the past, as well.

Hi,

thanks you for the reply and explanation!

Xun_Wang · May 6, 2019

meteoadriatic said:
Xun,

No guarantees it will do anything, but you can try to remove -fp-model fast=1, -no-prec-div and -no-prec-sqrt from FCBASEOPTS_NO_G line (configure.wrf) and instead of those add -fp-model=precise or even -fp-model=strict. Keep in mind that this might come with some performance penalty though.

Please read here more in detail:
https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-fp-model-fp

Ivan

Hi Ivan,

thanks for your suggestion! I'll take a look at it.

Running WRF on different machines

Xun_Wang

New member

kwerner

Administrator

meteoadriatic

Member

Xun_Wang

New member

Xun_Wang

New member