Why different compilers have different results for the same case?

All topics related to compiling WRF, including environment set-up, and library installation
Post Reply
Jing_iap
Posts: 2
Joined: Wed Dec 23, 2020 10:44 am

Why different compilers have different results for the same case?

Post by Jing_iap » Wed Dec 23, 2020 11:14 am

Hi,
I'm trying to use WRF version 4.0.2 to conduct one high-resolution simulation (9km/3km/1km). However, I found that when I used different compilers (intel version 2017 and gcc version 7.3.1) with Intel CPU 6142, the results come to be different, the maximum difference of T2 even reached to 9 kelvin degree.

They shared the same wrfinput_d01, wrfbdy_d01, and namelist.inpuit. When I used these two different compilers to finish compilation, I didn't change anything in the configure.wrf except for the DM_FC & DM_CC part (mpif90 and mpicc for gcc, mpiifort and mpiicc for intel).

I also checked other variables, such as Q2, RAINNC, and U10, all of them show obvious differences just like T2.

Could you please tell me why different compilers have different results for the same case? Is it reasonable? How can I make them have the same output?

Thanks!

kwerner
Posts: 2287
Joined: Wed Feb 14, 2018 9:21 pm

Re: Why different compilers have different results for the same case?

Post by kwerner » Mon Dec 28, 2020 6:38 pm

Hi,
As I'm not a computer science person, I can't tell you "why" this happens, except that they process differently. This is completely expected, though, and we are aware that results will never be identical when using different compilers or machines.
NCAR/MMM

Jing_iap
Posts: 2
Joined: Wed Dec 23, 2020 10:44 am

Re: Why different compilers have different results for the same case?

Post by Jing_iap » Thu Dec 31, 2020 6:33 am

kwerner wrote:
Mon Dec 28, 2020 6:38 pm
Hi,
As I'm not a computer science person, I can't tell you "why" this happens, except that they process differently. This is completely expected, though, and we are aware that results will never be identical when using different compilers or machines.
Thanks for the reply.
I understand that differences exist when using different compilers or machines, but is it reasonable that differences are that large?
Do you have any idea which result is the "best" one? Do you have any recommendations about machine or compiler based on your experiences?
Thanks a lot!

mgduda
Posts: 466
Joined: Mon Feb 26, 2018 7:35 pm

Re: Why different compilers have different results for the same case?

Post by mgduda » Thu Dec 31, 2020 6:53 pm

Often, the differences arise from round-off error in floating-point operations, which may be handled differently by different compilers and machines. For example, compilers may use optimized math libraries or they may convert a division into a multiplication by the inverse, and different processors may provide different representations of floating-point numbers (e.g., extended precision) or use different rounding modes by default. Regardless of the source, the differences, which may initially be very small (around "machine epsilon") can grow quickly over time in chaotic systems, leading to qualitative differences in the model results. Judt (JAS, 2018) nicely illustrates this error growth.

Below is a simple Fortran program that illustrates the different results that can result from order of summation.

Code: Select all

program assoc

    real :: x, y, z, w1, w2

    x = 1.0
    y = 2.0**(-24)
    z = 2.0**(-24)

    w1 = (x + y) + z
    w2 = x + (y + z)

    write(6,*) (w1 - 1.0), (w2 - 1.0)

    stop

end program assoc
In exact arithmetic, (x + y) + z is identical to x + (y + z), yet this is not in general true in floating-point arithmetic. Compounding matters, in numerical models like WRF that contain parameterizations of complex physical processes, there are often conditional statements that depend on floating-point values, and it's easy to imagine how these can amplify differences by causing entirely different code to be executed, depending on the compiler; for example (building on the Fortran, above):

Code: Select all

   if (w1 > 1.0) then
      ... call some subroutine to handle the case where w1 > 1.0 ...
   else
      ... call a different subroutine to handle the case where w <= 1.0 ...
   end if
If instead, w1 was computed in the same way as w2, then an entirely different subroutine would be called.
NCAR/MMM

mgduda
Posts: 466
Joined: Mon Feb 26, 2018 7:35 pm

Re: Why different compilers have different results for the same case?

Post by mgduda » Thu Dec 31, 2020 7:26 pm

Jing_iap wrote:
Thu Dec 31, 2020 6:33 am
kwerner wrote:
Mon Dec 28, 2020 6:38 pm
Hi,
As I'm not a computer science person, I can't tell you "why" this happens, except that they process differently. This is completely expected, though, and we are aware that results will never be identical when using different compilers or machines.
Thanks for the reply.
I understand that differences exist when using different compilers or machines, but is it reasonable that differences are that large?
Do you have any idea which result is the "best" one? Do you have any recommendations about machine or compiler based on your experiences?
Thanks a lot!
I think my previous post may at least partially address the question of whether "large" differences in results are reasonable -- essentially, given a round-off-level difference and a long enough integration time, we can get qualitatively different "weather".

On the question of which compiler and machine is "best", I think that, absent any compiler bugs, there is no "best" choice. Some compilers are available for free, but they tend to produce slower executables than commercial compilers; so whether speed or price is more important is up to you. Some compilers also provide better compile-time diagnostics and better run-time checks, so if you intend to do any significant model development, there might be some reason in this regard to choose one compiler over another; however, we've often found that having multiple compilers can be helpful in development, as each compiler seems to have its own strengths when it comes to error checking.
NCAR/MMM

neel14
Posts: 97
Joined: Fri Mar 15, 2019 6:52 am

Re: Why different compilers have different results for the same case?

Post by neel14 » Thu Apr 08, 2021 3:03 am

Hi,
Should we also expect different results for different versions of the same compiler (e.g intel 18 and 19)?

Regards

kwerner
Posts: 2287
Joined: Wed Feb 14, 2018 9:21 pm

Re: Why different compilers have different results for the same case?

Post by kwerner » Mon Apr 12, 2021 5:37 pm

Hi,

Yes, it is also possible for results to differ when the compiler version is different.
NCAR/MMM

Post Reply

Return to “WRF Compiling/Installation”