Platform: Intel (R) Xeon (R) Gold 6242 CPU @ 2.80GHz
Configure wrf:
Please select from among the following Linux x86_64 options:
1. (serial) 2. (smpar) 3. (dmpar) 4. (dm+sm) PGI (pgf90/gcc)
5. (serial) 6. (smpar) 7. (dmpar) 8. (dm+sm) PGI (pgf90/pgcc): SGI MPT
9. (serial) 10. (smpar) 11. (dmpar) 12. (dm+sm) PGI (pgf90/gcc): PGI accelerator
13. (serial) 14. (smpar) 15. (dmpar) 16. (dm+sm) INTEL (ifort/icc)
17. (dm+sm) INTEL (ifort/icc): Xeon Phi (MIC architecture)
18. (serial) 19. (smpar) 20. (dmpar) 21. (dm+sm) INTEL (ifort/icc): Xeon (SNB with AVX mods)
22. (serial) 23. (smpar) 24. (dmpar) 25. (dm+sm) INTEL (ifort/icc): SGI MPT
26. (serial) 27. (smpar) 28. (dmpar) 29. (dm+sm) INTEL (ifort/icc): IBM POE
30. (serial) 31. (dmpar) PATHSCALE (pathf90/pathcc)
32. (serial) 33. (smpar) 34. (dmpar) 35. (dm+sm) GNU (gfortran/gcc)
36. (serial) 37. (smpar) 38. (dmpar) 39. (dm+sm) IBM (xlf90_r/cc_r)
40. (serial) 41. (smpar) 42. (dmpar) 43. (dm+sm) PGI (ftn/gcc): Cray XC CLE
44. (serial) 45. (smpar) 46. (dmpar) 47. (dm+sm) CRAY CCE (ftn $(NOOMP)/cc): Cray XE and XC
48. (serial) 49. (smpar) 50. (dmpar) 51. (dm+sm) INTEL (ftn/icc): Cray XC
52. (serial) 53. (smpar) 54. (dmpar) 55. (dm+sm) PGI (pgf90/pgcc)
56. (serial) 57. (smpar) 58. (dmpar) 59. (dm+sm) PGI (pgf90/gcc): -f90=pgf90
60. (serial) 61. (smpar) 62. (dmpar) 63. (dm+sm) PGI (pgf90/pgcc): -f90=pgf90
64. (serial) 65. (smpar) 66. (dmpar) 67. (dm+sm) INTEL (ifort/icc): HSW/BDW
68. (serial) 69. (smpar) 70. (dmpar) 71. (dm+sm) INTEL (ifort/icc): KNL MIC
72. (serial) 73. (smpar) 74. (dmpar) 75. (dm+sm) FUJITSU (frtpx/fccpx): FX10/FX100 SPARC64 IXfx/Xlfx
Enter selection [1-75] : 20
------------------------------------------------------------------------
Compile for nesting? (1=basic, 2=preset moves, 3=vortex following) [default 1]:
Configuration successful!
And then run GFS1 data。
But when I run wrf, I don't know if the calculation process uses the avx512 instruction set?
perf stat:
starting wrf task 0 of 1
Performance counter stats for './wrf.exe':
94,261.71 msec task-clock # 0.994 CPUs utilized
453 context-switches # 0.005 K/sec
261 cpu-migrations # 0.003 K/sec
679,928 page-faults # 0.007 M/sec
322,807,987,394 cycles # 3.425 GHz (21.60%)
38,714,733,105 branches # 410.715 M/sec (21.61%)
580,459,095,247 instructions # 1.80 insn per cycle (26.95%)
649,679,401,942 uops_issued.any # 6892.294 M/sec (26.95%)
737,740,304,391 uops_executed.thread # 7826.511 M/sec (26.95%)
0 fp_assist.any # 0.000 K/sec (21.60%)
3,151,436,597 fp_arith_inst_retired.128b_packed_single # 33.433 M/sec (16.27%)
2,450,946,819 fp_arith_inst_retired.256b_packed_double # 26.002 M/sec (16.27%)
41,968,447,288 fp_arith_inst_retired.256b_packed_single # 445.233 M/sec (16.27%)
0 fp_arith_inst_retired.512b_packed_double # 0.000 K/sec (16.26%)
0 fp_arith_inst_retired.512b_packed_single # 0.000 K/sec (16.26%)
5,433,145,836 fp_arith_inst_retired.scalar_double # 57.639 M/sec (16.26%)
71,385,325,655 fp_arith_inst_retired.scalar_single # 757.310 M/sec (16.26%)
2,206,027 fp_arith_inst_retired.128b_packed_double # 0.023 M/sec (16.26%)
16,796,943,635 core_power.lvl0_turbo_license # 178.195 M/sec (16.26%)
304,529,500,056 core_power.lvl1_turbo_license # 3230.681 M/sec (16.26%)
1,656,906,525 core_power.lvl2_turbo_license # 17.578 M/sec (16.26%)
94.834498394 seconds time elapsed
93.185961000 seconds user
1.003576000 seconds sys
Configure wrf:
Please select from among the following Linux x86_64 options:
1. (serial) 2. (smpar) 3. (dmpar) 4. (dm+sm) PGI (pgf90/gcc)
5. (serial) 6. (smpar) 7. (dmpar) 8. (dm+sm) PGI (pgf90/pgcc): SGI MPT
9. (serial) 10. (smpar) 11. (dmpar) 12. (dm+sm) PGI (pgf90/gcc): PGI accelerator
13. (serial) 14. (smpar) 15. (dmpar) 16. (dm+sm) INTEL (ifort/icc)
17. (dm+sm) INTEL (ifort/icc): Xeon Phi (MIC architecture)
18. (serial) 19. (smpar) 20. (dmpar) 21. (dm+sm) INTEL (ifort/icc): Xeon (SNB with AVX mods)
22. (serial) 23. (smpar) 24. (dmpar) 25. (dm+sm) INTEL (ifort/icc): SGI MPT
26. (serial) 27. (smpar) 28. (dmpar) 29. (dm+sm) INTEL (ifort/icc): IBM POE
30. (serial) 31. (dmpar) PATHSCALE (pathf90/pathcc)
32. (serial) 33. (smpar) 34. (dmpar) 35. (dm+sm) GNU (gfortran/gcc)
36. (serial) 37. (smpar) 38. (dmpar) 39. (dm+sm) IBM (xlf90_r/cc_r)
40. (serial) 41. (smpar) 42. (dmpar) 43. (dm+sm) PGI (ftn/gcc): Cray XC CLE
44. (serial) 45. (smpar) 46. (dmpar) 47. (dm+sm) CRAY CCE (ftn $(NOOMP)/cc): Cray XE and XC
48. (serial) 49. (smpar) 50. (dmpar) 51. (dm+sm) INTEL (ftn/icc): Cray XC
52. (serial) 53. (smpar) 54. (dmpar) 55. (dm+sm) PGI (pgf90/pgcc)
56. (serial) 57. (smpar) 58. (dmpar) 59. (dm+sm) PGI (pgf90/gcc): -f90=pgf90
60. (serial) 61. (smpar) 62. (dmpar) 63. (dm+sm) PGI (pgf90/pgcc): -f90=pgf90
64. (serial) 65. (smpar) 66. (dmpar) 67. (dm+sm) INTEL (ifort/icc): HSW/BDW
68. (serial) 69. (smpar) 70. (dmpar) 71. (dm+sm) INTEL (ifort/icc): KNL MIC
72. (serial) 73. (smpar) 74. (dmpar) 75. (dm+sm) FUJITSU (frtpx/fccpx): FX10/FX100 SPARC64 IXfx/Xlfx
Enter selection [1-75] : 20
------------------------------------------------------------------------
Compile for nesting? (1=basic, 2=preset moves, 3=vortex following) [default 1]:
Configuration successful!
And then run GFS1 data。
But when I run wrf, I don't know if the calculation process uses the avx512 instruction set?
perf stat:
starting wrf task 0 of 1
Performance counter stats for './wrf.exe':
94,261.71 msec task-clock # 0.994 CPUs utilized
453 context-switches # 0.005 K/sec
261 cpu-migrations # 0.003 K/sec
679,928 page-faults # 0.007 M/sec
322,807,987,394 cycles # 3.425 GHz (21.60%)
38,714,733,105 branches # 410.715 M/sec (21.61%)
580,459,095,247 instructions # 1.80 insn per cycle (26.95%)
649,679,401,942 uops_issued.any # 6892.294 M/sec (26.95%)
737,740,304,391 uops_executed.thread # 7826.511 M/sec (26.95%)
0 fp_assist.any # 0.000 K/sec (21.60%)
3,151,436,597 fp_arith_inst_retired.128b_packed_single # 33.433 M/sec (16.27%)
2,450,946,819 fp_arith_inst_retired.256b_packed_double # 26.002 M/sec (16.27%)
41,968,447,288 fp_arith_inst_retired.256b_packed_single # 445.233 M/sec (16.27%)
0 fp_arith_inst_retired.512b_packed_double # 0.000 K/sec (16.26%)
0 fp_arith_inst_retired.512b_packed_single # 0.000 K/sec (16.26%)
5,433,145,836 fp_arith_inst_retired.scalar_double # 57.639 M/sec (16.26%)
71,385,325,655 fp_arith_inst_retired.scalar_single # 757.310 M/sec (16.26%)
2,206,027 fp_arith_inst_retired.128b_packed_double # 0.023 M/sec (16.26%)
16,796,943,635 core_power.lvl0_turbo_license # 178.195 M/sec (16.26%)
304,529,500,056 core_power.lvl1_turbo_license # 3230.681 M/sec (16.26%)
1,656,906,525 core_power.lvl2_turbo_license # 17.578 M/sec (16.26%)
94.834498394 seconds time elapsed
93.185961000 seconds user
1.003576000 seconds sys