Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

FFT test cases in WRF

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

biplab

New member
Hello,
I have been using WRF recently and want to benchmark it for FFT runtime.
I know that FFT is used in WRF for :- 1) polar filter, 2) Spectral Nudging, 3) Stochastic Backscatter.

I tried with Conus 12 KM benchmark test case, but it did not show any FFT calls. I realized that atleast for Polar Filter, global domain is expected.

My humble questions:-
1) To benchmark WRF for FFT runtime, what are the test cases/benchmark inputs to be used? Please provide me reference links if possible.
2) How should WRF be configured and run so that the modules:- 1) polar filter, 2) Spectral Nudging, 3) Stochastic Backscatter, are invoked/called?
Detailed steps to configure and run for the suggested test case/benchmark input will be greatly helpful.

I look forward to your urgent help and support.

Thank you.
S. Biplab Raut
 
Hi,

1) The 3 options you mention are the only ones that use it in the WRF model.
The stochastic option probably only calls it at the beginning and maybe sometimes during the run, and would not use FFT much.
The polar filter needs a global domain and spectral nudging needs a large area and both would use FFTs every timestep.

2)
a. To use the polar filter, you will need to set fft_filter_lat in the namelist.input file in the &dynamics section, and will need to use a global domain. There is an example namelist for global domains in both the WPS/ (namelist.wps.global) and test/em_real/ (namelist.input.global) directories. FFT's should be used every timestep for this option. You can read a little bit more about the namelist settings here: http://www2.mmm.ucar.edu/wrf/users/docs/user_guide_v4/v4.0/users_guide_chap5.html#Namelist

b. To run spectral nudging, you need a large area, and FFT's should be used every timestep for this option. For more information, take a look at this presentation (given at our WRF Tutorial): http://www2.mmm.ucar.edu/wrf/users/tutorial/201901/dudhia_fdda.pdf
Additionally, you can loosely follow an example for running this here (though this uses a small domain for simplicity):
http://www2.mmm.ucar.edu/wrf/OnLineTutorial/Class_Jan2019/cases/fdda.php

c. The stochastic option probably only calls FFT at the beginning (and possibly sometimes during the run), and would not use FFT much. For information on running with the stochastic option, see this section of our Users' Guide:
http://www2.mmm.ucar.edu/wrf/users/docs/user_guide_v4/v4.0/users_guide_chap5.html#stochastic
in addition to the namelist section regarding this (search for 'skebs'):
http://www2.mmm.ucar.edu/wrf/users/docs/user_guide_v4/v4.0/users_guide_chap5.html#Namelist
 
Hello,
Thank you very much for your response. I will definitely try out the points mentioned by you.
However, I request if you could tell me where to find the benchmark test inputs for running with Polar filter or spectral nudging?

Further requests and questions are mentioned inline as below:-

1) The 3 options you mention are the only ones that use it in the WRF model.
The stochastic option probably only calls it at the beginning and maybe sometimes during the run, and would not use FFT much.
The polar filter needs a global domain and spectral nudging needs a large area and both would use FFTs every timestep.

Thank you for the info.

2)
a. To use the polar filter, you will need to set fft_filter_lat in the namelist.input file in the &dynamics section, and will need to use a global domain. There is an example namelist for global domains in both the WPS/ (namelist.wps.global) and test/em_real/ (namelist.input.global) directories. FFT's should be used every timestep for this option. You can read a little bit more about the namelist settings here: http://www2.mmm.ucar.edu/wrf/users/docs ... l#Namelist
Actually I had set fft_filter_lat to 5 in namelist.input and copied various other parameters from namelist.input.global. And I ran conus 12 km input. It did not show polar filter calls. Upon further changing the namelist.input, it gave runtime assertion.
SO, I request to guide me which benchmark test input to run for global domain, and from which website I can download them ?
Thank you.

b. To run spectral nudging, you need a large area, and FFT's should be used every timestep for this option. For more information, take a look at this presentation (given at our WRF Tutorial): http://www2.mmm.ucar.edu/wrf/users/tuto ... a_fdda.pdf
Additionally, you can loosely follow an example for running this here (though this uses a small domain for simplicity):
http://www2.mmm.ucar.edu/wrf/OnLineTuto ... s/fdda.php

Thank you for your reply. I will try as you mentioned.
However, I want to know if I can use conus 12 km regional domain benchmark input for this?

c. The stochastic option probably only calls FFT at the beginning (and possibly sometimes during the run), and would not use FFT much. For information on running with the stochastic option, see this section of our Users' Guide:
http://www2.mmm.ucar.edu/wrf/users/docs ... stochastic
in addition to the namelist section regarding this (search for 'skebs'):
http://www2.mmm.ucar.edu/wrf/users/docs ... l#Namelist

Thank you very much. PLease tell me about test input to be used for this.

Thank you,
S. Biplab Raut
 
Hi,
Unfortunately we don't have any benchmarks for these particular tests.

2a) To run the global test for verification, you should run the domain set-up we have created for the default namelist.wps.global and namelist.input.global namelists. These have been previously tested, and should work. If you are only using CONUS input, then you only have data over the continental U.S., and not for the entire globe. Setting fft_filter_lat = 5 means that the filtering occurs at 5 degrees, and extends toward the poles, which again is not inside the CONUS.

As a side note, regarding the WRF model for a global run, in general, if you are planning to do a lot of future work with a global model, you may want to consider taking a look at our MPAS model, which was originally built as a global model, and is much better at scaling, and therefore runs much faster, without having to use a very small time-step. WRF was originally built as a regional model, and some developers later added a global component, but it is not highly tested, and no longer developed, so we cannot always guarantee that you will get good results - just depending on your application. If you are interested in taking a look at MPAS, you can find information here: https://mpas-dev.github.io

2b) The 12km CONUS case should be okay to use for spectral nudging.

2c) We do not have any benchmark test input to use with the stochastic option. However, following the advice given in the links I sent (you may also want to take a look at the literature mentioned in the Users' Guide, from the developers of this scheme), you should be able to use any generic input.
 
Thank you for all your valuable suggestions.

I tried to configure and run spectral nudging with CONUS test input, but could not find any FFT calls.
I followed the link http://www2.mmm.ucar.edu/wrf/OnLineTuto ... s/fdda.php and used the config params mentioned.

I am not sure how to proceed further in order to profile WRF for FFT and then go on to integrate different FFT libraries.
I would be grateful to any help/reference to solve my problem.

Thank you,
S. Biplab Raut
 
Hi,
Inside the file phys/module_fdda_spnudging.F, and inside the subroutine spectralnudgingfilterfft2dncar, there are calls to
rfftmi
rfftmf
rfftmb

If you put timing statements around those, you should be able to see the print outs to show that it's actually doing these things. I'm attaching a file that I modified for you already, with these statements. If you look for the comments 'kkw' then you'll see the additions. I'm not sure what version of the code you're using, but this particular file is for V4.0.3. I don't think much (if anything) has changed with this file over the past few versions of the code, but you'll just want to make sure there aren't other diffs in the this file, and the one you're using (if you're using an older version of WRF). Place this file in the phys/ directory, and then you'll need to recompile the code. Since you're not modifying the configuration or Registry, you don't need to issue a 'clean -a' or reconfigure. Just simply recompile and it should be quicker than the initial compile. After that, run this again and look for the "Time rfftm*" prints in your rsl.out* files.

I was discussing this forum topic with a colleague who gave me some additional information to pass along to you. You may or may not find this useful, but I just thought I'd pass it along.
1) We tend to see poor performance with a link of a vector that is not a small prime.
2) There is a library available that is supposed to help with the problem above. It's called 'fftw.'
3) Our FFT's don't scale well with large processor counts. Essentially the performance decreases as the number of processors increases.
4) If you find any solutions for us in your work, we would love for you to contribute them to our code. If you get to that point, I can provide further instructions to do so.
 

Attachments

  • module_fdda_spnudging.F
    44.4 KB · Views: 50
Hi,
Thank you for your suggestions. And sorry for replying late to you as I was off work.

kwerner said:
Hi,
Inside the file phys/module_fdda_spnudging.F, and inside the subroutine spectralnudgingfilterfft2dncar, there are calls to
rfftmi
rfftmf
rfftmb

If you put timing statements around those, you should be able to see the print outs to show that it's actually doing these things. I'm attaching a file that I modified for you already, with these statements. If you look for the comments 'kkw' then you'll see the additions. I'm not sure what version of the code you're using, but this particular file is for V4.0.3. I don't think much (if anything) has changed with this file over the past few versions of the code, but you'll just want to make sure there aren't other diffs in the this file, and the one you're using (if you're using an older version of WRF). Place this file in the phys/ directory, and then you'll need to recompile the code. Since you're not modifying the configuration or Registry, you don't need to issue a 'clean -a' or reconfigure. Just simply recompile and it should be quicker than the initial compile. After that, run this again and look for the "Time rfftm*" prints in your rsl.out* files.
Thank you very much for your help. I really appreciate your efforts in changing source file and providing the same. Will check this src file with WRF.

By the way, I have been using WRF 3.9.1.1 version. I used input data files from http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog_V3.html to generate met_em.d01.* by running WPS.
I followed your suggested link http://www2.mmm.ucar.edu/wrf/OnLineTuto ... s/fdda.php to configure and run WRF.
Using 'perf' tool , I found that the total run-time portion of spectral nudging functions is hardly ~ 1%. So, I am confused whether FFT use in spectral nudging use case is so minimal? Please provide your views and inputs on this.
If FFT usage is so low in spectral nudging and WRF is not normally used for polar filter, I will have to evaluate how much FFT performance improvement will contribute towards improving overall time of WRF.

I was discussing this forum topic with a colleague who gave me some additional information to pass along to you. You may or may not find this useful, but I just thought I'd pass it along.
1) We tend to see poor performance with a link of a vector that is not a small prime.
2) There is a library available that is supposed to help with the problem above. It's called 'fftw.'
3) Our FFT's don't scale well with large processor counts. Essentially the performance decreases as the number of processors increases.
4) If you find any solutions for us in your work, we would love for you to contribute them to our code. If you get to that point, I can provide further instructions to do so.
This is indeed coinciding with my efforts and the primary objective being to improve FFT run-time in WRF.
However, my serious question (and also my confusion) for you is :- If FFT's run-time portion is so low in spectral nudging and polar filer is for lesser used global domain, then why it is affecting WRF performance for the above test case?
Please also let me know which test input and test case is being run where FFT's performance is found affecting WRF performance as stated above.

Thanks,
S. Biplab Raut
 
Hi,
Unfortunately I don't think we know the answers to those questions at this time. There hasn't been a lot of work done regarding this in our department, as we don't have the resources available to contribute much time to it. I don't believe there are any test cases available anywhere - it's just something that perhaps was tested along the way, and reported back to us over time.
 
Top