There are a number of components to the WRF modeling system. The pre-processors in the WPS package tend to not use much memory unless the domain has a large number of grid cells in each direction. Even for a 1000x1000 domain, we usually find that the geogrid and metgrid programs are able to run on a single processor. Those codes have been designed to run with MPI in a distributed memory parallel fashion. The ungrib program is independent of the proposed WRF domain, as it just decodes and uncompresses grib files. Even for a quarter degree global data set, the 2d arrays are in the 1440x720 range, which fits easily on even laptop-sized memory. The real program, the eta-level pre-processor to the WRF model uses more memory than the WRF model. What we often find is that the amount of time required for the WRF model to complete a simulation is about the same as the amount of time required for post processing. The bottom line is to make sure that you are considering all of the system requirements.
When trying to determine an appropriate machine, please consider a few items.
1) If this is expected to largely be a machine only for production tasks with distributed memory jobs, then you can reduce the amount of memory per node. If you will likely run a mixture of distribute memory jobs (which can aggregate memory across multiple nodes) and single processor jobs (such as post-processors and visualization), then you will likely need to increase memory.
2) The decision for numbers of cores per node, total nodes per machine, etc is again dependent on the requirements of the machine. For example, if you are running semi-operationally and you need a forecast to complete within a specified period of time, then you need to need to increase the size of your single machine, but it will likely not be fully utilized for the entire day. However, if you are anticipating running a mixture of small and large jobs in production (for example, needing to run a 1-year simulation over a certain domain), you probably can get away with a smaller machine that runs 24 h per day. If you are expecting to run multiple jobs that are largely independent of each other (for example, ensembles), several n processor machines would be more productive than a single machine with the same number of cores.
3) The WRF model does not at all take any advantage of GPU, xeon phi, or any other accelerator technology at this time. If the machine is being purchased mostly for WRF, there is no need to include accelerators in your purchase. If your machine will be a multi-purpose machine, and if graphics and visualization would be part of the mix, then having the login nodes populated with GPUs might be a real benefit.
We apologize for not providing hard and fast numbers. Note that without even trying to discuss domain sizes, numbers of vertical levels, types of physical parameterizations, output frequency, all we can provide is general recommendations.
Recommendation #1
We do not test every possible combination of compilers, chip, OS, etc. We have quite a bit of experience at NCAR with Intel chips (in our desktop boxes and laptops, and in our supercomputers). The operating system must be a Unix/Linux flavor. We have found that users with virtual environments tend to have more troubles. We regularly test GNU and Intel compilers. If you would eventually like some assistance from us with your machine that is running WRF, staying within the bounds of these "norms" allows us to provide credible experience to your problems. Note that there are supercomputer vendors (Fujitsu, Cray, etc) that while we do not have personal access to those architectures, the vendors provide us with direct user support.
Recommendation #2
For a distributed memory machine, tend to spend more of your money to increase the number of processors as opposed to the amount of memory. Memory can be aggregated.
Recommendation #3
If your machine will be heterogeneous, make the login nodes (master node) filled up with memory.
Recommendation #4
The WRF model has the capability to output data from each processor. The amount of communication that the WRF model requires is fairly large. Bandwidth between processors and bandwidth to the IO systems are critical.
Recommendation #5
A few desktop machines hooked together with ethernet cables will not be a good cluster. There is merit to purchasing networking infrastructure.
Recommendation #6
Disk space is relatively cheap. If your machine will be utilized for analysis, a few TB of disk will be insufficient.
Recommendation #7
All of these considerations may be appropriate for the larger user group, to see what others have done. After a few months of using a new machine that they have purchased, users are very clear on what they like and dislike about their purchase.
When trying to determine an appropriate machine, please consider a few items.
1) If this is expected to largely be a machine only for production tasks with distributed memory jobs, then you can reduce the amount of memory per node. If you will likely run a mixture of distribute memory jobs (which can aggregate memory across multiple nodes) and single processor jobs (such as post-processors and visualization), then you will likely need to increase memory.
2) The decision for numbers of cores per node, total nodes per machine, etc is again dependent on the requirements of the machine. For example, if you are running semi-operationally and you need a forecast to complete within a specified period of time, then you need to need to increase the size of your single machine, but it will likely not be fully utilized for the entire day. However, if you are anticipating running a mixture of small and large jobs in production (for example, needing to run a 1-year simulation over a certain domain), you probably can get away with a smaller machine that runs 24 h per day. If you are expecting to run multiple jobs that are largely independent of each other (for example, ensembles), several n processor machines would be more productive than a single machine with the same number of cores.
3) The WRF model does not at all take any advantage of GPU, xeon phi, or any other accelerator technology at this time. If the machine is being purchased mostly for WRF, there is no need to include accelerators in your purchase. If your machine will be a multi-purpose machine, and if graphics and visualization would be part of the mix, then having the login nodes populated with GPUs might be a real benefit.
We apologize for not providing hard and fast numbers. Note that without even trying to discuss domain sizes, numbers of vertical levels, types of physical parameterizations, output frequency, all we can provide is general recommendations.
Recommendation #1
We do not test every possible combination of compilers, chip, OS, etc. We have quite a bit of experience at NCAR with Intel chips (in our desktop boxes and laptops, and in our supercomputers). The operating system must be a Unix/Linux flavor. We have found that users with virtual environments tend to have more troubles. We regularly test GNU and Intel compilers. If you would eventually like some assistance from us with your machine that is running WRF, staying within the bounds of these "norms" allows us to provide credible experience to your problems. Note that there are supercomputer vendors (Fujitsu, Cray, etc) that while we do not have personal access to those architectures, the vendors provide us with direct user support.
Recommendation #2
For a distributed memory machine, tend to spend more of your money to increase the number of processors as opposed to the amount of memory. Memory can be aggregated.
Recommendation #3
If your machine will be heterogeneous, make the login nodes (master node) filled up with memory.
Recommendation #4
The WRF model has the capability to output data from each processor. The amount of communication that the WRF model requires is fairly large. Bandwidth between processors and bandwidth to the IO systems are critical.
Recommendation #5
A few desktop machines hooked together with ethernet cables will not be a good cluster. There is merit to purchasing networking infrastructure.
Recommendation #6
Disk space is relatively cheap. If your machine will be utilized for analysis, a few TB of disk will be insufficient.
Recommendation #7
All of these considerations may be appropriate for the larger user group, to see what others have done. After a few months of using a new machine that they have purchased, users are very clear on what they like and dislike about their purchase.