Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRF simulation results are very slow at multicores processors

hhhyd

New member
Hi,

I am using a Linux cluster with GNU compilers to run WRF simulations using mpirun. Currently, I am using 9 nodes with a total of 36 cores, and the CPU usage appears efficient.

I have four domains, each with a grid size of 250 × 150. However, I am only getting 11 seconds of simulation results after running for an entire day. Do you have any suggestions to improve the calculation efficiency?

I've also attached my namelist here, as I suspect some settings might be contributing to the increased simulation time.

Thank you!
 

Attachments

  • namelist.input
    5 KB · Views: 2
Something is obviously wrong if it takes the whole day to simulate 11 seconds. In clusters, the data is shared between nodes through network, and if it does not function properly CPUs will wait for the data to be received and do nothing. Although in top it will look like they are fully utilized. So first thing to do is to fully test network; physical properties and quality of connections, but also routing tables, hostnames, verify that throughput and latency is expected, that nothing else is saturating network, switches, routers, and so on.
 
Thanks for your reply. I am going to check the network, but it appears that the network is functioning well for MPI. And I can use the MPI to running a very simple testing code. Additionally, I am wondering if using damper and smpar (35. (dm+sm) GNU (gfortran/gcc)) for the WRF compilation could affect the MPI performance of WRF? And the WRF's library setting of the mpi is
HYDRA build details:
Version: 4.2.2
Release Date: Wed Jul 3 09:16:22 AM CDT 2024
CC: gcc
Configure options: '--disable-option-checking' '--prefix=/home/houyidi/usr/local/mpich-4.2.2' '--with-hwloc=embedded' '--enable-fast=O3' '--enable-cxx' '--with-device=ch4:eek:fi' '--cache-file=/dev/null' '--srcdir=../../../../src/pm/hydra' 'CC=gcc' 'CFLAGS= -O3' 'LDFLAGS=' 'LIBS=' 'CPPFLAGS= -DNETMOD_INLINE=__netmod_inline_ofi__ -I/home/houyidi/TEMP/mpich-4.2.2/build/src/mpl/include -I/home/houyidi/TEMP/mpich-4.2.2/src/mpl/include -I/home/houyidi/TEMP/mpich-4.2.2/modules/json-c -I/home/houyidi/TEMP/mpich-4.2.2/build/modules/json-c -I/home/houyidi/TEMP/mpich-4.2.2/modules/hwloc/include -I/home/houyidi/TEMP/mpich-4.2.2/build/modules/hwloc/include -D_REENTRANT -I/home/houyidi/TEMP/mpich-4.2.2/build/src/mpi/romio/include -I/home/houyidi/TEMP/mpich-4.2.2/src/pmi/include -I/home/houyidi/TEMP/mpich-4.2.2/build/src/pmi/include -I/home/houyidi/TEMP/mpich-4.2.2/build/modules/yaksa/src/frontend/include -I/home/houyidi/TEMP/mpich-4.2.2/modules/yaksa/src/frontend/include -I/home/houyidi/TEMP/mpich-4.2.2/build/modules/libfabric/include -I/home/houyidi/TEMP/mpich-4.2.2/modules/libfabric/include'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs cobalt
Demux engines available: poll select
I really appreciate for your help and suggestions
 
I am wondering if using damper and smpar (35. (dm+sm) GNU (gfortran/gcc)) for the WRF compilation could affect the MPI performance of WRF?
Hi, Does this mean that you installed WRF with the dm+sm option? If so, please try dmpar instead. We typically don't suggest using dm+sm because results have been unfavorable.

I am using 9 nodes with a total of 36 cores
Per node, are there only 4 cores available? If there are more, try to use all the cores available for each node. You can also increase the number of processors you're using. 36 is not very many for the size of your domains. You should be able to use 100+.
 
Hi, Does this mean that you installed WRF with the dm+sm option? If so, please try dmpar instead. We typically don't suggest using dm+sm because results have been unfavorable.


Per node, are there only 4 cores available? If there are more, try to use all the cores available for each node. You can also increase the number of processors you're using. 36 is not very many for the size of your domains. You should be able to use 100+.
Thanks for your reply, I really appreciate it. Yes, I installed WRF by using dm+sm option, I will reinstall it and have a try again. We have access to 9 nodes, each with 36 cores, but only using 4 cores per node achieves maximum CPU efficiency. I will increase the number of cores used for the computations.
 
Top