Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Very long calculation time for WRF

fis_clima

New member
I'm running WRF on a cluster with 12 nodes, each with 80 threads.
The rsl.error.0000 file shows very large calculation values.
I've attached the file to run WRF and its namelist.

What could be causing the process to take so long?

I already used
#SBATCH --nodes=4
#SBATCH --nodelist=node007,node008,node009,node010
#SBATCH --ntasks-per-node=80
 

Attachments

  • rsl.error.0000
    19.2 KB · Views: 1
  • namelist.input
    2.9 KB · Views: 2
  • run_wrf.slurm.txt
    589 bytes · Views: 2
First, Your namelist.input indicates that the time step is only 10s for the 9km mesh. I would suggest that you increase the time step to 45, which will greatly reduce the computation cost.

Second, your D01 only has a grid number of 210 x 220. With the total number of 320 processors, communication between grids costs lots of time.

Third, because your case is 3-domain nested case, D01 can only move on after D02 is done, and d02 can only move on after D03 is done.Even if D01 has a much smaller number of grids compared to D03, it takes time to wait for its child domains to finish the integration.

Some other factors may also affect the integration speed, for example, frequent I/O takes some time.

Your namelist options look fine to me.
 
Hello,

Ming Chen has mentioned the setting in the namelist.input. I 'd like to say something about your job submission. Your rsl.error.0000 shows you only used one core to run the job, and of course your job will be very slow. When you run wrf.exe on HPC, you should know how many cores are there in each node.
For example, if each node has 32 cores, and you want to run wrf.exe with 4 nodes, then you use 128 cores in total. As wrf may run in a MPI+OpenMP way, usually we set the OpenMP threads as 4, then the MPI rank number will be 32 (128 cores/4 OpenMP threads=32 MPI ranks). As you used 4 nodes, there will be 8 MPI ranks per node. The slurm script will be as below:

#!/bin/sh
#SBATCH -J wrf # job name
#SBATCH -o ./wrf_job.o%j # output and error file name (%j expands to jobID)
#SBATCH -N 4 # number of nodes requested
#SBATCH --ntasks-per-node=8 # MPI ranks each node
#SBATCH --cpus-per-task=4 # OpenMP threads each node

export OMP_NUM_THREADS=4 # Environment variable setting OpenMP threads each node

mpirun -np 32 -ppn 8 ./wrf.exe #-np means MPI ranks used in total, -ppn means MPI ranks per node
 
Top