The wrf job cannot be integrated with a specific core number, and the job is stuck without output

xiaxx+666 · Nov 2, 2022

Have you ever encountered this situation: When wrf jobs use certain cores, and the jobs are stuck without output. No matter how to change the combination of nproc_x and nproc_y, the job just can't run.

I'm running a experiment： a nest with 951 * 951, grid=600m, timestep=3s, run_time 6h

This core is not an odd number (600c) that is hard to decompose. And it beyond the range of the processor for the example, because I tested jobs with 800c laterly, and it was worked.
I have tested the operation of 600 cores for many times, and also tried to manually allocate nproc_ x nproc_ y. The automatic mode decomposition is 24 * 25, I tried 20 * 30, 15 * 40, but the jobs are still stuck and does not output.

We guess whether it is related to the small grid and time step of the example. Regarding this phenomenon, does the WRF mode have a document on the optimal core assignment of jobs for users to refer to？

I will attach namelist Input and job logs, looking forward to your reply.

kwerner · Nov 3, 2022

Hi,
Can you tell me what resolution are your input data for this simulation?

The wrf job cannot be integrated with a specific core number, and the job is stuck without output

xiaxx+666

New member

Attachments

kwerner

Administrator