Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Quilting problem

suleiman

New member
Dear Community,
I have a domain of 451x451 grid points and want to make the output process parallel using quilting.
To run my experiment on a supper computer, I used 32 nodes with 1552 cores, of which 64 were for quilting:

srun -N 33 -n 1552 --hint=nomultithread --distribution=block:block --cpus-per-task=4 $prg2

However, after the first output step, it crashed.

quilting section
&namelist_quilt
nio_tasks_per_group = 4, !16, !0, !4
nio_groups = 8, !1, !1, !8
/
As you can see, I have tried different combinations of nio_tasks and groups. All attempts were unsuccessful.

Could you please guide me on the correct number of nio_tasks and nio_groups
or should I change the number of cores?

Thanks.

P/S The namelist is attached.
 

Attachments

  • namelist.input
    8.8 KB · Views: 0
Hi,
Many apologies for the delay in response. Does the simulation work if you do not use quilting? Unfortunately not many people have had much success with the quilting option in WRF, although occasionally some do. Take a look at this thread, which discusses some tactics. You may also find other helpful threads by searching in this forum.
 
Hi,
Many apologies for the delay in response. Does the simulation work if you do not use quilting? Unfortunately not many people have had much success with the quilting option in WRF, although occasionally some do. Take a look at this thread, which discusses some tactics. You may also find other helpful threads by searching in this forum.
Dear Kwener,
Thanks for your reply. Yes, it works fine when the quilting is turned off. Moreover, our old HPC, which has 32 cores in each node, works very well. In the new HPC, each node has 192 cores, and this is probably the problem, as I cannot find the correct number of quilting io and groups.

Best regards,
Suleiman
 
Top