Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

(RESOLVED) IO quilt crash - cheyenne

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.



I am running WRF on cheyenne with this configuration: 2 grids 8 km (536 x 481) and 1.6 km (796 x796), intel dm compilation, netcdf4, hourly outputs in the 1.6 km. the model runs fine using 60 nodes (2160 cpus) without quilting.

However, I am trying to see if the performance can be improved somewhat and used these settings:

nproc_x = 45,
nproc_y = 47,

nio_tasks_per_group = 9,
nio_groups = 5,

this combination totals 2160 cpus (45 x 47 + 9 x5). the model integrates about 12 hours and then exits with no apparent messages in the rsl.error logs
also the wrfout files have reasonable sizes but it is not viewable with ncview and the Times variable is missing.

any suggestions of what might be the problem is greatly appreciated.

I figured out what the problem was: wrong configuration in the number of nodes/cpus in the submitting cheyenne script
I am not sure if I can resolve the ticket or not.