Came across this thread while searching for some answers. Hoping to piggyback of previous comments.
I have been running wrf on a dm build for quite some time, using 256 cores. It runs just fine, no problems.
I have started exploring the use of a dm+sm build. When I try running 128 cores and 2 threads (256 total), the model runs, but the time it takes to run increases by a factor of 2-3x. I figured I would see improvement over the baseline, but that was not the case. Is there an optimal configuration that you would suggest so I could potential see improvement? 64 cores and 4 threads? 32 cores and 8 threads? Or does sticking with my DM build seem to be the better option here? Thanks.