wrf.exe failure on cluster

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

This case failed immediately, which often indicates either a data issue or a machine issue.
The rsl.error file shows that "Failed RDMA write request (status 12 : transport retry counter exceeded). Connection broken!", which looks more like a machine issue. Please talk to your computer manager to make sure you have permission ad enough space to write the data, and the communication between processors must work fine.
 
Back
Top