Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

rsl.out.???? files truncated by segmentation fault

andythewxman

New member
Hello! I am currently running WRFV4.5.2 using Slurm. When I run X amount of processors on a single node, everything works as expected. However, when I use the same number of processors evenly divided over two nodes, the model still completes, but I get a segmentation fault message and then the rsl.out.???? files are truncated and do not provide a "SUCCESS" message at the end. I did not observe this behavior when running WRFV3.8. I am a bit stumped at this point. Do you have any idea where I can start trying to track this down? Thanks.
 
Hi,
Apologies for the long delay in response while our team tended to time-sensitive obligations. Thank you for your patience. This issue is likely specific to your computing environment. I would suggest discussing the issue with a systems administrator at your institution for solutions.
 
Thank you - we were able to resolve the issue and confirm it was not WRF-related. A co-worker experienced this with another HPC program we were running. Apparently it is a known Cray error with an environment variable. Setting FI_VERBS_PREFER_XRC=0 solved this for us.
 
Top