Hi all,
when I was running long simulations (>O(10h)), I encountered random aborts due to insufficient memory. I was finally able to procure a dedicated compute node with sufficient computation time for this problem to be reproduced and monitored (shoutout to DKRZ Support) and it seems to me that there is a memory leak somewhere in the codebase (cf. mem_leak.png, mem_usage.csv).
Before I started debugging, I just wanted to ask whether this is a (known) issue - I couldn't find anything on the forums or in the GH issues. Personally, I can't think of anything that should accumulate in memory over a simulation, but it might also be expected behaviour. Also - if this is a bug - do you have any idea on where to start looking?
Thanks in advance
when I was running long simulations (>O(10h)), I encountered random aborts due to insufficient memory. I was finally able to procure a dedicated compute node with sufficient computation time for this problem to be reproduced and monitored (shoutout to DKRZ Support) and it seems to me that there is a memory leak somewhere in the codebase (cf. mem_leak.png, mem_usage.csv).
Before I started debugging, I just wanted to ask whether this is a (known) issue - I couldn't find anything on the forums or in the GH issues. Personally, I can't think of anything that should accumulate in memory over a simulation, but it might also be expected behaviour. Also - if this is a bug - do you have any idea on where to start looking?
Thanks in advance