ahmedbably
New member
Hi everyone,
I am having an issue with running my WRF model on two clusters with 35 nodes each. I set the maximum time limit to 24 hours, but my model runtime is 152 days, or about 3 months long. As a result, I've only been able to run and output about 9 days out of the 152 days.
I had to increase my domain size because when I tried running it on a smaller domain, the model refused to increase the number of processors due to a certain minimum decomposition value. I kept reducing the number of processors until I couldn't make use of the large number of computation nodes anymore, so I decided to increase the size of the domain. However, I am still unable to go beyond 35 processors on two clusters and I am getting an error when trying to do so.
Do you have any suggestions on how to solve this problem? I was thinking of dividing the runs, but I'm not sure how to go about doing that. Also, I've heard of the WRF Restart Model but have no idea how it works or how it can be used to solve this problem and complete the 150 day run successfully. Can anyone provide some information on this?
Thanks in advance for any advice or suggestions.
I am having an issue with running my WRF model on two clusters with 35 nodes each. I set the maximum time limit to 24 hours, but my model runtime is 152 days, or about 3 months long. As a result, I've only been able to run and output about 9 days out of the 152 days.
I had to increase my domain size because when I tried running it on a smaller domain, the model refused to increase the number of processors due to a certain minimum decomposition value. I kept reducing the number of processors until I couldn't make use of the large number of computation nodes anymore, so I decided to increase the size of the domain. However, I am still unable to go beyond 35 processors on two clusters and I am getting an error when trying to do so.
Do you have any suggestions on how to solve this problem? I was thinking of dividing the runs, but I'm not sure how to go about doing that. Also, I've heard of the WRF Restart Model but have no idea how it works or how it can be used to solve this problem and complete the 150 day run successfully. Can anyone provide some information on this?
Thanks in advance for any advice or suggestions.