Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

forrtl: error (78): process killed (SIGTERM)

bashaman

New member
Hello,

When I run WRF, I get 24 hours of output for my 3 day simulation before getting errors. I imagine this is due to a problem with the physics or dynamics parameters I have set in the namelist, but am not spotting the issue. Namelist and rsl* are included below.

Thanks!
 

Attachments

  • bashaman_WRF.zip
    722.5 KB · Views: 3
Hi,
Apologies for the delay. Per the wrf.o* file you shared, it looks like you're exceeding your wallclock time.

Code:
=>> PBS: job killed: walltime 28826 exceeded limit 28800

I was able to find your running directory on Derecho (I hope you don't mind). It looks like since this post, you've resolved that issue (based on the latest wrf.o* file), but it's still stopping. First, in your runwrf.csh script, increase the number of wallclock hours to the max of 12. That probably won't get all the way done, so can you also try increasing the number of processors you're using? Each node on Derecho has 128 processors. You should be able to run up to 384 processors with the size of your domains, and still be okay.
 
Hi,
Apologies for the delay. Per the wrf.o* file you shared, it looks like you're exceeding your wallclock time.

Code:
=>> PBS: job killed: walltime 28826 exceeded limit 28800

I was able to find your running directory on Derecho (I hope you don't mind). It looks like since this post, you've resolved that issue (based on the latest wrf.o* file), but it's still stopping. First, in your runwrf.csh script, increase the number of wallclock hours to the max of 12. That probably won't get all the way done, so can you also try increasing the number of processors you're using? Each node on Derecho has 128 processors. You should be able to run up to 384 processors with the size of your domains, and still be okay.
Hi, thanks for the suggestion. Since then I have tried a number of iterations increasing the number of processors, but my simulation breaks down in approximately the same general time in the simulation and I have not been running into any more wallclock issues.
 
Hi,
I just ran a test on Derecho, using your input and namelist. The difference is that I used V4.5.2 to run it (as it's the latest version). Since your simulation seems to be stopping within the first day, I ran a 30 hour simulation and it completed. For my case, I used the following settings in my batch script:

#PBS -l select=3:ncpus=128:mpiprocs=128

If you'd like to take a look, the directory is /glade/derecho/scratch/kkeene/bashaman/wrfv4.5.2/test/em_real. Perhaps you could try with V4.5.2, as well, and see if that makes any difference?
 
Top