Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

wrf.exe always stops at the same point

JSY

New member
Dear all,

Hi, I'm trying to run WRFv4.0.3 with fnl grib2 data. But after a few seconds running wrf.exe, it suddenly stops with no error.

I tried it in serial, also with 16 mpi but it always stos at the same point.
With mpirun, this stops with this message.

starting wrf task 0 of 1 (it repeats 16 times)
Exit code -5 signaled from feedback6(probably mpi host name?)
Killing remote processes...MPI process terminated unexpectedly
Done
signal 15 received.
signal 15 received.
signal 15 received.
signal 15 received.

I would really appreciate it if you could help me solve this problem. I attatched my namelist.input and rsl.error.0000 files.

Thank you so much.
 

Attachments

  • rsl.error.0000 (4).txt
    4.2 KB · Views: 6
Last edited:
Hi,
You will probably need to use more than 1 processor to run this domain (especially since d02 is ~200x200). Your rsl file shows that you're only using a single processor here.

Code:
Ntasks in X             1 , ntasks in Y             1

If you have the rsl.* file from when you use 16 processors, can you package all those rsl files together as a single *.tar file, and attach that? Thanks!
 
Thank you for your reply.

I've run the wrf executable with
Code:
mpirun -np 16 -machinefile mpi.hosts ./wrf.exe

and I still have only one rsl file like rsl.error.0000

I guess you're right. There's a problem running multiple processors. and this is not a WRF problem right? I think I need to deal with mpi.
If you have any idea to solve the problem, please let me know. Thank you for your help!
 
The only potential suggestion I can make is, if you are using a system where you need to load modules, perhaps adding something like "module load...." for the specific compiler and MPI could be helpful. I was dealing with something similar and that was how mine was resolved. But your systems administrator would know more specifically what should be used on your system.
 
Top