wrf.exe always stops at the same point

JSY

New member
Dear all,

Hi, I'm trying to run WRFv4.0.3 with fnl grib2 data. But after a few seconds running wrf.exe, it suddenly stops with no error.

I tried it in serial, also with 16 mpi but it always stos at the same point.
With mpirun, this stops with this message.

starting wrf task 0 of 1 (it repeats 16 times)
Exit code -5 signaled from feedback6(probably mpi host name?)
Killing remote processes...MPI process terminated unexpectedly
Done
signal 15 received.
signal 15 received.
signal 15 received.
signal 15 received.

I would really appreciate it if you could help me solve this problem. I attatched my namelist.input and rsl.error.0000 files.

Thank you so much.
 

Attachments

Last edited:
Hi,
You will probably need to use more than 1 processor to run this domain (especially since d02 is ~200x200). Your rsl file shows that you're only using a single processor here.

Code:
Ntasks in X             1 , ntasks in Y             1

If you have the rsl.* file from when you use 16 processors, can you package all those rsl files together as a single *.tar file, and attach that? Thanks!
 
Thank you for your reply.

I've run the wrf executable with
Code:
mpirun -np 16 -machinefile mpi.hosts ./wrf.exe

and I still have only one rsl file like rsl.error.0000

I guess you're right. There's a problem running multiple processors. and this is not a WRF problem right? I think I need to deal with mpi.
If you have any idea to solve the problem, please let me know. Thank you for your help!
 
The only potential suggestion I can make is, if you are using a system where you need to load modules, perhaps adding something like "module load...." for the specific compiler and MPI could be helpful. I was dealing with something similar and that was how mine was resolved. But your systems administrator would know more specifically what should be used on your system.
 
Back
Top