WRF4.1.5, em_les, wrf.exe hangs

Dear everyone,

I tried to a run a test of WRF4.1.5, em_les on a new cluster
and found that the execution of wrf.exe hangs.
Any idea about this?
On my another machine, the test run finishes in 1 minute.

Here is the Intel compiler I am using,
module load intel/19.0.3
module load mvapich2/2.3.2
module load netcdf/
export NETCDF=/share/apps/netcdf/

$ ./configure
select "16" and "1"
$ ./compile em_les > log
$ sbatch

The content of is
#!/bin/bash -l

#SBATCH -A activate
#SBATCH -J wrf
#SBATCH --nodes=1
#SBATCH -n 8
#SBATCH -t 00:59:58
#SBATCH -o wrf.out
#SBATCH -e wrf.err

ulimit -s unlimited
srun ./wrf.exe

Best regards,

Hi, Xiang,
Did you run ideal.exe first? I think so, but just want to make sure. Is there any error message in your rsl files?
The compiler and libs look fine.
If the same code can run in one machine but hang in another, it often indicates something wrong either in the library or in the environmental settings. Please consult your computer manager. It is hard to figure out the reason if we cannot repeat the error.
Hi Ming,

Thanks a lot for your reply.
I contacted my computer supporting center but I have not heard back from them.

Yes, I did. Attached please find the error report.
The simulation had hanged for about an hour before it was killed.
That is why you will see the error in the report that process is killed.




  • rsl.txt
    97.5 KB · Views: 67
I believe this is a computer issue. It is possibly related to your MPI installation or the command to run MPI job. The comminution between your processors may also be an issue. Please talk to your computer managers to seek a solution.
Hi Xiang-Yu,

Since this is a post in June, maybe you have successfully solved this problem. If not, have you tried to increase the walltime of MPI run? The system will kill the task when the run time exceeds the pre-set walltime.