I'm trying to run the ./wrf.exe command in an hpc using 13 nodes in order to decrease my computation time. In order to do that, I'm using the following command:
The contents of the hostfile.exe file are simply the names of the 12 nodes.
When I try to run the previous command, I get the following error:
It's really wierd, since a similar command is being used by another team that is running the RegCM model in the hpc.
I would really appreciate any insights to solve this problem!!!
Code:
mpirun -machinefile hostfile.txt -np 150 ./wrf.exe
The contents of the hostfile.exe file are simply the names of the 12 nodes.
When I try to run the previous command, I get the following error:
Code:
control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert (!closed) failed
[mpiexec@bright90] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@bright90] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
[mpiexec@bright90] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion
It's really wierd, since a similar command is being used by another team that is running the RegCM model in the hpc.
I would really appreciate any insights to solve this problem!!!