Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

MPI Error while running wrf.exe

Vaishnavi_198

New member
i) I ran wrf.exe and got error (file attached) which i do not understand. I run WRF on an HPC and this seems to be an MPI issue. I have attached the slurm script too.

ii) When I run wrf.exe I thought the error will be in the last rsl error file, but this error was found when I randomly checked for error in the rsl files and encountered rsl.error.0250 with the error, but the last error file was rsl.error.0383. How am I supposed to know which file has the error printed?

The error :

DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 43597500 bytes allocated
med_initialdata_input: calling input_input
Abort(608265743) on node 250 (rank 250 in comm 0): Fatal error in PMPI_Gatherv: Other MPI error, error stack:
PMPI_Gatherv(398)..........................: MPI_Gatherv failed(sbuf=0x467b820, scount=24, MPI_CHAR, rbuf=0x7ffdc1b34880, rcnts=0xf86ee70, displs=0xf86f480, datatype=MPI_CHAR, root=0, comm=MPI_COMM_WORLD) failed

MPIDI_Gatherv_intra_composition_alpha(1491):
MPIDI_NM_mpi_gatherv(523)..................:
MPIR_Gatherv_allcomm_linear_ssend(113).....:
MPIC_Ssend(249)............................:
MPID_Ssend(720)............................:
MPIDI_ssend_unsafe(311)....................:
MPIDI_OFI_send_normal(392).................:
(unknown)(): Other MPI error
 

Attachments

  • sandeepwrf.txt
    1,009 bytes · Views: 0
  • rsl.error.0250.txt
    3.4 KB · Views: 0
Where your rsl files are located on your machine.

Run each one of these commands individually in a new terminal window.


grep -i FATAL rsl.*

grep -i error rsl.*

grep -i SIGSEGV rsl.*

grep -i cfl rsl.

They will show you what file name you need to look for for the errors.
 
Where your rsl files are located on your machine.

Run each one of these commands individually in a new terminal window.


grep -i FATAL rsl.*

grep -i error rsl.*

grep -i SIGSEGV rsl.*

grep -i cfl rsl.

They will show you what file name you need to look for for the errors.
Thank you that was helpful
 
Top