Vaishnavi_198
New member
i) I ran wrf.exe and got error (file attached) which i do not understand. I run WRF on an HPC and this seems to be an MPI issue. I have attached the slurm script too.
ii) When I run wrf.exe I thought the error will be in the last rsl error file, but this error was found when I randomly checked for error in the rsl files and encountered rsl.error.0250 with the error, but the last error file was rsl.error.0383. How am I supposed to know which file has the error printed?
The error :
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 43597500 bytes allocated
med_initialdata_input: calling input_input
Abort(608265743) on node 250 (rank 250 in comm 0): Fatal error in PMPI_Gatherv: Other MPI error, error stack:
PMPI_Gatherv(398)..........................: MPI_Gatherv failed(sbuf=0x467b820, scount=24, MPI_CHAR, rbuf=0x7ffdc1b34880, rcnts=0xf86ee70, displs=0xf86f480, datatype=MPI_CHAR, root=0, comm=MPI_COMM_WORLD) failed
MPIDI_Gatherv_intra_composition_alpha(1491):
MPIDI_NM_mpi_gatherv(523)..................:
MPIR_Gatherv_allcomm_linear_ssend(113).....:
MPIC_Ssend(249)............................:
MPID_Ssend(720)............................:
MPIDI_ssend_unsafe(311)....................:
MPIDI_OFI_send_normal(392).................:
(unknown)(): Other MPI error
ii) When I run wrf.exe I thought the error will be in the last rsl error file, but this error was found when I randomly checked for error in the rsl files and encountered rsl.error.0250 with the error, but the last error file was rsl.error.0383. How am I supposed to know which file has the error printed?
The error :
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 43597500 bytes allocated
med_initialdata_input: calling input_input
Abort(608265743) on node 250 (rank 250 in comm 0): Fatal error in PMPI_Gatherv: Other MPI error, error stack:
PMPI_Gatherv(398)..........................: MPI_Gatherv failed(sbuf=0x467b820, scount=24, MPI_CHAR, rbuf=0x7ffdc1b34880, rcnts=0xf86ee70, displs=0xf86f480, datatype=MPI_CHAR, root=0, comm=MPI_COMM_WORLD) failed
MPIDI_Gatherv_intra_composition_alpha(1491):
MPIDI_NM_mpi_gatherv(523)..................:
MPIR_Gatherv_allcomm_linear_ssend(113).....:
MPIC_Ssend(249)............................:
MPID_Ssend(720)............................:
MPIDI_ssend_unsafe(311)....................:
MPIDI_OFI_send_normal(392).................:
(unknown)(): Other MPI error