MPI error while running real.exe

bashaman

New member
Hi,

I'm running into an error while running real.exe. The real.o file shows the error pasted below. Knowing that the default ncarenv module environment was just changed, I reverted to the old environment using:
module load ncarenv/24.12

This did not fix the error. I'd appreciate any pointers to get this fixed.

Thanks!

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

MPICH ERROR [Rank 0] [job id f1f53e03-5ecb-4603-bf06-36904721e0c3] [Wed Mar 4 12:34:17 2026] [dec2401] - Abort(1616271) (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(170).......:
MPID_Init(501)..............:
MPIDI_OFI_mpi_init_hook(573):
open_fabric(1521)...........: OFI fi_getinfo() failed (ofi_init.c:1521:open_fabric:No data available)
dec2401.hsn.de.hpc.ucar.edu: rank 92 exited with code 255
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libc.so.6 0000146BAFE42900 Unknown Unknown Unknown
libc.so.6 0000146BAFF1482B ioctl Unknown Unknown
libfabric.so.1.25 0000146BB371B457 Unknown Unknown Unknown
libfabric.so.1.25 0000146BB36F6A52 Unknown Unknown Unknown
libfabric.so.1.25 0000146BB370CA2A Unknown Unknown Unknown
libfabric.so.1.25 0000146BB379C020 Unknown Unknown Unknown
libfabric.so.1.25 0000146BB36E4171 fi_getinfo Unknown Unknown
libfabric.so.1.25 0000146BB36FFDC6 Unknown Unknown Unknown
libfabric.so.1.25 0000146BB3703D80 Unknown Unknown Unknown
libfabric.so.1.25 0000146BB375E57E Unknown Unknown Unknown
libfabric.so.1.25 0000146BB36E4171 fi_getinfo Unknown Unknown
libfabric.so.1.25 0000146BB36EAFFD fi_getinfo Unknown Unknown
libmpi_intel.so.1 0000146BB219CF99 Unknown Unknown Unknown
libmpi_intel.so.1 0000146BB219E3CE Unknown Unknown Unknown
libmpi_intel.so.1 0000146BB2027136 Unknown Unknown Unknown
libmpi_intel.so.1 0000146BB0732B55 Unknown Unknown Unknown
libmpi_intel.so.1 0000146BB0732924 PMPI_Init Unknown Unknown
libmpifort_intel. 0000146BB2C4AABF PMPI_INIT Unknown Unknown
real.exe 0000000000861A5B Unknown Unknown Unknown
real.exe 0000000001116079 Unknown Unknown Unknown
real.exe 00000000004186DB Unknown Unknown Unknown
real.exe 00000000004185ED Unknown Unknown Unknown
libc.so.6 0000146BAFE2BE6C Unknown Unknown Unknown
libc.so.6 0000146BAFE2BF35 __libc_start_main Unknown Unknown
real.exe 00000000004184F1 Unknown Unknown Unknown
 
Based on the "module load" command, it looks like you may be running on the NCAR HPC. If so, do you mind sharing your Derecho directory so we can access your namelist and rsl files? Thanks!
 
Yes, sorry for the omission of that detail. I am using Derecho.

Here's my working directory:
/glade/work/bashaman/ms_thesis/qvten_idealized_pgw/run

Also, I should note that I compiled this WRF run myself rather than use a pre-compiled copy.
 
I just tried running real.exe in a copy of my old experiments that had used pre-compiled code and I get the same error there as well.

The directory for that one is: /glade/work/bashaman/ms_thesis/wrf_new_grid/idealized_pgw_realSSTa/run
 
Thanks for sharing those. Can you try recompiling with the following loaded modules:

1) ncarenv/25.10 (S) 3) intel/2025.2.1 5) libfabric/1.22.0 7) hdf5/1.14.6
2) craype/2.7.34 4) ncarcompilers/1.1.0 6) cray-mpich/8.1.32 8) netcdf/4.9.3

And then when you configure, choose option 78 (INTEL (ifx/icx) : oneAPI LLVM), then run it again to see if that makes any difference. If it still fails, will you point me to the rsl error files, as well as your batch script? Thanks!
 
Great! Thanks for the update.
Do you have any guidance for legacy versions of WRF (necessary for WRF-WVT)? I've tried opts 34 and 50, and it keeps failing. My working directory is /glade/work/sstrey/WRF_WVT/WRF-4.3.3, and the failed compile logs are opt34.log.failed.compile and opt50.log.failed.compile. Should I deviate from the module list provided for later WRF releases?
 
Do you have any guidance for legacy versions of WRF (necessary for WRF-WVT)? I've tried opts 34 and 50, and it keeps failing. My working directory is /glade/work/sstrey/WRF_WVT/WRF-4.3.3, and the failed compile logs are opt34.log.failed.compile and opt50.log.failed.compile. Should I deviate from the module list provided for later WRF releases?
I'm not familiar with WRF-WVT, so I can't say for sure, but it's possible you may need to use different modules than the default.
 
I'm not familiar with WRF-WVT, so I can't say for sure, but it's possible you may need to use different modules than the default.
What options are being used to recompile the "precompiled code" in wrfhelp? Any idea of where to start with module changes? I've tried a few other options since posting, and all have failed. I had a compiled version of WRF 4.3.3 using option 34 before.
 
Back
Top