memory allocation error when running wrf.exe

AlexCrawford

New member
I am trying to run WRF on a domain with a parent and one nested domain using the real case with ERA5 data as the input for boundary data. I have worked through every step with WPS and ran real.exe without errors. When I try to run wrf.exe, I am receiving the following error:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 9.

Looking at the rsl.error file, the trigger seems to be related to this:
rsl_malloc failed allocating -1933456896 bytes, called rsl_bcast.c, line 290, try 1
: Cannot allocate memory

Searching the web, I found two examples of similar cases, both on this forum, both unresolved:
This is similar to this thread: error wrf.exe rsl_malloc failed allocating -1372408080 bytes, called rsl_bcast.c, line 270, try 2, but the original poster there never followed up with the files that were asked for my the respondent.
It's also similar to this thread: rsl_malloc failed allocating -2076874848 bytes, called rsl_bcast.c, line 290, try 3, and the response was to recompile WRF with OpenMP instead of dm+sm and then come back if that didn't solve the problem. Again, no follow-up from the original poster.

I have attached a zipped file containing the rsl.error, rsl.out, and all three namelist files (wps, input, and output).

Other pieces of information that may or may not have any use:
  • My version is 4.5.2.
  • I originally installed WRF last year using https://github.com/bakamotokatas/WRF-Install-Script.
  • Since it mentions memory allocation, the memory on the machine I'm using is 64 GB.
  • No paralell processing attempted here, and I'm not trying to use dm+sm mode, but if I'm being honest with myself, I cannot be 100% sure that I'm doing what I think I'm doing.
  • I have run wrf.exe successfully with this installation of WRF before, but never with ERA5 as the boundary data and never with such a large nested domain.
Does anyone have familiarity with this issue? Or any guesses on what to check?

Much obliged.

Update on 13 Feb 2025: I tested the same setup only with GFS data today, and I received exactly the same error as before. Therefore, I am confident that the error is not unique to the ERA5 data I was trying to use.
 

Attachments

Last edited:
Hi,
I'd like to apologize for the very long delay in response while we've had to tend to other obligations. Thank you for your patience. Since it's been so long, I first want to ask if you're still experiencing this issue. If so, I think you likely need more than a single processor for this simulation. See Choosing an Appropriate Number of Processors.
 
Hello! Thank you for the response. I had figured out that my domain was probably too big for my processor, but I had not yet figured out how to properly calculate the number of processors. Therefore, that link you shared was very helpful. I have the simulation running on 24 processors now, and that is working. (My first output netcdf file just came through.) So I know that the error I posted about is now solved!

For anyone reading in the future who has the same problem, since I have 2 threads per core, the final line for running wrf.exe was:
mpirun -np 24 --use-hwthread-cpus ./wrf.exe

Thanks again.
 
Back
Top