Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

memory allocation error when running wrf.exe

AlexCrawford

New member
I am trying to run WRF on a domain with a parent and one nested domain using the real case with ERA5 data as the input for boundary data. I have worked through every step with WPS and ran real.exe without errors. When I try to run wrf.exe, I am receiving the following error:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 9.

Looking at the rsl.error file, the trigger seems to be related to this:
rsl_malloc failed allocating -1933456896 bytes, called rsl_bcast.c, line 290, try 1
: Cannot allocate memory

Searching the web, I found two examples of similar cases, both on this forum, both unresolved:
This is similar to this thread: error wrf.exe rsl_malloc failed allocating -1372408080 bytes, called rsl_bcast.c, line 270, try 2, but the original poster there never followed up with the files that were asked for my the respondent.
It's also similar to this thread: rsl_malloc failed allocating -2076874848 bytes, called rsl_bcast.c, line 290, try 3, and the response was to recompile WRF with OpenMP instead of dm+sm and then come back if that didn't solve the problem. Again, no follow-up from the original poster.

I have attached a zipped file containing the rsl.error, rsl.out, and all three namelist files (wps, input, and output).

Other pieces of information that may or may not have any use:
  • My version is 4.5.2.
  • I originally installed WRF last year using https://github.com/bakamotokatas/WRF-Install-Script.
  • Since it mentions memory allocation, the memory on the machine I'm using is 64 GB.
  • No paralell processing attempted here, and I'm not trying to use dm+sm mode, but if I'm being honest with myself, I cannot be 100% sure that I'm doing what I think I'm doing.
  • I have run wrf.exe successfully with this installation of WRF before, but never with ERA5 as the boundary data and never with such a large nested domain.
Does anyone have familiarity with this issue? Or any guesses on what to check?

Much obliged.

Update on 13 Feb 2025: I tested the same setup only with GFS data today, and I received exactly the same error as before. Therefore, I am confident that the error is not unique to the ERA5 data I was trying to use.
 

Attachments

  • WRF_ErrorCase_20250123_AlexCrawford.zip
    14.5 KB · Views: 1
  • WRF_ErrorCase_20250213_AlexCrawford.zip
    14.5 KB · Views: 1
Last edited:
Hi,
I'd like to apologize for the very long delay in response while we've had to tend to other obligations. Thank you for your patience. Since it's been so long, I first want to ask if you're still experiencing this issue. If so, I think you likely need more than a single processor for this simulation. See Choosing an Appropriate Number of Processors.
 
Hello! Thank you for the response. I had figured out that my domain was probably too big for my processor, but I had not yet figured out how to properly calculate the number of processors. Therefore, that link you shared was very helpful. I have the simulation running on 24 processors now, and that is working. (My first output netcdf file just came through.) So I know that the error I posted about is now solved!

For anyone reading in the future who has the same problem, since I have 2 threads per core, the final line for running wrf.exe was:
mpirun -np 24 --use-hwthread-cpus ./wrf.exe

Thanks again.
 
Top