Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Memory Error when running idealized WRF test case on Cheyenne

astansfield

New member
Hello,

I am attempting to run the idealized tropical cyclone test case (em_tropical_cyclone) on Cheyenne. I am just using the default namelist settings. I have successfully configured WRF (1st option = 15 and 2nd = 1), successfully compiled the test case, and ideal.exe seems to have run successfully. The problem comes when I try to run wrf.exe using a batch script. WRF starts running but then immediately I get this error in the output file:

MPT ERROR: MPI_COMM_WORLD rank 29 has terminated without calling MPI_Finalize()
aborting job
MPT: Received signal 11


When I look in the error output file for that processor, the first error I can find is this:
MPT ERROR: Rank 29(g:29) received signal SIGSEGV(11).
Process ID: 60679, Host: r14i1n0, Program: /glade/scratch/alyssas/WRF/WRFV4.3.3_intel_dmpar/main/wrf.exe

When trying to look up this cryptic error, it seems like it's some sort of memory error? But I can't seem to find a way to get around it. I have searched the error files and there doesn't seem to be any CFL error warnings.

Thank you,
Alyssa
 
Hi Alyssa,
When you run ideal.exe, are you using multiple processors? If so, can you try to run that with only a single processor? The idealized code requires that, even if you've built it for distributed memory processing, you must still only run ideal.exe with a single processor. If that's not the case, let me know your running directory and I'll take a look. Thanks!
 
Hello, thank you for your reply.

I tried running ideal.exe in the command line, but I got this error message: ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications

Which made me think I had to use mpirun to run ideal.exe?
 
Hi,
Sorry for the confusion. You do actually need to use mpirun, but you just need to use a single processor. So wherever you are declaring the number of processors (e.g., mpirun -np 1 ./ideal.exe), make sure you set it to "1."
 
Okay, got it. So I discussed this issue with someone at CISL and we discovered that the ultimate problem was that my environment modules were older, so they were incompatible with the newer version of WRF that I was trying to run. So once I updated my environment modules, the idealized test case ran successfully!
 
Top