Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

unable to run ideal case em_quarter_ss in parallel (serial works)

acast

New member
Dear forum.

I am getting an error when I try to run ideal.exe from "em_quarter_ss" in parallel. I do not have a problem when I compile and run any of the ideal cases in serial. The reason I am trying to compile in parallel is that I am having errors when running wrf.exe in parallel with my northwest Atlantic model which has been successfully tested without errors in a different HPC machine.

This is the error I get from rsl.error file when trying in parallel ideal.exe em_quarter_ss

-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 175
ideal: error opening wrfinput for writing -1021
-------------------------------------------
Rank 0 [Tue Jul 25 16:49:10 2023] [c5-0c2s14n1] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

And this is what I get from my log file from the batch script (which is the same error I get when I try to run wrf.exe with my northwest Atlantic model).
srun: error: nid02294: tasks 4-7: Exited with exit code 255
srun: Terminating StepId=269304181.0
srun: error: nid02293: tasks 0: Exited with exit code 255

This is how I run the ideal.exe
srun --ntasks=1 ideal.exe

I attach the rsl.error file and configure.wrf file. I successfully build the .exe files when I compiled but I am wondering if this has something to do with the compilation on a Cray machine. I also attach my log.compile file which includes some warning signs.

Thanks,

Alma
 

Attachments

  • rsl.error.0000
    1.2 KB · Views: 0
  • log.txt
    1.6 MB · Views: 0
  • configure.wrf.txt
    20.6 KB · Views: 0
Alma,
we always run ideal cases in serial mode. So I have no immediate answer to your question .
I will run a test case pf em_quarter_ss with the code built in dmpar mode. It may take some time because our HPC is on maintenance now. I will keep you updated of the result after it is done. Thanks for your patience.
 
Top