Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1.

naveen_ven

New member
Hi everyone, I'm getting this error when I run real.exe. Here is the complete error message.

starting wrf task 0 of 1
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
 
Can you attach your rsl.error rsl.out and namelist files here so we can look at them?
 
Hi Will,

I have attached rsl.error rsl.out and namelist.input files.
Thanks for the help. Let me know if you need anything else.

Regards,
Naveen Venkat.
 

Attachments

  • rsl.out.0000
    2.6 KB · Views: 36
  • rsl.error.0000
    2.2 KB · Views: 17
  • namelist.input
    3.6 KB · Views: 21
Last edited:
Can you also include your namelist.wps file. I think I know what the issue is but I need to confirm with that.
 
Okay I know what the problem is,

You two namelists have different domains of interest in them.

In the WRF namelist.input you have two domains:

e_we = 150, 220,
e_sn = 130, 214,
e_vert = 45, 45,

But in the WPS namelist.wps you have only one domain listed that doesn't match the WRF namelist.

e_we = 220,
e_sn = 175,

That is why you are getting this error in the rsl.out

d01 2022-10-09_00:00:00 input_wrf.F: SIZE MISMATCH: namelist e_we = 150
d01 2022-10-09_00:00:00 input_wrf.F: SIZE MISMATCH: input file WEST-EAST_GRID_DIMENSION = 220
d01 2022-10-09_00:00:00 ---- ERROR: Mismatch between namelist and input file dimensions
d01 2022-10-09_00:00:00 input_wrf.F: SIZE MISMATCH: namelist e_sn = 130
d01 2022-10-09_00:00:00 input_wrf.F: SIZE MISMATCH: input file SOUTH-NORTH_GRID_DIMENSION = 175
d01 2022-10-09_00:00:00 ---- ERROR: Mismatch between namelist and input file dimensions
NOTE: 2 namelist vs input data inconsistencies found.
-------------- FATAL CALLED ---------------
 
Hi Mr Hatheway,
I'm getting similar error again. I have attached required files. Is there any manual or tutorial for running WRF ?

Thanks
Naveen Venkat
 

Attachments

  • namelist.wps
    950 bytes · Views: 6
  • rsl.error.0000
    425 bytes · Views: 5
  • rsl.out.0000
    388 bytes · Views: 6
  • namelist.input
    2.7 KB · Views: 8
Seems to be a similar problem.

Namelist.input shows three domains listed but only a max_dom of 1.
max_dom = 1,
e_we = 74, 112, 94
e_sn = 61, 97, 91,
e_vert = 30, 30, 30,

Namelist.wps shows only two domains listed.
&geogrid
parent_id = 1, 1,
parent_grid_ratio = 1, 3,
i_parent_start = 1, 31,
j_parent_start = 1, 17,
e_we = 74, 112,
e_sn = 61, 97,

I would recommend taking a look at the official tutorials located here:

WRF tutorials youtube

 
Hi,
It's okay to run WPS with more domains than you're using to run real/wrf - just not the other way around. So that shouldn't be your current issue. The error message shows:
Code:
  ------ ERROR while reading namelist dynamics ------
Maybe here?:             damp_opt = 0,
Maybe here?:             z_damp = 5000., 5000., 5000.,
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:   11540
ERRORS while reading one or more namelists from namelist.input.

It's saying there is a problem with one of the lines in the dynamics section of the namelist, and provides hints about where you can start to look. The issue you're having is that there is no namelist parameter called "z_damp." You need to remove that line from the namelist.input file. If you're ever curious about the parameters, go to the WRF/Registry directory and do a 'grep' command to search for the variable. If none is found, it doesn't exist.
 
Hi,
It's okay to run WPS with more domains than you're using to run real/wrf - just not the other way around. So that shouldn't be your current issue. The error message shows:

Good to know
 
I just encountered the same error message when I tried running real.exe, and I really need help in figuring out the troubleshoot strategy for this. Thank you very much in advance.

Here are the relevant attachments:
 

Attachments

  • rsl.out.0000
    2.3 KB · Views: 5
  • rsl.error.0000
    1.2 KB · Views: 9
  • namelist.input
    2.9 KB · Views: 5
  • namelist.wps
    568 bytes · Views: 3
I just encountered the same error message when I tried running real.exe, and I really need help in figuring out the troubleshoot strategy for this. Thank you very much in advance.

Here are the relevant attachments:
I just figured out the source of the error. Apparently, I failed to properly link the met_em files since I had a typo in entering the directory path. The real.exe now works fine.
 
Hi
can you tell me how to properly link the met_em files cause i have the same issue here?

fyi i link the met_em files using this
ln -sf /mnt/g/Models/WRF-Chem/WPS-4.5/met_em.d01*.

and it's not working
thanks in advance
 

Attachments

  • rsl.error.0000
    1.3 KB · Views: 1
  • rsl.out.0000
    2 KB · Views: 0
  • namelist.input
    3.6 KB · Views: 1
  • namelist.wps
    664 bytes · Views: 2
Seems to be a similar problem.

Namelist.input shows three domains listed but only a max_dom of 1.


Namelist.wps shows only two domains listed.


I would recommend taking a look at the official tutorials located here:

WRF tutorials youtube

Hello everyone,
I get the same error and i don't know how to solve that:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

I attache my namelist.wps and the other file. Please can someone help me?
 

Attachments

  • namelist.wps
    1.2 KB · Views: 2
  • rsl.error.0000
    1.5 KB · Views: 4
  • rsl.out.0000
    5.8 KB · Views: 3
  • namelist.input
    2.9 KB · Views: 6
Hello everyone,
I get the same error and i don't know how to solve that:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

Hi,
Just as the user above, you have a similar message in your rsl.error.0000 file. If you open up the file, you can see this message at the bottom:

Code:
d01 2024-01-15_00:00:00  input_wrf.F: SIZE MISMATCH:  namelist num_metgrid_levels           =           27
d01 2024-01-15_00:00:00  input_wrf.F: SIZE MISMATCH:  input file BOTTOM-TOP_GRID_DIMENSION  =           50
d01 2024-01-15_00:00:00 ---- ERROR: Mismatch between namelist and input file dimensions
NOTE:       1 namelist vs input data inconsistencies found.
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    1298
NOTE:  Please check and reset these options
-------------------------------------------

This is telling you that in your namelist.input file, you have the variable "num_metgrid_levels = 27," but in your input files (i.e., your met_em* files), you have 50 levels. You simply need to change the value to 50 in your namelist.input.
 
Hi,
I also have similar MPI_ABORT problem but different error in rsl file when I run wrf.exe. Although the real.exe is successfully run and it produces the wrfinput files, there is error " program wrf: error opening wrfinput_d01 for reading ierr= -1021" when I run wrf.exe. But I also got wrfinput_d01 file. How should I do to solve the problem?

Here I attached the file.
Thank you in advance for your time.
 

Attachments

  • namelist.input
    11.7 KB · Views: 1
  • namelist.wps
    920 bytes · Views: 1
  • rsl.error (real-exe).0000.txt
    39.5 KB · Views: 1
  • rsl.error.0000
    1.6 KB · Views: 1
  • rsl.out(real-exe).0000.txt
    57.8 KB · Views: 0
Hi,
I also have similar MPI_ABORT problem but different error in rsl file when I run wrf.exe. Although the real.exe is successfully run and it produces the wrfinput files, there is error " program wrf: error opening wrfinput_d01 for reading ierr= -1021" when I run wrf.exe. But I also got wrfinput_d01 file. How should I do to solve the problem?

Here I attached the file.
Thank you in advance for your time.
Hi,
I just went through all your files. I'm seeing that your start date and end date are mismatching for domain 2,3 and 4 in both namelist files.

Also, I think the time_step you are choosing in your "namelist.input" file is too small, try doing it to 3 - 6 times of DX. (I maybe wrong).


Hope that it helps.
With regards,
Tanmoy
 
Top