Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

(RESOLVED) ERRORS while reading one or more namelists

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

sao698

New member
Good morning,

I have been having issues with an MPI error being invoked when I am trying to run real.exe. After going through the forum, I tried increasing the amount of nodes for memory size and I have unlimited memory, but I am still running into issues. I have attached my rsl.error file and the namelist.input.

Thank you so much!
 

Attachments

  • namelist.input
    3.9 KB · Views: 88
  • rsl.error copy.txt
    1.1 KB · Views: 113
Hi,
How many processors are you using to run real.exe? For a domain of this size, using only 1 (or a small handful) should probably be okay. Sometimes it can be possible to use too many processors.
 
I was using 8 originally and then tried with 16, but when I just tried with 1, it still failed.

Thank you again for your help!
 
This was the output from the REALrun.err:

starting wrf task 1 of 3
starting wrf task 2 of 3
starting wrf task 0 of 3
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 8627 on
node keeling-f09 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
"
and this was the output from the rsl.error file:

taskid: 15 hostname: keeling-g04
~
 
Edited: Can you package your rsl.* files together and attach them as a single *.TAR file? Thanks.
 
Thanks. It would be helpful if you could remove all the rsl* files and then re-issue the real.exe executable, and then only send the rsl* files that go with that particular run. I'm seeing different errors in these, and as they are all dated/timed the same (prob the time you packaged them today), it's hard to differentiate.
 
Hi,
Thanks for sending those. In the rsl.error.0000 file, this error is shown:
Code:
  ------ ERROR while reading namelist dynamics ------
Maybe here?:      scalar_adv_opt                      = 1,      1,      1,
Maybe here?:      gwd_opt                             = 1,      1,      0,
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:   10851
ERRORS while reading one or more namelists from namelist.input.
-------------------------------------------

What version of the code are you using? If you're using a version older than V4.2, then the gwd_opt was not domain-dependent at that time, and therefore it could only be set once (i.e., gwd_opt = 1), without the setting for multiple columns.
 
Sorry, there is an issue with downloading and compiling WRF on our server, so it was related to that. Thank you for all of your help! :)
 
Top