'BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES' during wrf.exe

kinguT · May 7, 2024

I create real.exe files for the simulation period of start_date = '2012-06-27_00:00:00', end_date = '2012-09-30_18:00:00' to four domains successfully. However, when I start 'mpirun -np 8 ./wrf.exe' the following error message appeared:

BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 84908 RUNNING AT negusu-OptiPlex-3060
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

Does anyone suggest me how to solve?

Thanks

zoidimitriadou · May 16, 2024

I have the exact same issue for a few days now and I've searched for it in the forum in multiple ways but have not found a solution. All others excecutables have run successfully in my system using 32 cores. My system is Linux Ubuntu server (x86_64 GNU/Linux), CPU(s): 72, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz. With calculations following appropriate number of processors , I have found out that 32 is an appropriate amount of processors for my case and also metgrid and real run successfully with "mpirun -np 32 ./ ". I only get the above issue with wrf.exe. I have installed the latest version of the model available 4.5 for both WRF and WPS and use input and boundary data from GFS with SST_FIXED also from GFS. I also used the domain wizard web to create my domains for wps.
I've also noticed this
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
in 4 of my rsl.error.00* files (rsl.error.0007,rsl.error.0008,rsl.error.0016 and rsl.error.0023)

I have been running version 4.2 of the model before but I'm new at trying to run version 4.5 without help so some advice will be greatly appreciated.
Best,
Zoi

Manuarii · May 16, 2024

Hi,

I'm encountering the same issues and am currently working on optimizing my time step. I recommend decreasing your time step to see if that resolves the problem. This adjustment worked for me. Like try 90 of 108 instead ? Same for kingu, try maybe less than 162 with 135 ? Here I was making 5*dx.

Also don't hesitate when fine tunning the time step for each domain to use the time_step_ratio.

Best,

Vazquez Ballesta Manuarii

zoidimitriadou · May 16, 2024

Manuarii said:
Hi,

I'm encountering the same issues and am currently working on optimizing my time step. I recommend decreasing your time step to see if that resolves the problem. This adjustment worked for me. Like try 90 of 108 instead ? Same for kingu, try maybe less than 162 with 135 ? Here I was making 5*dx.

Also don't hesitate when fine tunning the time step for each domain to use the time_step_ratio.

Best,

Vazquez Ballesta Manuarii

Thank you so much for your recommendation,
I have changed the timestep to 90 and the same error occurs, with the only difference being that the error that I mentioned above is now shown only in 3 rsl.error.* files. I will try to run with different timesteps to see if that will maybe fix it as you have suggested.
Kindly,
Zoi

Manuarii · May 16, 2024

Ok, if you want, as an example for my configuration, I found that time step of 10s (with a small different time step ratio because the inner domain under 1km require smaller time step) help and I have the following in namelist :

&domains
time_step = 10,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 5,
e_we = 106, 100, 100, 175, 169,
e_sn = 100, 100, 121, 178, 154,
e_vert = 46, 46, 46, 46, 46,
vert_refine_method = 0, 0, 0, 0, 0,

eta_levels(1:46) = 1.0000, 0.9987, 0.9974, 0.9962, 0.9949,
0.9924, 0.9899, 0.9859, 0.9809, 0.9759,
0.9709, 0.9659, 0.9606, 0.9520, 0.9427,
0.9326, 0.9219, 0.9077, 0.8932, 0.8769,
0.8656, 0.8574, 0.8462, 0.8351, 0.8235,
0.8113, 0.7958, 0.7756, 0.7494, 0.7133,
0.6742, 0.6323, 0.5876, 0.5406, 0.4915,
0.4409, 0.3895, 0.3379, 0.2871, 0.2378,
0.1907, 0.1465, 0.1056, 0.0682, 0.0332,
0.0000,

p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 9000, 3000, 1000, 333.333, 111.111,
dy = 9000, 3000, 1000, 333.333, 111.111,
grid_id = 1, 2, 3, 4, 5,
parent_id = 0, 1, 2, 3, 4,
i_parent_start = 1, 50, 30, 11, 75,
j_parent_start = 1, 35, 30, 27, 47,
parent_grid_ratio = 1, 3, 3, 3, 3,
parent_time_step_ratio = 1, 3, 3, 4, 3,
feedback = 1,
smooth_option = 0,

zoidimitriadou · May 16, 2024

Manuarii said:
Ok, if you want, as an example for my configuration, I found that time step of 10s (with a small different time step ratio because the inner domain under 1km require smaller time step) help and I have the following in namelist :

&domains
time_step = 10,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 5,
e_we = 106, 100, 100, 175, 169,
e_sn = 100, 100, 121, 178, 154,
e_vert = 46, 46, 46, 46, 46,
vert_refine_method = 0, 0, 0, 0, 0,

eta_levels(1:46) = 1.0000, 0.9987, 0.9974, 0.9962, 0.9949,
0.9924, 0.9899, 0.9859, 0.9809, 0.9759,
0.9709, 0.9659, 0.9606, 0.9520, 0.9427,
0.9326, 0.9219, 0.9077, 0.8932, 0.8769,
0.8656, 0.8574, 0.8462, 0.8351, 0.8235,
0.8113, 0.7958, 0.7756, 0.7494, 0.7133,
0.6742, 0.6323, 0.5876, 0.5406, 0.4915,
0.4409, 0.3895, 0.3379, 0.2871, 0.2378,
0.1907, 0.1465, 0.1056, 0.0682, 0.0332,
0.0000,

p_top_requested = 5000,
num_metgrid_levels = 38,
num_metgrid_soil_levels = 4,
dx = 9000, 3000, 1000, 333.333, 111.111,
dy = 9000, 3000, 1000, 333.333, 111.111,
grid_id = 1, 2, 3, 4, 5,
parent_id = 0, 1, 2, 3, 4,
i_parent_start = 1, 50, 30, 11, 75,
j_parent_start = 1, 35, 30, 27, 47,
parent_grid_ratio = 1, 3, 3, 3, 3,
parent_time_step_ratio = 1, 3, 3, 4, 3,
feedback = 1,
smooth_option = 0,

I have tried a timestep of 108, 90, 72, 54, 36, 18 and even 10s and I keep getting the same error. So unfortunately I don't think it is a timestep issue.

kwerner · May 16, 2024

Hi @zoidimitriadou and @Manuarii
The error message "BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES" just simply means the simulation failed for some reason. And even if the rsl files don't seem to reveal any specific errors, it's likely your reasons for failures are all very different; therefore it would be best if you each post a new thread to discuss your issues if you are still experiencing them. Please make sure to include your namelist.input file and all of your rsl files in that post, as well. Thank you and I apologize for the inconvenience.

kwerner · May 16, 2024

@kingu,
Your simulation is almost certainly failing because you aren't using enough processors (especially for the size of d03). Take a look at Choosing an Appropriate Number of Processors for guidance.

kinguT · May 16, 2024

kwerner said:
@kingu,
Your simulation is almost certainly failing because you aren't using enough processors (especially for the size of d03). Take a look at Choosing an Appropriate Number of Processors for guidance.

Thank You kewerner,
maybe FYI, the first 6hrs were simulated for all domain and then stops. if it is associated with number of processors, is there any alternative mechanism to resolve other than increasing processors?

kinguT · May 16, 2024

Manuarii said:
Hi,

I'm encountering the same issues and am currently working on optimizing my time step. I

Manuarii
decreasing your time step to see if that resolves the problem. This adjustment worked for me. Like try 90 of 108 instead ? Same for kingu, try maybe less than 162 with 135 ? Here I was making 5*dx.

Also don't hesitate when fine tunning the time step for each domain to use the time_step_ratio.

Best,

Vazquez Ballesta Manuarii

Thanks Manuarii for your recommendation.
However, I proved that reducing the time_step can't resolve my problem.

kwerner · May 23, 2024

Unfortunately there isn't an alternative to that needing additional processors. In the rsl* files you sent at the beginning, the model seems to stop almost immediately, so I don't see where it ran for 6 hours. Even so, sometimes this can still be a lack of processors. Since your d01 is smaller (than d03), you could try running a single domain simulation to see if that fails. Although 8 processors is very few, I think it would still be able to process d01's size. If that works, you could then try d02, and then d03, until you find which domain causes the failure. You could also try using smaller domains for all 4 domains to see if you're able to run that. If so, it may point even more to the fact that it's an issue with the number of processors.

Another thing I notice is that your d01 is using a resolution of 27km, which is probably too coarse, depending on the resolution of the input data you're using. What is the resolution of your input data?

kinguT · May 23, 2024

kwerner said:
Unfortunately there isn't an alternative to that needing additional processors. In the rsl* files you sent at the beginning, the model seems to stop almost immediately, so I don't see where it ran for 6 hours. Even so, sometimes this can still be a lack of processors. Since your d01 is smaller (than d03), you could try running a single domain simulation to see if that fails. Although 8 processors is very few, I think it would still be able to process d01's size. If that works, you could then try d02, and then d03, until you find which domain causes the failure. You could also try using smaller domains for all 4 domains to see if you're able to run that. If so, it may point even more to the fact that it's an issue with the number of processors.

Another thing I notice is that your d01 is using a resolution of 27km, which is probably too coarse, depending on the resolution of the input data you're using. What is the resolution of your input data?

Thank you kwerner.
I use the NCEP Final Analysis (GFS-FNL) with 1-degree spatial resolution.

kwerner · May 30, 2024

Thanks. I suppose then it makes sense to use a 27km parent domain; however, you may want to consider using a higher-resolution input (e.g., GFS 0.25 degree data). It may not make much of a difference, but we typically advise to use the highest resolution option available.

kinguT · May 30, 2024

kwerner said:
Thanks. I suppose then it makes sense to use a 27km parent domain; however, you may want to consider using a higher-resolution input (e.g., GFS 0.25 degree data). It may not make much of a difference, but we typically advise to use the highest resolution option available.

Thanks kewerner.

sagar · Feb 28, 2026

I am trying to run the coupled WRF/WRF-Hydro model with the Crocus option enabled. i have completed upto ./real.exe process and generated threre three files wrfinput_d02,wrfinput_d01, wrfbdy_d01. But now I am encountering a segmentation fault (core dumped) error while executing ./wrf.exe. I have generated all the required WRF-Hydro input files using the WRF-Hydro GIS Preprocessor with geo_em.d02.nc. The files I createdinclude: fulldom_hires.nc, GEOGRID_LDASOUT_Spatial_Metadata.nc, GWBASINS.nc, GWBUCKPARM.nc, hydro2dtbl.nc, Route_Link.nc,soil_properties.nc
I placed all these files in the WRF run directory with the Domain folder. However, when I execute mpirun -np 1 ./wrf.exe i got the error
(ncl_env) sagar@sagar-OptiPlex-Tower-Plus-7010:~/data/coupled_wrfhydro_Copy/WRF/run$ mpirun -np 1 ./wrf.exe
starting wrf task 0 of 1

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 18119 RUNNING AT sagar-OptiPlex-Tower-Plus-7010
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
I have attached the error log files along with my namelist.input and hydro.namlist.
Could you please help me identify what might be causing this issue and what should be the namelist.input and hydro.namelist for two domains? I would greatly appreciate your guidance.

Regards
Sagar Lamichhane

kinguT said:
I create real.exe files for the simulation period of start_date = '2012-06-27_00:00:00', end_date = '2012-09-30_18:00:00' to four domains successfully. However, when I start 'mpirun -np 8 ./wrf.exe' the following error message appeared:

BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 84908 RUNNING AT negusu-OptiPlex-3060
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

Does anyone suggest me how to solve?

Thanks

sagar · Feb 28, 2026

I am trying to run the coupled WRF/WRF-Hydro model with the Crocus option enabled. i have completed upto ./real.exe process and generated threre three files wrfinput_d02,wrfinput_d01, wrfbdy_d01. But now I am encountering a segmentation fault (core dumped) error while executing ./wrf.exe. I have generated all the required WRF-Hydro input files using the WRF-Hydro GIS Preprocessor with geo_em.d02.nc. The files I createdinclude: fulldom_hires.nc, GEOGRID_LDASOUT_Spatial_Metadata.nc, GWBASINS.nc, GWBUCKPARM.nc, hydro2dtbl.nc, Route_Link.nc,soil_properties.nc
I placed all these files in the WRF run directory with the Domain folder. However, when I execute mpirun -np 1 ./wrf.exe i got the error
(ncl_env) sagar@sagar-OptiPlex-Tower-Plus-7010:~/data/coupled_wrfhydro_Copy/WRF/run$ mpirun -np 1 ./wrf.exe
starting wrf task 0 of 1

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 18119 RUNNING AT sagar-OptiPlex-Tower-Plus-7010
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
I have attached the error log files along with my namelist.input and hydro.namlist.
Could you all members please help me identify what might be causing this issue and what should be the namelist.input and hydro.namelist for two domains? I would greatly appreciate your guidance.

Regards,
Sagar Lamichhane

kinguT · Feb 28, 2026

sagar said:
I am trying to run the coupled WRF/WRF-Hydro model with the Crocus option enabled. i have completed upto ./real.exe process and generated threre three files wrfinput_d02,wrfinput_d01, wrfbdy_d01. But now I am encountering a segmentation fault (core dumped) error while executing ./wrf.exe. I have generated all the required WRF-Hydro input files using the WRF-Hydro GIS Preprocessor with geo_em.d02.nc. The files I createdinclude: fulldom_hires.nc, GEOGRID_LDASOUT_Spatial_Metadata.nc, GWBASINS.nc, GWBUCKPARM.nc, hydro2dtbl.nc, Route_Link.nc,soil_properties.nc
I placed all these files in the WRF run directory with the Domain folder. However, when I execute mpirun -np 1 ./wrf.exe i got the error
(ncl_env) sagar@sagar-OptiPlex-Tower-Plus-7010:~/data/coupled_wrfhydro_Copy/WRF/run$ mpirun -np 1 ./wrf.exe
starting wrf task 0 of 1

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 18119 RUNNING AT sagar-OptiPlex-Tower-Plus-7010
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
I have attached the error log files along with my namelist.input and hydro.namlist.
Could you all members please help me identify what might be causing this issue and what should be the namelist.input and hydro.namelist for two domains? I would greatly appreciate your guidance.

Regards,
Sagar Lamichhane

Probably this is happening because you are running with only a single processor (mpirun -np 1).
To make sure your WRF-Hydro input files are correct, I suggest running WRF-Hydro in standalone mode first before moving to the fully coupled WRF/WRF-Hydro run. That way you can confirm the hydro domain and routing inputs work properly and isolate whether the issue is coming from coupling.

Also, to get more feedback from people who work specifically on WRF-Hydro, I recommend posting this on the WRF-Hydro user group as well: https://groups.google.com/a/ucar.edu/g/wrf-hydro_users

'BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES' during wrf.exe

kinguT

Member

Attachments

zoidimitriadou

New member

Attachments

Manuarii

Member

zoidimitriadou

New member

Manuarii

Member

zoidimitriadou

New member

kwerner

Administrator

kwerner

Administrator

kinguT

Member

kinguT

Member

Manuarii

kwerner

Administrator

kinguT

Member

kwerner

Administrator

kinguT

Member

sagar

New member

Attachments

sagar

New member

Attachments

kinguT

Member

'BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES' during wrf.exe

Member

Attachments

New member

Attachments

Member

New member

Member

New member

Administrator

Administrator

Member

Member

Manuarii​

Administrator

Member

Administrator

Member

New member

Attachments

New member

Attachments

Member

Manuarii