Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation fault for MPI + OpenMP implementation

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Anna_1986

New member
I am a beginner user of your application. I would like to use a hybrid version of the WRF-Chem V4.0.3. There is no error for MPI version on 10 nodes (12 MPI per node). But when I configure WRF-Chem by using "51" and "1" options (dm+sm mode, Intel compiler, basic) and run on 10 nodes (12 MPI processes per node, 2 OpenMP threads per MPI process), I get a segmentation fault:

--------------------------------------------------------------------------------------------------------------------
_pmiu_daemon(SIGCHLD): [NID 03466] [c6-1c0s2n2] [Thu Jan 30 15:17:03 2020] PE RANK 80 exit signal Segmentation fault
[NID 03466] 2020-01-30 15:17:03 Apid 4463210: initiated application termination
--------------------------------------------------------------------------------------------------------------------

The output file ends with the following lines:
--------------------------------------------------------------------------------------------------------------------
Timing for processing lateral boundary for domain 1: 158.71330 elapsed seconds
WRF NUMBER OF TILES FROM OMP_GET_MAX_THREADS = 2
Tile Strategy is not specified. Assuming 1D-Y
WRF TILE 1 IS 1 IE 80 JS 1 JE 34
WRF TILE 2 IS 1 IE 80 JS 35 JE 67
WRF NUMBER OF TILES = 2
Top of Radiation Driver
CALL cldfra1
--------------------------------------------------------------------------------------------------------------------

I guess that a problem is in the subroutine radiation_driver() from the module radiation_driver.f90. I found the OpenMP loop (line 992 - line 2377) and a call of the subroutine cal_cldfra1(). But I do not know exactly what caused the error.

Please, help solve this problem.
 
Hi,
We typically don't recommend compiling with the dm+sm option because so many users run into similar problems. We don't even test our code in this mode anymore. Either dmpar or smpar should be okay options to use, and since you've at least seen good results with one of them, it's probably best to stick to that type of compile.
 
Top