WRF4.3.2 crashes while writing output

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

Hi all,

I am having some issues running WRF V4.3.2 with MOZART-MOSAIC chemistry (chem_opt = 201). The simulation starts correctly and works fine until the model attempts to write wrfout. In particular, when reaching the history_interval the simulation crashes without any message. Playing a bit with some debugging flags, I have been able to find the lines where the model stops:

<stdin> writing 2d real aod2d_out Status = 0
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 0000000013AF4254 Unknown Unknown Unknown
libpthread-2.17.s 00002B4F1EC8F370 Unknown Unknown Unknown
wrf.exe 0000000005A8F76E output_wrf_ 1046 output_wrf.f90
wrf.exe 000000000581835C module_io_domain_ 392 module_io_domain.f90
wrf.exe 0000000005C5C89D med_hist_out_ 943 mediation_integrate.f90
wrf.exe 0000000005C35FD7 med_before_solve_ 65 mediation_integrate.f90
wrf.exe 000000000058391B module_integrate_ 319 module_integrate.f90
wrf.exe 000000000041158D module_wrf_top_mp 326 module_wrf_top.f90
wrf.exe 0000000000410D55 MAIN__ 29 wrf.f90
wrf.exe 0000000000410D0E Unknown Unknown Unknown
libc-2.17.so 00002B4F1EEBDB35 __libc_start_main Unknown Unknown
wrf.exe 0000000000410C29 Unknown Unknown Unknown

Looking at the rsl.error file, it seems to me that WRF fails to save the variable called aod2d_out. Do you have any idea why this happens? I do not think it is a memory issue as I am running a test simulation on a small domain; also, in case of NaNs during computation of AOD, or other chemical species, I would expect the model crashes in some subroutine related to chemistry, while the simulation moves on without any problem until it reaches the history_interval.

Can you please help me to find the issue? I attach the namelist and rsl.error. I compiled the model in debug mode with these flags: -g $(FCNOOPT) -traceback -fpe0 -ftrapuv -check bounds

Thanks a lot!
Alessandro
 

Attachments

Hi Alessandro,

Thank you for your report. It seems that AOD2D_OUT is only calculated with the Goddard scheme ( goddardswscheme ra_sw_physics==5 ) so unless you are using that scheme AOD2D_OUT will not be output to history. This should not crash the model, however, so I believe you have indeed discovered a bug. We will get to work fixing it.

In the meantime, you can try two things,

1a) Remove the "h" and replace with a "-" for theses quantities (AOD_OUT, AOD2D_OUT, ATOP2D_OUT, ICN_DIAG, NC_DIAG)
1b) Run "./clean -a" and recompile

or

2) Add a my_iofields file (see WRF-Chem UG) and remove those variables from the history stream.

If you want to calculate AOD for your simulations, you can set opt_pars_out = 1, and use the extinction coefficient "extcof55" .. i.e., ( sum (extcof55(:) * dz(:) )

Best,

Jordan
 
Dear Jordan,

thanks for the quick reply and suggestions; I tried to add a my_iofields file removing from the history stream the variables that you suggested but the simulation always crashes, but in a different point (line 1060 module_domain.f90). These are the last lines of the standard output:

newnucbb mins 2.33E+02 5.00E-02 1.32E-15 1.22E-22 1.01E-22 0.00E+00
newnucbb maxs 2.90E+02 9.80E-01 2.03E-08 5.31E-13 3.33E-08 0.00E+00
newnucbb avgs 2.53E+02 3.61E-01 3.30E-10 2.14E-15 5.47E-11 0.00E+00
newnucbb hinuc 0.00E+00 0.00E+00 0.00E+00 0.00E+00 0.00E+00 0.00E+00
newnucbb dtnuc 1.20E+02
newnucbb ncnt 56000 0 0 0 0 0 0
coagbb ncntaa 56000 0 30 1816 0 0 0 29 0 0
coagbb ncntbb1 0 0 1455 0 0 0 0 0 0 0
coagbb ncntbb2 0 0 0 0 0 0 0 0
d01 2019-01-01_00:58:00 sum_pm_driver: calling sum_pm_mosaic_vbs0
d01 2019-01-01_00:58:00 sum_pm_driver: calling sum_vbs0
d01 2019-01-01_00:58:00 done tileloop in chem_driver
d01 2019-01-01_00:58:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_00:58:00]
d01 2019-01-01_00:58:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_00:00:00]
d01 2019-01-01_00:58:00 DEBUG wrf_timetoa(): returning with str = [2019-01-02_00:00:00]
d01 2019-01-01_00:58:00 DEBUG wrf_timeinttoa(): returning with str = [0000000000_000:002:000]
DEBUG domain_clockadvance(): before WRFU_ClockAdvance, clock start time = 2019-01-01_00:00:00
DEBUG domain_clockadvance(): before WRFU_ClockAdvance, clock current time = 2019-01-01_00:58:00
DEBUG domain_clockadvance(): before WRFU_ClockAdvance, clock stop time = 2019-01-02_00:00:00
DEBUG domain_clockadvance(): before WRFU_ClockAdvance, clock time step = 0000000000_000:002:000
d01 2019-01-01_01:00:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_01:00:00]
d01 2019-01-01_01:00:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_00:00:00]
d01 2019-01-01_01:00:00 DEBUG wrf_timetoa(): returning with str = [2019-01-02_00:00:00]
d01 2019-01-01_01:00:00 DEBUG wrf_timeinttoa(): returning with str = [0000000000_000:002:000]
DEBUG domain_clockadvance(): after WRFU_ClockAdvance, clock start time = 2019-01-01_00:00:00
DEBUG domain_clockadvance(): after WRFU_ClockAdvance, clock current time = 2019-01-01_01:00:00
DEBUG domain_clockadvance(): after WRFU_ClockAdvance, clock stop time = 2019-01-02_00:00:00
DEBUG domain_clockadvance(): after WRFU_ClockAdvance, clock time step = 0000000000_000:002:000
d01 2019-01-01_01:00:00 module_integrate: back from solve interface
d01 2019-01-01_01:00:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_01:00:00]
Timing for main: time 2019-01-01_01:00:00 on domain 1: 106.93501 elapsed seconds
d01 2019-01-01_01:00:00 DEBUG wrf_timetoa(): returning with str = [2019-01-01_01:00:00]
d01 2019-01-01_01:00:00 output_wrf: begin, fid = 1
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 0000000006F89264 Unknown Unknown Unknown
libpthread-2.17.s 00002AEB45472370 Unknown Unknown Unknown
wrf.exe 0000000000417EB7 module_domain_mp_ 1060 module_domain.f90
wrf.exe 000000000057E315 modify_io_masks_ 25013 module_domain.f90
wrf.exe 0000000001F23433 output_wrf_ 252 output_wrf.f90
wrf.exe 0000000001E4012A module_io_domain_ 392 module_io_domain.f90
wrf.exe 0000000001FD372F med_hist_out_ 943 mediation_integrate.f90
wrf.exe 0000000001FC2DC7 med_before_solve_ 65 mediation_integrate.f90
wrf.exe 000000000057E633 module_integrate_ 319 module_integrate.f90
wrf.exe 000000000041225E module_wrf_top_mp 326 module_wrf_top.f90
wrf.exe 0000000000411A55 MAIN__ 29 wrf.f90
wrf.exe 0000000000411A0E Unknown Unknown Unknown
libc-2.17.so 00002AEB456A0B35 __libc_start_main Unknown Unknown
wrf.exe 0000000000411929 Unknown Unknown Unknown

Oddly, if I double the history_interval (i.e output every two hours) the simulation goes on without any problem and then crashes when reaching the new history_interval:

Timing for main: time 2019-01-01_01:58:00 on domain 1: 13.71457 elapsed seconds
calculate MEGAN emissions at ktau, gmtp, tmidh = 60 1.000000 1.983333
photolysis_driver: called for domain 1
newnucbb mins 2.33E+02 5.00E-02 1.61E-15 1.25E-22 1.79E-22 0.00E+00
newnucbb maxs 2.90E+02 9.80E-01 1.97E-08 6.48E-13 3.12E-08 0.00E+00
newnucbb avgs 2.53E+02 3.62E-01 3.20E-10 3.36E-15 4.67E-11 0.00E+00
newnucbb hinuc 0.00E+00 0.00E+00 0.00E+00 0.00E+00 0.00E+00 0.00E+00
newnucbb dtnuc 1.20E+02
newnucbb ncnt 56000 0 0 0 0 0 0
coagbb ncntaa 56000 0 22 1739 0 0 0 22 0 0
coagbb ncntbb1 0 0 1494 0 0 0 0 0 0 0
coagbb ncntbb2 0 0 0 0 0 0 0 0
Timing for main: time 2019-01-01_02:00:00 on domain 1: 16.48836 elapsed seconds
forrtl: severe (174): SIGSEGV, segmentation fault occurred


I also tested other schemes and I found the same issue with CBMZ-MOSAIC, while with RADM2/MADE/SORGAM I successfully completed the simulation. Finally, I performed the same simulation (MOZART-MOSAIC) with v3.9 and all worked fine.

I hope it helps to find the issue, any suggestion is welcome.

Best,
Alessandro
 
Back
Top