Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

FATAL CALLED FROM FILE: <stdin> LINE: 1973 QG, QG(NK).LT.0

Arty

Member
Hello,

I'm running a new WRF simulation where I've changed the cumulus physics option from BMJ (cu_physics = 2) to KF (cu_physics = 1), using 64 vertical levels and two-way double-nested domains. While previous simulations using BMJ ran smoothly, the current run crashes, and based on the rsl.error.* files and some forum searches, the issue seems related to the convection scheme. However, I'm unsure of the exact root cause.

Notably, the first simulation day completes successfully—the output file is complete and uncorrupted.

I've attached a one of the rsl.error.* files (snippet below), as well as my namelist.input for further investigation.

Code:
taskid: 51 hostname: r1i6n8
 Ntasks in X           12 , ntasks in Y           12
--- WARNING: traj_opt is zero, but num_traj is not zero; setting num_traj to zero.
--- NOTE: grid_fdda is 0 for domain      1, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain      1, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain      1, setting obs nudging interval and ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain      2, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain      2, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain      2, setting obs nudging interval and ending time to 0 for that domain.
bl_pbl_physics /= 4, implies mfshconv must be 0, resetting
--- NOTE: RRTMG radiation is in use, setting:  levsiz=59, alevsiz=12, no_src_types=6
--- NOTE: num_soil_layers has been set to      5
WRF V3.6.1 MODEL
 *************************************
 Parent domain
 ids,ide,jds,jde            1         121           1         121
 ims,ime,jms,jme           24          47          34          57
 ips,ipe,jps,jpe           31          40          41          50
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
   alloc_space_field: domain            1 ,               33711148  bytes allocated
 RESTART run: opening wrfrst_d01_2016-05-01_00:00:00 for reading
 *************************************
 Nesting domain
 ids,ide,jds,jde            1         121           1         121
 ims,ime,jms,jme           21          51          31          60
 ips,ipe,jps,jpe           31          40          41          50
 INTERMEDIATE domain
 ids,ide,jds,jde           38          83          39          84
 ims,ime,jms,jme           40          62          44          67
 ips,ipe,jps,jpe           50          52          54          57
 *************************************
d01 2016-05-01_00:00:00  alloc_space_field: domain            2 ,               52517484  bytes allocated
 RESTART: nest, opening wrfrst_d02_2016-05-01_00:00:00 for reading
d01 2016-05-01_00:00:00 Input data processed for aux input   4 for domain   1
d01 2016-05-01_00:00:00 WRF restart, LBC starts at 2016-05-01_00:00:00 and restart starts at 2016-05-01_00:00:00
d01 2016-05-01_00:00:00 Found correct time, LBC matches the restart interval.
 Tile Strategy is not specified. Assuming 1D-Y
WRF TILE   1 IS     31 IE     40 JS     41 JE     50
WRF NUMBER OF TILES =   1
d02 2016-05-01_00:00:00  alloc_space_field: domain            2 ,                8187264  bytes allocated
d02 2016-05-01_00:00:00 Input data processed for aux input   4 for domain   2
 Tile Strategy is not specified. Assuming 1D-Y
WRF TILE   1 IS     31 IE     40 JS     41 JE     50
WRF NUMBER OF TILES =   1
d01 2016-05-01_06:00:00 Input data processed for aux input   4 for domain   1
d02 2016-05-01_06:00:00 Input data processed for aux input   4 for domain   2
d01 2016-05-01_12:00:00 Input data processed for aux input   4 for domain   1
d02 2016-05-01_12:00:00 Input data processed for aux input   4 for domain   2
d01 2016-05-01_18:00:00 Input data processed for aux input   4 for domain   1
d02 2016-05-01_18:00:00 Input data processed for aux input   4 for domain   2
d01 2016-05-02_00:00:00 Input data processed for aux input   4 for domain   1
d02 2016-05-02_00:00:00 Input data processed for aux input   4 for domain   2
d01 2016-05-02_06:00:00 Input data processed for aux input   4 for domain   1
d02 2016-05-02_06:00:00 Input data processed for aux input   4 for domain   2
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    1973
QG, QG(NK).LT.0
-------------------------------------------
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 51

There are only three months left to complete the full simulation (which has already run successfully for 2.5 years), so I'd greatly appreciate any insights to help resolve this issue.

Thanks in advance!
 

Attachments

  • rsl.error.txt
    3.9 KB · Views: 1
  • namelist.input.txt
    9 KB · Views: 1
Even though the error message can easily be found in the module_cu_kfeta.F file (see extract below), the origin of the problem remains unclear to me, as the model had run fine for several months before crashing. I re-computed the previous month, but it eventually crashed again at the same point in the following month.

Code:
!...CHECK TO SEE IF MIXING RATIO DIPS BELOW ZERO ANYWHERE;  IF SO, BORROW
!...MOISTURE FROM ADJACENT LAYERS TO BRING IT BACK UP ABOVE ZERO...
!     
        DO NK=1,LTOP
          IF(QG(NK).LT.0.)THEN
            IF(NK.EQ.1)THEN                             ! JSK MODS
!              PRINT *,' PROBLEM WITH KF SCHEME:  ' ! JSK MODS
!              PRINT *,'QG = 0 AT THE SURFACE!!!!!!!'    ! JSK MODS
              CALL wrf_error_fatal ( 'QG, QG(NK).LT.0') ! JSK MODS
            ENDIF                                       ! JSK MODS
 
Would you please recompile WRF in debug mode, then restart the run using the latest wrfrst file? With debug mode, the log file will tell exactly when and where the model crashes, which may give us some hints what is wrong.
 
Top