Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault in WRF REAL.EXE Program

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

jboustead

New member
We are using WRFV3.9 compiled with Intel compilers and running on a Red Hat Linux Server with multiple cores.

We are using previous WRF grib1 output from a 1 km resolution domain initialized with the NCEP HRRR model to initialize the next hours WRF run for the same geographical area covered by the first WRF run. The model runs completes GEOGRID, UNGRIB, and METGRIB without any errors (so the entire WPS process). When we then use the met_em files in the real.exe program it fails with a segmentation fault with the following error in the rsl.error file.

From Google searches I can find it is likely a skin temperature issue. When I view the met_em file and look at the skin temperature field the top row of the product is set to zero while the rest of the values look reasonable.

-------------- FATAL CALLED --------------- FATAL CALLED FROM FILE: LINE: 2946

grid%tsk unreasonable

Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

forrtl: error (78): process killed (SIGTERM)

Image PC Routine Line Source

real.exe 0000000002708724 for__signal_handl Unknown Unknown

libpthread-2.17.s 00002AE637D96630 Unknown Unknown Unknown

libpthread-2.17.s 00002AE637D9575B read Unknown Unknown

libmpi.so.12.0.0 00002AE6374CE76F Unknown Unknown Unknown

libmpi.so.12.0.0 00002AE6374CDB02 Unknown Unknown Unknown

libmpi.so.12.0.0 00002AE636DE8382 Unknown Unknown Unknown

libmpi.so.12.0.0 00002AE636ABA3C5 MPI_Abort Unknown Unknown

libmpifort.so.12. 00002AE63671A12D mpi_abort Unknown Unknown

real.exe 0000000000A57E46 Unknown Unknown Unknown

real.exe 00000000007E97AD Unknown Unknown Unknown

real.exe 00000000004792EF Unknown Unknown Unknown

real.exe 0000000000497785 Unknown Unknown Unknown

real.exe 0000000000411775 Unknown Unknown Unknown

real.exe 000000000040FF22 Unknown Unknown Unknown

libc-2.17.so 00002AE6384CB545 __libc_start_main Unknown Unknown

real.exe 000000000040FE29 Unknown Unknown Unknown

Any ideas of why this single row in our met_em skin temperature field is zero or if there is another issue that could be causing the failure?

Thanks in advance!
 
The issue may be in how the metgrid program is interpolating the skin temperature field from a grid that is essentially identical to the WRF model domain defined by geogrid. In the v4.1 METGRID.TBL file, the entry for SKINTEMP is as follows:
Code:
========================================
name=SKINTEMP
mpas_name=skintemp
        interp_option=sixteen_pt+four_pt+wt_average_4pt+wt_average_16pt+search
        masked=both
        interp_land_mask  = LANDSEA(1)
        interp_water_mask = LANDSEA(0)
        fill_missing=0.
========================================
I'd need to look further into the metgrid code to see exactly how the interpolation methods handle a grid point that is not surrounded by valid data in the input (intermediate) dataset, but as a quick test, could you try modifying the SKINTEMP entry so that it looks like the following?
Code:
========================================
name=SKINTEMP
mpas_name=skintemp
        interp_option=nearest_neighbor
        fill_missing=-999.
========================================
You'll only need to re-run metgrid and check whether there are any -999 values in the interpolated skin temperature field in your met_em files. If there are no -999 values, that would tell us that it is the interpolation methods that are the cause of the zero values in the top row of the interpolated field, in which case we may be able to make the interpolation methods more robust (or just use a nearest-neighbor interpolation, since we know the source and target grids are identical).
 
I did make that change to the METGRID.TBL and did a rerun of the simulation and it worked just fine. So it appears as you speculated that it was the interpolation method of the skin temperature.

Is there any harm with just leaving the interpolation method for skin temperature field set to nearest_neighbor?

Thank you very much for you help! I really appreciate it.
 
Top