Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WPS geogrid.exe

b17jps

New member
geogrid.exe runs but the following IEEE floating point error occurs. On x86-64 using gfortran and gcc, Linux.

>geogrid.exe
Parsed 50 entries in GEOGRID.TBL
Processing domain 1 of 3
INFORM: Using default interpolator sequence for HGT_M.
INFORM: Using default data source for HGT_M.
INFORM: Using default interpolator sequence for LANDUSEF.
...
Processing OL2SS
Processing OL3SS
Processing OL4SS
Processing BATHYMETRY
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Successful completion of geogrid. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG

I see the expected geogrid.nc output files but ? Could anyone please explain this bug ? Can it be ignored? I enabled debugging with GNU toolset, and find the SIGFPE occurs.
Why would HDF5/NetCDF trigger this floating point error during init, file create phase?


Traceback of STACK from gdb of geogrid.exe:
(gdb) where
#0 0x00007ffff51de7c5 in H5T__init_native_float_types () at H5Tinit_float.c:531
#1 0x00007ffff514176f in H5T_init () at H5T.c:752
#2 0x00007ffff520eff8 in H5VL_init_phase2 () at H5VLint.c:198
#3 0x00007ffff4df3b2f in H5_init_library () at H5.c:268
#4 0x00007ffff4ef2d1e in H5Eset_auto2 (estack_id=0, func=0x0, client_data=0x0) at H5E.c:1508
#5 0x00007ffff7788269 in set_auto (func=0x0, client_data=0x0) at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libhdf5/hdf5internal.c:65
#6 0x00007ffff778827e in nc4_hdf5_initialize () at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libhdf5/hdf5internal.c:76
#7 0x00007ffff7793bf4 in NC_HDF5_initialize () at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libhdf5/hdf5dispatch.c:128
#8 0x00007ffff7711880 in nc_initialize () at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/liblib/nc_initialize.c:107
#9 0x00007ffff7713c1e in NC_create (path0=0x7fffffff6a00 "./geo_em.d01.nc", cmode=4096, initialsz=0, basepe=0, chunksizehintp=0x0, useparallel=0, parameters=0x0, ncidp=0x7fffffff6a44)
at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libdispatch/dfile.c:1857
#10 0x00007ffff7713237 in nc__create (path=0x7fffffff6a00 "./geo_em.d01.nc", cmode=4096, initialsz=0, chunksizehintp=0x0, ncidp=0x7fffffff6a44)
at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libdispatch/dfile.c:475
#11 0x00007ffff77131f0 in nc_create (path=0x7fffffff6a00 "./geo_em.d01.nc", cmode=4096, ncidp=0x7fffffff6a44)
at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-c/libdispatch/dfile.c:402
#12 0x00007ffff7b298e8 in nf_create (path='./geo_em.d01.nc', cmode=4096, ncid=0, _path=15)
at /shared/data1/Projects/SESD-04777-08/Build_WRF/LIBRARIES/netcdf-fortran-4.5.2/fortran/nf_control.F90:64
#13 0x00000000004ba088 in ext_ncd_open_for_write_begin_ ()
#14 0x0000000000435347 in output_module::eek:utput_init (nest_number=1, title=<error reading variable: Asked for position 0 of stack, stack only has 0 elements on it.>,
datestr='0000-00-00_00:00:00', grid_type='C', dynopt=2, corner_lats=..., corner_lons=..., start_dom_1=<optimized out>, end_dom_1=<optimized out>, start_dom_2=<optimized out>,
end_dom_2=<optimized out>, start_patch_1=<optimized out>, end_patch_1=<optimized out>, start_patch_2=<optimized out>, end_patch_2=<optimized out>, start_mem_1=<optimized out>,
end_mem_1=<optimized out>, start_mem_2=<optimized out>, end_mem_2=<optimized out>, extra_col=<optimized out>, extra_row=<optimized out>, _title=<optimized out>, _datestr=<optimized out>,
_grid_type=<optimized out>) at output_module.f90:319
#15 0x000000000045a486 in process_tile_module::process_tile (which_domain=1, grid_type='C', dynopt=2, dummy_start_dom_i=1, dummy_end_dom_i=<optimized out>, dummy_start_dom_j=1,
dummy_end_dom_j=118, dummy_start_patch_i=1, dummy_end_patch_i=121, dummy_start_patch_j=1, dummy_end_patch_j=118, extra_col=.TRUE., extra_row=.TRUE., _grid_type=1)
at process_tile_module.f90:320
#16 0x0000000000404dc3 in geogrid () at geogrid.f90:83
#17 main (argc=<optimized out>, argv=<optimized out>) at geogrid.f90:19
#18 0x00007ffff5993505 in __libc_start_main () from /lib64/libc.so.6
#19 0x0000000000403239 in _start ()



DETAILS:
[l1057678@xxxxxx WPS]$ uname -a
Linux xxxxxx 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
 

Attachments

  • namelist.wps
    2 KB · Views: 0
Hi,
Yes, you can ignore that message. I'm not actually certain of the reason this prints out sometimes - I believe it's a historical thing in the code, dating back several years ago, but I do know that it's okay to proceed, and to ignore the message.
 
Please note, I fixed the problem, detailed here:

#WPS heap bug FIX !
#Solution: to modify F90 allocate() call to initalize heap correctly
```
$ git diff geogrid/src/proc_point_module.F
diff --git a/geogrid/src/proc_point_module.F b/geogrid/src/proc_point_module.F
index e4b44f9..d7275ea 100644
--- a/geogrid/src/proc_point_module.F
+++ b/geogrid/src/proc_point_module.F
@@ -149,12 +149,13 @@ module proc_point_module

if (istatus /= 0) return

- allocate(where_maps_to(src_min_x:src_max_x,src_min_y:src_max_y,2))
- do i=src_min_x,src_max_x
- do j=src_min_y,src_max_y
- where_maps_to(i,j,1) = NOT_PROCESSED
- end do
- end do
+ !allocate(where_maps_to(src_min_x:src_max_x,src_min_y:src_max_y,2))
+ allocate(where_maps_to(src_min_x:src_max_x,src_min_y:src_max_y,2), SOURCE=NOT_PROCESSED) !JPS DEBUGGING
+ !do i=src_min_x,src_max_x
+ ! do j=src_min_y,src_max_y
+ ! where_maps_to(i,j,1) = NOT_PROCESSED
+ ! end do
+ !end do

call process_categorical_block(src_array, istagger, where_maps_to, &
src_min_x+src_npts_bdr, src_min_y+src_npts_bdr, src_min_z, &
```
 
Hi,
Yes, you can ignore that message. I'm not actually certain of the reason this prints out sometimes - I believe it's a historical thing in the code, dating back several years ago, but I do know that it's okay to proceed, and to ignore the message.
It took some digging....but the problem is related to F90/C heap stack allocation and behavior (or lack thereof) of Compiler used. When using GNU tools the heap is not automagically initialized. This can cause pointer memory problems, it's best to resolve it. Also, the 2 do loops to set the where_maps_to matrix
is set to NOT_PROCESSED where commented out as the allocate(array, source=xxx) now replaces those 2 loops and automatically sets the values in the newly allocated matrix.
 
Top