Hi,
I'm trying to run a specific case study but I can't get this simulation to finish writing the initialization files. I've tried a few different model versions, compilers, and CPUs. The model doesn't integrate through any timestep, it just crashes as it tries to write the 1st wrfout.
With 16 processors from 2 CPUs (8+8) I get this error:
With 12 processors from 1 CPU, I get this:
The specified line number 6963 in module_diffusion refers to this statement:
I calculated z_at_w and rdzw from wrfinput_d01 and wrfinput_d02 and they look ok. You'll find the netcdf files with these variables in the tarball for the real.exe files.
The source code for the attached test runs is the release-v4.2.2 (1e93b7e3) compiled with intel/19.1.1 and impi on Cheyenne.
I'm including the relevant files from real.exe, and from wrf.exe for the tests with 12 and 16 CPUs. The tarball names with "-D" correspond to executables compiled in debug mode with -D, and "noD" corresponds to the standard compilation.
I also tried release-v4.0.1 and release-v4.0.3, compiled with gnu and intel but I'm not including the tests files for these.
The input data to WPS is from HRRR v3 at pressure levels, downloaded from http://hrrr.chpc.utah.edu/. Let me know if you'd like to see the WPS files.
I'd really appreciate any tip on how to identify the issue. Thank you so much!
I'm trying to run a specific case study but I can't get this simulation to finish writing the initialization files. I've tried a few different model versions, compilers, and CPUs. The model doesn't integrate through any timestep, it just crashes as it tries to write the 1st wrfout.
With 16 processors from 2 CPUs (8+8) I get this error:
Code:
forrtl: severe (408): fort: (2): Subscript #4 of the array SCALAR has value 2 which is greater than the upper bound of 1
Image PC Routine Line Source
wrf.exe 000000000CF3DBCF Unknown Unknown Unknown
wrf.exe 0000000001097AE9 force_domain_em_p 11088 module_dm.f90
With 12 processors from 1 CPU, I get this:
Code:
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
wrf.exe 000000000CF465FB Unknown Unknown Unknown
libpthread.so.0 00002B33D88A4B00 Unknown Unknown Unknown
wrf.exe 00000000091E72C3 module_diffusion_ 6963 module_diffusion_em.f90
The specified line number 6963 in module_diffusion refers to this statement:
Code:
rdzw(i,k,j) = 1.0 / ( z_at_w(i,k+1,j) - z_at_w(i,k,j) )
The source code for the attached test runs is the release-v4.2.2 (1e93b7e3) compiled with intel/19.1.1 and impi on Cheyenne.
I'm including the relevant files from real.exe, and from wrf.exe for the tests with 12 and 16 CPUs. The tarball names with "-D" correspond to executables compiled in debug mode with -D, and "noD" corresponds to the standard compilation.
I also tried release-v4.0.1 and release-v4.0.3, compiled with gnu and intel but I'm not including the tests files for these.
The input data to WPS is from HRRR v3 at pressure levels, downloaded from http://hrrr.chpc.utah.edu/. Let me know if you'd like to see the WPS files.
I'd really appreciate any tip on how to identify the issue. Thank you so much!