Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Real.exe quits without error code

rpkamakura

New member
I am trying to run a quick (1 day) simulation with real data, including SST. I am just testing the set-up to prepare for longer (~1-3 week) simulations where I am specifically interested in understanding the influence of different SST scenarios, so I am running with sst_update on. [Edit for more context: I am also running it with urban data since it is covering the Houston/Galveston Bay area, not sure if relevant]

I managed to run WPS (geogrid, ungrib, and metgrid) successfully. However, when I try to run real.exe it terminates after ~30 seconds without any informative error codes. I have attached my rsl.error and rsl.out files along with the namelist.input file as well as a screenshot of the output in the terminal. I will detail more about the run set-up below.

  • Running in windows powershell (have a windows machine)
  • WRF compiled with dmpar GNU (gfortran/gcc) [option 34]
  • WPS compiled dmpar with gfortran [option 3]
  • I converted the netcdf file of the high-resolution SST to the intermediate file format
I added the text below to METGRID.TBL to deal with the added SST file (from GHRSST), and from looking at the met_em files it seems to have worked correctly.

========================================
name=SST
interp_option=wt_average_4pt
fill_missing=0.
interp_mask=mask(1)
masked=land
missing_value=-1e+30
flag_in_output=FLAG_SST
========================================

I am not sure where to even start debugging, would love any guidance y'all might have!
 

Attachments

  • rsl.error.0000
    5.9 KB · Views: 2
  • rsl.out.0000
    541.8 KB · Views: 1
  • namelist.input
    4.2 KB · Views: 1
  • Termination_message.png
    Termination_message.png
    47.3 KB · Views: 7
Last edited:
Please take a look al all the RSL files you have for this case, --- the error messages could be in any of your RSL files and not necessarily in rsl.out.0000 and rsl.error.0000. We need some helpful error messages for us to figure out what is wrong.

By the way, what data did you ungrib to run this case?
 
Thanks for the suggestion! I looked at all the files (it wouldn't let me attach them here) and still didn't see an error message anywhere. I also ran it again just with 1 processor to make the output easier to see and didn't see an error message in those rsl files either (attached), and it just prints out "killed" in the terminal.

As for the ungrib data, I am using the NAM data. I am also using the high resolution static data from the static data download site.
 

Attachments

  • rsl.error.0000
    5.9 KB · Views: 1
  • rsl.out.0000
    6.7 MB · Views: 1
There is no error message in your rsl files. I am suspicious that this issue is related to the windows machine you are using, ---- WPS/WRF are designed to run in Linux system and they are not supported for running in windows.

Are you able to run the model in Linux cluster or HPC?
 
Yeah that is entirely possible. I have run WRF successfully on this computer (with windows subsystem for linux) so I can have with a version of this model without sst_update on, but perhaps with sst_update it is too much?

I do have access to an HPC cluster and can run it there. I was just trying to make sure everything worked first since I have limited compute resources, but since the code otherwise seems fine I can just transfer it over now. Thanks!
 
Sorry for being a bit slow, but I have finally gotten around to running it on the computing cluster and have a few updates:

1) There was definitely an issue with the SST mask. I am not sure I have completely fixed it, but it seems like instead of a separate mask the GHRSST data just has "masked out" regions given the SST value -32768 (most negative value allowed for a 16-bit signed integer in Fortran). Taking that into account I have been able to run real.exe both on the computing cluster and on my personal machine.
2) I still cannot get wrf.exe to run correctly due to a segmentation fault that I suspect is related to the SST data.

I am going back and trying to re-run metgrid and then run real.exe and wrf.exe again with a slightly different approach to the SST masking but have had unrelated challenges with the computing cluster. I will update this thread if I figure it out.
 
Top