Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Model output vs Observation (Gauged stations)

kcmonaka

New member
I have managed to run WRF using the tropical suite. I want to compare the gauged station data and the model output. What is the best way to go around this. The challenge being "point data vs grid data"

I used the python code below and i somehow feel am missing something.
I have successfully run the WRF model using the tropical suite and now wish to evaluate its performance by comparing the model output with gauge station data. However, I’m facing the common challenge of comparing point-based observations (from weather stations) with gridded model data.

To address this, I wrote a Python script that extracts WRF-simulated precipitation values from the nearest grid point to each station. While the code executes without error, I feel that something may be missing in terms of accuracy or methodology, particularly regarding spatial representation and grid-point selection


import xarray as xr
import pandas as pd
import numpy as np

# --- Define the station information ---
stations = [
{"name": "Platjan Border Post", "lat": -22.48642, "lon": 28.83516},
{"name": "Sechele Primary School", "lat": -20.7541944444444, "lon": 27.357277777777778},
{"name": "Gobojango Primary School", "lat": -21.8301388888889, "lon": 28.723111111111113},
{"name": "Senyawe Primary School", "lat": -20.7665833333333, "lon": 27.6896388888889},
{"name": "Semolale Primary School", "lat": -21.8761666666667, "lon": 28.830222222222222},
{"name": "Tobane Crops", "lat": -21.9110285374283, "lon": 28.0416840644143},
{"name": "Bainsdrift Police Station", "lat": -22.483333333333334, "lon": 28.733333333333334},
{"name": "Motshegaletau Primary School", "lat": -22.34, "lon": 26.33},
{"name": "Agosi Primary School (Bobonong)", "lat": -21.99825, "lon": 28.41638888888889},
{"name": "L.O Dialwa`s Farm (Lerala)", "lat": -22.78897, "lon": 27.76186},
{"name": "Moss Farm (Molalatau)", "lat": -22.07346, "lon": 28.59792},
{"name": "Paje Primary School", "lat": -22.15, "lon": 26.47},
{"name": "Reuben Motswakae Farm (Tonota)", "lat": -21.43783, "lon": 27.46338},
{"name": "Mandunyane Primary School", "lat": -21.3841388888889, "lon": 27.415305555555555},
{"name": "Selibe Phikwe Met Station", "lat": -22.0549444444444, "lon": 27.8185},
{"name": "Senete Primary School", "lat": -20.3224722222222, "lon": 27.131222222222224},
{"name": "Letlhakane Prison", "lat": -21.4066285969262, "lon": 25.5898117598271},
{"name": "Thabala Primary School", "lat": -22.29, "lon": 26.25},
{"name": "Ramokgwebana Border Gate", "lat": -20.5500833333333, "lon": 27.720861111111113},
{"name": "Tshesebe Primary School", "lat": -20.7260833333333, "lon": 27.581666666666667},
{"name": "Sepako Wildlife Camp", "lat": -19.86488889, "lon": 26.49155556},
{"name": "Letlhakane Met Station", "lat": -21.42, "lon": 25.62},
{"name": "Moreomabele Primary School", "lat": -22.02, "lon": 27.16},
{"name": "Pandamatenga Met Station", "lat": 25.63577777777778, "lon": -18.5445},
{"name": "Masunga Primary School", "lat": -20.6203055555556, "lon": 27.445972222222224},
{"name": "Swaneng Hill Snr School (Serowe)", "lat": -22.4141, "lon": 26.7474},
{"name": "Serowe Police Station", "lat": -22.3871944444444, "lon": 26.696916666666667},
{"name": "Ditladi Primary School", "lat": -21.4618611111111, "lon": 27.53322222222222},
{"name": "Sua Junior Secondary School (Nata)", "lat": -20.216666666666665, "lon": 26.166666666666668},
{"name": "P.G Matante Airport Met Office", "lat": -21.15, "lon": 27.483333333333334},
{"name": "Xanagas Research", "lat": -22.1625833333333, "lon": 20.28811111111111},
{"name": "Rakops Junior Secondary School", "lat": -21.0273888888889, "lon": 24.412222222222223},







]

# --- Load the WRF NetCDF file ---
nc_file = "wrfout_d02_2023-12-16_daily_precip.nc"
ds = xr.open_dataset(nc_file)

# --- Checking required variables ---
if "XLAT" not in ds.variables or "XLONG" not in ds.variables:
raise KeyError("Variables 'XLAT' or 'XLONG' are missing in the NetCDF file.")

# Extract 2D latitude and longitude arrays
lats = ds["XLAT"].values.squeeze()
lons = ds["XLONG"].values.squeeze()

# --- Load precipitation ---
precip_var = "TOTAL_PRECIP" # Adjust if your variable name differs
if precip_var not in ds.variables:
raise KeyError(f"Variable '{precip_var}' not found in the NetCDF file.")

precip = ds[precip_var].values.squeeze()

# --- Prepare list for storing extracted station data ---
station_data = []

# --- Process each station ---
for station in stations:
name = station["name"]
lat = station["lat"]
lon = station["lon"]

# Calculating Euclidean distance in degrees
distance = np.sqrt((lats - lat)**2 + (lons - lon)**2)

# Finding the nearest grid point
nearest_lat_idx, nearest_lon_idx = np.unravel_index(np.argmin(distance), distance.shape)

# Extract precipitation at the nearest grid point
precip_value = precip[nearest_lat_idx, nearest_lon_idx]

# Append station data
station_data.append({
"Station": name,
"Latitude": lat,
"Longitude": lon,
"Precipitation (mm)": precip_value
})

# --- Convert to DataFrame and save ---
df = pd.DataFrame(station_data)

# Save to CSV
output_csv = "station_precip_extracted.csv"
df.to_csv(output_csv, index=False)

print(f"\n✅ Station precipitation values saved to '{output_csv}'")



My specific questions are:

  1. Is extracting the nearest grid point sufficient for comparing WRF output with point observations, or should I be interpolating the gridded data to the exact station location (e.g., bilinear or inverse distance weighting)?
  2. Are there better practices or recommended approaches in the literature or your experience for handling point vs. grid data comparison in WRF validation?
  3. Are there any diagnostic tools or packages you’d recommend to quantify the accuracy (e.g., bias, RMSE, correlation) of the model against station data once the extraction is done?

Any feedback or suggestions to improve the methodology would be greatly appreciated.
 
Top