Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

CONUS benchmark problems

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

steveb

New member
Hi all,

I'm trying to run the 12km CONUS benchmark.

I'm getting a segfault, but before that a warning that points are exceeding cfl=2. So it looks like it's just unstable, but I'd have assumed the benchmark was set up so that it would work?

This is using WRF 3.9.1.1. To run the benchmark I'm using the files from test/em_real, with data files from http://www2.mmm.ucar.edu/WG2bench/conus12km_data_v3/ over the top (e.g. replacing namelist.input). I had to modify the namelist to add
Code:
use_baseparam_fr_nml = .t.
to the &dynamics section, but it is otherwise unchanged.

I'm then using:

Code:
mpirun ./wrf.exe

This is under slurm so you can't see the process control but its on 8 nodes with 64 tasks per node = 512 cores/processes total. I did try smaller runs first with same result, but I can see there are benchmark results for 512 cores so seemed a reasonable place to start. Nodes have 64 cores with 128GB RAM.

Example of rsl.error.* file with segfault:
 
Forum s/w complained the rsl.error. listing "Forbidden. Contains contacts. Message seems to be spam." so I can't post the full thing.

end of it was:

LBC for restart: Starting valid date = 2001-10-25_00:00:00, Ending valid date = 2001-10-25_06:00:00
LBC for restart: Found the correct bounding LBC time periods for restart time = 2001-10-25_00:00:00
Tile Strategy is not specified. Assuming 1D-Y
WRF TILE 1 IS 55 IE 81 JS 291 JE 300
WRF NUMBER OF TILES = 1
d01 2001-10-25_00:00:00 1997 points exceeded cfl=2 in domain d01 at time 2001-10-25_00:00:00 hours
d01 2001-10-25_00:00:00 MAX AT i,j,k: 67 295 34 vert_cfl,w,d(eta)= 2166.30884 -326.380646 1.25799999E-02

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
 
Hi,
I first would like to apologize for the long delay in response to this inquiry. It seems as though it was overlooked. Thank you for your patience.
Over the years there have been code updates that may have led to these particular input files, and this particular namelist becoming unstable due to CFL errors. The benchmark files you are using are more than a decade old, and were created with a version of WRF much older than V3.9.1.1. I would just advise to try decreasing your time step to something more like 4xDX (or maybe even 3xDX) to see if that helps to overcome the CFL error.
 
Top