Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault in wrf.exe Run (WRF v4.6.1, ERA5 Data, 4 Domains at High Resolution)

Dear WRF Community,

Greetings of the day.

I am currently working with WRF version 4.6.1, simulating high-resolution nested domains with ERA5 reanalysis data. The simulation is configured with four domains at resolutions of 9 km down to 0.333 km using a 1:3 nesting ratio.

The preprocessing steps (geogrid, ungrib, metgrid, and real.exe) executed successfully without any issues. However, upon running wrf.exe, I encountered a segmentation fault. The error message is as follows:

I tried with differnt timesteps such as 90 and 60, and 54 too

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe2e5d8b20)

I have attached the following files for your reference:
  • namelist.input and WPS namelist too
  • The relevant excerpt from rsl.error.0158
    • WRF TILE 1 IS 686 IE 771 JS 830 JE 893
      WRF NUMBER OF TILES = 1
      Tile Strategy is not specified. Assuming 1D-Y
      WRF TILE 1 IS 206 IE 231 JS 269 JE 289
      WRF NUMBER OF TILES = 1
      d03 2003-08-05_00:00:48 7 points exceeded v_cfl = 2 in domain d03 at time 2003-08-05_00:00:48 hours
      d03 2003-08-05_00:00:48 Max W: 732 846 3 W: -49.45 w-cfl: 2.93 dETA: 0.01
      d03 2003-08-05_00:00:48 22 points exceeded v_cfl = 2 in domain d03 at time 2003-08-05_00:00:48 hours
      d03 2003-08-05_00:00:48 Max W: 733 846 3 W: 62.23 w-cfl: 5.44 dETA: 0.01
      d03 2003-08-05_00:00:48 66 points exceeded v_cfl = 2 in domain d03 at time 2003-08-05_00:00:48 hours
      d03 2003-08-05_00:00:48 Max W: 732 846 2 W: -131.04 w-cfl: 10.45 dETA: 0.01
      [mtx24:1204522:0:1204522] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe2e5d7b60)
      ==== backtrace (tid:1204522) ====
      0 /home/apps/ucx/1.13.1/lib/libucs.so.0(ucs_handle_error+0x294) [0x154442be36e4]
      1 /home/apps/ucx/1.13.1/lib/libucs.so.0(+0x2f89c) [0x154442be389c]
      2 /home/apps/ucx/1.13.1/lib/libucs.so.0(+0x2fb48) [0x154442be3b48]
      3 /lib64/libpthread.so.0(+0x12ce0) [0x154457390ce0]
      4 ./wrf.exe() [0x1d842a2]
      5 ./wrf.exe() [0x1d8457d]
      6 ./wrf.exe() [0x1d84847]
      7 ./wrf.exe() [0x1d8b938]
      8 ./wrf.exe() [0x26bf87f]
      9 ./wrf.exe() [0x1c46e38]
      10 ./wrf.exe() [0x1c66ca8]
      11 ./wrf.exe() [0x126e36d]
      12 ./wrf.exe() [0xf05f2b]
      13 ./wrf.exe() [0xdbf2d4]
      14 ./wrf.exe() [0x4153fa]
      15 ./wrf.exe() [0x415a52]
      16 ./wrf.exe() [0x415a52]
      17 ./wrf.exe() [0x405102]
      18 ./wrf.exe() [0x404bbd]
      19 /lib64/libc.so.6(__libc_start_main+0xf3) [0x154456ff3ca3]
      20 ./wrf.exe() [0x404bfe]
      =================================

      Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

      Backtrace for this error:
      #0 0x154457d98171 in ???
      #1 0x154457d97313 in ???
      #2 0x154457390cdf in ???
      #3 0x1d842a2 in ???
      #4 0x1d8457c in ???
      #5 0x1d84846 in ???

      #6 0x1d8b937 in ???
      #7 0x26bf87e in ???
      #8 0x1c46e37 in ???
      #9 0x1c66ca7 in ???
      #10 0x126e36c in ???
      #11 0xf05f2a in ???
      #12 0xdbf2d3 in ???
      #13 0x4153f9 in ???
      #14 0x415a51 in ???
      #15 0x415a51 in ???
      #16 0x405101 in ???
      #17 0x404bbc in ???
      #18 0x154456ff3ca2 in ???
      #19 0x404bfd in ???
      #20 0xffffffffffffffff in ???

I would be grateful for any assistance or guidance to help identify and resolve the cause of this issue. Please let me know if any additional information or logs would be helpful for troubleshooting.

Thank you very much for your time and support.

Warm regards,
Nags
 

Attachments

  • namelist.input
    4.1 KB · Views: 2
  • namelist.wps
    1.3 KB · Views: 1
  • namelist.input
    4.1 KB · Views: 1
  • namelist.wps
    1.3 KB · Views: 1
Hi,
I believe the issue here is possibly related to the domain setup you're using. I see that you're choosing the mercator map projection, but since you're looking at a latitude of 44.453, it probably makes more sense to use the lambert projection instead. If you do, I'd recommend setting truelat1 = 60 and truelat2 = 30, and then set stand_long = to ref_lon.

Additionally, you probably should make d03 a bit smaller. It's a bit close to the boundaries of d02, and since d04 is so small, there's plenty of room for you to decrease the size of that domain.

If you do modify the domain and still get CFL errors, see Segmentation Faults and CFL Errors for additional options for dealing with the instability. You may also need to add additional vertical levels (e_vert) to account for the large size of your domains. You can try setting it to 60 for each domain and see if that makes a difference.
 
Hi,
I believe the issue here is possibly related to the domain setup you're using. I see that you're choosing the mercator map projection, but since you're looking at a latitude of 44.453, it probably makes more sense to use the lambert projection instead. If you do, I'd recommend setting truelat1 = 60 and truelat2 = 30, and then set stand_long = to ref_lon.

Additionally, you probably should make d03 a bit smaller. It's a bit close to the boundaries of d02, and since d04 is so small, there's plenty of room for you to decrease the size of that domain.

If you do modify the domain and still get CFL errors, see Segmentation Faults and CFL Errors for additional options for dealing with the instability. You may also need to add additional vertical levels (e_vert) to account for the large size of your domains. You can try setting it to 60 for each domain and see if that makes a difference.
Dear @kwerner,

Thank you so much for your kind response and helpful suggestions.

As per your advice, I followed the guidance in the link you shared, and I’m happy to inform you that the model is now running smoothly! I haven’t changed truelat1 and truelat2 yet, but I’m currently running some simulations and will compare the results with observations to evaluate the performance.


I’ll keep you updated once I have more insights from the output.

Sincerely,
Nagaraju Gaddam
 
Top