Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Crash in HALO_EM_PHYS_HCW_inline.inc after 2 days in nested idealized tropical cyclone case (WRF v4.5)

babyasleep

New member
Hi all,

I am running an idealized tropical cyclone simulation using WRF v4.5, with three nested domains. The model runs fine for about 2 simulated days, but then it crashes with an error related to the file HALO_EM_PHYS_HCW_inline.inc.
This module seems to be associated with halo region processing, where WRF handles data exchange at domain boundaries for physical processes.
I have double-checked that the tropical cyclone remains well-centered in all three domains throughout the simulation. The same experiment, when run without another nesting setting, completes successfully.

Any suggestions, similar experiences, or advice would be greatly appreciated!
I’ve attached the namelist.input files for both the successful and unsuccessful runs for reference.

Below is part of the log output leading to the crash:

d03 0001-01-02_20:23:13+01/03 calling inc/HALO_EM_PHYS_HCW_inline.inc
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libpthread-2.31.s 000015083EC1F8C0 Unknown Unknown Unknown
libmpi_intel.so.1 000015083D90F24F Unknown Unknown Unknown
...
wrf.exe 000000000041733A Unknown Unknown Unknown

Best regards,
Jiaxin
 

Attachments

  • ERORR_namelist.input
    5.2 KB · Views: 3
  • NO_ERROR_namelist.input
    5.2 KB · Views: 1
Apologies for the long delay in response while our team tended to time-sensitive obligations. Thank you for your patience.

Are you still experiencing this issue? If so, can you package all of your rsl* files from the failed simulation into a single *.tar or zipped file and attach that? Thanks!
 
Hi kwerner,

Thanks for your response. I’m still working on this issue and have tried the following steps separately:
1. The experiment ran for a longer time after I set the same history_interval for all domains, but it eventually failed. (New namelist attached.)
2. I changed the mp_physics setting, and it works. However, I really want to use mp_physics = 8 since it is one of the best schemes for tropical cyclones.
3. I have checked there is no cfl-error in my rsl files.

I set the debug_level=200 so the all rsl files are too large to attach. So I uploaded the most important rsl files. My case is in /glade/derecho/scratch/cjiaxin/wrfv4.5_level_nest_feedback_initial_profile_mpphyscis1 in NCAR derecho. Really appreciate your help!

Thanks,
Jiaxin
 

Attachments

  • rsl.out.0000_error.PNG
    rsl.out.0000_error.PNG
    82.1 KB · Views: 1
  • rsl.out.0228_error.PNG
    rsl.out.0228_error.PNG
    41.5 KB · Views: 1
Jiaxin,
I want to let you know I'm not ignoring you. I've just been trying to figure out what's going on, and to do that, I'm having to run your case several times, which takes quite a long time. So far, I haven't found a successful solution, but I'll keep you posted.
 
Hi,
Okay, I was finally able to get this to run. These are the changes I made:

  • I changed the time_step to 20
  • I set e_vert to 60 for each domain
  • I set w_damping=1 in the &dynamics namelist record
  • I ran using 480 processors (4 nodes, 120 processors per node)
  • I did have to restart the model because it didn't finish before reaching my 6-hour wallclock limit
If you'd like to see my exact namelist, it's in
/glade/derecho/scratch/kkeene/babyasleep/wrfv471/test/em_tropical_cyclone
 
Top