Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Wrong behaviour on auxiliary output files

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

LluisFB

Member
Dear WRF-developing team,

It seems that the auxiliar output files framework does not properly work. It does not take the right output frequency as it is set up from namelist.

I got in my namelist something like:
auxhist2_outname = "wrfafwa_d<domain>_<date>",
io_form_auxhist2 = 2,
auxhist2_interval = 30,1440,
frames_per_auxhist2 = 1000,1000,
output_diagnostics = 1,
auxhist3_outname = "wrfxtrm_d<domain>_<date>",
io_form_auxhist3 = 2,
auxhist3_interval = 30,60,
frames_per_auxhist3 = 1000,1000,
auxhist9_outname = "wrfcdx_d<domain>_<date>",
io_form_auxhist9 = 2,
auxhist9_interval = 30, 60,
auxhist9_interval_m = 30, 60,
auxhist9_interval_s = 1800, 60,
frames_per_auxhist9 = 1000, 1000,
auxhist23_outname = "wrfpress_d<domain>_<date>",
io_form_auxhist23 = 2
auxhist23_interval = 30, 60,
frames_per_auxhist23 = 1000, 1000,

But in the output files I only get the right frequency into the wrfafwa file.

In order to check it (sorry if there is a more efficient way yo do it) I wrote into the phys/module_diagnostics_driver.F:
CALL WRFU_ALARMGET( grid%alarms(AUXHIST2_ALARM), prevringtime=aux_time, ringinterval=auxint)
PRINT *,'aux2int', auxint
CALL WRFU_ALARMGET( grid%alarms(AUXHIST3_ALARM), prevringtime=aux_time, ringinterval=auxint)
PRINT *,'aux3int', auxint
CALL WRFU_ALARMGET( grid%alarms(AUXHIST9_ALARM), prevringtime=aux_time, ringinterval=auxint)
PRINT *,'aux9int', auxint
CALL WRFU_ALARMGET( grid%alarms(AUXHIST23_ALARM), prevringtime=aux_time, ringinterval=auxint)
PRINT *,'aux23int', auxint

When I look into the rsl.out.0000, I got:
aux2int 1800 0 0 0
aux3int 10800 0 0 0
aux9int 10800 0 0 0
aux23int 10800 0 0 0

I guess I should obtain auxint = 1800 for all the streams. At the same time, it seems that if I do not set up the auxhistN_interval and use any other of the available ones auxhistN_interval_[y/m/d/h/m/s] model crashes with a segmentation fault.

Many thanks in advance,

Lluís
 
Hi,
Can you send the namelist.input file you are using, as well as your rsl.out.0000 file? Can you also let me know which version of WRF you are running?
Thanks!
 
Hi Kwerner,

Sorry, I forgot to include that.

I am using WRF3.9.1.1 see attached the namelist. This namelist works with a series of modifications I recently made related to the CORDEX output (auxhist #9 and section &cordex of the namelist). You could remove it, It should not interfere with the output.

I do not know if it is also related, but I got problems some time to get the restart file at the end of the simulation.

Many thanks,

Lluís
 

Attachments

  • namelist.input
    7.1 KB · Views: 93
Thanks for sending that. Can you set debug_level = 0 (this is an option that was put in the namelist many years ago, but was recently removed because it typically provides no useful information. It simply makes the rsl file large and tough to read), and you can remove any other options that are not used for your run, and then re-run this? After that please send the new namelist.input file, and please send the rsl.out.0000 and rsl.error.0000 files.

Thanks,
Kelly
 
Sorry for the late answer,

we got air-conditioning issues in our HPC room.

Here I attach the namelist and the rsl.out/error.0000 files

Thanks,

Lluís
 

Attachments

  • namelist.input
    7.1 KB · Views: 88
  • rsl_error_0000.txt
    24.2 KB · Views: 72
  • rsl_out_0000.txt
    24.3 KB · Views: 72
Hi Lluis,

I have tried to repeat your problem but have been unsuccessful. I ran V3.9.1.1 and used your namelist, but my data, so I only made modifications to the date and grid size/resolution, and time_step. I am able to get the correct time for all of the outputs EXCEPT for auxhist9. You mentioned earlier that you put in some modifications to the namelist that helped you actually get the right output for auxhist9, but these give me the wrong output intervals. If I have these settings in the namelist:
Code:
 auxhist2_interval                    = 30
 auxhist3_interval                    = 30
 auxhist23_interval                   = 30
 auxhist9_interval                    = 30 
 auxhist9_interval_m                = 30
 auxhist9_interval_s                 = 1800

then I get output in these intervals, like this:
Code:
Timing for Writing wrfafwa_d01_2016-03-23_00:00:00 for domain        1:    0.03824 elapsed seconds
Timing for Writing wrfxtrm_d01_2016-03-23_00:00:00 for domain        1:    0.03012 elapsed seconds
Timing for Writing wrfcdx_d01_2016-03-23_00:00:00 for domain        1:    0.00900 elapsed seconds
Timing for Writing wrfpress_d01_2016-03-23_00:00:00 for domain        1:    0.02616 elapsed seconds
......
......
Timing for Writing wrfafwa_d01_2016-03-23_00:30:00 for domain        1:    0.00618 elapsed seconds
Timing for Writing wrfxtrm_d01_2016-03-23_00:30:00 for domain        1:    0.00395 elapsed seconds
Timing for Writing wrfpress_d01_2016-03-23_00:30:00 for domain        1:    0.00321 elapsed seconds
......
......
Timing for Writing wrfafwa_d01_2016-03-23_01:00:00 for domain        1:    0.00519 elapsed seconds
Timing for Writing wrfxtrm_d01_2016-03-23_01:00:00 for domain        1:    0.00378 elapsed seconds
Timing for Writing wrfcdx_d01_2016-03-23_01:00:00 for domain        1:    0.00720 elapsed seconds
Timing for Writing wrfpress_d01_2016-03-23_01:00:00 for domain        1:    0.00363 elapsed seconds

So you can see that all of them wrote out at 30 min intervals, except for the wrfcdx* files (auxhist9). If I remove these from the namelist:
Code:
 auxhist9_interval_m                = 30
 auxhist9_interval_s                 = 1800

then I get the correct output times for all of them in the rsl file; however, either way I don't get any actual output in the wrfcdx* file because there is nothing to be put in that file (nothing telling the model what to write into the auxiliary stream 9). So I'm not really sure why this isn't working for you. Did you make any other modifications to any of your code? One thing I would suggest is to download a fresh/pristine tar file of v3.9.1.1 of the code and recompile in a clean directory, and try to run this there.

Kelly
 
Many thanks Kelly,

I did the tests that you suggested and it worked, even with the modifications that I introduced. I had been talked by my IT team, that there was someone badly using our cluster (high I/O writing on the main node), which might interfere with the well being of the model.

However, i still do not get the wrfrst when it should. I have monthly runs and for example I have on my namelist:
Code:
 &time_control
 run_days                            = 0,
 run_hours                           = 0,
 run_minutes                         = 0,
 run_seconds                         = 0,
 start_year                          = 2001,
 start_month                         = 03,
 start_day                           = 01,
 start_hour                          = 00,
 start_minute                        = 00,
 start_second                        = 00,
 end_year                            = 2001,
 end_month                           = 04,
 end_day                             = 01,
 end_hour                            = 00,
 end_minute                          = 00,
 end_second                          = 00,
 interval_seconds                    = 21600
  input_from_file                     = .true.
 history_interval                    = 1440
 frames_per_outfile                  = 5
 restart                             = .true.
 restart_interval                    = 44640
 io_form_history                     = 2
(...)

But when I look into the running folder I got:
Code:
$ ls wrfrst*
wrfrst_d01_2001-03-01_00:00:00  wrfrst_d01_2001-03-29_00:00:00

$ ncdump -v Times wrfrst_d01_2001-03-29_00\:00\:00 
(...)
 Times =
  "2001-03-29_00:00:00" ;
}

And when I look into the rsl.errror.0000:
Code:
(...)
Timing for main: time 2001-03-28_23:58:20 on domain   1:    5.01393 elapsed seconds
d01 2001-03-28_23:58:20  CLWRFdiag - T2; tile:            1 T2clmin:   276.2490     T2clmax:   276.3187     TT2clmin:   125173.3     TT2clmax:   124793.3     T2clmean:   276.2661     T2clstd:  1.5655488E-02
Timing for main: time 2001-03-29_00:00:00 on domain   1:    5.11592 elapsed seconds
Timing for Writing wrfout_d01_2001-03-29_00:00:00 for domain        1:   29.93662 elapsed seconds
Timing for Writing wrfxtrm_d01_2001-03-29_00:00:00 for domain        1:    0.70058 elapsed seconds
Timing for Writing wrfcdx_d01_2001-03-29_00:00:00 for domain        1:    1.51330 elapsed seconds
Timing for Writing wrfpress_d01_2001-03-29_00:00:00 for domain        1:    9.06156 elapsed seconds
d01 2001-03-29_00:00:00 Input data processed for aux input   4 for domain   1
(....)
Timing for main: time 2001-03-31_23:58:20 on domain   1:    5.15768 elapsed seconds
d01 2001-03-31_23:58:20  CLWRFdiag - T2; tile:            1 T2clmin:   276.2603     T2clmax:   277.0431     TT2clmin:   129600.0     TT2clmax:   129478.3     T2clmean:   276.7323     T2clstd:  0.2882564
Timing for main: time 2001-04-01_00:00:00 on domain   1:    5.29839 elapsed seconds
Timing for Writing wrfout_d01_2001-04-01_00:00:00 for domain        1:   35.20950 elapsed seconds
Timing for Writing wrfxtrm_d01_2001-04-01_00:00:00 for domain        1:    0.66797 elapsed seconds
Timing for Writing wrfcdx_d01_2001-04-01_00:00:00 for domain        1:    1.44985 elapsed seconds
Timing for Writing wrfpress_d01_2001-04-01_00:00:00 for domain        1:    9.02107 elapsed seconds
d01 2001-04-01_00:00:00 wrf: SUCCESS COMPLETE WRF

As you can see, wrfrst is written 2 days before and even it does not registered as being write into the rsl. Looking into the namelist.output:
Code:
(...)
 RESTART_INTERVAL        =       44640,
(...)
 RESTART_INTERVAL_D      =           0,
 RESTART_INTERVAL_H      =           0,
 RESTART_INTERVAL_M      =           0,
 RESTART_INTERVAL_S      =           0,

I am completely puzzled!
 
Hi Lluís,
First, I'm so glad that you were finally able to get past the original problem. That is great!
As for the second problem with the restart output, try setting (in the &time_control section)

override_restart_timers = .true.

and see if that helps.

Kelly
 
Dear kelly,

Now it works with the parameter you passed me.

I lost the model version when this parameter seems to become mandatory to be able to perform long period simulations (multitple restarts). I should read more carefully the web page where it is explained.

http://www2.mmm.ucar.edu/wrf/users/docs/user_guide_V3/users_guide_chap5.htm#restart

Sorry about that, :oops:

Any way, many thanks,

Lluís
 
No worries! I'm very glad that you were able to resolve this. Thank you for updating the post!

Kelly
 
Top