Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Error in running WRF-Chem Version 3.9.1.1. using MOZART-MOSAIC option (chem_opt=202)

Ankan

Member
Dear all,
I am running WRF-Chem Version 3.9.1.1 on a Linux server with 32 processors (RAM: 378 GB) and the gfortran compiler. I used the EDGARV5 MOZART dataset for generating anthropogenic emission input files ('wrfchemi_d<domain>_<date>.nc'), FINN Version 1.5 dataset for generating fire emission input files ('wrffirechemi_d<domain>_<date>.nc'), MEGAN for generating biogenic emission input files ('wrfbiochemi_d<domain>.nc') and CAM-chem model output for running the mozbc utility to include chemical lateral boundary conditions in the wrfinput files. After successfully completing these steps, I ran wrf.exe with the following command:
mpirun -np 28 ./wrf.exe >& wrfrun.log &
Though wrf.exe was running successfully for some time, after 30 minutes the simulation stopped without throwing any errors. I checked the 'rsl.error.0000' file and then I found the following lines:
Timing for main: time 2015-12-22_00:30:00 on domain 2: 493.95236 elapsed seconds
Timing for main: time 2015-12-22_00:30:00 on domain 1: 2437.71338 elapsed seconds
-------------------------
WARNING: Large total lw optical depth of ******** at point i,j,nb= 1 1 1
Diagnostics 1: k, tauaerlw1, tauaerlw16
1****************
2****************
3****************
4****************
5****************
6****************
7****************
8****************
9****************
10****************
11****************
12****************
13****************
14****************
15****************
16****************
17****************
18 0.00 0.00
19 0.00 0.00
20 0.00 0.00
21 0.00 0.00
22 0.00 0.00
23 0.00 0.00
24 0.00 0.00
25 0.00 0.00
26 0.00 0.00
27 0.00 0.00
28 0.00 0.00
29 0.00 0.00
30 0.00 0.00
31 0.00 0.00
32 4.10 26.75
33****************
34 44.19 266.64
-------------------------
-------------------------

I also checked 'rsl.out.0000' file and I found the following lines:
Timing for main: time 2015-12-22_00:00:18 on domain 2: 288.50095 elapsed seconds
d02 Domain average of dpsdt, dmudt (mb/3h): 0.300000012 91.5495071 18.9523468
d02 Max mu change time step: 399 393 2.25554081E-03
d02 Domain average of dardt, drcdt, drndt (mm/sec): 0.300000012 1.55115742E-06 1.55083853E-06 3.18821164E-10
d02 Domain average of rt_sum, rc_sum, rnc_sum (mm): 0.300000012 3.25822002E-05 3.25729452E-05 9.25278609E-09
d02 Max Accum Resolved Precip, I,J (mm): 1.04091785E-04 187 323
d02 Max Accum Convective Precip, I,J (mm): 6.21847808E-03 213 261
d02 Domain average of sfcevp, hfx, lh: 0.300000012 7.40115764E-04 5.41384125 59.9057388
calculate MEGAN emissions at ktau, gmtp, tmidh = 2 0.00000000 7.50000030E-03
photolysis_driver: called for domain 2
entering mosaic_cloudchem_driver - ktau = 2
leaving mosaic_cloudchem_driver - ktau = 2 0
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200
ASTEM internal steps exceeded 200

A 'fort.67' file is also generated in which "ASTEM internal steps exceeded 200" is only printed many times. I checked the 'wrfrun.log' file and I found that the following lines are printed at the end of the file:
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 5440 RUNNING AT durga
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
Intel(R) MPI Library troubleshooting guide:
Documentation Library
===================================================================================

Can anyone guide me as to why I am getting these errors and how to solve them? I am attaching my namelist.input file and wrfrun.log file for your convenience. I am very new to this model. So, any help on this will be greatly appreciated. Thank you.
With regards,
Ankan
 

Attachments

  • namelist.input.txt
    8.2 KB · Views: 65
  • wrfrun.log
    1.9 KB · Views: 4
Hi Ankan,

I'm honestly surprised you are able to run this given the multiple domains and complex chemistry. I would prepare for this to take a long time to run. That said, it looks like the thermodynaics module is not converging (see chem/module_mosaic_therm.F)

! astem parameters
nmax_astem = 200 ! max number of time steps in astem
alpha_astem = 0.05 ! choose a value between 0.01 and 1.0
! Changed alpha_astem from 0.5 to 0.05 by Manish Shrivastava on 01/08/2010
rtol_eqb_astem = 0.01 ! equilibrium tolerance in astem
ptol_mol_astem = 0.01 ! mol percent tolerance in astem

My suggestion would be to first decrease the model time step, at least for the first day of the simulation, in case your chemistry is not well spun up
If that doesn't work, you should thoroughly check all of your input files to make sure there are not any spurious values.
If that doesn't work, you can modify the code: increase the nmax_astem value and recompile the model.

Jordan
 
Hello Jordan,
Thank you for such a quick response. After going through a lot of discussions available online, I learned that the time_step should be 6*dx, where dx is the spatial resolution of the parent domain in the x-axis, and 'chemdt' option should be set for the nested domain according to the 'parent_grid_ratio' and 'parent_time_step_ratio' (here, 5:1). Since, I am giving simulations on two domains, d01 (15 km) and d02 (3 km), I set the following options in the 'namelist.input' file like this:

&domains
time_step = 90,
time_step_fract_num = 0,
time_step_fract_den = 1,

&chem
chem_opt = 202, 202,
bioemdt = 1.5, 0.3,
photdt = 1.5, 0.3,
chemdt = 1.5, 0.3,


You suggested to first decrease the model time step, How should I set time_step then? I found the following lines about time_step from the WRF USERS PAGE (WRF Namelist.input Best Prac) :
Best Practice
It is recommended to use a value of 5-6xDX (in km) for a typical case. If you are using many vertical levels and/or with map-scale-factors much larger than 1, you will need to use a smaller time_step. If you are getting CFL errors that stop your run, this means your run has become unstable and you may need to decrease this value to about 4xDX (or perhaps even 3xDX). It makes things easier if you use a time step that evenly divides into your history_interval, so that your output times will be evenly spaced.
So, should I set time_step =4*15=60? And I am not sure about the 'bioemdt' and 'photdt' options. Should these be the same as 'chemdt'? So, should the following namelist options be used?

&time_control
history_interval = 180, 180,

&domains

time_step = 60,
time_step_fract_num = 0,
time_step_fract_den = 1,

&chem
chem_opt = 202, 202,
bioemdt = 1, 0.2,
photdt = 1, 0.2,
chemdt = 1., 0.2,

Could you please advise me on this? Thank you again for your time.
With regards,
Ankan
 
Hi Ankan,

Yes, 6*dx is the recommendation, but it doesn't guarantee success.

Generally you want chemdt = 0, so it is called at each WRF time step. The model will adjust accordingly and will be based the namelist option 'parent_time_step_ratio'. So for example, if your WRF timestep is 60 seconds, then d01 met and chemistry will run every 60 seconds. d02 will run met and chemistry calculations every 12 seconds. For bioemdt and photdt, these are generally called every ~30 minutes. If you call photolysis every minute or even 5X a minute for d02, your simulation is going to increase in cost dramatically. That said, your specific experiment will determine your namelist options.

Jordan
 
Hello @jordanschnell,
Thank you so much for the clarification.And I will try running WRF-Chem accordingly after modifying my namelist.input file as per your suggestion and let you know the results. Thanks a lot.
Regards,
Ankan
 
Last edited:
Hello @jordanschnell,
I am running WRF-Chem with the MOZART-MOSAIC option (chem_opt=202) by setting the following namelist options for time_step=60:
&chem
chem_opt = 202, 202,
bioemdt = 30, 6,
photdt = 30, 6,
chemdt = 0., 0,

The thing is, wrf.exe is still running, but I checked the rsl files in which the following warning messages are printed:
-------------------------
-------------------------
WARNING: Large total sw optical depth of ******** at point i,j,nb= 27 42 5
Diagnostics 1: k, tauaer300, tauaer400, tauaer600, tauaer999, tauaer
1****************************************
2****************************************
3****************************************
4****************************************
5****************************************
6****************************************
7****************************************
8****************************************
9****************************************
10****************************************
11****************************************
12****************************************
13****************************************
14****************************************
15****************************************
16****************************************
17****************************************
18****************************************
19****************************************
20****************************************
21****************************************
22****************************************
23****************************************
24****************************************
25****************************************
26****************************************
27****************************************
28****************************************
29****************************************
30****************************************
31****************************************
32****************************************
33****************************************
34****************************************
Diagnostics 2: k, gaer300, gaer400, gaer600, gaer999
1 0.82 0.82 0.81 0.80
2 0.82 0.82 0.81 0.80
3 0.82 0.82 0.81 0.80
4 0.82 0.82 0.81 0.80
5 0.82 0.82 0.81 0.80
6 0.82 0.82 0.81 0.80
7 0.82 0.82 0.81 0.80
8 0.82 0.82 0.81 0.80
9 0.82 0.82 0.81 0.80
10 0.82 0.82 0.81 0.80
11 0.82 0.82 0.81 0.80
12 0.82 0.82 0.81 0.80
13 0.82 0.82 0.81 0.80
14 0.82 0.82 0.81 0.80
15 0.82 0.82 0.81 0.80
16 0.81 0.82 0.80 0.77
17 0.81 0.82 0.80 0.77
18 0.81 0.82 0.80 0.77
19 0.81 0.81 0.79 0.75
20 0.83 0.81 0.81 0.78
21 0.82 0.81 0.79 0.76
22 0.82 0.81 0.81 0.77
23 0.82 0.82 0.81 0.80
24 0.83 0.81 0.81 0.78
25 0.82 0.82 0.81 0.80
26 0.82 0.82 0.81 0.80
27 0.82 0.82 0.81 0.80
28 0.82 0.82 0.81 0.80
29 0.82 0.82 0.81 0.80
30 0.82 0.82 0.81 0.80
31 0.82 0.82 0.81 0.80
32 0.82 0.82 0.81 0.80
33 0.82 0.82 0.81 0.80
34 0.82 0.82 0.81 0.80
Diagnostics 3: k, waer300, waer400, waer600, waer999
1 1.00 1.00 1.00 1.00
2 1.00 1.00 1.00 1.00
3 1.00 1.00 1.00 1.00
4 1.00 1.00 1.00 1.00
5 1.00 1.00 1.00 1.00
6 1.00 1.00 1.00 1.00
7 1.00 1.00 1.00 1.00
8 1.00 1.00 1.00 1.00
9 1.00 1.00 1.00 1.00
10 1.00 1.00 1.00 1.00
11 1.00 1.00 1.00 1.00
12 1.00 1.00 1.00 1.00
13 1.00 1.00 1.00 1.00
14 1.00 1.00 1.00 1.00
15 1.00 1.00 1.00 1.00
16 1.00 1.00 1.00 1.00
17 1.00 1.00 1.00 1.00
18 1.00 1.00 1.00 1.00
19 1.00 1.00 1.00 1.00
20 1.00 1.00 1.00 1.00
21 1.00 1.00 1.00 1.00
22 1.00 1.00 1.00 1.00
23 1.00 1.00 1.00 1.00
24 1.00 1.00 1.00 1.00
25 1.00 1.00 1.00 1.00
26 1.00 1.00 1.00 1.00
27 1.00 1.00 1.00 1.00
28 1.00 1.00 1.00 1.00
29 1.00 1.00 1.00 1.00
30 1.00 1.00 1.00 1.00
31 1.00 1.00 1.00 1.00
32 1.00 1.00 1.00 1.00
33 1.00 1.00 1.00 1.00
34 1.00 1.00 1.00 1.00
Diagnostics 4: k, ssaal, asyal, taual
0 1.00 0.75 0.00
1 1.00 0.75 0.00
2 1.00 0.75 0.02
3 1.00 0.75 2.99
4 1.00 0.75 2.90
5 1.00 0.75 0.08
6 1.00 0.75 0.00
7 1.00 0.75 0.00
8 1.00 0.75 0.00
9 1.00 0.00 0.00
10 1.00 0.77********
11 1.00 0.77********
12 1.00 0.77********
13 1.00 0.77********
14 1.00 0.77********
15 1.00 0.77********
16 1.00 0.77********
17 1.00 0.77********
18 1.00 0.77********
19 1.00 0.77********
20 1.00 0.77********
21 1.00 0.77********
22 1.00 0.77********
23 1.00 0.77********
24 1.00 0.77********
25 1.00 0.74********
26 1.00 0.74********
27 1.00 0.75********
28 1.00 0.71********
29 1.00 0.80********
30 1.00 0.72********
31 1.00 0.78********
32 1.00 0.77********
33 1.00 0.80********
34 1.00 0.77********
-------------------------
-------------------------

I think because of these warning messages, the size of the rsl files are getting too large. After checking the rsl.error files, I learned that these warnings are coming every 30 minutes, and then again, the printing in the output files has been going on. My guess is that this warning may be due to the radiation options since I set 'radt' to 30 minutes for each domain.
I set the following radiation options in my namelist.input file:
&physics
ra_lw_physics = 4, 4,
ra_sw_physics = 4, 4,
radt = 30, 30,

I have gone through namelist.input: Best Practices from WRF USERS PAGE (WRF Namelist.input Best Prac), and found that it is recommended to use 1 minute per km of dx (e.g., set to 10 for a 10km parent domain). However, I set here 30, which is 2*dx (dx = 15 km for the parent domain, in my case). So, is it because of 'radt' option? Or is anything else wrong? And can these warnings cause any trouble with the output? If so, then can you please help me to resolve this issue? That will be really helpful for me. Thank you for your time.
With regards,
Ankan
 
Hi Ankan,

Yes, you should decrease your radt to be ~dx. You can also set debug_level = 0 to see if that will reduce the messages. Warnings are not always indications of problems/errors. Is the simulation advancing? As stated before, with the size and complexity of your simulations and only using 28 processors, it may take an extremely long time to run.

Jordan
 
Hello @jordanschnell,
Earlier, I used debug_level=0, but still, the message was long. My simulation stopped today without throwing any errors after running for four days. Before ending the simulation, I checked that the storage used by the /home/ directory was 100%. I suspect that simulation stopped due to the lack of space, as I noticed the storage of the /home/ directory become 39% again after deleting the rsl files. So, as you are saying, the simulation was taking a long time to run due to the size and complexity of my simulation. I am trying again by giving a test run of just a couple of days after decreasing the domain size and increasing the resolution of the domain size (d01:20 km and d02:4 km). I considered MOZART-MOSAIC after going through the tutorial presentation 'Best Practices for Applying WRF-Chem 3.9.1.1', where the following lines are mentioned:
MOZART appropriate for simulations of pure gas-phase
MOZART-GOCART appropriate for simulations of months-years, simulations focused on trace gas chemistry
MOZART-MOSAIC appropriate for short-term simulations or aerosol-climate studies detailed analysis of trace gas and aerosol processes

Since I want to perform a short-time (15 days) simulation to study aerosol-cloud-climate interaction, I selected this chemistry option over MOZCART option (MOZART-GOCART). So what should you advise after looking at my configuration ( a Linux server with 32 processors and 378 GB of RAM)? Should I go for the MOZART-MOSAIC? Or MOZCART? Or any other chemistry options? I also have a small query about the generation of biogenic emission input files to make sure that I was doing it right. The following lines are mentioned in the README file of MEGAN:
NOTE: The MEGAN biogenic emission option requires input data sets for both the month of the simulation and the previous month.
If you are simulating in January then you will have to do all months;
start_lai_month = 1,
end_lai_month = 12,

Since my simulation period is from the days of December 2015 to the days of January 2016, what should be correct?
start_lai_month = 11,
end_lai_month = 1,
or,

start_lai_month = 1,
end_lai_month = 12,
?

And my last question is: what should be the spin-up time for a simulation of 8 days to study aerosol-cloud-climate interaction?
Can you please guide me in this regard? Thank you for your time.
With regards,
Ankan
 
Last edited:
Hi Ankan,

Yes, it seems you filled up your directory space, that's probably why it stopped. Biogenic emission namelist looks fine, but you should check your output (e.g., isoprene) to make sure it is working.

Regarding your other questions, there is no "right" answer to many of these questions. MOZCART is cheaper than MOZART-MOSAIC and interacts with radiation, but there is not cloud chemistry. It will all depend on the level of complexity that you desire, and what questions you are trying to answer. With 32 processors, your simulation is going to take awhile.

The same applies with your spin-up period - look at papers for what have others used for similar domains and configurations. If you are initializing from another simulation you may not need much of a spinup period, but if you are starting from default 'clean' profiles, it will depend on other factors like the size of your domain.

Jordan
 
Thank you, Jordan, for your guidance. It helps me a lot to understand this model. So just to clarify that the following input ('megan_bio_emiss.inp') settings is alright then for simulation period from the days of December 2015 to the days of January 2016
start_lai_month = 1,
end_lai_month = 12, ?

Right?
With regards,
Ankan
 
Last edited:
Hello, I would like to ask a question.
How can I used the EDGARV5 MOZART dataset for generating anthropogenic emission input files ('wrfchemi_d<domain>_<date>.nc')?
Hi,
You can generate anthropogenic emission input files using the anthro_emis utility from NCAR or using prep-chem.
Best wishes,
Ankan
 
When using the anthro_emis tool, I was prompted that the data does not have a time dimension, so I used the prep_chem tool, but how can I set the use_edgar in the prep_chem_src. inp?QQ图片20230405160017.png
 
Hi,
I don't have experience in generating emission files using prep-chem. So, I can't help you. However, there are few tutorial exercises available in WRF-Chem webpage, maybe these can be helpful for you.
Best wishes,
Ankan
 
Hi,
I don't have experience in generating emission files using prep-chem. So, I can't help you. However, there are few tutorial exercises available in WRF-Chem webpage, maybe these can be helpful for you.
Best wishes,
Ankan
Thank you for your reply. Is there a link to the EDGARV5 MOZART dataset? Does this dataset have a time dimension? I want to see if I have found the wrong dataset.
 
Hello, Jordan
After watching your conversation with others, I have a question to ask you.
I installed WRF-Chem on the server, it can run successfully, but the variable value of MEBIO_ISOP in wrfout is 0, do you know what caused this?
Best wishes,
Haopeng
 
Yes, if you prepare the inputs for all months, there should not be an issue.
Hello, Jordan
After watching your conversation with others, I have a question to ask you.
I installed WRF-Chem on the server, it can run successfully, but the variable value of MEBIO_ISOP in wrfout is 0, do you know what caused this?
Best wishes,
Haopeng
 
Top