CLWRF: 'CAMtr_volume_mixing_ratio' does not exist

jhegarty

New member
Hi All,

I am trying to run WRF 4.5.1 and get the following error.

-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 203
CLWRF: 'CAMtr_volume_mixing_ratio' does not exist
-------------------------------------------

I started by cloning the latest WRF repository late last month (August 2023). The executable was built with the Intel compiler.

I found another thread on this topic, but that thread was closed, and a message at the top of the thread instructed to start a new thread. In the old thread the solution was to have CAMtr_volume_mixing_ratio.RCP8.5 available on the WRF run directory. That file is available on my run directory with all read an executable bits open.

Please let me know if there is another solution.

Thanks,

Jen
 
Jen,

Various CAMtr_volume_mixing data for different scenarios can be found in WRF/run directory.

When you run WRF under one specific scenario, you need to link the corresponding CAMtr data as "CAMtr_volume_mixing_ratio" to your working directory .
For example, suppose you are running wrf..exe in WRF/test/em_real for simulation under the scenario RCP8.5, you need to:

cd WRF/test/em_real
ln -sf ../../run/CAMtr_volume_mixing_ratio.RCP8.5 CAMtr_volume_mixing_ratio

Then you can run wrf.exe.
 
Last edited:
Jen,

Various CAMtr_volume_mixing data for different scenarios can be found in WRF/run directory.

When you run WRF under one specific scenario, you need to link the corresponding CAMtr data as "CAMtr_volume_mixing_ratio" your working directory .
For example, suppose you are running wrf..exe in WRF/test/em_real for simulation under the scenario RCP8.5, you need to:

cd WRF/test/em_real
ln -sf ../../run/CAMtr_volume_mixing_ratio.RCP8.5 CAMtr_volume_mixing_ratio

Then you can run wrf.exe.
Hello Ming,

I recently had this error that terminated the restart run again and again. I made sure that the file is properly linked. The file does exist in the run directory. The run starting from the initial time was working fine. But when I want to continue the run with the restart files, the model outputs this error that kills the simulation. Could you please help me? If you have any suggestions on what I could try or how to troubleshoot further, I'd really appreciate it.

Thanks,
Zhan
 
Hi Zhan,

This error message is derived from SUBROUTINE read_CAMgases in phys/module_ra_clWRF_support.F.

Can you add some prints right before the line to check models and dates for this case?

201 CALL wrf_error_fatal("CLWRF: 'CAMtr_volume_mixing_ratio' does not exist")

Specifically, I would like to know values of the following variables:

model, yr, julian, max_years

Thanks.
 
Hi Zhan,

This error message is derived from SUBROUTINE read_CAMgases in phys/module_ra_clWRF_support.F.

Can you add some prints right before the line to check models and dates for this case?

201 CALL wrf_error_fatal("CLWRF: 'CAMtr_volume_mixing_ratio' does not exist")

Specifically, I would like to know values of the following variables:

model, yr, julian, max_years

Thanks.
Hi, my name is Andrea, now i hav been runing the wrf/4.5.2 model since my cluster on my computer but i note the same mistake whit the next text
a.7116@tlaloc ~/Build_WRF/WPS-4.5 $ srun -n 4 -p golden wrf.exe
starting wrf task 1 of 4
starting wrf task 2 of 4
starting wrf task 3 of 4
starting wrf task 0 of 4
srun: Job step aborted: Waiting up to 62 seconds for job step to finish.
slurmstepd: error: *** STEP 26072.0 ON node1 CANCELLED AT 2026-02-23T14:45:00 ***
srun: error: node1: tasks 0-3: Killed
a.7116@tlaloc ~/Build_WRF/WPS-4.5 $ cat rsl.out.0000
taskid: 0 hostname: node1
Quilting with 1 groups of 0 I/O tasks.
Ntasks in X 2 , ntasks in Y 2
Domain # 1: dx = 9000.000 m
WRF V4.5.2 MODEL
No git found or not a git repository, git commit version not available.
*************************************
Parent domain
ids,ide,jds,jde 1 120 1 120
ims,ime,jms,jme -4 67 -4 67
ips,ipe,jps,jpe 1 60 1 60
*************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 170995636 bytes allocated
med_initialdata_input: calling input_input
Input data is acceptable to use: wrfinput_d01
CURRENT DATE = 2023-10-23_00:00:00
SIMULATION START DATE = 2023-10-23_00:00:00
Timing for processing wrfinput file (stream 0) for domain 1: 0.48414 elapsed seconds
Max map factor in domain 1 = 1.00. Scale the dt in the model accordingly.
D01: Time step = 54.00000 (s)
D01: Grid Distance = 9.000000 (km)
D01: Grid Distance Ratio dt/dx = 6.000000 (s/km)
D01: Ratio Including Maximum Map Factor = 5.976624 (s/km)
D01: NML defined reasonable_time_step_ratio = 6.000000
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 203
CLWRF: 'CAMtr_volume_mixing_ratio' does not exist
I don't know what will i do? Someone will help me? I am run the hurricaine Otis
 
@ andysant18

CAMtr_volume_mixing data for different scenarios can be found in WRF/run directory.

When you run WRF under one specific scenario, please link the corresponding CAMtr data as "CAMtr_volume_mixing_ratio" to your working directory .

For example, suppose you are running wrf..exe in WRF/test/em_real for simulation under the scenario RCP8.5, you need to:

cd WRF/test/em_real
ln -sf ../../run/CAMtr_volume_mixing_ratio.RCP8.5 CAMtr_volume_mixing_ratio

Please try and let me know whether it works for you.
 
Hi Zhan,

This error message is derived from SUBROUTINE read_CAMgases in phys/module_ra_clWRF_support.F.

Can you add some prints right before the line to check models and dates for this case?

201 CALL wrf_error_fatal("CLWRF: 'CAMtr_volume_mixing_ratio' does not exist")

Specifically, I would like to know values of the following variables:

model, yr, julian, max_years

Thanks.
Hello Ming,

Sorry about the late reply. We were able to avoid restart last time, and later the restart worked, so this error was passed and ignored.
However, this time the restart failed again and i am trying anything i could to figure it out.
So i print the following variables: model, yr, julian, max_years, and also absolute path and relative path of the file. One of the rank returned the CAMtr file does not exist while others confirmed that the file does exist. So this is very interesting to me that ranks performed differently like this. Can i ask if you would have any thought? Thank you.

rsl.out.0224:
29 Checking file: [CAMtr_volume_mixing_ratio]
30 /pscratch/sd/z/zhanshi/ensemble_18h/run
31 relative exists = F
32 absolute exists = F
33 model=RRTMG yr= 2023 julian= 159.0000 max_years= 233
34 -------------- FATAL CALLED ---------------
35 FATAL CALLED FROM FILE: <stdin> LINE: 215
36 CLWRF: 'CAMtr_volume_mixing_ratio' does not exist
37 -------------------------------------------
38 taskid: 224 hostname: nid004800

rsl.out.0220:
D01: NML defined reasonable_time_step_ratio = 6.000000
Checking file: [CAMtr_volume_mixing_ratio]
/pscratch/sd/z/zhanshi/ensemble_18h/run
relative exists = T
absolute exists = T
Normal ending of CAMtr_volume_mixing_ratio file
GHG annual values from CAM trace gas file
Year = 2023 , Julian day = 159
CO2 = 4.230768799241307E-004 volume mixing ratio
N2O = 3.343851326344848E-007 volume mixing ratio
CH4 = 1.941778516043003E-006 volume mixing ratio
CFC11 = 2.100264328202455E-010 volume mixing ratio
CFC12 = 4.810798593491203E-010 volume mixing ratio
taskid: 220 hostname: nid004581

Not just the old problem, new problem appears ...
I also suspect that this problem might be sensitive to decomposition. For the previous successful case, I used 900 ranks. Later the run stopped, restarted again with 900 ranks successfully. This time i used 1024 ranks. But for the restart, when i use smaller number of processors, like 256, i received CAMtr_volume_mixing_ratio does not exist. When i use larger number of processors like i used for my first run 1024, Normal ending of CAMtr_volume_mixing_ratio file, but i received a different error message: cxil_map: write error. The restart run never reached wrfrst_d03 and crushed after reading wrfrst_d02.

*** subr move_sections - method = 20
*** subr move_sections - idiag = 0
dep_init: initializing for 3 domains
start_domain_em: numgas = 141
*************************************
Nesting domain
ids,ide,jds,jde 1 301 1 268
ims,ime,jms,jme -4 21 93 120
ips,ipe,jps,jpe 1 10 103 110
INTERMEDIATE domain
ids,ide,jds,jde 28 133 29 123
ims,ime,jms,jme 23 42 55 77
ips,ipe,jps,jpe 26 32 65 67
*************************************
d01 2023-06-08_21:00:00 alloc_space_field: domain 2 , 51352560 bytes allocated
d01 2023-06-08_21:00:00 alloc_space_field: domain 2 , 301273920 bytes allocated
RESTART: nest, opening wrfrst_d02_2023-06-08_21:00:00 for reading
d01 2023-06-08_21:00:00 Input data is acceptable to use:
cxil_map: write error
cxil_map: write error
cxil_map: write error

MPICH ERROR [Rank 384] [job id 54199333.0] [Tue Jun 9 00:25:52 2026] [nid004876] - Abort(405397519) (rank 384 in comm 0): Fatal error in PMPI_Scatterv: Other MPI error, error stack:
PMPI_Scatterv(416).....: MPI_Scatterv(sbuf=0x4aec2340, scnts=0x4af77c40, displs=0x4af78c50, dtype=0x4c000427, rbuf=0x7fff9e475900, rcount=28800, dtype=0x4c000427, root=0, comm=comm=0xc4000000) failed
MPIR_CRAY_Scatterv(502):
MPIC_Recv(194).........:
MPID_Recv(380).........:
MPIDI_recv_unsafe(87)..:
MPIDI_OFI_do_irecv(356): OFI tagged recv failed (ofi_recv.h:356:MPIDI_OFI_do_irecv:Bad address)

aborting job:
Fatal error in PMPI_Scatterv: Other MPI error, error stack:
PMPI_Scatterv(416).....: MPI_Scatterv(sbuf=0x4aec2340, scnts=0x4af77c40, displs=0x4af78c50, dtype=0x4c000427, rbuf=0x7fff9e475900, rcount=28800, dtype=0x4c000427, root=0, comm=comm=0xc4000000) failed
MPIR_CRAY_Scatterv(502):
MPIC_Recv(194).........:
MPID_Recv(380).........:
MPIDI_recv_unsafe(87)..:
MPIDI_OFI_do_irecv(356): OFI tagged recv failed (ofi_recv.h:356:MPIDI_OFI_do_irecv:Bad address)

While i realized this might be related to the bug mentioned and solved in this post: (in case you cannot open the website)

MPI_Gatherv/MPI_Scatterv displacements overflow in frame/collect_on_comm.c​

https://github.com/wrf-model/WRF/issues/2156
Determine MPI Data Types in col_on_comm() & dst_on_comm() to prevent displacements overflow.

TYPE: bug fix

KEYWORDS: prevent displacements overflow in MPI_Gatherv() and MPI_Scatterv() operations

SOURCE: Benjamin Kirk & Negin Sobhani (NSF NCAR / CISL)

DESCRIPTION OF CHANGES:
Problem:
The MPI_Gatherv() and MPI_Scatterv() operations require integer displacements into the communications buffers. Historically everything is passed as an MPI_CHAR, causing these displacements to be larger than otherwise necessary. For large domain sizes this can cause the displace[] offsets to exceed the maximum int, wrapping to negative values.

Solution:
This change introduces additional error checking and then uses the function MPI_Type_match_size() (available since MPI-2.0) to determine a suitable MPI_Datatype given the input *typesize. The result then is that the displace[] offsets are in terms of data type extents, rather than bytes, and less likely to overflow.

ISSUE: Fixes #2156

LIST OF MODIFIED FILES:
M frame/collect_on_comm.c

TESTS CONDUCTED:
Failed cases run now.

RELEASE NOTE:
Determine MPI Data Types in col_on_comm() & dst_on_comm() to prevent displacements overflow.


I modified the code and recompiled. Unfortunately it didn't work for me.
I have little experience in debugging the code. Please let me know if you have any advise or have experience in solving this kind of problem. I would also work with HPC support because this problem might not be purely related to wrf chem itself.

Thank you very much for your time and help.
 
Back
Top