(resolved )troubles, could not find trapping x locations

mengronglu · Aug 12, 2025

Previously, I had run multiple simulations using the same dataset, model, and parameters. Later, I modified p_top_requested = 10000 and successfully completed a full simulation, obtaining the wrfout output.

When I tried to run it again afterwards, the error started to occur:

Error info 1：
===========================================================
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 330337484 bytes allocated
med_initialdata_input: calling input_input
Input data is acceptable to use:
CURRENT DATE = 2022-09-01_00:00:00
SIMULATION START DATE = 2022-09-01_00:00:00
Max map factor in domain 1 = 1.01. Scale the dt in the model accordingly.
D01: Time step = 120.0000 (s)
D01: Grid Distance = 30.00000 (km)
D01: Grid Distance Ratio dt/dx = 4.000000 (s/km)
D01: Ratio Including Maximum Map Factor = 4.041906 (s/km)
D01: NML defined reasonable_time_step_ratio = 6.000000
---- WARNING : Older v3 input data detected
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 684
---- Error : Cannot use moist theta option with old data
-------------------------------------------
Abort(1) on node 22 (rank 22 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 22
===========================================================

Afterwards I tried many times but it still didn’t work. I checked the data and confirmed there was no missing TITLE with V4.* — I have always included it. I then prepared to rerun real.exe to regenerate the data, but a new issue occurred:

Error info 2：
===========================================================
Using sfcprs3 to compute psfc
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 6506
troubles, could not find trapping x locations
-------------------------------------------
Abort(1) on node 1 (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
===========================================================

I then found on the forum that in some cases, errors occurred after modifying p_top_requested = 10000. So I readjusted p_top_requested = 5000 and regenerated the new met_em.d0* data starting from WPS. However, the same error still occurred — “troubles, could not find trapping x locations,” as shown in the attached file.
This happened even after I changed the input data (I originally used ERA5_pl together with surface reanalysis data and Vtable.ECMWF, and then tried GFS-FNL data with Vtable.GFS) and even after recompiling WPS (v4.1) and WRF (v4.5.2, same version as before). The problem still persisted. （rsl.error.0014 are attatched）

Therefore, my question is: in my sample file, which variables are incorrect? Is it the PRES variable, where the data at k=0 comes from PSFC? However, I have previously run large-domain simulations in cases where PRES at k=0 was also taken from PSFC, and I did not encounter similar issues.

Given that all these problems occurred after I once modified p_top_requested, I am not sure whether this could have any lasting or potential effects on the system environment.
Because after error info 1 occurred, I used the same set of data without rerunning WPS, modified p_top_requested, and ran the model with the original met_em.d0* data. The error message I got was:

Error info 3：
===========================================================
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 95 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 96 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 97 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 98 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 99 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 100 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 101 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 102 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 103 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 104 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 105 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 106 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 107 30 38 , setting Qv to 0
d01 2022-09-01_09:00:00 t(i,j,k) was 0 at 108 30 38 , setting Qv to 0
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 1339
grid%p_top > previous value
-------------------------------------------
Abort(1) on node 2 (rank 2 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2
===========================================================

Ming Chen · Aug 12, 2025

The error message seems to indicate that you are running a new version of WRF (newer than v4.0) but your input data were produced by older WPS.

Can you run WPSv4.5 to create met_em files, then rerun real.exe?

WPSv4.5 should be consistent with the WRF verison you are using (WRFv4.5.2). Please let me know if you still ahve some issues.

mengronglu · Aug 15, 2025

Hi Ming，

I have updated my WPS to version 4.5 but issues remain：

I have processed the following tests：
test 1:
I followed the cases:
https://forum.mmm.ucar.edu/threads/era5-landmask-error.21837/,
download ERA5 pressure level data. the dataset from RDA:
https://rda.ucar.edu/datasets/d633000/dataaccess/#
invariant:

e5.oper.invariant.128_129_z.regn320sc.2016010100_2016010100.nc
e5.oper.invariant.128_172_lsm.ll025sc.1979010100_1979010100.nc
run era5_to_int.py and the log file （era5.log）is attached

Error when running real.exe:
Using sfcprs3 to compute psfc
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 6506
troubles, could not find trapping x locations
----------------------------------------------------

test_1_met_em.d0* : met_em_files.
test_1_namelist.wps , test_1_namelist.input, and the test_rsl.* files are attached.

test2：
CDS ERA5 pressure level data and surface level data：
ERA5 hourly data on pressure levels from 1940 to present,

ERA5 hourly data on single levels from 1940 to present

ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent...

cds.climate.copernicus.eu

Error when running real.exe:
Using sfcprs3 to compute psfc
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 6506
troubles, could not find trapping x locations
-----------------------------------------------------
test_2_met_em.d0* : met_em_files.

My workflow is as follows:
1. link data and test_2_Vtable, from Vtable.ECMWF
2. ./ungrib.exe, ./metgrib.exe, attached : test_2_namelist.wps

My questions:

Which part is most likely causing the problem — WPS or WRF?
Why does the vertical distribution of the variable PRES look like this: could this be the cause?
【 Because previously my input fields were similar to this, but I was able to successfully generate the wrfinput_d0* files, and the PB field was normal, and the simulation ran without problems (using the same data and process as in test_2, only with a 100×100 grid domain; met_em.d0*: met_em_files). 】
If not, what other reasons could cause such a result?

PRES test_2_ layer=0

PRES test_2_layer=1

Ming Chen · Aug 15, 2025

Hi,

Let's forget test2, which processed data from CDS. I am not familiar with the CDS data and we always use ERA5 from NCAR RDA.

For test1, I do believe that it is a data issue. Somehow you may not download all the required ERA5 data. I will take a look and get back to you.

Ming Chen · Aug 15, 2025

In your namelist.input, the option " interval_seconds = 10800 " indicates that your ERA5 data should be at 3-hr interval.

However, in your era5.log, the data is at 6-hr interval (interval_hours = 6:00:00).

Can you make these two options consistent?

Also, please remove the options below:

! nproc_x = 6

! nproc_y = 9

nproc_x = -1,

nproc_y = -1,

numtiles = 1

mengronglu · Aug 17, 2025

Hi Ming,
Thanks for your reply!
I have updated the parameters in the namelist.input and run the whole process again, but the issue is remaining the same.
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 6506
troubles, could not find trapping x locations
-------------------------------------------
test_6h_met_em.d0* : met_em_files.
May I ask if an environment-variable issue could cause this error? It appeared suddenly after running normally for a long time, and I haven’t been able to find or fix the cause by changing parameters or input data. I don’t think there’s anything wrong with my met_em.d0* files. I’m going to try downloading the official namelist.input again to see if that helps.

mengronglu · Aug 17, 2025

Ming Chen said:
In your namelist.input, the option " interval_seconds = 10800 " indicates that your ERA5 data should be at 3-hr interval.

However, in your era5.log, the data is at 6-hr interval (interval_hours = 6:00:00).

Can you make these two options consistent?

Also, please remove the options below:

! nproc_x = 6

! nproc_y = 9

nproc_x = -1,

nproc_y = -1,

numtiles = 1

Hi Ming,
I run more test cases (2022-09-01 to 2022-09-05, ref_lat=30, ref_lon=120,), and I found following things,
test_a：
esn,ewe=100*100，dx=dy=30km，e_vert=50
fail, error message as shown is test_6h at the beginning.
test_b:
I deleted the SPECHUMD var.
esn,ewe=100*100，dx=dy=30km，e_vert=50
success
test_c:
I deleted the SPECHUMD var. and
I try to change the domain grib to 120*120, others parameters remained the same (dx=dy=30km，e_vert=50):
failed. error and out files are attached as test_c_rsl*:
d01 2022-09-01_18:00:00 Timing for input 0 s.
d01 2022-09-01_18:00:00 flag_soil_layers read from met_em file is 1
grid%p_top from last time period = 5000.000
grid%p_top from this time period = 9660.516
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 1359
grid%p_top > previous value
-------------------------------------------
Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
test_d:
the same as test_c, but change the p_top_requested to 10000
failed. error: test_d_rsl*:
-------------------------------------------------------
d01 2022-09-01_18:00:00 Timing for input 0 s.
d01 2022-09-01_18:00:00 flag_soil_layers read from met_em file is 1
Using sfcprs3 to compute psfc
i,j = 1 31
target pressure and value = NaN 1.4012985E-45
column of pressure and value = NaN 0.0000000E+00
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = NaN NaN
column of pressure and value = -Infinity NaN
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 6567
troubles, could not find trapping x locations
-------------------------------------------
test_e:
I deleted the SPECHUMD var. and
I try to change the dx=dy=15km, and the grib setting is 120*120, evert=50,
failed. error: test_e_rsl*:
----------------------------------------
d01 2022-09-01_18:00:00 Timing for input 0 s.
d01 2022-09-01_18:00:00 flag_soil_layers read from met_em file is 1
grid%p_top from last time period = 5000.000
grid%p_top from this time period = 9481.984
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 1359
grid%p_top > previous value
-------------------------------------------

I'm really confused by these results. Do you have any advice?
Please let me know if you need any additional files. Thank you very much for your help.

Ming Chen · Aug 18, 2025

Are you running WRF-Chem? How did you compile the code?

The error message indicates that some namelist options are changed when REAL is running. This is not allowed when we run WRF-ARW.

I am suspicious that the data communication during parallel run somehow doesn't work correctly. This is more like a compiling or machine issue.

mengronglu · Aug 19, 2025

You were absolutely right.
I tested with a single core via srun yesterday, and all my previous failed input fields ran successfully on my side. It turns out the failures were due to an MPI issue in the parallel runs.
Many thanks for your kind and professional help.

Ming Chen said:
Are you running WRF-Chem? How did you compile the code?

The error message indicates that some namelist options are changed when REAL is running. This is not allowed when we run WRF-ARW.

I am suspicious that the data communication during parallel run somehow doesn't work correctly. This is more like a compiling or machine issue.

Ming Chen · Aug 20, 2025

Would you please clarify what MPI issue did you have that leads to the weird model behavior? I suppose this information will be helpful for other users who may experience the same problem. Thanks in advance.

mengronglu · Aug 21, 2025

Ming Chen said:
Would you please clarify what MPI issue did you have that leads to the weird model behavior? I suppose this information will be helpful for other users who may experience the same problem. Thanks in advance.

Hi Ming

I don’t yet know why this is happening. I suspect an environment configuration issue, but I haven’t been able to identify the root cause. I’m still investigating. I normally use the Intel compilers and Intel MPI via the iimpi toolchain (icc, ifort, mpiicc, mpiifort), but when loading modules some GCC-built software is inevitably pulled in; this might be part of the problem. However, the same setup used to work, and—as I mentioned—everything broke suddenly. I built multiple WRF stacks and none of them run in parallel, which makes me suspect the system environment was changed (though I haven’t confirmed this with the admins yet).

Do you know effective debugging methods for this kind of issue? I’m keen to find the cause. I’ve checked the runtime links with ldd, and both local and system libraries appear fine. I’m now trying to compile WRF with GFortran + OpenMPI and to avoid using local libraries as much as possible. Hopefully this will resolve the issue.

Ming Chen · Aug 21, 2025

Hi,
Thank you for the detailed explanation. I understand that various compilers and related libs used to build WRF can be a big issue and the environmental settings add extra complexity. Sorry that there is no specific approaches we can use to debug possible problems introduced by wrong environmental settings. We usually install all libs and make sure they are consistent before we move on to build WRF.

(resolved )troubles, could not find trapping x locations

mengronglu

New member

Attachments

Ming Chen

Moderator

mengronglu

New member

ERA5 hourly data on single levels from 1940 to present

Attachments

Ming Chen

Moderator

Ming Chen

Moderator

mengronglu

New member

Attachments

mengronglu

New member

Attachments

Ming Chen

Moderator

mengronglu

New member

Ming Chen

Moderator

mengronglu

New member

Ming Chen

Moderator