Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

[SOLVED] Running WRF V-4.5 with the new hybrid 100-m global land cover dataset with Local Climate Zones and MLUCM BEP

GiVu

New member
Hello.
I'm trying to run the latest version of the WRF model by implementing the hybrid 100-m global land cover coupled with the BEP urban parameterization. I have 3 nested domains with 12 km, 4km, and 1km horizontal resolution, respectively.
Unfortunately, during the simulation of the first day, after the first hour, the model stops, and I get these errors:

[ne05:13804:0:13804] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe07c1a8c0)
==== backtrace (tid: 13804) ====

0 0x0000000000012ce0 __funlockfile() :0
1 0x0000000002d52221 module_sf_sfclayrev_mp_psim_stable_() ???:0
2 0x0000000002d4d624 module_sf_sfclayrev_mp_sfclayrev1d_() ???:0
3 0x0000000002d4b1f6 module_sf_sfclayrev_mp_sfclayrev_() ???:0
4 0x00000000025da4c6 module_surface_driver_mp_surface_driver_() ???:0
5 0x0000000001e6be9c module_first_rk_step_part1_mp_first_rk_step_part1_() ???:0
6 0x00000000016e4fec solve_em_() ???:0
7 0x0000000001502ff8 solve_interface_() ???:0
8 0x00000000005b7803 module_integrate_mp_integrate_() ???:0
9 0x00000000004152d1 module_wrf_top_mp_wrf_run_() ???:0
10 0x000000000041528f MAIN__() ???:0
11 0x0000000000415222 main() ???:0
12 0x000000000003aca3 __libc_start_main() ???:0
13 0x000000000041512e _start() ???:0
=================================
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 00000000031D4CCA for__signal_handl Unknown Unknown
libpthread-2.28.s 000014EED1F5FCE0 Unknown Unknown Unknown
wrf.exe 0000000002D52221 Unknown Unknown Unknown
wrf.exe 0000000002D4D624 Unknown Unknown Unknown
wrf.exe 0000000002D4B1F6 Unknown Unknown Unknown
wrf.exe 00000000025DA4C6 Unknown Unknown Unknown
wrf.exe 0000000001E6BE9C Unknown Unknown Unknown
wrf.exe 00000000016E4FEC Unknown Unknown Unknown
wrf.exe 0000000001502FF8 Unknown Unknown Unknown
wrf.exe 00000000005B7803 Unknown Unknown Unknown
wrf.exe 00000000004152D1 Unknown Unknown Unknown
wrf.exe 000000000041528F Unknown Unknown Unknown
wrf.exe 0000000000415222 Unknown Unknown Unknown
libc-2.28.so 000014EED1BC2CA3 __libc_start_main Unknown Unknown
wrf.exe 000000000041512E Unknown Unknown Unknown



The weird thing is that I tried running the model with the same setup but with BULK and didn't get any errors.
The namelist.wps, rsl.out*, and rsl.error* files are attached.
Thank you in advance for any help or suggestions.
 

Attachments

  • namelist.input
    6.3 KB · Views: 38
  • rsl files.zip
    392.4 KB · Views: 7
UPDATE:
I increased the innermost domain to have it over the whole of Cyprus. This way, I could also increase the number of cores I used to run the simulations. Unfortunately, I'm still getting the same error.

Any suggestion is more than welcome
 
Hello.
I'm trying to run the latest version of the WRF model by implementing the hybrid 100-m global land cover coupled with the BEP urban parameterization. I have 3 nested domains with 12 km, 4km, and 1km horizontal resolution, respectively.
Unfortunately, during the simulation of the first day, after the first hour, the model stops, and I get these errors:

[ne05:13804:0:13804] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe07c1a8c0)
==== backtrace (tid: 13804) ====

0 0x0000000000012ce0 __funlockfile() :0
1 0x0000000002d52221 module_sf_sfclayrev_mp_psim_stable_() ???:0
2 0x0000000002d4d624 module_sf_sfclayrev_mp_sfclayrev1d_() ???:0
3 0x0000000002d4b1f6 module_sf_sfclayrev_mp_sfclayrev_() ???:0
4 0x00000000025da4c6 module_surface_driver_mp_surface_driver_() ???:0
5 0x0000000001e6be9c module_first_rk_step_part1_mp_first_rk_step_part1_() ???:0
6 0x00000000016e4fec solve_em_() ???:0
7 0x0000000001502ff8 solve_interface_() ???:0
8 0x00000000005b7803 module_integrate_mp_integrate_() ???:0
9 0x00000000004152d1 module_wrf_top_mp_wrf_run_() ???:0
10 0x000000000041528f MAIN__() ???:0
11 0x0000000000415222 main() ???:0
12 0x000000000003aca3 __libc_start_main() ???:0
13 0x000000000041512e _start() ???:0
=================================
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 00000000031D4CCA for__signal_handl Unknown Unknown
libpthread-2.28.s 000014EED1F5FCE0 Unknown Unknown Unknown
wrf.exe 0000000002D52221 Unknown Unknown Unknown
wrf.exe 0000000002D4D624 Unknown Unknown Unknown
wrf.exe 0000000002D4B1F6 Unknown Unknown Unknown
wrf.exe 00000000025DA4C6 Unknown Unknown Unknown
wrf.exe 0000000001E6BE9C Unknown Unknown Unknown
wrf.exe 00000000016E4FEC Unknown Unknown Unknown
wrf.exe 0000000001502FF8 Unknown Unknown Unknown
wrf.exe 00000000005B7803 Unknown Unknown Unknown
wrf.exe 00000000004152D1 Unknown Unknown Unknown
wrf.exe 000000000041528F Unknown Unknown Unknown
wrf.exe 0000000000415222 Unknown Unknown Unknown
libc-2.28.so 000014EED1BC2CA3 __libc_start_main Unknown Unknown
wrf.exe 000000000041512E Unknown Unknown Unknown



The weird thing is that I tried running the model with the same setup but with BULK and didn't get any errors.
The namelist.wps, rsl.out*, and rsl.error* files are attached.
Thank you in advance for any help or suggestions.
Did you modify your GEOGRID.TBL.ARW according to WPS/geogrid/GEOGRID.TBL.ARW_LCZ
 
Did you modify your GEOGRID.TBL.ARW according to WPS/geogrid/GEOGRID.TBL.ARW_LCZ
Dear Haiqingsong, thanks for your reply.
I created a symbolic link between the GEOGRID.TBL and the GEOGRID.TBL.ARW_LCZ.
Is it correct too, or do I need to edit the GEOGRID.TBL.ARW?
 

Attachments

  • Screenshot 2023-06-28 at 19.48.23.png
    Screenshot 2023-06-28 at 19.48.23.png
    94.8 KB · Views: 60
Last edited:
Did you modify your GEOGRID.TBL.ARW according to WPS/geogrid/GEOGRID.TBL.ARW_LCZ
I tried what you suggested, but I got the same error:

:0:6969] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe07c1a8c0)
==== backtrace (tid: 6969) ====
0 0x0000000000012ce0 __funlockfile() :0
1 0x0000000002d52221 module_sf_sfclayrev_mp_psim_stable_() ???:0
2 0x0000000002d4d624 module_sf_sfclayrev_mp_sfclayrev1d_() ???:0
3 0x0000000002d4b1f6 module_sf_sfclayrev_mp_sfclayrev_() ???:0
4 0x00000000025da4c6 module_surface_driver_mp_surface_driver_() ???:0
5 0x0000000001e6be9c module_first_rk_step_part1_mp_first_rk_step_part1_() ???:0
6 0x00000000016e4fec solve_em_() ???:0
7 0x0000000001502ff8 solve_interface_() ???:0
8 0x00000000005b7803 module_integrate_mp_integrate_() ???:0
9 0x00000000004152d1 module_wrf_top_mp_wrf_run_() ???:0
10 0x000000000041528f MAIN__() ???:0
11 0x0000000000415222 main() ???:0
12 0x000000000003aca3 __libc_start_main() ???:0
13 0x000000000041512e _start() ???:0
================================
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 00000000031D4CCA for__signal_handl Unknown Unknown
libpthread-2.28.s 000014EB6CCAACE0 Unknown Unknown Unknown
wrf.exe 0000000002D52221 Unknown Unknown Unknown
wrf.exe 0000000002D4D624 Unknown Unknown Unknown
wrf.exe 0000000002D4B1F6 Unknown Unknown Unknown
wrf.exe 00000000025DA4C6 Unknown Unknown Unknown
wrf.exe 0000000001E6BE9C Unknown Unknown Unknown
wrf.exe 00000000016E4FEC Unknown Unknown Unknown
wrf.exe 0000000001502FF8 Unknown Unknown Unknown
wrf.exe 00000000005B7803 Unknown Unknown Unknown
wrf.exe 00000000004152D1 Unknown Unknown Unknown
wrf.exe 000000000041528F Unknown Unknown Unknown
wrf.exe 0000000000415222 Unknown Unknown Unknown
libc-2.28.so 000014EB6C90DCA3 __libc_start_main Unknown Unknown
wrf.exe 000000000041512E Unknown Unknown Unknown
 
@GiVu,

1) When you mention running with 'BULK,' what exactly is BULK?

2) When you were able to run it successfully with BULK, were you using the exact same dates, domain size, number of processors, etc.?

There are a couple of things I can see that could be causing the issue.
  1. You are only using 4 processors to process some decently-sized domains. I would suggest using several more - perhaps something closer to 50 processors. Even if you were able to run with 4 processors with a different option, some options can require more processors. Can you give this a try and let me know if it makes a difference?
  2. Your domain 03 is too small. e_we and e_sn should never be any smaller than 100x100.
  3. What happens if you use a 3:1 ratio for domain 03? It's not recommended to use even-numbered ratios, especially when you have feedback turned on.
  4. radt should be set to ~1 minute per km grid size (of innermost domain), meaning it should be set to something like 1, instead of 50.
As a side note, set debug_level = 0. Turning it on is rarely useful in helping to figure out the issue and ends up just printing junk to the rsl files, making them more difficult to read.
 
@kwerner, thank you for your reply and the suggestions!

1) With BULK, I'm referring to the sf_urban_physics = 0. Yes, when I ran with BULK, I used the same setup to run with BEP.

2) I first ran with BULK with the set-up I attached in the first message. Then I implemented it by linking the GEOGRID.TBL to GEOGRID.TBL.ARW_LCZ and by adding 'cglc_modis_lcz' to the command geog_res = 'default' in the namelist.wps, the 100 m land cover dataset. Geodrid, ungrib, and metgrid worked properly. In the namelist.input, I changed the sf_urban_physics from 0 to 2 and ran the real.exe, which ran without errors. Finally, I ran wrf.exe, and I got the error above.

1. Regarding the processors, I was using only 4 because of the horizontal resolution of the innermost one. Every time I tried to increase the number of processors, I got errors. For this reason, I used the script provided in a post in this forum to estimate the number of processors based on the dimensions of the smaller domain, but still, the suggestion was 4 processors.

2. Because of the small number of processors and because I read in this forum that the domains shouldn't be any smaller than 100x100, I changed d03, increasing its size, but it still didn't work. The number of processors that I'm using now is 126.

3 & 4. I'll try to set a 3:1 ratio for domain 03 and change the radt accordingly. Also, I'll keep in mind to set debug_level = 0.

I take advantage of this reply to ask if you believe this error might be caused by some bugs related to the LCZ classification over a small area like Cyprus or Nicosia. I've read that this kind of error when it happens at the very beginning of the simulation, can be caused by some wrong data.

I’ll keep you posted in case of anything happens by changing the namelist. Thanks a lot for your time.
 
Last edited:
Hello everyone. Following the previous messages, I fixed the namelist and reran the model.
Unfortunately, I only managed to delay the error by 10 minutes. Now instead of happening at 2021-07-21_01:41:00, it happens at 2021-07-21_01:51:00.
I'm running a test run for a week from 2021-07-21_00:00:00 until 2021-07-28_00:00:00.
I'm attaching the new namelist.input, plus the rsl* files folder, the namelist.wps, and the domain configuration.
Thank you in advance for any further help.
 

Attachments

  • wps_show_dom.pdf
    66.5 KB · Views: 15
  • rsl.tar
    6 MB · Views: 3
  • namelist.wps
    1.2 KB · Views: 35
  • namelist.input
    5.9 KB · Views: 28
Hi,
I first would like to apologize for the long delay in response. I have been bogged down preparing for the WRF tutorial taking place right now, and have gotten behind on forum responses.

So since you were able to run with urban option 2, without problems, and then when you tried to use the LCZ dataset, with everything else the same, it's quite possible the issue is with either that dataset, or with the process you used. If you haven't already, since you've now modified your domains and number of processors a bit, can you try rerunning the original test (default static data during geogrid, with sf_urban_physics = 2, and the updated namelist and domain size, and number of processors) to confirm that this still runs without issues?

Before you do that, please save your geogrid.log file from the problematic run (when using the LCZ dataset) somewhere else (or as a different name) so it doesn't get overwritten, and then please send me the two different geogrid.log files. Thanks!
 
Thank you, @kwerner, for your reply.

Unfortunately, I haven't been able to run the model with urban option 2. I managed to run it only with urban option 0 (the bulk one).
I'm uploading a folder named bulk_static which includes geogrid.log, ungrib.log, and metgrid.log for urban option 0.

Anyway, I tried to run the WRF with the same setup but using default static data instead of the cglc, and this time, I got a different error.
Because of the parameterizations, urban option 2 requires LCZ information which I retrieved using the w2w script. Again, everything works fine until the wrf.exe execution, and then:

CFC11 = 2.152171322463335E-010 volume mixing ratio
CFC12 = 4.898791989458283E-010 volume mixing ratio
INPUT LandUse = "MODIFIED_IGBP_MODIS_NOAH"
LANDUSE TYPE = "MODIFIED_IGBP_MODIS_NOAH" FOUND 61 CATEGORIES 2 SEASONS WATER CATEGORY = 17 SNOW CATEGORY = 15
Climatological albedo is used instead of table values
SOIL TEXTURE CLASSIFICATION = STAS FOUND 19 CATEGORIES
-------------- FATAL CALLED ---------------
USING URBPARM_LCZ.TBL WITH OLD 3 URBAN CLASSES. SET USE_WUDAPT_LCZ=0

(Regarding this error, it is related to this bug, but I haven't started to work on it yet)

I'm uploading a folder named bep_w2w which includes geo_em.d0* created with the script, the rsl.out.*, and the rsl.error.*, besides geogrid.log, ungirb.log, and metgrid.log for urban option 2.

Finally, I tried to run the model with the cglc map, and as mentioned above, it stopped after a bit.
I'm also uploading a folder named bep_cglc containing the rsl.out.* and the rsl.error.*, besides geogrid.log, ungirb.log, and metgrid.log for urban option 2.

For you to know, everything I described above happens also using urban option 3.

Thank you in advance for your help.
 

Attachments

  • bulk_static.tar
    1.5 KB · Views: 3
  • bep_w2w.tar
    1.5 KB · Views: 2
  • bep_cglc.tar
    1.5 KB · Views: 4
Hi GiVu,

I have come across the same problem using urban option 1 also. I checked the variables and it seems that at a specific I, perturbation geopotentials (ph) have become NaN, causing dz8w becoming NaN, thus crashing when sfclayrev has been called. I am now worried about the LCZ static data itself may have some bugs...
 
Hi @winstonwu91,
Thank you for your reply!
I saw your post yesterday, and I followed it because I thought your issue was similar to mine, indeed. Did you manage anyway to fix the sfclayrev issue?
By the way, I also have started thinking that LCZ may have some bugs.
 
UPDATE: I solved the issue by changing the LSM considered in the simulations. I was using Noah-MP, but I had to switch to Noah (sf_surface_physics = 2), as suggested by the user manual.
 

Attachments

  • Screenshot 2023-08-06 at 11.27.43.png
    Screenshot 2023-08-06 at 11.27.43.png
    93.9 KB · Views: 53
UPDATE: I solved the issue by changing the LSM considered in the simulations. I was using Noah-MP, but I had to switch to Noah (sf_surface_physics = 2), as suggested by the user manual.
Hi, GiVu! I wonder if the new LCZ data is suitable for SLUCM ( sf_urban_physics = 1). I have tried to use BEP and the LCZ data and the wrf run well. However, when using SLUCM, the wrf crashed soon at the first time step. Have you ever tried the SLUCM with the LCZ data?
 
Hi, GiVu! I wonder if the new LCZ data is suitable for SLUCM ( sf_urban_physics = 1). I have tried to use BEP and the LCZ data and the wrf run well. However, when using SLUCM, the wrf crashed soon at the first time step. Have you ever tried the SLUCM with the LCZ data?
Hi there! Sorry for the late reply. I'm sorry, but I haven't tried the SLUCM
 
Top