Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

kermarrec

Member
I am facing a segmentation problem (although I extended to unlimited) after 1 hour of simulation. It is a real case with LES and 5 domains
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7fed117fbd11 in ???
#1 0x7fed117faee5 in ???
#2 0x7fed112e708f in ???
at /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
#3 0x5608f5537d3f in ???
#4 0x5608f553d304 in ???
#5 0x5608f5541382 in ???
#6 0x5608f4d1f995 in ???
#7 0x5608f4548525 in ???
#8 0x5608f3f8f498 in ???
#9 0x5608f3e248f3 in ???
#10 0x5608f2e666ad in ???
#11 0x5608f2e66d1a in ???
#12 0x5608f2e66d1a in ???
#13 0x5608f2ded667 in ???
#14 0x5608f2ded09e in ???
#15 0x7fed112c8082 in __libc_start_main
at ../csu/libc-start.c:308
#16 0x5608f2ded0dd in ???
#17 0xffffffffffffffff in ???


May it be that the problem comes from my setting? I attach the namelist and the error file. Thanks for your help.
 

Attachments

  • rsl.out.0000
    214.5 KB · Views: 7
  • namelist.input
    6.8 KB · Views: 6
I have changed the input file as attached by letting
nproc_x =2,
nproc_y =2,
/
and
&namelist_quilt
nio_tasks_per_group = 1,
nio_groups = 2,

but now, I face this error, which occurs by real.exe
taskid: 0 hostname: kornog
module_io_quilt_old.F 2931 T
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 5874
check comm_start, nest_pes_x, nest_pes_y settings in namelist for comm 1
-------------------------------------------

if I do not add nproc_x and _y and now get the following error after running real.exe successfully for wrf.exe:


taskid: 0 hostname: kornog
module_io_quilt_old.F 2931 F
MPASPECT: UNABLE TO GENERATE PROCESSOR MESH. STOPPING.
PROCMIN_M 1
PROCMIN_N 1
P 0
MINM 1
MINN 0
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 124
module_dm: mpaspect
-------------------------------------------

I don't know what suddently happened as nothing works now. There is probably a mistake somewhere? Should I recompile wrf? where does the problem come from?
thanks for your help!
best
 

Attachments

  • rsl.out.0000
    248 bytes · Views: 0
  • namelist.input
    6.9 KB · Views: 1
I managed to come further but it still does not work. I recompiled the wrf and used the command mprun -np 4 ./wrf.exe instead of nproc_x which did not work.
Now I am still having a segmentation fault error. I attach the rsl and the namelist.
I ran grep cfl rsl* and did not get message. However I tried several things: w_damping=1, smooth_cg_topo =.true, decrease the time step.
Maybe you have some idea how I can come further. Many thanks in advance, I would appreciate any help to be able to lauch this simulation
 

Attachments

  • namelist.input
    6.9 KB · Views: 5
  • rsl.error.0000
    10.5 KB · Views: 2
Can you let me know the resolution of your input data (the data you use in the ungrib process)?
 
I used the default setting of WPS (for HGT_M default:topo_gmted2010_30s and for LANDUSEF default:modis_landuse_20class_30s_with_lakes/

Is it not enough for the 1000 m domain (because it is at that point that I get this error)?
 
I changed to modis_15s_lake for LANDUSEF and topo_gmted2010_30s for HGT_M and unfortunately, it still did not work and stopped at the same place. I attach the error file.
 

Attachments

  • rsl.out.0000
    25.9 KB · Views: 0
  • rsl.error.0000
    7.1 KB · Views: 1
Hi,
I was referring to the resolution of the meteorological input data that you use for ungrib (e.g., ERA5 or GFS, etc) - not the static fields during geogrid. Can you let me know the resolution of the meteorological input? Thanks!
 
Apologies for the delay. The data you're using is fine, but 0.25 degree data has a grid distance of ~28 km, and your parent domain (d01) uses a grid distance of 3 km. This is a 9:1 ratio. We recommend trying to stay in a ratio range no larger than about 5:1. This could be the reason for your issue. You may need an outer domain around the 3km domain (perhaps something like 9km) so there's not so much of a difference between the resolution of the input data and your parent domain.
 
Hi, there was a problem with my meteodata that was not going with the domain. Sometimes it is also the timestep which needs to be adapted. I would check that. good luck gael.
 
Top