Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

Segmentation Fault when using YSU PBL scheme and ETA/Ferrier for mp scheme

luizfcs

New member
I succefully run WRF using WRF Double-moment 6-class (WDM6) as a microphisic scheme and Mellor-Yamada-Janjic (MYJ) as PBL scheme.
But when i try to use ETA Ferrier microphisic scheme it crashes, due to segmentation fault.
Also when i try to uso YSU as PBL scheme, even with WDM6, it crashes due to segmentation fault. I hve no idea what coul possibly be causing this, any help?

Runing on EC2, 32 processors, mpirun np4 -bind to core:8
 

Attachments

  • namelist.input.txt
    3.3 KB · Views: 4
It crashes imediately, the error message: is segmentation fault (11)

After some more tests i narrow it down the problem:

I was able to run with a smaller domain 100x100 with the same mpi passing ( 4 process, 8 cores per process)

I tryed again with a larger domain 160x160, but i was only able to succefully run the simullation with 16 process and 1 core per process.

On my last try, i expanded my domain to 240x240, but with this gridsize i was unabble even to run previous parametrizations MYJ and Goddar, wich was running with a 200x200 grid)

My question now is if its only a siple matter to use more processors, or if my parallell strategy isn't right.

I am running the WRF on an AWS EC2 wrf-arm c7g.8xlarge with 32 VCPU, 64 GB ram memory. And i am usin dmshare strategy (don't know if i should use SM ou DM + SM with VCPU from AWS)
 
Based on the information you posted, I believe this is a memory issue. I wonder whether you can compile WRF in dmpar mode, then run the case with more number of processors, which will give you larger memory.
I am not quite familiar with AWS. If this issue persists, please contact AWS manager to get larger memory.
 
Thank you for the support!

It turns out it was really a memory issue. I tryied some more tilying strategyies and mannaged to run thw 240x240 grid with 32 process. To max perform the elapsed time rate of each time step i am using 1 processor for each process. I'm on the limit of the AWS machine now, but at least it works.
 
Top