Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

WRFDA memory leaks

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

hanlung

New member
Hi,

I recently found some bugs in WRFDA causing the memory leaks. I am wondering how I should report it. Could you give me some advice?

Thanks,

Han
 
I have moved this topic to the WRFDA Bug Report section, since this is a WRFDA concern.
 
Fujitsu and Taiwan's Central Weather Bureau (CWB) recently found some bugs in WRFDA causing memory access error. After investigations we fixed the problem and decided to report the issue to this forum and the development group.

These tests were conducted on WRFDA release 3.9 but the issue is still present in the latest release (3.9.1) and also in the WRF-ARW (release 4.0.3) as much of the code is shared between the two systems. The test case used only assimilated radar data and as far as is known the memory leak only affects radar data.

If the amount of Resident Set Size (RSS) is measured at each iteration of the minimization process assimilating radar data then memory use increases at each iteration, a clear indication of a potential memory leak. Within each loop of the minimization procedure there are calls to subroutines "da_calculate_j" and "da_calculate_gradj" both of which contain calls to subroutine "da_allocate_y" to allocate a type variable to hold observation data and to subroutine "da_deallocate_y" to deallocate the type variable after processing has completed for that iteration.

The code to allocate the type variable to hold radar data is
====
if (y % nlocal(radar) > 0) then
allocate (y % radar(1:y % nlocal(radar)))
do n = 1, y % nlocal(radar)
nlevels = iv%info(radar)%levels(n)
allocate (y % radar(n)%rv(1:nlevels))
allocate (y % radar(n)%rf(1:nlevels))
allocate (y % radar(n)%rrn(1:nlevels))
allocate (y % radar(n)%rsn(1:nlevels))
allocate (y % radar(n)%rgr(1:nlevels))
allocate (y % radar(n)%rcl(1:nlevels))
allocate (y % radar(n)%rci(1:nlevels))
allocate (y % radar(n)%rqv(1:nlevels))

y % radar(n) % rv(1:nlevels) = 0.0
y % radar(n) % rf(1:nlevels) = 0.0
y % radar(n) % rrn(1:nlevels) = 0.0
y % radar(n) % rsn(1:nlevels) = 0.0
y % radar(n) % rgr(1:nlevels) = 0.0
y % radar(n) % rcl(1:nlevels) = 0.0
y % radar(n) % rci(1:nlevels) = 0.0
y % radar(n) % rqv(1:nlevels) = 0.0
end do
end if
====

However, the code to deallocate the type is
====
if (y % nlocal(radar) > 0) then
do n = 1, y % nlocal(radar)
deallocate (y % radar(n)%rv)
deallocate (y % radar(n)%rf)
end do
deallocate (y % radar)
end if
====

Only two of the child arrays are being explicitly deallocated before the parent type y is deallocated. View attachment da_deallocate_y.docDeallocating the parent type is not sufficient to free the memory used by the child arrays not being deallocated although the memory occupied by those child arrays become inaccessible. Therefore, at each iteration more memory is being acquired and then made inaccessible and in some circumstances this eventually causes the program to crash.

The solution is simply to explicitly deallocate all the child arrays.
====
if (y % nlocal(radar) > 0) then
do n = 1, y % nlocal(radar)
deallocate (y % radar(n)%rv)
deallocate (y % radar(n)%rf)
deallocate (y % radar(n)%rrn)
deallocate (y % radar(n)%rsn)
deallocate (y % radar(n)%rgr)
deallocate (y % radar(n)%rcl)
deallocate (y % radar(n)%rci)
de allocate (y % radar(n)%rqv)
end do
deallocate (y % radar)
end if
====

Further notes: Only the Fujitsu Fortran compiler has been used to run the full WRFDA system but the problem has been reproduced in a test program and experiments with this program reveal that the problem will also occur using the Intel compiler (2019 v3 tested) and therefore presumably with other Fortran compilers.

The modified "de_deallocate_y.inc" routine is attached along with this report (da_deallocate_y.doc). This routine is found within "var/da/de_define_structures" directory.

In addition to the fixing of the memory leak issue, it has been restructured so that all arrays to hold observation data are deallocated in the reverse order in which they are allocated. This is in line with best practice guidelines and may assist in better memory management overall.

It is also noted that in this directory there is also a routine "da_allocate_radar.inc" which reproduces the allocation code in "da_allocate_y" but no call to this routine can be found in the code and there is no corresponding "da_deallocate_radar" routine. This routine therefore seems superfluous.

We hope this fix can be implemented in the future releases.

Thanks,

Han Lung
Fujitsu America, Inc.
 
WRFDA memory leaks is there for precipitation data assimilation also. Any idea on how to resolve this issue?
 
Top