Good day!
I have some troubles to run my WRFDA 4D-VAR case.
And I'm looking for someone who has more experience in running this module.
For information:
I use CentOS 7.
I have successfully compiled 3D-VAR (4.1.3), WRFPLUS and 4D-VAR with using GNU dmpar configuration. Also I installated it on netcdf-c-4.9.2, netcdf-fortran-4.6.1, hdf5-1.10.5, zlib-1.2.13, jasper-1.900, libpng-1.6.37 and mpich-3.3.1.
I run 3D-VAR test successfully and didn't have any errors in compile log files.
When I run my case on 8 processors I didn't received any direct error but the da_wrfvar.exe process terminated for some time,
but rsl.error ends by:
...
Timing for main: time 2023-07-30_17:58:30 on domain 1: 1.68950 elapsed seconds
Timing for main: time 2023-07-30_18:00:00 on domain 1: 1.80228 elapsed seconds
Swap time: <2023-07-30_12:00:00>and: <2023-07-30_18:00:00>
Swap time: <2023-07-30_13:00:00>and: <2023-07-30_17:00:00>
Swap time: <2023-07-30_14:00:00>and: <2023-07-30_16:00:00>
wrf: calling adjoint integrate
in rsl.out
...
Timing for main: time 2023-07-30_17:58:30 on domain 1: 1.68950 elapsed seconds
Timing for main: time 2023-07-30_18:00:00 on domain 1: 1.80228 elapsed seconds
Calculate innovation vector(iv)
..
Minimize cost function using CG method
..
Swap time: <2023-07-30_12:00:00>and: <2023-07-30_18:00:00>
Swap time: <2023-07-30_13:00:00>and: <2023-07-30_17:00:00>
Swap time: <2023-07-30_14:00:00>and: <2023-07-30_16:00:00>
wrf: calling adjoint integrate
The termination looks like:
starting wrf task 0 of 8
starting wrf task 1 of 8
starting wrf task 2 of 8
starting wrf task 3 of 8
starting wrf task 4 of 8
starting wrf task 6 of 8
starting wrf task 5 of 8
starting wrf task 7 of 8
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 68254 RUNNING AT servicenew
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Сould it be related to the size and resolution of my domain or with input files (sound temp reports)? Or may be I should use more processors?
I have attached my configure requirements file.
I would be very grateful for any ideas how to solve this problem.
I have some troubles to run my WRFDA 4D-VAR case.
And I'm looking for someone who has more experience in running this module.
For information:
I use CentOS 7.
I have successfully compiled 3D-VAR (4.1.3), WRFPLUS and 4D-VAR with using GNU dmpar configuration. Also I installated it on netcdf-c-4.9.2, netcdf-fortran-4.6.1, hdf5-1.10.5, zlib-1.2.13, jasper-1.900, libpng-1.6.37 and mpich-3.3.1.
I run 3D-VAR test successfully and didn't have any errors in compile log files.
When I run my case on 8 processors I didn't received any direct error but the da_wrfvar.exe process terminated for some time,
but rsl.error ends by:
...
Timing for main: time 2023-07-30_17:58:30 on domain 1: 1.68950 elapsed seconds
Timing for main: time 2023-07-30_18:00:00 on domain 1: 1.80228 elapsed seconds
Swap time: <2023-07-30_12:00:00>and: <2023-07-30_18:00:00>
Swap time: <2023-07-30_13:00:00>and: <2023-07-30_17:00:00>
Swap time: <2023-07-30_14:00:00>and: <2023-07-30_16:00:00>
wrf: calling adjoint integrate
in rsl.out
...
Timing for main: time 2023-07-30_17:58:30 on domain 1: 1.68950 elapsed seconds
Timing for main: time 2023-07-30_18:00:00 on domain 1: 1.80228 elapsed seconds
Calculate innovation vector(iv)
..
Minimize cost function using CG method
..
Swap time: <2023-07-30_12:00:00>and: <2023-07-30_18:00:00>
Swap time: <2023-07-30_13:00:00>and: <2023-07-30_17:00:00>
Swap time: <2023-07-30_14:00:00>and: <2023-07-30_16:00:00>
wrf: calling adjoint integrate
The termination looks like:
starting wrf task 0 of 8
starting wrf task 1 of 8
starting wrf task 2 of 8
starting wrf task 3 of 8
starting wrf task 4 of 8
starting wrf task 6 of 8
starting wrf task 5 of 8
starting wrf task 7 of 8
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 68254 RUNNING AT servicenew
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Сould it be related to the size and resolution of my domain or with input files (sound temp reports)? Or may be I should use more processors?
I have attached my configure requirements file.
I would be very grateful for any ideas how to solve this problem.