Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

MPAS-8.2.2 result is not bit-reproducible when running with multiple OpenMP threads

PSH

New member
I have realized that MPAS-8.2.2 results are not bit-repeatable when re-run with the same number of OpenMP threads. It looks like there is race-condition in 'subroutine driver_microphysics' routine. If an OpenMP directive (before 'call driver_microphysics' in mpas_atm_time_integration.F) is disable, the results are bit-identical.

I haven't seen the non bit-repeatable results with old version of MPAS.

mpas_atm_time_integration.F:
if (trim(config_microp_scheme) /= 'off') then
call mpas_timer_start('microphysics')
!!$OMP PARALLEL DO
do thread=1,nThreads
call driver_microphysics ( block % configs, mesh, state, 2, diag, diag_physics, tend_physics, tend, itimestep, &
cellSolveThreadStart(thread), cellSolveThreadEnd(thread))
end do
!!$OMP END PARALLEL DO
call mpas_timer_stop('microphysics')
end if

Thanks,
 
Hi @PSH thanks for highlighting this issue and apologies for the delay in support. I will try to confirm this soon.

Would you please share the namelist.atmosphere file and the commands (or script) to execute the atmosphere_model for your run? Specific software versions (esp. compiler used) may help me replicate this as well.
 
Thanks. namelist.atmosphere is attached.

Launch command on a cluster (256cores per node) with intel mpi/2024/v2.1/mpi/2021.13:
export OMP_NUM_THREADS=2
export OMP_STACKSIZE=512M
export KMP_AFFINITY="granularity=fine,compact"
mpiexec -ppn 128 -np 2560 ./${BIN_FILE}

Launch command on Cray EX system with cray-mpich:
export OMP_NUM_THREADS=2
export OMP_STACKSIZE=1G
export OMP_PROC_BIND=true
export OMP_PLACES=cores

export FI_CXI_RX_MATCH_MODE=software
srun --ntasks=2560 --ntasks-per-node=64 --cpus-per-task=2 --distribution=block:block ./${BIN_FILE}

compiler version: intel-ifort 2023/2.0, intel-ifx 2025

Same status with on various architectures and intel compiler versions.

As a workaround, I commented out the OpenMP directive in 'call call driver_microphysics'.

Many Thanks,
 

Attachments

  • namelist.atmosphere.txt
    1.4 KB · Views: 0
Top