Scheduled Downtime
On Friday 21 April 2023 @ 5pm MT, this website will be down for maintenance and expected to return online the morning of 24 April 2023 at the latest

srun error related to PMK_KVS_Barrier

This post was from a previous version of the WRF&MPAS-A Support Forum. New replies have been disabled and if you have follow up questions related to this post, then please start a new thread from the forum home page.

I tried to submit wrf.exe using the slurm script below

#!/bin/bash
#SBATCH --job-name wrf_PurpleAir
#SBATCH --qos long+
#SBATCH --time 120:00:00
#SBATCH --output wrf.log
#SBATCH --nodes 3
#SBATCH --ntasks 12
#SBATCH --partition high_mem
##SBATCH --mem=max
#SBATCH --account=pi_zzbatmos

unset I_MPI_PMI_LIBRARY
##export I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=0
/home/vy57456/zzbatmos_user/application/intel/2018b/mpich-3.4.2/bin/mpirun -np ${SLURM_NTASKS} ./wrf.exe > index_000 2>&1

Job status said that job is running. However, these is no any output. After running for a while, it has the following error
srun: error: PMK_KVS_Barrier duplicate request from task 0

I tried both gfortran and intel compilers to compile MPICH. Both have the same error. Do you have any idea on this error? Thank you
 
Hi,
Are you getting any rsl.out* or rsl.error* files? If not, then it sounds like it's a problem with your system. Unfortunately you'll need to discuss the issue with a systems administrator at your institution.

If there are rsl* files, take a look at the rsl.error.0000 file and see if there is an error message at the end. If not, then package those files up and attach them, along with your namelist.input file so that I can take a look.
 
No, I did not get rsl.* files. Previously I compiled WRF using dmpar and intel compiler. After compiling WRF using smpar and intel compiler, it seems that wrf.exe can run, but with the following error.

-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 2270
Warning: too many input landuse types
-------------------------------------------

Right now I am trying to figure out this error. At least, wrf.exe can run. Do you have any idea about this error? Thanks
 
Hi,
Will you first take a look at this post to see if there is any helpful information there for you? I would suggest following the instructions issued by Ming on Jan 3, 2020. Let me know what you can figure out. Please let me know what you are doing during the WPS process that may be different from the default. Thanks!
 
Top