Difference between revisions of "Wien2k"

From IT Service Wiki
Jump to: navigation, search
 
(6 intermediate revisions by the same user not shown)
Line 9: Line 9:
 
<br>  
 
<br>  
  
== Parallel runs in the Grid Engine  ==
+
== Parallel runs with Slurm ==
  
To access the compute nodes it is necessary that you use the batch system. Please make sure that the [[Sun Grid Engine|'''SUN grid engine''']] is configured correctly for your account. See the [[Sun Grid Engine|wiki page]] for further instruction. Furthermore you need the [[Intel Compiler Temp|'''Intel&nbsp;MKL libraries''']] in your LD_LIBRARY_PATH. <br>
+
This script is currently untested. Please test and report.
  
You need one slot for the the master process which start and controles the slaves. If you allocate 8 Slots you will get 7 worker.
+
<pre>
 
+
#! /bin/bash
Up to now, only the k-point parallelization is working. Here is a annotated example script:
+
#
<pre>#! /bin/bash
+
# Running wien2k job with slurm using k-point parallelization.
#  
+
#
# Sample wien2k script for use with sge in ITP
+
# Create an apropriate .machine file for wien2k
# adopted from the tcsh version in wien2k/qsub-job0-sge
+
#
 +
# Allocated resources are encode like this:
 +
#  SLURM_JOB_CPUS_PER_NODE="4,2(2x)"
 +
#  SLURM_NODELIST="bree,carla,mike"
 +
#
 +
# = Run 4 tasks on bree and 2 on carla and mike.
 
#
 
#
#   $NSLOTS
+
# Slurm paramters (read 'man sbatch')
#      the number of tasks to be used
 
#  $TMPDIR/machines
 
#      a valid machine file to be passed to mpirun
 
#
 
# Options passed to qsub (denoted by #$)&nbsp;:
 
 
#
 
#
# Pass your environment to job  
+
#SBATCH --partition=dfg
#$ -V
+
#SBATCH --mem-per-cpu=4000
 +
#SBATCH --ntasks=8
 +
#SBATCH --output=job.out
 +
# (--output includes STDERR)
 
#
 
#
# Run in current working directory (in most cases a good idea)
 
#$ -cwd
 
# Rename the STDOUT und STDERR Stream to an friendly name
 
#$ -o job.out
 
#$ -e job.err
 
 
# select a queue
 
#$ -q dwarfs
 
 
#
 
#
# How many resources do I need (per slot)
 
# Lightly overcommit the memory, that it runs on 8GB&nbsp;machine
 
#$ -l h_vmem=2G,virtual_free=1800G
 
 
#
 
#
# Selected parallel environment and number of slots/processes
+
# Set internal parallelization code in mkl to only use
# mpi is needed, although we do not start a mpi job
+
# one thread per process.
#$ -pe mpi 6
+
export OMP_NUM_THREADS=1
 +
 
 +
# Use , as list seperator
 +
IFS=','
 +
# Convert string to array
 +
hcpus=($SLURM_JOB_CPUS_PER_NODE)
 +
unset IFS
  
# define the environment, eventually not needed
+
declare -a conv
export WIENROOT="/home/wien2k/wien2k"
 
export PATH="$WIENROOT:$PATH"
 
export SCRATCH="/tmp"
 
  
# Set internal parallelization code in mkl to only use
+
# Expand compressed slurm array
# on thread per process.
+
for cpu in ${hcpus[@]}; do
export OMP_NUM_THREADS=1
+
    if [[ $cpu =~ (.*)\((.*)x\) ]]; then
 +
# found compressed value
 +
value=${BASH_REMATCH[1]}
 +
factor=${BASH_REMATCH[2]}
 +
for j in $(seq 1 $factor); do
 +
    conv=( ${conv[*]} $value )
 +
done
 +
    else
 +
conv=( ${conv[*]} $cpu )
 +
    fi
 +
done
  
# some information
+
# Build .machines file
echo "Got $NSLOTS slots." &gt;&gt; job.out
+
rm -f .machines
echo "Got $NSLOTS slots." &gt;&gt; job.err
 
  
# read the mpi machines files (generated by the sge)
+
nhost=0
proclist=(`cat $TMPDIR/machines`)
 
nproc=$NSLOTS
 
echo $nproc nodes for this job: $proclist
 
  
rm .machines
+
echo ${conv[@]};
  
# Convert proclist to one line per slot/k-point.
+
IFS=','
# In a single queue all nodes have equal performance.
+
for node in $SLURM_NODELIST
for a in ${proclist[*]}; do
+
do
    echo 1:$a &gt;&gt; .machines
+
    declare -i cpuspernode=${conv[$nhost]};
done
+
    for ((i=0; i<${cpuspernode}; i++))
 +
    do
 +
echo 1:$node >> .machines
 +
    done
 +
    let nhost+=1
 +
done  
  
#This line would force the mpi version
+
echo 'granularity:1' >>.machines
#echo 1:$proclist  &gt;&gt; .machines
+
echo 'extrafine:1' >>.machines
  
echo 'granularity:1' &gt;&gt;.machines
+
# .machines file complete
echo 'extrafine:1' &gt;&gt;.machines
 
  
 
# Run your caclulation
 
# Run your caclulation
 
x lapw1 -p
 
x lapw1 -p
 
 
</pre>
 
</pre>

Latest revision as of 12:44, 3 December 2019

This page is indent to give some instructions, how to run the wien2k package in our Cluster.

Access

The official build is installed in the user account "wien2k". Access to this account is restricted to users who are member of the unix group "wien2k". The latest version is always linked to the directory /home/wien2k/wien2k. Your setup in your .bashrc could look lile:

export WIENROOT="/home/wien2k/wien2k"
export PATH="$WIENROOT:$PATH"
export SCRATCH="/tmp"


Parallel runs with Slurm

This script is currently untested. Please test and report.

#! /bin/bash
#
# Running wien2k job with slurm using k-point parallelization.
#
# Create an apropriate .machine file for wien2k
#
# Allocated resources are encode like this:
#  SLURM_JOB_CPUS_PER_NODE="4,2(2x)"
#  SLURM_NODELIST="bree,carla,mike"
#
# = Run 4 tasks on bree and 2 on carla and mike.
#
# Slurm paramters (read 'man sbatch')
#
#SBATCH --partition=dfg
#SBATCH --mem-per-cpu=4000
#SBATCH --ntasks=8
#SBATCH --output=job.out
# (--output includes STDERR)
#
#
#
# Set internal parallelization code in mkl to only use 
# one thread per process. 
export OMP_NUM_THREADS=1

# Use , as list seperator
IFS=','
# Convert string to array
hcpus=($SLURM_JOB_CPUS_PER_NODE)
unset IFS

declare -a conv

# Expand compressed slurm array
for cpu in ${hcpus[@]}; do
     if [[ $cpu =~ (.*)\((.*)x\) ]]; then
	# found compressed value
	value=${BASH_REMATCH[1]}
	factor=${BASH_REMATCH[2]}
	for j in $(seq 1 $factor); do
	    conv=( ${conv[*]} $value )
	done
     else
	conv=( ${conv[*]} $cpu )
     fi
done

# Build .machines file
rm -f .machines

nhost=0

echo ${conv[@]};

IFS=','
for node in $SLURM_NODELIST
do 
    declare -i cpuspernode=${conv[$nhost]};
    for ((i=0; i<${cpuspernode}; i++))	
    do
	echo 1:$node >> .machines
    done
    let nhost+=1
done 

echo 'granularity:1' >>.machines
echo 'extrafine:1' >>.machines

# .machines file complete

# Run your caclulation
x lapw1 -p