Wien2k

From Wiki-IT
Jump to: navigation, search

This page is indent to give some instructions, how to run the wien2k package in our Cluster.

Access

The official build is installed in the user account "wien2k". Access to this account is restricted to users who are member of the unix group "wien2k". The latest version is always linked to the directory /home/wien2k/wien2k. Your setup in your .bashrc could look lile:

export WIENROOT="/home/wien2k/wien2k"
export PATH="$WIENROOT:$PATH"
export SCRATCH="/tmp"


Parallel runs with Slurm

This script is currently untested. Please test and report.

#! /bin/bash
#
# Running wien2k job with slurm using k-point parallelization.
#
# Create an apropriate .machine file for wien2k
#
# Allocated resources are encode like this:
#  SLURM_JOB_CPUS_PER_NODE="4,2(2x)"
#  SLURM_NODELIST="bree,carla,mike"
#
# = Run 4 tasks on bree and 2 on carla and mike.
#
# Slurm paramters (read 'man sbatch')
#
#SBATCH --partition=dfg
#SBATCH --mem-per-cpu=4000
#SBATCH --ntasks=8
#SBATCH --output=job.out
# (--output includes STDERR)
#
#
#
# Set internal parallelization code in mkl to only use 
# one thread per process. 
export OMP_NUM_THREADS=1

# Use , as list seperator
IFS=','
# Convert string to array
hcpus=($SLURM_JOB_CPUS_PER_NODE)
unset IFS

declare -a conv

# Expand compressed slurm array
for cpu in ${hcpus[@]}; do
     if [[ $cpu =~ (.*)\((.*)x\) ]]; then
	# found compressed value
	value=${BASH_REMATCH[1]}
	factor=${BASH_REMATCH[2]}
	for j in $(seq 1 $factor); do
	    conv=( ${conv[*]} $value )
	done
     else
	conv=( ${conv[*]} $cpu )
     fi
done

# Build .machines file
rm -f .machines

nhost=0

echo ${conv[@]};

IFS=','
for node in $SLURM_NODELIST
do 
    declare -i cpuspernode=${conv[$nhost]};
    for ((i=0; i<${cpuspernode}; i++))	
    do
	echo 1:$node >> .machines
    done
    let nhost+=1
done 

echo 'granularity:1' >>.machines
echo 'extrafine:1' >>.machines

# .machines file complete

# Run your caclulation
x lapw1 -p

Parallel runs in the Grid Engine

To access the compute nodes it is necessary that you use the batch system. Please make sure that the SUN grid engine is configured correctly for your account. See the wiki page for further instruction. Furthermore you need the Intel MKL libraries in your LD_LIBRARY_PATH.

You need one slot for the the master process which start and controles the slaves. If you allocate 8 Slots you will get 7 worker.

Up to now, only the k-point parallelization is working. Here is a annotated example script:

#! /bin/bash
# 
# Sample wien2k script for use with sge in ITP
# adopted from the tcsh version in wien2k/qsub-job0-sge
#
#   $NSLOTS
#       the number of tasks to be used
#   $TMPDIR/machines
#       a valid machine file to be passed to mpirun
# 
# Options passed to qsub (denoted by #$) :
#
# Pass your environment to job 
#$ -V
#
# Run in current working directory (in most cases a good idea)
#$ -cwd
# Rename the STDOUT und STDERR Stream to an friendly name
#$ -o job.out
#$ -e job.err

# select a queue
#$ -q dwarfs
#
# How many resources do I need (per slot)
# Lightly overcommit the memory, that it runs on 8GB machine
#$ -l h_vmem=2G,virtual_free=1800G
#
# Selected parallel environment and number of slots/processes
# mpi is needed, although we do not start a mpi job
#$ -pe mpil 6 

# define the environment, eventually not needed 
export WIENROOT="/home/wien2k/wien2k"
export PATH="$WIENROOT:$PATH"
export SCRATCH="/tmp"

# Set internal parallelization code in mkl to only use 
# on thread per process. 
export OMP_NUM_THREADS=1

# some information
echo "Got $NSLOTS slots." >> job.out
echo "Got $NSLOTS slots." >> job.err

# read the mpi machines files (generated by the sge)
proclist=(`cat $TMPDIR/machines`)
nproc=$NSLOTS
echo $nproc nodes for this job: $proclist

rm .machines

# Convert proclist to one line per slot/k-point.
# In a single queue all nodes have equal performance.
for a in ${proclist[*]}; do
    echo 1:$a >> .machines
done

#This line would force the mpi version
#echo 1:$proclist  >> .machines

echo 'granularity:1' >>.machines
echo 'extrafine:1' >>.machines

# Run your caclulation
x lapw1 -p