Generic Templates

This is page is a collection of generic templates of job scripts. The latest official template can be found on thNeumann-Cluster-Homepage.

In the Variant 2, the line mpirun $debug_mpi ./mpi_app should be comment out when you want to run a mpi process. If you ran special software, such as StarCCM+ you cannot use mpirun. Replace the mpirun line with your command.

If you want to start parallel processes in your job, such as running a programm on every CPU use srun.

Scripts on this page

Variant 1

A simple example, to get familiar with the queue system and job submission syntax:

justatestjob_20170822.sh
#!/bin/bash
# An example batch script to launch a dummy test script on the Neumann cluster
# From command line on Neumann, log into a node, e.g. type "ssh c002"
# then submit this job to the queue system by typing "sbatch justatestjob_20170814.sh"
# If this works correctly, you should see your job in the short queue for about half a minute
# And then it will create a tiny file called "hello" in your scratch directory
 
 
###################################################################################################
# Queue system requests
 
#SBATCH --job-name doobdoob		# job name displayed by squeue
#SBATCH --partition sw01_short		# queue in which this is going
#SBATCH --nodes 2	 		# number of nodes
#SBATCH --time 001:00:00		# time budget [HHH:MM:SS] 
#SBATCH --mem 1G			# RAM memory allocated to each node
 
#SBATCH --dependency singleton	# singleton dependency: do not start this job before any other job with the same job name has finished
 
 
 
###################################################################################################
# Now, let’s run this script
 
# Define where the working directory is, based on the username of the user submitting
WORKINGDIRECTORY="/scratch/tmp/$USER"
 
# Create the scratch folder if it does not exist
mkdir -p $WORKINGDIRECTORY
 
# Do nothing for 30 seconds
sleep 30s
 
# Create a tiny file called "hello" in your working directory and write some text inside it
echo "hello world!" >> $WORKINGDIRECTORY/hello
 
# Write the date at the end
date +%Y-%m-%d_%H:%M:%S_%s_%Z >> $WORKINGDIRECTORY/hello

Variant 2

job-default.sh
#!/bin/bash
# UPDATE: 23.06.2016 adding --mem
# WARNING: this minimum script is not in its final version
#          (acceptable longrun-job-handling is missing)
#  please check http://www-e.uni-magdeburg.de/urzs/t100/ periodically 2016-11
#
# lines beginning with "#SBATCH" are instructions for the jobsystem (man slurm).
# lines beginning with "##SBATCH" are comments
#
#SBATCH -J job-01             # jobname displayed by squeue
#SBATCH -N 4                  # minimum number of nodes needed or minN-maxN
#  do not waste nodes (check scaling of your app), other users may need them
#SBATCH --ntasks-per-node 1   # 1 for multi-thread-codes (using 16 cores)
##SBATCH --ntasks-per-node 2  # 2 for mixed code, 2 tasks * 8 cores/task
##SBATCH --ntasks-per-node 16 # 16 for pure MPI-code or 16 single-core-apps
#SBATCH --time 01:00:00       # set 1h walltime (=maximum runtime), see sinfo
#SBATCH --mem 80000           # [MB/node], please do not use more than 120000
# please use all cores of a node (especially small jobs fitting to one node)
# nodes will not be shared between jobs (avoiding problems) (added 2017.06)
#
#
# most output is for more simple debugging (better support):
date +%Y-%m-%d_%H:%M:%S_%s_%Z # date as YYYY-MM-DD_HH:MM:SS_Ww_ZZZ
echo "DEBUG: SLURM_JOB_NODELIST=$SLURM_JOB_NODELIST"
echo "DEBUG: SLURM_NNODES=$SLURM_NNODES"
echo "DEBUG: SLURM_TASKS_PER_NODE=$SLURM_TASKS_PER_NODE"
env | grep -e MPI -e SLURM
echo "DEBUG: host=$(hostname) pwd=$(pwd) ulimit=$(ulimit -v) \$1=$1 \$2=$2"
exec 2>&1      # send errors into stdout stream
#
# load modulefiles which set paths to mpirun and libs (see website)
echo "DEBUG: LOADEDMODULES=$LOADEDMODULES" # module list
#module load gcc/4.8.2                # if you need gcc or gcc-libs on nodes
#module load openblas/gcc/64/0.2.15   # multithread basic linear algebra
#module load openmpi/gcc/64/1.10.1    # message passing interface
echo "DEBUG: LOADEDMODULES=$LOADEDMODULES" # module list
#
echo "check for free space on /dev/shm (needed for MPI jobs)" # 2017-09-25
srun -l -c6 bash -c "df -k /dev/shm | tail -1" | sort -k 3n | tail -2
#
## please use /scratch (200TB 8GB/s), /home is for job preparation only
## do not start jobscript in /scratch but change to it to use massive disk-I/O
##   (conflicting link@master vs. mount@nodes), see website for more info
#mkdir -p /scratch/tmp/${USER}_01    # create directory if not existing
#cd /scratch/tmp/${USER}_01;echo new_pwd=$(pwd) # change to scratch-dir
#
export OMP_WAIT_POLICY="PASSIVE"
export OMP_NUM_THREADS=$((16/((SLURM_NPROCS+SLURM_NNODES-1)/SLURM_NNODES)))
# optimize multi-thread-speed for 16 threads:
[ $OMP_NUM_THREADS == 16 ] && export GOMP_CPU_AFFINITY="0-15:1" 
export OMP_PROC_BIND=TRUE
echo OMP_NUM_THREADS=$OMP_NUM_THREADS
#
# --- please comment out and modify the part you will need! ---
# --- for MPI-Jobs and hybrid MPI/OpenMP-Jobs only ---
# prepare debug options for small test jobs
## set debug-output for small test jobs only:
[ "$SLURM_NNODES" ] && [ $SLURM_NNODES -lt 4 ]\
  && debug_mpi="--report-bindings"
#mpirun $debug_mpi ./mpi_app  # start mpi-application
#
# --- for multiple Single-Jobs and multiple OpenMP-Jobs only ---
# for multiple single-core-apps, ntasks-per-node should be set to maximum=16
## small test-job:
#
# start all tasks, but only let output tasks 0..19:
srun bash -c "[ \$SLURM_PROCID -lt 20 ] && echo task \$SLURM_PROCID \
 of \$SLURM_NPROCS runs on \$SLURMD_NODENAME"
#
#
# parallize serial loop of single_core_application app1:
# for ((i=0;i<$SLURM_NPROCS;i++));do ./app1 $i;done  # serial version
srun bash -c "./app1 \$SLURM_PROCID"                 # parallel version
#
# start 4 different tasks (app1..4) in background and wait:
[ "$SLURM_NNODES" == "1" ] && ( ./app1 & ./app2 & ./app3 & ./app4 & wait )
#
# -------------------------- post-processing -----------
# If MPI jobs abort in an unexpected way /dev/shm gets filled with stale
# files. So we try to clean tmp-dirs after the job (added 2017-09-25):
find /dev/shm /tmp -xdev -maxdepth 1 -user $USER \
  -exec rm -rf --one-file-system {} \;
# dont use this if you have suspended jobs
# if you know a better way, please tell me (Joerg S.)
 
## Final time stamp
date +%Y-%m-%d_%H:%M:%S_%s_%Z # date as YYYY-MM-DD_HH:MM:SS_Ww_ZZZ
guide/jobscript/general.txt · Zuletzt geändert: 2017/09/27 13:02 von Sebastian Engel
Nach oben
CC Attribution-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0