Generic Templates
This is page is a collection of generic templates of job scripts. The latest official template can be found on thNeumann-Cluster-Homepage.
In the Variant 2, the line mpirun $debug_mpi ./mpi_app
should be comment out when you want to run a mpi process. If you ran special software, such as StarCCM+ you cannot use mpirun. Replace the mpirun line with your command.
If you want to start parallel processes in your job, such as running a programm on every CPU use srun
.
Scripts on this page
Variant 1
A simple example, to get familiar with the queue system and job submission syntax:
- justatestjob_20170822.sh
#!/bin/bash # An example batch script to launch a dummy test script on the Neumann cluster # From command line on Neumann, log into a node, e.g. type "ssh c002" # then submit this job to the queue system by typing "sbatch justatestjob_20170814.sh" # If this works correctly, you should see your job in the short queue for about half a minute # And then it will create a tiny file called "hello" in your scratch directory ################################################################################################### # Queue system requests #SBATCH --job-name doobdoob # job name displayed by squeue #SBATCH --partition sw01_short # queue in which this is going #SBATCH --nodes 2 # number of nodes #SBATCH --time 001:00:00 # time budget [HHH:MM:SS] #SBATCH --mem 1G # RAM memory allocated to each node #SBATCH --dependency singleton # singleton dependency: do not start this job before any other job with the same job name has finished ################################################################################################### # Now, let’s run this script # Define where the working directory is, based on the username of the user submitting WORKINGDIRECTORY="/scratch/tmp/$USER" # Create the scratch folder if it does not exist mkdir -p $WORKINGDIRECTORY # Do nothing for 30 seconds sleep 30s # Create a tiny file called "hello" in your working directory and write some text inside it echo "hello world!" >> $WORKINGDIRECTORY/hello # Write the date at the end date +%Y-%m-%d_%H:%M:%S_%s_%Z >> $WORKINGDIRECTORY/hello
Variant 2
- job-default.sh
#!/bin/bash # UPDATE: 23.06.2016 adding --mem # WARNING: this minimum script is not in its final version # (acceptable longrun-job-handling is missing) # please check http://www-e.uni-magdeburg.de/urzs/t100/ periodically 2016-11 # # lines beginning with "#SBATCH" are instructions for the jobsystem (man slurm). # lines beginning with "##SBATCH" are comments # #SBATCH -J job-01 # jobname displayed by squeue #SBATCH -N 4 # minimum number of nodes needed or minN-maxN # do not waste nodes (check scaling of your app), other users may need them #SBATCH --ntasks-per-node 1 # 1 for multi-thread-codes (using 16 cores) ##SBATCH --ntasks-per-node 2 # 2 for mixed code, 2 tasks * 8 cores/task ##SBATCH --ntasks-per-node 16 # 16 for pure MPI-code or 16 single-core-apps #SBATCH --time 01:00:00 # set 1h walltime (=maximum runtime), see sinfo #SBATCH --mem 80000 # [MB/node], please do not use more than 120000 # please use all cores of a node (especially small jobs fitting to one node) # nodes will not be shared between jobs (avoiding problems) (added 2017.06) # # # most output is for more simple debugging (better support): date +%Y-%m-%d_%H:%M:%S_%s_%Z # date as YYYY-MM-DD_HH:MM:SS_Ww_ZZZ echo "DEBUG: SLURM_JOB_NODELIST=$SLURM_JOB_NODELIST" echo "DEBUG: SLURM_NNODES=$SLURM_NNODES" echo "DEBUG: SLURM_TASKS_PER_NODE=$SLURM_TASKS_PER_NODE" env | grep -e MPI -e SLURM echo "DEBUG: host=$(hostname) pwd=$(pwd) ulimit=$(ulimit -v) \$1=$1 \$2=$2" exec 2>&1 # send errors into stdout stream # # load modulefiles which set paths to mpirun and libs (see website) echo "DEBUG: LOADEDMODULES=$LOADEDMODULES" # module list #module load gcc/4.8.2 # if you need gcc or gcc-libs on nodes #module load openblas/gcc/64/0.2.15 # multithread basic linear algebra #module load openmpi/gcc/64/1.10.1 # message passing interface echo "DEBUG: LOADEDMODULES=$LOADEDMODULES" # module list # echo "check for free space on /dev/shm (needed for MPI jobs)" # 2017-09-25 srun -l -c6 bash -c "df -k /dev/shm | tail -1" | sort -k 3n | tail -2 # ## please use /scratch (200TB 8GB/s), /home is for job preparation only ## do not start jobscript in /scratch but change to it to use massive disk-I/O ## (conflicting link@master vs. mount@nodes), see website for more info #mkdir -p /scratch/tmp/${USER}_01 # create directory if not existing #cd /scratch/tmp/${USER}_01;echo new_pwd=$(pwd) # change to scratch-dir # export OMP_WAIT_POLICY="PASSIVE" export OMP_NUM_THREADS=$((16/((SLURM_NPROCS+SLURM_NNODES-1)/SLURM_NNODES))) # optimize multi-thread-speed for 16 threads: [ $OMP_NUM_THREADS == 16 ] && export GOMP_CPU_AFFINITY="0-15:1" export OMP_PROC_BIND=TRUE echo OMP_NUM_THREADS=$OMP_NUM_THREADS # # --- please comment out and modify the part you will need! --- # --- for MPI-Jobs and hybrid MPI/OpenMP-Jobs only --- # prepare debug options for small test jobs ## set debug-output for small test jobs only: [ "$SLURM_NNODES" ] && [ $SLURM_NNODES -lt 4 ]\ && debug_mpi="--report-bindings" #mpirun $debug_mpi ./mpi_app # start mpi-application # # --- for multiple Single-Jobs and multiple OpenMP-Jobs only --- # for multiple single-core-apps, ntasks-per-node should be set to maximum=16 ## small test-job: # # start all tasks, but only let output tasks 0..19: srun bash -c "[ \$SLURM_PROCID -lt 20 ] && echo task \$SLURM_PROCID \ of \$SLURM_NPROCS runs on \$SLURMD_NODENAME" # # # parallize serial loop of single_core_application app1: # for ((i=0;i<$SLURM_NPROCS;i++));do ./app1 $i;done # serial version srun bash -c "./app1 \$SLURM_PROCID" # parallel version # # start 4 different tasks (app1..4) in background and wait: [ "$SLURM_NNODES" == "1" ] && ( ./app1 & ./app2 & ./app3 & ./app4 & wait ) # # -------------------------- post-processing ----------- # If MPI jobs abort in an unexpected way /dev/shm gets filled with stale # files. So we try to clean tmp-dirs after the job (added 2017-09-25): find /dev/shm /tmp -xdev -maxdepth 1 -user $USER \ -exec rm -rf --one-file-system {} \; # dont use this if you have suspended jobs # if you know a better way, please tell me (Joerg S.) ## Final time stamp date +%Y-%m-%d_%H:%M:%S_%s_%Z # date as YYYY-MM-DD_HH:MM:SS_Ww_ZZZ