StarCCM+ sbatch script to run a SLURM job array

Explain template better, Improve explanation structure, currently too unsorted.

Imagine a case where you want to run many similar StarCCM+ simulations.

In the case that you normally would have to created an equal amount of job-scripts, such as the single simulation template already provided. You would end up having to use tools such as sed or any other editor to replace the changing names or directories. As you could guess, this might become tedious for a larger amount of simulations.

Another option would be the use of a specialized macro, which is able to run multiple sim-files sequentially within one job. You'll be able to find an explanation in the Steve Portal

SLURM also offers the possibility to run a sequence of similar jobs, a so-called Job Array.

A job array is instanced by the additional argument –array followed by a list of numbers. For example, you could run a sequence of 5 jobs by adding –array=1-5.

This will make new environment variables available within a SLURM job. The variable SLURM_ARRAY_TASK_ID is the most useful one. It distinguishes the individual runs within an job array. For example the SLURM_ARRAY_TASK_ID would return SLURM_ARRAY_TASK_ID=1 for the first job in the array, SLURM_ARRAY_TASK_ID=2 for the second job in the array, and so forth. You have to use this variable to switch through your simulations. For example, your could have star1.sim, star2.sim, star3.sim,… In such a case it is rather easy to use the SLURM_ARRAY_TASK_ID. You would need to set the variable SIMULATIONFILE to SIMULATIONFILE=“star${SLURM_ARRAY_TASK_ID}.sim”. Then the SLURM variable would be replaced with the respective numbers.

But what if you have different directories? You'd need to incorporate the job id into the path, too. This could be inconvenient when you want to keep the directory names which might contain information on the simulation setup, such as: coupled_SST_1mioMesh_steady/star.sim. For such a case, you could fill a text file with the simulation paths and use the line numbers as sequence to be used in the job array. In the following script SLURM_ARRAY_TASK_ID is used to read the n-th line of an ASCII file called simdirs.csv. For example, ID=3 would have to read the third line of simdirs.csv.

--array sequences

The argument –array can take a number of different formats.
For example:

    --array=1-5       ## This will create a sequence of 1,2,3,4,5
    --array=1,4,15    ## This will create a list of 1,4,15, especially good, if you need to rerun a few cases
    --array=1-123:3   ## This will create a sequence, with a step size of three, such as: 1,4,7,...

Simulation list

This is an example for simdirs.csv. It containes the relative path to the sim-file.

simdirs.csv
steady/mesh1/starSteady.sim
steady/mesh2/starSteady.sim
transient/mesh1/starTrans1.sim
transient/mesh2/starTrans2.sim

Job Array script

This script mainly is based on the single simulation template to be found here.

it is extended by the argument –array

TODO explain:

  • SIMULATIONFILE=$ROOTDIR/$(sed -n "${SLURM_ARRAY_TASK_ID}p" $ROOTDIR/simdirs.csv)
  • ROOTDIR
  • WORKDIR=${SIMULATIONFILE%/*}

Windows users: please make sure to convert the script with dos2unix on the linux machine, and read the article on Linebreaks

jobArray.sh
#!/bin/bash
## Version 01/2020
## by Sebastian Engel
##
## Runs an array of jobs = one submission for x jobs
## Tailored to run StarCCM+ simulations
## expects a simdirs.csv containing relative paths starting from ROOTDIR to locate the sim files,
## one line per task.
## The sequence of tasks (which lines in simdirs.csv shall be run) has to be created manually.
## 
#################### Job Settings #################################################################
#SBATCH -J myJobArray         # Setting the display name for the submission
#SBATCH -N 1                  # Number of nodes to reserve, -N 2-5  for variable number of requested node count
#SBATCH --ntasks-per-node 16  # typically 16, range: 1..16 (max 16 cores per node)
#SBATCH -t 30:00              # set walltime in hours, format:    hhh:mm:ss, days-hh, days-hhh:mm:ss
#SBATCH -p short              # Desired Partition
#SBATCH --mem 100G            # Requested Memory. Neumann gives priority bonus under 120G 
#SBATCH --signal=B:USR1@180   # Sends a signal 180 seconds before the end of the job to this script,
                              # to write a stop file for StarCCM
#SBATCH --array=1-4           # Sequence of task IDs
 
 
#################### Simulation Settings ##########################################################
## Root directory for the job array to be run. No "/" at the end.
## It should be the directory where simdirs.csv is located.
ROOTDIR=/scratch/tmp/myusername
 
## Simulation file, selected by TASK ID given by SLURM. 
SIMULATIONFILE=$ROOTDIR/$(sed -n "${SLURM_ARRAY_TASK_ID}p" $ROOTDIR/simdirs.csv)
 
## Work directory. Filterd from the sim file path
WORKDIR=${SIMULATIONFILE%/*}
 
## Macro file. Must be located in WORKDIR. Leave empty if no macro is used.
MACROFILE="macro.java"
 
## Personal POD key
PERSONAL_PODKEY="XXXXXXX"
 
## Decide which version by commenting out the desired version. 
#module load starCCM/11.06.011
#module load starCCM/12.02.011
#module load starCCM/13.02.013
module load starCCM/14.04.013
 
## Application. Can be kept constant if modules are used.
APPLICATION="starccm+"
 
## Select which options you need. Leave only the required options uncommented.
##
## you are using a macro and a sim file
#USROPT="$SIMULATIONFILE -batch $WORKDIR/$MACROFILE"
## you are using a macro and are creating a new sim file
#USROPT="-new -batch $WORKDIR/$MACROFILE"
## you want to just run the simulation
USROPT="$SIMULATIONFILE -batch run"
 
#################### Printing some Debug Information ##############################################
## Debug information
/cluster/apps/utils/bin/slurmProlog.sh 
 
#################### Signal Trap ##################################################################
## Catches signal from slurm to write an ABORT file in the WORKDIR.
## This ABORT file will satisfy the stop file criterion in StarCCM.
## Change ABORTFILENAME if you changed the stop file Criterion.
ABORTFILENAME="ABORT"
## Location where Starccm is looking for the abort file
ABORTFILELOCATION=$WORKDIR/$ABORTFILENAME
 
# remove old abort file
rm -rf $ABORTFILELOCATION
# Signal handler
write_abort_file()
{
        echo "$(date +%Y-%m-%d_%H:%M:%S) The End-of-Job signal has been trapped."
        echo "Writing abort file..."
        touch $ABORTFILELOCATION
}
# Trapping signal handler
echo "Trapping handler for End-of-Job signal"
trap 'write_abort_file' USR1
 
#################### Preparing the Simulation #####################################################
## creating machinefile 
MACHINEFILE="machinefile.$SLURM_JOBID.txt"
scontrol show hostnames $SLURM_JOB_NODELIST > $WORKDIR/$MACHINEFILE
 
## Default options plus user options
OPTIONS="$USROPT -mpi openmpi -licpath 1999@flex.cd-adapco.com -power -podkey $PERSONAL_PODKEY -collab -time -rsh /usr/bin/ssh"
 
## Let StarCCM+ wait for licenses on startup
export STARWAIT=1
 
 
#################### Running the simulation #######################################################
## Run application (StarCCM+) in background to allow signal trapping
echo "$(date +%Y-%m-%d_%H:%M:%S) Now, running the simulation ...."
 
## Command to run application (StarCCM+)
$APPLICATION $OPTIONS -np $SLURM_NPROCS -machinefile $WORKDIR/$MACHINEFILE > $SIMULATIONFILE.$SLURM_JOBID.output.log 2>&1 &
wait
 
## Final time stamp
echo "Simulation finalized at: $(date +%Y-%m-%d_%H:%M:%S_%s_%Z)" 
 
## Waiting briefly, to give starccm server processes time to quit gracefully.
sleep 120
 
## Clean-Up
/cluster/apps/utils/bin/slurmEpilog.sh 
 
echo "done."
guide/neumann/jobscript_starccm_array.txt · Last modified: 2020/01/13 10:56 by seengel
Back to top
CC Attribution-Share Alike 3.0 Unported
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0