User Tools

Site Tools


slurm_basics

====== Main Slurm Commands ====== **sbatch** The sbatch command submits a batch processing job to the slurm queue manager. These scripts typically contain one or more srun commands to queue jobs for processing. **srun** The srun command is used to submit jobs for execution, or to initiate steps of jobs in real time. For the full range of options that can be passed to the srun command, see the UNIX man page for srun (type man srun at the command prompt). **scancel** The scancel command will terminate pending and running job steps. You can also use it to send a unix signal to all processes associated with a running job or job step. **squeue** The squeue command will report the state of running and pending jobs. **sinfo** The sinfo command will report the status of the available partitions and nodes. **smap** The smap command is similar to the sinfo command, except it displays all of the information in a pseudo-graphical, ncurses terminal. **sview** The sview command is graphical user interface to view and modify Slurm state. **Example Scripts** * Script 1 The following snippet runs a program asking for four (4) tasks. http://osirim.irit.fr/site/en/articles/sbatch-options <code> #!/bin/bash srun -n 4 my_program </code> * Script 2 This script is the same as Script 1 except it uses slurm directives instead of passing the arguments as part of the srun command. <code> #!/bin/bash #SBATCH -n 4 #SBATCH --ntasks-per-node=2 #SBATCH --time=00:30:00 srun ./my_program </code> To submit the script, just run <code> $> sbatch jobscript </code> * Script 3 Running two jobs per node: <code> #!/bin/bash #SBATCH -N 1 #SBATCH -n 2 #SBATCH --time=00:30:00 </code> # Use '&' to move the first job to the background <code> srun -n 1 ./job1.batch & srun -n 1 ./job2.batch </code> # Use 'wait' as a barrier to collect both executables when they are done. wait To submit the script, just run <code> $> sbatch jobscript </code> * Script 4 Naming output and error files: <code> #!/bin/bash #SBATCH -n 2 #SBATCH --time=00:05:00 #SBATCH --error=job.%J.err #SBATCH --output=job.%J.out </code> <code> srun ./my_program </code> To submit the script, just run $> sbatch jobscript * Script 5 <code> #!/bin/bash #SBATCH --nodes=1 #request one node #SBATCH --cpus-per-task=2 #e .g. ask for 2 cpus #SBATCH --time=02:00:00 #ask that the job be allowed to run for 2 hours. #SBATCH --error=/PATH/<NAME>.%J.err # tell it to store the output console text to a file #SBATCH --output=/PATH/<NAME>.%J.out #tell it to store the error messages to a file module load R #load the most recent version of R available Rscript --vanilla myRscript.R #run an R script using R </code> To submit the script, just run <code> $> sbatch jobscript </code> * Script 6 Running a job that needs a GPU <code> #!/bin/bash #SBATCH --nodes=1 #request one node #SBATCH --cpus-per-task=8 #ask for 8 cpus #SBATCH --time=02:00:00 #ask that the job be allowed to run for 2 hours. #SBATCH --error=job.%J.err # tell it to store the output console text to a file #SBATCH --output=job.%J.out #tell it to store the error messages to a file #SBATCH --gres=gpu:1 #job is asking for 1 GPU, the scheduler will ensure this job is run on a node with a GPU available module load R #load the most recent version of R available Rscript--vanilla myRScript.R #run an R script using R </code> To submit the script, just run <code> $> sbatch jobscript </code> Interactive session example To get an interactive session for an hour <code> salloc --time=01:00:00 --nodes=1 </code> back to [[ace-gpu-1_slurm|ace-gpu-1 slurm]]

slurm_basics.txt · Last modified: 2019/02/12 15:08 by xteng