How to use SLURM Workload Manager
The Bishop Cluster utilizes SLURM Workload manager. SLURM is a queue management system and stands for Simple Linux Utility for Resource Management.
Documentation
Documentation on SLURM usage and commands can be found at the SLURM site.
Basic Slurm Commands
sinfo
Submit a Job:
sbatch myscript.sh
Submit a Job to a Specific Queue:
sbatch –partition=quickq myscript.sh
List all current jobs for a user:
squeue -u
List all running jobs for a user:
squeue -u -t RUNNING
List all pending jobs for a user:
squeue -u -t PENDING
List detailed information for a job (for troubleshooting):
scontrol show jobid -dd
List status info for a currently running job:
sstat –format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j –allsteps
To get statistics on completed jobs by jobID:
sacct -j –format=JobID,JobName,MaxRSS,Elapsed
Controlling Jobs
To cancel one job:
scancel
To cancel all the jobs:
scancel -u
To cancel all the pending jobs for a user:
scancel -t PENDING -u
To cancel one or more jobs by name:
scancel –name myJobName
To pause a particular job:
scontrol hold
To resume a particular job:
scontrol resume
To requeue (cancel and rerun) a particular job:
scontrol requeue
Sbatch Parameters (Full List Here)
#!/bin/bash | |
#SBATCH -J jobname | # Specify a Job Name |
#SBATCH -n 1 | # Number of cores |
#SBATCH -N 1 | # Number of nodes |
#SBATCH –begin=3 PM | # Specify time of day to run, now+ is delayed run |
#SBATCH -t 0-00:00:00 | # Runtime in D-HH:MM:SS |
#SBATCH -p queuename | # Submit to specific queue |
#SBATCH –mem=1 | # Total (Also can use –mem-per-node) |
#SBATCH -o hostname_%j.out | # File to which Output will be written |
#SBATCH -e hostname_%j.err | # File to which Errors will be written |
Sbatch Example
#!/bin/bash #SBATCH -N 2 #SBATCH –begin=now+2 hours #SBATCH -t 0-12:00:00 #SBATCH -p quickq #SBATCH –mem-per-node=4 #SBATCH -o hostname_%j.out #SBATCH -e hostname_%j.err |
# 2 nodes # Run in 2 hours # 12 hours runtime # Submit to quickq # 4 GB Memory per node, Total 8 GB # Output File # Error File |
module load matlab matlab -nodisplay < matlab_test.m | Load the Matlab Module Launch matlab script |