# KISLURM KI-Cluster quickstart
## Logging in to the cluster
The cluster is not directly accessible from outside the university network, so to access it from private machines you will need to first login to either the "lmblogin" server (see SSH access above) or the TF login server `login.informatik.uni-freiburg.de` and then login to the cluster from there.
- login nodes:
- kislogin1.rz.ki.privat
- kislogin2.rz.ki.privat
- kislogin3.rz.ki.privat
- account:
- use your TF account (**NOT** the LMB account)
- if your TF account works (see TF account access above) and you still don't have access, create a ticket here: https://osticket.informatik.uni-freiburg.de/open.php
## Get cluster and job info
- get info about nodes and partitions
```bash
sfree
```
- get info about partitions you are allowed to access
```bash
sinfo
```
- get info about **all** jobs (queued or running)
```bash
squeue
```
- get info about jobs of specific **user**
```bash
squeue -u username
```
- get info about **your** jobs
```bash
squeue --me
```
## Create and use storage (workspaces)
- Create a workspace
```bash
ws_allocate workspacename 100
```
- allocates a workspace named workspacename
- request a duration of 100 days (maximum on dlclarge)
- it can be extended! (up to 5 times)
- Info about your workspaces
```bash
ws_list
```
- Configure e-mail remainder before workspace duration ends
- notifiy 14 days before expiration
- notify email@cs.uni-freiburg.de
- example: while creating the workspace
```bash
ws_allocate -r 14 -m email@cs.uni-freiburg.de workspacename 100
```
- example: add to an existing workspace: (use the extend-option (-x) with duration 0)
```bash
ws_allocate -x -r 14 -m email@cs.uni-freiburg.de workspacename 0
```
- Extend workspace (reset remaining duration 100 days)
```bash
ws_allocate -x workspacename 100
```
- Utilize the extend-option (-x) with an extend-duration of 0 to update an existing workspace **TODO test this**
### How to access your workspaces
- mounted on
```bash
/work/dlclarge1/username-workspacename/
```
- or
```bash
/work/dlclarge2/username-workspacename/
```
### Example: Copy data from LMB storage
```bash
rsync -avz -e "ssh -p 2122" dienertj@lmblogin.informatik.uni-freiburg.de:/misc/lmbraid21/dienertj/data/cfg-gan-cifar10 /work/dlclarge1/dienertj-cfggan/data/
```
### Access your TF/LMB home
- your LMB home directory will be mounted on /ihome/username
- **you can cd there, even if /ihome/ is empty!**
```bash
cd /ihome/username
```
- **warning** the connection to ihome is slow! don't try to directly train on data on ihome.
## Running jobs
- for all jobs you should
- specify partition (queue in torque)
- specify time limit
- specify memory requirement
- specify cpu core requirements
### Job submission examples
**Note**: Run `sinfo` to find out which queues you are allowed to access.
Below commands will most likely show
`srun: error: Unable to allocate resources: User's group not permitted to use this partition`
- interactive session with one GPU on a node from partition lmb_gpu-rtx2080
```bash
srun -p lmb_gpu-rtx2080 --pty bash
```
- interactive session wiht 2GPUs, 10GB memory, 23:59:59 time limit, working directory = your home
```bash
srun -p lmb_gpu-rtx2080 --gpus=2 --mem=10G --time=23:59:59 --chdir=/home/dienertj --pty bash
```
- submit a jobscript, requesting 8GPUs and allow 8 MPI tasks
```bash
sbatch -p lmb_gpu-rtx2080 --nodes=1 --gpus=8 --mem=40G --time=23:59:59 --ntasks-per-node=8 nanoscopic-dynamics-diffusion/bash_scripts/myocardials-slurm-continue.sh
```
- **TODO** array-jobs
### interact with running jobs
- with SLURM you can execute commands within running jobs
- example: we want to check the GPU utilization of a job we submitted
- use squeue to get the jobid
```bash
> squeue --me
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
3999690 lmb_gpu-r myocardi dienertj R 2:59:32 1 dagobert
```
- execute nvidia-smi within the job
```bash
srun --jobid=3999690 nvidia-smi
```
- we can even get an interactive bash session:
```bash
srun --jobid=3999690 --pty bash
```
### Connect to a running job
Useful e.g. for connecting VSCode, running a debugger, building an SSH tunnel to connect jupyter notebooks.
- Start an interactive job
- Then log in at the assigned node via ssh
For example:
```bash
# start a job:
srun -p lmb_gpu-1080ti --nodes=1 --gpus=1 --mem=10G --time=23:59:59 --chdir=/home/myusername --ntasks-per-node=1 --pty bash
# then (for example from your local lmb workstation):
ssh -p 22 USERNAME@mario
# (assuming that mario is the node where your job was started)
```
So, after starting a job, students can connect their VSCode via SSH to the node and edit code there and debug stuff etc. Also port forwarding is possible, e.g. to start tensorboard/jupyter/etc. on the node and access it from the local workstation.
The advantage of this setup: no GPU conflicts, as the GPU is assigned to a specific user via the job.
The disadvantages:
- extra effort instead of just connecting via ssh (but only little)
- only one user per GPU even if user 90% of the time just edits code and doesn't use the GPU
- "developing session" ends after walltime (e.g. 24 hours)
## See also
You can find additional info about the KISLURM cluster in our [ticket system](https://osticket.informatik.uni-freiburg.de/open.php) under knowledgebase.
# LMB cluster quickstart
*Removed link to [queue status page](https://lmb.informatik.uni-freiburg.de/interna/web-qstat/) as it requires the "lmb" password*
If you are a MSc student, you should work on the KISLURM cluster only, except for a good reason. If you work on the LMB cluster, you should not use the 3090 machines.
## Usage guidelines
- Do not waste resources.
Measure or estimate the resources your program needs in advance on a local machine or on a development server (**dacky**).
- Do not use more than two GPUs at the same time! Use the **student** queue to automatically enforce this.
- See [serverload page](https://lmb.informatik.uni-freiburg.de/interna/serverload/)
## Example job submission
We will submit the following example job script `myjob.sh`:
```bash
#!/bin/bash
echo `hostname`
sleep 30 # wait 30 seconds
```
1. login to ```lmbtorque``` with ```ssh```
2. Use the ```qsub``` command to submit the job script
```bash
qsub -l nodes=1:ppn=1,mem=100mb -q student myjob.sh
```
The command above reserves 1 cpu core and 100mb.
On success a job identifier is written to stdout.
3. Check the state of your job with the ```qstat``` command.
```qstat``` returns information about the state of all submitted jobs.
The second last column shows the current job state, which is one of
- **C** Job is completed after having run/
- **H** Job is held.
- **Q** job is queued, eligible to run or routed.
- **R** job is running.
4. Check the program output after the job is completed.
torque will create two files named ```myjob.sh.oXXXXXX``` and ```myjob.sh.eXXXXXX``` with the stdout and stderr output of your program.
For more information see manuals and help texts of ```qsub``` and ```qstat``` and the manual of torque http://docs.adaptivecomputing.com/8-1-0/basic/help.htm
## Example job script
All parameters to ```qsub``` can be set within the script file like shown below
```bash
#PBS -N Jobname
#PBS -S /bin/bash
#PBS -l nodes=1:ppn=1:gpus=1,mem=1gb,walltime=24:00:00
#PBS -m ae
#PBS -j oe
#PBS -q student
echo `hostname`
sleep 30 # wait 30 seconds
```
The script above can then be submitted with `qsub SCRIPT`.
## Getting around the 24h walltime limit
The maximum walltime for all jobs is 24 hours.
To get around this, a job must save its intermediate results to the hard disk such that a follow up job can continue.
There are two techniques for submitting multiple jobs that run one after the other: job arrays and job chaining.
#### Job arrays
```bash
qsub -t V-W[%S]
```
V: Startnumber, W: End number, S: Number of simultaneous jobs
Arrays are used to start W-V+1 jobs that use the same script. Torque will set the environment variable `PBS_ARRAYID` to provide the array index to the instance of the script. This index can then be translated to parameter sets, input datasets, etc.
Example (HelloWorldArray.sh):
```bash
#!/bin/bash
#PBS -N HelloWorldArray
#PBS -S /bin/bash
#PBS -l nodes=1:ppn=1,mem=10mb,nice=10,walltime=00:01:00
#PBS -j oe
#PBS -q student
echo "Hello World! from array index $PBS_ARRAYID"
exit 0
```
When submitting the script with
```bash
qsub -t 1-10%3 HelloWorldArray.sh
```
ten instances jobid[1]@lmbtorque - jobid[10]@lmbtorque will be enqueued of which at most three are running simultaneously at any time. Ten output files HelloWorldArray.ojobid-1 - HelloWorldArray.ojobid-10 will be generated containing
```bash
Hello World! from array index 1
.
.
.
Hello World! from array index 10
```
To run only one job at the same time use S=1.
#### Job chaining
The job identifier returned by `qsub` can be used to create a chain of jobs, where each job depends on its predecessor.
```bash
job1=`qsub chain_ok.sh`
job2=`qsub -W depend=afterok:$job1 chain_ok.sh`
job3=`qsub -W depend=afterok:$job2 chain_not_ok.sh`
job4=`qsub -W depend=afterok:$job3 chain_ok.sh`
```
`chain_ok.sh` returns 0 as exit code.
`chain_not_ok.sh` returns 1 as exit code.
job4 will be deleted by the scheduler because job3 ends with a non-zero exit code.
```bash
#PBS -N chain_ok
#PBS -S /bin/bash
#PBS -l nodes=1:ppn=1,mem=10mb,walltime=1:00:00
#PBS -q student
echo `hostname`
sleep 60
echo success
exit 0
```
```bash
#PBS -N chain_not_ok
#PBS -S /bin/bash
#PBS -l nodes=1:ppn=1,mem=10mb,walltime=1:00:00
#PBS -q student
#!/bin/bash
echo `hostname`
sleep 60
echo failure
exit 1
```
## How can I measure the memory peak of my program?
You can measure the required memory with `/usr/bin/time -v` (not to be confused with the built-in time command of bash).
The Maximum resident set size (kbytes) gives the amount of memory required to run your program.
## How can I measure the gpu resources of my program?
Add the following two lines to your script to print the resource usage measured on the gpu.
```bash
echo "pid, gpu_utilization [%], mem_utilization [%], max_memory_usage [MiB], time [ms]"
nvidia-smi --query-accounted-apps="pid,gpu_util,mem_util,max_memory_usage,time" --format=csv | tail -n1
```
In case the output is empty, check if accounting mode is enabled on the used gpu.
```bash
nvidia-smi -q | grep Accounting
```
## Connecting to servers that are running in a job
### Jupyter notebook
Since you cannot build an SSH tunnel, access it like this instead:
```bash
jupyter notebook --generate-config
# config file is in .jupyter/jupyter_notebook_config.py
# change config line 351
c.NotebookApp.ip = '*'
# start interactive job and run the notebook
jupyter notebook
# now you should be able to access it via browser as servername:port
```
### Other server tools
```bash
# start interactive job, assuming you get server chip
# ping it from e.g. lmbtorque to get the local ip
ping chip
# start your server and tell it to host on the local ip instead of localhost
start-my-server --host IP --port PORT
# now you should be able to access it via browser as IP:PORT
```
## See also
You can find additional info about our cluster and internal servers in the [LMB Wiki](https://lmb.informatik.uni-freiburg.de/interna/lmbwiki/)