Introduction and general information
We have our own computational cluster (10 nodes, 96 CPU cores and ~1Tb RAM each) located at:
hclm.ifp.tuwien.ac.at
To monitor the usage of resources see this page.
SSH setup
For an easier access one might want to set up ssh-config file, ~/.ssh/config
:
Host hclm User username Hostname hclmbck.ifp.tuwien.ac.at
And use the following command to login:
ssh hclm
For more info about ssh config file see this page.
To keep the ssh session open for prolonged time, use the following keyword:
ServerAliveInterval 60
SLURM
The slurm system is used to manage the queue with computational jobs. Use sinfo
to see the information on the current setup and available computational resources.
Here is a template for your run.sh
file:
#!/bin/bash #SBATCH -J JOB_NAME #SBATCH -N 1 #SBATCH --tasks-per-node=96 #SBATCH --partition=compute #SBATCH -t 3-0:00:00 # 3 days walltime, the format is MM:SS, or HH:MM:SS, or D-HH:MM:SS #environment variables to set export OMP_NUM_THREADS=1 # modules to load module purge module load --auto w2dynamics/1.1.4-gcc-11.4.0-4e7xlay # commands to run mpirun -n $SLURM_TASKS_PER_NODE DMFT.py input_file.in
It will ask for one node, request 96 tasks/threads to be available and see that it does not run more than three days (walltime limit).
To submit your job use:
sbatch run.sh
To check the status of your jobs:
squeue -u $USER
To cancel the job:
scancel job_id
Julia
Please install julia
and setup the proper environment yourself following the instructions from the official page.
If you're new to julia
and want to get introduced to the current workflow, check this link.
Table of contents