Computer Services Centre

Indian Institute of Technology Delhi

  • Increase font size
  • Default font size
  • Decrease font size

HPC@IITD: Padum

E-mail Print PDF

Request Access

More details at http://supercomputing.iitd.ac.in

How to Use

You will need an ssh client to connect to the Padum cluster. CPU login is available through ssh hpc.iitd.ac.in (Use IITD credentials). To copy data use scp to hpc.iitd.ac.in. GPU or Mic (Xeon Phi) nodes can be directly accessed through gpu.hpc.iitd.ac.in and mic.hpc.iitd.ac.in respectively. Please avoid using gpu and mic for large data transfer.

Once logged in to the system, you have access to your home (backed up) and scratch (not backed up) directories. Please generate an ssh key pair in your .ssh directory to start using PBS. (Please see https://help.github.com/articles/generating-ssh-keys for help.)

Please report issues to hpchelp@iitd.ac.in

PBS Queue Manager

PBS Pro v13

  1. Commands:
    • qstat is used to view job and job step information for jobs managed by PBS. The default view includes only minimal information. A more detailed table is available using the command qstat -a. For more details, use qstat -as.
    • qsub is used to submit a batch script to SLURM
    • qsub -I is used to obtain a SLURM job allocation (a set of nodes), execute a command, and then release the allocation when the command is finished
    • qdel is used to cancel jobs.
    • For more info: PBS Works Documentation
    • UPDATE: You must submit yout job to a project that you are a part of. Everyone is part of a default project with the same name as their department dc name: the string that appears after /home/ in your home dierectory's path. (See here for all dc names.) For example a user (whose home path starts with "/home/cc/" will use qsub -P cc to specify her default project as cc.
  2. Job limits:
    • Upper time limit: 168 hours (7 days)
    • The entire cluster can be subscribed for a maximum of 24 hours
    • A fraction of the cluster can be subscribed for a maximum of 24/fraction hours. e.g. half the cluster can be requested for 48 hours, quarter of the cluster can be requested for 96 hours etc.

 

Examples of resource selection:

  1. 24 CPU cores per node : -l select=2:ncpus=24
  2. 2 GPUs per node : -l select=2:ngpus=2
  3. 2 Xeon Phi Per node : -l select=2:nmics=2
  4. High memory (512GB): -l select=1:highmem=1

 

Typical submission script

In a blank text file (e.g. pbsbatch.sh) add the following:


#!/bin/sh
### Set the job name
#PBS -N somejob
### Set the project name, your department dc by default
#PBS -P cc
### Request email when job begins and ends #PBS -m bea ### Specify email address to use for notification. #PBS -M $USER@iitd.ac.in #### #PBS -l select=n:ncpus=m ### Specify "wallclock time" required for this job, hhh:mm:ss #PBS -l walltime=01:00:00 #PBS -o stdout_file #PBS -e stderr_file #### Get environment variables from submitting shell #PBS -V
#PBS -l software= # After job starts, must goto working directory. # $PBS_O_WORKDIR is the directory from where the job is fired. cd $PBS_O_WORKDIR #job time -p mpirun -n {n*m} executable #NOTE # The job line is an example : users need to suit their applications # The PBS select statement picks n nodes each having m free processors # OpenMPI needs more options such as $PBS_NODEFILE

submit using:
$ qsub pbsbatch.sh

 

 


 

Interactive job

2 nodes, 24 procs each:
$ qsub -V -I -l select=2:ncpus=24

2 nodes, 2 gpus each, 2 cpus each (default) for 6 hours:
$ qsub -V -I -l select=2:ngpus=2 -l walltime=6:00:00

2 nodes, 2 gpus each, 12 cpus each (explicit):
$ qsub -V -I -l select=2:ngpus=2:ncpus=12


Currently released resources:

236 CPU Compute nodes | 160 GPU Compute Nodes | 22 Xeon Phi Compute nodes
RAM: 64GB
Some high memory nodes (512GB) are also available:
7 CPU node, 7 GPU nodes, 4 Xeon Phi nodes.

Storage

500TB "home" space (!different from institute home!): will be backed up
1000TB "scratch" space - for temporary data. Will not be backed up

 

Compilers, Softwares and Applications

Please see /home/apps and /home/soft for centrally installed applicaitons. 

 

NOTE: The old cluster "hpca" has been integrated into the HPC cluster and does not exist on its own

Last Updated on Thursday, 08 December 2016 22:15