Old IRMACS Cluster Documentation

Software

The cluster supports the C, C++, Fortran, Python, and Java languages. All software is installed /hpc/software on the compute nodes. Commonly used packages include:

Maple Matlab MRBayes R Blast Boost ESCM

To determine what software is available on the cluster, the "module" command is useful. Simply type "module avail" on the command line and it will give you a list of packages that are available and supported. To use one the packages simply issue the command "module load PACKAGENAME" where PACKAGENAME is the name of the module listed above. For more information on which packages are available on the cluster and how to use modules please refer to the modules support page.

Queues

The IRMACS cluster uses a single batch queue that schedules jobs based on their characteristics. Short jobs are given a high priority. For those users of the cluster in its old form, this emulates the old "live" and "batch" queues. Users should not specify a queue name in their submission scripts.

Submitting Jobs

To use the multiprocessor queueing system, a user must create a PBS script, and then submit the script to the cluster. To submit a job, a user uses the following command:

qsub "script_file"

To monitor a job a user uses the following command:

qstat -nl

To monitor all the queued and running jobs:

qstat -a

To delete a job in the queue:

qdel jobid (where jobid is the ID of the job given to you when the job is submiited with qsub)

PBS Script Elements

Note: Not all #PBS directives are needed for each script type.

#!/bin/bash "Script to use"
#PBS -N job_name "Name of the submitted job"
#PBS -q queue_name "Name of the queue to run job on"
#PBS -M user_email "Email address to send notifications to"
#PBS -m bae "Send email (b)efore execution (a)fter execution (e)rror occurs"
#PBS -l nodes=1:ppn=6 "How many nodes (1) and CPUs (6) needed for job"
#PBS -l procs=4 "How many processors/CPUs/cores required (default 1)
#PBS -l mem=1g "How much memory job requires (1 GB)"
#PBS -l walltime=1:00:00 "How long we expect the job to run (1 day)"
#PBS -j oe "Join stdout and stderr"
#PBS -o path_to_job_log "Send job log to file"
#PBS -e path_to_error_log "Send error log to file"

Important PBS Elements

In order for the scheduler to make intelligent decisions on when and how to schedule jobs it is important that you specify resources accurately. The critical elements are the number of processors your job will use (e.g. -l nodes=1:ppn=6 or -l procs=4), the amount of memory the job will use (e.g. -l mem=600mb), and the expected run time of the job (e.g. -l walltime=dd:hh:mm). If this information is not specified accurately the scheduler can not schedule jobs properly and/or may siginficantly decrease the performance of your and other user's jobs.

The number of processors should be specified exactly. Expected memory usage and run time can be estimated, but should be as accurate as possible. Jobs that run longer than their estimated run time will be terminated. Thus it is better to overestimate run time and memory usage but have a run time rather than have no run time at all. Not specifying a run time will result in an infinite run time but will also result in a hard to schedule job.

In order to see the resources consumed by a running job use the "qstat -f jobid" command. Refer to the lines:

resources_used.cput = 14:59:56
resources_used.mem = 35328kb
resources_used.vmem = 71308kb
resources_used.walltime = 16:44:55

Using these lines for existing jobs can guide you in the parameters for your job submissions.

Single Job Example

#! /bin/bash
#PBS -N Myprogram
#PBS -M user@irmacs.sfu.ca
#PBS -m bae
./myprogram param1 param2

Matlab Job Example

#! /bin/bash
# Note - this assumes that you have loaded the Matlab module. Otherwise, it will be
# necessary to put the explicit path in the PBS script below.
#PBS -N Myprogram
#PBS -M user@irmacs.sfu.ca
#PBS -m bae

module load COMMERCIAL/MATLAB/2012a
matlab -nosplash -nodesktop < my_matlab_program.m > outputfile.txt

Threaded Parallel Job Example

#! /bin/bash
#PBS -N Myprogram
#PBS -l nodes=1:ppn=6
#PBS -M user@irmacs.sfu.ca
#PBS -m bae
./myprogram -np 6 < inputfile.txt > outputfile.txt

Note: In the above example, this assumes that myprogram is a multi-threaded program and takes advantage of the number of processors given to it by the command line arguement -np (in this case 6 processors). Because this is a multi-threaded application, it MUST run on a single node, since the inter-processor comminucation is through threads and shared memory. Thus the -l nodes=1 part of the PBS directive is essential. The ppn=6 part of the PBS directive tells the scheduler that the job requires 6 processors. If this is omitted then it will by default only assign it one processor. This would result in five of the six threads possibly running on processors with other jobs running on them, slowing down both your job and that of other cluster users.

MPI Parallel Job Example

#! /bin/bash
#PBS -N Myprogram
#PBS -l nodes=1:ppn=8
#PBS -M user @ domain.com
#PBS -m bae
mpirun ~/mypgrogram

Note: On the IRMACS cluster the scheduling system and mpi are closely coupled. Thus it is not necessary to specify -np or -machinefile to the mpirun command. The number of processors used by MPI is specified by the by the PBS directive -l nodes=1:ppn=8. This means that we have requested a single node with 8 CPUs on that node. This is the optimal configuration for MPI jobs as the communication between processors will occur through shared memory rather than across the network between nodes (the IRMACS cluster consists of 10 nodes, each with eight processros).

Technical Support

The IRMACS Centre employs a professional technical team to support researchers' use of the cluster.

If you have any other questions about the computational cluster, contact the IRMACS Centre.