IRMACS Cluster to Colony Cluster Migration

IRMACS Cluster to Colony Cluster - What does it mean?

As of June 16, 2014 the IRMACS compute nodes have been integrated with the RCG Colony Cluster. As with the IRMACS Resource Allocation on Compute Canada, IRMACS users will have priority access to 80 cores on the Colony Cluster. This migration is necessary for two main reasons:

  1. It helps us (IRMACS and RCG) minimize support time by not maintaining two completely separate clusters
  2. It allows IRMACS to retire some of its aging hardware that was required to keep the IRMACS cluster running.

The colony cluster is run in much the same way as the IRMACS cluster was, so not much should need to change in how you use the system. With that said, you will notice some changes in the environment and you will need to make some changesto your job submission scripts. These differences/changes are documented below.

If you have any questions in regards to this migration process please send email to help@irmacs.sfu.ca.

If you have any questions about using the IRMACS allocation on the Colony Cluster, please send an email to help@irmacs.sfu.ca.

If you have any software installation requests, general usage questions, and/or urgent support for the Colony Cluster, please email research_support@sfu.ca and CC help@irmacs.sfu.ca.

Login Node

The login node for the Colony Cluster is queen.rcg.sfu.ca. To log in to the new head node simply ssh to queen.rcg.sfu.ca. head.irmacs.sfu.ca will no longer be available.

Username and Password

Your username and password on queen.rcg.sfu.ca is your SFU user name and password. This is the same user name and password as you use to log in to the IRMACS lab machines as well as to your SFU email account. Note for most of you this will be the same as what you used on head.irmacs.sfu.ca, but for some of you it will be different. Please ensure you use your SFU credentials on queen.rcg.sfu.ca.

Your files and home directory

Your IRMACS files from ~USERNAME on head.irmacs.sfu.ca can be found in /rcg/irmacs/cluster-personal/USERNAME where USERNAME is your SFU username. Please do not confuse this space with your personal network folder on its-fs2.sfu.ca (i.e. username-irmacs). If you had any project files they will be stored in /rcg/irmacs/cluster-projects/PROJECTNAME. The /rcg/irmacs file system is backed up.

This file system is a TB in size and can be made larger as required. Please let is know if you think you will need more disk space.

Note: Your home directory on queen.rcg.sfu.ca (~USENRAME), has a 3 GB quota on it and should only be used to store small amounts of data. Please use /rcg/irmacs/cluster-personal/USERNAME for your research data.

Modules

Like the IRMACS cluster, the colony cluster uses modules to manage the software that you are using. Although almost identical in functionality, the module naming convention may be slightly different and/or versions may be slightly different. For example, to use python on head.irmacs.sfu.ca you would use:

module load LANG/PYTHON/2.7.2

On queen.rcg.sfu.ca you would use:

module load LANG/PYTHON/2.7.6-SYSTEM

Note the version and the naming convention are slightly different. To see the modules available on queen.rcg.sfu.ca type "module avail". If there is software that you would like installed on the cluster please email help@irmacs.sfu.ca.

More information on using modules on the Colony Cluster is available here.

Submitting Jobs

Submitting jobs is done almost exactly the same way as on the IRMACS cluster, with the exception of a couple of details as described below.

You should always specify three parameters in your job submission script, the amount of memory the job will use, the number of processors you will use, and the expected wall time of the job. These tasks can be accomplished by using the PBS directives

  • -l mem=1gb
  • -l nodes=1:ppn=1
  • -l walltime = 1:00:00

to request 1 GB of memory, one processor on one node, and 1 hour of wall time for your job to complete.

To take advantage of the IRMACS allocation on the colony cluster, you should specify the accounting group that you are submitting to. The accounting group in this case is rcg-irmacs. Thus you should use the following PBS directive:

  • -W group_list=rcg-irmacs

A typical batch submission script on queen.rcg.sfu.ca would therefore look like this:

#PBS -W group_list=rcg-irmacs
#PBS -l nodes=1:ppn=1,pmem=1g,walltime=1:00:00
#PBS -M USERNAME@sfu.ca
#PBS -m abe
matlab -nosplash -nodisplay < test.m

More information on job submission on the Colony Cluster is available here.

Monitoring Jobs

Monitoring jobs is done using the qstat and showq commands as on head.irmacs.sfu.ca. The main difference is that qstat reports jobs as having a "C" or complete status when they are done, whereas on head.irmacs.sfu.ca and bugaboo.westgrid.ca the jobs are removed from the queue.

More information on job monitoring on the Colony Cluster is available here.