MENU

SFU Big Data Infrastructure

As the world's problems become more nuanced and complex, tap into a dedicated resource to help accelerate your research

SFU researchers can now access a set of free, dedicated services that provide a transformative way to solve complex problems and analyze data at blazing-fast speeds. With 30TB of free, high-performance storage that is integrated with Supercomputer Cedar, you can prototype and scale up your research seamlessly.

Expand your data-intensive research and find a faster and more powerful way to uncover new insights. Email research-support@sfu.ca to accelerate your research with this free, dedicated resource.

Benefits

30TB Free, High-Performance Storage

Faculty members get 30TB of free, high-performance storage, which can be shared with their research group. Storage is directly accessible from either the interactive cluster or Supercomputer Cedar, which means you can seamlessly access Cedar as your projects grows. Faculty members can access this storage without using the interactive cluster or Cedar.

Seamlessly Expand Your Work

Seamless integration with Supercomputer Cedar gives you the ability to effortlessly scale and expand your research or application. The interactive cluster and Supercomputer Cedar use the same familiar interface, which means you can dive straight in and expand your research or launch your application with ease.

Designed For SFU Researchers

Easily test applications on a dedicated, interactive big data cluster only accessible to SFU researchers. This allows for fast prototyping and agile development on infrastructure designed for research.

Hands-On, Dedicated Support

Dedicated support and guidance from the Research Computing Group to help set up access and help you achieve your research goals.

Storage Allocation Model

SFU Faculty members will receive 30TB of high-performance storage with a high-throughput connection to computation and includes backup copy. This storage can be shared with faculty member’s research group. Faculty members can access this storage without using the interactive cluster or Cedar.

Additional storage may be purchased for $200/TB, which is provided for five years and includes backup copy.

Dedicated Big Data Cluster Allocation Model

The dedicated big data cluster is designed to support rapid prototyping of data-intensive work, at a smaller scale than the Supercomputer Cedar.

It features a standard login and processing, and offers popular interfaces like Hadoop and Spark. The allocation model for the resource is based on interactive use, and will be adjusted according to demand.

Benefits of the interactive big data cluster

  • Map-reduce programming model offers a simplified approach to store, access and distribute large datasets across multiple servers
  • Spark allows you to process large data sets ideal for machine learning and graph algorithms. Spark also includes Spark SQL, which helps extract valuable information from large datasets by executing SQL queries on distributed datasets
  • Utilize machine learning, natural language processing and graph workloads
  • Interactive cluster powered by Hadoop, an open-source distributed processing framework that supports big data tools