The dedicated big data cluster is designed to support rapid prototyping of data-intensive work, at a smaller scale than the Supercomputer Cedar.

Benefits of the interactive big data cluster include:

  • Map-reduce programming model offers a simplified approach to store, access and distribute large datasets across multiple servers
  • Spark allows you to process large data sets ideal for machine learning and graph algorithms. Spark also includes Spark SQL, which helps extract valuable information from large datasets by executing SQL queries on distributed datasets
  • Utilize machine learning, natural language processing¬†and graph workloads
  • Interactive cluster powered by Hadoop, an open-source distributed processing framework that supports big data tools

Learn how you can accelerate your research with this free, dedicated resource.

Get Access Now

Hadoop Cluster Specs

168
Cores
1.4
TB memory
96
TB disk space across 7 nodes, connected via 100 Gbit network connectivity
4
shelves with 84 drives, for a total of 2.3 PB of usable storage

GPU Node Spec

1
Large GPU node with 4 pascal P100 16 GB graphics cards
24
Cores
256
GB memory
C6420
Base node with 48 cores and 192 GB memory

General Compute Node Specs

48
Cores
192
GB memory