Spring 2020 - CMPT 733 G100

Programming for Big Data 2 (6)

Class Number: 6802

Delivery Method: In Person

Overview

  • Course Times + Location:

    Jan 6 – Apr 9, 2020: Mon, 12:30–2:20 p.m.
    Burnaby

  • Prerequisites:

    CMPT 732: Programming for Big Data 1

Description

CALENDAR DESCRIPTION:

This course is one of two lab courses that are part of the Professional Masters Program in Big Data in the School of Computing Science. This lab course aims to provide students with the hands-on experience needed for a successful career in Big Data in the information technology industry. Many of the assignments will be completed on massive publically available data sets giving them appropriate experience with cloud computing and the algorithms and software tools needed to master programming for Big Data. Over 13 weeks of lab work and 12 hours per week of lab time, and building on the previous lab course CMPT 731, the students will obtain a solid background in programming for Big Data.

COURSE DETAILS:

From CMPT 726 and CMPT 732, students have learnt machine learning algorithms and big data programming tools. However, when facing a real-world data problem, the students will find that there is still a gap between what they have learnt in class and what they are going to do in practice. The goal of this course is to fill this gap, making the students be able to apply what they have learnt to solve real-world problems. To achieve this goal, our course will cover a set of important topics that a data scientist should know, and teach students about the state-of-the-art approaches. After taking this course, students should feel confident when being asked to extract value from real-world datasets, and know how to ask interesting questions about data, how to choose proper tools, how to design data-processing pipelines, and how to present final data products.

Topics

  • Data Preparation
  • Data Analytics
  • Applied Statistics
  • Pratical Machine Learning
  • Visualization and Communication
  • Active Learning and Crowdsourcing
  • Deep Learning
  • Large-scale Machine Learning

Grading

NOTES:

Each one of the 8 assignments will count for 9% of the final grade. The final project will count for the remaining 28%.

Materials

MATERIALS + SUPPLIES:

  • Data Science from Scratch, Joel Grus, O'Reilly Media, 9781491901427
  • Python for Data Analysis: Data Wrangling with Pandas, NumPy and IPython, 2nd Edition, Wes McKinney, O'Reilly, 2017, 9781491957660

Graduate Studies Notes:

Important dates and deadlines for graduate students are found here: http://www.sfu.ca/dean-gradstudies/current/important_dates/guidelines.html. The deadline to drop a course with a 100% refund is the end of week 2. The deadline to drop with no notation on your transcript is the end of week 3.

Registrar Notes:

SFU’s Academic Integrity web site http://www.sfu.ca/students/academicintegrity.html is filled with information on what is meant by academic dishonesty, where you can find resources to help with your studies and the consequences of cheating.  Check out the site for more information and videos that help explain the issues in plain English.

Each student is responsible for his or her conduct as it affects the University community.  Academic dishonesty, in whatever form, is ultimately destructive of the values of the University. Furthermore, it is unfair and discouraging to the majority of students who pursue their studies honestly. Scholarly integrity is required of all members of the University. http://www.sfu.ca/policies/gazette/student/s10-01.html

ACADEMIC INTEGRITY: YOUR WORK, YOUR SUCCESS