Fall 2025 - STAT 440 D100

Learning from Big Data (3)

Class Number: 7100

Delivery Method: In Person

Overview

  • Course Times + Location:

    Sep 3 – Dec 2, 2025: Mon, 2:30–4:20 p.m.
    Burnaby

    Sep 3 – Dec 2, 2025: Wed, 2:30–3:20 p.m.
    Burnaby

  • Prerequisites:

    90 units including STAT 350 with a minimum grade of C- and one of STAT 341, STAT 260, or CMPT 225, with a minimum grade of C- (STAT 240 is also recommended); OR data science majors with 90 units including STAT 302 or STAT 305, CMPT 225, STAT 260, and STAT 240, all with a minimum grade of C-.

Description

CALENDAR DESCRIPTION:

A data-first discovery of advanced statistical methods. Focus will be on a series of forecasting and prediction competitions, each based on a large real-world dataset. Additionally, practical tools for statistical modeling in real-world environments will be explored.

COURSE DETAILS:

STAT 440 is suitable for senior students who have a minimum of 90 units.

Course Outline

The course will be split into two modules. Each module will focus on a real-world dataset. At the start of each module, students will form teams, and a subset of the dataset will be given to all teams (the training data). The rest of the dataset will be withheld (the test data). Students will learn modern machine learning methods for predicting aspects of the test data, using the training data. This test/train paradigm is often encountered in both academic and industrial settings (production/development). Students will learn the steps of real-world data science including pre-processing and feature engineering.

The methods will include bagging, boosting, deep learning, model blending and cross-validation. Students will learn how to implement these methods using standard software packages such as scikit-learn and tensorflow, and how to use large language model (LLM) APIs. Students can use these methods (or any method they've learned) to make their predictions. Marks for the modules will be awarded based on the accuracy of their predictions.

Grading

  • Assignments 10%
  • Reading 20%
  • Projects 70%

NOTES:


Assignments and Grading Procedures

  • Four assignments will be given worth 2.5% each with problem sets following the methods taught in the lectures.
  • Four articles or papers will be assigned for reading, with in-class responses worth 5% each.

  • Students will work in teams of three or four on two projects (35% each). A dataset will be provided for each project. Marks for predictions of held-out target variables will be awarded based on performance relative to classmates and to objective baselines. The project will also involve a written report covering insights and methods.
Above Grading is subject to change.

Materials

RECOMMENDED READING:

  • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition (A. Géron, 2023, O'Reilly)
  • Linear Algebra, 5th Edition (S. Friedberg, A. Insel, L. Spence, 2022, Pearson)
  • Probabilistic Machine Learning: An Introduction (K. Murphy, 2022, MIT Press)
  • The Kaggle Book (K. Banachewicz, L. Massaron, 2022, Packt Publishing) 
  • AWS Cookbook (J. Culkin, M. Zazon, 2023, O'Reilly)
  • Learning R (R. Cotton, 2013, O'Reilly)

REQUIRED READING NOTES:

Your personalized Course Material list, including digital and physical textbooks, are available through the SFU Bookstore website by simply entering your Computing ID at: shop.sfu.ca/course-materials/my-personalized-course-materials.

Department Undergraduate Notes:

Students with Disabilities:
Students requiring accommodations as a result of disability must contact the Centre for Accessible Learning 778-782-3112 or caladmin@sfu.ca.  


Tutor Requests:
Students looking for a tutor should visit https://www.sfu.ca/stat-actsci/all-students/other-resources/tutoring.html. We accept no responsibility for the consequences of any actions taken related to tutors.

Registrar Notes:

ACADEMIC INTEGRITY: YOUR WORK, YOUR SUCCESS

At SFU, you are expected to act honestly and responsibly in all your academic work. Cheating, plagiarism, or any other form of academic dishonesty harms your own learning, undermines the efforts of your classmates who pursue their studies honestly, and goes against the core values of the university.

To learn more about the academic disciplinary process and relevant academic supports, visit: 


RELIGIOUS ACCOMMODATION

Students with a faith background who may need accommodations during the term are encouraged to assess their needs as soon as possible and review the Multifaith religious accommodations website. The page outlines ways they begin working toward an accommodation and ensure solutions can be reached in a timely fashion.