Fall 2022 - CMPT 459 D100

Special Topics in Database Systems (3)

Data Mining

Class Number: 5355

Delivery Method: In Person

Overview

  • Course Times + Location:

    Sep 7 – Dec 6, 2022: Tue, 2:30–4:20 p.m.
    Burnaby

    Sep 7 – Dec 6, 2022: Fri, 2:30–3:20 p.m.
    Burnaby

  • Prerequisites:

    CMPT 354 with a minimum grade of C-.

Description

CALENDAR DESCRIPTION:

Current topics in database and information systems depending on faculty and student interest.

COURSE DETAILS:

This course introduces Data Mining, an area that plays a key role in Big Data analytics. The goal of data mining is the efficient discovery of useful patterns in large datasets. This course focuses on fundamental data mining tasks and algorithms as well as key applications. It will prepare you both for developing your own data mining application and for starting your data mining research. Students taking this course are expected to have taken an algorithms course and to have an understanding of basic statistics equivalent to an entry-level course. The programming assignments and the course project require programming in Python, and students are expected to be proficient with this programming language.

Topics

  • Introduction
  • Data preprocessing: data cleaning, completion, transformation, normalization
  • Classification: evaluation, decision trees, Bayesian classification, NN, SVM, ensemble methods
  • Cluster analysis: partitioning, hierarchical, density-based methods, subspace clustering
  • Outlier detection: probabilistic and distance-based methods, LOF, non-parametric methods
  • Frequent pattern mining: association rules, Apriori, FP-growth, pattern summarization
  • Impact of data mining
  • Research issues: causal discovery, explainability, transfer learning

 

Grading

NOTES:

Evaluation will be based on programming assignments, a course project, and (midterm and/or final) exams. Details to be discussed and finalized in the first week of classes.

Students must attain an overall passing grade on the weighted average of exams in the course in order to obtain a clear pass (C- or better).

Materials

REQUIRED READING:

  • Data Mining: The Textbook
  • Charu Aggarwal,
  • Springer,
  • 2015

  • The book is available as e-book through the SFU Library.

ISBN: 9783319141411

REQUIRED READING NOTES:

Your personalized Course Material list, including digital and physical textbooks, are available through the SFU Bookstore website by simply entering your Computing ID at: shop.sfu.ca/course-materials/my-personalized-course-materials.

Registrar Notes:

ACADEMIC INTEGRITY: YOUR WORK, YOUR SUCCESS

SFU’s Academic Integrity website http://www.sfu.ca/students/academicintegrity.html is filled with information on what is meant by academic dishonesty, where you can find resources to help with your studies and the consequences of cheating. Check out the site for more information and videos that help explain the issues in plain English.

Each student is responsible for his or her conduct as it affects the university community. Academic dishonesty, in whatever form, is ultimately destructive of the values of the university. Furthermore, it is unfair and discouraging to the majority of students who pursue their studies honestly. Scholarly integrity is required of all members of the university. http://www.sfu.ca/policies/gazette/student/s10-01.html