Spring 2025 - CMPT 713 G100

Natural Language Processing (3)

Class Number: 5476

Delivery Method: In Person

Overview

  • Course Times + Location:

    Jan 6 – Apr 9, 2025: Mon, Wed, 3:30–4:50 p.m.
    Burnaby

Description

CALENDAR DESCRIPTION:

Natural Language Processing is the automatic analysis of human languages such as English, Korean, and thousands of others analyzed by computer algorithms. Unlike artificially created programming languages where the structure and meaning of programs is easy to encode, human languages provide an interesting challenge, both in terms of its analysis and the learning of language from observations. Covers NLP tasks such as language modeling, machine translation, multilingual processing, information extraction, question answering, and other topics relevant to modern NLP. Students with credit for CMPT 825 or CMPT 413 may not take this course for further credit.

COURSE DETAILS:

Imagine a world where you can pick up a phone and talk in English, while at the other end of the line your words are spoken in Chinese. Imagine a computer animated representation of yourself speaking fluently what you have written in an email. Imagine instructing a robot to prepare your backpack for you. Imagine automatically uncovering protein/drug interactions in petabytes of medical abstracts. Imagine feeding a computer an ancient script that no living person can read, then listening as the computer reads aloud in this dead language. Natural Language Processing (NLP) is the automatic analysis of human languages such as English, Korean, and thousands of others analyzed by computer algorithms that can make these applications possible. Unlike artificially created programming languages where the structure and meaning of programs is easy to encode, human languages provide an interesting challenge, both in terms of its analysis and the learning of language from observations.

This course is an introduction to NLP and will cover algorithms and techniques for processing text (using probabilistic models and neural networks) as well as basic linguistic concepts.  

Topics

  • Text classification
  • Language models
  • Word representations and embeddings
  • Supervised machine learning for NLP
  • Neural models for NLP
  • Sequence modeling
  • Machine translation
  • Parsing and semantics
  • NLP applications

COURSE-LEVEL EDUCATIONAL GOALS:

At the conclusion of the course, the student is expected to gain an understanding of the algorithms and techniques used in NLP, including how computers represent words and the modeling of text as sequences, and the mathematics of machine learning models (e.g. language models, RNNs, transformers) commonly used in NLP and how they are trained.  In addition, students will also gain knowledge about basic NLP tasks such as part-of-speech tagging and parsing, and how NLP models are used in applications such as machines translation.

Grading

NOTES:

Grading will be based on assignments, exams, final project and class participation. Details will be announced during the first week of classes.

Students must attain an overall passing grade on the weighted average of exams in the course in order to obtain a clear pass (C- or better).

REQUIREMENTS:

Students are expected to have a strong background in math probability, linear algebra, and calculus. Students must also be comfortable with programming and implementing algorithms (coding assignments will be in Python and Pytorch). In addition, familiarity with basic machine learning and deep learning concepts are highly recommended (e.g. students has taken either CMPT 726 or CMPT 728)

Materials

MATERIALS + SUPPLIES:

Reference Books

  • Speech and Language Processing (3rd ed), Dan Jurafsky and James H. Martin, 2024, https://web.stanford.edu/~jurafsky/slp3/
  • Neural Network Methods for Natural Language Processing, Yoav Goldberg, Morgan and Claypool, 2017, http://www.morganclaypool.com/doi/10.2200/S00762ED1V01Y201703HLT037
  • Natural Language Processing, Jacob Eisenstein, The MIT Press, 2018, https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf

REQUIRED READING:

Speech and Language Processing (3rd ed)
Dan Jurafsky and James H. Martin, 2024
https://web.stanford.edu/~jurafsky/slp3/


RECOMMENDED READING:

Deep Learning
Ian Goodfellow and Yoshua Bengio and Aaron Courville
MIT Press
2016

https://www.deeplearningbook.org/
ISBN: 9780262035613

REQUIRED READING NOTES:

Your personalized Course Material list, including digital and physical textbooks, are available through the SFU Bookstore website by simply entering your Computing ID at: shop.sfu.ca/course-materials/my-personalized-course-materials.

Graduate Studies Notes:

Important dates and deadlines for graduate students are found here: http://www.sfu.ca/dean-gradstudies/current/important_dates/guidelines.html. The deadline to drop a course with a 100% refund is the end of week 2. The deadline to drop with no notation on your transcript is the end of week 3.

Registrar Notes:

ACADEMIC INTEGRITY: YOUR WORK, YOUR SUCCESS

SFU’s Academic Integrity website http://www.sfu.ca/students/academicintegrity.html is filled with information on what is meant by academic dishonesty, where you can find resources to help with your studies and the consequences of cheating. Check out the site for more information and videos that help explain the issues in plain English.

Each student is responsible for his or her conduct as it affects the university community. Academic dishonesty, in whatever form, is ultimately destructive of the values of the university. Furthermore, it is unfair and discouraging to the majority of students who pursue their studies honestly. Scholarly integrity is required of all members of the university. http://www.sfu.ca/policies/gazette/student/s10-01.html

RELIGIOUS ACCOMMODATION

Students with a faith background who may need accommodations during the term are encouraged to assess their needs as soon as possible and review the Multifaith religious accommodations website. The page outlines ways they begin working toward an accommodation and ensure solutions can be reached in a timely fashion.