This marks the 18th year of the SFU/UBC Joint Statistics Seminar. The goal of the seminar is to have graduate students from SFU and UBC socialize and present their new research. Breakfast will be provided prior the student presentations, which will consist of six talks by graduate students at both universities. The event is concluded with a talk by a faculty member from one of the universities. Lunch is provided at the end of the seminar.

The discrete nonlinear filter (DNF) of Kitagawa (1987) is a fast, accurate and deterministic algorithm to estimate the filtering distribution and the likelihood of nonlinear non-Gaussian state-space models. Many stochastic volatility models in finance are in this category of state-space models. Indeed, several researchers have applied the DNF for their estimation (see, e.g., Watanabe, 1999; Bartolucci and De Luca, 2001; Langrock et al., 2012; Bégin and Boudreault, 2021). In this talk, I outline how to apply the DNF to a simple stochastic volatility model before introducing the SVDNF R package. This package allows users to apply the DNF to a wide class of stochastic volatility models to estimate filtering distributions, likelihood evaluations and obtain maximum likelihood parameter estimates. I will demonstrate these capabilities with a simple built-in model and show how users can create custom models to work with the SVDNF package functions.

Jax is an open-source python library that combines the strength of XLA and Autograd, facilitating high-performance machine learning research. Jax is gaining popularity among machine learning researchers from academia and industry. In this talk, I will introduce our recent work on Dynamax, a python library built on Jax that performs inference and learning of state space models efficiently. I will demonstrate how Dynamax enables fast inference of various state space models. I will also introduce sts-jax, a Jax package of structural time series (STS) models, which is built upon Dynamax and offers a flexible approach to fitting STS models with Jax.

Reconstructing person-to-person transmission events using genomic and epidemiological data can be valuable in designing strategies to control and prevent the spread of infectious diseases. Sequenced genetic data from strains in an outbreak for pathogens that evolve slowly, such as mycobacterium tuberculosis (due to low mutation rate), can be partially informative about who infected whom; however, it should be supported with epidemiological data (such as infection and sampling time) to understand the transmission network better. This proposal will apply Bayesian inference to incorporate four unobserved processes: mutation, between-host transmission (epidemiology), within-host evolution, and unsampled cases. Our approach will simultaneously infer phylogenetic and transmission trees. Monte Carlo Markov Chain will be used to sample from the posterior tree space. Different transition kernels (proposals) will be utilized to alter the tree configuration (topology and branch length) and assigned transmission network (who infected whom and its timing).

In this talk, I will introduce my current work (in progress) on density ratio estimation with its application especially on intractable likelihood inference. Density ratio estimation is a useful tool in machine learning research especially in the area of transfer learning and contrastive learning. The latter one is commonly used in intractable likelihood inference. I will discuss one of approaches under this framework for statistical inference on un-normalized models, which is a case of intractable likelihood inference — Noise Contrastive Estimation (NCE). First, I will discuss how NCE could be connected with a density ratio estimation based on a probabilistic classifier (DRE). With the fact that NCE is actually estimating a density ratio, I will analyse the efficiency of NCE and introduce several ways to improve the efficiency of NCE from the perspective of DRE. To be more specific, I will discuss the disadvantages of a type of DRE based on a probabilistic classifier and introduce several approaches as alternatives to overcome such disadvantages. I will also have a review on a recently proposed approach called Telescoping Density Ratio Estimation (TDRE), which is also designed to overcome the shortcoming of original DRE mentioned above. I will talk about the limitation of existing TDRE from a theoretical perspective and provide a framework to extend the range of situations where TDRE can be applied with a theoretical support.

In many real-world applications, the data of interest are discretely measured over a continuum, e.g., time. A common pipeline for functional data analysis (FDA) is to firstly convert the discretely observed data to smoothly varying functions, and then represent the functional data by a finite-dimensional vector of coefficients that summarize the information carried by the functions. Existing methods for data smoothing and dimensional reduction include basis expansion and functional principal component analysis (FPCA). Both approaches focus on learning the linear mappings from the data space to the representation space, however, learning only the linear representations may not be sufficient. In this study, we propose to learn the nonlinear representation of functional data using neural network autoencoders. We design the encoders to employ a projection layer taking the weighted summations of the inner product of the functional data and functional weights over the observed timestamp, and the decoders to apply a recovery layer that maps the finite-dimensional vector extracted from the functional data back to functional space using a set of pre-selected basis functions. The developed architecture compresses the discretely observed functional data to a set of representations and then outputs smooth functions. The proposed method are suited for both regularly and irregularly spaced data, and the smoothness of the recovered curves is controlled through a roughness penalty added to the objective function used for network training.

Empirical likelihood is a popular platform for non-parametric inference. In the empirical likelihood context, the consistency result established by Qin and Lawless has been widely accepted. Their consistency result states that under some moment and smoothness conditions on the estimating function, there exists a consistent local maximum within an n−1/3 neighbourhood of the true parameter. Because the true parameter is unknown, their consistency result is not completely satisfactory. When the empirical likelihood function has multiple local maxima, Qin and Lawless’ consistency result does not indicate which local maximum is consistent. For this reason, we establish a global consistency result by showing that the global maximum of the empirical likelihood function is consistent under some conditions. Furthermore, we can test if any given local maximum of the empirical likelihood function is the global maximum through empirical likelihood ratio test.

Modern life is saturated with the ideas of probability. They appear prominently in science, engineering, sociology, politics, medicine, sports, economics, and so on. . Of course, probability underlies statistical models, methodology, and analysis. But the enthusiasm for probability masks questions about how and why probability can be used as a model in so many situations. Questions that are especially important for risky situations like predicting disease, forecasting hurricanes, and evaluating reliability of nuclear reactors.

Unfortunately, it is difficult to find careful descriptions of the construction of probability models. In this talk, I will describe a series of probability models derived from real applications. The goal is to illustrate various ways probability is used to model complex phenomena. I want to provide evidence for the observation that probability is often a good description in situations in which aggregation of many similar processes leads to coherent behavior.