Fall 2014 UBC/SFU Joint Statistical Seminar
Sunday October 26th, 2014
Room 7000 in the Harbour Centre
Presented by PIMS and the Simon Fraser Graduate Student Society

PIMS logo                       GSS logo

The SFU-UBC Joint Graduate Student Workshop in Statistics is going into its 10th year. This is the first of two seminars to take place this school year, the one in Fall is organized by graduate students from SFU and the one in Spring is organized by graduate students from UBC. The idea of this event is to offer graduate students in Statistics and Actuarial Science with an opportunity to attend a seminar with accessible talks providing them an introduction to active areas of research in the field. For three students from each university the seminar allows them to present on their work, as well as to offer them an opportunity to develop their presentation skills with their peers.

Continuing with the usual format of past years this event will consist of talks given by six students (three from UBC and three from SFU) and two professors (one from each university). The seminar also contains important social components, namely the morning coffee and the lunch where students get an opportunity to network with each other and foster a mutually beneficial relationship between the departments.

Information on previous seminars can be found on the UBC statistics department website (here).


This seminar could not take place without the generous help of our sponsors: The Pacific Institute for the Mathematical Sciences (PIMS) and the Graduate Student Society at Simon Fraser University (GSS).

Agenda For Sunday October 26th
8:30 - 9:00
Coffee and Pastries at Blenz's Coffee (508 West Hastings Street)
Across the street from the Harbour Centre

9:00 - 9:30
Student Talk: Jack Davis
SQL - A 25 Minute Introduction Abstract

9:30 - 10:00
Student Talk: Seagle Liu
Bias Correction and Uncertainty Characterization of High-Resolution Dead-Reckoned Paths of Marine Mammals Abstract

10:00 - 10:30
Student Talk: Biljana Stojkova
Simulated tempering via contour optimized sampling Abstract

10:30 - 10:45
Short Break

10:45 - 11:45 Faculty Talk: Dr. Dave Campbell
Finding Statistics Jobs in Industry

12:00 - 2:00
Lunch at Rogue Wetbar in Gastown (Website)
Across the street from the Harbour Centre

2:15 - 3:15
Faculty Talk: Dr. Matías Salibián-Barrera
Running to Stand Still: Some Thoughts on Doing Statistics Today and Tomorrow

3:15 - 3:45
Student Talk: Andy Leung
Three-Step Robust Regression for Handling Cell-wise and Case-wise Contamination Abstract

3:45 - 4:00
Short Break

4:00 - 4:30
Student Talk: Xiaoqing Liang
Maximizing the Probability of Reaching a Goal Before Ruin with Transaction Costs Abstract

4:30 - 5:00
Student Talk: MD Mahsin
Determination of Sample Size for Phase II Clinical Trials in Multiple Sclerosis using Lesional Recovery as an Outcome Measure Abstract

Directions and Accessibility
The seminar conveniently takes place in room 7000 on the SFU downtown campus in the Harbour Centre in downtown Vancouver (map). From SFU, the 135 bus will take you directly to the seminar location. From UBC, the 044 and 14 bus provide direct access. It is also near Waterfront station, which allows access from all Skytrain lines: Canada Line, Expo Line and Millenium Line.


SQL - A 25-minute Introduction

SQL, or Structured Query Language is a portable and powerful tool for managing large datasets, collections of related datasets, or data that is stored on a remote server. It is easy to learn to use SQL, and it can be used within R, SAS, and Access, and many other analytic software packages. This talk will explain some of the features of SQL as well as give a brief tutorial on writing some of the most common types of queries (mini programs) used. Links will be provided to set-up kits and libraries of further tutorial material.

Bias Correction and Uncertainty Characterization of High-Resolution Dead-Reckoned Paths of Marine Mammals

With the recent development in electrical engineering, biologists today are able to track their animal of interest with high resolution tags and reconstruct the animal's path via the Dead- Reckoning Algorithm (DRA). However, the DRA path can be seriously biased and lack an uncertainty measurement. We thus develop a Bayesian Melding (BM) approach built on a Brownian Bridge process to efficiently combine the fine-resolution but seriously biased DRA results and the precise but sparse GPS measurements to provide an estimate of the animal's path at high spatial-temporal resolution. Our method also provides uncertainty statements of this path estimate via the Bayesian credible intervals (CI). With the properties of the underlying stochastic processes and some approximations to the likelihood, our method scales easily to very large data sets.

Simulated tempering via contour optimized sampling

In this presentation, I will describe a new simulated tempering algorithm followed by implementation challenges that came along the way. When it comes to sampling from multi-modal distributions, it is well known that MCMC methods fail to explore efficiently the posterior surface. Simulated and parallel tempering have been proposed to deal with problems of multimodality in high-dimensional target distributions. In simulated tempering, the temperature is a dynamic variable and sampling distribution is the joint posterior distribution of the state variables and temperature parameter. In order to ensure the sampling of the temperature parameter, simulated tempering requires normalizing constants which contributes in large to its unpopularity in practice. We developed a new simulated tempering algorithm that does not require computation of the normalizing constants. The algorithm is implemented on a mixture of Gaussians.


This is my abstract

I do cool things. Woo!



This is my abstract

I do cool things. Woo!


Three-Step Robust Regression for Handling Cell-wise and Case-wise Contamination

Traditional robust regression methods may fail when data contains cell-wise outliers. Cell-wise outliers are likely to occur together with case-wise outliers in modern datasets. The proposed method, called 3S-regression, proceeds as follows: first it uses a univariate filter to detect and eliminate extreme cell-wise outliers; second it applies a robust estimator of multivariate location and scatter to the filtered data to down-weight case-wise outliers; third it computes robust regression coefficients from the estimates obtained in the second step. The estimator is Fisher consistent and asymptotically normal at the central model under mild assumptions on the tail distributions of the predictors. Extensive simulation results show that 3S-regression is resilient to cell-wise outliers. It also performs well under case-wise contaminations when comparing with traditional high breakdown point estimators.

Maximizing the Probability of Reaching a Goal Before Ruin with Transaction Costs

In this talk, I will investigate an optimal reinsurance-investment problem of an insurer who invests the surplus in a simplified financial market consisting of a risk-free asset and a risky asset. The wealth can be transferred between the two assets, but the insurer should pay fees on each transaction equal to a fixed percentage of the amount transacted. Besides investments in the financial market, the insurer can also control risk exposure due to insurance claims via the purchase of proportional reinsurance, the proportion reinsured is constrained to the interval [0, 1]. The objective of the insurer is to choose an optimal reinsurance-investment strategy to maximize the probability of reaching the total wealth k before bankruptcy. We adopt the Hamilton-Jacobi-Bellman (HJB) dynamic programming approach to analyze this optimal reinsurance-investment problem. We show the existence of five types of solutions depending on the free parameters of the model.

Determination of Sample Size for Phase II Clinical Trials in Multiple Sclerosis using Lesional Recovery as an Outcome Measure

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system. The hallmark feature of the disease is the formation of focal demyelinating lesions accompanied by myelin destruction in the white matter (WM). Magnetic resonance imaging (MRI) is used to identify and visualize these lesions. Repeated MRI scanning of subjects (most often monthly) over a period of months has become a standard protocol for Phase II trials of experimental treatments in MS. The formation of WM lesions in MS is characterized by inflammatory demyelination and then remyelination usually occurs over several months after lesion formation. Hence, a measure reflecting lesional recovery is a promising outcome for phase II clinical trials that assess the effect of therapies intended to induce remyelination. Our objective is to develop an approach to determine the sample size required to detect the effect of such an experimental treatment with specified statistical power. We consider a parallel group design with two arms of equal number of subjects. The study design leads to a three level hierarchical data structure where lesions are nested within subjects and are assessed repeatedly over the study period. A three-level mixed effects linear model is used for our sample size determination. Required sample sizes to achieve specified statistical powers are determined for different numbers of follow-up scans and different magnitudes of the treatment effect. These required sample sizes are determined based on two estimators of the treatment effect for five different proton density (PD) summaries and for analyses based on all, only enhancing, and only large new lesions. The greatest statistical sensitivity is observed for the PD 25th percentile summary measure. For this summary measure and based on use of the large new lesions in a 6-month phase II MS clinical trial, only 4 subjects/arm are required to detect a 20% treatment effect on lack of recovery with a statistical power of 90%. Trials with such small sample sizes can be conducted quickly and efficiently.