# Interview with Dr. Richard Lockhart

### Professor, Department of Statistics & Actuarial Science

#### Goodness-of-fit Testing, Inference on Stochastic Processes, Large Sample Theory

Analyzing a data set and getting answers that you like is gratifying, but how can you be sure your analysis is actually working? Statisticians tackle this question; they want evidence of how well an analysis works, and to obtain answers, they analyze methods of analyzing data. Dr. Lockhart is not fundamentally a data analyst; he does analyze data, but the focus of his research is to describe how well various methods are expected to work in different circumstances.

What is the most satisfying aspect of the work you do?
I like mathematical problem-solving. As my Ph.D. supervisor used to say, “I am not interested in research, I am interested in understanding, which is different.” It is very satisfying when I finally feel that I understand how things work.

What research do you do in describing how well a group of data fits a statistical model?
Lots of analytical methods hinge on specific and technical mathematical assumptions about the way the data were generated. Those assumptions are sometimes checkable and sometimes not. ‘Goodness of fit’ is the process of developing statistical methods to ask whether the assumptions you rely on are reasonable. My job is to develop rules that will test those assumptions. Any statistics problem will have competing suggestions for methods and I compare them to highlight their strengths and weaknesses.

Why do you work on ‘inference for regression analysis methods’?
Inference, in this context, refers to reasoning from effects to causes:  you see what happens and you infer why it happened. Statisticians are particularly interested in quantifying the uncertainty in that reasoning.

The particular problem I have been concerned with lately is called “high-dimensional” regression. We were always told that you couldn't use more variables than you had data points;  i.e., you cannot find a solution with three equations and five unknowns or variables. It turns out that it's not a hopeless case if you are willing to make genuinely strong assumptions.

We now have automated regression analysis methods to improve the prediction accuracy and interpretability of the statistical model that is being generated. The modern strategy is not to consider all possibilities, but to whittle away the possibilities by generating a sensible list of variables to consider. I've been involved in providing specific methods to measure uncertainty when you use these modern techniques.

The new methods are not free of assumptions. It is always hazardous to have too little data or too few responses for those predictors. But if you have good theoretical reasons for supposing that the nature of an effect is limited to a small number of predictors, then you are a lot safer in using methods that rely on those assumptions.

What aspects of large sampling research capture your interest?
Most problems do not have exact solutions and as a result, approximations are needed. More data points can improve the behaviour of many statistical methods. Thinking about this improvement mathematically allows us to study what we call the limiting behaviour of a procedure – what would happen if you had infinite amounts of data.  That limiting behaviour provides an approximation when you don’t have an infinite amount of data, and we use that approximation for the amount of data we do have. Of course, the quality of that approximation would vary depending on the context and the statistical model being applied.

I make approximations to discuss the behaviour of something by computing a mathematical limit. I'm trying to understand which approximations are good and what is the minimum amount of data needed to make something a reasonable approximation.

Is there a particular research experience that had a big impact on the way that you think?
When I was a student at the University of British Columbia (UBC), I had an opportunity to do a summer research project with James Zidek, from UBC, who was consulting with an engineering firm on a Lions Gate Bridge project. The government wanted answers to questions like “can we add another lane to the bridge to accommodate heavy trucks and buses”.

The answer depends on the load on the bridge and the aspect I was involved in was the question of what the heaviest load on the bridge would be. I spent the summer thinking about assumptions you need to make to compute a conservative upper bound on this worst-case load value. You don't want to underestimate it or the bridge will fall down, and you don't want the upper bound to be too high either. You cannot include every detail in a model; I learned about the importance of disregarding the unimportant.

What specific educational backgrounds or personal traits do you look for in prospective trainees?
I have a fairly mathematical attitude toward problems, so students will find it easier to work with me if they are comfortable with concepts in real analysis in mathematics, and if they're genuinely interested in science.

My current students mostly have statistics backgrounds. I tend to do a lot of co-supervision, which exposes the student to specialists in statistics and the scientific area of application; not to mention, it opens the door to additional funding opportunities for my program.

You could have just as easily gone into physics or math. Is there a fundamental difference between you and physicists or mathematicians?
Mathematics is fundamentally extraordinarily precise, whereas physics has precision but it also has lots of approximations. I find that physicists focus on what they're trying to achieve using a complex process in which they need to cope with approximations; they have a relentless focus on practicality or at least realism and are not so interested in questions that involve defining the scope of a method or assessing how useful the method is in general.

At the other end of the spectrum, many pure mathematicians find most theorems of statistics unattractive because they lack the elegance that pure mathematicians look for.

I like elegance, and I like realism; for me, Statistics is a Department that lets me have enough of both.

______________________________________

Read more: Dr. Lockhart’s profile on the Department of Statistics & Actuarial Science website and the Featured Researchers page

Interview by Jacqueline Watson with Theresa Kitos