F T I
While evolution is certainly not always tree-like, even over the relatively short time frames of observable pathogen evolution, branching trees remain a cornerstone of how we understand genetic variation through time and how we quantitatively understand evolution. Developing mathematical tools for comparing trees and exploring tree space is a fun and exciting area that links combinatorics, probability, stochastic processes, graph theory and biomathematics. We develop new metrics on trees with different flavours of labelling, we design and study tree features and we explore tree space, with a view to ultimately improve inference of epidemiological and /or evolutionary processes from trees and to build interesting mathematics along the way.
In many systems, multiple distinct variants of a pathogen circulate in the same population. Now that sequencing is readily available, the resolution with which we can detect diversity is greatly increased. Understanding the dynamics, competition and maintenance of diversity in diverse circulating pathogens is increasingly urgent: organisms continue to develop and spread antimicrobial resistance, and their populations respond to vaccination and other interventions as well. We model competition, resistance and diversity in infections, with a focus on tuberculosis and Streptococcus pneumoneae, aiming to incorporate genomic data reflecting their diversity in our models.
New molecular sequencing technologies can offer an unprecedented view of biological diversity and evolution. In principle, this should give us the opportunity to understand population dynamics, ecology and even individual events like disease transmission in much more depth than is possible with conventional data. However, sequence data do not directly reveal population dynamics, ecology or individual-level events. Accordingly, there are exciting opportunities for mathematical research to play a key role in many applications.
We work on methods like TransPhylo which reconstruct outbreaks of an infectious disease with the aid of DNA or RNA sequence data. These contain information about person-to-person transmission events: if two individuals have similar pathogen genomes, it is possible that one infected the other. But the relationship between sequence similarity and who infected whom is complicated by in-host diversity, timing and the shared ancestry patterns in a set of sequences. We are building improved methods to integrate diverse sources of data to refine our predictions of who infected whom, and we are analysing data from tuberculosis outbreaks in many countries around the world, aiming to use genomic data to develop improved TB interventions.