2011 Distinguished Lecture Series

February 24, 2011

Laks Lakshman, University of British Columbia
Databases/social networks
Title: Musings on Next Generation Recommender Systems (click to view presentation)
Read full biography  

Abstract:  A recommender system (RS) is a system that tracks users' experience with items, and takes available information to make recommendations to users about items it predicts are most likely to match their taste. Recommender systems have become very popular both in academic research and in the industry.

On the research side, impressive strides have been made in  improving the accuracy of predictions. On the business side, recommender systems are one of the core technologies powering the business of companies like Amazon and Netflix. It's a good time to ask the question "Is that it?" about this technology.

This question can be instantiated in a number of ways. Each instantiation exposes the limitations of current RS. For instance, over-emphasis on accuracy can make a RS predictable, turning off users. How do we make the recommendations diverse? Explanation of why a user is recommended an item can improve the chance of the user making the right decision about adopting items. How do we endow RS with explanation facility? We could broaden the scope of what we call "items". For instance, in online exchange markets, users trade objects they no longer need with other users who possess items they wish to have. Can RS make recommendations to users about promising trade  partners? In applications like trip planning, users look for recommendations of packages, i.e., collections of POIs. In addition to accurately predicting POIs that may appeal to the users, the standard metric of good recommendations, these packages must satisfy user constraints on time and money. Can we push the boundaries of RS so they are able to deal with collections and constraints? An overarching concern, both in the context of classic RS and in the extensions we seek, is scalability. Often, users only care about top-k recommendations. Can we design algorithms and systems that are "optimal" with respect to the work they do for generating top-k recommendations?

In this talk, I will describe work being conducted in our group that aims to close the gap between classic RS and the applications anticipated by these questions. I will conclude with promising directions for further research.  This is joint work with a long list of colleagues and grad students.

Return to current DLS page.

March 17, 2011

Margo Seltzer, Harvard
Systems (databases/file systems)
Title:  Provenance Everywhere (click to view presentation)
Read full biography 

Abstract:  Digital provenance describes the ancestry or history of a digital object. Computer science research in provenance has addressed issues in provenance capture in operating systems, command shells, languages, workflow systems and applications. However, it's time to begin thinking seriously about provenance interoperability, what it means, and how we can achieve it.

We have undertaken several projects that integrate provenance across multiple platforms. Doing so introduces many challenging research opportunities.

In this talk, Margo Seltzer will present Provenance-Aware Storage Systems, focusing on her experiences integrating provenance across different layers of abstraction. She will present some of her use cases and discuss important issues for further research.

Return to current DLS page.

April 7, 2011

Helmut Pottman, King Abdullah University of Science and Technology
Graphics (geometric modeling)
Title: Geometric Computing for Freeform Architecture. (click here to view presentation)
Read full biography  

Abstract:  Freeform surfaces play an increasingly important role in contemporary architecture. While digital models are easily created, the actual fabrication and construction of architectural freeform structures remains a challenge. In order to make a freeform design realizable, an optimization process known as rationalization has to be applied. This means its decomposition into smaller parts, thereby meeting two competing objectives: feasibility, and consistency with the designer’s intentions. Mostly rationalization replaces smooth surfaces (possibly with an additional curve network on them) by other structures like meshes with special properties. The guiding thought in all considerations is the efficient manufacturing of the surface parts and their respective necessary supporting/connecting elements. Both simple geometry and repetition of elements contribute to this goal of efficiency. We report on recent advances in this new direction of Geometric Computing and discuss ongoing research on fabrication-aware design through shape space exploration, which unifies shape design and rationalization. While motivated by architecture, fabrication-aware design or design exploration is expected to be of great interest in other applications as well.

Return to current DLS page.

June 2, 2011

Mark Moir, Oracle
Theory/systems (transactional memory)
Title: Hardware Transactional Memory: Potential and Progress (click to view presentation)
Read full biography

Abstract:  There is a growing body of practical evidence that hardware transactional memory (HTM) has the potential to significantly ease the development of concurrent algorithms that are scalable, efficient and correct.  However, this potential depends in many cases on details of the HTM's design.  This talk will introduce HTM, explore several examples illustrating its potential, discuss requirements for achieving that potential, and provide an update on (publicly known) potential directions for future HTM support.

Return to current DLS page.

 August 22, 2011

Serafim Batzoglou, Stanford University
Title: Algorithms for Population Genomics (click to view presentation)
Read full biography

Abstract: Population genomics, or the study of (large) groups of genomes, is a scientific discipline that is truly enabled by the increasing availability of inexpensive whole genome sequencing. Population genomics encompasses questions that directly relate to our health, population background, family history, and many other key aspects of our lives, so its study will be key in applying genomics technologies to the benefit of society. Algorithmic and statistical advances will be key in enabling this goal, because by its nature population genomics involves handling enormous datasets with complex statistical properties. In this talk, I will cover our algorithmic work in three different population genomics topics: relationship inference, population ancestry inference, and functional SNP identification. First, I will talk about a new machine learning-based method that we developed for inferring the genealogical relationship between two (or more) individuals. CARROT, a system that we developed based on our method, can identify relationships up to fourth degree. Second, I will highlight some of our recent results on ancestry inference, namely the construction of an accurate and efficient ancestry inference framework, and its application to genotype data sampled from human populations worldwide. Finally, I will shift to genome wide association studies (GWAS); such studies have been successful in identifying SNPs associated with disease. However, associated SNPs are usually part of a larger region of linkage disequilibrium (LD), making it hard to identify the SNPs that have a biological link with the disease. I will talk about our methodology towards using ENCODE data to narrow down the lists of SNPs associated with the disease, to a significantly shorter list of "functional SNPs" that are more likely to have a functional effect.

Return to current DLS page.

September 15, 2011

Christos Faloustos, Carnegie Mellon University
Databases
Title: Mining Billion-node Graphs: Patterns, Generators and Tools  (click to view presentation)

Biography:  Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the SIGKDD Innovations Award (2010), seven "best paper'' awards (including two  "test of time'' awards), and four teaching awards. He has served as a member of the executive committee of SIGKDD; he has published over 200 refereed articles, 11 book chapters and one monograph.  He holds six patents and he has given over 30 tutorials and over 10 invited distinguished lectures. He is an ACM Fellow.  His research interests include data mining for graphs and streams, fractals, database performance, and indexing for multimedia and bio-informatics data.

Abstract:  What do graphs look like? How do they evolve over time? How to handle a graph with a billion nodes? We present a comprehensive list of static and temporal laws, and some recent observations on real graphs (like, e.g., "eigenSpokes"). For generators, we describe some recent ones, which naturally match all of the known properties of real graphs. Finally, for tools, we present "oddBall"' for discovering anomalies and patterns, as well as an overview of the PEGASUS system which is designed for handling Billion-node graphs, running on top of the "hadoop" system.

Return to current DLS page.

October 13, 2011

NOTE: DIFFERENT LOCATION AND TIME!

Location:  IRMACS Centre, SFU Burnaby, Applied Science Building, Level 3, Room 10905
Time: 4:45 p.m.

Leslie Valiant, Harvard
Theory
Title: Biological Evolution as a Form of Learning (click to view presentation)
Read full biography

Abstract: Living organisms function according to protein circuits. Darwin's theory of evolution suggests that these circuits have evolved through variation guided by natural selection. However, the question of which circuits can so evolve in realistic population sizes and within realistic numbers of generations has remained essentially unaddressed.

We suggest that computational learning theory offers the framework for investigating this question, of how circuits can come into being adaptively from experience, without a designer. We formulate evolution as a form of learning from examples. The targets of the learning process are the functions of highest fitness. The examples are the experiences. The learning process is constrained so that the feedback from the experiences is Darwinian. We formulate a notion of evolvability that distinguishes function classes that are evolvable with polynomially bounded resources from those that are not. The dilemma is that if the function class, say for the expression levels of proteins in terms of each other, is too restrictive, then it will not support biology, while if it is too expressive then no evolution algorithm will exist to navigate it. We shall review current work in this area.

Return to current DLS page.