March 15:  Jamie Shotton, Microsoft Research Cambridge  

2012 Distinguished Lecture Series

March 10, 2012

Meet world-renowned researchers at lectures hosted by SFU Computing Science. These lectures are open to students, researchers and those working in industry and education who want to hear the latest leading-edge research. Admission is free. For more information, call 778-782-8923

March 15, 2012
Jamie Shotton, Microsoft Research Cambridge 
1:30-2:30 p.m.
IRMACS Theatre, Room 10900 (Applied Sciences Building)
SFU Burnaby Campus
8888 University Drive
Burnaby, B.C. 

Title: Body Part Recognition: Making Kinect Robust


Late 2010, Microsoft launched Xbox Kinect (, a revolution in gaming where your whole body becomes the controller - you need not hold any device or wear anything special.  Human pose estimation has long been a grand challenge of computer vision, and Kinect has been the first product that meets the speed, cost, accuracy, and robustness requirements to take pose estimation out of the lab and into the living room.

In this talk we will discuss some of the challenges of pose estimation and the technology behind Kinect, detailing a new algorithm, body part recognition, which drives Kinect’s skeletal tracking pipeline. Building on earlier work that uses machine learning to recognize categories of objects in photographs, body part recognition uses a classifier to produce an interpretation of pixels coming from the Kinect depth-sensing camera into different parts of the body: head, left hand, right knee, etc.  Estimating this pixel-wise classification is extremely efficient, as each pixel can be processed in parallel on the GPU.  The classifications can then be pooled across pixels to produce hypotheses of 3D body joint positions for use by a skeletal tracking algorithm.  Our method has been designed to be robust, in two ways.  Firstly, we train the system with a vast and highly varied training set of synthetic images to ensure the system works for all ages, body shapes and sizes, clothing and hairstyles. Secondly, the recognition does not rely on any temporal information, and this ensures that the system can initialize from arbitrary poses and prevents catastrophic loss of track, enabling extended gameplay for the first time.


Jamie Shotton studied Computer Science at the University of Cambridge, and remained at Cambridge for his PhD in Computer Vision and Visual Object Recognition, graduating in 2007.  He was awarded the Toshiba Fellowship and travelled to Japan to continue his research at the Toshiba Corporate Research & Development Center in Kawasaki.  In 2008 he returned to the UK and started work at Microsoft Research Cambridge where he is a Researcher in the Machine Learning & Perception group.

His research interests include Human Pose Estimation, Object Recognition, Machine Learning, Gesture and Action Recognition, and Medical Imaging.   He has published papers in all the major computer vision conferences and journals, with a focus on object detection by modelling contours, semantic scene segmentation exploiting both appearance and semantic context, and dense object part layout
constraints.  His demo on real-time semantic scene segmentation won the best demo award at CVPR 2008.  More recently, he has investigated how many of the ideas from visual object recognition and machine learning can be applied in new ways.  In human pose estimation, he architected the human body part recognition algorithm used by Microsoft Kinect’s skeletal tracking algorithm.  This recognition algorithm was awarded the best paper prize at CVPR 2011 and the 2011 Royal Academy of Engineering’s MacRobert Award gold medal, and was honored as part of the 2012 Outstanding Technical Achievement award from Microsoft’s Technical Community Network.  In the sphere of medical imaging, he has published papers on the automatic recognition of organs and other anatomical structures from CT data, with a view to simplifying and speeding up the radiologist’s workflow.

More information about Jamie Shotton's research: