Yueh-Min Huang
IEEE Circuits and Systems Society joint Chapter of the Vancouver/Victoria Sections

Speaker: Dr. Gene Cheung
National Institute of Informatics, Japan

Title: Sparse Representation of Depth Maps for Efficient Transform Coding

(Presentation is available in pdf format.)
Monday, February 14, 2011, 11:00 am to 12:00 noon
ASB 9705, Simon Fraser University, Burnaby, BC


Compression of depth maps is important for "image plus depth" representation of multiview images, which enables synthesis of novel intermediate views via depth-image-based rendering (DIBR) at decoder. Since depth map is only a means to the end of view synthesis and not itself viewed, we propose to explicitly manipulate depth values, without causing severe synthesized view distortion, in order to maximize representation sparsity in the transform domain for compression gain.

More specifically, we present two methods to find sparse representations in the compressed domain. In the first method, we define a "don't care region" (DCR) for each depth pixel, where a depth value outside the DCR will lead to a synthesized distortion larger than a threshold value T during DIBR. We then find a sparse representation in the compressed domain by iteratively solving a weighted l_1 minimization via linear programming, so that the sought depth signal remains inside the defined per-pixel DCRs.

In the second method, for each pixel in the depth map, we first define a quadratic penalty function, with minimum at ground truth depth value and the parabola's sharpness decided based on synthesized view's distortion sensitivity to the pixel's depth value during DIBR. We then define an objective for a depth signal in a code block as a weighted sum of: i) signal's sparsity in the compressed domain, and ii) per-pixel synthesized view distortion penalties for chosen signal. We subsequently replace the difficult-to-solve l_0-norm (sparsity) in the objective with a computationally inexpensive weighted l_2-norm. For the weighted l_2-norm to promote sparsity, we solve the optimization iteratively, where at each iteration weights are readjusted to mimic sparsity-promoting l_{tau}-norm, 0 < tau <= 1. Using JPEG as an example transform codec, we show that our methods exhibit significant compression gain for the interpolated view over compression of unaltered depth maps.


Gene Cheung received the B.S. degree in Electrical Engineering from Cornell University in 1995, and the M.S. and Ph.D. degrees in Electrical Engineering and Computer Science from the University of California, Berkeley, in 1998 and 2000, respectively. He was a senior researcher in Hewlett-Packard Laboratories Japan, Tokyo, from 2000 till 2009. He is currently an assistant professor in National Institute of Informatics in Tokyo, Japan.

His research interests include media representation & network transport, single- / multiple-view video coding & streaming, and immersive communication & interaction. He has published over 15 international journal and 50 conference publications. He has served as associate editor of IEEE Transactions on Multimedia since 2007, served as area chair in IEEE International Conference on Image Processing (ICIP) 2010, and serves as technical program co-chair of International Packet Video Workshop (PV) 2010. He serves as track co-chair for Multimedia Signal Processing track in IEEE International Conference on Multimedia and Expo (ICME) 2011. He is a co-recipient of the Top 10% Paper Award in IEEE International Workshop on Multimedia Signal Processing (MMSP) 2009.

Last updated 
Tuesday, February 22, 2011  5:56:15 PM PST.