Ivan V. Bajić

School of Engineering Science
Simon Fraser University
8888 University Drive
Burnaby, BC, V5A 1S6, Canada
T: +1-778-782-7159
F: +1-778-782-4951
E: ibajic@ensc.sfu.ca
@IvanBajic @IvanBajic


I am a Professor of Engineering Science at Simon Fraser University. My professional interests revolve around signal processing, machine learning, and their applications in image and video processing, coding, communications, and collaborative intelligence. My group's work has received a number of international recognitions, including the 2023 IEEE TCSVT Best Paper Award, conference paper awards at ISCAS 2023, MMSP 2022, ICIP 2019, and ICME 2012, as well as other recognitions (e.g., paper award finalist, top n%) at Asilomar, ICIP, ICME, ISBI, and CVPR. This is in large part due to my outstanding students, among whom are recipients of the Vanier Scholarship, multiple Governor General's Gold Medalists, NSERC Doctoral and Masters Scholars, and winners of competitive conference travel grants.

I have served on the organizing and program committees of the main conferences in my field, and have received several awards in these roles, including Outstanding Reviewer Award (six times), Outstanding Area Chair Award, and Outstanding Service Award. I was the Chair of the Vancouver Chapter of the IEEE Signal Processing Society in 2013-2019, during which the Chapter received the Chapter of the Year Award from IEEE SPS. I was the Chair of the IEEE Multimedia Signal Processing Technical Committee during 2022-2023 and I am currently serving as a Senior Area Editor of IEEE Signal Processing Letters. I have previously served on editorial boards of several journals, including IEEE Signal Processing Magazine, IEEE Transactions on Multimedia, and Signal Processing: Image Communication.

I was born in Belgrade, Serbia, in 1976. I received the B.Sc.Eng. degree (summa cum laude) in Electronic Engineering from the University of Natal, South Africa, in 1998, and the M.S. degree in Electrical Engineering, the M.S. degree in Mathematics, and the Ph.D. degree in Electrical Engineering from Rensselaer Polytechnic Institute, Troy, NY, USA, in 2000, 2002, and 2003, respectively.

Besides professional life, I am also a wine enthusiast. I’ve been fortunate enough to visit some of the top wine producing regions in the world in the last number of years, including the Napa and Sonoma valleys in California, Cape Town/Stellenbosch region in South Africa, La Rioja in Spain, Douro valley in Portugal, Chianti in Italy and, closer to home, Okanagan and Cowichan valleys in British Columbia, as well as the Olympic Peninsula wineries in Washington State.


At SFU (2005 - present):

At UM (2003-2005):


My research interests are in the field of signal processing and its many applications. Signal processing is the science behind our digital life. In our work, my students, collaborators, and I, use the tools from signal processing, machine/deep learning, probability, statistics and optimization to analyze and solve problems related to all kinds of signals - images, video, audio, power - and others. Most of this work takes place in the Multimedia Lab (ASB 8803.1) and the Computational Sustainability Lab. Some of my ongoing projects are listed below.

Collaborative intelligence

Collaborative intelligence is a way to distribute computation of an Artificial Intelligence (AI) model across multiple devices. It has been shown to be an efficient way to deploy AI on mobile devices and is one of the promising avenues to bring AI "to the edge." We want to understand the various trade-offs in designing collaborative intelligence systems, and to develop efficient tools for their deployment, including model splitting, feature compression, feature transmission, error control, feature error concealment, and others.

Coding for machines

Much of the sensory information captured today is intended for automated machine-based analysis, rather than human use. This necessitates rethinking of traditional compression and pre-/post-processing methods to facilitate efficient machine-based analysis. We are interested in understanding fundamental limits as well as creating practical solutions for signal compression targeted at machine use. Our group is among the pioneers in this field, having been the first to demonstrate significant gains in coding for object detection (at ICIP 2018) and point cloud classification (MMSP 2023), and to derive the first rate-distortion results (arXiv 2021, TIP 2022) in this area.

Compressed vision

In the cult movie The Matrix, the character called Cypher is able to spot objects and people from endless streams of coded data. "I don't even see the code, all I see is blonde, brunette, redhead," he says. Is this really possible? For humans, unlikely. But we have suceeded in training machines to do something like that. Being able to analyze compressed streams, without decoding, is one of the key technologies needed to handle massive amounts of video in the era of Big Data.

Point clouds

Point clouds are sets of points in 3D space that describe the surface or shape of an object, and may carry additional attributes such as color. They were used for some time in computer graphics and animation industry, then 3D printing, and are now making their way towards mainstream through virtual/augmented reality and immersive media. Their irregular sampling makes their processing more challenging.

Computational sustainability

Computational sustainability tries to balance the needs of the environment, the economy, and society to solve sustainability problems using computational tools, models and algorithms. We are particularly interested in solving climate change and energy challenges through the theme of conservation and habitual change. To this end, we work with both academic and industry partners to develop and commercialize our discoveries and solutions.

Current group members:

Research Associates:


Former group members:

In industry:

In academia:

Former visitors:

Students' success:

Selected talks

Selected preprints

Selected publications


























The software below is provided as is, without any warranty, expressed or implied. It is free for academic and non-commercial use. If you use the software in your research, please cite the corresponding references.

Adaptive hole filling for 3D point clouds - Implementation of exemplar-based hole filling from the SPL 2018 paper. Download. Exemplar-based hole filling for 3D point clouds - Implementation of exemplar-based hole filling from the VCIP 2017 paper. Download. Color Gaussian jet features for image quality assessment - Implementation of Color Gaussian jet features for quality assessment of multiply-distorted images. Download. No-reference image quality assessment - A tool to evaluate image quality without a reference image. Download. (for password, please send an e-mail with your name and affiliation to hadi.sfu@gmail.com)
How many bits does it take for a stimulus to be salient? - Implementation of saliency estimation in video based on Operational Block Description Length (OBDL). Download.
Attention retargeting by color manipulation - Implementation of an attention retargeting method based on color manipulation. Download.
Subliminal flicker - Experiments with subliminal flicker to guide attention in natural images. Download.
Compressed-domain correlates of fixations in video - Implementation of visual saliency estimation methods for compressed video from the following two papers.
Download PIVP code.
Download MTAP code.
Saliency-aware video compression - Implementation of saliency-aware video compression from the following paper. Download.
Motion visualization in compressed video - Matlab code to reproduce the results from the following paper. Download.
NDLT-based compressed-domain GME - Matlab code for compressed-domain Global Motion Estimation (GME) based on the Normalized Direct Linear Transform (NDLT) algorithm. Download.
Compressed-domain tracking - Matlab code to reproduce the results from the following paper. Download.
Joint global motion estimation and motion segmentation - Matlab code to reproduce the results from the following paper. Download.
Outlier removal for global motion estimation - Matlab code for removing motion vector (MV) outliers from the MV field prior to global motion estimation. Download.
NAL-SIM - An interactive simulator of H.264/AVC video coding and transmission. Allows the user to encode a raw YUV video into H.264/AVC bitstream using a variety of options, analyze the bitstream structure (NAL units), simulate the loss of NAL units, and see the effects of loss on the decoded video quality. Download.
mcl.jit - A library of external objects for video coding, processing, and communication in Max/MSP/Jitter developed under the New Media Initiative grant funded by NSERC and CCA. A separate web page is maintained for it. Web.

Region-based predictive decoding of video - A Windows executable implementing Xvid MPEG-4 video encoding, and Region-Based Predictive Decoding (RBPD) of the resulting MPEG-4 video bitstreams. Download.
Error concealment for MC-EZBC - Microsoft Visual C/C++ code for motion-compensated error concealment for MC-EZBC. It includes an early version of MC-EZBC submitted to MPEG in 2002. Current versions of MC-EZBC are available on the CIPR website. Download.
NXSensor - Nucleosome eXclusion Sequence sensor is a tool for finding regions of DNA sequences that are likely to be nucleosome-free. The basic idea behind NXSensor is that the DNA sequence which wraps around the nuclosome needs to have a certain degree of flexibility. DNA flexibility is a necessary (though not the only, and also not sufficient) condition for nucleosome formation. It is known that the intrinsic curvature of a piece of DNA depends on its sequence, and we use that knowledge to find DNA sequences that are fairly rigid. Regions of DNA that have several rigid sequences close to each other are likely to be nucleosome-free. Web.
Maximum minimal distance lattice partitioning (MMDLP) - Matlab code for generating a partition matrix that solves the constrained sphere packing problem on the Z2 lattice. Download.
Dispersive Packetization (DP) for images - Microsoft Visual C/C++ code for dispersive packetization of subband/wavelet coded images. Baseline coder is based on Geoff Davis' Kit, with the packetization and error concealment modules added. Download.

The datasets below are provided without any warranty, expressed or implied. They are free for academic and non-commercial use. If you use the data in your research, please cite the corresponding references.

SFU-HW-Tracks-v1 - This dataset is an extension of SFU-HW-Objects-v1, and contains unique object IDs for each annotated object in 13 raw HEVC v1 CTC sequences. This allows for benchmarking tracking algorithms and studying the relationship between video compression and tracking.


SFU-HW-Objects - This dataset contains object annotations (bounding boxes and object classes) for raw HEVC v1 CTC sequences.


VOC-360 - This dataset includes a collection of images from from the VOC 2012 dataset that have been processed to look like fisheye images coming from a typical 360-degree camera. The dataset allows one to train models for object detection and segmentation on fisheye-looking images. The dataset includes images, object annotations, and segmentation masks.


Wider-360 - This dataset includes a collection of images from the Wider Face Detection Benchmark that have been processed to look like fisheye images coming from a typical 360-degree camera. The dataset allows one to train face detectors on fisheye-looking images. The dataset includes images and face annotations.


FDDB-360 - This dataset includes a collection of images from the Face Detection Data Set and Benchmark (FDDB) that haven been processed to look like fisheye images coming from a typical 360-degree camera. The dataset allows one to train face detectors on fisheye-looking images. The dataset includes images, face annotations, sample code to train and test a face detector, as well as a sample face detection model described in the paper below.


Eye tracking database for standard video sequences - This dataset includes a database of gaze locations by 15 independent viewers on a set of 12 standard CIF video sequences: Foreman, Bus, City, Crew, Flower Garden, Mother and Daughter, Soccer, Stefan, Mobile Calendar, Harbor, and Tempete. Included are the gaze locations for the first and second viewing of each sequence, their visualizations, heat maps, and sample MATLAB demo files that show how to use the data.


Segmented foreground objects - This dataset includes manually segmented foreground objects that we used as the ground truth in our moving region segmentation. Each set contains segmentation masks, segmented object(s), and original frames.

No openings at the moment.

Future openings will be announced here and on Twitter @IvanBajic

Design based on http://getskeleton.com/