Pixels & Predictions

Learning Active Learning from Data

Here are some slides I made to present this NIPS 2017 paper in our reading group:

CNNs for Brain Tumor Segmentation in MRI Scans (BraTS)

Drawing inspiration from the popular VGG networks, the paper proposes using a deep convolutional neural network architecture with small convolutional kernels for segmentation of gliomas in MRI images. The authors discuss the relative advantages of using small kernels, and also explore the use of intensity normalization as a pre-processing step, which was unconventional in CNN-based segmentation methods. The proposed algorithm obtained the first position for the complete, the core, and the enhancing regions in Dice Similarity Coefficient metric in the Brain Tumor Segmentation Challenge 2013 database (BraTS 2013). ...

Reinforcement Learning for Landmark Detection in 3D CT Volumes

The authors propose to reformulate the problem of anatomical detection as a cognitive learning task for an artificial agent. Given a volumetric image $\boldsymbol{I}: \mathbb{Z} \to \mathbb{R}$ and the location of an anatomical structure of interest $p_{GT} \in \mathbb{R}^3$ within $\boldsymbol{I}$, the task can be formulated as learning a navigation strategy to $p_{GT}$ in the voxel grid. This can also be interpreted as finding voxel-based navigation trajectories from any arbitrary starting point $p_0$ to a destination point $p_k$ within the image $\boldsymbol{I}$, such that $|p_k - p_{GT}|$ is minimized. In the domain of reinforcement learning, this problem can be modelled as a Markov Decision Process (MDP): $$ \mathcal{M} := \left(\mathcal{S}, \mathcal{A}, \mathcal{T}, \mathcal{R}, \gamma\right) $$ where $\mathcal{S}$ represents the finite state of states, and $s_t \in \mathcal{S}$ denotes the state of the agent at time $t$, and is defined as $s_t = \boldsymbol{I}(p_t)$. ...

Unsupervised Learning for Deformable Registration

The authors highlight the multiple shortcomings of the contemporary learning based image registration methods, such as the inaccuracy of the correspondences provided for training (especially when the deformed subject image is significantly different from the template image), the difficulty of incorporating new image features for learning without repeating the whole training procedure all over again, and the lack of variation in the training image features primarily because of the prohibitive computational cost associated with it. Moreover, the authors note that the best features are ``often learnt only at the template space", meaning if the template image is changed, the whole training procedure has to be re-done. ...

Achieving Dermatologist-level Classification Performance of Skin Lesion Images

The Dataset The paper uses a new dermatologist-labelled dataset of 129,450 clinical images, which also includes 3,374 dermoscopic images. These images come from 18 different clinician-curated, open-access online repositories, as well as from clinical data from Stanford University Medical Center, and belong to 2,032 diseases. This data is split into 127,463 training and validation images, and 1,942 biopsy-labelled test images. ...