Reinforcement Learning for Landmark Detection in 3D CT Volumes
The authors propose to reformulate the problem of anatomical detection as a cognitive learning task for an artificial agent. Given a volumetric image $\boldsymbol{I}: \mathbb{Z} \to \mathbb{R}$ and the location of an anatomical structure of interest $p_{GT} \in \mathbb{R}^3$ within $\boldsymbol{I}$, the task can be formulated as learning a navigation strategy to $p_{GT}$ in the voxel grid. This can also be interpreted as finding voxel-based navigation trajectories from any arbitrary starting point $p_0$ to a destination point $p_k$ within the image $\boldsymbol{I}$, such that $|p_k - p_{GT}|$ is minimized. In the domain of reinforcement learning, this problem can be modelled as a Markov Decision Process (MDP): $$ \mathcal{M} := \left(\mathcal{S}, \mathcal{A}, \mathcal{T}, \mathcal{R}, \gamma\right) $$ where $\mathcal{S}$ represents the finite state of states, and $s_t \in \mathcal{S}$ denotes the state of the agent at time $t$, and is defined as $s_t = \boldsymbol{I}(p_t)$. ...