Professional Documents
Culture Documents
Deep Learning in
Medical Image Analysis
Dinggang Shen
Methods Limitations
O PCA Linear
Not optimal for non-Gaussian data
O Gaussian Mixture Require knowledge for the number of clusters
Models Challenging when applied to high-dimensional
O k-Means data
O ICA Linear model
O Sparse Coding Shallow model (e.g., single-layer
O Non-Linear Embedding representation)
Performance increase
with layers
Department of Radiology and BRIC, UNC-Chapel Hill
Deep Learning Why hot?
n 3 Main Reasons:
1) New layer-wise training algorithm (Science 2006)
n Each time, train on simple task
2) Big data, compared to 20 years ago
3) Powerful computers
n Previousalgorithms may be theoretically working, but
practically not converged to good local minima with the
previous less-powerful computers.
Restricted
Boltzmann
Machine
O M. Kim, G. Wu, D. Shen, Unsupervised Deep Learning for Hippocampus Segmentation in 7.0
Tesla MR Images, MICCAI Workshop on Machine Learning in Medical Imaging (MLMI), 2013.
O M. Kim, G. Wu, W. Li, L. Wang, Y.-D. Son, Z.-H. Cho, D. Shen, Automatic Hippocampus
Segmentation of 7.0 Tesla MR Images by Combining Multiple Atlases and Auto-Context Models,
Neuroimage 83:335-345, 2013.
n Challenges
Hippocampus
n The hippocampus is small (35157mm3)
n The hippocampus is surrounded by complex structures
n Low imaging resolution (111mm3) of 1.5T or 3T MRI
scanners
b c
Haar filters
voxels
in the
image
a patches
Basis filters W
in 1st layer
Learned basis filters by the 1st ISA
Image
patches X
1st ISA layer
image learned
patches 2-layer features Classifier
ISA sequence 1
Atlas space 1
image patches
Aligned
training image learned 2-layer ISA
images patches 2-layer features Classifier classification
in each ISA sequence 2 maps 1N
atlas
Adaptively weighted
space Atlas space 2 fusion
1N
probability map
Level set
image learned
patches 2-layer features Classifier
segmentation
ISA sequence N result
Atlas space N Subject image space
Training Stage Testing Stage
Recall
Relative overlap
Similarity index
T1
appears, you may have to
CSF
delete the image and then appears, you may have to
insert it again. delete the image and then
insert it again.
T2
computer, and then open the corrupted. Restart your
T2
computer, and then open the
GM
appears, you may have to
delete the image and then appears, you may have to
insert it again. delete the image and then
insert it again.
FA
may have been corrupted.
Restart your computer, and
FA WM
have to delete the image and
then insert it again.
Input modalities consist of T1, T2, and fractional anisotropy (FA) images of infant brains
Each pixel is segmented into cerebrospinal fluid (CSF), gray matter, and white matter
Isointense stage (~6-8 months of age) brains are very difficult to segment
# of trainable parameters is ~5.3 million
W. Zhang, R. Li, H. Deng, L. Wang, W. Lin, S. Ji, D. Shen: Deep Convolutional neural networks for
multi-modality isointense infant brain image segmentation. NeuroImage15
O G. Wu, M. Kim, Q. Wang, B.C. Munsell, D. Shen, Scalable High Performance Image Registration
Framework by Unsupervised Deep Feature Representations Learning, Revised for IEEE TBME, 2015.
O G. Wu, M. Kim, Q. Wang, S. Liao, Y. Gao, D. Shen, "Unsupervised Deep Feature Learning for
Deformable Image Registration of MR Brains," MICCAI 2013.
Individual Model
Morphological signatures
for image registration
Nv Nv-Nw+1
Dice ratio
Methods Ventricle Gray Matter White Matter Hippocampus
Demons 90.2 76.0 85.7 72.2
M+PCA 90.5 76.6 85.5 72.3
M+DP 90.9 76.5 85.8 72.5
HAMMER 91.5 75.5 85.4 75.5
H+PCA 91.7 76.9 86.5 75.6
H+DP 95.0 78.6 88.1 76.8
Methods Average
Demons 68.9
M+PCA 68.9
M+DP 69.2
HAMMER 70.2
H+PCA 70.6
H+DP 72.7
STG: Superior temporal gyrus PCG: Precentral gyrus SFG: Superior frontal gyrus
M&ITG: Middle and inferior temporal gyrus AOG: Anterior orbital gyrus POG: Posterior gyrus
MFG: Middle frontal gyrus IFG: Inferior frontal gyrus SFG: Superior frontal gyrus
LG: Lingual gyrus
H.-I. Suk, S.-W. Lee, and D. Shen, "Latent Feature Representation with Stacked Auto-Encoder for AD/
MCI Diagnosis, Brain Structure and Function, 2014.
H.-I. Suk and D. Shen, "Deep Learning-based Feature Representation for AD/MCI Classification,"
MICCAI 2013.
H.-I. Suk, S.-W. Lee, D. Shen, Hierarchical Feature Representation and Multimodal Fusion with Deep
Learning for AD/MCI Diagnosis, NeuroImage, 101:569-582, 2014.
n Treatments
n Small symptomatic benefits for
mild-to-moderate AD
n Cannot delay or halt the
progression of AD
Encoding
Input
Auto-encoder Simple
Stacked auto-encoder
Department of Radiology and BRIC, UNC-Chapel Hill
Proposed Framework
MRI
PET
CSF
Label
MMSE
ADAS-Cog
Template
Feature Feature
extraction
extraction
Pre-training Fine-tuning
Latent feature
representation
Feature representation
with stacked auto-encoder
Clinical scores
Feature Label prediction
regression
selection
Multi-task learning
AD/MCI diagnosis
1 1
0.9
0.8
0.8
0.6 0.7
0.4 0.6
0.5
Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity
AD vs. MCI MCIC vs. MCINC
H.-I. Suk, S.-W. Lee, D. Shen, Hierarchical Feature Representation and Multimodal Fusion with Deep
Learning for AD/MCI Diagnosis, NeuroImage, 101:569-582, 2014.
Spatially distributed
MRI v kMRI m m={MRI,PET}
{v } m={MRI,PET}
{fk }k=1:K
k k=1:K {v km } fk mega-patch
k=1:K
v PET PET
fk R FS construction
k v R
m
k
www
v km R FG
Weighted ensemble
v MRI
K SVM classifier
PET learning
v PET
K Preprocessor Multi-modal DBM
MRI
PET
Function
Each volume contains 646464 voxels; ~50,000 R. Li, W. Zhang, H. Suk, L. Wang, J.
patches were extracted from each volume, leading to a Li, D. Shen, and S. Ji: Deep Learning
Based Imaging Data Completion for
total of 19.9 million training patches.
Improved Brain Disease Diagnosis.
The network contains 37,761 parameters. MICCAI, 2014.
S. Liao, Y. Gao, and D. Shen, "Representation Learning: A Unified Deep Learning Framework for
Automatic Prostate MR Segmentation," MICCAI 2013.
Inhomogeneity
Large inter-subject shape variability
Simple units correspond to image filters, and pool units group similar image filters together
to increase the robustness of learned features. The goal of ISA is to learn a feature
representation that is: 1) sparse, and 2) diverse. Thus, the objective function is defined as:
v
u !2
X m
N X uXk d
X
u
arg min Rj (xi , W, V ), where W W T = I, where Rj (xi , W, V ) = t Vjl Wlp xpi
W,V
i=1 j=1 l=1 p=1
d, k and m denotes the dimension of each input xi , number of simple units in the first layer,
and the number of pooling units in the second layer, respectively.
Feature Difference Maps: Comparing the green-cross voxel with other image voxels using
different features. Blue indicate low difference, and red indicate high difference.