CSE527 l13 PDF

This Lecture
Introduction
Mid-Level Vision
Gestalt Theory & Grouping
Clustering & Segmentation
K-means & EM
Mean Shift
Normalized Cuts
Slide Credits:
A. Efros, S. Palmer, B. Leibe, S. Lazebnik, K. Grauman, S. Seitz, C.Bishop, .
Kokkinos
Mid-level vision
Half-way between the image and the objects
Something more informative than pixels

Superpixels,
Ren & Malik
But not necessarily object-centered : Generic
`Blue SegmentV. Kandinsky

2
Why not go directly from image to objects?
Pattern recognition task
Mid-level representations suffice for recognition
Scalability: objects and their geons
Hypothesis: there is a small number of geometric components that

constitute the primitive elements of the object recognition system
Analogy: using letters to form words (compare with Ideograms)
Scalability: Recognition-by-components
1) We know that this object is nothing we know

2) We can split this objects into parts that
everybody will agree
3) We can see how it resembles something
familiar: a hot dog cart
Mid-level vision
How can we abstract from the image observations?
Too many pixels, edgels, blobs, junctions
Replace with representative, higher-level structures
Fewer and amenable to subsequent processing
Core problem: Grouping
Region grouping (Segmentation)
The Gestalt School

The whole is greater than the sum of its parts
Properties result from relationships
Illusory/subjective
contours
Occlusion
Familiar configuration
Relationships are recovered using a few generic cues
Similarity
Common Fate
10
Proximity
11
Symmetry
12
Parallelism
13
Continuity
Gestalt theory and computer vision

Gestalt heritage: mostly conceptual
Turning Gestalt cues into numerical quantities:
Common fate:
Parallelism:
Symmetry:
Similarity:
Closure, continuity:
motion estimation
texture analysis
ridge detection
region-based segmentation
boundary-based segmentation
Main problem: how do we choose when to rely on each

of these cues?
15
Cue combination problem

Different cues lead to different segmentations
Symmetry
Continuity
Color Similarity
16
Image segmentation is an ill-posed problem

No `optimal segmentation exists
`Good segmentation: highly task-dependent

Depth Ordering
Motion Estimation
This Lecture
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
18
Segmentation Problem
Task: Partition image into homogeneous regions
Homogeneity: based on intensity, color, texture,

motion, depth, shading .
Intensity:
19
Feature Space
At each pixel, form a vector of measurements describing
image properties: image features
Map observations into feature space
Group pixels based
on color similarity
R=255
G=200
B=250
B
R=245
G=220
B=248
R=15
G=189
B=2
Feature space: color value (3D)
R=3
G=12
B=2
20
10
Feature Space
At each pixel, form a vector of measurements describing
image properties: image features
Perform segmentation by clustering the feature space
Grouping pixels based
on texture similarity
F1
F2
Filter bank
of 24 filters
F24
Feature space: filter bank responses (e.g., 24D)
21
Grouping: Inventing underlying models
What if these
points lie on lines?
Line 1
Line 2
11
As we dont know the lines
Line weights
line 1
line 2
Iteration
Application to segmentation
Invented `models: image segments
Modeling the features separately within each segment:
substantially easier than modeling the image.
Brown
Blue
Yellow
24
12
Intensity-based segmentation: toy example
white
pixels
black pixels
gray
pixels
input image
intensity
1D feature vector: intensity measurement

These intensities define the three groups.
We could label every pixel in the image according to

which of these primary intensities it is.
i.e., segment the image based on the intensity feature.
Pixel count
25
Input image
Intensity
Pixel count
Input image
Intensity
26
13
190
Intensity
255
Goal: choose three centers

Label every pixel according to closest center
But how can we find the centers?
27
Clustering
chicken and egg problem:
Known centers: group points by allocating each to its closest

center.
Known group memberships: get centers by computing group

mean.
28
14
K-Means algorithm
Input: features
d: feature vector dimensionality
N: number of pixels
Output: centers & assignments
K-Means Clustering
Randomly initialize k cluster centers.
Iterate:
1.
Given cluster centers, determine points in each cluster

For each point i, find the closest center. Put i into cluster j
2.
Given points in each cluster, solve for centers.

Set center to be the mean of points in cluster j
3.
If ci have changed, go to 1
Guaranteed convergence to local minimum of

30
15
K-Means Clustering Results

K-means clustering based on intensity or color
Image
Intensity-based clusters
Color-based clusters
Clustering (r,g,b,x,y) values enforces spatial coherence
31
Limitations of k-means
Euclidean distance-based criterion
No justification for Euclidean metric in arbitrary feature space
Desired
K-means
Remedy: introduce more flexible models for

observations within each group
K-means allows only spherical clusters

Consider ellipsoidal
32
16
d-dimensional Gaussian distribution

Determined by mean and covariance matrix
Maximum likelihood parameter estimates:
P(x) = .1
P(x) = .2
P(x) = .5
33
Mixture of Gaussians
K Gaussian blobs with parameters:
Blob k is selected with probability

The likelihood of observing x is a weighted mixture of Gaussians
1D
Parameter Estimation: maximize

34
17
Expectation Maximization algorithm

E-step (Bayes Rule)
M-step
K-means vs. EM
k-means
Closest centers index
Isotropic Distance
(Euclidean)
EM
Soft assignment, R
Anisotropic Likelihood
(Covariance-based,`Mahalanobis)
Fast (e.g. kd-trees)

Accurate & more flexible
More robust to initalization Prone to local minima
Typical usage: initialize EM with k-means results
36
18
Problems of K-Means/EM
Number of clusters
Initialization/local minima
Mismatch with data distribution
This Lecture
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
38
19
Finding Modes in a Histogram
How many modes are there?
Mode = local maximum of the density of a given distribution

Easy to see, hard to compute
39
Mean Shift Algorithm

Consider nonparametric density estimate
e. g.
Update: set each point to weighted average of its

neighborhood
Change is in direction of maximal increase in likelihood
20
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
Slide by Y. Ukrainitz & B. Sarel
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
21
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
22
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Region of
interest
Center of
mass
Mean Shift
vector
23
Mean-Shift
Region of
interest
Center of
mass
Real Modality Analysis
Tessellate the space

with windows
Run the procedure in parallel
24
Real Modality Analysis
The blue data points were traversed by the windows towards the mode.
Mean-Shift Clustering
Cluster: all data points in the attraction basin of a mode
Attraction basin: the region for which all trajectories
lead to the same mode
50
25
Mean Shift for image segmentation
Mean-Shift Segmentation Results
52
26
More Results
53
Summary Mean-Shift
Pros
General, application-independent tool

Model-free, does not assume any prior shape (spherical,
elliptical, etc.) on data clusters
Just a single parameter (window size h)
h has a physical meaning (unlike k-means)
Finds variable number of modes

Robust to outliers
Cons
Output depends on window size

Window size (bandwidth) selection is not trivial
Computationally (relatively) expensive
Does not scale well with dimension of feature space
54
27
This Lecture
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
55
Images as Graphs
q
wpq
p
Fully-connected graph
Node (vertex) for every pixel

Link between every pair of pixels, (p,q)
Affinity weight wpq for each link (edge)
wpq measures similarity
Similarity is inversely proportional to difference
(in color and position)
56
28
Segmentation by Graph Cuts

q
wpq
Break Graph into Segments
Delete links that cross between segments

Easiest to break links that have low similarity (low weight)
Similar pixels should be in the same segments
Dissimilar pixels should be in different segments
57
Measuring Affinity
Distance
Intensity
Color
(some suitable color space distance)
Texture
(vectors of filter outputs)
58
29
Graph Cut
Set of edges whose removal makes a graph disconnected

Cost of a cut
Sum of weights of cut edges:
A graph cut gives us a segmentation
What is a good graph cut and how do we find one?

59
Graph cut
30
Graph cut
Graph Cut
Here, the cut is nicely

defined by the block-diagonal
structure of the affinity matrix.
How can this be generalized?
62
31
Multi-way graph cut
Affinity matrix
Block detection
Minimum Cut
We can do segmentation by finding the minimum cut in
a graph (next lecture)
Drawback:
Weight of cut proportional to number of edges in the cut

Minimum cut tends to cut off very small, isolated components
Cuts with
lesser weight
than the
ideal cut
Ideal Cut
64
32
Normalized Cut (NCut)

A minimum cut penalizes large segments
This can be fixed by normalizing for size of segments
The normalized cut cost is:
assoc(A,V) = sum of weights of all edges in V that touch A
65
Optimization
Original problem: partition similarity graph
Mathematically equivalent to
Partitioning
-1
Relaxation: allow for continuous

Generalized Eigenvector problem
Embedding
Compute embedding and then cluster in new space
33
Interpretation as a Dynamical System
Treat the links as springs and shake the system
Elasticity decreasing function of affinity

Vibration modes correspond to segments
NCuts Example
Smallest eigenvectors
NCuts segments
68
34
Discretization
Problem: eigenvectors take on continuous values
How to choose the splitting point to binarize the image?
Image
Eigenvector
NCut scores
Possible procedures
a)
b)
c)
Pick a constant value (0, or 0.5).

Pick the median value as splitting point.
Look for the splitting point that has the minimum NCut value:
1. Choose n possible splitting points.
2. Compute NCut value.
3. Pick minimum.
CVPR 2006: Tolliver & Miller: Spectral Rounding
69
NCuts: Overall Procedure

1. Construct a weighted graph G=(V,E) from an image:
Connect each pair of pixels, and compute
4. Solve
for the smallest eigenvectors.
5. Form approximate solution to NCUT problem:
Threshold eigenvectors to get a discrete cut & recursively

subdivide if NCut value is below a pre-specified value.
Or: cluster eigenvector values using k-means
Or: Spectral Rounding
70
35
Color Image Segmentation with NCuts
71
Results with Color & Texture
72
36
Normalized Cuts results

Berkeley Segmentation Engine
Summary: Normalized Cuts

Pros:
Generic framework, flexible choice of affinity function

Does not require any model of the data distribution
Cons:
Time and memory complexity can be high

Dense, highly connected graphs many affinity computations
Solving eigenvalue problem
Preference for balanced partitions

If a region is uniform, NCuts will find the
modes of vibration of the image dimensions
74
37
Lecture Summary
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
75
Lecture Summary
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
76
38
Lecture Summary
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
77
Lecture Summary
Introduction
Mid-Level Vision
K-means & EM
Mean Shift
Normalized Cuts
78
39

CSE527 l13 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSE527 l13 PDF

Uploaded by

Copyright:

Available Formats

This Lecture

Clustering & Segmentation

Something more informative than pixels

But not necessarily object-centered : Generic

`Blue SegmentV. Kandinsky

Why not go directly from image to objects?

Pattern recognition task

Mid-level representations suffice for recognition

Scalability: objects and their geons

Hypothesis: there is a small number of geometric components that

1) We know that this object is nothing we know

The Gestalt School

Properties result from relationships

Relationships are recovered using a few generic cues

Gestalt theory and computer vision

Main problem: how do we choose when to rely on each

Cue combination problem

Image segmentation is an ill-posed problem

`Good segmentation: highly task-dependent

Clustering & Segmentation

Homogeneity: based on intensity, color, texture,

Feature space: color value (3D)

Feature space: filter bank responses (e.g., 24D)

Grouping: Inventing underlying models

As we dont know the lines

Intensity-based segmentation: toy example

1D feature vector: intensity measurement

We could label every pixel in the image according to

i.e., segment the image based on the intensity feature.

Goal: choose three centers

Known centers: group points by allocating each to its closest

Known group memberships: get centers by computing group

Given cluster centers, determine points in each cluster

Given points in each cluster, solve for centers.

Guaranteed convergence to local minimum of

K-Means Clustering Results

Clustering (r,g,b,x,y) values enforces spatial coherence

No justification for Euclidean metric in arbitrary feature space

Remedy: introduce more flexible models for

K-means allows only spherical clusters

d-dimensional Gaussian distribution

Maximum likelihood parameter estimates:

K Gaussian blobs with parameters:

Blob k is selected with probability

Parameter Estimation: maximize

Expectation Maximization algorithm

Fast (e.g. kd-trees)

Clustering & Segmentation

Finding Modes in a Histogram

How many modes are there?

Mode = local maximum of the density of a given distribution

Mean Shift Algorithm

Update: set each point to weighted average of its

Change is in direction of maximal increase in likelihood

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Real Modality Analysis