You are on page 1of 192

Robot Vision

Peter Corke
CyberPhysical Systems Lab.
CVPR Summer School
Kioloa 2012

petercorke.com/kioloa.pdf
CRICOS No. 00213J

Queensland University of Technology

Why would it be useful for a robot to see?

a university for the

real

world

CRICOS No. 00213J

What is a robot?

a university for the

real

world

CRICOS No. 00213J

1956

1977

Robot: the word


Karel apek

1921
a university for the

real

world

CRICOS No. 00213J

2004

Robot: a definition

where am I?
where are you?
A goal oriented machine that can
sense, plan and act.

how do I get there?


a university for the

real

world

CRICOS No. 00213J

What about GPS?


GPS is not perfect
Satellites are obscured in urban areas
Multi-pathing in industrial sites
Not available in many important work
domains such as
underwater
underground
deep forest

Only tells where I am

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

What about vision?


Eyes are useful/essential for
the critical life tasks of all
animals:
finding food
avoiding being food
finding mates

Can sense shape, color,


motion
A long range sensor
beyond our finger tip

a university for the

real

world

CRICOS No. 00213J

Evolution of the eye

eye invented 540 million

years ago
10 different eye designs
lensed eye invented 7
times

Vision does need a light


source, but we evolved next
to a bright star
a university for the

real

world

CRICOS No. 00213J

Consider the bee

brain
1g
106 neurons

a university for the

real

world

CRICOS No. 00213J

Anterior Median and Anterior Lateral Eyes of an Adult Female


Phidippus putnami Jumping Spider

a university for the

real

world

CRICOS No. 00213J

Compound Eyes of a Holocephala fusca Robber Fly

a university for the

real

world

CRICOS No. 00213J

Face of a Southern Yellowjacket Queen- Vespula squamosa

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Pupil

Sclera
Iris

Pupil

Iris

Sclera

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

brain
1.5 kg
1011 neurons
~1/3 for vision

bee brain
1g
106 neurons

Seeing is an active process

a university for the

real

world

CRICOS No. 00213J

Robots and vision

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

A great sensor for robots

a university for the

real

world

CRICOS No. 00213J

Vision is the process of discovering from images


what is present in the world and where it is.
David Marr

a university for the

real

world

CRICOS No. 00213J

Robots that read

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Dual-camera image-based visual servo

a university for the

real

world

CRICOS No. 00213J

Watching whales with UAVs

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Robots underwater

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Image
geometry

a university for the

real

world

CRICOS No. 00213J

Reflection of light

i r

Specular reflection
- angle of incidence equals
angle of reflection

a university for the

real

world

CRICOS No. 00213J

Reflection of light
I
I cos r

Lambertian reflection
- diffuse/matte surface
- brightness invariant to
observers angle of view

Johann Heinrich Lambert


1728-1777
a university for the

real

world

CRICOS No. 00213J

Extramission theory

a university for the

real

world

CRICOS No. 00213J

Image formation

points in the world


a university for the

real

world

CRICOS No. 00213J

Image formation

points in the world


a university for the

real

world

image plane
CRICOS No. 00213J

Image formation

points in the world


a university for the

real

world

image plane
CRICOS No. 00213J

The pin hole camera

a university for the

real

world

CRICOS No. 00213J

Pin hole images

a university for the

real

world

CRICOS No. 00213J

The worlds largest pin hole camera

http://www.legacyphotoproject.com
a university for the

real

world

CRICOS No. 00213J

Image formation

a university for the

real

world

CRICOS No. 00213J

Use a lens to gather more light

George R. Lawrence 1900


a university for the

real

world

CRICOS No. 00213J

Image formation
f
bigger
area

F = f /

image plane
a university for the

real

world

CRICOS No. 00213J

Pin-hole image geometry

Y
Z

y
Y
=
Z
z

Image formation is the mapping of scene points to the


image plane

a university for the

real

world

CRICOS No. 00213J

No unique inverse

a university for the

real

world

CRICOS No. 00213J

Thin lens model


The pin-hole camera p
luminance in units of
brighter images is to c
A convex lenses can f
pin
hol
equivalent
er
lens
pin
holeallows more light
ay

zo

f
focal points
ideal
thin
lens

a university for the

real

world

zi
The
elementary aspec
inverted
f11.1.
The
positive
z-a
image
and its image are relat
image plane

object

1
1
1
+ =
zo zi
f

where zo is the distan


focal length of the le
CRICOS No. 00213J

Thin lens model

pin

object

er

equivalent
pin hole

ay

zi

zo

f
focal points
ideal
thin
lens

Focussing on distant objects

zo
zi f

a university for the

real

world

inverted
image
image plane

hol

CRICOS No. 00213J

Thin lens model

pin

object

er

equivalent
pin hole

ay

zi

zo

f
focal points
ideal
thin
lens

a university for the

real

world

inverted
image
image plane

hol

CRICOS No. 00213J

Pin-hole image geometry (3D)

Y
Z

fX
fY
x=
,y=
Z
Z

real

world

X
x
=
Z
f

(X,Y, Z) (x, y)
3

R R

a university for the

Y
y
=
Z
f

CRICOS No. 00213J

Perspective transform

a university for the

real

world

CRICOS No. 00213J

Perspective projection
Lines lines
parallel lines not necessarily parallel

Conics conics

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Ideal city (1470)

Piero della Francesca (1415-1492)

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Homogeneous coordinates
a
"
!

# "

Fig. 2.1.
"
!
!
'
$ " % "
!
" "
"
b
!
% "
!
"
" *!
"
( )
"#
!
!
'
!
& !
"
! % "
" ! ' "
! % "
%
2 ! '
% "
!

"!
Cartesian homogeneous
" "

%
" $

P = (x, y)
P R

"
!%

'
% $
"
"

P = (x, y, 1)
P P

homogeneous Cartesian
P = (x,
y,
z)

P = (x, y)
x
y
x= , y=
z
z

Fig. 2.2.
"
!
'
" $ " !
" $ "
( )
(
)
!
a university for the" real
world
( )
" $ " ( ) !

CRICOS No. 00213J

Pin-hole model in homogeneous form



x
f
y = 0
z
0

0
f
0


X
0 0
Y

0 0
Z
1 0
1

x = f X, y = fY, z = Z

y
x
x= , y=
z
z
fY
fX
,y=
x=
Z
Z

Perspective transformation, with the pesky divide by Z, is linear in


homogeneous coordinate form.

CRICOS No. 00213J


Queensland University of Technology


x
f
y = 0
0
z

0
f
0


X
0 0
Y

0 0
Z
1 0
1

1 0 0 0
f
0 1 0 0 0

0 0 1 0 0
0

a university for the

real

world

0
f
0
0

0 0
0 0

1 0
0 1

CRICOS No. 00213J

Central projection model



x
f
y = 0
z
0

P = (X, Y, Z)

z
opt
ica
l ax
is

0
f
0


X
0 0
Y

0 0
Z
1 0
1

p
z=f
pa
i
c
prin t
poin

ima

an
l
p
e

a university for the

xC

zC
{C}
yC

camera
origin

y
real

world

CRICOS No. 00213J

Change of coordinates
P = (X, Y, Z)

0
0

z
opt
ica
l ax
is

0
0

1
u

0
1
v

u0

v0
1

z=f
ipal
c
n
pri t
poin )

{C}
yC

, v0
0
u
(

ma
i
1

lan
p
e
g

xC

zC

camera
origin

v
a university for the

real

world

CRICOS No. 00213J

Complete camera model

P = (X, Y, Z)
C


x
f
y = 0
z
0


1
u
u

v = 0
w
0


X
0 0
Y

0 0
Z
1 0
1

0
f
0

0
1
v

u0
f

v0 0
0
1

intrinsic
parameters

a university for the

real

world

{C}
y

C
{0}

extrinsic parameters

0
f
0

0 0
R
0 0
013
1 0

camera matrix


X
1
Y
t

Z
1
1

CRICOS No. 00213J

MATLAB example
>> cam = CentralCamera('focal', 0.015, 'pixel', 10e-6, ...
'resolution', [1280 1024], 'centre', [640 512], ...
'name', 'mycamera')
cam =
name: mycamera [central-perspective]
focal length:
0.015
pixel size:
(1e-05, 1e-05)
principal pt:
(640, 512)
number pixels: 1280 x 1024
T:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
>> whos cam
Name
Size
cam
1x1

Bytes
112

Class
CentralCamera

Attributes

>> P = [0.3, 0.4, 3.0]';


>> cam.project(P)
ans =
790
712
>> cam.C
ans =
1.0e+03 *
1.5000
0
0

0
1.5000
0

0.6400
0.5120
0.0010

a university for the

real

0
0
0

world

CRICOS No. 00213J

MATLAB example

>> [X,Y,Z] = mkcube(0.2, 'centre', [0.2, 0, 0.3], 'edge');


>> cam.mesh(X, Y, Z)
>> T = transl(-1,0,0.5)*troty(0.8);
>> cam.mesh(X, Y, Z, Tcam, T)

a university for the

real

world

CRICOS No. 00213J

Fish-eye lens

a university for the

real

world

CRICOS No. 00213J

Fish-eye imaging model

11.3 Non-Perspective Imaging Models

>> cam = FishEyeCamera('name', 'fisheye', ...


'projection', 'equiangular', ...
'pixel', 10e-6, ...
'resolution', [1280 1024])
>> [X,Y,Z] = mkcube(0.2, 'centre', [0.2, 0, 0.3], 'edge');
>> cam.mesh(X, Y, Z)
a university for the

real

world

CRICOS No. 00213J

Imaging by reflection

From Opticks, Newton, 1704.

An Accompt of a New Catadioptrical


Telescope invented by Mr. Newton
by Isaac Newton
Philosophical Transactions of the Royal
Society, No. 81 (25 March 1672)

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Catadioptric imaging model

>> cam = CatadioptricCamera('name', 'panocam', ...


'projection', 'equiangular', ...
'maxangle', pi/4, ...
'pixel', 10e-6, ...
'resolution', [1280 1024])
>> [X,Y,Z] = mkcube(0.2, 'centre', [0.2, 0, 0.3], 'edge');
>> cam.mesh(X, Y, Z)
a university for the

real

world

CRICOS No. 00213J

Multi-view
correspondence
The problem of finding the
same point in different views of
the same scene
a university for the

real

world

CRICOS No. 00213J

The correspondence problem

a university for the

real

world

CRICOS No. 00213J

The correspondence problem

a university for the

real

world

CRICOS No. 00213J

The correspondence problem

145
147
147
147
147
147
147
147
147

147
147
147
145
144
145
145
144
145

144
143
145
147
147
147
147
147
145

a university for the

144
144
147
148
147
145
145
147
147

147
144
145
147
147
145
145
147
147

real

145
145
145
147
147
147
145
147
147

144
145
145
144
147
147
144
147
147

world

145
145
145
144
145
144
144
147
144

144
147
145
143
144
145
145
145
144

131
131
131
130
131
131
131
130
131

131
131
133
130
131
131
131
130
130

131
131
131
131
131
130
130
128
130

131
131
131
130
130
130
131
130
130

131
130
130
131
130
130
130
130
131

131
131
131
131
130
131
130
130
130

131
130
131
131
131
130
128
130
130

130
131
130
131
131
130
130
130
130

131
131
130
131
130
130
131
130
130

CRICOS No. 00213J

Corner detector

a university for the

real

world

CRICOS No. 00213J

Corner detector

a university for the

real

world

CRICOS No. 00213J

%"
' Corner Detectors
' ' # " ! ' " !
l
Classical
Corner
! " %
'" '
! detector
" * ) % %
! '
'
%
13.3.1

&

&'%" !
%
! ' !
% '
! & " * *
! & ' ' # +
% ! * % " " " ) %' , (
' ' ' & ! # " ! ' " " (! % & " ! ! '
& ! & '! % " !" ! ' % % & ' ! ! ' !
'
! ' ' ! " % ' ! " &" ! " * * % ' " ! & ' ! & ' ' '
! " % & ' "# " ' ! ' ' ! ' "& * ) % %
! '%
&' !
! # + * ' " ' " & ) % & , ! ( ! ' , ' & % ! ! ' ! ' " ! (& % &' , " ' ! " ' " ! ' & ! ! " ! " ' (% % & ' " %
' & ' % # ", ! ' ' # + ' " & ! '
" % ! % "
! %"
! ' ' ! ! " % ' " % " ! % " ! # " ! ' & % % ' $ " (! ' &
'& ! ! ' ' # , + ) ' ' ( &
!"
& %!
! "' " , "
% ! ! ' % ! ' ! ,& ' , ' ' " '
'& %! ! ' )
" '
' & ' % & ,!
# , + % " ' ! ' % " % " % ! , % ' " " (! ' " ) * ' ' ! ! $ ( & " %& (! % # & " & '! ' % & " % !
" ' " ! &'
' " ! *
* *
& (& & ! '
! + '
# ' %
' ! ' ' , )
(
%
" " "
! %
, ' '
!
% &' " % ! % # " ! '
' '" % * &
" % ) .&
& "
" ( '& ' & !
&' ! # " ! ' & !, ' % & ' ! % ' " % ' * % , ' "
( ' ) % * " ' ' % ! $ (! & # & (% & # & ' )
' * " ' & " ! & & ' " ! ' ' " ! ! '* ( ' " ! '* ' * ! & ( & &% ! " ' ! ! & + ' '"
(# ! ' %
(" (& , " '
' ! ' ,' " % * % & ! ' ' " " % ) " ).& % # # !
! ! " ' u% % v& ' ' " % ! ( & % ' # " & ! ( '
! ' % & "" !
%" !
'! %
! ! '# %
" (% & ) ' ! ! ' ' # & " ! ' & % ! ' ,'
'& * ! ! ' % ' * " ! % ! ' %
'
'"*! & & # &
" ! ,' ! ' ( ' & " ! ' '
!
%
" !
& '"
(!
(" (

( , )

! " ' %
" % )
" !
&#
!

'

(& '

'
&
,

& & "


"
&
% ',
& (% % "
'
" %
&#
! '& !
) ( & '
! ' % &'
& (%

&(
% ',
&
%

! ' ,
'*
!

" !
'

!
%

' ,#
' ' *
!

& & "


"
%
" !
&
% ',
& (% % "
'
" %
&#
! '& !
'
%
" % )
) ( & ( ' ! ' " ! ! ' % & / ' & ) & (( % '

a university for the

&
'

&

'

) ( '

# # %"

' ,#
' ' *
!

# # !

!
!

, !
&$ ( % * ! " *
& & '
& (& &
# % ) " (& ,
% ', & )
' " ! &
! '
!
(
%

' " & *


%
&
& " '%" # & !
' +
!
% ' " ! & " ! & $ ( ! ' ,

real* world & ! (" ! ' ' " & ! %


/
!
! % - '

% ! ' '"
" ) %
" !
! '%
'

, !
&$ ( % * ! " *
& (& &
# % ) " (& ,
%
' " ! &
! '

% , # +
! '
!
! ' % &' # " ! '&
' ' " ! " '
" % )
' '" % & ' ' ' & ! "
!
&& ! '
, %
! ' !
'
! (
%
'" %
!
)
& ' % " ! " (' # (' " % # " ! ' " !
!
,

" %

) % , # +

! !

'

&

CRICOS No. 00213J

! '

% ', & '

'

! ' % &'
&(

) (

& '

! ' % &'

& (%

Corner detector

(! ' " !
/ &
' " & *
%
&
&" '%" # & !
' +
!
% ' " ! & " ! & $ ( ! ' ,
*
& ! " '
& %
!
! % - '
&$ ( %
% ! &
'*

) ( '

" %

) %, # +
! '
!
! ' % &' # " ! '& %
' ' " ! " '
" % )
' ' " % & ' ' ' & ! 13.3
" ! Point Features
!
&& ! '
, %
! ' !
'
! (
% "
'" %
!
)
& ' % " ! " (' # (' " % # " ! ' " !
!

&
More general approach
'
'

Gaussian weighting matrix


,

# # %"
! '

! !
" !

'
!

&

% ', & '


&#
%

'

'
" !

&(

"

&

'
13.3 Point Features

* !

&
" *

' !
!

'

'% + '
' %

'
!

) ,
) # " )' - (
,) $ # ' .
$ # + $ !* )$ # ) #
*

'

" % -" ! '

& & & # " ! '& " & % '" '


# # %" +
'
, ' % (! '

* (( #
) % '

! " * * % '

) ,
) # " )' - (
* (( #
' # !
''
), $ # . ( $ # (.
+ $ " ! * )" $ # )') # " )' - '

' # !

! '& % &#

! '% " '


, " % & % &

' % !

' ) ,

(* " "

structure tensor

' % !

)$ + ' $ * (! . ( )
$ ' ' ! ) $ # " )' - $ ' ( $ # " $ " # ) " )' - ) % )* ' ( )
!$ ! #
$ * ' $ $
#
)(
# + !* (% ' $ +
' $ ) )$ #
$ ) #
$ * ' $ $
! " # )( $ )
" )' - '
$
'
# for)(the(& real
* ' world
$ ' " * ! ) % !
# ) # (" $ $ )
* (#
a university
,*
( ! (. " * " % ' ' )' ! " # " ' ,)' & - '
''
)$ + ' $ * (! . ( ) ()' * )*
' * (# $ ( #
" % ' $ + ( ) () ! ). # ' !
! ). $ )

(* " "

()' * )* ' ) # ($
# ) # ( ). ()' * )* '
!!. # + ' # ) (
" % * )
'$ " )
, CRICOS) No.
# 00213J
" )'
' ) # ($ ' * )$
) )$ '

'

* )$
$ )
' % )$ #
"
(
'
# )

Convolution

12.4.1

'

& " )

lConvolution
" $

# $
'
( # "
$
a university for the real
$"
$
'
' world

" #

"

kernel

&

" $ " #

% $
% $

"

"

#
#

&

% $

" & ") % $ % $ ( $


"" #
% $
$ ' # ' $ $
"
$
"
$
CRICOS No. 00213J

12.4.1.1 Smoothing
" ! &

% !

&

&$ ( %

>> K = ones(21,21) / 21^2;

!
'

&

"

( ! ' )" (
%!
& !
!

'

' &

'& ) ( & &( '


*
%
" (' # (' #
" (% " "
! '
! # ('

>> lena = iread('lena.pgm', 'do


>> idisp( iconv(K, lena) );
Defocus involves a kernel which is a
2-dimensional Airy pattern or sinc function. The Gaussian function is similar in
shape, but is always positive whereas
the Airy pattern has low amplitude
negative going rings.

a university for the

real

world

!
& '" & " " ' !
(% % ! " %
% ( , * *
&
& "
! ' " % -" ! '
% ! !
" % &( '
% !
" % & " " '

& &,

'%

" (' '

" %

CRICOS No. 00213J

0.02
0.015
0.01
0.005
0
0.005
0.01
0.015
0.02
15
10

10

5
0

10
15

a university for the

real

world

10
15

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

# )( (& * '
$ ' " * !)% !
# ) # (" $ $ )
* (#
,
) # " )' " $ " # ) " )' - ) % )* ' ( ) # ) # ( ). ()' * )
)$ ' ( ' !& ' ) $ #! ' " % & ' )' - & $ ( % ' ( $ #
' * (# $ ( #
" % ' $ + ( ) () ! ). # ' !
! ). $ )
) )$ '
'
! $" ! # (
$
*
'
$
$
#
)(
#
+
!
*
(
%
'
$
+
'
$
)
)
$
#
!
!
.
#
+
'
#
)
#
' ). %
!!. ! * ! ) * (#
' + )+ $
* (( #
'# ! "
$ )) #
$ ,* ' ) $ $ (" $ $ ) # ! " % ' # )(
" )' - '
$ " % * )
'$ "
" $ ) ' )
Point Features
' # # ( #! )()' " (&' ! ()
* %' / $ & # $ ))' (" ' * ! )" % % ( ! )$ # % , # + #$ ' ! ,)' # (" 1$ ! $ )( ! ' % & * ' (# " $ # ! ' ' & 13.3
% , ' )) # $ # (" $
' + ' " & *)$ * ' (% # $ ( & # ) (" % # ' $ ,! + ()' )+ ' " ' ! ()" ' ' ! ) ).$ " # % ),# " ' ' $ ! + ' " % )& ! ' )., ' $ ' # & )$! " , ! ) ' ) % )$ ! ' .
'%" # & !
' + approach
! &
!
&& ! '
, %
! ' !
'
! (
% "
More" & " general
% (( ' " " (! ! & ' " )$! # & )$ ( ! $ ' ' ' , ' ).# % ! ' ' ' ! " ! % . $ # ! ! * ) ,! )& ' % $ " # ! * ((" # ( ' '# ()' " % $ ' ' # + " )! # ' + " ! ! $ " ! * ((( # (* ' ' #
*
# ) + & ! ! * " ' weighting
( $ & % , ' ) ) ("% matrix
' $ # $ )% # ! * % ' + ' )* " ' () $ ' ) (* '
)) )% $ # )
$ )
Gaussian
+ , ! * ' # ( ! # ' ) ! ("' % ()!-! % )' $ # # )# )% " (* ' , ( $ ! # !( '! $ )' & ), )% ' (, ) & ' 1 " * ( ' ' & ( $ $ # "'
( % % ' ' $ -) "$
&$ ( %
% ! &
'*
! '
%
" ! ! '
&#
%
" ! &
+ $ # )$()' # ) ! $ ! # ) # )( ).( # $ , # ) + # ' + ! * ' () $ # , # " $ ) + $ )) ' , ! $ # , $ ), # ))' % (* ' ! .
' (( " ( ! % ' )$ , ) $ ' # # )! ' ( # $ #
, $ )$ # ( # + ' )! * ($ ' ' # ! )" (* '
( ( ((*
13.3 Point Features
!
ompression removes %
,
,
$
#
(
'
)$
$
'
#
'
# + !* ($
' ) % ' # % ! * ' + )* ' ( $ ) (* '
)) )% $ # )
$
etail from the image,
$ " ! '! (! ) # ) ' % ))$+ ' ' (*' $ ' # (# based
' (& ( )& on! # gradients
()'
# " )& ) (% '$ )" ' ) " ! $ ' '% # " ' ' ' $ ' $ #
+
!
*
(
'
("
)
)
( % % '
*
%
&
*
&
"
!
'
&
what defines a corso intensity
difference is
" * $ !# # ()
" " )* # * )" , ! $
! ! ) # # # + )' " ! * # (' )' ).
% - ( ! $ # * # ((
# % " + # # + ' ' ! # * ! , ( ' % ( ! ' # # , )" % , & $ % ' ) & % ! ' ! $ ), )(* " # " )

detectors should be
that have not been
decompressed.
mpression removes

ail from the image,


what defines a cored
to in theshould
literature
etectors
be
ner
hatdetector.
have not been
compressed.

'

Corner detector

'

)$ #

$ # + $ !* )$ # ) #

)( #
, '
%
,
,
$ # ( ' )$
*
%
!
$ % " ' ( " % -") ! ' )$ ' ! $ )# % ('
# + !* ($
$
,! " * *' % ' ' )
" ' # ' " * ' " ' )$ # (+ / ! *
,
( (. " " )'
" )' - '
'

eliminated

$ )

# + !* ( '

(* '

structure
tensor
' ) ,
!

$ '# ' !
' ( ) % ()' ! # ' & %) & # $

# )( # )
0)$ + ' $
'
# $ (' ' ) ! )), $ * # )" ) # )' # " - $ ()'' -($ ' (# $ # ' ()'
* " (($ " # # )# )' # " (! )' - ) %

"

$ '
' ' ( )
* (! . ( )

$ '# ' $ '

) ("
(* ' (
!
2 large
)$2 small
'
(
(
$ # ) (
()' * )* ' ) # ($ ' * )$

)*# ' , ( )' % ! # ) )# ( ).(* " ()'" * )* ' $ )


1 small constant edge
)$ # .
$ # + $ !* )$ # ) #
!$ ! #
$ * ' $ $
#
)(
# + !* (% ' $ +
' $ ) )$ # !!. # + ' # ) ( ' % )$ #
$ ) #
$ * ' $ $
! " # )( $ )
" )' - '
$ " % * )
'$ " )
"
edge
peak
(*(
, ' ' # )( (&' * )' $ ' " # * + ! ) % ! * ! ( $ # ) $ # # (")( $ # $ )) 1 large
*"(#
,$ ' , ) # " ) )'( -"
to in the literature
'' # * ' ( ' # # ' $ ( ! )$ ' # ( + "/ % ! * ' $ + ' (% )' ( ()# )( ! ).()' $ # # 0' ! () !# ).) ' $ ' $ )(' # ' ) ) )$# )$ ' ' ! () (' $ # # )

er detector.
a university for the real
#,)'* " ( ()! (.* "# )* " % ' ' )' ! )# ' world
(
$
((
# ($+ ' # ! No.
* * 00213J
# % " ' ' ' # ,! )'(! .' & - ()'
' )! * ' # )$!' )' ))$' * (%+ (' $ # $ # * (!(. , (' )+ # )()'+ $ * )$ )* ' *)CRICOS
)$(' # ' ! " ! ' ) $ #
" ).

Harris corner value

a university for the

real

world

CRICOS No. 00213J

Harris corner features

>> b1 = iread('building2-1.png', 'grey', 'double');


>> idisp(b1)
>> C1 = icorner(b1, 'nfeat', 200, 'patch', 5);
7497 corners found (0.7%), 200 corner features saved
>> C1(1:4)
ans =
(600,662), strength=0.0054555, descrip= ..
(24,277), strength=0.0039721, descrip= ..
(54,407), strength=0.00393328, descrip= ..
(116,515), strength=0.00382975, descrip= ..
>> C1.plot()

a university for the

real

world

CRICOS No. 00213J

Harris corner features

a university for the

real

world

CRICOS No. 00213J

Image motion sequence

>> im = iread('bridge-l/*.png', 'roi', [20 750; 20 480]);


>> c = icorner(im, 'nfeat', 200, 'patch', 7);
>> ianimate(im, c, 'fps', 10)

a university for the

real

world

CRICOS No. 00213J

Comparing features
Weve found the coordinates of some interesting
points in each image:
1

{ pi , i 1 N1 }

{ 2p j , j 1 N2 }

Now we need to determine the correspondence,


1
2
which pi p j
1

The pixel values I [ u, v] themselves are not


sufficiently unique

a university for the

real

world

CRICOS No. 00213J

Feature matching
Chapter 12 Image Processing

We use a W W window of pixels centred on each


corner point (W is odd)
We use a similarity metric to compare the windows
1
For each point pi we test the similarity against all
2
the points in the other image { p j , j 1 N2 }
This is an N1 N2 search problem

a university for the

real

world

CRICOS No. 00213J

MATLAB example

We need to separate good


and bad matches
a university for the

real

world

>> m = C1.match(C2)
m =
39 corresponding points (listing suppressed)
>> whos m
Name
Size
Bytes Class
Attributes
m
1x39
416 FeatureMatch
>> m(1:5)
ans =
(999, 602) <-> (770, 624), dist=0.125548
(885, 570) <-> (659, 588), dist=0.131761
(251, 599) <-> (26, 588), dist=0.148539
(272, 647) <-> (42, 638), dist=0.161652
(591, 314) <-> (448, 290), dist=0.172520
>> idisp({b1, b2});
>> m.plot()

CRICOS No. 00213J

For tracking

>> t = Tracker(im, c)
200 continuing tracks, 41 new tracks, 0 retired
241 continuing tracks, 46 new tracks, 0 retired
287 continuing tracks, 34 new tracks, 0 retired
.
.
a university for the

real

world

CRICOS No. 00213J

Tracking results

a university for the

real

world

CRICOS No. 00213J

Problems with feature matching


Best match is not necessarily a
good match
obscuration

Non-unique matches
many to one
visual similarity
left

a university for the

real

world

right

CRICOS No. 00213J

More problems with feature matching!

Large changes in viewpoint will distort the pattern of


pixels

view direction
perspective

scale change

rotation

We need a descriptor that is invariant to scale and


rotation
a university for the

real

world

CRICOS No. 00213J

Harris feature recap


Concise:
hundreds of features instead of millions of pixels
a description that is useful for the viewer and not cluttered
with irrelevant information (Marr)
Finds points that are distinct and easily located in a different
view of the same scene
Computationally efficient (good for real-time tracking)
Problems:
finds only small scale features
simple neighbourhood window matching is problematic
with changes in scale and rotation
with missing parts
leading to erroneous matches
a university for the

real

world

CRICOS No. 00213J

Epipolar
geometry
The geometry underlying
different views of the same
scene
a university for the

real

world

CRICOS No. 00213J

Epipolar geometry
P!
P

I1

epipolar plane

I2

ima
ge p
lane

z1
{1}

epipolar
line

x1

real

world

!
p
p

2
2

y1

a university for the

ne
a
l
p
ge
ima

z2

x2
{2}
y2

CRICOS No. 00213J

Homogeneous coordinates
a
"
!

# "

Fig. 2.1.
"
!
!
'
$ " % "
!
" "
"
b
!
% "
!
"
" *!
"
( )
"#
!
!
'
!
& !
"
! % "
" ! ' "
! % "
%
2 ! '
% "
!

"!
Cartesian homogeneous
" "

%
" $

P = (x, y)
P R

"
!%

'
% $
"
"

P = (x, y, 1)
P P

homogeneous Cartesian
P = (x,
y,
z)

P = (x, y)
x
y
x= , y=
z
z

Fig. 2.2.
"
!
'
" $ " !
" $ "
"
( )
( )
!

a university for the real( )


" ( ) !
world " $

lines and points are duals


CRICOS No. 00213J

A line in homogeneous form

such that

T p = 0

p = (x,
y,
z)

Point equation of a line

= (l1 , l2 , l3 )

y = mx + c

a university for the

real

world

CRICOS No. 00213J

Line joining two points

p 2 = (d, e, f )

p 1 = (a, b, c)

= p 1 p 2

a university for the

real

world

CRICOS No. 00213J

Point joining two lines


2 = (d, e, f )

p
p = 1 2

1 = (a, b, c)

a university for the

handles the case of


non-intersecting lines
automatically

line equation of a point

real

world

CRICOS No. 00213J

Fundamental matrix

2 2

p =0

2 T

Fundamental
matrix

p F p=0
a university for the

real

world

CRICOS No. 00213J

Testing match validity


If a pair of points are genuinely corresponding then

2 T

p F p=0

Now we just have to figure out F...

a university for the

real

world

CRICOS No. 00213J

Computing F
F has special structure
rank 2
null vector is the epipole coordinate

Can be computed from 8 pairs of corresponding


points
>> F = fmatrix(p1, p2);

but we dont know the correspondences...

a university for the

real

world

CRICOS No. 00213J

The RANSAC algorithm

1.Take 8 random possible pairs


2.Compute F
3.Test all other pairs with this F and
score how well they fit
4.Repeat N times and choose the F
that best explains the most pairs

a university for the

real

world

CRICOS No. 00213J

>> F = m.ransac(@fmatrix, 1e-4, 'verbose')


62 trials
6 outliers
5.94368e-05 final residual
F =
-0.0000
0.0000
-0.0053

-0.0000
-0.0000
-0.0023

0.0052
0.0010
1.0682

>> idisp({b1,b2})
>> m.outlier.plot(r);

a university for the

real

world

CRICOS No. 00213J

How did the camera move?


essential
matrix

2 T

x E x = 0
must be F

p Kx

2 T

T
1 1
K2 EK1 p

E=

a university for the

T
K2 FK1

real

world

=0

E S(t)R
CRICOS No. 00213J

Dealing with
scale

a university for the

real

world

CRICOS No. 00213J

More problems with feature matching!

Large changes in viewpoint will distort the pattern of


pixels

view direction
perspective

scale change

rotation

We need a descriptor that is invariant to scale and


rotation
a university for the

real

world

CRICOS No. 00213J

Gaussian sequence

a university for the

real

world

CRICOS No. 00213J

)1" / + 1& 3" * " + 0 1, # & + ! 1% " - , & + 1, # * 5& * 2* $ / ! & " + 1& 0 1, , * - 21" 1% " 0" , + !
! " / & 3 1& 3" + ! ! " 1" / * & + " 4 % " / " 1% & 0 & 0 7 " / , % "
- ) & + , - " / 1, /

Laplacian of Gaussian sequence


Fig. 12.18. , * - / & 0, + , # 14 ,
" ! $ " , - " / 1, / 0 a + + 6 , - " / 1, /
4 & 1% ! " # 2)1 - / * " 1" / 0 b
$
+ & 12! " , # ! " / & 3 1& 3" , #
200& +
(" / + " )
% "
, , -" /
1, / / " . 2& / " 0)" 00 , * - 21 1& , + 1% +
+ + 6 21$ " + " / 1" 01% & ( " / " ! $ " 0
, / , 1%
0" 0 / " 02)10 / " 0% , 4 +
& + 3" / 1" ! 4 % & 1" & 0 7 " / ,

+ ! / " 12/ + 0 + & * $ " 4 % " / " 1% " " ! $ " 0


0- , + ! & + $ 1, $ / ! & " + 1* $ + & 12! " 11%
1% " " ! $ " 0 / " * 2 % 1% & + + " / 1% + 1% ,
, - " / 1, / 4 % & % & 0 0% , 4 + & + & $
200& + , - " / 1& , +
% " % 601" / " 0& 0 1% /
$ 2* " + 10
, # / 4 " % 3" , + 0& ! " / " ! + " ! $ "
02- - / " 00& , + % 0 " " + 20" ! 1,
#,
+ )1" / + 1& 3" * " + 0 1, # & + ! 1% " - , & + 1
! " / & 3 1& 3" + ! ! " 1" / * & + " 4 % " / " 1% & 0 &

& 0 1% " 02* , # 1% " 0" , + ! 0- 1& )! " / & 3 1& 3" & + 1% " % , / & 7 , + 1 ) + ! 3" / 1& )! & / " 1& , + 0 , /
! & 0 / " 1" & * $ " 1% & 0 + " , * - 21" ! 6 , + 3, )21& , + 4 & 1% 1% "
- ) & + (" / + " )
>> L = klaplace()
L =
0
1
0
1
-4
1
0
1
0

& 0 1% " 02* , # 1% " 0" , + ! 0- 1& )! " / & 3 1


! & 0 / " 1" & * $ " 1% & 0 + " , * - 21" !
>> L = klaplace()
L =
0
1
0
1
-4
1
0
1
0

4 % & % & 0 & 0, 1/ , - & 8 & 1/ " 0- , + ! 0 " . 2 ))6 1, " ! $ " 0 & + + 6 ! & / " 1& , +
% " 0" , + ! ! " / & 3
1& 3" & 0 " 3" + * , / " 0" + 0& 1& 3" 1, + , & 0" 1% + 1% " # & / 01 ! " / & 3 1& 3" + ! & 0 $ & + , 4* % & *% & 0,& 0,+ 1/ )6
, - & 8 & 1/ " 0- , + ! 0 " . 2 )
1& 3" & 0 " 3" + * , / " 0" + 0& 1& 3" 1, + , & 0" 1%
20" ! & + , + ' 2+ 1& , + 4 & 1%
200& + 0* , , 1% " ! & * $ "
20" ! & + , + ' 2+ 1& , + 4 & 1%
200& + 0

4% & % 4" , *
( " / + " ) $ & 3" +

4 % & % 4 " , * & + " & + 1, 1% "


-) & + , #
200& + ( " / + " ) ,
( " / + " Mexican
) $ & 3" +
, 3" hat
% & 0 function
+ " 4 / & 11" + + )61& ))6 0

+ !

& 0 1% "

real

world

-) & + ,
" 4 / & 11"

-) & +

4 % & % & 0( + , 4 +
&+ &$
"

a university for the

& + " & + 1, 1% "


, 3" % & 0 +

0 1% "

CRICOS No. 00213J

/ /

& )! / " 1% ,

Stack the images


Find local maxima with
respect to u, v,
The (u,v) coordinate is
the position of a
feature
The z coordinate is the
scale of the feature

a university for the

real

world

CRICOS No. 00213J

Simple scale-space example

a university for the

real

world

CRICOS No. 00213J

Laplacian of Gaussian sequence

a university for the

real

world

CRICOS No. 00213J

Magnitude of LoG function


this peak is the
characteristic scale of the
feature

a university for the

real

world

CRICOS No. 00213J

Characteristic scale

a university for the

real

world

CRICOS No. 00213J

Scale Invariant Feature Transform

SIFT detector (Lowe, 2004)


a university for the

real

world

>> s = isift(b1, nfeat, 200);


>> s.plot(clock);

CRICOS No. 00213J

100

100

200

200

300

300

v (pixels)

v (pixels)

Compare Harris and SIFT

400
500

400
500

600

600

700

700

800

800
200

400

600
u (pixels)

800

1000

1200

Harris corner

a university for the

real

world

200

400

600
u (pixels)

800

1000

SIFT

CRICOS No. 00213J

1200

Epipolar magic example

image 1

a university for the

real

world

image 2

CRICOS No. 00213J

484

Using multiple images

v (pixels)

200
400
600
800
500

1000

1500
u (pixels)

2000

2500

image
1
image
2
Feature matching. Subset (100 out of 1664) of matches based on SURF de-

Figure 14.3:
scriptor similarity. We note some (low in the image) that some are clearly incorrect.

a university
the realof SurfPointFeature objects. ManyCRICOS
No. 00213J
which results in
two forvectors
thousands
of
world

v (pixels)

200
400
600
800
500

1000

1500
u (pixels)

2000

500

1000

1500
u (pixels)

2000

2500

v (pixels)

200
400
600
800

a university for the

real

world

2500

CRICOS No. 00213J

Geometry of multiple views

497

Epipolar magic
noname

100
200

v (pixels)

300
400

image 2

500
600

700
epipolar plane

I 1 image plan

I2

800

200

400

600
u (pixels)

800

1000

z1

1200
{1}

y
image
1
re 14.10: Image from Figure 14.1(a) showing epipolar lines converging on the projection

e second cameras centre. In this case the second camera is clearly visible in the bottom
of the image.
a university for the

real

world

x1

epipolar
line

e
lan
ge p
ima

! z2

1 epipolar
point

CRICOS No. 00213J

x2
{2}
y2

3 dimensional
vision

a university for the

real

world

CRICOS No. 00213J

How big is it?

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

The Ames room

a university for the

real

world

CRICOS No. 00213J

The Ames room

a university for the

real

world

CRICOS No. 00213J

How do we estimate distance?


1.Occlusion
2.Height in visual field
3.Relative size
4.Texture density
5.Aerial perspective
6.Binocular disparity
7.Accomodation
8.Convergence
9.Motion perspective

PERCEPTION, LAYOUT, AND VIRTUAL REALITY

29

Figure 1. Just-discriminable ordinal depth thresholds as a function of the logarithm of distance from the observer, from 0.5 to 10,000 m, for nine sources of information about layout. I assume that more potent sources of
information are associated with smaller depth-discrimination thresholds; and that these thresholds reflect
suprathreshold utility. This array of functions is idealized for the assumptions given in Table 1. From Perceiving
Layout and Knowing Distances: The Integration, Relative Potency, and Contextual Use of Different Information
About Depth, by J. E. Cutting and P. M. Vishton, 1995, in W. Epstein and S. Rogers (Eds.), Perception of Space
and Motion (p. 80), San Diego: Academic Press, Copyright 1995 by Academic Press. Reprinted with permission.

How the eye measures reality and virtual reality.


JAMES E. CUTTING, Cornell University, Ithaca, New York.
Chauvet et al., 1995; Hobbs, 1991) and Egyptian art (see image) to the top, and assuming the presence of a ground
Hagen,
1986; Hobbs,
1991),Methods,
where it is often
used alone, &
plane,
of gravity, and
the absence
of a27-36
ceiling (see Dunn,
Behavior
Research
Instruments,
Computers
1997,
29 (1),

a university for the

real

with no other information to convey depth. Thus, one


can make a reasonable claim that occlusion was the first
source of information discovered and used to depict spatial relations
in depth.

worldBecause occlusion can never be more than ordinal informationone can only know that one object is in front
of another, but not by how muchit may not seem im-

Gray, & Thompson, 1965). Across the scope of many


different traditions in art, a pattern is clear: If one source
of information about layout is present in a picture beyond occlusion, that source is almost always height in the
CRICOS No.
00213J and height,
visual field. The conjunction
of occlusion
with no other sources, can be seen in the paintings at Chauvet; in classical Greek art and in Roman wall paintings;

How do we estimate distance?


1.Occlusion
2.Height in visual field
3.Relative size
4.Texture density
5.Aerial perspective
6.Binocular disparity
7.Accomodation
8.Convergence
9.Motion perspective

a university for the

real

world

CRICOS No. 00213J

How do we estimate distance?


1.Occlusion
2.Height in visual field
3.Relative size
4.Texture
Texture density
5.Aerial perspective
6.Binocular disparity
7.Accomodation
8.Convergence
9.Motion perspective

a university for the

real

world

CRICOS No. 00213J

How do we estimate distance?


1.Occlusion
2.Height in visual field
3.Relative size
4.Texture density
5.Aerial perspective
6.Binocular disparity
7.Accomodation
8.Convergence
9.Motion perspective

PERCEPTION, LAYOUT, AND VIRTUAL REALITY

29

Figure 1. Just-discriminable ordinal depth thresholds as a function of the logarithm of distance from the observer, from 0.5 to 10,000 m, for nine sources of information about layout. I assume that more potent sources of
information are associated with smaller depth-discrimination thresholds; and that these thresholds reflect
suprathreshold utility. This array of functions is idealized for the assumptions given in Table 1. From Perceiving
Layout and Knowing Distances: The Integration, Relative Potency, and Contextual Use of Different Information
About Depth, by J. E. Cutting and P. M. Vishton, 1995, in W. Epstein and S. Rogers (Eds.), Perception of Space
and Motion (p. 80), San Diego: Academic Press, Copyright 1995 by Academic Press. Reprinted with permission.

How the eye measures reality and virtual reality.


JAMES E. CUTTING, Cornell University, Ithaca, New York.
Chauvet et al., 1995; Hobbs, 1991) and Egyptian art (see image) to the top, and assuming the presence of a ground
Hagen,
1986; Hobbs,
1991),Methods,
where it is often
used alone, &
plane,
of gravity, and
the absence
of a27-36
ceiling (see Dunn,
Behavior
Research
Instruments,
Computers
1997,
29 (1),

a university for the

real

with no other information to convey depth. Thus, one


can make a reasonable claim that occlusion was the first
source of information discovered and used to depict spatial relations
in depth.

worldBecause occlusion can never be more than ordinal informationone can only know that one object is in front
of another, but not by how muchit may not seem im-

Gray, & Thompson, 1965). Across the scope of many


different traditions in art, a pattern is clear: If one source
of information about layout is present in a picture beyond occlusion, that source is almost always height in the
CRICOS No.
00213J and height,
visual field. The conjunction
of occlusion
with no other sources, can be seen in the paintings at Chauvet; in classical Greek art and in Roman wall paintings;

How do we estimate distance?


1.Occlusion
2.Height in visual field
3.Relative size
4.Texture density
5.Aerial perspective
6.Binocular disparity
7.Accomodation
8.Convergence
9.Motion perspective

PERCEPTION, LAYOUT, AND VIRTUAL REALITY

29

Figure 1. Just-discriminable ordinal depth thresholds as a function of the logarithm of distance from the observer, from 0.5 to 10,000 m, for nine sources of information about layout. I assume that more potent sources of
information are associated with smaller depth-discrimination thresholds; and that these thresholds reflect
suprathreshold utility. This array of functions is idealized for the assumptions given in Table 1. From Perceiving
Layout and Knowing Distances: The Integration, Relative Potency, and Contextual Use of Different Information
About Depth, by J. E. Cutting and P. M. Vishton, 1995, in W. Epstein and S. Rogers (Eds.), Perception of Space
and Motion (p. 80), San Diego: Academic Press, Copyright 1995 by Academic Press. Reprinted with permission.

How the eye measures reality and virtual reality.


JAMES E. CUTTING, Cornell University, Ithaca, New York.
Chauvet et al., 1995; Hobbs, 1991) and Egyptian art (see image) to the top, and assuming the presence of a ground
Hagen,
1986; Hobbs,
1991),Methods,
where it is often
used alone, &
plane,
of gravity, and
the absence
of a27-36
ceiling (see Dunn,
Behavior
Research
Instruments,
Computers
1997,
29 (1),

a university for the

real

with no other information to convey depth. Thus, one


can make a reasonable claim that occlusion was the first
source of information discovered and used to depict spatial relations
in depth.

worldBecause occlusion can never be more than ordinal informationone can only know that one object is in front
of another, but not by how muchit may not seem im-

Gray, & Thompson, 1965). Across the scope of many


different traditions in art, a pattern is clear: If one source
of information about layout is present in a picture beyond occlusion, that source is almost always height in the
CRICOS No.
00213J and height,
visual field. The conjunction
of occlusion
with no other sources, can be seen in the paintings at Chauvet; in classical Greek art and in Roman wall paintings;

No unique inverse

a university for the

real

world

CRICOS No. 00213J

Binocular disparity

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Stereo disparity
50
100
150

v (pixels)

200
250
300
350
400
450
500
550
200

400

600
u (pixels)

800

1000

1200

points in the right image are shifted to the left


the shift is less for distant points
a university for the

real

world

CRICOS No. 00213J

1954-59
a university for the

real

world

CRICOS No. 00213J

Disparity
The horizontal displacement in an image point due
to horizontal translation of the camera

fb
d
Z
b
f

f is focal length, b is baseline, Z is depth.

a university for the

real

world

CRICOS No. 00213J

2010
a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Prague

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Anaglyph images

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

Anaglyph image

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

a university for the

real

world

CRICOS No. 00213J

left

a university for the

real

world

CRICOS No. 00213J

right

a university for the

real

world

CRICOS No. 00213J

Shutter glasses

a university for the

real

world

CRICOS No. 00213J

Computational
stereo
Depth from 2 images

a university for the

real

world

CRICOS No. 00213J

Disparity
The horizontal displacement in an image point due
14.3 Stereo Vision
to horizontal translation of the camera

Fig. 14.21.
( & # ! ( "
' &
" (&
(
" ( &
(
!
' ' + $ ( # & .# " ( )" ( (! ( ' (
+ " # +
&# ! (
( !

fb
d
Z
a university for the

real

world

CRICOS No. 00213J

T =
vL
d
dmax

left

uL

right

uL

Computational stereo
50
100
150

50

85

100

80

150

75

250
300
350
400
450
500

200

550
200

400

600
u (pixels)

800

1000

1200

v (pixels)

v (pixels)

200

70

250
65
300
60

350

55

400
450

50

500

45

550

40

100

200

300
400
u (pixels)

500

600

>> d = istereo(L, R, [40 90], 3);


a university for the

real

world

CRICOS No. 00213J

50

85

100

80

150

75

v (pixels)

200

70

250
65
300
60

350

55

400
450

50

500

45

550

40

100

200

300
400
u (pixels)

500

600

Planar
homgraphy
The problem of finding the
same point in different views of
the same scene
a university for the

real

world

CRICOS No. 00213J

Homography
n

object plane

epipolar plane

I 1 image plane
p

{1}

I2
ne
e pla
g
a
im

z1

z2

x2

x1

{2}

y1
1

a university for the

y2

p=H p

real

world

CRICOS No. 00213J

Homography
The matrix H is called an homography
The 3x3 matrix contains 9 elements
Overall scale factor is arbitrary, that is kH is the
same as H
So only 8 unique numbers to determine
Can be computed from 4 or more corresponding
point pairs

p=H p

a university for the

real

world

CRICOS No. 00213J

Corresponding points
p1

q1

p2

q3
p3
pi = (ui , vi )

u1 u2 u3
v1 v2 v3

p4

u4
v4

q2
q4

qi = (ui , vi )

u1 u2 u3
v1 v2 v3

u4
v4

>> H = homography(P, Q)
>> Q = homtrans(H, P)
a university for the

real

world

CRICOS No. 00213J

Perspective rectification

p=H p
a university for the

real

world

>> H = homography(p1, p2)


H=
1.4003 0.3827 -136.5900
-0.0785 1.8049 -83.1054
-0.0003 0.0016 1.0000
CRICOS No. 00213J

Perspective rectification
50
100
150

v (pixels)

200
250
300
350
400
450
500
550
100

200

300

400

500 600
u (pixels)

p=H p
a university for the

real

700

800

900

1000

>> homwarp(H, im, 'full')

world

CRICOS No. 00213J

Virtual camera

&

&

&

&
0

("

&

(
%

&

"
>> tr2rpy(sol(2).T, 'deg')
ans =
-76.6202 9.4210 -13.8081

&

a university for the

real

world

- #

&
%

"
CRICOS No. 00213J

Optical flow
How points in an image move
as the camera moves

a university for the

real

world

CRICOS No. 00213J

Optical flow patterns

tx

tz
100

100

200

200

300

300
400
v (pixels)

v (pixels)

400
500
600
700

500
600
700

800

800

900

900

1000
100

200

300

400

500 600
u (pixels)

700

800

900

1000

1000

100

200

300

400

500 600
u (pixels)

700

800

900

1000

100
200
300

>> cam.flowfield([1 0 0 0 0 0])

v (pixels)

400
500
600
700
800
900
1000
100

200

300

400

500 600
u (pixels)

700

a university for the

800

900

real

1000

world

CRICOS No. 00213J

'

"

#%

#$ $% $

flow
" equation
$
(
"
Optical
"
" & $ & #

$ # "

$ &

# $

(u,
v)
are distances from principal point

a university for the

real

world

CRICOS No. 00213J

"

Ambiguity between translation and rotation

tx

100

200

200

300

300

400

400
v (pixels)

v (pixels)

100

500
600

500
600

700

700

800

800

900

900

1000

1000
100

200

300

400

500 600
u (pixels)

a university for the

700

800

real

900

1000

world

100

200

300

400

500 600
u (pixels)

700

800

900

1000

CRICOS No. 00213J

tx

#%

#$ $% $

# $

100
200
300

v (pixels)

400
500
600
700
800
900
1000
100

200

300

400

700

800

900

1000

small f

large f

100

100

200

200

300

300

400

400
v (pixels)

v (pixels)

500 600
u (pixels)

500
600

500
600

700

700

800

800

900

900

1000

1000
100

200

300

400

500 600
u (pixels)

700

a university for the

800

900

real

1000

world

100

200

300

400

500 600
u (pixels)

700

800

900

1000

CRICOS No. 00213J

* /

# !& %

,! + ! &

- & ,! ' & $ ' * %

points
+ & Motion
of $ multiple
+)/ .
. ##
% ,## * '
$ ) $ & + & % * & )) * ' & % * + & )& + + & % )& , %
Consider the& case
) + ) of' & three
% + * points, in matrix form
!. /

# %

/ * !,

' -+ ' * / *
' % ( * ! + + ,*
, % & ! (-$
!0
&
'

ob0(qn)

.0144

# ! & % ,! + /
& + $ ,! ' & $ &
,' + * $ ' + ) ! / & .
% (-,
!*
15.2.2

* ' , ,! ' & $


' # # * , % & % *' % %
,$ 1! 1 ,

lControlling Feature Motion

& ).
- * & . %
$ & + & real
%
* * & + 0% +
0.3197

a university for the

. $ ' ! ,1 '
, ,*# ! ) * & # & ' %
jacob0

world

!+
% ( ' & & ,+
! *& + ' & % + * )
% , ' ' ,

& . ' & % +* $ & - % +


$
* + * 0+
% - )* ' )& 0 # $
CRICOS No. 00213J

' # %
+

+&

$ +)/ .
. ##
% ,## * '
. +
$ & ))Inverting
% % *+ * + &2 )& + + & % )& , !
%
+ $ & + & * % &* +& ))the
) * ' ' & & problem
3# % ! & &
. & )% + % )- )+' & %( + *

% 2 ! * " ! 01. ! 2 ! (+ % 05 3 ! * + ) , 10! 0$ ! . ! - 1% . !


3 ! ! 0! . ) % * ! 0$ ! " ! 01. ! 2 ! (+ % 05 $ ! / % ) , (! / 0 / 0.
% 2 ! * " ! 01.
! 2 ! (+ % 05 3 ! * + )
required point
+ *0. + ((! .
from! 2 ! (+
3 ! ! 0! . ) velocity
% * ! 0$ to! move
" ! 01.
% 05
+ *0. + ((! .

p to p*

0$ 0
0$ ! " ! 01. ! / 0+ 3 . 0$ ! % . ! / % . ! 2 (1! /
+3 % 0$ $ + -) / . # # 3 ! 3% & . %% 0!0$ * % 0 , # ) * 0$& !# & " %! 01.* ! +/ 0+ 3' & . % + 0$* ! % ). %
% * & #- & )+
) ( ,3 )% 0$ - $ ) 3 -! 3# & . % 0!+ 0
15.2.2
&

).
$ & + & %

a university for the

lControlling Feature Motion


- * & . %
* * & + % +
real
07
/ % 0world $

%/

& . ' & % +* $ & - % +


$
* + * +
% - )* ' )& # $
+

' # %
+ *

$ .07/3% 0% (( $ .% / % 2 !+ *0.
+ *0. + ((!
0$ !+ ((! .)3 !% ((
. /. % + 2 !0
CRICOS No. 00213J

Desired view

a university for the

real

world

CRICOS No. 00213J

Current view

a university for the

real

world

CRICOS No. 00213J

Image plane motion

a university for the

real

world

CRICOS No. 00213J

Image plane motion

(u,
v)

(u, v)

a university for the

real

world

CRICOS No. 00213J

IBVS simulation

a university for the

real

world

CRICOS No. 00213J

New concepts
Image convolution
Gaussian, Laplacian operators

Homogeneous coordinates
Feature detectors
Scale-space
Geometry of image formation
Fundamental and essential matrix
Homography
Image correspondence
RANSAC

a university for the

real

world

CRICOS No. 00213J

Further reading
Peter Corke

Peter C0rke

Robotics,

The practice of robotics and computer vision

involve the application of computational algoVision eachrithms


The research community has develand opedtoadata.
very large body of algorithms but for a
Control newcomer to the field this can be quite daunting.

isbn 978-3-642-20143-1

9 783642 201431

a university for the

real

world

Corke

Robotics, Vision and Control

For more than 10 years the author has maintained two opensource matlab Toolboxes, one for robotics and one for vision.
They provide implementations of many important algorithms and
allow users to work with real problems, not just trivial examples.
This new book makes the fundamental algorithms of robotics,
vision and control accessible to all. It weaves together theory, algorithms and examples in a narrative that covers robotics and computer vision separately and together. Using the latest versions
of the Toolboxes the author shows how complex problems can be
decomposed and solved using just a few simple lines of code.
The topics covered are guided by real problems observed by the
author over many years as a practitioner of both robotics and
computer vision. It is written in a light but informative style, it is
easy to read and absorb, and includes over 1000 matlab and
Simulink examples and figures. The book is a real walk through
the fundamentals of mobile robots, navigation, localization, armrobot kinematics, dynamics and joint level control, then camera
models, image processing, feature extraction and multi-view
geometry, and finally bringing it all together with an extensive
discussion of visual servo systems.

Robotics,
Vision
and
Control
FUNDAMENTAL
ALGORITHMS
IN MATLAB

springer.com

CRICOS No. 00213J

Homework
How could a robot vision system exploit other (non
binocular) depth cues to determine distance?
How could a robot recognize a place (kitchen,
bathroom, garden) in a way that is invariant to:
lighting levels
position of the robot
small changes in the environment

a university for the

real

world

CRICOS No. 00213J

You might also like