You are on page 1of 55

Computer Vision

Basic Info
Instructor: Paolo Favaro (paolo.favaro@iam.unibe.ch)
Teaching Assistant: Thoma Papadhimitri
(thoma.papadhimitri@iam.unibe.ch)
Course webpage: http://cvg.unibe.ch/teaching.html
Course material acknowledgements:
Svetlana Lazebnik University of Illinois
Steve Seitz University of Washington
Jan Koenderink University of Leuven
Kristen Grauman University of Texas at Austin
Andrea Vedaldi University of Oxford
Srinivas Narasimhan Carnegie Mellon University
Textbooks
Forsyth & Ponce, Computer
Vision: A Modern Approach

Richard Szeliski, Computer
Vision: Algorithms and
Applications (available online)


Kristen Grauman and Bastian Leibe, Visual Object
Recognition (pdf available online)
Course requirements
3 Assignments
Deadlines specified in the next slide
Each assignment has a maximum score of 100 and a pass is 60
To register for the exam one needs at least a pass on each assignment
Each assignment to be made available in Ilias
Assignments will require use of MATLAB (tutorial will be provided)

Exercises
Weekly
Aim is exam preparation
Exam
There will be a final exam on 11 February 2014 (duration 120 mins)
Final Mark
70% Exam and 30% Assignments
Assignments
First assignment: Photometric Stereo
Available on 8/10
Due on 29/10

Second assignment: Uncalibrated Stereo
Available on 29/10
Due on 19/11

Third assignment: Object detection via Bag-of-Words
Available on 19/11
Due on 10/12
Academic integrity policy
Feel free to discuss assignments with each other, but coding
must be done individually
Feel free to incorporate code or tips you find on the Web,
provided this doesnt make the assignment trivial and you
explicitly acknowledge your sources

Remember: we can Google too!
The goal of computer vision
To extract meaning from pixels
What we see What a computer sees
Source: S. Narasimhan
The goal of computer vision
To extract meaning from pixels
Source: 80 million tiny images by Torralba et al.
Humans are remarkably good at this!
WhaL klnd of lnformauon can be
exLracLed from an lmage?
!"#$"%&'( lnformauon )"$*+,( lnformauon
bulldlng
person
Lrashcan
car car
ground
Lree
Lree
sky
door
wlndow
bulldlng
roof
chlmney
!"#$%%& ()*+*
,-#. /"&%0*1+
2
Why study computer vision?
Vision is useful
Vision is interesting
Vision is difficult
Half of primate cerebral cortex is devoted to visual
processing
Achieving human-level visual perception is probably
AI-complete
Successes of computer vision to date
Optical character recognition (OCR)
Source: S. Seitz, N. Snavely
Digit recognition
yann.lecun.com
Llcense plaLe readers
hup://en.wlklpedla.org/wlkl/AuLomauc_number_plaLe_recognluon

Sudoku grabber
hup://sudokugrab.blogspoL.com/
AuLomauc check processlng
Biometrics
Fingerprint scanners on
many new laptops,
other devices
Face recognition systems now
beginning to appear more widely
http://www.sensiblevision.com/

Source: S. Seitz
Biometrics
How the Afghan Girl was Identified by Her Iris Patterns
Source: S. Seitz
Face detection
Many consumer digital cameras now detect faces
Source: S. Seitz
Smile detection
Sony Cyber-shot T70 Digital Still Camera
Source: S. Seitz
Face recognition: Apple iPhoto software
http://www.apple.com/ilife/iphoto/
Mobile visual search: Google Goggles
Automotive safety
Mobileye: Vision systems in high-end BMW, GM, Volvo models
Pedestrian collision warning
Forward collision warning
Lane departure warning
Headway monitoring and warning
Source: A. Shashua, S. Seitz
Google self-driving cars
Vision-based interaction: Xbox Kinect
3D Reconstruction: Kinect Fusion
YouTube Video
3D Reconstruction: Multi-View Stereo
YouTube Video
Google Maps Photo Tours
http://maps.google.com/phototours
Special effects: shape and motion capture
Source: S. Seitz
Vision for robotics, space exploration
NASA'S Curiosity Rover has a system consisting of 17 cameras
Why is computer vision difficult?
Challenges: viewpoint variation
Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba
Challenges: illumination
image credit: J. Koenderink
Challenges: scale
slide credit: Fei-Fei, Fergus & Torralba
Challenges: deformation
Xu, Beihong 1943
slide credit: Fei-Fei, Fergus & Torralba
Challenges: occlusion, clutter
Image source: National Geographic
Challenges: Motion
Challenges: object intra-class
variation
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
Source: Rob Fergus and Antonio Torralba
Challenges: local ambiguity
Source: Rob Fergus and Antonio Torralba
Challenges: Inherent ambiguity
Many different 3D scenes could have given rise to a
particular 2D picture












Challenges or opportunities?
Images are confusing, but they also reveal the structure of
the world through numerous cues
Our job is to interpret the cues!
Image source: J. Koenderink
Depth cues: Linear perspective
Depth cues: Aerial perspective
Depth ordering cues: Occlusion
Source: J. Koenderink
Shape cues: Texture gradient
Shape and lighting cues: Shading
Position and lighting cues: Cast shadows
Source: J. Koenderink
Grouping cues: Similarity (color, texture,
proximity)
Grouping cues: Common fate
Image credit: Arthus-Bertrand (via F. Durand)
Origins of computer vision
L. G. Roberts,
Machine Perception of Three
Dimensional Solids, Ph.D. thesis,
MIT Department of Electrical
Engineering, 1963.
Connections to other disciplines
Computer Vision
Image Processing
Machine Learning
Artificial Intelligence
Robotics
Cognitive science
Neuroscience
Computer Graphics
The computer vision industry
A list of companies here:

http://www.cs.ubc.ca/spider/lowe/vision.html
Course overview
I. Image formation: Camera and projection models
II. Radiometry and shading
I. Calibrated/Uncalibrated photometric stereo
III. Early vision: Image filtering, edges, interest points,
denoising and deblurring
IV. Epipolar geometry, stereo and structure from motion
V. Tracking, optical flow and registration
VI. Mid-level vision: Clustering and segmentation
VII. Recognition
I. Bag of Features and Support Vector Machines
VIII. Deformable parts models
IX. Special topics
Early vision
Cameras and sensors
Light and color
Linear filtering
Edge detection
*
=
Feature extraction: corner and blob detection
Basic image formation and processing
Mid-level vision
Fitting: Least squares
RANSAC
Alignment
Fitting and grouping
Multi-view geometry
Structure from motion
Stereo Epipolar geometry
3D Photography
Recognition
Instance recognition, large-scale alignment
Image classification
Sliding window detection
Part-based models

You might also like