You are on page 1of 49

8/20/2014

Computer Vision
Mubarak Shah
shah@crcv.ucf.edu

Computer Vision

The ability of computers to see.

Image Understanding
Machine Vision
Robot Vision
Image Analysis
Video Understanding

8/20/2014

A picture is worth a thousand


words.

A word is worth a thousand


pictures.

A HUNT

8/20/2014

Image

2-D array of numbers (intensity values,


gray levels)
Gray levels 0 (black) to 255 (white)
Color image is 3 2-D arrays of numbers

Red
Green
Blue

Resolution (number of rows and columns)

128X128
256X256
512X512
640X480

8/20/2014

Image Formats
TIF
PGM
PBM
GIF
JPEG
RAW

Video

Sequence of frames
30 frames per second

Formats

AVI
MPEG
Quick Time

8/20/2014

Video Clip

Sequence of Images

8/20/2014

Image Formation

Light Source
Camera (extrinsic and intrinsic
parameters)
Scene (Surface reflectance, Surface
shape )

Perspective Projection (Pin


Hole)
(X,Y,Z)
World
point

Image Plane
f
y
image

Lens
Z

8/20/2014

Orthographic Projection
Image Plane

(X,Y,Z)
World
point

y
image

Shape from X

Recover 3-D shape from 2-D image(s)

Stereo
Motion
Shading
Texture
Contours

8/20/2014

Stereo

http://www.vision3d.com/stereo.html

8/20/2014

Renault Stereo Pair

Depth Map

8/20/2014

Shape from Shading

Lambertian Model

S=L, light
source
I=S.N

10

8/20/2014

Vase

(1, 0, 1)

(-1, 1, 1)

(-1,-1, 1)

Shape from Texture

11

8/20/2014

Visual Motion

Shape from Motion: Moving


Light Display

12

8/20/2014

Shape from Motion

Photosynth

13

8/20/2014

Sequence

Raw Optical flow

27

Video Clip & Mosaic

14

8/20/2014

Applications of Computer
Vision

Face Recognition
Object Recognition
Video Surveillance and Monitoring

Object detection, tracking and behavior analysis

Remote Sensing: UAVs


Robotics
Computer Graphics

Object Recognition
Finding People in images
Problem 1: Given an image I
Question: Does I contain an image of a
person?

15

8/20/2014

Yes Instances

No Instances

16

8/20/2014

Localize People (Human Detection)

Human Detection

17

8/20/2014

Airplanes

Motor Cycles

18

8/20/2014

Face Recognition

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

What are wild faces?

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

19

8/20/2014

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

Three Tasks of Face Recognition

Most Research Focus

Pair-Matching Task

Most Realistic
Scenario

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

Open-Universe Face Identification


Bob
Barack Obama

Alice

News Article: Label


Important Figures

Social Network: Tag


Facebook Friends

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

24

8/20/2014

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

Open-Universe Face Identification


Find Angelina Jolie and George Clooney

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

Open-Universe Face Identification


Find Angelina Jolie and George Clooney

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

25

8/20/2014

UniversityofCentralFlorida

CRCV|CenterforResearchinComputerVision

Open-Universe Face Identification

Gwenyth Paltrow
Robert Downey Jr.

TamingWildFaces:WebScale,OpenUniverseFaceRecognition

FaceRecognitioninMovieTrailersvia
MeanSequenceSparseRepresentationbasedClassification

CVPR2013

26

8/20/2014

Recognition Qualitative
Known:
PaulRudd
SteveCarell

Recognition Qualitative
Known:
PaulRudd
SteveCarell

27

8/20/2014

Recognition Qualitative
Known:
OwenWilson
ReeseWitherspoon
PaulRudd

Recognition Qualitative
Known:
BruceWillis
MorganFreeman

28

8/20/2014

Recognition Qualitative
Known:
TomCruise
CameronDiaz

FACIAL EXPRESSIONS

RAISE EYE BROWS

SMILE

29

8/20/2014

Detecting Driver Alertness

Lipreading

30

8/20/2014

Video Surveillance and Monitoring

Object detection

Object tracking

Object categorization
and classification

Event or Activities
Recognition

Automated Surveillance System (Detection &


Tracking)

A.IPersonTracking

A.IIPartTracking

31

8/20/2014

NONA: Project Overview


Part of the WAS (wide area surveillance) project executed by the
Homeland Security Advanced Research Project Agency (HSARPA)
Current Sensor
8 high-resolution cameras
provide a 100 mega-pixel, 360field of view
frame rate: 5 frames per second
Next Generation
48 cameras with significantly higher resolution
smaller size

Current Sensor

Next Generation

NONA SysmtemAirport
Sequence 1

32

8/20/2014

UAV: Unmanned Aerial Vehicle

UAVs: Unmanned Aerial


Vehicles (Drones)

Predator
Global Hawk

Microdrone

33

8/20/2014

KINGFISHER AEROSTAT BALLOON

8/20/2014

Computer Vision Lab, UCF

67

COCOA SystemFlow

Feature based + Gradient


Based

Registered Images

Motion Detection

Object Tracking

Accumulative Frame
Differencing + Background
Modeling + Object
Segmentation

Kernel Tracking + Blob


Tracking + Occlusion Handling

Motion Detection

Tracks

COCOA

Aerial Video

Telemetry*

Ego Motion
Compensation

Event Detection & Indexing

34

8/20/2014

RegistrationResult I

Aerial Video - EO

Alignment

Mosaic

Mask

RegistrationResult II

Aerial Video - IR

Alignment

Mosaic

Mask

35

8/20/2014

DetectionResults

TrackingResults

36

8/20/2014

WideAreaSurveillance

WideAreaSurveillance

37

8/20/2014

TrackingResults

RobotVision(UnmannedGround
Vehicle)

UGV

38

8/20/2014

UGV

HumanActionRecognition

39

8/20/2014

Events,Actions,Activities,

Action
Event
Movement
Activity
Interaction
Verb
.

WeizmannActionDataset
10actions
9actorsperaction

Bend

Jack

Sidestep

Hop

Wave 2 hands

Walk

Wave 1 hand

Skip

Jump in place

Run

40

8/20/2014

High Density Crowded Scenes

Political Rallies

Religious Festivals

Marathons

High Density
Moving Objects

Tracking in Crowds

Average chip size 14 x 22 pixels


492 Frames
Selected 199 athletes for tracking
Successfully tracked 143 athletes

48

8/20/2014

Results

Experiment 1

49

8/20/2014

Experiment-3

Average chip size 14 x 17 pixels


453 Frames
Selected 50 athletes for tracking

Experiment 3

50

8/20/2014

BehaviorsinCrowdedScenes

Bottleneck

Departure

Lanes

Arch/Rings

Blockings

51

8/20/2014

Ground truth=634 Proposed Method=640

Ground truth=1567Proposed Method=1590

52

8/20/2014

Ground truth=1428 Proposed Method=1468

Ground truth=653 Proposed Method=673

53

8/20/2014

Ground truth=2322

Proposed Method=2203

Ground truth=2319 Proposed Method=2496

54

8/20/2014

Where Am I?

Where Am I?
Problem:

Accurate Image Localization


Input

Output

Mere Visual Information(Images)

Location in Terms of (Lon.) and (Lat.)


=40.4419, =-79.9986

55

8/20/2014

QualitativeImageLocalizationResults:
68.3%:Thebestmatchfound:

Query

Match Error:9.01m

Query
Match Error:7.5m

Query

Query

Match Error:6.4m

Match Error:62.7m

QualitativeImageLocalizationResults:
68.3%:Thebestmatchfound:
13.2%:Acorrectmatchfound,butnotthebestone:

Query

Query

Match Error:244.7m

Match Error:307.7m

Query

Query

Match Error:157.3m

Match Error:71.3m

56

8/20/2014

Geospatial Trajectory Extraction

VisualBusinessRecognition
ACMMultimedia2013

57

8/20/2014

Computer Vision for Computer


Graphics

Video Completion

58

8/20/2014

Layer Based Video Composition

Results of Doll

59

8/20/2014

Results of Mom-Daughter

Multimedia: Segmentation of
Moving-Sounding Objects

Accepted in IEEE Transactions on


Multimedia

120

60

You might also like