You are on page 1of 29

Scope

Multimedia Applications

Usage
Multimedia

Multimedia
Learning & Teaching Design User Interfaces

Compression Content
Docu- Synchro-
Group

Services
Process- Security ... Communi-
ments nization
ing cations

Databases Programming

Systems
Media-Server Operating Systems Communications
Prof. Dr.-Ing. Lars Wolf
Opt. Memories Quality of Service Networks

Compression
TU Braunschweig Computer
Institut fr Betriebssysteme und Rechnerverbund Archi- Image &

Basics
tectures Animation Video Audio
Graphics
Mhlenpfordtstrae 23, 38106 Braunschweig, Germany
Section Email: wolf@ibr.cs.tu-bs.de Section

All All
3-compression.fm 3-compression.fm
1 2

Contents 1. Motivation

1. Motivation
2. Requirements General Digital video in computing means for
Multimedia

Multimedia
Text:
3. Fundamentals Categories 1 page with 80 char/line and 64 lines/page and 2 Byte/Char
80 x 64 x 2 x 8 = 80 kBit/page
4. Source Coding
Image:
5. Entropy Coding 24 Bit/Pixel, 512 x 512 Pixel/image
512 x 512 x 24 = 6 MBit/Image
6. Hybrid Coding
Audio:
7. JPEG CD-quality, samplerate44,1 kHz, 16 Bit/sample
Mono: 44,1 x 16 = 706 kBit/s
8. H.261 and related ITU Standards Stereo: 1.412 MBit/s

9. MPEG-1 Video:
full frames with 1024 x 1024 Pixel/frame, 24 Bit/Pixel, 30 frames/s
10. MPEG-2 1024 x 1024 x 24 x 30 = 720 MBit/s
more realistic
Section 11. MPEG-4 Section
360 x 240 Pixel/frame = 60 MBit/s

All
14. Conclusion All Hence compression is necessary
3-compression.fm 3-compression.fm
3 4
2. Requirements General Requirements

Dialogue and retrieval mode requirements:


Independence of frame size and video frame rate
Multimedia

Multimedia
Synchronization of audio, video, and other media
low delay
Dialogue mode requirements:
Compression and decompression in real-time
intrinsic scalability (e.g. 25 frames/s)
End-to-end delay < 150ms
high quality
Symmetric:
compression
compression and decompression take the same time

Retrieval mode requirements:


Fast forward and backward data retrieval
low complexity (e.g., ease of decoding) Random access within 1/2 s
efficient implementation (e.g., memory req.) Asymmetric:
Section Section compression takes longer than decompression
All All
3-compression.fm
5
3-compression.fm
6
Software and/or hardware-assisted implementation requirements

3. Fundamentals Categories Categories and Techniques


Run-Length Coding
Entropy
Huffman Coding
Coding
Multimedia

Multimedia
Arithmetic Coding
entropy coding DPCM
hybrid Prediction
- ignoring semantics of the data DM
coding
- lossless FFT
Source Transformation
DCT

source entropy
coding encoding
Coding Bit Position
- entropy Layered Coding Subsampling
- based on semantic of the data and Sub-Band Coding
- often lossy source Vector Quantization

coding JPEG
channel coding Hybrid MPEG
- adaptation to communication channel Coding H.261, H.263
Section
- introduction of redundancy Section proprietary: Quicktime, ...
All All
3-compression.fm 3-compression.fm
7 8
Categories & Techniques, Cont. Categories & Techniques: Symmetric / Asymmetric

Two principal possibilities Major distinction: Symmetric / Asymmetric


1. Entropy Coding: Eliminate Redundancy (thus, lossless) Asym. (usually): more effort for compression
Multimedia

Multimedia
2. Reduction Coding: Eliminate Irrelevance / Low-Relevance (lossy) o.k. if compression non real-time, "only once" (movie!)
may involve number-crunchers (...owned by content provider)
Preparatory Step: Decorrelation - Eliminate Interdependencies
Symmetric: "required" for real-time, e.g., videoconferencing
this is the essence of source coding in reality, often not 100% symmetric
changes "representation" of media
goal usually: reduce dependencies between data
as such, is a preparatory step!! and usually, does not compress

Hybrid coding steps (often): decorrelation - reduction - entropy cod.


often: reduction by quantization
last step: additional compresion without harm

note: literature usually uses terms as in last slide!!


Section Section

All
note: reduction coding is "smart deletion", not really "compression" All
3-compression.fm 3-compression.fm
9 10

Categories & Techniques: Further Considerations 4. Source Coding

Further considerations include, e.g., DPCM


Adjustable compression rate? ...quality?
Multimedia

Multimedia
"smooth" bit stream ("isochronous")? DPCM = Differential Pulse-Code Modulation
terms: CBR (const. bit rate) vs. VBR (variable bit rate)
Assumptions:
may be "over time": e.g., packet size BigSmallSmall BigSmallSmall...
may be simulated w/ loop-back filter plus buffer Consecutive samples or frames have similar values

"progressive" (mainly: non-continuous media): display-while-download Prediction is possible due to existing correlation

"streaming": ~ same for video (here, rather an issue of software) Fundamental steps:
more subtle issues previous actual
Predict next data next data:
"open" standard?
data:
based on previously processed data prediction
good "performance" (ratio, speed) for all kinds of media? 1000 1005
Determine difference between 5
bullet-proof, well-understood? actual next data and prediction 1000
code
...
Code difference only

Section Section
Challenge: optimal predictor
All All Delta modulation (DM): 1 bit as difference signal
3-compression.fm 3-compression.fm
11 12
Source Coding: Transformation Source Coding: Sub-Band

Assumptions: Assumption:
Data in the transformed domain is easier to compress Some frequency ranges are more important than others
Multimedia

Multimedia
Related processing is feasible
Example:
Example: frequency spectrum of the signal
Fourier Transformation

time domain frequency


domain
frequency
region of transformation / coding
Inverse
Fourier Transformation
Application:
Telephone system
300 - 3400 Hz only
FFT: Fast Fourier Transformation MPEG audio
Section Section
DCT: Discrete Cosine Transformation
All All
3-compression.fm 3-compression.fm
13 14

5. Entropy Coding Entropy Coding: Principle

Entropy Coding: Principle Entropy formula:


H(P) = p() log B p()

Multimedia

Multimedia
Entropy (in information theory): information content/ "density"
example: given 4 possible symbols (words) in source code
symbols/words equally likely: high entropy (full of information)
i) IF all equal p=1/4: H(P)=2; ii) IF p= 1/2, 1/4, 1/8, 1/8 --> H(P)= 1 6/8
otherwise: lower entropy (suboptimal representation of info, less dense)
"Entropy coding" means:
note:
mean length of file equals (~almost) entropy
probability

high doesnote: seems "little information" to


not consider
arrangement
Entropy (cf.us since it is very regular; this is not
run length in ii) above, with B=2 (binary):
encoding!)
grey levels covered by entropy formula, yet may p=  code length -log2 () = -(-1)=1; p= 2bits, etc.
be used for compression (e.g. run length) GOAL: find code w/ symbol length as close as possible to logB p()
probability

low
Entropy here: "little info" because
"most of picture is in same gray"
grey levels

Section Section

All All
3-compression.fm 3-compression.fm
15 16
Run-Length Entropy Coding: Huffman

(only marginal relation to entropy) Assumption:


Some symbols occur more often than others
Assumption:
Multimedia

Multimedia
E.g., character frequencies of the English language
Long sequences of identical symbols
Fundamental principle:
Example:
... A B C E E E E E E D A C B... Frequently occurring symbols are coded with shorter bit strings

compression

... A B C E ! 6 D A C B...

symbol number of
occurrences
special flag

Special variant: zero-length encoding


Section Section
only repetition of zeroes count
All All
3-compression.fm
17
in red part above, "symbol" not needed (i.e. "pays" for >2 repetitions) 3-compression.fm
18

Entropy Coding: Huffman Entropy Coding: Huffman

Example: Table and example of application to data stream


Symbols to be encoded: A, B, C, D, E
Multimedia

Multimedia
Given probabilities of occurrence:
p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15
symbol code
A 11
symbol probability coding tree code B A C D A B E B A E
B 10
1 C 011
A 30% A A = 11 10 11 011 010 11 10 00 10 11 00
1 D 010
60% 00
0 E
B 30% B B = 10
1
C 10% C 100% C = 011
1
25%
0
D 15% D 0 D = 010
40%
0
Section E 15% E E = 00 Section

All All
3-compression.fm 3-compression.fm
19 20
6. Hybrid Coding 7. JPEG
Basic Encoding Steps
JPEG: Joint Photographic Expert Group
Multimedia

Multimedia
video:
lossy International Standard:
audio: lossy For digital compression and coding of continuous-tone still images:
lossless (sometimes lossless) lossless Gray-scale
Color
Since 1992
data data
source quanti- entropy compresse Joint effort of:
pre- pro-
data zation encoding data ISO/IEC JTC1/SC2/WG10
paration cessing
Commission Q.16 of CCITT SGVIII

e.g. e.g.
Compression rate of 1:10 yields reasonable results
e.g. e.g.
- resolution - DCT - linear - runlength
- frame rate - sub-band - DC, AC - Huffman
Section coding Section
values
All All
3-compression.fm 3-compression.fm
21 22

JPEG JPEG Compression Steps

Very general compression scheme

Independence of:
Multimedia

Multimedia
image image entropy
com-
Image resolution pre- pro- encoding
source paration cessing pressed
Image and pixel aspect ratio
quanti-
runlength
Color representation pixel
image zation image
Image complexity and statistical characteristics predictor Huffman
block
MCU FDCT Arithm.
Well-defined interchange format of encoded data

Implementation in:
Software only MCU: Minimum Coded Unit
FDCT: Forward Discrete Cosine Transformation
Software and hardware

MOTION JPEG for video compression


Section Sequence of JPEG-encoded images Section
JPEG JPEG JPEG
All All
3-compression.fm 3-compression.fm
23 24
JPEG: Image Preparation JPEG Image Preparation
planes: data units: pixels or 8*8-blocks
* * * * * * * * *
Example 4:2:2 YUV, 4:1:1 YUV, and YUV9 Coding
* * * * *
* * * * * * * * * Luminance (Y):
* * * * *
Multimedia

Multimedia
* * * * *
* * ** ** ** * * brightness
** * * * * * * Yi resolution
* *** *** *** ** * C3 sampling frequency 13.5 MHz
* of plane i
* * * * * *
* * * * * C2
* * * * * Xi Chrominance (U, V):
* * * * * C1 color differences
sampling frequency 6.75 MHz
Planes:
1 N 255 components Ci (e.g., one plane per color)
Contain data units
Pixels in lossless mode, 8*8-blocks in lossy mode
Different planes may have different resolutions

Number of bits per pixel:


8 or 12 bit per pixel in lossy modes
Section Section
2 to 16 bit per pixel in lossless mode
All All
3-compression.fm 3-compression.fm
25 26

JPEG Image Preparation JPEG: 4 Modes of Compression

Non-interleaved encoding: Lossy sequential DCT-based mode


top Baseline Mode
Multimedia

Multimedia
* * * * * * *
Expanded lossy DCT-based mode
left * * * * * * * right
* * * * * * * Progressive image display
I.e. from coarse to fine resolution
bottom
Lossless mode
Interleaved encoding:
Lossless compression error-free decompression
C1 C2 C3
* * * * * * * * * * * * * * * Hierarchical mode
* * * * * * * * * * * * * * * + +
* * * * * * * * * Compression with multiple resolutions
* * * * * * * * * * * * * * * = MCU

Minimum Coded Unit (MCU):


Combination of interleaved data units of different components
Section Section
Data of an MCU are stored and transmitted together
All All
3-compression.fm 3-compression.fm
27 28
JPEG Baseline Mode Intuitive Understanding of DCT
Fourier-Transform (& FFT "fast" algorithm) known from 1-dimensional:
1. image 2. image 3. quanti- 4. entropy cut waveform into pieces (blocks of samples)
com-
pre- pro- zation encoding for each blocks:
source paration cessing pressed
Multimedia

Multimedia
interpret as periodic (infinite) oscillating waveform
represent as sum of sin/cos waves ai sin t; i=0...(N-1); same for cos
image image
ai coefficients; a0 = DC (direct current= shift wrt. 0-axis),
8x8 others: how much of the respective sin or cos wave is part of waveform
blocks FDCT tables
tables tables
i increasing frequencies (usually N = no. of samples in block)
DCT in JPEG etc.:
Baseline mode is mandatory for all JPEG implementations: same idea, but 2-dimensional cos-waves
Often restricted to certain resolution cut out square blocks from picture (NxN)
cos waves all have independent frequencies in horizontal/vertical direction
Often only three planes with predefined color set-up
comparable to smooth hills, # of valleys may differ horiz/vert.
Image preparation: again: interpret sample as periodic (2D) waveform
--> represent as sum of (2D) cos wave "hill areas"
Step 1a: Pixel resol. multiples of p=8 bit yields 8x8 pixel blocks (data units) why only cos??
Step 1b: unsigned --> signed integer (prepare for "oscillation" --> sin/cos) trick: picture swapped around axes
--> 4fold size --> picture symmetric to axes --> sin parts become zero
... other steps see below
Section Section 4fold size no problem: 3 parts redundant
Step 4a: zigzag linearization (see below) axes have double "weight" (pix. row/col. "0") --> factor Cu/Cv in formula
All All
3-compression.fm
29
Steps 4b, c, ...: several entropy coding algorithms applied 3-compression.fm
30

JPEG Baseline Mode: Image Processing JPEG Baseline Mode: Image Processing

Forward Discrete Cosine Transformation (FDCT): FDCT transforms:


7 7 blocks into blocks
1 ( 2x + 1 ) u
Multimedia

Multimedia
2y + 1 ) v-
s yx cos ----------------------------- cos (----------------------------
S vu = --- C u C v
4 16 16 not pixels into pixels
x = 0y = 0
Example:
with:
cu, cv = 1
------- , for u, v= 0; else cu, cv = 1 Calculation of S00
2

Formula applied to each block for all 0 u, v 7:


Blocks with 8x8 pixel
result in 64 DCT coefficients: P P P ... D A A ...
P P P ... FDCT A A A ... # # # # # # # # * * * * * * * *
1 DC-coefficient S00:
P P P ... A A A ... # # # # # # # # * * * * * * * *
basic color of the block ... ... # # # # # # # # * * * * * * * *
63 AC-coefficients: # # # # # # # # * * * * * * * *
(likely) zero or near-by zero values P=Pixel D=DC- / A=AC-coeff. # # # # # # # # * * * * * * * *
# # # # # # # # * * * * * * * *
Different significance of the coefficients: # # # # # # # # * * * * * * * *
# # # # # # # # * * * * * * * *
Section Section
DC: most important
All All
AC: less important
3-compression.fm 3-compression.fm
31 32
JPEG Baseline Mode: Quantization JPEG Quantization Effect

Quantization of DCT-coefficients:
Map interval of real numbers to one integer number
Multimedia

Multimedia
Especially: small values are mapped to 0, yielding long zero sequences

Using quantization tables:


(a) (b)
Different coefficients may have different granularities

Section Section

All All
3-compression.fm 3-compression.fm
33 34

JPEG Baseline Mode: Entropy Encoding JPEG Baseline Mode: Entropy Coding

DC-coefficients: 63 AC coefficients:
Compute the differences: Ordering in zig-zag form
Multimedia

Multimedia
DCi-1 DCi AC01 AC07
* * * * * * * *
DC * * * * * * * *
... block block ... * * * * * * * *
* * * * * * * *
* * * * * * * *
DIFF = DCi - DCi-1 * * * * * * * *
* * * * * * * *
Encode differences instead of the DCi values AC70 * * * * * * * *
AC77
Reason: DC values of adjacent blocks are often similar
reason: coefficients in lower right corner are likely to be zero
Huffman coding of all coefficients:
Transformation into a code
where amount of bits depends on frequency of respective value
Subsequent runlength coding of zeros

Section Section

All All
3-compression.fm 3-compression.fm
35 36
JPEG: Details of (one possible) Entropy coding JPEG: Sample Compression of 1 Block: 8x8 Matrices

Treatment of "zig-zag sequence": 1. Typical Pixel Block: 2. DCT Coefficients:


differential coding of DC: DCi stored as "change wrt. DCi-1"
Multimedia

Multimedia
139 144 149 153 155 155 155 155 235.6 1.0 -12.1 -5.2 2.1 -1.7 -2.7 1.3
assumption: there will rarely be two non-zero AC values in sequence 144 151 153 156 159 156 156 156 -22.6 -17.5 -6.2 -3.2 -2.9 -0.1 0.4 -1.2
--> regard seq. as iteration of non-zero AC-values and zero-runlengths 150 155 160 163 158 156 156 156 -10.9 -9.3 -1.6 1.5 0.2 -0.9 -0.6 -0.1
--> sometimes, the zero-runlength will have "length zero" 159 161 162 160 160 159 159 159 -7.1 -1.9 0.2 1.5 0.9 -0.1 0.0 0.3
159 160 161 162 162 155 155 155 -0.6 -0.8 1.5 1.6 -0.1 -0.7 0.6 1.3
code non-zero AC-values as VLIs (variable length integers ) 161 161 161 161 160 157 157 157 1.8 -0.2 1.6 -0.3 -0.8 1.5 1.0 -1.0
--> need to transmit VLI-lengths 162 162 161 163 162 157 157 157 -1.3 -0.4 -0.3 -1.5 -0.5 1.7 1.1 -0.8
(difference to Huffman: end of code not found by decoder) 162 162 161 161 163 158 158 158 -2.6 1.6 -3.8 -1.8 1.9 1.2 -0.6 -0.4

create pairs (zero-runlength, VLI-length-of-following-non-zero-AC-value)


these pairs are Huffman encoded 3. Quantization Matrix: 4. Quantized Result:
the very first "pair" is not a pair, but the VLI-length of the (diff.) DC-value
16 11 10 16 24 40 51 61
the block is finally represented as iteration 15 0 -1 0 0 0 0 0
12 12 14 19 26 58 60 55 -2 -1 0 0 0 0 0 0
Huffman-encoded pair / VLI-encoded non-zero-AC / Huffman-.... / VLI... / ... 14 13 16 24 40 57 69 56 -1 -1 0 0 0 0 0 0
preceded by "Huffman-encoded VLI-length / VLI-encoded diff.-DC" 16 17 22 29 51 87 80 62 0 0 0 0 0 0 0 0
18 22 37 56 68 109 103 77 0 0 0 0 0 0 0 0
24 35 55 64 81 104 113 92 0 0 0 0 0 0 0 0
Section Section
49 64 78 87 103 121 120 101 0 0 0 0 0 0 0 0
All
Next two slides give an example of the DCT coding of a 8x8 block All 72 92 95 98 112 100 103 99 0 0 0 0 0 0 0 0
3-compression.fm 3-compression.fm
37 38

JPEG: Sample Compression (contd.) JPEG: Example

assume: last DC value was 18 --> encoded difference is 3 On the following slides: picture of Yosemite Valley
--> only 3, -2, -1 occur as non-zero values. Source: pico.phys.chemie.tu-muenchen.de/people/krempl/JMT
Multimedia

Multimedia
Their VLI-encoding is as follows:
3 11 with various degrees of compression:
-2 01
-1 0 Bitmap: no compression
1024 * 671 pixels
This makes the iteration look as follows (VLIs still represented as integers):
3 bytes / pixel
(2)(3), (1,2)(-2), (0,1)(-1), (0,1)(-1), (0,1)(-1), (2,1)(-1), (0,0) (<-- abbreviation for "til end")
2014 KByte file size
1:31 63 KByte
The following Huffman encoding is defined:
(2) 011 1:50 40 KByte
(0,0) 1010
(0,1) 00 240 0 -10 0 0 0 0 0 1:100 21 KByte
(1,2) 11011 -24 -12 0 0 0 0 0 0
(2,1) 11100 -14 -13 0 0 0 0 0 0 1:155 13 KByte
0 0 0 0 0 0 0 0
...so that the bitstream finally consists of 0 0 0 0 0 0 0 0

the following 31 bits (for 64 coefficients!): 0 0 0 0 0 0 0 0


0 0 0 0 0 0 0 0
0111111011010000000001110001010
Section 0 0 0 0 0 0 0 0 Section
...btw., the decoded matrix looks like this:
All All
3-compression.fm 3-compression.fm
39 40
Bitmap: No Compression JPEG: 1:31
Multimedia

Multimedia
Section Section

All All
3-compression.fm 3-compression.fm
41 42

JPEG: 1:50 JPEG: 1:100


Multimedia

Multimedia

Section Section

All All
3-compression.fm 3-compression.fm
43 44
JPEG: 1:155 JPEG 4 Modes of Compression

lossy sequential DCT-based mode


Multimedia

Multimedia
(baseline mode)

expanded lossy DCT-based mode

lossless mode

hierarchical mode

Section Section

All All
3-compression.fm 3-compression.fm
45 46

JPEG Extended Lossy DCT-Based Mode JPEG Extended Lossy DCT-Based Mode

Pixel resolution 8 to 12 bit Progressive image display:


Sequential image display: Coarse to fine
Multimedia

Multimedia
Good for large and complicated images
Top to bottom
Good for small images and fast processing

Section Section

All All
3-compression.fm 3-compression.fm
47 48
JPEG Extended Lossy DCT-Based Mode JPEG Lossless Mode

Principle: Image preparation:


Coefficients stored in buffer after quantization On pixel basis (2-16 bit/pixel)
Multimedia

Multimedia
Order of pixel/block processing changed
Image processing:
By spectral selection: Selection of a predictor for each pixel
code prediction
Selection according to importance of DC, AC value
0 no prediction
All DC values of whole image first 1 x=A
c b 2 x=B
All AC values in order of importance subsequently 3 x=C
a x 4 x=A+B+C
By successive approximation: 5 x=A+((B-C)/2)
6 x=B+((A-C)/2)
Selection according to position of bits
7 x=(A+B)/2
First the most significant bit of all blocks
Then the second significant bit of all blocks Entropy coding:
Until the least significant bit of all blocks Same as lossy mode
Section Section Code of chosen predictor and its difference to the actual value
All All
3-compression.fm 3-compression.fm
49 50

JPEG Hierarchical Mode JPEG 2000

Coding of each image with several resolutions: Goal: Establish a follow-on standard to JPEG
Image scaling Started Feb. 1996
Multimedia

Multimedia
Differential encoding Call for Proposals March 1997
First, coded with lowest resolution image A Standardization Dec. 2000 (target date)
Coded with increasing horizontal & vertical resolution image A
Features:
Difference between both images is computed B = A - A (*)
Compression based on Wavelet technology
Iteration for higher resolutions See Section 11

Features: Image resolution controlled by viewer, e.g.


Low resolution for thumbnail
Requires more storage and higher data rate
High resolution for full format
Fast decoding process Lossless for storage
Used for scalable video Increased capacity for color information
Similar to Photo-CD (Kodak, proprietary) 256 color channels

Section
(*) note for all scalable approaches: Section
Increased capacity for metadata
relate higher-res version B (or B) to receivers de-coded I.e. information about the image
All lower-res version A (to avoid accumulation of quantization errors) All
3-compression.fm 3-compression.fm
51 52
8. H.261 and related ITU Standards H.261 Image Preparation

Fixed source image format


Video codec for audiovisual services at p x 64kbit/s: Image components:
Multimedia

Multimedia
CCITT standard from 1990 Luminance signal (Y)
Two color difference signals (Cb,Cr)
For ISDN
Subsampling according to CCIR 601 (4:1:1)
With p=1,..., 30
Quarter Common Intermediate Format (QCIF) resolution:
Technical issues:
Mandatory
Real-time encoding/decoding
Y: 176 x 144 pixel ("pruning" 180-->176)
Max. signal delay of 150ms
Constant data rate
At 29.97 frames/s appr. 9.115 Mbit/s (uncompressed) CIF: 360*288
but: encoder may leave out up to 3 frames (--> ~8 fps)
Implementation in hardware (main goal) and software
QCIF
Common Intermediate format (CIF) resolution:
Optional
Y: 352 x 288 pixel
Section Section
At 29.97 frames/s appr. 36.46 Mbit/s (uncompressed) i.e. ~ 570 * 64kbps
All All
3-compression.fm 3-compression.fm
53 54

H.261 Image Preparation H.261: Image Compression Intraframe

Layered structure: Intraframe Coding: Independent coding of individual frames


Block of 8 x 8 pixels yields "reference frame" f0
Multimedia

Multimedia
Macroblock of: 4 Y blocks, 1 Cr block, 1 Cb block basically DCT as in JPEG baseline mode
Group of blocks (GOBs) of 3 x 11 macroblocks DCT w/ same quantization factor for all AC values
Picture: this factor may be adjusted by loopback filter (see below)
QCIF picture: 3 GOBs
CIF picture: 12 GOBs

Section Section

All All
3-compression.fm 3-compression.fm
55 56
H.261: Image Compression Interframe H.261: Image Compression

Interframe Coding: Coding dependent on previous frame(s) Interframe coding of a frame:


Based on motion estimation: Find for each macroblock similar macroblock in previous frame
Multimedia

Multimedia
Frame 1 Frame 2 Encode:
Motion vectors between macroblock pairs
Components are encoded yielding code words of variable length
Differences between macroblock pairs
DCT if value higher than a specific threshold
No further processing if value less than this threshold
motion vector
Quantization:
interframes: f1,f2,f3,... relative to f0 (differential encoding)
Linear
in H.261: intraframes rare (bandwidth!, main application videophone)
Adaptation of step size (loopback filter) constant data rate
Search for similar macroblock (16x16) in previous image Coarse quantization if many values to be transmitted
Position of this macroblock defines motion vector Fine quantization if few values to be transmitted
("leaky bucket": constant 64kbps "drop out";
Search range for similar block is implementation-dependent:
loopback filter: adjust quantization factor if bucket filled
Section max. 15 pixel Section above threshold1 or below threshold 2, respectively)
All but: motion vector may also always be 0 ("bad" software encoder) All
3-compression.fm 3-compression.fm
57 58

Further ITU Video Schemes (H.263, H.3xx) H.263

H.263 Differences of H.263 compared to H.261


extension to H.261 motion vector may point forward in time (future interframe), cf. MPEG, for video
Multimedia

Multimedia
max. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates suitable f. modem optimal PB-frames (2 combined pictures: 1 B- & 1 P-Frame)
optional overlapped block motion compensation
optional motion vector pointing outside image
Source Image Formats half pel motion compensation (instead of full pel)

H.261 H.263 JPEG is the still picture mode


Format Pixels
Encoder Decoder Encoder Decoder no included error detection and correction

SQCIF 128 x 96 optional required unlimited search space for motion vector
--> fast encoder can do better
QCIF 176 x 144 required required
..
CIF 352 x 144 optional optional
4CIF 704 x 576
not defined optional
Section 16CIF 1408 x 1152 Section

All All
3-compression.fm 3-compression.fm
59 60
H.320, H.32x Family 9. MPEG-1

H.320 specifies (as overview) videophone for ISDN


Motion Picture Expert Group (MPEG)
H.310
Multimedia

Multimedia
ISO/IEC working group(s)
adapt MPEG 2 for communication over B-ISDN (ATM)
ISO/IEC JTC1/SC29/WG11
H.321 ISO IS 11172 since 3/93
define videoconferencing terminal for B-ISDN (instead of N-ISDN)
Starting point: MPEG-1
H.322 Audio/video at about 1.5 Mbit/s
adapts H.320 for guaranteed QoS LANs (like ISO-Ethernet) Based on experiences with JPEG and H.261

H.323 Follow-up standards


videoconferencing over non-guaranteed LANs MPEG-2: choice of quality levels and compression factors

H.324 MPEG-4: content-based encoding, high compression factor

Terminal for low bit rate communication (over V.34 Modems) MPEG-7: support for content-based search and retrieval
Section Section MPEG-21:future framework
All All
3-compression.fm 3-compression.fm
61 62

MPEG Features MPEG Video: Preparation Step

Color model: Y Cb Cr
MPEG
4:2:0 subsampling
Multimedia

Multimedia
audio video system Y value for each pixel
Cb and Cr in every fourth pixel only
combined stream
coding data stream coding data stream
common buffer Resolution:
management
At most 768 x 576 pixel / image
8 bit/pixel in each layer (i.e., for Y, Cr, Cb)
Consideration of other standards:
14 pixel aspect ratios
JPEG
horizontal : vertical = 1:1 or 16:9 or 4:3 or ...
H.261
8 frame rates

Symmetric and asymmetric compression 23.976 Hz, 24 Hz, 25 Hz, 29.97 Hz, 30 Hz, 50 Hz, 59.94 Hz, 60 Hz
Lower rates not allowed!
Constant data rate, should be < 1856 kbit/s
No user defined MCU like JPEG
Section Original target rate ~ 1.2 Mbps incl. audio (=1x CD-ROM: 150 kbps) Section
No progressive mode like JPEG
All All
3-compression.fm 3-compression.fm
63 64
MPEG Video: Processing Step MPEG: Video - Processing Step

I-frames (intra-coded frames): Motion vectors:


I
Like JPEG but real-time decoding demands B Frame 1 Frame 2
Multimedia

Multimedia
Coding independent of other frames
B
P
P-frames (predictive coded frames): B
Coding depends on previous I- or P-frames B
motion vector
Based on motion vector P
B-frames (bi-directional predictive coded I MPEG does not define how to determine the motion vectors
frames): I.e. specifies only the format to describe them
t
Coding depends on previous and subsequent but no algorithm to find them
I- and P- frames Programmer is free to implement any algorithm

Based on motion vector Difference of similar macroblocks is DCT coded


Macroblock = 4 blocks with 8*8 pixels each
D-frames (DC-coded frames):
Only DC-coefficients are DCT coded, AC values are dropped
DC and AC coefficients are runlength coded
Section Section

All For fast forward and rewind All


3-compression.fm 3-compression.fm
65 66

MPEG: Video - Processing Step MPEG Video: Implications

Sequence of I-, P-, and B-frames: Random access


Position / frequency of I-, P- and B-frames can be defined by the encoder at I-frames
Multimedia

Multimedia
I1 B1 B2 P1 B3 B4 P2 I2 at P-frames: i.e. decode previous I-frame first
at B-frame: i.e. decode I and P-frames first

Must consider the structure of the movie: Editing


An I-frame should occur at least after each cut
decoded data
Order of transmission differs from order of display loss of quality (encode -> decode -> encode -> ...)
application of all video editing functions
I1 P1 B1 B2 P2 B3 B4 I2
encoded data (previous to entropy encoding)
preservation of quality
Reason: Receiver must know I- and P-frames transition effects as function in the DCT domain
before it can display B-frames
morphing, non-block conform overlay very difficult
Problem: Additional delay
encoded data
preservation of quality
Section Section
today: too complex, if possible, i.e. need for entropy decoding
All All
3-compression.fm 3-compression.fm
67 68
MPEG Audio Coding: Fundamentals MPEG Audio Coding: Fundamentals
80
60 pre- simultaneous- post-masking-
fm = 0.25 1 4 kHz
60
40
Multimedia

Multimedia
Sound Pressure Level (dB)
40 20 masker

SLT
av
masking
patterns 0
20 -50 50 100 150 ms 0 50 100 150 200
Dt tv
absolute threshold
0 of hearing

Masking in Time Domain


0.02 0.05 0.1 0.2 0.5 1 2 5 10 20
frequency (kHz)
after and before the event
depends on (to some extent) amplitude
Masking threshold in the frequence domain
narrowband random noise
depends on frequency

Section Section

All All
3-compression.fm 3-compression.fm
69 70

MPEG Audio Coding MPEG Audio Coding

Sampling compatible to encoding of CD-DA and DAT:


sub-band 32 quanti- entropy Sampling rates: 32 kHz, 44,1 kHz, 48 kHz
coder & Sampling precision: 16 bit/sample
coding zation
Multimedia

Multimedia
frame
packing
Audio channels:
Mono (single, 1 channel)
psychoacoustical controls: how many bits reserved
for which sub-band Stereo (2 channels)
model
dual channel mode (independent, e.g., bilingual)
optional: joint stereo (exploits redundancy and irrelevancy)
Audio channel:
Between 32 and 448 kbit/s Application Example: DAB Digital Audio Broadcasting
In steps of 16 kbit/s uses MPEG layer 2 (compression also known as MUSICAM =
(Masking pattern adapted Universal Subband Integrated Coding And Multiplexing)
Definition of 3 layers of quality delays, for VLSI implementation:
Layer 1: max. 448 Kbit/s (approx. 1.4 compression) max. 30 ms encoding
max. 10 ms decoding
Layer 2: max. 384 Kbit/s (approx. 1:6-1:8, common, e.g. as MUSICAM in DAB)
Section
Layer 3: max. 320 Kbit/s
Section SW codec delays vary for different layers, implementations, computers (rule-of-thumb
may be 50/100/150 ms for layer 1/2/3, which makes MP3 rather inappropriate for real-
All MP3 files: compression up to 1:12 / 1:14 with no hearable losses All
time conversation)
3-compression.fm 3-compression.fm
71 72
MPEG Audio and Video Data Streams Follow-Up MPEG Standards

Audio Data Stream Layers: MPEG-2:


1. Frames Higher data rates for high-quality audio/video
Multimedia

Multimedia
2. Audio access units Multiple layers and profiles with different degrees of compression and quality
3. Slots
MPEG-3
Video Data Stream Layers: Initially HDTV, but MPEG-2 scaled up to subsume MPEG-3
1. Video sequence layer
MPEG-4:
2. Group of pictures layer
Initially, lower data rates for e.g. mobile communication
3. Single picture layer Then, coding and additional functionalities based on image contents
4. Slice layer
MPEG-7:
5. Macroblock layer
Content description
6. Block layer
Basis for search and retrieval

MPEG-21 (upcoming):
Section Section
Framework for multimedia business, delivery... whats missing?
All All
3-compression.fm 3-compression.fm maybe eCommerce focus --> e.g., security, watermarking?
73 74

10. MPEG-2 MPEG-2 Video: Scaling

Motivation
From MPEG-1 to MPEG-2 analog: continuous decrease in quality if errors occur
Multimedia

Multimedia
Improvement in quality digital: need for tolerance whenever error occur, i.e scaling
from VCR to TV to HDTV
Option: Spatial scaling
No CD-ROM based constraints
higher data rates reduction of resolution
MPEG-1: about 1.5 Mbit/s approach
MPEG-2: 2-100 Mbit/s image sampled with half resolution, then MPEG algorithms applied,
output processed with better FEC (base layer)
Evolution Image decoded, substracted from original, to difference MPEG algorithms applied,
1994: International Standard output processed with worseFEC (enhanced layer)

Also later known as H.262 Option: Signal to Noise (SNR) scaling


Prominent role for digital TV in DVB (digital video broadcasting) noise introduced by
commercial MPEG-2 realizations available quantization errors and visible block structures

Section Section
approach
Base layer: DCT output, more significant bits encoded with better FEC
All All
3-compression.fm 3-compression.fm
Enhanced layer:DCT output, less significant bits encoded with worse FEC
75 76
MPEG-2 Video Profiles und Levels MPEG-2 Audio

High Level 80 Mbit/s 100 Mbit/s (two modest) extension to MPEG-1 audio:
1920 pixels/line
1152 lines
1) "low sample rate extension" LSE:
Multimedia

Multimedia
High-1440 Level 60 Mbit/s 60 Mbit/s 80 Mbit/s
1440 pixels/line 1/2 of all MPEG-1 rates: 16, 22.05, 24kHz
1152 lines
quantization down to 8 bits/sample
Main Level 15 Mbit/s 15 Mbit/s 15 Mbit/s 20 Mbit/s
720 pixels/line
576 lines 2) "multichannel extension": more channels, i.e. up to
Low Level 4 Mbit/s 4 Mbit/s 5 full bandwidth channels (surround system)
352 pixels/line
288 lines left and right front
Simple Main SNR Spatial High center (in front)
Profile Profile Scalable Scalable Profile left and right back
Profile Profile
LEVELS "matrixing": rule for backward compatible conversion --> stereo (x, y = 0.71)
and No B-frames B-frames B-frames B-frames B-frames
PROFILES Left for Stereo = Left_f + xCenter + yLeft_b
4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 or 4:2:2
Right for Stereo = Right_f + xCenter + yRigtht_b
Not Scalable Not Scalable SNR SNR Scalable SNR Scalable
Scalable or Spatial or Spatial option: +1 "low freq. extension" (LFE) channel for subwoofer
Scalable Scalable
"multilingual extension": 7 more, i.e. up to 12 channels (multiple languages,
Section Section commentary)
All All
3-compression.fm 3-compression.fm
77 78

MPEG-2 Audio (2) MPEG-2 System

Improved quality at or below 64 kbit/s Steps


1. Audio and video combined to Packetized Elementary Stream (PES)
Compatible to MPEG-1
Multimedia

Multimedia
2. PES(es) combined to Program Stream or Transport Stream
all MPEG-1 audio format can be processed by MPEG-2
only 3 MPEG-2 audio codec will not provide backward compatibility Program stream:
(in the range between 256 - 448 Kbit/s)
Error-free environment
Packets of variable length
One single stream with one timing reference

Transport stream:
Designed for noisy (lossy) media channels
Multiplex of various programs with one or more time bases
Packets of 188 byte length

Conversion between Program and Transport Streams possible


Section Section

All All
3-compression.fm 3-compression.fm
79 80
11. MPEG-4 MPEG-4: Timeline

Goals Schedule for Standardization


1993 Work started
Multimedia

Multimedia
MPEG-4 (ISO 14496) originally: 1997: Committee Draft
Targeted at systems with very scarce resources 1998: Final Committee Draft
To support applications like 1998: Draft International Standard
Mobile communication
1999-2000: International Standard
Videophone and E-mail
Max. data rates and dimensions (roughly):
Between 4800 and 64000 bits/s
176 columns x 144 lines x 10 frames/s

Largely covered by H.263, therefore re-orientation:


Goal to provide enhanced functionality
to allow for analysis and manipulation of image contents

Section Section

All All
3-compression.fm 3-compression.fm
81 82

MPEG-4: Goals (cont.) MPEG-4: Scope

1: support composite multimedia i.e. find standardized ways to Definition of


Represent units of aural, visual or audiovisual content System Decoder Model
Multimedia

Multimedia
"audio/visual objects" or AVOs specification for decoder implementations
1
object coding independent of Rhubarb 2
3 Description language
other objects, surroundings Rhubarb
and background binary syntax of an AV objects bitstream representation
Audio Audio
natural and synthetic objects object 1 video objects object 2 scene description information

Compose these objects together Corresponding concepts, tools and algorithms,


especially for
i.e. creation of compound objects that form audiovisual scenes
content-based compression of simple and compound audiovisual objects
Multiplex and synchronize the data associated with AVOs
manipulation of objects
for transportation over network channels providing a QoS (Quality-of-Service) transmission of objects
random access to objects
2: support synthetic objects
animation
computer-gen. (VR), synthesized (txt2speech), model-based ("face")
scaling
error robustness
3: support truly interactive applications (more than play/pause/
Section rewind..) Section

All Interact with the audiovisual scene generated at the decoders site All
3-compression.fm 3-compression.fm
83 84
MPEG-4: Scope (cont.) MPEG-4: Video and Image Encoding

Targeted bit rates for video and audio: Encoding / decoding of


VLBV core Rectangular images and video
Multimedia

Multimedia
Very Low Bit-rate Video coding similar to MPEG-1/2
5 - 64 Kbit/s motion prediction
image sequences with CIF resolution and up to 15 frames/s texture coding
Higher-quality video Images and video of arbitrary
64 Kbit/s - 4 Mbit/s shape
quality like digital TV as done in conventional approach
8x8 DCT or shape-adaptive DCT
Natural audio coding
plus coding of shape and transparency information
2 - 64 Kbit/s
Encoder
Must generate timing information
speed of the encoder clock = time base
desired decoding times and/or expiration times
by using time stamps attached to the stream
Section Section
Can specify the minimum buffer resources needed for decoding
All All
3-compression.fm 3-compression.fm
85 86

MPEG-4: Composition of Scenes MPEG-4: Example of a Composition

Scene description includes:


Tree to define hierarchical relationships between objects
Multimedia

Multimedia
Rhubarb primitive AVO
Rhubarb compound object
compound object

Objects positions in space and time


by converting the objects local coordinate system into a global coordinate system
Attribute value selection
e.g. pitch of sound, color, texture, animation parameters

Description based on some VRML concepts


VRML = Virtual Reality Modelling Language

Section Section
Interaction with scenes
All All
3-compression.fm
e.g. change viewing point, drag object, start/stop streams, select language 3-compression.fm
87 88
MPEG-4: Scaling MPEG-4: Synthetic Objects

Three approaches: Visual objects:


Spatial scalability Human face
Multimedia

Multimedia
decoder displays textures and visual objects at a reduced spatial resolution start object: neutral-expression face
by decoding only a subset of the total bit stream animated via FDPs and/or FAPs
32 levels max. for textures and still images FAP (facial anim param): animate current display
3 levels max. for video sequences FDP (facial def. param): alternative shape/texture
Temporal scalability Mesh + texture mapping: for 2D & 3D meshes
decoder displays video at a reduced temporal resolution 2D mesh may also be used for human face anim., see above
by decoding only a subset of the total bit stream only triangular 2D meshes, vertices may be moved (mv!), texture is warped
3 levels max. e.g. virtual background
Quality scalability Texture coding for view-dependent applications
bitstream is parsed into a number of bit stream layers of different bit-rates texture, e.g. virt. background; decoder/encoder loop for "minimal" Xmission
either during transmission or in the decoder
subset of the layers still yields a meaningful signal

Spatial and temporal scaling both for


Section Section
Conventional rectangular display and
All All
3-compression.fm Objects with arbitrary shape 3-compression.fm
89 90

MPEG-4: Synthetic Objects MPEG-4: Layered Networking Architecture


Display / Recording
Audio objects:
Text-to-speech Media
Multimedia

Multimedia
speech generation from given text and prosodic parameters
face animation control CoDec
CoDecCoDecCoDec Coding / Decoding
Score driven synthesis
Access Units e.g. video or audio frames
music generation from a score or scene description commands
more general than MIDI
Adaptation Layer
Special effects
A/V object data
Elementary Streams + stream type info, sync. info, QoS req.,...

FlexMux Layer Flexible Multiplexing

e.g. multiple elementary streams


Multiplexed Streams with similar QoS requirements
Transport Multiplexing
TransMux Layer - only interface specified
Section Section Network or Local Storage - layer itself can be any network,
All All e.g. RTP/UDP/IP, AAL5/ATM
3-compression.fm 3-compression.fm
91 92
MPEG-4: Layered Networking Architecture (cont.) MPEG-4: Error Handling

DMIF Delivery Multimedia Integration Framework Mobile communication:


Allows to establish multiple party sessions Low bit-rate (< 64 Kbps)
Multimedia

Multimedia
interaction with Error-prone
remote interactive peers
broadcast systems MPEG-4 concepts for error handling:
storage systems
Resynchronization
establishment of channels with specific QoSs and bandwidths
enables receiver to tune in again
Controls based on markers within bitstream
FlexMux layer
Data recovery
TransMux layer
enables receiver to reconstruct lost data
encode data in an error-resilient manner
Error concealment
enables receiver to bridge gaps in data
e.g. by repeating parts of old frames

Section Section

All All
3-compression.fm 3-compression.fm
93 94

12. Wavelets Wavelets: Compression / Decompression

Motivation Compressor
Multimedia

Multimedia
Forward Wavelet
JPEG / DCT problems: Transformation
Quantizer Encoder

DCT not applicable to whole image, but only to small blocks


block structure becomes visible at high compression ratios
Scaling as add-on additional effort
DCT function is fixed can not be adapted to source data Inverse Wavelet
Decoder
Transformation DeQuantizer

Improvements by using Wavelets: Decompressor


Transformation of the whole image
overcomes visible block structures and introduces inherent scaling The same overall structure as for DCT-based algorithms
Better identification of which data is relevant to human perception
But: important differences in the transformation step
higher compression ratio

Section Section

All All
3-compression.fm 3-compression.fm
95 96
Wavelets: Fundamental Idea Wavelets: Transformation Steps

Image is transformed into the frequency domain (as in JPEG) "Discrete Wavelet Transformation" (Mallat, 1989)

But: based on Wavelet functions instead of cosine functions Split image recursively by using high and low pass filters
Multimedia

Multimedia
read by
cosine: Wavelet e.g.: read by column
line lower
L c1 ...
frequencies
L
... ... H transformed
d11 image with
L d12 reduced size
Advantage: Wavelets yield zero value outside a limited interval H
higher
Wavelet is confined to a part of the image L Low Pass H d13 frequencies
H High Pass
Image needs not be splitted into blocks

Use Wavelet family: {2-j/2*(2-j*x-k)}, j,k Z, being a Wavelet


Section Section

All All
3-compression.fm 3-compression.fm
97 98

Wavelets: Transformation Steps (cont.) Wavelets: DWT compared with DCT

In each step i: Advantages of DWT over DCT:


Three images dxi (x=1,2,3): No block artefacts
Multimedia

Multimedia
containing the high frequency parts of the image Inherent scaling
representing "details" of the image based on the dxi for i=1,2,3,...
submitted to Wavelet transformation
Lower time complexity for the transformation
or thrown away in case of scaling
DCT: O(n*logn),
i
One image c :
DWT: O(n) (n=number of values to be transformed)
containing the lower frequency parts of the image
Higher flexibility: Wavelet function can be freely chosen
representing the original image with less details / at a lower resolution
submitted to step i+1

Up to here: 4 images with 1/4 resolution each --> no compression!


but again: decorrelation: many coefficients in d-images (close to) 0

Afterwards:
Quantization
Section Section
Entropy encoding
All All
3-compression.fm as with DCT 3-compression.fm
99 100
Wavelets: Further Issues 13. Fractal Image Compression

Edge detection reduces high frequencies: Image Generation


First extract detected edges
Multimedia

Multimedia
Then apply wavelets to such a filtered image

Application to video:

In-2 ...
In-1 In-1 - In-2
Image n In - In-1
Compute Wavelet
differences compressor

t Im t ...

Mandelbrot
recursive construction of images
infinite granularity
Section Section
self-similarities in images
All All
3-compression.fm 3-compression.fm Zi = RealConst. * Zi-1 + ComplexConst
101 102

Use of Fractals for Compression??? Overview (1) Use of Fractals for Compression??? Overview (2)

observation: self-similarities in natural images Key #3: Collages Theorem:


(clouds, dunes, beaches: zoom-in reveals similar forms as large image)
in order to find Wimg as above: search Wimg such that image is
Multimedia

Multimedia
idea: can natural images be described w/ fractal geometry?? (almost) transformed into itself!
first published by Barnsley & Sloan (88), first impl. 89 by Arnaud Joquin
First algorithm published (Joaquin):
Key #1: Iterated Function Systems IFS: partition image into (small, non-overlapping) "range blocks"
a b x e


input (sub-)picture subject to math. transform. of type + search (larger, overlapping) "domain blocks" which can be
c d y f

"contracted" into range blocks
picture moved, rotated / mirrored, and contracted
--> all transformations are "contractions" for each range block, find domain block and contraction
(lots of possibilities!!)
Key #2: Banachs Fixed Point Theorem:
apply a set Wimg={Wi} of contractions to an image
after infinitely many applications, a specific image appears
... called "attractor" or "fractal"
this process is independent of initial "start" image!!
Section human perception: iteration can stop "pretty soon" (finite no. of iterations) Section

All Q: how to find Wimg such that attractor is image-to-be-compressed? All


details / simplifications of Joaquin approach see below
3-compression.fm 3-compression.fm
103 104
To apply self-similarity: Image Generation To Find Self-Similarities

Examples affine function allows for


(from TUD + Univ. Bochum) for translation
Multimedia

Multimedia
recursive contruction of
rotation
images
scaling
Sirpinky triangle brightness adaptation
to produce self-similar
structures IFS:
infinite steps applied to
Iterative Function System
different source images ideally completely self-similar
lead to same result
example see right
known as
Sirpinski-triangle PIFS:
"Grenzwert" also known Partitioned Iterative Funcion
as attractor System
real images are
Section Section not completly self-similar
All All Wimg?
3-compression.fm 3-compression.fm
105 106

Theoretical Basis Fractal Image Compression and Decompression

Banachs Fixed Point Theorem: Compression: Find appropriate Wimg difficult


Let F be a metrical space
Decompression: Apply Wimg iteratively to any image easy
Multimedia

Multimedia
Let W: FF be a contractive mapping
i.e. there exists an s, 0<s<1, with | W(x)-W(y) | s | x-y | for all x,y F
Then W has exactly one fixed point xf
i.e. W(xf) = xf
xf can be computed as xf = limn Wn(x) with any x F

Application to image compression:


Let img be the image to be compressed
Regard the set of all possible images as a metrical space
metric e.g.: maximum difference between the pixels of two pictures
Goal: construct Wimg such that img is the fixed point of Wimg

Section Section
Stop when error falls below some bound
All All
3-compression.fm 3-compression.fm
Error can be calculated by "Collage Theorem"
107 108
How to Find Wimg?

Systematic search based on Compression rate? Example: for each (8*8) range block:
"Partitioned Iterative Function System (PIFS)" contraction factor fixed
Multimedia

Multimedia
Partition image into "range blocks" Ri
3 bit for transformation
8*8 pixel blocks
16 bit for domain block coordinates
non-overlapping
12 bit for brightness/contrast adaptation
Consider all "domain blocks" Dj of double size
16*16 pixel blocks --> factor is 8x8x8 : 31= 512:31 (cf. JPEG example)
overlapping
Find for each Ri the most similar Dj
consider rotations (0o/90o/180o/270o) and mirroring
adapt brightness and contrast of Dj to that of Ri
translation, rotation, mirroring, brightness adaptation
define a (partial) affine function
Combine partial functions to Wimg

Section Section

All All
3-compression.fm 3-compression.fm
109 110

Further Improvements Advantages & Drawbacks

+ High quality at high compression rates


At least for images with self-similarities
Multimedia

Multimedia
Here: better than JPEG ("cross-over point" at about 1:10 to 1:30)

+ Zooming into image supported


detailed view possible, interpolation instead of "pixelization"

+ Scalability
decompression steps yield iteratively improving image

- Long compression times


asymmetric mechanisms
Quadtree partitioning: improving search techniques for range & domain block pairs
Problem: - blockwise artifacts with Information losses
fixed 8*8 blocks do not reflect image properties Wimg is only approximative
Solution:
- Not well applicable to images of non-fractal nature
flexible partition of image into larger or smaller squares
E.g. texts, sharp lines & no quality guarantee possible
Section
driven by image structure Section
- Lower quality than JPEG at low compression rates
All Partitioning into rectangles and triangles All
3-compression.fm
111
3-compression.fm
112
- Error (error propagation)
14. Conclusion

JPEG:
Multimedia

Very general format with high compression ratio


SW and HW for baseline mode available

H.261 / H.263:
Established standard by telecom world
Preferable hardware realization

MPEG family of standards:


Video and audio compression for different data rates
Asymmetric (focus) and symmetric

Next steps: wavelets, fractals, models of objects

Section

All
3-compression.fm
113

You might also like