3 Compression

Scope
Multimedia Applications
Usage
Multimedia
Multimedia
Learning & Teaching Design User Interfaces
Compression Content
Docu- Synchro-
Group
Services
Process- Security ... Communi-
ments nization
ing cations
Databases Programming
Systems
Media-Server Operating Systems Communications
Prof. Dr.-Ing. Lars Wolf
Opt. Memories Quality of Service Networks
Compression
TU Braunschweig Computer
Institut fr Betriebssysteme und Rechnerverbund Archi- Image &
Basics
tectures Animation Video Audio
Graphics
Mhlenpfordtstrae 23, 38106 Braunschweig, Germany
Section Email: wolf@ibr.cs.tu-bs.de Section
All All
3-compression.fm 3-compression.fm
1 2
Contents 1. Motivation
1. Motivation
2. Requirements General Digital video in computing means for
Multimedia
Multimedia
Text:
3. Fundamentals Categories 1 page with 80 char/line and 64 lines/page and 2 Byte/Char
80 x 64 x 2 x 8 = 80 kBit/page
4. Source Coding
Image:
5. Entropy Coding 24 Bit/Pixel, 512 x 512 Pixel/image
512 x 512 x 24 = 6 MBit/Image
6. Hybrid Coding
Audio:
7. JPEG CD-quality, samplerate44,1 kHz, 16 Bit/sample
Mono: 44,1 x 16 = 706 kBit/s
8. H.261 and related ITU Standards Stereo: 1.412 MBit/s
9. MPEG-1 Video:
full frames with 1024 x 1024 Pixel/frame, 24 Bit/Pixel, 30 frames/s
10. MPEG-2 1024 x 1024 x 24 x 30 = 720 MBit/s
more realistic
Section 11. MPEG-4 Section
360 x 240 Pixel/frame = 60 MBit/s
All
14. Conclusion All Hence compression is necessary
3 4
2. Requirements General Requirements
Dialogue and retrieval mode requirements:

Independence of frame size and video frame rate
Multimedia
Multimedia
Synchronization of audio, video, and other media
low delay
Dialogue mode requirements:
Compression and decompression in real-time
intrinsic scalability (e.g. 25 frames/s)
End-to-end delay < 150ms
high quality
Symmetric:
compression
compression and decompression take the same time
Retrieval mode requirements:

Fast forward and backward data retrieval
low complexity (e.g., ease of decoding) Random access within 1/2 s
efficient implementation (e.g., memory req.) Asymmetric:
Section Section compression takes longer than decompression
All All
3-compression.fm
5
3-compression.fm
6
Software and/or hardware-assisted implementation requirements
3. Fundamentals Categories Categories and Techniques

Run-Length Coding
Entropy
Huffman Coding
Coding
Multimedia
Multimedia
Arithmetic Coding
entropy coding DPCM
hybrid Prediction
- ignoring semantics of the data DM
coding
- lossless FFT
Source Transformation
DCT
source entropy
coding encoding
Coding Bit Position
- entropy Layered Coding Subsampling
- based on semantic of the data and Sub-Band Coding
- often lossy source Vector Quantization
coding JPEG
channel coding Hybrid MPEG
- adaptation to communication channel Coding H.261, H.263
Section
- introduction of redundancy Section proprietary: Quicktime, ...
All All
7 8
Categories & Techniques, Cont. Categories & Techniques: Symmetric / Asymmetric
Two principal possibilities Major distinction: Symmetric / Asymmetric

1. Entropy Coding: Eliminate Redundancy (thus, lossless) Asym. (usually): more effort for compression
Multimedia
Multimedia
2. Reduction Coding: Eliminate Irrelevance / Low-Relevance (lossy) o.k. if compression non real-time, "only once" (movie!)
may involve number-crunchers (...owned by content provider)
Preparatory Step: Decorrelation - Eliminate Interdependencies
Symmetric: "required" for real-time, e.g., videoconferencing
this is the essence of source coding in reality, often not 100% symmetric
changes "representation" of media
goal usually: reduce dependencies between data
as such, is a preparatory step!! and usually, does not compress
Hybrid coding steps (often): decorrelation - reduction - entropy cod.

often: reduction by quantization
last step: additional compresion without harm
note: literature usually uses terms as in last slide!!

Section Section
All
note: reduction coding is "smart deletion", not really "compression" All
9 10
Categories & Techniques: Further Considerations 4. Source Coding
Further considerations include, e.g., DPCM

Adjustable compression rate? ...quality?
Multimedia
Multimedia
"smooth" bit stream ("isochronous")? DPCM = Differential Pulse-Code Modulation
terms: CBR (const. bit rate) vs. VBR (variable bit rate)
Assumptions:
may be "over time": e.g., packet size BigSmallSmall BigSmallSmall...
may be simulated w/ loop-back filter plus buffer Consecutive samples or frames have similar values
"progressive" (mainly: non-continuous media): display-while-download Prediction is possible due to existing correlation
"streaming": ~ same for video (here, rather an issue of software) Fundamental steps:
more subtle issues previous actual
Predict next data next data:
"open" standard?
data:
based on previously processed data prediction
good "performance" (ratio, speed) for all kinds of media? 1000 1005
Determine difference between 5
bullet-proof, well-understood? actual next data and prediction 1000
code
...
Code difference only
Section Section
Challenge: optimal predictor
All All Delta modulation (DM): 1 bit as difference signal
11 12
Source Coding: Transformation Source Coding: Sub-Band
Assumptions: Assumption:
Data in the transformed domain is easier to compress Some frequency ranges are more important than others
Multimedia
Multimedia
Related processing is feasible
Example:
Example: frequency spectrum of the signal
Fourier Transformation
time domain frequency

domain
frequency
region of transformation / coding
Inverse
Fourier Transformation
Application:
Telephone system
300 - 3400 Hz only
FFT: Fast Fourier Transformation MPEG audio
Section Section
DCT: Discrete Cosine Transformation
All All
13 14
5. Entropy Coding Entropy Coding: Principle
Entropy Coding: Principle Entropy formula:

H(P) = p() log B p()

Multimedia
Multimedia
Entropy (in information theory): information content/ "density"
example: given 4 possible symbols (words) in source code
symbols/words equally likely: high entropy (full of information)
i) IF all equal p=1/4: H(P)=2; ii) IF p= 1/2, 1/4, 1/8, 1/8 --> H(P)= 1 6/8
otherwise: lower entropy (suboptimal representation of info, less dense)
"Entropy coding" means:
note:
mean length of file equals (~almost) entropy
probability
high doesnote: seems "little information" to

not consider
arrangement
Entropy (cf.us since it is very regular; this is not
run length in ii) above, with B=2 (binary):
encoding!)
grey levels covered by entropy formula, yet may p= code length -log2 () = -(-1)=1; p= 2bits, etc.
be used for compression (e.g. run length) GOAL: find code w/ symbol length as close as possible to logB p()
probability
low
Entropy here: "little info" because
"most of picture is in same gray"
grey levels
Section Section
All All
15 16
Run-Length Entropy Coding: Huffman
(only marginal relation to entropy) Assumption:

Some symbols occur more often than others
Assumption:
Multimedia
Multimedia
E.g., character frequencies of the English language
Long sequences of identical symbols
Fundamental principle:
Example:
... A B C E E E E E E D A C B... Frequently occurring symbols are coded with shorter bit strings
compression
... A B C E ! 6 D A C B...
symbol number of
occurrences
special flag
Special variant: zero-length encoding

Section Section
only repetition of zeroes count
All All
3-compression.fm
17
in red part above, "symbol" not needed (i.e. "pays" for >2 repetitions) 3-compression.fm
18
Entropy Coding: Huffman Entropy Coding: Huffman
Example: Table and example of application to data stream

Symbols to be encoded: A, B, C, D, E
Multimedia
Multimedia
Given probabilities of occurrence:
p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15
symbol code
A 11
symbol probability coding tree code B A C D A B E B A E
B 10
1 C 011
A 30% A A = 11 10 11 011 010 11 10 00 10 11 00
1 D 010
60% 00
0 E
B 30% B B = 10
1
C 10% C 100% C = 011
1
25%
0
D 15% D 0 D = 010
40%
0
Section E 15% E E = 00 Section
All All
19 20
6. Hybrid Coding 7. JPEG
Basic Encoding Steps
JPEG: Joint Photographic Expert Group
Multimedia
Multimedia
video:
lossy International Standard:
audio: lossy For digital compression and coding of continuous-tone still images:
lossless (sometimes lossless) lossless Gray-scale
Color
Since 1992
data data
source quanti- entropy compresse Joint effort of:
pre- pro-
data zation encoding data ISO/IEC JTC1/SC2/WG10
paration cessing
Commission Q.16 of CCITT SGVIII
e.g. e.g.
Compression rate of 1:10 yields reasonable results
e.g. e.g.
- resolution - DCT - linear - runlength
- frame rate - sub-band - DC, AC - Huffman
Section coding Section
values
All All
21 22
JPEG JPEG Compression Steps
Very general compression scheme
Independence of:
Multimedia
Multimedia
image image entropy
com-
Image resolution pre- pro- encoding
source paration cessing pressed
Image and pixel aspect ratio
quanti-
runlength
Color representation pixel
image zation image
Image complexity and statistical characteristics predictor Huffman
block
MCU FDCT Arithm.
Well-defined interchange format of encoded data
Implementation in:
Software only MCU: Minimum Coded Unit
FDCT: Forward Discrete Cosine Transformation
Software and hardware
MOTION JPEG for video compression

Section Sequence of JPEG-encoded images Section
JPEG JPEG JPEG
All All
23 24
JPEG: Image Preparation JPEG Image Preparation
planes: data units: pixels or 8*8-blocks
* * * * * * * * *
Example 4:2:2 YUV, 4:1:1 YUV, and YUV9 Coding
* * * * *
* * * * * * * * * Luminance (Y):
* * * * *
Multimedia
Multimedia
* * * * *
* * ** ** ** * * brightness
** * * * * * * Yi resolution
* *** *** *** ** * C3 sampling frequency 13.5 MHz
* of plane i
* * * * * *
* * * * * C2
* * * * * Xi Chrominance (U, V):
* * * * * C1 color differences
sampling frequency 6.75 MHz
Planes:
1 N 255 components Ci (e.g., one plane per color)
Contain data units
Pixels in lossless mode, 8*8-blocks in lossy mode
Different planes may have different resolutions
Number of bits per pixel:

8 or 12 bit per pixel in lossy modes
Section Section
2 to 16 bit per pixel in lossless mode
All All
25 26
JPEG Image Preparation JPEG: 4 Modes of Compression
Non-interleaved encoding: Lossy sequential DCT-based mode

top Baseline Mode
Multimedia
Multimedia
* * * * * * *
Expanded lossy DCT-based mode
left * * * * * * * right
* * * * * * * Progressive image display
I.e. from coarse to fine resolution
bottom
Lossless mode
Interleaved encoding:
Lossless compression error-free decompression
C1 C2 C3
* * * * * * * * * * * * * * * Hierarchical mode
* * * * * * * * * * * * * * * + +
* * * * * * * * * Compression with multiple resolutions
* * * * * * * * * * * * * * * = MCU
Minimum Coded Unit (MCU):

Combination of interleaved data units of different components
Section Section
Data of an MCU are stored and transmitted together
All All
27 28
JPEG Baseline Mode Intuitive Understanding of DCT
Fourier-Transform (& FFT "fast" algorithm) known from 1-dimensional:
1. image 2. image 3. quanti- 4. entropy cut waveform into pieces (blocks of samples)
com-
pre- pro- zation encoding for each blocks:
source paration cessing pressed
Multimedia
Multimedia
interpret as periodic (infinite) oscillating waveform
represent as sum of sin/cos waves ai sin t; i=0...(N-1); same for cos
image image
ai coefficients; a0 = DC (direct current= shift wrt. 0-axis),
8x8 others: how much of the respective sin or cos wave is part of waveform
blocks FDCT tables
tables tables
i increasing frequencies (usually N = no. of samples in block)
DCT in JPEG etc.:
Baseline mode is mandatory for all JPEG implementations: same idea, but 2-dimensional cos-waves
Often restricted to certain resolution cut out square blocks from picture (NxN)
cos waves all have independent frequencies in horizontal/vertical direction
Often only three planes with predefined color set-up
comparable to smooth hills, # of valleys may differ horiz/vert.
Image preparation: again: interpret sample as periodic (2D) waveform
--> represent as sum of (2D) cos wave "hill areas"
Step 1a: Pixel resol. multiples of p=8 bit yields 8x8 pixel blocks (data units) why only cos??
Step 1b: unsigned --> signed integer (prepare for "oscillation" --> sin/cos) trick: picture swapped around axes
--> 4fold size --> picture symmetric to axes --> sin parts become zero
... other steps see below
Section Section 4fold size no problem: 3 parts redundant
Step 4a: zigzag linearization (see below) axes have double "weight" (pix. row/col. "0") --> factor Cu/Cv in formula
All All
3-compression.fm
29
Steps 4b, c, ...: several entropy coding algorithms applied 3-compression.fm
30
JPEG Baseline Mode: Image Processing JPEG Baseline Mode: Image Processing
Forward Discrete Cosine Transformation (FDCT): FDCT transforms:

7 7 blocks into blocks
1 ( 2x + 1 ) u
Multimedia
Multimedia
2y + 1 ) v-
s yx cos ----------------------------- cos (----------------------------
S vu = --- C u C v
4 16 16 not pixels into pixels
x = 0y = 0
Example:
with:
cu, cv = 1
------- , for u, v= 0; else cu, cv = 1 Calculation of S00
2
Formula applied to each block for all 0 u, v 7:

Blocks with 8x8 pixel
result in 64 DCT coefficients: P P P ... D A A ...
P P P ... FDCT A A A ... # # # # # # # # * * * * * * * *
1 DC-coefficient S00:
P P P ... A A A ... # # # # # # # # * * * * * * * *
basic color of the block ... ... # # # # # # # # * * * * * * * *
63 AC-coefficients: # # # # # # # # * * * * * * * *
(likely) zero or near-by zero values P=Pixel D=DC- / A=AC-coeff. # # # # # # # # * * * * * * * *
# # # # # # # # * * * * * * * *
Different significance of the coefficients: # # # # # # # # * * * * * * * *
# # # # # # # # * * * * * * * *
Section Section
DC: most important
All All
AC: less important
31 32
JPEG Baseline Mode: Quantization JPEG Quantization Effect
Quantization of DCT-coefficients:
Map interval of real numbers to one integer number
Multimedia
Multimedia
Especially: small values are mapped to 0, yielding long zero sequences
Using quantization tables:

(a) (b)
Different coefficients may have different granularities
Section Section
All All
33 34
JPEG Baseline Mode: Entropy Encoding JPEG Baseline Mode: Entropy Coding
DC-coefficients: 63 AC coefficients:
Compute the differences: Ordering in zig-zag form
Multimedia
Multimedia
DCi-1 DCi AC01 AC07
* * * * * * * *
DC * * * * * * * *
... block block ... * * * * * * * *
* * * * * * * *
* * * * * * * *
DIFF = DCi - DCi-1 * * * * * * * *
* * * * * * * *
Encode differences instead of the DCi values AC70 * * * * * * * *
AC77
Reason: DC values of adjacent blocks are often similar
reason: coefficients in lower right corner are likely to be zero
Huffman coding of all coefficients:
Transformation into a code
where amount of bits depends on frequency of respective value
Subsequent runlength coding of zeros
Section Section
All All
35 36
JPEG: Details of (one possible) Entropy coding JPEG: Sample Compression of 1 Block: 8x8 Matrices
Treatment of "zig-zag sequence": 1. Typical Pixel Block: 2. DCT Coefficients:

differential coding of DC: DCi stored as "change wrt. DCi-1"
Multimedia
Multimedia
139 144 149 153 155 155 155 155 235.6 1.0 -12.1 -5.2 2.1 -1.7 -2.7 1.3
assumption: there will rarely be two non-zero AC values in sequence 144 151 153 156 159 156 156 156 -22.6 -17.5 -6.2 -3.2 -2.9 -0.1 0.4 -1.2
--> regard seq. as iteration of non-zero AC-values and zero-runlengths 150 155 160 163 158 156 156 156 -10.9 -9.3 -1.6 1.5 0.2 -0.9 -0.6 -0.1
--> sometimes, the zero-runlength will have "length zero" 159 161 162 160 160 159 159 159 -7.1 -1.9 0.2 1.5 0.9 -0.1 0.0 0.3
159 160 161 162 162 155 155 155 -0.6 -0.8 1.5 1.6 -0.1 -0.7 0.6 1.3
code non-zero AC-values as VLIs (variable length integers ) 161 161 161 161 160 157 157 157 1.8 -0.2 1.6 -0.3 -0.8 1.5 1.0 -1.0
--> need to transmit VLI-lengths 162 162 161 163 162 157 157 157 -1.3 -0.4 -0.3 -1.5 -0.5 1.7 1.1 -0.8
(difference to Huffman: end of code not found by decoder) 162 162 161 161 163 158 158 158 -2.6 1.6 -3.8 -1.8 1.9 1.2 -0.6 -0.4
create pairs (zero-runlength, VLI-length-of-following-non-zero-AC-value)

these pairs are Huffman encoded 3. Quantization Matrix: 4. Quantized Result:
the very first "pair" is not a pair, but the VLI-length of the (diff.) DC-value
16 11 10 16 24 40 51 61
the block is finally represented as iteration 15 0 -1 0 0 0 0 0
12 12 14 19 26 58 60 55 -2 -1 0 0 0 0 0 0
Huffman-encoded pair / VLI-encoded non-zero-AC / Huffman-.... / VLI... / ... 14 13 16 24 40 57 69 56 -1 -1 0 0 0 0 0 0
preceded by "Huffman-encoded VLI-length / VLI-encoded diff.-DC" 16 17 22 29 51 87 80 62 0 0 0 0 0 0 0 0
18 22 37 56 68 109 103 77 0 0 0 0 0 0 0 0
24 35 55 64 81 104 113 92 0 0 0 0 0 0 0 0
Section Section
49 64 78 87 103 121 120 101 0 0 0 0 0 0 0 0
All
Next two slides give an example of the DCT coding of a 8x8 block All 72 92 95 98 112 100 103 99 0 0 0 0 0 0 0 0
37 38
JPEG: Sample Compression (contd.) JPEG: Example
assume: last DC value was 18 --> encoded difference is 3 On the following slides: picture of Yosemite Valley
--> only 3, -2, -1 occur as non-zero values. Source: pico.phys.chemie.tu-muenchen.de/people/krempl/JMT
Multimedia
Multimedia
Their VLI-encoding is as follows:
3 11 with various degrees of compression:
-2 01
-1 0 Bitmap: no compression
1024 * 671 pixels
This makes the iteration look as follows (VLIs still represented as integers):
3 bytes / pixel
(2)(3), (1,2)(-2), (0,1)(-1), (0,1)(-1), (0,1)(-1), (2,1)(-1), (0,0) (<-- abbreviation for "til end")
2014 KByte file size
1:31 63 KByte
The following Huffman encoding is defined:
(2) 011 1:50 40 KByte
(0,0) 1010
(0,1) 00 240 0 -10 0 0 0 0 0 1:100 21 KByte
(1,2) 11011 -24 -12 0 0 0 0 0 0
(2,1) 11100 -14 -13 0 0 0 0 0 0 1:155 13 KByte
0 0 0 0 0 0 0 0
...so that the bitstream finally consists of 0 0 0 0 0 0 0 0
the following 31 bits (for 64 coefficients!): 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0
0111111011010000000001110001010
Section 0 0 0 0 0 0 0 0 Section
...btw., the decoded matrix looks like this:
All All
39 40
Bitmap: No Compression JPEG: 1:31
Multimedia
Multimedia
Section Section
All All
41 42
JPEG: 1:50 JPEG: 1:100

Multimedia
Multimedia
Section Section
All All
43 44
JPEG: 1:155 JPEG 4 Modes of Compression
lossy sequential DCT-based mode

Multimedia
Multimedia
(baseline mode)
expanded lossy DCT-based mode
lossless mode
hierarchical mode
Section Section
All All
45 46
JPEG Extended Lossy DCT-Based Mode JPEG Extended Lossy DCT-Based Mode
Pixel resolution 8 to 12 bit Progressive image display:

Sequential image display: Coarse to fine
Multimedia
Multimedia
Good for large and complicated images
Top to bottom
Good for small images and fast processing
Section Section
All All
47 48
JPEG Extended Lossy DCT-Based Mode JPEG Lossless Mode
Principle: Image preparation:

Coefficients stored in buffer after quantization On pixel basis (2-16 bit/pixel)
Multimedia
Multimedia
Order of pixel/block processing changed
Image processing:
By spectral selection: Selection of a predictor for each pixel
code prediction
Selection according to importance of DC, AC value
0 no prediction
All DC values of whole image first 1 x=A
c b 2 x=B
All AC values in order of importance subsequently 3 x=C
a x 4 x=A+B+C
By successive approximation: 5 x=A+((B-C)/2)
6 x=B+((A-C)/2)
Selection according to position of bits
7 x=(A+B)/2
First the most significant bit of all blocks
Then the second significant bit of all blocks Entropy coding:
Until the least significant bit of all blocks Same as lossy mode
Section Section Code of chosen predictor and its difference to the actual value
All All
49 50
JPEG Hierarchical Mode JPEG 2000
Coding of each image with several resolutions: Goal: Establish a follow-on standard to JPEG
Image scaling Started Feb. 1996
Multimedia
Multimedia
Differential encoding Call for Proposals March 1997
First, coded with lowest resolution image A Standardization Dec. 2000 (target date)
Coded with increasing horizontal & vertical resolution image A
Features:
Difference between both images is computed B = A - A (*)
Compression based on Wavelet technology
Iteration for higher resolutions See Section 11
Features: Image resolution controlled by viewer, e.g.

Low resolution for thumbnail
Requires more storage and higher data rate
High resolution for full format
Fast decoding process Lossless for storage
Used for scalable video Increased capacity for color information
Similar to Photo-CD (Kodak, proprietary) 256 color channels
Section
(*) note for all scalable approaches: Section
Increased capacity for metadata
relate higher-res version B (or B) to receivers de-coded I.e. information about the image
All lower-res version A (to avoid accumulation of quantization errors) All
51 52
8. H.261 and related ITU Standards H.261 Image Preparation
Fixed source image format

Video codec for audiovisual services at p x 64kbit/s: Image components:
Multimedia
Multimedia
CCITT standard from 1990 Luminance signal (Y)
Two color difference signals (Cb,Cr)
For ISDN
Subsampling according to CCIR 601 (4:1:1)
With p=1,..., 30
Quarter Common Intermediate Format (QCIF) resolution:
Technical issues:
Mandatory
Real-time encoding/decoding
Y: 176 x 144 pixel ("pruning" 180-->176)
Max. signal delay of 150ms
Constant data rate
At 29.97 frames/s appr. 9.115 Mbit/s (uncompressed) CIF: 360*288
but: encoder may leave out up to 3 frames (--> ~8 fps)
Implementation in hardware (main goal) and software
QCIF
Common Intermediate format (CIF) resolution:
Optional
Y: 352 x 288 pixel
Section Section
At 29.97 frames/s appr. 36.46 Mbit/s (uncompressed) i.e. ~ 570 * 64kbps
All All
53 54
H.261 Image Preparation H.261: Image Compression Intraframe
Layered structure: Intraframe Coding: Independent coding of individual frames

Block of 8 x 8 pixels yields "reference frame" f0
Multimedia
Multimedia
Macroblock of: 4 Y blocks, 1 Cr block, 1 Cb block basically DCT as in JPEG baseline mode
Group of blocks (GOBs) of 3 x 11 macroblocks DCT w/ same quantization factor for all AC values
Picture: this factor may be adjusted by loopback filter (see below)
QCIF picture: 3 GOBs
CIF picture: 12 GOBs
Section Section
All All
55 56
H.261: Image Compression Interframe H.261: Image Compression
Interframe Coding: Coding dependent on previous frame(s) Interframe coding of a frame:

Based on motion estimation: Find for each macroblock similar macroblock in previous frame
Multimedia
Multimedia
Frame 1 Frame 2 Encode:
Motion vectors between macroblock pairs
Components are encoded yielding code words of variable length
Differences between macroblock pairs
DCT if value higher than a specific threshold
No further processing if value less than this threshold
motion vector
Quantization:
interframes: f1,f2,f3,... relative to f0 (differential encoding)
Linear
in H.261: intraframes rare (bandwidth!, main application videophone)
Adaptation of step size (loopback filter) constant data rate
Search for similar macroblock (16x16) in previous image Coarse quantization if many values to be transmitted
Position of this macroblock defines motion vector Fine quantization if few values to be transmitted
("leaky bucket": constant 64kbps "drop out";
Search range for similar block is implementation-dependent:
loopback filter: adjust quantization factor if bucket filled
Section max. 15 pixel Section above threshold1 or below threshold 2, respectively)
All but: motion vector may also always be 0 ("bad" software encoder) All
57 58
Further ITU Video Schemes (H.263, H.3xx) H.263
H.263 Differences of H.263 compared to H.261

extension to H.261 motion vector may point forward in time (future interframe), cf. MPEG, for video
Multimedia
Multimedia
max. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates suitable f. modem optimal PB-frames (2 combined pictures: 1 B- & 1 P-Frame)
optional overlapped block motion compensation
optional motion vector pointing outside image
Source Image Formats half pel motion compensation (instead of full pel)
H.261 H.263 JPEG is the still picture mode

Format Pixels
Encoder Decoder Encoder Decoder no included error detection and correction
SQCIF 128 x 96 optional required unlimited search space for motion vector
--> fast encoder can do better
QCIF 176 x 144 required required
..
CIF 352 x 144 optional optional
4CIF 704 x 576
not defined optional
Section 16CIF 1408 x 1152 Section
All All
59 60
H.320, H.32x Family 9. MPEG-1
H.320 specifies (as overview) videophone for ISDN

Motion Picture Expert Group (MPEG)
H.310
Multimedia
Multimedia
ISO/IEC working group(s)
adapt MPEG 2 for communication over B-ISDN (ATM)
ISO/IEC JTC1/SC29/WG11
H.321 ISO IS 11172 since 3/93
define videoconferencing terminal for B-ISDN (instead of N-ISDN)
Starting point: MPEG-1
H.322 Audio/video at about 1.5 Mbit/s
adapts H.320 for guaranteed QoS LANs (like ISO-Ethernet) Based on experiences with JPEG and H.261
H.323 Follow-up standards

videoconferencing over non-guaranteed LANs MPEG-2: choice of quality levels and compression factors
H.324 MPEG-4: content-based encoding, high compression factor
Terminal for low bit rate communication (over V.34 Modems) MPEG-7: support for content-based search and retrieval
Section Section MPEG-21:future framework
All All
61 62
MPEG Features MPEG Video: Preparation Step
Color model: Y Cb Cr
MPEG
4:2:0 subsampling
Multimedia
Multimedia
audio video system Y value for each pixel
Cb and Cr in every fourth pixel only
combined stream
coding data stream coding data stream
common buffer Resolution:
management
At most 768 x 576 pixel / image
8 bit/pixel in each layer (i.e., for Y, Cr, Cb)
Consideration of other standards:
14 pixel aspect ratios
JPEG
horizontal : vertical = 1:1 or 16:9 or 4:3 or ...
H.261
8 frame rates
Symmetric and asymmetric compression 23.976 Hz, 24 Hz, 25 Hz, 29.97 Hz, 30 Hz, 50 Hz, 59.94 Hz, 60 Hz
Lower rates not allowed!
Constant data rate, should be < 1856 kbit/s
No user defined MCU like JPEG
Section Original target rate ~ 1.2 Mbps incl. audio (=1x CD-ROM: 150 kbps) Section
No progressive mode like JPEG
All All
63 64
MPEG Video: Processing Step MPEG: Video - Processing Step
I-frames (intra-coded frames): Motion vectors:

I
Like JPEG but real-time decoding demands B Frame 1 Frame 2
Multimedia
Multimedia
Coding independent of other frames
B
P
P-frames (predictive coded frames): B
Coding depends on previous I- or P-frames B
motion vector
Based on motion vector P
B-frames (bi-directional predictive coded I MPEG does not define how to determine the motion vectors
frames): I.e. specifies only the format to describe them
t
Coding depends on previous and subsequent but no algorithm to find them
I- and P- frames Programmer is free to implement any algorithm
Based on motion vector Difference of similar macroblocks is DCT coded

Macroblock = 4 blocks with 8*8 pixels each
D-frames (DC-coded frames):
Only DC-coefficients are DCT coded, AC values are dropped
DC and AC coefficients are runlength coded
Section Section
All For fast forward and rewind All

65 66
MPEG: Video - Processing Step MPEG Video: Implications
Sequence of I-, P-, and B-frames: Random access

Position / frequency of I-, P- and B-frames can be defined by the encoder at I-frames
Multimedia
Multimedia
I1 B1 B2 P1 B3 B4 P2 I2 at P-frames: i.e. decode previous I-frame first
at B-frame: i.e. decode I and P-frames first
Must consider the structure of the movie: Editing

An I-frame should occur at least after each cut
decoded data
Order of transmission differs from order of display loss of quality (encode -> decode -> encode -> ...)
application of all video editing functions
I1 P1 B1 B2 P2 B3 B4 I2
encoded data (previous to entropy encoding)
preservation of quality
Reason: Receiver must know I- and P-frames transition effects as function in the DCT domain
before it can display B-frames
morphing, non-block conform overlay very difficult
Problem: Additional delay
encoded data
preservation of quality
Section Section
today: too complex, if possible, i.e. need for entropy decoding
All All
67 68
MPEG Audio Coding: Fundamentals MPEG Audio Coding: Fundamentals
80
60 pre- simultaneous- post-masking-
fm = 0.25 1 4 kHz
60
40
Multimedia
Multimedia
Sound Pressure Level (dB)
40 20 masker
SLT
av
masking
patterns 0
20 -50 50 100 150 ms 0 50 100 150 200
Dt tv
absolute threshold
0 of hearing
Masking in Time Domain

0.02 0.05 0.1 0.2 0.5 1 2 5 10 20
frequency (kHz)
after and before the event
depends on (to some extent) amplitude
Masking threshold in the frequence domain
narrowband random noise
depends on frequency
Section Section
All All
69 70
MPEG Audio Coding MPEG Audio Coding
Sampling compatible to encoding of CD-DA and DAT:

sub-band 32 quanti- entropy Sampling rates: 32 kHz, 44,1 kHz, 48 kHz
coder & Sampling precision: 16 bit/sample
coding zation
Multimedia
Multimedia
frame
packing
Audio channels:
Mono (single, 1 channel)
psychoacoustical controls: how many bits reserved
for which sub-band Stereo (2 channels)
model
dual channel mode (independent, e.g., bilingual)
optional: joint stereo (exploits redundancy and irrelevancy)
Audio channel:
Between 32 and 448 kbit/s Application Example: DAB Digital Audio Broadcasting
In steps of 16 kbit/s uses MPEG layer 2 (compression also known as MUSICAM =
(Masking pattern adapted Universal Subband Integrated Coding And Multiplexing)
Definition of 3 layers of quality delays, for VLSI implementation:
Layer 1: max. 448 Kbit/s (approx. 1.4 compression) max. 30 ms encoding
max. 10 ms decoding
Layer 2: max. 384 Kbit/s (approx. 1:6-1:8, common, e.g. as MUSICAM in DAB)
Section
Layer 3: max. 320 Kbit/s
Section SW codec delays vary for different layers, implementations, computers (rule-of-thumb
may be 50/100/150 ms for layer 1/2/3, which makes MP3 rather inappropriate for real-
All MP3 files: compression up to 1:12 / 1:14 with no hearable losses All
time conversation)
71 72
MPEG Audio and Video Data Streams Follow-Up MPEG Standards
Audio Data Stream Layers: MPEG-2:

1. Frames Higher data rates for high-quality audio/video
Multimedia
Multimedia
2. Audio access units Multiple layers and profiles with different degrees of compression and quality
3. Slots
MPEG-3
Video Data Stream Layers: Initially HDTV, but MPEG-2 scaled up to subsume MPEG-3
1. Video sequence layer
MPEG-4:
2. Group of pictures layer
Initially, lower data rates for e.g. mobile communication
3. Single picture layer Then, coding and additional functionalities based on image contents
4. Slice layer
MPEG-7:
5. Macroblock layer
Content description
6. Block layer
Basis for search and retrieval
MPEG-21 (upcoming):
Section Section
Framework for multimedia business, delivery... whats missing?
All All
3-compression.fm 3-compression.fm maybe eCommerce focus --> e.g., security, watermarking?
73 74
10. MPEG-2 MPEG-2 Video: Scaling
Motivation
From MPEG-1 to MPEG-2 analog: continuous decrease in quality if errors occur
Multimedia
Multimedia
Improvement in quality digital: need for tolerance whenever error occur, i.e scaling
from VCR to TV to HDTV
Option: Spatial scaling
No CD-ROM based constraints
higher data rates reduction of resolution
MPEG-1: about 1.5 Mbit/s approach
MPEG-2: 2-100 Mbit/s image sampled with half resolution, then MPEG algorithms applied,
output processed with better FEC (base layer)
Evolution Image decoded, substracted from original, to difference MPEG algorithms applied,
1994: International Standard output processed with worseFEC (enhanced layer)
Also later known as H.262 Option: Signal to Noise (SNR) scaling

Prominent role for digital TV in DVB (digital video broadcasting) noise introduced by
commercial MPEG-2 realizations available quantization errors and visible block structures
Section Section
approach
Base layer: DCT output, more significant bits encoded with better FEC
All All
Enhanced layer:DCT output, less significant bits encoded with worse FEC
75 76
MPEG-2 Video Profiles und Levels MPEG-2 Audio
High Level 80 Mbit/s 100 Mbit/s (two modest) extension to MPEG-1 audio:
1920 pixels/line
1152 lines
1) "low sample rate extension" LSE:
Multimedia
Multimedia
High-1440 Level 60 Mbit/s 60 Mbit/s 80 Mbit/s
1440 pixels/line 1/2 of all MPEG-1 rates: 16, 22.05, 24kHz
1152 lines
quantization down to 8 bits/sample
Main Level 15 Mbit/s 15 Mbit/s 15 Mbit/s 20 Mbit/s
720 pixels/line
576 lines 2) "multichannel extension": more channels, i.e. up to
Low Level 4 Mbit/s 4 Mbit/s 5 full bandwidth channels (surround system)
352 pixels/line
288 lines left and right front
Simple Main SNR Spatial High center (in front)
Profile Profile Scalable Scalable Profile left and right back
Profile Profile
LEVELS "matrixing": rule for backward compatible conversion --> stereo (x, y = 0.71)
and No B-frames B-frames B-frames B-frames B-frames
PROFILES Left for Stereo = Left_f + xCenter + yLeft_b
4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 or 4:2:2
Right for Stereo = Right_f + xCenter + yRigtht_b
Not Scalable Not Scalable SNR SNR Scalable SNR Scalable
Scalable or Spatial or Spatial option: +1 "low freq. extension" (LFE) channel for subwoofer
Scalable Scalable
"multilingual extension": 7 more, i.e. up to 12 channels (multiple languages,
Section Section commentary)
All All
77 78
MPEG-2 Audio (2) MPEG-2 System
Improved quality at or below 64 kbit/s Steps

1. Audio and video combined to Packetized Elementary Stream (PES)
Compatible to MPEG-1
Multimedia
Multimedia
2. PES(es) combined to Program Stream or Transport Stream
all MPEG-1 audio format can be processed by MPEG-2
only 3 MPEG-2 audio codec will not provide backward compatibility Program stream:
(in the range between 256 - 448 Kbit/s)
Error-free environment
Packets of variable length
One single stream with one timing reference
Transport stream:
Designed for noisy (lossy) media channels
Multiplex of various programs with one or more time bases
Packets of 188 byte length
Conversion between Program and Transport Streams possible

Section Section
All All
79 80
11. MPEG-4 MPEG-4: Timeline
Goals Schedule for Standardization

1993 Work started
Multimedia
Multimedia
MPEG-4 (ISO 14496) originally: 1997: Committee Draft
Targeted at systems with very scarce resources 1998: Final Committee Draft
To support applications like 1998: Draft International Standard
Mobile communication
1999-2000: International Standard
Videophone and E-mail
Max. data rates and dimensions (roughly):
Between 4800 and 64000 bits/s
176 columns x 144 lines x 10 frames/s
Largely covered by H.263, therefore re-orientation:

Goal to provide enhanced functionality
to allow for analysis and manipulation of image contents
Section Section
All All
81 82
MPEG-4: Goals (cont.) MPEG-4: Scope
1: support composite multimedia i.e. find standardized ways to Definition of

Represent units of aural, visual or audiovisual content System Decoder Model
Multimedia
Multimedia
"audio/visual objects" or AVOs specification for decoder implementations
1
object coding independent of Rhubarb 2
3 Description language
other objects, surroundings Rhubarb
and background binary syntax of an AV objects bitstream representation
Audio Audio
natural and synthetic objects object 1 video objects object 2 scene description information
Compose these objects together Corresponding concepts, tools and algorithms,

especially for
i.e. creation of compound objects that form audiovisual scenes
content-based compression of simple and compound audiovisual objects
Multiplex and synchronize the data associated with AVOs
manipulation of objects
for transportation over network channels providing a QoS (Quality-of-Service) transmission of objects
random access to objects
2: support synthetic objects
animation
computer-gen. (VR), synthesized (txt2speech), model-based ("face")
scaling
error robustness
3: support truly interactive applications (more than play/pause/
Section rewind..) Section
All Interact with the audiovisual scene generated at the decoders site All
83 84
MPEG-4: Scope (cont.) MPEG-4: Video and Image Encoding
Targeted bit rates for video and audio: Encoding / decoding of

VLBV core Rectangular images and video
Multimedia
Multimedia
Very Low Bit-rate Video coding similar to MPEG-1/2
5 - 64 Kbit/s motion prediction
image sequences with CIF resolution and up to 15 frames/s texture coding
Higher-quality video Images and video of arbitrary
64 Kbit/s - 4 Mbit/s shape
quality like digital TV as done in conventional approach
8x8 DCT or shape-adaptive DCT
Natural audio coding
plus coding of shape and transparency information
2 - 64 Kbit/s
Encoder
Must generate timing information
speed of the encoder clock = time base
desired decoding times and/or expiration times
by using time stamps attached to the stream
Section Section
Can specify the minimum buffer resources needed for decoding
All All
85 86
MPEG-4: Composition of Scenes MPEG-4: Example of a Composition
Scene description includes:

Tree to define hierarchical relationships between objects
Multimedia
Multimedia
Rhubarb primitive AVO
Rhubarb compound object
compound object
Objects positions in space and time

by converting the objects local coordinate system into a global coordinate system
Attribute value selection
e.g. pitch of sound, color, texture, animation parameters
Description based on some VRML concepts

VRML = Virtual Reality Modelling Language
Section Section
Interaction with scenes
All All
3-compression.fm
e.g. change viewing point, drag object, start/stop streams, select language 3-compression.fm
87 88
MPEG-4: Scaling MPEG-4: Synthetic Objects
Three approaches: Visual objects:

Spatial scalability Human face
Multimedia
Multimedia
decoder displays textures and visual objects at a reduced spatial resolution start object: neutral-expression face
by decoding only a subset of the total bit stream animated via FDPs and/or FAPs
32 levels max. for textures and still images FAP (facial anim param): animate current display
3 levels max. for video sequences FDP (facial def. param): alternative shape/texture
Temporal scalability Mesh + texture mapping: for 2D & 3D meshes
decoder displays video at a reduced temporal resolution 2D mesh may also be used for human face anim., see above
by decoding only a subset of the total bit stream only triangular 2D meshes, vertices may be moved (mv!), texture is warped
3 levels max. e.g. virtual background
Quality scalability Texture coding for view-dependent applications
bitstream is parsed into a number of bit stream layers of different bit-rates texture, e.g. virt. background; decoder/encoder loop for "minimal" Xmission
either during transmission or in the decoder
subset of the layers still yields a meaningful signal
Spatial and temporal scaling both for

Section Section
Conventional rectangular display and
All All
3-compression.fm Objects with arbitrary shape 3-compression.fm
89 90
MPEG-4: Synthetic Objects MPEG-4: Layered Networking Architecture

Display / Recording
Audio objects:
Text-to-speech Media
Multimedia
Multimedia
speech generation from given text and prosodic parameters
face animation control CoDec
CoDecCoDecCoDec Coding / Decoding
Score driven synthesis
Access Units e.g. video or audio frames
music generation from a score or scene description commands
more general than MIDI
Adaptation Layer
Special effects
A/V object data
Elementary Streams + stream type info, sync. info, QoS req.,...
FlexMux Layer Flexible Multiplexing
e.g. multiple elementary streams

Multiplexed Streams with similar QoS requirements
Transport Multiplexing
TransMux Layer - only interface specified
Section Section Network or Local Storage - layer itself can be any network,
All All e.g. RTP/UDP/IP, AAL5/ATM
91 92
MPEG-4: Layered Networking Architecture (cont.) MPEG-4: Error Handling
DMIF Delivery Multimedia Integration Framework Mobile communication:

Allows to establish multiple party sessions Low bit-rate (< 64 Kbps)
Multimedia
Multimedia
interaction with Error-prone
remote interactive peers
broadcast systems MPEG-4 concepts for error handling:
storage systems
Resynchronization
establishment of channels with specific QoSs and bandwidths
enables receiver to tune in again
Controls based on markers within bitstream
FlexMux layer
Data recovery
TransMux layer
enables receiver to reconstruct lost data
encode data in an error-resilient manner
Error concealment
enables receiver to bridge gaps in data
e.g. by repeating parts of old frames
Section Section
All All
93 94
12. Wavelets Wavelets: Compression / Decompression
Motivation Compressor
Multimedia
Multimedia
Forward Wavelet
JPEG / DCT problems: Transformation
Quantizer Encoder
DCT not applicable to whole image, but only to small blocks

block structure becomes visible at high compression ratios
Scaling as add-on additional effort
DCT function is fixed can not be adapted to source data Inverse Wavelet
Decoder
Transformation DeQuantizer
Improvements by using Wavelets: Decompressor

Transformation of the whole image
overcomes visible block structures and introduces inherent scaling The same overall structure as for DCT-based algorithms
Better identification of which data is relevant to human perception
But: important differences in the transformation step
higher compression ratio
Section Section
All All
95 96
Wavelets: Fundamental Idea Wavelets: Transformation Steps
Image is transformed into the frequency domain (as in JPEG) "Discrete Wavelet Transformation" (Mallat, 1989)
But: based on Wavelet functions instead of cosine functions Split image recursively by using high and low pass filters
Multimedia
Multimedia
read by
cosine: Wavelet e.g.: read by column
line lower
L c1 ...
frequencies
L
... ... H transformed
d11 image with
L d12 reduced size
Advantage: Wavelets yield zero value outside a limited interval H
higher
Wavelet is confined to a part of the image L Low Pass H d13 frequencies
H High Pass
Image needs not be splitted into blocks
Use Wavelet family: {2-j/2*(2-j*x-k)}, j,k Z, being a Wavelet

Section Section
All All
97 98
Wavelets: Transformation Steps (cont.) Wavelets: DWT compared with DCT
In each step i: Advantages of DWT over DCT:

Three images dxi (x=1,2,3): No block artefacts
Multimedia
Multimedia
containing the high frequency parts of the image Inherent scaling
representing "details" of the image based on the dxi for i=1,2,3,...
submitted to Wavelet transformation
Lower time complexity for the transformation
or thrown away in case of scaling
DCT: O(n*logn),
i
One image c :
DWT: O(n) (n=number of values to be transformed)
containing the lower frequency parts of the image
Higher flexibility: Wavelet function can be freely chosen
representing the original image with less details / at a lower resolution
submitted to step i+1
Up to here: 4 images with 1/4 resolution each --> no compression!

but again: decorrelation: many coefficients in d-images (close to) 0
Afterwards:
Quantization
Section Section
Entropy encoding
All All
3-compression.fm as with DCT 3-compression.fm
99 100
Wavelets: Further Issues 13. Fractal Image Compression
Edge detection reduces high frequencies: Image Generation

First extract detected edges
Multimedia
Multimedia
Then apply wavelets to such a filtered image
Application to video:
In-2 ...
In-1 In-1 - In-2
Image n In - In-1
Compute Wavelet
differences compressor
t Im t ...
Mandelbrot
recursive construction of images
infinite granularity
Section Section
self-similarities in images
All All
3-compression.fm 3-compression.fm Zi = RealConst. * Zi-1 + ComplexConst
101 102
Use of Fractals for Compression??? Overview (1) Use of Fractals for Compression??? Overview (2)
observation: self-similarities in natural images Key #3: Collages Theorem:

(clouds, dunes, beaches: zoom-in reveals similar forms as large image)
in order to find Wimg as above: search Wimg such that image is
Multimedia
Multimedia
idea: can natural images be described w/ fractal geometry?? (almost) transformed into itself!
first published by Barnsley & Sloan (88), first impl. 89 by Arnaud Joquin
First algorithm published (Joaquin):
Key #1: Iterated Function Systems IFS: partition image into (small, non-overlapping) "range blocks"
a b x e

input (sub-)picture subject to math. transform. of type + search (larger, overlapping) "domain blocks" which can be
c d y f

"contracted" into range blocks
picture moved, rotated / mirrored, and contracted
--> all transformations are "contractions" for each range block, find domain block and contraction
(lots of possibilities!!)
Key #2: Banachs Fixed Point Theorem:
apply a set Wimg={Wi} of contractions to an image
after infinitely many applications, a specific image appears
... called "attractor" or "fractal"
this process is independent of initial "start" image!!
Section human perception: iteration can stop "pretty soon" (finite no. of iterations) Section
All Q: how to find Wimg such that attractor is image-to-be-compressed? All

details / simplifications of Joaquin approach see below
103 104
To apply self-similarity: Image Generation To Find Self-Similarities
Examples affine function allows for

(from TUD + Univ. Bochum) for translation
Multimedia
Multimedia
recursive contruction of
rotation
images
scaling
Sirpinky triangle brightness adaptation
to produce self-similar
structures IFS:
infinite steps applied to
Iterative Function System
different source images ideally completely self-similar
lead to same result
example see right
known as
Sirpinski-triangle PIFS:
"Grenzwert" also known Partitioned Iterative Funcion
as attractor System
real images are
Section Section not completly self-similar
All All Wimg?
105 106
Theoretical Basis Fractal Image Compression and Decompression
Banachs Fixed Point Theorem: Compression: Find appropriate Wimg difficult

Let F be a metrical space
Decompression: Apply Wimg iteratively to any image easy
Multimedia
Multimedia
Let W: FF be a contractive mapping
i.e. there exists an s, 0<s<1, with | W(x)-W(y) | s | x-y | for all x,y F
Then W has exactly one fixed point xf
i.e. W(xf) = xf
xf can be computed as xf = limn Wn(x) with any x F
Application to image compression:

Let img be the image to be compressed
Regard the set of all possible images as a metrical space
metric e.g.: maximum difference between the pixels of two pictures
Goal: construct Wimg such that img is the fixed point of Wimg
Section Section
Stop when error falls below some bound
All All
Error can be calculated by "Collage Theorem"
107 108
How to Find Wimg?
Systematic search based on Compression rate? Example: for each (8*8) range block:
"Partitioned Iterative Function System (PIFS)" contraction factor fixed
Multimedia
Multimedia
Partition image into "range blocks" Ri
3 bit for transformation
8*8 pixel blocks
16 bit for domain block coordinates
non-overlapping
12 bit for brightness/contrast adaptation
Consider all "domain blocks" Dj of double size
16*16 pixel blocks --> factor is 8x8x8 : 31= 512:31 (cf. JPEG example)
overlapping
Find for each Ri the most similar Dj
consider rotations (0o/90o/180o/270o) and mirroring
adapt brightness and contrast of Dj to that of Ri
translation, rotation, mirroring, brightness adaptation
define a (partial) affine function
Combine partial functions to Wimg
Section Section
All All
109 110
Further Improvements Advantages & Drawbacks
+ High quality at high compression rates

At least for images with self-similarities
Multimedia
Multimedia
Here: better than JPEG ("cross-over point" at about 1:10 to 1:30)
+ Zooming into image supported

detailed view possible, interpolation instead of "pixelization"
+ Scalability
decompression steps yield iteratively improving image
- Long compression times

asymmetric mechanisms
Quadtree partitioning: improving search techniques for range & domain block pairs
Problem: - blockwise artifacts with Information losses
fixed 8*8 blocks do not reflect image properties Wimg is only approximative
Solution:
- Not well applicable to images of non-fractal nature
flexible partition of image into larger or smaller squares
E.g. texts, sharp lines & no quality guarantee possible
Section
driven by image structure Section
- Lower quality than JPEG at low compression rates
All Partitioning into rectangles and triangles All
3-compression.fm
111
3-compression.fm
112
- Error (error propagation)
14. Conclusion
JPEG:
Multimedia
Very general format with high compression ratio

SW and HW for baseline mode available
H.261 / H.263:
Established standard by telecom world
Preferable hardware realization
MPEG family of standards:

Video and audio compression for different data rates
Asymmetric (focus) and symmetric
Next steps: wavelets, fractals, models of objects
Section
All
3-compression.fm
113

3 Compression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3 Compression

Uploaded by

Copyright:

Available Formats

Scope

Dialogue and retrieval mode requirements:

Retrieval mode requirements:

3. Fundamentals Categories Categories and Techniques

Two principal possibilities Major distinction: Symmetric / Asymmetric

Hybrid coding steps (often): decorrelation - reduction - entropy cod.

note: literature usually uses terms as in last slide!!

Categories & Techniques: Further Considerations 4. Source Coding

Further considerations include, e.g., DPCM

time domain frequency

5. Entropy Coding Entropy Coding: Principle

Entropy Coding: Principle Entropy formula:

high doesnote: seems "little information" to

(only marginal relation to entropy) Assumption:

Special variant: zero-length encoding

Entropy Coding: Huffman Entropy Coding: Huffman

Example: Table and example of application to data stream

JPEG JPEG Compression Steps

Very general compression scheme

MOTION JPEG for video compression

Number of bits per pixel:

JPEG Image Preparation JPEG: 4 Modes of Compression

Non-interleaved encoding: Lossy sequential DCT-based mode

Minimum Coded Unit (MCU):

Forward Discrete Cosine Transformation (FDCT): FDCT transforms:

Formula applied to each block for all 0 u, v 7:

Using quantization tables:

Treatment of "zig-zag sequence": 1. Typical Pixel Block: 2. DCT Coefficients:

create pairs (zero-runlength, VLI-length-of-following-non-zero-AC-value)

JPEG: Sample Compression (contd.) JPEG: Example

the following 31 bits (for 64 coefficients!): 0 0 0 0 0 0 0 0

JPEG: 1:50 JPEG: 1:100

lossy sequential DCT-based mode

expanded lossy DCT-based mode

Pixel resolution 8 to 12 bit Progressive image display:

Principle: Image preparation:

JPEG Hierarchical Mode JPEG 2000

Features: Image resolution controlled by viewer, e.g.

Fixed source image format

H.261 Image Preparation H.261: Image Compression Intraframe

Layered structure: Intraframe Coding: Independent coding of individual frames

Interframe Coding: Coding dependent on previous frame(s) Interframe coding of a frame:

Further ITU Video Schemes (H.263, H.3xx) H.263

H.263 Differences of H.263 compared to H.261

H.261 H.263 JPEG is the still picture mode

H.320 specifies (as overview) videophone for ISDN

H.323 Follow-up standards

H.324 MPEG-4: content-based encoding, high compression factor

MPEG Features MPEG Video: Preparation Step

I-frames (intra-coded frames): Motion vectors:

Based on motion vector Difference of similar macroblocks is DCT coded

All For fast forward and rewind All

MPEG: Video - Processing Step MPEG Video: Implications

Sequence of I-, P-, and B-frames: Random access

Must consider the structure of the movie: Editing

Masking in Time Domain

MPEG Audio Coding MPEG Audio Coding

Sampling compatible to encoding of CD-DA and DAT:

Audio Data Stream Layers: MPEG-2:

10. MPEG-2 MPEG-2 Video: Scaling

Also later known as H.262 Option: Signal to Noise (SNR) scaling

MPEG-2 Audio (2) MPEG-2 System

Improved quality at or below 64 kbit/s Steps

Conversion between Program and Transport Streams possible

Use Wavelet family: {2-j/2(2-jx-k)}, j,k Z, being a Wavelet