Professional Documents
Culture Documents
Modulator
· Digital Source Channel
Source Symbols Encoder Binary Encoder
Source sequence
Entropy
Source coding reduces redundancy and hence reduces
bandwidth requirement.
V ariable-length Code
(M orse Code)
1
Shannon’s source coding theorem
Given a discrete memoryless source of entropy H(X), the
average codeword length Lav for any distortionless source
coding scheme is bounded as
Lav ≥ H ( X )
D.3
Source Efficiency
η ≤1
At the receiver,
Channel
Binary Decoder
Symbols
sequence
2
Example 1
Source symbols
Discrete P ( A) = 1 /2
Memory
Source A, B, C, D P( B) = 1/ 4
(DMS) P (C ) = 1 / 8
P( D) = 1 / 8
D.5
Example 1
B 1/4 00 10 01
D.6
3
Code I
There is a problem in this decoding process.
D.7
Code II
Uniquely decodable and instantaneous.
A B C
D.8
4
Code III
Uniquely decodable but not instantaneously decodable.
0
D
1 1 1
D.9
Instantaneously decodable
Question: How can we have codes that are instantaneously
decodable?
Answer: No code word in the code is a prefix of another code
word.
5
Kraft inequality
A prefix code always satisfy the Kraft inequality
L
∑2
k =1
− nk
≤1
D.11
Example
Code I violates the Kraft inequality; it cannot be a prefix
code.
D.12
6
Average code word length
H ( X ) ≤ Lav ≤ H ( X ) + 1
⇒ H ( X n ) ≤ Lnav ≤ H ( X n ) + 1
⇒ nH ( X ) ≤ Lnav ≤ nH ( X ) + 1
⇒ H ( X ) ≤ Lnav ≤ H ( X ) + 1 / n
Shannon-Fano Encoding
Variable-length encoding scheme proposed by Shannon and
Fano and the algorithm is simply stated as below.
1
GN = −
N
∑ P(m ) log
i
i 2 P ( mi ) as before
D.14
7
M
1
In other words , Lav =
N
∑l P → G
i =1
i i N
H
η= H: Source Entropy
Lav
D.16
8
Coding: Step 1
Arrange the N messages in order of decreasing probability
and let
i −1
Fi = ∑ Pk with F1 = 0
k =1
P1 F1
P2 F
i.e. → 2
: :
PN FN
D.17
Step 2
Calculate the number of bits li required for message i by the
following equation
− log 2 Pi ≤ li < 1 − log 2 Pi
For example, Pi = 1 / 5 ,
∴ li = 3
D.18
9
Step 3-4
Convert Fi into a binary fraction. The binary fraction
satisfies the following equality
b1 b2 b
b1b2 b3 ⋅ ⋅ ⋅ bk = + 2 + ... + kk
2 2 2
For example, 27 0 0 1 1 0 1 1
=
+ + + + + +
128 2 4 8 16 32 64 128
0 0 1 1 0 1 1
= + 2 + 3+ 4 + 5 + 6 + 7
2 2 2 2 2 2 2
→ 0011011
Step 4
The code word for message i is the truncated version of the
fraction Fi and has length li .
D.19
Example 2
3 CAA 9 54
128 128 4 0110110 0110
D.20
10
Example 3
Consider a Markov source encoder with N =2 symbols per
message per message. The encoding operation is
i Symbols Pi li Codeword Fi Binary Fi
1 AA 9
2 00
32
2 BB 9
2 01
32
3 AC 9
4 1001
32
4 CB 3
4 1010
32
5 BC 3
4 1100
32
6 CA 3
4 1101
32
7 CC 2
4 1111
32 D.21
Example 3
H N = 1.44bit / symbol
CA,AA,BB,BC 1101,00,01,1100
Source Encoder
11
Huffman encoding
Variable-length scheme for the encoding of symbols.
– Arrange the source symbols in descending order of
probability
– Create a new source with one less symbol by combining
(adding) the two symbols having the lowest probability.
– Repeat step 1 and 2 until a single -symbol source is
achieved.
– Associate '1' and a '0' with each pair of probabilities so
combined.
– Encode each original source symbol into binary sequence
generated by the various combinations, with the first
combination as the least significant digit in thee sequence.
D.23
Example 4
Source symbols A,B,C,D,E and probabilities are P(A)=0.4,
P(B) =0.2, P(C) =0.2, P(D)=0.1, P(E)=0.1
A 0.4 Æ 0.4 0.4
B 0.2 Æ 0.2
C 0.2 Æ 0.2 0
0.4 0
D 0.1 0 0.2 1 0.6 0
Æ
E 0.1 1 0.2 1 0.4 1Æ 1.0
12
Example 4
Example:
The encoded sequence for symbol C is not a prefix of any
other valid sequence.
D.25
Example 4
Tree Diagram E
A B 1
D
1 1 1 0
Initial state C
0 0 0
m
Source Entropy H ( X ) = −∑ Pi log 2 ( Pi )
i =1
= −0.4 log 2 0.4 − 2[0.2 log 2 0.2] − 2[0.1log 2 0.1]
= 2.12 bits/symbol
D.26
13
Example 4
If P(A) = P(B) = P(C) = P(D) = P(E), H ( X ) = H ( X ) max
H(X )
Source efficiency before encoding = = 0.913
H ( X ) MAX
D.27
Example 4
The average number Lav of digits used to encode each source
symbol after encoding is
0.4(1) + 0.2(2) + 0.2(3) +0.1(4)+0.1(4) = 2.2 bit/symbol
Source efficiency η after encoding is η=H(X)/Lav=0.964
14
Run-length Coding
Run length coding technique is particularly suitable for
binary sources where one of the symbols (the '1' say) occurs
very much less often then others, so that there are long runs
of successive '0's (e.g. a scanned and digitized line drawing).
In this case, it is more efficient to encode by counting the
number of the consecutive '0's between '1's.
D.29
Example 5
1 000
01 001
001 010
0001 Source 011
00001 Encoder 100
000001 101
0000001 110
0000000 111
15
Example 5
The compression in this particular case is 15/20 = 0.75
There are schemes where both the input block length and
output block length are variable length. (FAX)
D.31
Lempel-Ziv Coding
In practice, source statistics are not always known a priori.
D.32
16
Example 6
Consider 000101 110010 100101…
Assumed that the binary symbols 0 and 1 are already stored
in that order in the code book, We can write
Subsequences stored: 0, 1
Data to be parsed: 000101 110010 100101…
Example 6
Similarly, we have
Subsequences stored: 0, 1, 00, 01
Data to be parsed: 01 110010 100101…
D.34
17
Example 6
Numerical Subsequences Numerical Binary
Position Representation encoded
blocks
1 0
2 1
3 00 11 0010
4 01 12 0011
5 011 42 1001
6 10 21 0100
7 010 41 1000
8 100 61 1100
9 101 62 1101 D.35
Lempel-Ziv coding
Lempel-Ziv algorithm uses fixed-length codes to represent a
variable number of source symbol; this feature makes the
Lempel-Ziv code suitable for synchronous transmission.
D.36
18