Professional Documents
Culture Documents
3-1
Information Theory
Section 3.5 :
Ex. 3.5.3 :
Consider a telegraph source having two symbols, dot and dash. The dot duration is
0.2 seconds; and the dash duration is 3 times the dot duration. The probability of the dot
occurring is twice that of the dash, and the time between symbols is 0.2 seconds.
Calculate the information rate of the telegraph source.
Soln. :
Given that :
1.
2.
3.
4.
1.
Let the probability of a dash be P. Therefore the probability of a dot will be 2P. The total
probability of transmitting dots and dashes is equal to 1.
P (dot) + P (dash)
= 1
P + 2P
= 1
Probability of dash
= 1/3
3.
P = 1/3
= 2/3
(1)
H (X) = (2/3) log2 [ 3/2 ] + (1/3) log2 [ 3 ] = 0.3899 + 0.5283 = 0.9182 bits/symbol.
4.
3-2
Information Theory
Ex. 3.5.4 :
...Ans.
The voice signal in a PCM system is quantized in 16 levels with the following
probabilities :
P1 = P2 = P3 = P4 = 0.1
P5 = P6 = P7 = P8 = 0.05
Soln. :
It is given that,
1.
2.
(a)
k=1
As
...(1)
k=1
H = 0.4 log2 (10) + 0.2 log2 (20) + 0.3 log2 (13.33) + 0.1 log2 (40)
= 0.4
(b)
H = 3.85 bits/message
...(2) ...Ans.
f s = 2 fm
= 2 3 kHz = 6 kHz
...(3)
3-3
Information Theory
Hence there are 6000 samples/sec. As each sample is converted to one of the 16 levels, there are
6000 messages/sec.
Message rate r = 6000 messages/sec
(c)
...(4)
Ex. 3.5.5 :
...Ans.
A message source generates one of four messages randomly every microsecond. The
probabilities of these messages are 0.4, 0.3, 0.2 and 0.1. Each emitted message is
independent of other messages in the sequence :
1.
2.
What is the rate of information generated by this source in bits per second ?
.Page No. 3-15
Soln. :
It is given that,
1.
2.
3.
(a)
k=1
pk log2 ( 1/pk )
(b)
H = 1.846 bits/message
...Ans.
...Ans.
A source consists of 4 letters A, B, C and D. For transmission each letter is coded into a
sequence of two binary pulses. A is represented by 00, B by 01, C by 10 and D by 11.
1
1
1
The probability of occurrence of each letter is P(A) = , P (B) = , P (C) = and
5
4
4
3
P (D) =
. Determine the entropy of the source and average rate of transmission of
10
information.
3-4
Information Theory
Soln. : The given data can be summarised as shown in the following table :
Message
A
B
C
D
Probability
1/5
1/4
1/4
3/10
Code
00
01
10
11
Assumption : Let us assume that the message transmission rate be r = 4000 messages/sec.
(a)
(b)
1
1
1
log2 (5) + log2 (4) + log2 (4) + 0.3 log2 (10/3)
5
4
4
H = 1.9855 bits/message
...Ans.
(c)
...Ans.
H = 1 bit/binit
...Ans.
Section 3.6 :
Ex. 3.6.3 :
A discrete memoryless source has five symbols x1, x2, x3, x4 and x5 with probabilities
p ( x1 ) = 0.4, p ( x2 ) = 0.19, p ( x3 ) = 0.16, p ( x4 ) = 0.14 and p ( x5 ) = 0.11. Construct the
Shannon-Fano code for this source. Calculate the average code word length and coding
efficiency of the source.
Soln. : Follow the steps given below to obtain the Shannon-Fano code.
Step 1 : List the source symbols in the order of decreasing probability.
Step 2 : Partition the set into two sets that are as close to being equiprobable as possible and assign
0 to the upper set and 1 to the lower set.
Step 3 : Continue this process, each time partitioning the sets with as nearly equal probabilities as
possible until further partitioning is not possible.
(a)
3-5
Information Theory
(b)
Symbols
Probability
Step 1
Step 2
Step 3
Code word
x1
0.4
0
Partition
Stop here
00
x2
0.19
0
Partition
Stop here
01
x3
0.16
0
Partition
Stop here
10
x4
0.14
0
Partition
Stop here
110
x5
0.11
Stop here
111
pk (length of mk in bits)
k=1
= ( 0.4 2 ) + ( 0.19 2 ) + ( 0.16 2 ) + ( 0.14 3 ) + ( 0.11 3 )
= 2.25 bits/message
(c)
pk log2 ( 1 / pk )
k=1
= 0.4 log2 ( 1 / 0.4 ) + 0.19 log2 ( 1 / 0.19 ) + 0.16 log2 ( 1 / 0.16 )
+ 0.14 log2 ( 1 / 0.14 ) + 0.11 log2 ( 1 / 0.11 ) = 2.15
Ex. 3.6.6 :
A discrete memoryless source has an alphabet of seven symbols with probabilities for its
output as described in Table P. 3.6.6(a).
Table P. 3.6.6(a)
Symbol
Probability
S0
S1
S2
S3
S4
S5
S6
0.25
0.25
0.125
0.125
0.125
0.0625
0.0625
Compute the Huffman code for this source moving the combined symbol as high as
possible. Explain why the computed source code has an efficiency of 100 percent.
3-6
Information Theory
Soln. : The Huffman code for the source alphabets is as shown in Fig. P. 3.6.6.
Probability
Codeword
Codeword length
S0
0.25
10
2 bit
S1
0.25
11
2 bit
S2
0.125
001
3 bit
S3
0.125
010
3 bit
S4
0.125
011
3 bit
S5
0.0625
0000
4 bit
S6
0.0625
0001
4 bit
k=0
From Table P. 3.7.4(b)
L = ( 0.25 2 ) + ( 0.25 2 ) + ( 0.125 3 ) 3 + ( 0.0625 4 ) 2
L = 2.625 bits/symbol
3-7
Information Theory
6
2.
p ( xi ) log2 [ 1 / p ( xi ) ]
i=0
H = [ 0.25 log2 ( 4 ) ] 2 + [ 0.125 log2 ( 8 ) ] 3 + [ 0.0625 log2 ( 16 ) ] 2
3.
= 100%
Note :
As the average information per symbol (H) is equal to the average code length (L), the code
efficiency is 100%.
Section 3.11 :
Ex. 3.11.5 :
Calculate differential entropy H (X) of the uniformly distributed random variable X with
probability density function.
fX (x) = 1/a
0xa
= 0
for 1. a = 1
elsewhere
2.
a=2
3.
a = 1/2.
Soln. :
The uniform PDF of the random variable X is as shown in Fig. P. 3.11.5.
Fig. P. 3.11.5
1.
The average amount of information per sample value of x (t) is measured by,
H (X) =
(1)
fX (x) log2 [1/fX (x)] dx bits/sample
The entropy H (X) defined by the expression above is called as the differential entropy of X.
2.
...(2)
3-8
Information Theory
1
(a)
1 log2 1 dx = 0
...Ans.
1
1
log2 2 dx = 2 = 1
2
2
...Ans.
0
2
(b)
(c)
Substitute a =
1
to get, H (X) =
2
0
1/2
...Ans.
y2
y3
x1 0.9
P (Y / X) = x2 0
x3 0
0.1
0.8
0.3
0
0.2
0.7
Calculate all the entropies and mutual information with this channel.
Soln. :
Steps to be followed :
Step 1 : Obtain the joint probability matrix P (X, Y).
Step 2 : Obtain the probabilities p (y1), p (y2), p (y3).
Step 3 : Obtain the conditional probability matrix P (X/Y)
Step 4 : Obtain the marginal densities H (X) and H (Y).
Step 5 : Calculate the conditional entropy H (X/Y).
Step 6 : Calculate the joint entropy H (X , Y).
Step 7 : Calculate the mutual information I (X , Y).
Step 1 : Obtain the joint probability matrix P (X, Y) :
The given matrix P (Y/X) is the conditional probability matrix. We can obtain the joint
probability matrix P (X , Y) as :
P (X, Y) = P [ Y/X ] P (X)
P (X, Y) =
0.9 0.3
0
0
y1
x1
P (X, Y) = x2
x3
0.27
0
0
0.1 0.3
0.8 0.25
0.3 0.45
y2
0.03
0.2
0.135
0.2 0.25
0.7 0.45
y3
0
0.05
0.315
...(1)
3-9
Information Theory
p ( y1 ) = 0.27 + 0 + 0 = 0.27
p ( y2 ) = 0.03 + 0.2 + 0.135 = 0.365
p ( y3 ) = 0 + 0.05 + 0.315 = 0.365
Step 3 : Obtain the conditional probability matrix P (X/Y) :
The conditional probability matrix P (X/Y) can be obtained by dividing the columns of the joint
probability matrix P (X , Y) of Equation (1) by p (y1), p (y2) and p (y3) respectively.
0.27/0.27
0/0.27
0/0.27
P (X /Y) =
x1
P (X /Y) = x2
x3
y1
1
0
0
0.03/0.365
0/0.365
0.2/0.365
0.05/0.365
0.135/0.365
0.315/0.365
y2
y3
0.082
0.5479
0.3698
0
0.1369
0.863
...(2)
p ( xi ) log2 [ 1/p ( xi )]
i=1
= p ( x1 ) log2 [ 1/p ( x1 ) ] + p ( x2 ) log2 [ 1/p ( x2 ) ] + p ( x3 ) log2 [ 1/p ( x3 )]
Substituting the values of p ( x1 ), p ( x2 ) and p ( x3 ) we get,
= 0.3 log2 (1/0.3) + 0.25 log2 (1/0.25) + 0.45 log2 (1/0.45)
= [ (0.3 1.7369) + (0.25 2) + (0.45 1.152) ]
...Ans.
...Ans.
p ( xi , yj ) log2 p ( xi/yj )
i=1 j=1
3-10
Information Theory
P (X, Y)
y1
y2
y3
0.27
0
0
0.03
0.2
0.135
0
0.05
0.315
x1
x2
x3
y1
y2
y3
1
0
0
0.0821
0.5479
0.3698
0
0.1369
0.863
Fig. P. 3.11.6
Substituting various values from these two matrices we get,
H (X/Y) = 0.27 log2 1 0.03 log2 (0.0821) 0 0 0.2 log2 (0.5479)
0.05 log2 (0.1369) 0 0.135 log2 (0.3698) 0.315 log2 (0.863)
= 0 + 0.108 + 0.1736 + 0.1434 + 0.1937 + 0.0669
H (X/Y) = 0.6856 bits / message
...Ans.
p ( xi , yj ) log2 p ( xi , yj )
i=1 j=1
H (X, Y) = [ 0.27 log2 0.27 + 0.03 log2 0.03 + 0 + 0 + 0.2 log 0.2 + 0.05 log 0.05 + 0
+ 0.135 log2 0.135 + 0.315 log2 0.315 ]
= [ 0.51 + 0.1517 + 0.4643 + 0.216 + 0.39 + 0.5249]
...Ans.
For the given channel matrix, find out the mutual information. Given that p ( x1 ) = 0.6,
p ( x2 ) = 0.3 and p ( x3 ) = 0.1.
p (y / x)
y2
y3
x1
1/2
1/2
x2
1/2
1/2
x3
1/2
1/2
3-11
Information Theory
Soln. :
Steps to be followed :
Step 1 : Obtain the joint probability matrix P (X , Y).
Step 2 : Calculate the probabilities p ( y1 ), p ( y2 ), p ( y3 ).
Step 3 : Obtain the conditional probability matrix P (X/Y).
Step 4 : Calculate the marginal densities H (X) and H (Y).
Step 5 : Calculate the conditional entropy H (X/Y).
Step 6 : Find the mutual information.
Step 1 : Obtain the joint probability matrix P (X , Y) :
We can obtain the joint probability matrix P (X , Y) as
P (X , Y) = P (Y/X) P (X)
So multiply rows of the P (Y / X) matrix by p ( x1 ), p ( x2 ) and p ( x3 ) to get,
P (X / Y)
P (X, Y)
0.5 0.6
0.5 0.6
0.5 0.3
0.5 0.3
0.5 0.1
0.5 0.1
y1
y2
y3
x1
0.3
0.3
x2
0.15
0.15
x3
0.05
0.05
(1)
P (X / Y)
0.3 / 0.45
0.3 / 0.35
0.15 / 0.45
0.15 / 0.2
0.05 / 0.35
0.05 / 0.2
3-12
P (X, Y)
Information Theory
y1
y2
y3
x1
0.667
0.857
x2
0.333
0.75
x3
0.143
0.25
(2)
p ( xi ) log2 p ( xi )
i=1
= p ( x1 ) log2 p ( x1 ) p ( x2 ) log2 p ( x2 ) p ( x3 ) log2 p ( x3 )
= 0.6 log2 (0.6) 0.3 log2 (0.3) 0.1 log2 (0.1)
= 0.4421 + 0.5210 + 0.3321
p ( xi , yj ) log2 ( xi / yj )
i=1 j=1
= p ( x1 , y1 ) log2 p ( x1/y1 ) p ( x1 , y2 ) log2 p ( x1/y2 ) p ( x1 , y3 ) log2 p ( x1/y3 )
p ( x2 , y1 ) log2 p ( x2/y1 ) p ( x2 , y2 ) log2 p ( x2/y2 ) p ( x2 , y3 ) log2 p ( x2/y3 )
p ( x3 , y1 ) log2 p ( x3/y1 ) p ( x3 , y2 ) log2 p ( x3/y2 ) p ( x3 , y3 ) log2 p ( x3/y3 )
P (X, Y)
y1
y2
y3
y1
y2
Y3
x1
0.667
0.857
x1
0.3
0.3
x2
0.333
0.75
x2
0.15
0.15
x3
0.143
0.25
x3
0.05
0.05
Fig. P. 3.11.7
Substituting various values from these two matrices we get,
H (X/Y) = 0.3 log2 0.667 0.3 log2 0.857 0
0.15 log2 0.333 0 0.15 log2 0.75
0 0.05 log2 0.143 0.05 log2 0.25
...Ans.
3-13
Information Theory
State the joint and conditional entropy. For a signal which is known to have a uniform
density function in the range 0 x 5; find entropy H (X). If the same signal is amplified
eight times, then determine H (X).
H (X) =
2.
Fig. P. 3.11.8
fX (x) log2 [1/fX (x)] dx bits/sample.
Let us define the PDF fX (x). It is given that fX (x) is uniform in the range 0 x 5.
Let
fX (x) = k
.... 0 x 5
= 0
.... elsewhere
fX (x) dx = 1
k dx = 1
0
k = 1/5
.... 0 x 5
.... elsewhere
Ex. 3.11.9 :
1
log2 (5) dx
5
...Ans.
Two binary symmetrical channels are connected in cascade as shown in Fig. P. 3.11.9.
1.
2.
3-14
Information Theory
The channel matrix of a BSC consists of the transition probabilities of the channel. That means
the channel matrix for channel 1 is given by,
P ( y1/x1 )
P ( y2/x1 )
P [ Y/X ] =
...(1)
P ( y1/x2 ) P ( y2/x2 )
Substituting the values we get,
P [ Y/X ] =
0.8
0.2
0.2
0.8
...(2)
...(3)
0.7
0.3
0.3
0.7
...(4)
...(5)
...(6)
Similarly we can obtain the expressions for the remaining terms in the channel matrix of resultant
channel.
P ( z1/y1 ) P ( y1/x1 ) + P ( z1/y2 ) P ( y2/x2 ) P ( y1/x1 ) P ( z2/y1 ) + P ( y2/x1 ) P ( z2/y2 )
P[ Z/X ] =
P ( y1/x2 ) P (z1/y1 ) + P ( y2/x2 ) P ( z1/y2 ) P ( y2/x2 ) P ( z2/y1 ) + P ( y2/x2 ) P ( z2/y2 )
...(7)
3-15
Information Theory
The elements of the channel matrix of Equation (7) can be obtained by multiplying the individual
channel matrices.
P (Z/X) = P (Y/X) P (Z/Y)
0.8
0.2 0.7
P (Z/X) =
0.2
0.8 0.3
0.62
0.38
=
0.38
0.62
(8)
0.3
0.7
...Ans.
To calculate P ( z1 ) and P ( z2 ) :
From Fig. P.3.11.10 we can write the following expression,
P ( z 1 ) = P ( z 1/ y 1 ) P ( y 1 ) + P ( z 1/ y 2 ) P ( y 2 )
Substituting
P ( y 1 ) = P ( x 1 ) P ( y 1/ x 1 ) + P ( x 2 ) P ( y 1/ x 2 )
and
and
We get,
Similarly
Ex. 3.11.10 :
(9)
...Ans.
P ( z 2 ) = P ( z 2/ y 1 ) P ( y 1 ) + P ( z 2/ y 2 ) P ( y 2 )
= (0.3 0.56) + (0.7 0.44)
P ( z2 ) = 0.476
...Ans.
y2
outputs
x1
inputs
2 / 3
x2 1 / 10
1 / 3
9 / 10
x1 2 / 3
x2 1 / 10
y2
1 / 3
9 / 10
3-16
Information Theory
...Ans.
1 P
P
P
1 P
1.
2.
If the source has equally likely outputs, compute the probabilities associated with
the channel outputs for P = 0.2.
Soln. :
Part I :
1.
2.
3-17
Information Theory
The channel diagram is as shown in Fig. P. 3.11.11 This type of channel is called as binary
erasure channel. The output y2 = e indicates an erasure that means this output is in doubt and this
output should be erased.
Part II :
p ( x1 ) = p ( x2 ) = 0.5
That means
These are the required values of probabilities associated with the channel outputs for p = 0.2.
Ex. 3.11.13 :
Find the mutual information and channel capacity of the channel as shown in
Fig. P. 3.11.13(a). Given that P ( x1 ) = 0.6 and P ( x2 ) = 0.4.
.Page No. 3-57.
Fig. P. 3.11.13(a)
Soln. :
Given that : p ( x1 ) = 0.6, p ( x2 ) = 0.4
The conditional probabilities are,
p ( y1/x1 ) = 0.8, p ( y2/x1 ) = 0.2
p ( y1/x2 ) = 0.3 and p ( y2/x2 ) = 0.7
The mutual information can be obtained by
referring to Fig. P. 3.11.13(b).
Fig. P. 3.11.13(b)
As already derived, the mutual information is given by,
I (X ; Y) = [ + (1 ) p ] p () (1 p) ()
Where is called as the horseshoe function which is given by,
(p) = p log2 (1/p) + (1 p) log2 (1/1 p)
...(1)
...(2)
3-18
Information Theory
(3)
I (X ; Y) = 0.1868 bits.
...Ans.
...Ans.
Section 3.12
Ex. 3.12.3 :
In a facsimile transmission of a picture, there are about [2.25 10 ] picture elements per
frame. For good reproduction, twelve brightness levels are necessary. Assuming all these
levels to be equiprobable, calculate the channel bandwidth required to transmit one
picture in every three minutes for a single to noise power ratio of 30 dB. If SNR
requirement increases to 40 dB, calculate the new bandwidth. Explain the trade-off
between bandwidth and SNR, by comparing the two results.
Soln. :
Given :
The number of picture elements per frame is 2.25 10 and these elements can be of any
brightness out of the possible 12 brightness levels.
The information rate (R) = No. of messages/sec. Average information per message.
R = rH
...(1)
6
6
2.25 10
2.25 10
Where r =
=
= 12500 elements/sec.
...(2)
3 minutes
180 sec
and H = log2 M = log2 12 ...as all brightness levels are
equiprobable.
...(3)
R = 12,500 log2 12
R = 44.812 k bits/sec.
...(4)
2.
3-19
Information Theory
S
Substitute
= 30 dB = 1000 we get,
N
3.
44.812 10
...(5)
B log2 [1 + 1000]
44.812 10
9.96
B 4.4959 kHz.
...Ans.
BW for S/N = 40 dB :
For signal to noise ratio of 40 dB or 10,000 let us calculate new value of bandwidth.
44.812 10
B log2 [1 + 10000 ]
44.812 10
13.287
B 3.372 kHz.
...Ans.
Trade off between bandwidth and SNR : As the signal to noise ratio is increased from
30 dB to 40 dB, the bandwidth will have to be decreased.
Ex. 3.12.4 :
An analog signal having bandwidth of 4 kHz is sampled at 1.25 times the Nyquist rate,
with each sample quantised into one of 256 equally likely levels.
Soln. :
Given :
1.
2.
Can the output of this source be transmitted without error over an AWGN channel
with bandwidth of 10 kHz and SNR or 20 dB ?
3.
Find SNR required for error free transmission for part (ii).
4.
Find bandwidth required for an AWGN channel for error free transmission this
source if SNR happens to be 20 dB.
.Page No. 3-68
1.
...(1)
r = Number of messages/sec.
and
R = 10 10 log2 256 = 10 10 8
R = 80 k bits/sec.
...Ans.
2.
3-20
Information Theory
Given :
S
= 20 dB = 100
N
S
3
C = B log2 1 + = 10 10 log2 [101].
N
B = 10 kHz and
C = 66.582 k bits/sec.
For error free transmission, it is necessary that R C. But here R = 80 kb/s and C = 66.582 kb/s
hence R > C hence errorfree transmission is not possible.
3.
C = R = 80 kb/s. we get,
S
3
80 10 = B log2 1 +
N
80 10
256 = 1+ (S/N)
...Ans.
This is the required value of the signal to noise ratio to ensure the error free transmission.
4.
Given :
C = 80 kb/s,
Ex. 3.12.5 :
C = B log2 1 +
S/N = 20 dB = 100
S
N
80 = B log2 [1 + 100]
B 12 kHz.
...Ans.
A channel has a bandwidth of 5 kHz and a signal to noise power ratio 63. Determine the
bandwidth needed if the S/N power ratio is reduced to 31. What will be the signal power
required if the channel bandwidth is reduced to 3 kHz ?
Soln. :
1.
C = 30 10 bits/sec
...(1)
2.
3-21
Information Theory
3.
B =
30 10
= 6 kHz
5
...(2)
Let the noise power corresponding to a bandwidth of 6 kHz be N1 = 6 N0 and the noise power
corresponding to the new bandwidth of 3 kHz be N2 = 3 N0.
N1
6 N0
=
=2
...(3)
N2
3 N0
The old signal to noise ratio =
S1
= 31
N1
S1 = 31 N1
S2
The new signal to noise ratio =
. We do not know its value, hence let us find it out.
N2
30 10
S2
N2
= 3 10 log2 1 +
S2
...(4)
N2
= 1023
S2 = 1023 N2
N1
But from Equation (3), N2 =
, substituting we get,
2
N1
S2 = 1023
2
...(5)
...(6)
S2 = 16.5 S1
...Ans.
Thus if the bandwidth is reduced by 50% then the signal power must be increased 16.5 times i.e.
1650% to get the same capacity.
Ex. 3.12.6 :
(b)
3-22
Information Theory
Soln. :
Data :
(a)
= 251
...(1)
(b)
P
N
S
3
3
= 2 10 log2 [1 + 251] = 2 10
N
C =
B log2 1 +
C =
15.95 10 bits/sec
log10 252
log10 2
...Ans.
= N0 B
= N1 = N0 B1
...(2)
N1
= N2 = N0 B2
...(3)
N2
N1
N0 B2
B2 1
=
=
N0 B1
B1 2
1
2
...(4)
As the signal power remains constant, the SNR with new bandwidth is,
S
S
S
=
=2
N2
N 1/ 2
N1
But we know that
S
= 251
N1
= 2 251 = 502
...(5)
= 1 10
2.
S
N2
log10 (503)
log10 2
3
C = 8.97 10 bits/sec
...Ans.
...(6)
3-23
S
N3
= 4
Information Theory
S
= 4 251 = 1004
N1
...(7)
S
= 500 log2 (1004)
N3
C = 4.99 10 bits/sec
...Ans.
qqq