Communic PDF

Communication Systems
Raphael Seebacher
raphasee@ee.ethz.ch
Contents
1 Random Processes 1
1.1 Mathematical Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Mean, Correlation and Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Ergodic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Transmission of a Random Process Through a LTI Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.6 Power Spectral Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.7 Gaussian Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.8 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.8.1 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Baseband Pulse Transmission 5

2.1 Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Properties of Matched Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Error Rate due to Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Intersymbol Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Nyquists Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4.1 Ideal Nyquist Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4.2 Raised Cosine Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Correlative-Level Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5.1 Duobinary Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5.2 Modified Duobinary Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Baseband M-ary PAM Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Signal-Space Analysis 9
3.1 Geometric Representation of Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Gram-Schmidt Orthogonalization Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Conversion of the Continuous AWGN Channel into a Vector Channel . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Statistical Characterization of the Correlator Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Likelihood Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Coherent Detection of Signals in Noise: Maximum Likelihood Decoding . . . . . . . . . . . . . . . . . . . . . 10
3.4.1 Maximum a Posteriori Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4.2 Maximum Likelihood Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 Correlation Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6 Proability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6.1 Invariance of the Probability of Error to Rotation and Translation . . . . . . . . . . . . . . . . . . . . 11
3.6.2 Minimum Energy Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.6.3 Union Bound on the Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.6.4 Bit versus Symbol Error Proababilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Passband Data Transmission 13

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 Hierarchy of Digital Modulation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.2 Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.3 Power Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.4 Bandwidth Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Passband Transmission Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Coherent Phase-Shift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.1 Binary Phase-Shift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.2 Quadriphase-Shift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.3 Offset QPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.4 /4-Shifted QPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.5 M-ary PSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
i
4.3.6 Bandwidth Efficiency of M-ary PSK Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 Hybrid A/P Modulation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.1 M-ary Quadrature Amplitude Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.2 Carrierless Amplitude/Phase Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5 Coherent Frequency-Shift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5.1 Binary FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5.2 Minimum Shift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5.3 Gaussian-Filtered MSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5.4 M-ary FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.6 Detection of Signals with Unknown Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.6.1 Optimum Quadratic Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.7 Noncoherent Orthogonal Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.7.1 Noncoherent Binary FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.7.2 Differential PSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.8 Comparison of Digital Modulation Schemes Using a Single Carrier . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Multiuser Radio Communications 25

5.1 Multiple-Access Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Satellite Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 Radio Link Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3.1 Free-Space Propagation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3.2 Noise Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Wireless Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.4.1 Propagation Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.5 Binary Signaling over a Rayleigh Fading Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.5.1 Diversity Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Fundamental Limits in Information Theory 31

6.1 Uncertainty, Information and Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.2 Source-Coding Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.3 Data Compaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3.1 Prefix Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3.2 Huffman Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3.3 Lempel-Ziv Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Discrete Memoryless Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.5 Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.5.1 Properties of Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.6 Channel Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.7 Channel-Coding Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.7.1 Application of the Channel Coding Theorem to Binary Symmetric Channels . . . . . . . . . . . . . . 35
6.8 Differential Entropy and Mutual Information for Continuous Ensembles . . . . . . . . . . . . . . . . . . . . . 36
6.8.1 Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.9 Information Capacity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.9.1 Implications of the Information Capacity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7 Error-Control Coding 39
7.1 Discrete-Memoryless Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.2 Linear Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.2.1 Syndrome: Definition and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2.2 Minimum Distance Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2.3 Syndrome Detecting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2.4 Dual Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.3 Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.3.1 Generator Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.3.2 Parity-Check Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.3.3 Calculation of the Syndrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ii
8 Multiple Access Protocols 43
8.1 ALOHA Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.1.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.1.2 Throughput pure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.1.3 Throughput slotted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.1.4 Mean Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2 Carrier Sense Multiple Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2.1 1-persistent CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2.2 Nonpersistent CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2.3 P-persistent CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A Line Codes 44
B Hilbert Transform 44
C Complex Representation of Signals and Systems 45

C.0.5 Canonical Representation: Band-Pass Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
iii
iv
Communication Systems Summary CONFIDENTIAL Raphael Seebacher
1 Random Processes
1.1 Mathematical Definition where fX(t) (x) is the first-order probability density function
of the process.
Def 1.1 Sample Space S
The totality of sample points si S corresponding to the ag-
If X(t) is strictly stationary, the mean is a constant:
gregate of all possible outcomes of the experiment is called
the sample space S. X (t) = X t
Def 1.2 Random Process Def 1.6 Autocorrelation Function

We formally define a random process X(t) as an ensemble of
time functions together with a probability rule that assigns a
probability to any meaningful event associated with an obser- RX (t1 , t2 ) = E [X(t1 )X(t2 )]
Z Z
vation of one of the sample functions of the random process.
= x1 x2 fX(t1 ),X(t2 ) (x1 , x2 ) dx1 dx2

For a fixed sample point si S, the function xsi (t) is called
a realization or sample function of the random process. where fX(t1 ),X(t2 ) (x1 , x2 ) is the second-order probability
density function of the process.
A random process observed at a fixed time instant tk is the
random variable If X(t) is strictly stationary, the autocorrelation function
depends only on the time difference t2 t1 .
{x1 (tk ), x2 (tk ), . . . , xn (tk )}
RX (t1 , t2 ) = RX (t2 t1 ) t1 , t2
Def 1.3 Joint Distribution Function Hence
Let X(t) denote a random process initiated at t = . Let RX ( ) = E [X(t + )X(t)]
X(t1 ), . . . , X(tk ) denote the random variables obtained by
observing the random process at times t1 , . . . , tk .
Def 1.7 Autocovariance Function
The joint distribution function now is denoted as
The autocovariance function of a strictly stationary process
FX(t1 ),...,X(tk ) (x1 , . . . , xk ) is written as

CX (t1 , t2 ) = E X(t1 ) X X(t2 ) X
1.2 Stationary Processes = RX (t2 t1 ) X 2
Def 1.4 Strict Stationarity
The random process X(t) is said to be strictly stationary if The mean and the autocorrelation function are sufficient to
the following condition holds describe the first two moments of the process. However, they
only provide a partial description and furthermore the con-
FX(t1 + ),...,X(tk + ) (x1 , . . . , xk ) = FX(t1 ),...,X(tk ) (x1 , . . . , xk ) ditions
for all time shifts , all k and all possible t1 , . . . , tk . X (t) = X and RX (t1 , t2 ) = RX (t2 t1 )
Special Cases are not sufficient to guarantee that the random process X(t)
is strictly stationary.
For k = 1, we have Such processes are referred to as wide-sense stationary or
FX(t) (x) = FX(t+ ) (x) = FX (x) t, weakly stationary. The class of strictly stationary processes
with finite second-order moments forms a subclass of the
hence the first-order distribution function of a station- class of all stationary processes.
ary random process is independent of time.
Properties of the Autocorrelation Function
For k = 2 and = t1 , we have
RX (0) = E[X 2 (t)]
FX(t1 ),X(t2 ) (x1 , x2 ) = FX(0),X(t2 t1 ) (x1 , x2 ) t1 , t2
RX ( ) = RX ( )
hence the second-order distribution function of a sta-
tionary random process depends only on the time dif- |RX ( )| RX (0)
ference between the observation times.
Def 1.8 Cross-Correlation Function
Consider two random processes X(t) and Y (t) with autocor-
1.3 Mean, Correlation and Covariance relation functions RX (t, u) and RY (t, u).
Def 1.5 Mean The two cross-correlation functions of X(t) and Y (t) are de-
We define the mean of the process X(t) as the expectation fined by
of the random variable obtained by observing the process at
RXY (t, u) = E[X(t)Y (u)] and RY X (t, u) = E[Y (t)X(u)]
some time t.
Z where t and u denote the two values of tim eat which the
X (t) = E[X(t)] = xfX(t) (x)dx processes are observed.

1
Def 1.9 Correlation Matrix If X(t) is stationary the mean is constant, and hence
Let X(t) and Y (t) denote two random processes. The cor- Z
relation matrix is given by Y = X h(1 )d1 = X H(0)
" #
RX (t, u) RXY (t, u)
R(t, u) = The autocorrelation function of Y (t) is, provided that
RY X (t, u) RY (t, u)
E[X 2 (t)] < , t and the system is stable
If the random processes X(t) and Y (t) are each stationary
and jointly stationary, the correlation matrix simplifies to RY (t, u) = E[Y (t)Y (u)]
Z Z
d2 h(2 )RX (t 1 , u 2 )
" #
RX ( ) RXY ( ) = d1 h(1 )
R(t, u) =
RY X ( ) RY ( )
If X(t) is stationary, let = t u
where = t u. Z Z
RY ( ) = h(1 )h(2 )RX ( 1 + 2 )d1 d2
RXY ( ) = RY X ( )
If the input to a stable LTI-Filter is a stationary process,

1.4 Ergodic Processes then the output of the filter is also a stationary process.
Consider the sample function x(t) of a stationary process
X(t). The DC value of x(t) is defined by the time average 1.6 Power Spectral Density
Z T
1 Let H(f ) denote the frequency response of the LTI-Filter.
x (T ) = x(t)dt
2T T Z
h(1 ) = H(f ) exp(j2f 1 )df
Since the process X(t) is assumed to be stationary, the mean
of the time average x (T ) is given by
Def 1.11 Power Spectral Density
E[x (T )] = X
The function SX (f ) is called the power spectral density, or
where X is the mean of the process X(t). power spectrum, of the stationary process X(t).
Z
Def 1.10 Ergodic Process SX (f ) = RX ( ) exp(j2f )d

We say that the process X(t) is ergodic in the mean if two Z
conditions are satisfied: RX ( ) = SX (f ) exp(j2f )df

limT x (T ) = X
The above equations are known as the Einstein-Wiener-
limT Var[x (T )] = 0
Khintchine relations.
We may formally define the time-averaged autocorrelation
function of a sample function x(t) as follows We obtain by using the power spectral density
Z T Z
1
Rx (, T ) = x(t + )x(t)dt RY (0) = E[Y 2 (t)] = |H(f )|2 SX (f )df
2T T
The process X(t) is ergodic in the autocorrelation function if Properties of the PSD
limT Rx (, T ) = RX (T ) R
SX (0) = RX ( )d
limT Var[Rx (, T )] = 0 R
E[X 2 (t)] = SX (f )df = RX (0)
For a random process to be ergodic, it has to be stationary.
SX (f ) 0, f
1.5 Transmission of a Random Process SX (f ) = SX (f )

Through a LTI Filter The power spectral density, appropriately normalized,
A random process X(t) is applied as input to a linear time- has the properties usually associated with a probability
invariant filter of impulse response h(t), producing a new density function.
random process Y (t). Assume X(t) is a stationary process.
SX (f )
Z pX (f ) = R
Y (t) = h(1 )X(t 1 )d1 S (f )df
X

For the mean of Y (t) we get, provided that E[X(t)] < , t The power spectral density of the output process Y (t) equals
and the system is stable the power spectral density of the input process X(t) multi-
Z plied by the squared magnitude response of the filter:
Y (t) = E[Y (t)] = h(1 )X (t 1 )d1 SY (f ) = |H(f )|2 SX (f )

2
We obtain the relation between the power spectral density 2. Consider the set of random variables X(t1 ), . . . , X(tn ),
SX (f ) of an ergodic process and the squared magnitude spec- obtained by observing a random process X(t) at times
trum |X(f, T )|2 of a truncated sample function of the pro- t1 , . . . , tn . If the process X(t) is Gaussian, then this set
cess: of random variables is jointly Gaussian for any n, with
1 their n-fold joint probability density function being com-
SX (f ) = lim E[|X(f, T )|2 ] pletely determined by specifying the set of means
T 2T
2
1 T
Z
X(tk ) = E[X(ti )] i = 1, . . . , n

= lim E x(t) exp(j2f t)dt

T 2T T
and the set of covariance functions
1.7 Gaussian Processes

CX (tk , ti ) = E (X(tk ) X(tK ) )(X(ti ) X(ti ) )
Def 1.12 Linear Functional
with k, i = 1, . . . , n.
Let X(t) denote a random process, g(t) some weighting func-
tion. We refer to The random vector X = (X(t1 ), . . . , X(tn )) has a multi-
Z T variate Gaussian distribution defined as
Y = g(t)X(t)dt
0 fX(t1 ),...,X(tn ) (x1 , . . . , xn ) =

as a linear functional of X(t). 1 1 T 1
exp (x ) (x )
(2)n/2 1/2 2
Def 1.13 Gaussian Distribution
where = [1 , . . . , n ]T is the mean vector, =
(y y )2

1 {CX (tk , ti )}nk,i=1 is the covariance matrix and is the
fY (y) = exp
2Y 2Y 2 determinant of the covariance matrix .
2
where Y is the mean and Y is the variance of Y .
3. If a Gaussian process is stationary, then the process is
also strictly stationary.
Def 1.14 Gaussian Process
If the weighting function g(t) is such that the mean-square 4. If the random variables X(t1 ), . . . , X(tn ), obtained by
value of the random variable Y is finite, and if the random sampling a Gaussian process X(t) at times t1 , . . . , tn are
variable Y is a Gaussian-distributed random variable for ev- uncorrelated, that is
ery g(t), then the process X(t) is said to be a Gaussian

process. E (x(tk ) X(tk ) )(x(ti ) X(ti ) ) i 6= k
Theorem 1.1 Central Limit Theorem then these random variables are statistically independent.
Let Xi , i = 1, . . . , N be a set of independently and identically
distributed (i.i.d.) random variables. 1.8 Noise
They are normalized as follows
1 The term noise is used customarily to designate unwanted
Yi = (Xi X ) i = 1, . . . , N signals that tend to disturb the transmission and processing
X
of signals in communication systems and over which we have
so that E[Yi ] = 0 and Var[Yi ] = 1. incomplete control.
Define furthermore the random variable
N 1.8.1 White Noise
1 X
VN = Yi
N i=1 The noise analysis of communication systems is customarily
based on an idealized form of noise called white noise.
The central limit theorem states that the probability distri-
bution of VN approaches a normalized Gausian distribution N0 N0
N (0, 1) in the limit as the number of random variables N SW (f ) = RW ( ) = ( )
2 2
approaches infinity.
where the dimension of N0 is watts per Hertz, and
Properties of a Gaussian Process
N0 = kB Te
1. If a Gaussian process X(t) is applied to a stable linear fil-
ter, then the rando process Y (t) developed at the output where kB is the Boltzmanns constant and Te is the equiva-
of the filter is also Gaussian. lent noise temperature of the receiver.
3
2 Baseband Pulse Transmission

In its most general form, linear modulation is defined by The impulse response for the optimum filter (except for the
scaling factor k) is a time-reversed and delayed version of the
s(t) = sI (t) cos(2fc t) sQ (t) sin(2fc t) input signal g(t).
The only assumption we have made about the input noise
where sI (t) is the in-phase component of the modulated wave
w(t) is that it is stationary and white with zero mean and
s(t), and sQ (t) is its quadrature component.
power spectral density N0 /2.
2.1 Matched Filter 2.1.1 Properties of Matched Filters

We assume that the major source of system limitation is the The impulse response hopt (t) is uniquely defined, ex-
channel noise. Consider a linear receiver, involving a LTI- cept for the delay T and the scaling facor k.
filter of impulse response h(t). The input signal is given by
The peak pulse signal-to-noise ratio of a matched filter
x(t) = g(t) + w(t) 0tT depends only on the radio of the signal energy to the
power spectral density of the white noise at the filter
where T is an arbitrary observation interval. The w(t) is the input.
sample function of a white noise process of zero mean and
power spectral density N0 /2. go (T )2 (kE)2 2E
max = = =
The output of the linear receiver may be expressed as E[n2 (t)] k 2 N0 E/2 N0
y(t) = h(t) (g(t) + w(t)) = h(t) g(t) + h(t) w(t) Def 2.2 Signal Energy-to-Noise Spectral Density
= go (t) + n(t) Ratio
We refer to E/N0 as the signal energy-to-noise spectral den-
Def 2.1 Peak Pulse Signal-to-Noise Ratio sity ratio.
2
|go (t)|
= 2.2 Error Rate due to Noise
E [n2 (t)]
Consider a binary PCM (Pulse Code Modulation) system
The requirement now is to specify the impulse response h(t) based on polar non-return-to-zero (NRZ) signaling. The
of the filter such that the output signal-to-noise ratio is max- channel noise is modeled as additive white Gaussian noise
imized. w(t) of zero mean and power spectral density N0 /2.
go (t) is given by the inverse Fourier transform In the signaling interval 0 t Tb , the received signal is
Z written as
(
go (t) = H(f )G(f ) exp(j2f t) df +A + w(t) symbol 1 was sent
x(t) =
A + w(t) symbol 0 was sent
and the average power of the output noise n is given by
where Tb is the bit duration and A is the transmitted pulse
2 N0
Z Z
E n (t) SN (t) df = |H(f )|2 df amplitude.
2
Hence we get for the peak pulse signal-to-noise ratio + matched

y decision 1 if y >
s(t) filter device
R 2 0 if y <
sample at
H(f )G(f ) exp(j2f t) df

time t = Tb
= N0
R
threshold
2
|H(f )|2 df w(t)
By applying Schwarzs inequality, we get for the above ex- Let y denote the sample value obtained at the end of a sig-
pression Z naling interval. The sample value y is compared to a preset
2 threshold in the decision device.
|G(f )|2 df
N0 There are two possible kinds of error:
For the optimum value of H(f we get (as a result of Schwarzs 1. Symbol 1 is chosen when a 0 was actually transmitted
inequality) (error of the first kind).
Hopt (f ) = kG (f ) exp(j2f T ) df 2. Symbol 0 is chosen when a 1 was actually transmitted
(error of the second kind).
where G (f ) is the complex conjugate of the Fourier trans-
form of the input signal g(t) and k is a scaling factor. Suppose that symbol 0 was sent. The matched filter output,
Since for a real signal g(t) we have G (f ) = G(f ), we get sampled at time t = Tb is then given by
Z Z Tb Z Tb
1
hopt (t) = k G(f ) exp(j2f (T t)) df y= x(t) dt = A + w(t) dt
0 Tb 0
= kg(T t) which represents the sample value of a random variable Y .
5
We may characterize the random variable Y as follows: Def 2.4 Binary Symmetric Channel
A channel for which the conditional probabilities of error p01
Y is Gaussian distributed with E[Y ] = A. and p10 are equal is said to be binary symmetric.
The variance of Y is
Def 2.5 Transmitted Signal Energy per Bit
2 2
Y = E[(Y + A) ]
Z Tb Z Tb Eb = A2 Tb
1
= 2 RW (t, u) dt du
Tb 0 0 Def 2.6 Average Probability of Symbol Error in a
N0
Binary Symmetric Channel
with RW (t, u) = 2 (t u) we get
r !
1 Eb
N0 Pe = erfc
Y 2 = 2 N0
2Tb
We then have for p10 , the conditional probability of error 2.3 Intersymbol Interference
given that symbol 0 was sent Intersymbol Interference (ISI) is another source of bit errors,
which arises when the communication channel is dispersive.
p10 = P (y > | symbol 0 was sent) For the baseband transmission of digital data, discrete pulse-
Z
amplitude modulation (PAM) is one of the most efficient
= fY (y|0) dy
schemes in terms of power and bandwidth utilization.
Z
(y + A)2

1
=p exp dy Consider a baseband binary PAM system. The incoming bi-
N0 /Tb N0 /Tb
nary sequence {bk } consists of symbols 1 and 0, each of dura-
tion Tb . The pulse-amplitude modulator modifies this binary
Def 2.3 Complementary Error Function sequence into (
2
Z +1 if bk = 1
erfc(u) = exp(z 2 ) dz ak =
u 1 if bk = 0
This sequence of short pulses is applied to a transmit filter
By defining the new variable of impulse response g(t), producing the transmitted signal
X
y+A s(t) = ak g(t kTb )
z=p
N0 /Tb k
The signal s(t) is modified as a result of transmission through

we can rewrite the Equation for p10 the channel of impulse response h(t). The channel adds ran-
! dom noise to the signal at the receiver input. The noisy
1 A+ signal x(t) is then passed through a receive filter of impulse
p10 = erfc p
2 N0 /Tb response c(t). The resulting filter output y(t) is sampled
synchronously with the transmitter.
Analogous, p01 denotes the conditional probability of error, The receive filter output is written as
given that symbol 1 was sent. X
y(t) = ak p(t kTb ) + n(t)
!
k
1 A
p01 = erfc p where is a scaling factor, and the pulse p(t) is to be defined.
2 N0 /Tb
The receive filter output y(t) is sampled at time ti = iTb ,
Hence, the average probability of symbol error Pe in the re- yielding
ceiver is given by
X
y(ti ) = ai + ak p (i k)Tb + n(ti )
kZ, k6=i
Pe = p0 p10 + p1 p01
! ! The first term ai represents the contribution of the ith
p0 A+ p1 A transmitted bit. The second term represents the residual
= erfc p + erfc p
2 N0 /Tb 2 N0 /Tb effect of all other transmitted bits on the decoding of the ith
bit, which is called intersymbol interference (ISI).
The optimum threshold that minimizes Pe is
N0

p0 2.4 Nyquists Criterion
opt = log
4ATb p1 By controlling the overall pulse p(t), we can ensure perfect
reception in the absence of noise
For the special case when both symbols are equiprobable, we (
have 1 i=k
p(iTb kTb ) =
0 i 6= k
1
p0 = p1 = = opt = 0 = p01 = p10
2 where p(0) = 1 by normalization.
6
Def 2.7 Bit Rate Def 2.14 Transmission Bandwidth

The bit rate in bits per second is defined as The transmission bandwidth BT is definded by
1 Bt = 2W f1 = W (1 + )
Rb =
Tb
The inverse Fourier transform for P (f ) yields
Def 2.8 Nyquist Criterion for Distortionless Base-
band Transmission

cos(2W t)
The frequency function P (f ) eliminates intersymbol inter- p(t) = sinc(2W t)
1 162 W t t2
ference for samples taken at intervals Tb provided that it
satisfies The amount of intersymbol interference resulting from tim-

X ing error decreases as the rolloff factor increased from zero
P (f nRb ) = Tb to unity.
n=
2.4.1 Ideal Nyquist Channel Def 2.15 Full-Cosine Rolloff

The special case = 1 (i.e. f1 = 0) is known as the full-
The simplest way of satisfying the above Nyquist Criterion cosine rolloff characteristic with
is to specify P (f ) as follows
( 1 1 + cos
f
0 < |f | < 2W
1
|f | < W P (f ) = 4W 2W
P (f ) = 2W
|f | > W 0 |f | 2W

0
where W is the overall system bandwidth defined by and in the time domain
Rb 1 sinc(4W t)
W = = p(t) =
2 2Tb 1 16W 2 t2
Def 2.9 Ideal Nyquist Channel This time response has two extremely useful properties
By applying the inverse Fourier transform we get the ideal
Nyquist channel At t = Tb /2 = 1/4W we have p(t) = 0.5.
sin(2W t) Additional zero crossings at t = 3Tb /2, 5Tb /2, . . .

p(t) = = sinc(2W t)
2W t However the price paid for this desirable property is the use
of a channel bandwidth double that required for the ideal
Def 2.10 Nyquist Rate Nyquist channel ( = 0).
The bit rate Rb = 2W is called the Nyquist rate.
Def 2.11 Nyquist Bandwidth

2.5 Correlative-Level Coding
W itself is called the Nyquist bandwidth. By adding intersymbol interference to the transmitted signal
in a controlled manner, it is possible to achieve a signaling
However there are two practical difficulties rate equal to the Nyquist rate of 2W symbols per second in
a channel of bandwidth W Hertz.
The magnitude characteristics P (f ) is physically unre-
alizable because of the abrupt transitions.
2.5.1 Duobinary Signaling
The function p(t) decreases as 1/|t| for large |t|, result-
Duobinary signaling (also called class I partial response)
ing in a slow rate of decay.
implies doubling of the transmission capacity of a straight
binary system.
2.4.2 Raised Cosine Spectrum
Def 2.12 Raised Cosine Spectrum Consider again a binary sequence {bk } applied to a pulse-
By extending the bandwidth from the minimum value W = amplitude modulator producing
Rb /2 to an adjustable value between W and 2W . The fre- (
quency response of the raised cosine spectrum consists of a +1 if bk = 1
ak =
flat portion and a rolloff portion that has a sinusoidal form 1 if bk = 0
1
0 |f | < f1 When this sequence is applied to a duobinary encoder,. it is
2W

converted into a threelevel output, namely, -2, 0, and +2.

P (f ) = 1 (|f | W )
1 sin f1 |f | < 2W f1 The duobinary coder output ck may be expressed as

4W 2W f1
|f | 2W f1

0 ck = ak + ak1
Def 2.13 Rolloff Factor For the overall frequency response of the duobinary signaling
The parameter is called the rolloff factor scheme we have
(
f1 2 cos(f Tb ) exp(jf Tb ) |f | 2T1 b
=1 HI (f ) =
W 0 otherwise
7
The corresponding impulse response 2.6 Baseband M-ary PAM Transmission

Tb 2 sin Ttb In a baseband M-ary PAM system, the pulse-amplitude mod-
hI (t) = ulator produces one of M possible amplitude levels with
t(Tb t)
M > 2. We refer to 1/T as the signaling rate of the system,
has only two distinguishable values at the sampling instants. which is expressed in symbols per second, or bauds.
The original two-level sequence {ak } may be detected by in- In an M-ary PAM system, 1 baud is equal to log2 M bits
voking the use of per second, and the symbol duration of the M-ary PAM sys-
ak = ck ak1 tem is related to the bit duration Tb of the equivalent binary
where ak denotes the estimate of the original pulse ak . The PAM system as
technique of using a stored estimate of the previous symbol T = Tb log2 M
is called decision feedback.
Therefore, in a given channel bandwidth, we find that by
However, a major drawback of this detection procedure is
using an M-ary PAM system, we are able to transmit infor-
that once errors are made, they tend to propagate through
mation at a rate that is log2 M faster, than the corresponding
the output. A practical means of avoiding this error-
binary PAM system.
propagation phenomenon is to use precoding before the duobi-
However, to realize the same average probability of error, an
nary coding.
M-ary PAM system requires more transmitted power.
dk = bk dk1
where denotes modulo-two addition.
This nonlinear precoding leads to
(
0 if data symbol bk = 1
ck =
2 if data symbol bk = 0
2.5.2 Modified Duobinary Signaling

The fact that HI (f ) is nonzero at the origin is considered to
be an undesirable feature in some applications, since many
communication channels cannot transmit a DC component.
We may correct this by using the class IV partial response or

modified duobinary technique, where the precoder involves a
delay of 2Tb seconds.
ck = ak ak2
We get an overall frequency response of
(
2j sin(2f Tb ) exp(j2f Tb ) |f | 2T1 b
HIV (f ) =
0 elsewhere
The impulse response of the modified duobinary coder con-
sists of two sinc pulses that are time-displaced by 2Tb seconds
with respect to each other

2Tb 2 sin Ttb
hIV (t) =
t(2Tb t)
To eliminate the possibility of error propagation we precode
as follows
dk = bk dk2
8
3 Signal-Space Analysis
A message source emits one symbol every T seconds, with The set of coefficients {sij }N j=1 may naturally be viewed as
the symbols belonging to an alphabet of M symbols denoted an N -dimensional vector, denoted by si . Hence, every signal
by m1 , m2 , . . . , mM . in the set {si (t)} is completely characterized by the vector
It is customary to assume that the M symbols of the alpha- of its coefficients, i.e. its signal vector
bet are equally likely
h iT
1 si = si1 si2 ... siN i = 1, . . . , M
pi = P (mi ) = i = 1, . . . , M
M
The transmitter takes the message source output mi and We may visualize the set of signal vectors {si | i = 1, . . . , M }
codes it into a distinct signal si (t), which occupies the full du- as defining a corresponding set of M points in an N -
ration T allotted to symbol mi . si (t) is a real-valued energy- dimensional Euclidean space, with N mutually prependic-
signal ular axes labeled 1 , . . . , N . This N -dimensional Euclidean
Z T space is called the signal space.
Ei = si 2 (t) dt i = 1, . . . , M
0
Def 3.2 Inner Product
Def 3.1 AWGN Channel The squared-length of any signal vector si is defined to be
A channel having the following two characteristics is referred the inner product or dot product of si with itself
to as an additive white Gaussian noise (AWGN) channel:
N
X
The channel is linear, with a bandwidth that is wide ||si ||2 = si T si = sij 2 i = 1, . . . , M
enough. j=1
The channel noise w(t) is the sample function of a zero-

Def 3.3 Signal Energy
mean white Gaussian noise process.
Z T N
The received signal x(t) may then be expressed as
X
Ei = si 2 (t) dt = sij 2 = ||si ||2
0 j=1
x(t) = si (t) + w(t) 0 t T, i = 1, . . . , M
The receiver has the task of observing the received signal Def 3.4 Euclidean Distance between Signal Vectors
x(t) for a duration of T seconds and making a best estimate
of the transmitted signal si (t). N
X Z T
The requirement is therefore to desing the receiver so as to dik = ||si sk ||2 = (sij skj )2 = (si (t) sk (t))2 dt
0
minimize average probability of symbol error defined as j=1
M
X Def 3.5 Angle between Signal Vectors
Pe = pi P (m 6= mi |mi )
i=1
si T sk
where mi is the transmitted symbol and m is the estimate. cos(ik ) =
||si || ||sk ||
3.1 Geometric Representation of Signals 3.1.1 Gram-Schmidt Orthogonalization Procedure

The essence of geometric representation of signals is to rep- 1. All contribution from k (t), k < i is eliminated, with the
resent any set of M energy signals {si (t)} as a linear combi- remainder mforming gi (t)
nations of N orthonormal basis functions, where N M .
i1
N X
si (t) =
X
sij j (t) 0 t T, i = 1, . . . , M gi (t) = si (t) sij j (t)
j=1
j=1
where the coefficients of the expansion are defined by where the coefficients sij are defined by
Z T Z T
sij = si (t)j (t) dt i, j = 1, . . . , M sij = si (t)j (t) dt j = 1, . . . , i 1
0 0
The real-valued basis functions 1 (t), . . . , N (t) are or-

thonormal 2. Given the gi (t), we may define the set of basis functions
Z T (
1 i=j gi (t)
i (t)j (t) dt = ij = i (t) = qR i = 1, . . . , N
0 0 i 6= j T
0
gi 2 (t) dt
where ij is the Kronecker delta.
which form an orthonormal set.
9
3.2 Conversion of the Continuous AWGN where the vector x is called the observation vector, and xj is
Channel into a Vector Channel called an observable element.
By substituting Gaussian probability densitiy function in the
Suppose that the input to the bank of N product integrators above equation, we get
or correlators is not the transmitted signal si (t) but rather
the received signal x(t).
N

1 X
fX (x|mi ) = (N0 )N/2 exp (xj sij )2
x(t) = si (t) + w(t) 0 t T, i = 1, . . . , M N0 j=1
where w(t) is a sample function of a white Gaussian noise
process W (t) of zero mean and power spectral density N0 /2. Finally, we note that any random variable W 0 (tk ), derived
At the output of correlator j, we have a sample of a random from the noise process W 0 (t) by sampling it at time tk , is in
variable Xj fact statistically independent of the set of random variables
Z T {Xj }. The random variable W 0 (tk ) is irrelevant to the de-
cision as to which particular signal was actually transmitted.
xj = x(t)j (t) dt = sij + wj j = 1, . . . , N
0
We may thus express the received signal as

Def 3.6 Theorem of Irrelevance
N
X Insofar as signal detection in additive white Gaussian noise
x(t) = xj j (t) + w0 (t) is concerned, only the projections of the noise onto the ba-
j=1 sis functions of the signal set {si (t)}M
i=1 affects the sufficient
where w0 (t) may be viewed as a sort of remainder term. statistics of the detection problem; the remainder of the noise
is irrelevant.
3.2.1 Statistical Characterization of the Correlator
Outputs We may state that the AWGN channel is equivalent to an
N -dimensional vector channel described by the observation
Let X(t) denote the random process, a sample function of vector
which is represented by the received signal x(t). Correspond- x = si + w i = 1, . . . , M
ingly, let Xj denote the random variable whose sample value
is represented by the correlator ouput xj , j = 1, . . . , N .
Xj is a Gaussian random variable and hence is completely 3.3 Likelihood Functions
characterized by its mean and variance.
At the receiver we are given the observation vector x and the
Xj = E[Xj ] = E[sij + Wj ] = sij + E[Wj ] requirement is to estimate the message symbol mi .
= sij
Xj = Var[Xj ] = E[(Xj sij )2 ] = E[Wj 2 ]
2 Def 3.7 Likelihood Function
Z TZ T
= j (t)j (u)RW (t, u) du dt L(mi ) = fX (x|mi ) i = 1, . . . , M
0 0
with RW (t, u) = N0
u) we get Def 3.8 Log-Likelihood Function
2 (t

N0
Z T
N0 l(mi ) = log L(mi ) i = 1, . . . , M
= j 2 (t) dt =
2 0 2
Def 3.9 Log-Likelihood Function for an AWGN
The Xj are mutually uncorrelated, since Channel
cov[Xj Xk ] = 0 j 6= k N
1 X
and because the Xj are Gaussian random variables, they are l(mi ) = (xj sij )2 i = 1, . . . , M
N0 j=1
also statistically independent.
Define the vector of N random variables
h iT 3.4 Coherent Detection of Signals in Noise:
X = X1 X2 . . . XN Maximum Likelihood Decoding
whose elements are independent Gaussian random variables Suppose that in each time slot of duration T seconds, one of
with mean values equal to sij and variances equal to N0 /2. the M possible signals s1 (t), . . . , sM (t) is transmitted with
1
equal probability M .
We may therefore express the conditional probability density The corresponding point of si (t) in the Euclidean space of
function of the vector X, given the transmitted symbol mi , dimension N M is referred to as the transmitted signal
as the product of the conditional probability density func- point or message point. The set of message points is called a
tions of its individual elements signal constellation.
N Based on the observation vector x, we may represent the re-
Y
fX (x|mi ) = fXj (xj |mi ) i = 1, . . . , M ceived signal x(t) by a point in the same Euclidean space.
j=1 This point is referred to as the received signal point.
10
Def 3.10 Signal Detection Problem 3.5 Correlation Receiver

Given the observation vector x, perform a mapping from x
For an AWGN channel and for the case when the transmit-
to an estimate m of the transmitted symbol, mi , in a way
ted signals s1 (t), . . . , sM (t) are equally likely, the optimum
that would minimize the probability of error in the decision-
receiver consists of two subsystems
making process.
1. The detector part consists of a bank of M product-
Pe (mi |x) = P (mi not sent | x) = 1 P (mi sent | x) integrators or correlators, which operates on the received
signal x(t), 0 t T , in order to produce the observation
3.4.1 Maximum a Posteriori Decoding vector x.
Def 3.11 Maximum a Posteriori Probability Rule 2. The signal transmission decoder is implemented in form
Set m = mi if of a maximum-likelihood decoder.
The optimum receiver is commonly reffered to as a correla-
P (mi sent | x) P (mK sent | x) k 6= i tion receiver.
Using Bayes rule we get for the MAP Rule
Def 3.13 Equivalence of Correlation and Matched
Set m = mi if
Filter Receivers
pk fX (x|mk ) The correlation receiver involves a set of correlators. Alter-
is maximum for k = i natively, we may use a corresponding set of matched filters.
fX (x)
where pk is the a priori probability of transmitting symbol 3.6 Proability of Error

mk .
Suppose thath the observation space Z is partitioned into a
3.4.2 Maximum Likelihood Decoding set of M regions {Zi }M
i=1 . Suppose also that symbol mi is
transmitted and an observation vector x is received.
Def 3.12 Maximum Likelihood Rule The average probability of symbol error is given by
Set m = mi if M
1 X
Pe = 1 P (x lies in Zi | mi sent)
l(mk ) is maximum for k = i M i=1
M Z
The maximum likelihood decoder differs from the maximum 1 X
=1 fX (x | mi ) dx
a posteriori decoder in that it assumes equally likely message M i=1 Zi
symbols.
3.6.1 Invariance of the Probability of Error to Ro-
Let Z denote the N -dimensional sapce of all possible obser- tation and Translation
vation vectors x, the observation space. The total observa-
The way in which the observation space Z is partitioned is
tion space Z is correspondingly partitioned into M -decision
uniqueli defined by the signal constellation under study. Ac-
regions, denoted by Z1 , . . . , ZM . Hence the Maximum Like-
cordingly, changes in the orientation of the signal constella-
lihood Rule can be stated as follows
tion with respect to both the coordinate axes and the origin
Observation vector x lies in region Zi if
of the signal space do not affect the probability of symbol
l(mk ) is maximum for k = i error Pe .
1. In maximum likelihood detection, the probability of sym-
The maximum likelihood decision rule is of generic kind, with bol error Pe depends solely on the relative Euclidean dis-
the channel noise w(t) being additive as the only restriction tances between the message points in the constellation.
imposed on it.
Consider now an AWGN channel. The maximum likelihood 2. The additive white Gaussian noise is spherically symmet-
decision rule for the AWGN channel may be formulated as ric in all directions in the signal space.
Observation vector x lies in region Zi if
Def 3.14 Principle of Rotational Invariance
N
X If a signal constellation is rotated by an orthonormal trans-
(xj sij )2 is minimum for k = i formation, that is
j=1
si,rotate = Qsi i = 1, . . . , M
or equivalent
where Q is an orthonormal matrix, then the probability of
Observation vector x lies in region Zi if symbol error Pe incurred in maximum likelihood signal de-
the Euclidean distance ||x sk ||2 is minimum for k = i tection over an AWGN channel is completely unchanged.
The maximum likelihood decision rule hence is simply to Def 3.15 Principle of Translational Invariance
choose the message point closest to the received signal point. If a signal constellation is translated by a constant vector
amount, then the probability of symbol error Pe incurred in
Observation vector x lies in region Zi if maximum likelihood signal detection over an AWGN channel
PN 1
j=1 xj skj 2 Ek is maximum for k = i is completely unchanged.
11
3.6.2 Minimum Energy Signals The probability of symbol error, averaged over all the M
symbols, is therefore overbounded as follows:
A ueful application of the principle of translational invari-
ance is in the translation of a given signal constellation in M
X
such a way that the average energy is minimized. Pe = pi Pe (mi )
Given a signal constellation {si }M
i=1 , the corresponding sig- i=1
nal constellation with minimum average energy is obtained M M
1X d
ik
X
by subtracting from each signal vector si in the given con- erfc
stellation an amount equal to the constant vector E[s] with 2 i=1 2 N0
k=1, k6=i
M
X where pi is the probability of transmitting symbol mi .
E[s] = si pi
i=1 1. Suppose that the signal constellation is circularly sym-
where pi is the probability that symbol mi was been emitted. metric about the origin. Hence
The minimum average energy of the translated signal con-
M
stellation is then 1 d
ik
X
Pe erfc i
2 2 N0
Etranslate,min = E ||E[s]||2 k=1, k6=i
where E is the average energy of the original signal constel-

2. Define the minimum distance of a signal constellation,
lation.
dmin , as the smallest Euclidean distance between any two
transmitted signal points in the constellation.
3.6.3 Union Bound on the Probability of Error
2
For AWGN channels, the formulation of the average prob- M 1 dmin M 1 d
Pe erfc exp min
ability of symbol error Pe is conceptually straightforward. 2 2 N0 2 4N0
However, the numerical computation of the integral is im-
practical. 3.6.4 Bit versus Symbol Error Proababilities
By simplifying the region of integration, we get a simple and
useful upper bound called the union bound as an approxi- Thus far, the only figure of merit we have user to assess the
mation to the average probability of symbol error for a set noise performance of a digital passband transmission system
of M equally likely signals (symbols) in an AWGN channel. has been the average probability of symbol error. Another
figure of merit is called the bit error rate (BER).
Let Aik , with i, k = 1, . . . , M , denote the event that the ob-
servation vector x is closer to the signal vector sk than to si , Case 1
when the symbol mi (vector si ) is sent. We know that the We assume that it is possible to perform the mapping from
probability of a finite union of events is overbounded by the binary to M-ary symbols in such a way that the two binary
sum of the probabilities of the constituent events. M-tuples corresponding to any pair of adjacent symbols in
the M-ary modulation scheme differ only in one bit position
M
X (e.g. Gray code).
Pe (mi ) P (Aik ) i = 1, . . . , M
k=1, k6=i Pe
BER Pe
log2 M
Def 3.16 Pairwise Error Probability
The probability P2 (si , sk ) is called the pairwise error proba- Case 2
bility in that if a data transmission system uses only a pair Let M = 2k . We assume that all symbol errors are equally
of signals, si and sk , then P2 (si , sk ) is the probability of the likely and occur with probability
receiver mistaking sk for si .
Pe Pe
= k

1 dik M 1 2 1
P2 (si , sk ) = erfc
2 2 N0
where Pe is the average probability of symbol error.
Using the pairwise error probability we can simplify the prob- Hence, the bit error rate is
ability of symbol error, given that mi has been emitted, fur-
2k1

ther to BER = Pe
M 2k 1
X
Pe (mi ) P2 (si , sk )
k=1, k6=i or equivalently
M
1 d !
ik
X
M
erfc 2
2 2 N0 BER = Pe
k=1, k6=i M 1
12
4 Passband Data Transmission

4.1 Introduction Given a modulated signal s(t), we may describe it in terms
of its in-phase and quadrature components as
In baseband pulse transmission a data stream represented in
the form of a discrete pulse-amplitude modulated (PAM) sig- s(t) = sI (t) cos(2fc T ) sQ (t) sin(2fc T )
nal is transmitted directly over a low-pass channel. In digital = Re (s(t) exp(j2fc t))
passband transmission the incoming data stream is modu-
where the signal s(t) = sI (t) + jsQ (t) is called the complex
lated onto a carrier (usually sinusoidal) with fixed frequency
envelope (i.e. baseband version) of the modulated (band-
limits imposed by a band-pass channel of interest.
pass) signal s(t).
The modulation process making the transmission possible in-
The power spectral density can be written as
volves switching (keying) the amplitude, frequency, or phase
of a sinusoidal carrier in some fashion in accordance with the 1
SS (t) = SB (f fc ) + SB (f + fc )
incoming data. Thus there are three basic signaling schemes, 4
known as amplitude-shift keying (ASK), frequency-shift key- where SB (f ) denotes the baseband power spectral density. It
ing (FSK), and phase-shift keying (PSK). is therefore sufficient to evaluate the baseband power spec-
tral density SB (f ).
4.1.1 Hierarchy of Digital Modulation Techniques
4.1.4 Bandwidth Efficiency
Digital modulation techniques may be classified into coherent
The primary objective of spectrally efficient modulation is
and noncoherent techniques, depending on whether the re-
to maximize the bandwidth efficiency defined as the ratio
ceiver is equipped with a phase-recovery circuit or not, hence
of the data rate in bits per second to the effectively utilized
if the receiver is synchronized (in both frequency and phase).
channel bandwidth.

In an M-ary signaling scheme, we may send any one of M - Rb bits Hz
=
possible signals s1 (t), . . . , sM (t), during each signaling inter- B s
val of duration T . For almost all applications, the number of where Rb denotes the data rate, and B the effectively used
possible signals M = 2n , where n is an integer. The symbol channel bandwidth.
duration T = nTb , where Tb is the bit duration. In passband Bandwidth efficiency is the product of two independent fac-
data transmission, we have M-ary ASK, M-ary PSK, and tors: one due to the possible use of multilevel encoding and
M-ary FSK as well as hybrid forms (e.g. M-ary amplitude- the other due to spectral shaping.
phase keying (APK) or M-ary quadrature-amplitude modula-
tion (QAM)). 4.2 Passband Transmission Model
M-ary signaling schemes are preferred over binary signaling
schemes for transmitting digital information over band-pass First, there is assumed to exist a message source that emits
channels when the requirement is to conserve bandwidth at one symbol every T seconds, with the symbols belonging to
the expense of increased power. The use of M-ary PSK en- an alphabet of M symbols, which we denote by m1 , . . . , mM .
ables a reduction in transmission bandwidth over binary PSK The a priori probabilities P (m1 ), . . . , P (mM ) specify the
by the factor message source output.
n = log2 M The M-ary output of the message source is presented to a
signal transmission encoder, producing a corresponding vec-
tor sj made up of N real elements, one such set for each of
4.1.2 Probability of Error the M symbols of the source alphabet (N M ).
A major goal of passband data transmission systems is the The modulator then constructs a distinct signal si (t) of dura-
optimum design of the receiver so as to minimize the average tion T seconds as the representation of the symbol mi . The
probability of symbol error in the presence of AWGN. signal si (t) is real valued and an energy signal
Z T
For each system the signal constellation is formulated, and
the decision regions in accordance with maximum likelihood Ei = si 2 (t) dt i = 1, . . . , M
0
signal detection over an AWGN channel is constructed.
The bandpass communication channel, coupling the trans-
In case of simple methods (coherent binary PSK, co- mitter to the receiver, is assumed to have two characteristics
herent binary FSK) exact formulas for Pe are derived. 1. The channel is linear, with a bandwidth that is wide
enough.
In the case of more elaboarte methods (coherent M-ary
PSK, coherent M-ary FSK) an approximate formula for 2. The channel noise w(t) is the sample function of a white
Pe is derived. Gaussian noise process of zero mean and power spectral
density N0 /2.
4.1.3 Power Spectra The receiver, which consists of a detector followed by a signal
transmission decoder, performs two functions:
The power spectra of the resulting modulated signals is par- 1. It reverses the operations performed in the transmitter.
ticularly important in two contexts: occupancy of the chan-
nel bandwidth and co-channel interference in multiplexed 2. It minimizes the effect of channel noise on the estimate m
systems. computed for the transmitted symbol mi .m
13
4.3 Coherent Phase-Shift Keying The conditional probability density function of random vari-
able X1 , given that symbol 0 was transmitted is defined by
4.3.1 Binary Phase-Shift Keying

1 1 2
In a coherent BPSK system, the pair of signals below is used fX1 (x1 |0) = exp (x1 s21 )
to represent binary symbols 1 and 0 N0 N0
p 2
1 1
r = exp x1 Eb
2Eb N0 N0
s1 (t) = cos(2fc t)
T
r b r The conditional probability of the receiver deciding in favor
2Eb 2Eb of symbol 1, given that symbol 0 was transmitted, is
s2 (t) = cos(2fc t + ) = cos(2fc t)
Tb Tb r !
Z
1 Eb
where 0 t Tb and Eb is the transmitted signal energy p10 = fX1 (x1 |0) dx1 = erfc
0 2 N 0
per bit. The carrier frequency fc is chosen equal to nc /Tb for
some fixed integer nc . Consider now an error of the second kind. We note that the
signal space is symmetric with respect to the origin.
Def 4.1 Antipodal Signals
A pair of sinusoidal waves that differ only in a relative phase- Thus, averaging the conditional error probabilities p10 and
shift of 180 degrees are referred to as antipodal signals. p01 , we find that the average probability of symbol error or
equivalently the bit error rate for coherent BPSK is (assum-
Signal Space ing equiprobable symbols)
In BPSK there is only one basis function r !
1 Eb
r
2 Pe = erfc
1 (t) = cos(2fc t) 0 t < Tb 2 N0
Tb
Power Spectral Density
For the signals s1 (t) and s2 (t) we then have The complex envelope of a BPSK wave consists of an in-
p p phase component only. The signal shaping function g is de-
s1 (t) = Eb 1 (t) s2 (t) = Eb 1 (t) 0 t < Tb fined by (q
2Eb
0 t Tb
The one-dimensional signal space consists of two message g(t) Tb
points 0 otherwise
We assume that the input binary wave is random, with sym-
bols 1 and 0 equally likely and the symbols transmitted dur-
ing the different time slots being statistically independent.
The baseband power spectral density of a BPSK signal equals
2Eb sin2 (Tb f )

SB (f ) = = 2Eb sinc2 (Tb f )
(Tb f )2
4.3.2 Quadriphase-Shift Keying

A bandwidth-conserving modulation scheme known as coher-
ent quadriphase-shift keying, is an example of quadrature-
carrier multiplexing. In quadriphase-shift keying (QPSK),
information carried by the transmitted signal is contained in
the phase.
Error Probability
r
2E
To realize a rule for making a decision, we partition the sig- si (t) = cos (2i 1) cos(2fc t)
T 4
nal space into two regions Z1 and Z2 . r
2E
sin (2i 1) sin(2fc t)
Z1 The set of points closest to message point 1 at + Eb . T 4

Z2 The set of points closest to message point 2 at Eb . where i = 1, 2, 3, 4; E is the transmitted signal energy per
symbol, and T is the symbol duration. The carrier frequency
To calculate the probability of making an error of the first fc equals nc /T for some fixed integer nc .
kind, we note that the decision region associated with symbol
1 or signal s1 (t) is described by Signal Space
1 (t) and 2 (t) are defined by a pair of quadrature carriers
Z1 : 0 < x1 < r
2
where the observable element x1 is defined by 1 (t) = cos(2fc t) 0tT
T
r
Z Tb 2
x1 = x(t)1 (t) dt 2 (t) = sin(2fc t) 0tT
T
0
14
There are four message points, and the associated signal vec- The average probability of a correct decision from the com-
tors are defined by bined action of two channels working together is
" # r ! r !
E cos (2i 1) 4 E 1 E
si = i = 1, 2, 3, 4 Pc = (1 P 0 )2 = 1 erfc + erfc2
E sin (2i 1) 4 2N0 4 2N0

Accordingly, a QPSK signal has a two-dimensional signal The average probability of symbol error for coherent QPSK
constellation and four message points whose phase angles is therefore
increase in a counterclockwise direction. As with BPSK, the r ! r !
E 1 2 E
QPSK signal has minimum average energy. Pe = 1 Pc = erfc erfc
2N0 4 2N0
In the region where E/2N0 1 we have

r ! r !
E E=2Eb Eb
Pe ' erfc = erfc
2N0 N0
With Gray encoding used for the incoming symbols, we find

that the bit error rate of QPSK is exactly
r !
1 Eb
BER = erfc
2 N0
We amy therefore state that a coherent QPSK system

achieves the same average probability of bit error as the
coherent BPSK system for the same bit rate and the same
Eb /N0 , but uses only half the channel bandwidth.
Power Spectral Density

Error Probability Assume that the binary wave at the modulator input is ran-
In a coherent QPSK system, the received signal x(t) is de- dom, with symbols 1 and 0 being equally likely, and with the
fined by symbols transmitted during adjacent time slots being statis-
tically independent.
x(t) = si (t) + w(t) 0 t T, i = 1, 2, 3, 4
1. Depending on the dibit sent during the signaling interval
where w(t) is the sample function of a white Gaussian noise Tb t Tb , the in-phase component equals +g(t) or
process of zero mean and power spectral density N0 /2. g(t), where
The observation vector x has two elements, defined by (q
E
Z T r T 0tT
E g(t) =
x1 = x(t)1 (t) dt = + w1 0 otherwise
0 2
Z T r
E 2. The in-phase and quadrature components are statistically
x2 = x(t)2 (t) dt = + w2
0 2 independent.
Thus the observable elements x1 and x2 are sample values SB (f ) = 2E sinc2 (T f ) = 4Eb sinc2 (2Tb f )
of independentpGaussian random
p variables with mean val-
ues equal to E/2 and E/2, respectively, and with a
common variance equal to N0 /2.
To calculate the average probability of symbol error, we note
that a coherent QPSK system is in fact equivalent to two
coherent BPSK systems working in parallel and using two
carriers that are in phase quadrature. The two elements of
the observation vector x may be viewed as the individual
ouputs of the two coherent BPSK systems.
Since the signal energy per bit is E/2 and the noise spectral
density equal to N0 /2 we habe for the average probability of
bit error in each channel of the coherent QPSK system:
r !
0 1 E
P = erfc
2 2N0
Important to note is that the bit errors in the in-phase and

quadrature channels of the coherent QPSK system are sta-
tistically independent.
15
4.3.3 Offset QPSK The generation of /4-shifted DQPSK symbols, represented

by the symbol pair (I, Q), is described by
Examining the QPSK waveform we may make the following
observations Ik = cos(k1 + k ) = cos(k )
1. The carrier phase changes by 180 degrees whenever both Qk = sin(k1 + k ) = sin(k )
the in-phase and quadrature components of the QPSK
signal changes sign. where k1 is the absolute phase angle of symbol k 1, and
k is the differentially encoded phase change.
2. The carrier phase changes by 90 degrees whenever the
in-phase or quadrature component changes sign. Detection of /4-Shifted DQPSK Signals
Given the noisy channel output x(t), the receiver first com-
3. The carrier phase is unchanged when neither the in-phase putes the projections of x(t)onto the basis functions 1 (t)
nor the quadrature component changes sign. and 2 (t). The resulting outputs, denoted by I and Q, are
These shifts in carrier phase can result in changes in the car- applied to a differential detector that consists of
rier amplitude, thereby causing additional symbol errors on arctangent computer
detection.
The extent of amplitude fluctuations exhibited by QPSK sig- phase-difference computer
nals may be reduced by using offset QPSK, where the bit
modulo-2 correction logic
stream responsible for generating the quadrature component
is delayed by half a symbol interval with respect to the bit The tangent type differential detector for the demodulation
stream responsible for generating the in-phase component. of /4-shifted DQPSK signals is relatively simple to imple-
The two basis functions are defined as ment. It offers a satisfactory performance in Raleigh fading
r channel as in a static multipath environment.
2
1 (t) = cos(2fc t) 0 t < leT
T
r 4.3.5 M-ary PSK
2 T 3T
2 (t) = sin(2fc t) t QPSK is a special case of M-ary PSK, where the pahse
T 2 2
of the carrier takes on one of M possible values, namely
However 90 degree phase transitions in offset QPSK occur i = 2(i 1)/M , where i = 1, . . . , M . Accordingly, during
twice as frequently but with half the intensitiy encountered each signaling interval of duration T , one of the M possible
in QPSK. The offset QPSK has exactly the same probability signals
of symbol error in an AWGN channel as QPSK. r
2E 2
si (t) = cos 2fc t + (i 1) i = 1, . . . , M
4.3.4 /4-Shifted QPSK T M
In another variant of QPSK known as /4-shifted QPSK, the is sent, where E is the signal energy per symbol. The carrier
carrier phase used for the transmission of successive symbols frequency fc = nc /T for some fixed integer nc .
(i.e. dibits) is alternately picked from one of the two QPSK
constellations shown below. It follows therefore that a /4- Signal Space
shifted QPSK signal may reside in any one of eight possible Each si (t) may be expanded in terms of the same two basis
phase states. functions 1 (t) and 2 (t). The signal constellation of M-ary
PSK is therefore two-dimensional. The M message points
2 2 are equally spaced on a cirle of radius E and center at the
origin.
1 1 Error Probability
0 0 Suppose that the transmitted signal corresponds to message
point m1 . Suppose that the ratio E/N0 is large enough to
consider the nearest two message points as potential candi-
dates for being mistaken for m1 due to channel noise.
Attractive features of the /4-shifted QPSK scheme include The average probability of symbol error for coherent M-ary
the following PSK is r !
E
The phase transitions from one symbol to the next are Pe ' erfc sin
N0 M
restricted to /4 and 3/4. Consequently, envelope
variations of /4-shifted QPSK signals due to filtering where it is assumed that M 4. The approximation be-
are significantly reduced, compared to those in QPSK. comes extremely tight, for fixed M , as E/N0 is increased.
Unlike offset QPSK signals, /4-shifted QPSK signals Power Spectral Density
can be noncoherently detected. The symbol duration of M-ary PSK is defined by
Like QPSK signals, /4-shifted QPSK signals can be T = Tb log2 M
differently encoded, in which case we should really
speak of /4-shifted DQPSK. where Tb is the bit duration.
16
The baseband power spectral density of an M-ary PSK signal where E0 denotes the energy of the signal with the lowest
is given by amplitude. The transmitted M-ary QAM signal for symbol
k is then defined by
SB (f ) = 2E sinc2 (T f ) = 2Eb log2 (M ) sinc2 Tb f log2 (M )

r r
2E0 2E0
sk (t) = ak cos(2fc t) bk sin(2fc t)
4.3.6 Bandwidth Efficiency of M-ary PSK Signals T T
The power spectra of M-ary PSk signals possess a main lobe with 0 t T, k Z.
bounded by well-defined spectral nulls. Depending on the number of possible symbols M , we may
distinguish two distinct QAM constellations:
Def 4.2 Null-To-Null Bandwidth
The null-to-null bandwidth is defined as the spectral width of QAM Square Constellations
the main lobe, wich provides a simple and popular measure With an even number of bits per symbol, we may write
for the bandwidth of M-ary PSK signals.
L= M
The channel bandwidth required to pass M-ary PSK signals where L is a positive integer. Under this condition, an M-ary
(more precisely, the main spectral lobe of M-ary signals) is QAM square constellation can always e viewed as the Carte-
given by sian product of a one-dimensional L-ary PAM constellation
2 2Rb with itself.
B= =
T log2 M In the case of a QAM square consteallation, the ordered pairs
where T is the symbol duration. of coordinates naturally forms a square matrix {ai , bi } =
The bandwidth efficiency of M-ary PSK signals is then given
by (L + 1, L 1) (L + 3, L 1) . . . (L 1, L 1)
Rb log2 M (L + 1, L 3) (L + 3, L 3) . . . (L 1, L 3)

= = .. .. ..
B 2
. . .
Observation
As the number of states M is increased, the bandwidth effi- (L + 1, L + 1) (L + 3, L + 1) . . . (L 1, L + 1)
ciency is improved at the expense of error performance. To To calculate the probability of symbol error for M-ary QAM,
ensure that there is no degradation in error performance, we we exploit the property that a QAM square constellation can
have to increase Eb /N0 to compensate for the increase in M . be factored into the product of the corresponding PAM con-
stellation with itself.
4.4 Hybrid A/P Modulation Schemes The probability of correct detection for M-ary QAM may be
written as
In an M-ary PSK system, the in-phase and quadrature com-
Pc = (1 Pe0 )2
ponents of the modulated signal are interrelated in such a
way that the envelope is constrained to remain constant. If where r !
this constraint is removed, and the in-phase and quadrature 1 E0
Pe0 = 1 erfc
components are thereby permitted to be independent, we get M N0
a new modulation scheme called M-ary quadrature amplitude
Assuming that Pe0 is small enough compared to unity, we
modulation (QAM).
find that the probability of symbol error for M-ary QAM is
The passband basis functions in M-ary QAM may not be pe-
approximately given by
riodic for an arbitrary choice of the carrier frequency fc with
respect to the symbol rate 1/T . Ordinarily, this aperiodicity
1
r !
E0
is of no real concern. Pe ' 2 1 erfc
M N0
4.4.1 M-ary Quadrature Amplitude Modulation The transmitted energy in M-ary QAM is variable in that its
M-ary QAM is a two-dimensional generalization of M-ary instantaneous value depends on the particular symbol trans-
PAM in that its formulation involves two orthogonal pass- mitted. It is therefore more logical to express Pe in terms of
band basis functions the average value of the transmitted energy rather than E0 .
r Assuming that the L amplitude levels of the in-phase or
2 quadrature component are equally likely, we have
1 (t) = cos(2fc t) 0tT
T
r L/2
2 2E 0
X
2 (t) = sin(2fc t) 0tT Eav = 2 (2i 1)2
T L i=1
Let the message point si in the (1 , 2 ) plane be denoted by
Accordingly we may rewrite the equation for Pe using Eav
as

dmin dmin
ai , bi s !
2 2 1 3Eav
Pe ' 2 1 erfc
M 2(M 1)N0
where dmin is the minimum distance between any two mes-
sage points in the constellation; ai and bi are integers, and QAM Cross Constellations
i = 1, . . . , M . Let p To generate an M-ary QAM signal with an odd number of
dmin = E0 bits per symbol, we require the use of a cross constellation.
17
The probability of symbol error Pe can approximately be 3. The transmitted signal s(t) represents a symbol-time-
written as invariant realization of this hybrid modulation process.

1
r !
E0 The transmitted signal
Pe ' 2 1 erfc
M N 0 X
s(t) = ak p(t kT ) bk p(t kT )
k=
for high E0 /N0 . Note that it is not possible to perfectly Gray
code a QAM cross constellation. is known as carrierless amplitude/phase modulation (CAP).
4.4.2 Carrierless Amplitude/Phase Modulation Properties of the Passband In-phase and Quadrature
Pulses
So far a rectangular pulse has been used for the pulse-shaping
function. 1. The passband in-phase pulse p(t) and quadrature pulse
We redefine the transmitted M-ary QAM signal in terms of p(t) are even and odd functions of time t, given that the
a general pulse-shaping function g(t) as baseband pulse g(t) is an even function of time t.
2. The passband pulses p(t) and p(t) form an orthogonal set
sk (t) = ak g(t kT ) cos(2fc t) bk g(t kT ) sin(2fc t) over the entire interval (, ) as shown by
Z
0 t T , k Z. The transmitted signal s(t) can then be
p(t)p(t) dt = 0
written as

X
s(t) = sk (t)
3. The passband pulses u(t) = p(t)h(t) and u(t) = p(t)h(t)
k=
where h(t) is the impulse response of a LTI channel, form
This equation shows that for an arbitrary fc , the passband a Hilbert-transform pair and are therefore orthogonal over
functions g(t kT ) cos(2fc t) and g(t kT ) sin(2fc t) are the entire interval (, ) for any h(t).
aperiodic in that they vary from one symbol to another.
We can further write s(t) in a complex notation 4.5 Coherent Frequency-Shift Keying

!
X M-ary PSK as well as QAM are examples of linear mod-
s(t) = Re Ak g(t kT ) exp(j2fc t) ulation, whereas coherent frequency-shift keying (FSK) is a
k= nonlinear method of passband data transmission.
where Ak is a complex number defined by Ak = ak + jbk .
Define 4.5.1 Binary FSK
In a binary FSK system, symbols 1 and 0 are distinguished
Ak = Ak exp(j2fc kT ) and g+ (t) = g(t) exp(j2fc t) from each other by transmitting one of two sinusoidal waves
that differ in frequency by a fixed amount.
The scalar Ak is a rotated version of the complex represen-
tation of the coordinates of the kth transmitted symbol in
The FSK signal
the (1 , 2 )-plane. r
2Eb
cos(2fi t) 0 t Tb
We assume that the pulse-shaping function g(t) is a low-pass si (t) = Tb
signal whose highest frequency component is smaller than
0 elsewhere
the carrier frequency fc . We then recognize g+ (t) as the an-
alytic signal, or pre-envelope, representation of the band-pass where i = 1, 2, and Eb is the transmitted signal energy per bit
signal g(t) cos(2fc t). is known as Sundes FSK, which is a continuous-phase sig-
nal in the sense that phase continuity is always maintained.
g+ (t) = g(t) cos(2fc t) + jg(t) sin(2fc t) = p(t) + j p(t) It is, hence, an example of cintinuous-phase frequency-shift
keying (CPFSK).
We may then say that the quadrature (imaginary) compo- The transmitted frequency is
nent p(t) of the analytic signal g+ (t) is the Hilbert transform nc + i
of the in-phase (real) component p(t). fi =
Tb
g(t) is referred to as the baseband pulse, and p(t) and p(t)
are referred to as passband in-phase and passband quadrature for some fixed integer nc and i = 1, 2.
pulses.
Signal Space
!
X
s(t) = Re Ak g+ (t kT ) The set of orthonormal basis functions is given by
k= (q
2
Important Observations Tb cos(2fi t) 0 t Tb
i (t) = i = 1, 2
0 elsewhere
1. The transmitted signal s(t) appears to be carrierless.
Correspondingly the coefficient sij is defined by
2. The new formulation of s(t) fully retains the hybridized (
amplitude and phase modulation characterizing the orig- Eb i = j
sij = i, j {1, 2}
inal M-ary QAM signal 0 i 6= j
18
The two message points are defined by symbols transmitted in adjacent time slots are statistically
" # " # independent.
Eb 0 The baseband power spectral density of Sundes FSK signal
s1 = and s2 =
0 Eb equals the sum of the power spectral densities of the in-phase
and the quadrature component.

with the Euclidean distance between them equal to 2Eb . 8E cos2 (T f )
Eb
b b
SB (f ) = f 2T1 b + f + 2T1 b + 2
Error Probability 2Tb (4Tb 2 f 2 1)2
The observation vector x has two elements The presence of the two discrete frequency components ((.))
Z Tb Z Tb provides a means of synchronizing the receiver with the
x1 = x(t)1 (t) dt and x2 = x(t)2 (t) dt transmitter.
0 0
Since the baseband PSD falls of as the inverse fourth power
where x(t) denotes the received signal. of frequency, an FSK signal with continuous phase does not
produce as much interference outsite the signal band of in-
Given that symbol 1 was transmitted, x(t) equals s1 (t)+w(t), terest as an FSK signal with discontinuous phase.
where w(t) is the sample function of a white Gaussian noise
process of zero mean and power spectral density N0 /2. If 4.5.2 Minimum Shift Keying
symbol 0 was transmitted, x(t) equals s2 (t) + w(t).
In the coherent detection of binary FSK signal, the phase
information contained in the received signal is not fully ex-
Define a new Gaussian random variable Y whose sample
ploited, other than to provide for synchronization of the re-
value is defined by
ceiver to the transmitter.
y = x1 x2
Another useful way of representing the CPFSK signal s(t) is
The mean value of the random variable Y depends on which to express it in the conventional form of an angle-modulated
binary symbol was transmitted. The conditional means of Y signal: r
are 2Eb
s(t) = cos 2fc t + (t)
p p Tb
E[Y |1] = + Eb and E[Y |0] = Eb
where (t) is the phase of s(t).
The variance of random variable Y is given by The phase (t) of a CPFSK signal increases or decreases lin-
early with time during each bit duration Tb seconds:
Var[Y ] = Var[X1 ] + Var[X2 ] = N0
h
(t) = (0) t 0 t Tb
We get for the conditional probability of error, given that Tb
symbol 0 was transmitted
where the plus sign corresponds to sending symbol 1, and
r ! the minus sign corresponds to sending symbol 0; the dimen-
1 Eb
p10 = erfc sionless parameter h is referred to as the deviation ratio.
2 2Tb ) (
f1 = fc + 2Thb fc = 12 (f1 + f2 )
Similarly, we may show that p01 has the same value as p10 .
f2 = fc 2Thb h = Tb (f1 f2 )
We hence find that the average probability of bit error, or the
bit error rate for coherent binary FSK is (assuming equiprob- Phase Trellis
able symbols) ! The sending of symbol 1 increases the phase of a CPFSK
r
1 Eb signal s(t) by h radians, whereas the sending of symbol 0
Pe = erfc
2 2Tb reduces it by an equal amount
(
In a coherent binary FSK system, we have to double the bit h for symbol 1
energy-to-noise density ratio Eb /N0 , to maintain the same (Tb ) (0) =
h for symbol 0
bit error rate as in a coherent BPSK system.
The phase of a CPFSK signal is an odd or even multiple of
Power Spectral Density h radiands at odd or even multiples of the bit duration Tb ,
Consider the case of Sundes FSK. We may express this spe- respectively.
cial BFSK signal as follows: For Sundes FSK the deviation ratio h is exactly unity.
r Hence, the phase change over one bit interval is radians.
2Eb t But, a change of + radians is exactly the same as a change
s(t) = cos 2fc t 0 t Tb
Tb Tb of radians, modulo 2. It follows therefore that in the
case of Sundes FSK there is no memory.
r
2Eb t t
= cos cos(2fc t) sin sin(2fc t)
Tb Tb Tb
In contrast, we have a completely different situation when
where, in the last line, the plus sign corresponds to trans- the deviation ratio h is assigned the special value of 1/2,
mitting symbol 0, and the minus sign corresponds to trans- where the difference between the frequencies is half the bi-
mitting symbol 1. rate. This is, hence, the minimum difference, for which
We assume that the symbols 1 and 0 in the random binary s1 1 (t) and s2 2 (t) are orthogonal.
wave at the modulator input are equally likely, and that the
19
Signal Space
With the deviation ratio defined as above (h = 1/2), we have

(t) = (0) t 0 t Tb
2Tb
where the plus sign corresponds to symbol 1 and the minus

sign to symbol 0.
The orthonormal basis function forMSK are defined by a pair

of sinusoidally modulated quadrature carriers:
r
2
1 (t) = cos t cos(2fc t) 0 t Tb
Tb 2Tb
r
2
2 (t) = sin t sin(2fc t) 0 t Tb
Tb 2Tb
Correspondingly, we may express the MSK signal in the ex-

panded form
Error Probability
s(t) = s1 1 (t) + s2 2 (t) 0 t Tb In the case of an AWGN channel, the received signal is
where x(t) = s(t) + w(t)
Z Tb p where s(t) is the transmitted MSK signal, and w(t) is the
s1 = s(t)1 (t) dt = Eb cos (0) Tb t Tb sample function of a white Gaussian noise process of zero
Tb mean and power spectral density N0 /2.
Z2Tb p We have to establish a procedure for the use of x(t) to detect
s2 = s(t)2 (t) dt = Eb sin (Tb ) 0 t 2Tb
0 the phase states (0) and (Tb . For the optimum detection
of (0) we obtain
We note that in the above equations for s1 and s2 Z Tb
x1 = x(t)1 (t) dt = s1 + w1 Tb t Tb
Both integrals are evaluated for a time interval equal Tb
to twice the bit duration
We observe that if x1 > 0, the receiver chooses the estimate
Both the lower and the upper limits of the product in- (0) = 0. If x1 < 0, it chooses the estimate (0) = .
tegration used to evaluate the coefficient s1 are shifted Similarly, for the optimum detection of (Tb ) we obtain
by the bit duration Tb with respect to those used to Z 2Tb
evaluate the coefficient s2 . x2 = x(t)2 (t) dt = s2 + w2 0 t 2Tb
0
The time interval 0 t Tb , for which the phase states If x2 > 0, the receiver chooses the estimate (Tb ) = /2.
(0 and (Tb are defined, is common to both signals. If x2 < 0, it chooses the estimate (Tb ) = /2.
Accordingly, the signal constellation for an MSK signal is
The bit error rate for coherent MSK is given by
two-dimensional with four possible message points with co- r !
ordinates 1 Eb
( Pe = erfc
( 2 N0
( Eb , Eb ) ( Eb , Eb )
symbol 0 symbol 1 which is exactly the same as that for BPSK and QPSK.
( Eb , Eb ) ( Eb , Eb )
Note that the coordinates of the message points s1 and s2 Power Spectral Density
have opposite signs when symbol 1 is sent in this interval, As with the BFSK signal, we assume that the input binary
but the same sign when symbol 0 is sent. wave is random, with symbols 1 and 0 equally likely, and
In QPSK the transmitted symbol is represented by any one the symbols transmitted during different time slots being
of the four message points, whereas in MSK one of two mes- statistically independent.
sage points is used to represent the transmitted symbol at
any one time, depending on the value of (0). The baseband power spectral density of the MSK signal is
given by
2
Transmitted (0) (Tb ) s1 s2 32Eb cos(2Tb f )
SB (f ) =
Binary Symbol 2 16Tb 2 f 2 1

0 0 2 + Eb + Eb For f 1/Tb , the baseband PSD of the MSK signal falls off

1 2 Eb + Eb as the inverse fourth power of frequency, whereas in the case
of the QPSK signal it falls off as the inverse square of fre-
0 + 2 Eb Eb
quency. Hence, MSK does not produce as much interference
1 0 + 2 + Eb Eb outsite the signal band of interest as QPSK.
20
The pulse response

r
1 2 t 1
g(t) = erfc W Tb
2 log 2 Tb 2
r !
2 t 1
erfc W Tb +
log 2 Tb 2
constitutes the frequency shaping pulse of the GMSK mod-

ulator, with the dimensionless time-bandwidth product W Tb
playing the role of a design parameter.
Note that g(t) is noncausal and has to be truncated and
shifted in time in order to obtain a causal version.
The curve for the limiting condition W Tb = corre-
sponds to the case of ordinary MSK.
When W Tb is less than unity, increasingly more of the
transmit power is concentrated inside the passband of
the GMSK signal.
The tails of the Gaussian impulse response of the pulse-
shaping filter cause the modulating signal to spread out to
adjacent symbol interals, resulting in intersymbol interfer-
ence, which increases with decreasing W Tb .
4.5.3 Gaussian-Filtered MSK
Error Probability
We may summarize the desirable properties of the MSK sig-
The Probability of error of GMSK using coherent detection
nal as follows:
in the presence of AWGN is
Constant envelope r !
1 Eb
Pe = erfc
Relatively narrow bandwidth 2 2N0
Coherent detection performance equialent to QPSK where Eb is the signal energy per bit and N0 /2 is the noise
spectral density. The constant depends on the time-
Bringing the power spectrum of the MSK signal in a compact bandwidth product W Tb .
form, while maintaining the constant-envelope property, can
be achieved through the use of a premodulation low-pass fil- We may view 10 log10 (/2), expressed in decibels, as a mea-
ter, hereafter referred to as a baseband pulse-shaping filter. sure of performance degradation of GMSK compared to or-
It should satisfy the follwing properties dinary MSK.
1. Frequency response with narrow bandwidth and sharp
4.5.4 M-ary FSK
cutoff characteristics.
Consider the M-ary version of FSK, for which the transmit-
2. Impulse response with relatively low overshoot. ted signals are defined by
3. Evolution of a phase trellis where the carrier phase of the
r
2E
modulated signal assumes the two values /2 at odd si (t) = cos (nc + i)t 0tT
T T
multiples of Tb and the two values 0 and at even multi-
ples of Tb as in MSK. where i = 1, . . . , M , and the carrier frequency fc = nc /2T
for some fixed integer nc . The transmitted symbols are of
These desirable properties can be achieved by passing a equal duration T and have equal energy E.
nonreturn-to-zero (NRZ) binary data stream through a base-
band pulse-shaping filter whose impulse response is defined Signal Space
by a Gaussian function. The resulting method of binary Since the individual signals in M-ary FSK are orthogonal,
frequency modulation is naturally referred to as Gaussian- we may use the transmitted signals in normalized form as a
filtered MSK or just GMSK. complete orthonormal set of basis functions
1
Let W denote the 3dB baseband bandwidth of the pulse- i (t) = si (t) 0 t T, i = 1, . . . , M
shaping filter. E
Accordingly, the M-ary FSK is described by an M -
2 !
log 2 f dimensional signal-space diagram.
H(f ) = exp
2 W
Error Probability
and For coherent M-ary FSK, the optimum receiver consists of
2 2 2 2
r
2 a bank of M correlators or matched filters, with the i (t)
h(t) = W exp W t providing the pertinent reference signals.
log 2 log 2
21

With the minimum distance dmin in M-ary FSK being 2E In order to design an optimum receiver for detecting sym-
and equiprobable symbols, we get bol si , we may formulate the conditional likelihood function,
r ! given the carrier phase
1 E r Z T !
Pe (M 1) erfc E
2 2N0 L si () = exp x(t) cos(2fi t + ) dt
N0 T 0
Power Spectral Density By integrating the conditional likelihood function over all
The spectral analysis of M-ary FSK signals is much more possible values of , we remove the phase dependence of
complicated than that of M-ary PSK signals. L(si ()) and obtain
Z

Bandwidth Efficiency L(si ) = L si () f () d
We may define the channel bandwidth required to transmit
Z r Z T !
M-ary FSK signals as 1 E
= exp x(t) cos(2fi t + ) dt d
2 N0 T 0
M Rb M
B= =
2T 2 log2 M Using trigonometric formulas, and defining
v
u T 2 T 2
The bandwidth efficiency of M-ary signals is therefore u Z
u
Z
li = t x(t) cos(2fi t)dt + x(t) sin(2fi t)dt
Rb 2 log2 M
= = 0 0
B M RT !
0
x(t) sin(2fi t)dt
We see that increasing the number of levels M tends to in- i = arctan RT
0
x(t) cos(2fi t)dt
crease the bandwidth efficiency of M-ary PSK signals, but
it also tends to decrease the bandwidth efficiency of M-ary we obtain
FSK signals. Z T
x(t) cos(2fi t + ) dt = li cos( + i )
0
4.6 Detection of Signals with Unknown and can simplify the above likelihood function to
Phase Z r !
1 E
L(si ) = exp li cos d
Up to this point, we have assumed that the receiver is per- 2 N0 T
fectly synchronized to the transmitter, and the only channel
impairment is noise. Perhaps the most common random sig- which we recognize as modified Bessel function of zero order.
nal parameter is the carrier phase, which is especially true Hence r !
for narrowband signals. E
L(si ) = I0 li
A digital communication receiver with no provision made for N0 T
carrier phase recovery is said to be noncoherent. The binary hypothesis test (i.e. the hypothesis that signal
s1 (t) or signal s2 (t) was transmitted) can now be written as
4.6.1 Optimum Quadratic Receiver r ! r !
E H1 E
I0 l1 I 0 l2
Consider a binary digital communication system in which N0 T H2 N0 T
the transmitted signal is
where hypothesis H1 and H2 correspond to signals s1 (t) and
r s2 (t), respectively.
2E
si (t) = cos(2fi t) 0 t T, i = 1, 2 Since the modified Bessel function is a monotonically increas-
T ing function, we can carry out the test in terms of li 2
where E is the signal energy, T is the duration of the signal- H1
ing interval, and the carrier frequency fi for symbol i is an l1 2 l2 2
H2
integral multiple of T2 .
The received signal, assuming an AWGN channel and a non- 4.7 Noncoherent Orthogonal Modulation
coherent system, may be written as
r The noise performance of noncoherent orthogonal modulation
2E is now studied for two noncoherent receivers as special cases:
x(t) = cos(2fi t + ) + w(t) 0 t T, i = 1, 2 noncoherent BFSK and DPSK.
T
Consider a binary signaling scheme that involves the use
where is the unknown carrier phase, and w(t) is the sample of two orthogonal signals s1 (t) and s2 (t) of equal energy.
function of a white Gaussian noise process of zero mean and During the interval 0 t T , one of these two signals is
PSD N0 /2. sent over an imperfect AWGN channel that shifts the carrier
We furthermore assume uniform distribution of phase by an unknown amount.
We may thus express the received signal as
1 < (
f () = 2 g1 (t) + w(t) s1 (t) sent, 0 t T
x(t) =
0 otherwise g2 (t) + w(t) s2 (t) sent, 0 t T
22
where w(t) is the sample function of AWGN of zero mean We have

and PSD N0 /2 and g1 (t) and g2 (t) denote the phase-shifted q
versions of s1 (t) and s2 (t), respectively. It is assumed that 2Eb cos(2fc t) 0 t Tb
T
g1 (t) and g2 (t) remain orthogonal. s1 (t) = q b
2Eb cos(2fc t) Tb t 2Tb
A quadrature receiver (equivalent to a noncoherent matched Tb
filter) consists of two channels. In the in-phase channel the
received signal x(t) is correlated with the function i (t), and
which represents a scaled version of the transmittted sig- q
nal si (t) with zero carrier phase. In the quadrature channel 2Eb cos(2fc t) 0 t Tb
T
x(t) is correlated with another function i (t), which repre- s2 (t) = q b
2Eb cos(2fc t + ) Tb t 2Tb
sents the versoin of i (t) that results from shifting the carrier Tb
phase by -90 degrees.

The signal i (t) is the Hilbert transform of i (t). Let The bit error rate for DPSK is given by
i (t) = m(t) cos(2fi t) 1

Eb

Pe = exp
where m(t) is a band-limited message signal. Typically, the 2 N0
carrier frequency fi is greater than the highest frequency
component of m(t). The Hilbert transform is in this case which provides a gain of 3dB over noncoherent FSK for the
defined by same Eb /N0 .
i (t) = m(t) sin(2fi t)
An important property of Hilbert transformation is that a
signal and its Hilbert transform are orthogonal to each other. 4.8 Comparison of Digital Modulation
Schemes Using a Single Carrier
The average probability of error for the noncoherent receiver
is given by
1 E
Pe = exp
2 2N0
where E is the signal energy per symbol, and N0 /2 is the
noise spectral density.
4.7.1 Noncoherent Binary FSK

In the binary FSK case, the transmitted signal is defined by
r
2Eb
cos(2fi t) 0 t Tb
si (t) = Tb
0 elsewhere

where fi = ni /Tb for a fixed integer ni . f1 represents symbol

1, whereas f2 represents symbol 0.
The noncoherent BFSK described herein is a special cae
of noncoherent orthogonal modulation with T = Tb and
E = Eb . Hence the bit error rate for noncoherent binary
FSK is
1 Eb
Pe = exp
2 2N0
4.7.2 Differential PSK

We may view differential phase-shift keying (DPSK) as the
noncoherent version of PSK. It eliminates the need for a co- Signaling Scheme Bit Error Rate
herent reference signal at the receiver by combining two basic
operations at the transmitter: Coherent BPSK p
1
Coherent QPSK 2 erfc( Eb /N0 )
1. differential encoding of the input binary wave Coherent MSK
1
p
2. phase shift keying Coherent BFSK 2 erfc( Eb /2N0 )
1
DPSK is another example of noncoherent orthogonal modu- DPSK 2 exp(Eb /N0 )
1
lation, when it is considered over two bit intervals. Noncoherent BFSK 2 exp(Eb /2N0 )
23
5 Multiuser Radio Communications

The communication channel model considered so far, i.e. a Relatively inexpensive microwave equipment.
channel model limited in bandwidth and corrupted by additive
Low attenuation due to rainfall.
white Gaussian noise, is mathematically elegant. An exam-
ple of a physical channel that is well represented by such a Insignificant sky background noise (lowest level between 1
model is the satellite communications channel. and 10 GHz).
A satellite communication system in geostationary orbit re-
However, radio interference limits the application in the C-
lies on line-of-sight radio propagation for the operation of its
Band, because the transmission frequencies of this band co-
uplink from an earth terminal to the transponder and the
incide with those used for terrestrial microwave systems. The
downlink from the transponder to another earth terminal.
use of second-generation communication satellites that op-
erate in the 14/12 GHz Band (i.e., Ku-band) eliminates the
The radio propagation channel characterizing wireless com-
problem.
munications deviates from the idealized AWGN channel
Moreover the use of these higher frequencies makes it possi-
model due to the presence of multipath, which is a non-
ble to build smaller and therefore less expensive antennas.
Gaussian form of signal-dependent phenomenon that arises
The block diagram of the transponder consists of the follow-
because of reflections of the transmitted signal from fixed
ing parts
and moving objects.
Band-pass filter, designed to separate the received signal
5.1 Multiple-Access Techniques from among the different radio channels.
Multiple access is a technique whereby many subscribers or Low-noise amplifier.

local stations can share the use of a communication channel Frequency down-converter, the purpose of which is to con-
at the same time or nearly so. vert the received radio frequency (RF) signal to the desired
There are subtle differences between multiple access and mul- downlink frequency.
tiplexing that should be noted
Traveling-wave tube amplifier, which provides high gain
Multiple accesss refers to the remote sharing of a commu- over a wide band of frequencies.
nication channel by users in highly dispersed locations.
Propagation time delay becomes particularly pronounced in
Multiplexing refers to the sharing of a channel such as a a satellite channel because of the large distances involved.
telephone channel by users confined to a local site. Speech signals incur a transmission delay of approximately
270 ms. Any impedance mismatch results in an echo of the
We may identify four basic types of multiple access: speakers voice. By using an echo canceller we may overcome
1. Frequency-division multiple access (FDMA) this problem.
Disjoint subbands of frequencies are allocated to the dif- The satellite channel, the uplink as well as the downlink,
ferent users on a continuous-time basis. Guard bands are is closely represented by an additive white Gaussian noise
used to act as buffer zones. (AWGN) model. A satellite transponder is an example where
multiple access techniques are used
2. Time-division multiple access (TDMA)
FDMA
Each user is allocated the full spectral occupancy of the
In a satellite channel, nonlinearity of the transponder is
channel, but only for a short duration of time called a time
the primary cause of interference between users. Hence
slot. Buffer zones in the form of guard times are inserted
in an FDMA system the power efficiency of the system
between the assigned time slots.
is reduced, because of the necessary power backoff of the
3. Code-division multiple access (CDMA) traveling-wave tube amplifier.
Another technique for sharing the channel resources is
TDMA
using a hybrid combination of FDMA and TDMA, which
In a TDMA system, the users access the satellite transpon-
represents a specific form of code-division multiple access
der one at a time. Accordingly, the satellite transponder
(CDMA) (e.g. frequency hopping). An important advan-
is now able to operate at full power efficiency and hence
tage of CDMA over both FDMA and TDMA is that it
its wide use.
can provide for secure communications.
SDMA
4. Space-division multiple access (SDMA) The transponder is equipped with multiple anntennas, in
In this multiple access technique, resource allocation is order to exploit the spatial locations of earth stations.
achieved by exploiting the spatial separation of the indi-
vidual users. In particular, multibeam antennas are used
to separate radio signals by pointing them along different
directions.
5.2 Satellite Communications

The most popular frequency band for satellite communica-
tions is 6 GHz (C-Band) for the uplink and 4 GHz for the
downlink. This has the following advantages:
25
5.3 Radio Link Analysis Def 5.2 Power Density

Consider then an isotropic source radiating a total power de-
A link budget, or more precisely link power budget, is the noted by Pt , measured in watts. The power density, denoted
totaling of all gains and losses incurred in operating a com- by (d), at any point on the surface of the sphere is given by
munication link. We end up with an estimation procedure
for evaluating the performance of a radio link (uplink or Pt 2
(d) = watts/m
downlink). Note that the essence of the communication link 4d2
analysis described here also applies to other radio links that
rely on line of sight for their operation. Def 5.3 Radiation Intensity
The performance of a digital communication system, in the = d2 (d) watts per steradian
presence of channel noise modeled as additive white Gaussian
noise, is defined by a formula having the shape of a water- We may express the radiation intensity as (, ), and so
fall curve. The probability of symbol error Pe is usually speak of a radiation-intensity pattern. The power radiated
plotted versus the bit energy-to-noise spectral density ratio inside an infinitesimal solid angle d is given by
Eb /N0 . The first design task is to specify two particular
(, )d d = sin d d
values of Eb /N0 :
1. Required Eb /N0 Def 5.4 Total Power Radiated

The prescribed Pe and the calculated (Eb /N0 )req define a Z
point on the waterfall curve, which is designated as oper- P = (, ) d watts
ating point 1.
2. Received Eb /N0 Def 5.5 Average Power

To assure reliable operation of the communication link, The average power radiated per unit solid angle is
the link budget includes a safety measure called the link 1
Z
P
margin. The link margin provides protection against Pav = (, ) d = watts/steradian
4 4
change and the unexpected. Thus the Eb /N0 actually re-
ceived by the system is somewhat larger than (Eb /N0 )req .
For a direction specified by the angle pair (, ), the directive
Let (Eb /N0 )rec denote the actual or received Eb /N0 ,
gain of an antenna, denoted by g(, ), is defined as the ra-
which deifnes a second point on the waterfall curve des-
tio of the radiation intensity in that direction to the average
ignated as operating point 2.
radiated power
Def 5.1 Link Margin (, ) (, )

g(, ) = =
Pav P/4
Eb Eb
M (dB) = (dB) (dB)
N0 rec N0 req Def 5.6 Directivity
The directivity of an antenna, denoted by D, is defined as the
The larger we make the link margin M , the more reliable is ratio of the maximum radiation intensity from the antenna
the communication link. However, the increased reliability to the radiation intensity from an isotropic source.
of the link is attained at the cost of higher Eb /N0 .
Def 5.7 Power Gain
5.3.1 Free-Space Propagation Model The power gain of an antenna, denoted by G, is defined as the
The next step in formulating the link budget is to calculate ratio of the maximum radiation intensity from the antenna to
the received signal power. In a radio communication system, the radiation intensity from a lossless isotropic source, under
the propagation of the modulated signal is accomplished by the constraint that the same input power is applied to both
means of a transmitting antenna, whose function is twofold: antennas.
Using radiation to denote the radiation efficiency factor of
To convert the electrical modulated signal into an electro- the antenna, we have
magnetic field.
G = radiation D
To radiate the electromagnetic energy into the desired di-
rections. Henceforth, we assume that the antenna is 100 percent effi-
cient (G = D).
At the receiver, we have a receiving antenna whose function An antenna is said to be reciprocal if the transmission
is the opposite of the transmitting antenna. Typically the medium is linear, passive and isotropic. For a given antenna
receiver is located in the farfield of the transmitting antenna. structure, the power gains of transmitting and receiving an-
tennas are then identical.
The Poynting vector or power density is the rate of energy
flow per unit area; it has the dimensions watts per square me- Def 5.8 EIRP
ter. The customary practice (for the reference antenna) is to The effective radiated power referenced to an isotropic source
assume that the reference antenna is an isotropic source, de- (EIRP) is defined as
fined as an omnidirectional (i.e., completely nondirectional)
antanna that radiates uniformly in all directions. EIRP = Pt Gt watts
26
Def 5.9 Antenna Beamwidth We define

The antenna beamwidth, in degrees or radians, is defined as
the available output noise power in a band of width f cen-
the angle that subtens the two points on the mainlobe of the
tered at frequency f as the maximum average noise power
filed-power pattern at which the peak field power is reduced
in this band, obtainable at the output of the device and
by 3dBs.
the noise figure of the two-port device as the ratio of the
Def 5.10 Effective Aperture total available output noise power (due to the device and
The effective aperture of the antenna is defined as the ratio the source) per unit bandwidth to the portion thereof due
of the power available at the antenna terminals to the ower solely to the source.
per unit area of the appropriately polarized incident electro-
Let the spectral density of the available noise power of the
magnetic wave.
device output be SN O (f ), and the spectral densitiy of the
2 c available noise power due to the source at the device input
A= G = be SN S (f ).
4 f
Also let G(f ) denote the available power gain of the two-port
where c is the speed of light (approx. 3 108 m/s). device, defined as the ratio of the available signal power at
the ouput of the device to the available signal power of the
The ratio of the antennas effective aperture to its physical source when the signal is a sinusoidal wave of frequency f .
aperture is a direct measure of the antennas aperture effi- We then get for the noise figure
ciency aperture , in radiating power to a desired direction or SN O (f )
absorbing power from that direction. F =
G(f )SN S (f )
Def 5.11 Friis Free-Space Equation The noise figure is commonly expressed in decible, that is,
The power Pr absorbed by the receiving antenna is the prod- as 10 log10 F .
uct of this power density and the antennas effective area
denoted by Ar Let PS (f ) denote the available signal power from the source.
Under the condition that Z (f ) = R(f ) jX(f ), we find
EIRP Pt Gt Ar that
Pr = Ar = watts
4d 2 4d2 V0 2
PS (f ) =
4R(f )
where d is the distance between the receiving and transmit-
ting antennas. The available signal power at the output of the device is
According to the reciprocity principle we can express the
PO (f ) = G(f )PS (f )
effective area of the receiving antenna as
We can then write
2
Ar = Gr S (f )
4 F =
O (f )
and get the Friis Free-Space Equation
PS (f )
2 S (f ) =
SN S (f )f

Pr = Pt Gt Gr
4d PO (f )
O (f ) =
SN O (f )f
Def 5.12 Path Loss
The path loss representing signal attenuation in decibels We refer to S (f ) as the available signal-to-noise ratio of the
across the entire communication link, is defined as the dif- source and to O (f ) as the available signal-to-noise ratio at
ference (in dBs) between the transmitted signal power and the device output, both measured in a narrow band of width
the received signal power f centered at f .
2 The noise figure F is a function of the operating frequency
Pt 4d
PL = 10 log10 = 10 log10 (Gt Gr ) + 10 log10 f ; it is therefore referred to as the spot noise figure. We may
Pr
define an average noise figure F0 :
The minus sign associated with the first term signifies the R
fact that this term represents a gain. The second term is SN O (f ) df
F0 = R
called the free-space loss, denoted by Lfree space .
G(f )SN S (f ) df
When comparing low-noise devices (all with F 1) it is

5.3.2 Noise Figure
preferable to use the equivalent noise temperature.
To perform noise analysis at the receiver of a communication
T + Te
system, we need convenient measure of the noise performance F = Te = T (F 1)
of a linear two-port device. One such measure is furnished T
by the so-called noise-figure. The noise figure F is measured under matched input condi-
Consider a linear two-port device connected to a signal tions, and with the noise source at temperature T . By con-
source of internal impedance Z(f ) = R(f ) + jX(f ) at the vention the temperature T is taken as room temperature,
input. namely 290 K.
27
It is often necessary to evaluate the noise figure of a cascade In North America, the band of radio frequencies assigned to
connection of two-port networks whose individual noise fig- the cellular system is 800-900 MHz. The subband 824-849
ures are known. It is assumed that the devices are matched. MHz is used to receive signals from the mobile units, and
The overall noise figure of a cascade connection is the subband 869-894 MHz is used to transmit signals to the
F2 1 F3 1 F4 1 mobile units. In Europe and elsewhere, the base-mobile and
F = F1 + + + + ... mobile-base subbands are reversed.
G1 G1 G2 G1 G2 G3
where F1 , F2 , . . . are the individual noise figures, and
G1 , G2 , . . . are the available power gains. 5.4.1 Propagation Effects
The overall equivalent noise temperature of the cascade con-
The major propagation problems are due to the fact that the
nection is (Friis formula)
antenna of a mobile unit may lie well below the surround-
T2 T3 T4 ing buildings. Simply put, there is no line-of-sight path
Te = T1 + + + + ...
G1 G1 G2 G1 G2 G3 to the base station. As a result of scattering, reflection and
where T1 , T2 , . . . are the equivalent noise temperatures. diffraction, we speak of a multipath phenomenon in that the
various incoming waves reach their destination from different
Def 5.13 Equivalent Noise Temperature of an directions and with differnet time delays.
Earth-Terminal Receiver The effect of the differential time delay is to introduce a rela-
tive phase shift between the two components of the received
Tmixer + TIF signal, with two extreme cases:
Te = Tantenna + TRF +
GRF
1. The relative phase shift is zero, in which case the two
5.4 Wireless Communications components add constructively.
A second type of multiuser radio communication system is 2. The relative phase shift is 180 degrees, in which case the
wireless communications, which is synonymous with mobile two components add destructively.
radio. In such a system the radio transmitter or receiver
is capable of being moved, regardless of whether it actually Due to motion of the receiver, there is a continuous change
moves or not. in the length of each propagation path. Hence, the rela-
The aim of a characterization of the mobile radio channel is tive phase shift between the two components of the received
to quantify two factors of primary concern: signal is a function of spatial location of the receiver. The
received amplitude (envelope) is no longer constant as was
1. median signal strength
the case in a static environment.
2. signal variability, which characterizes the fading nature of
the channel. Def 5.14 Signal Fading
An idealized model of the cellular radio system consists of There is constructive addition at some locations, and almost
an array of hexagonal cells with a base station located at the complete cancellation at some other locations. This phe-
center of each cell. A typical cell has a radius of 1 to 12 miles. nomenon is referred to as signal fading.
The function of the base stations is to act as an interface be-
tween mobile subscribers and the cellular radio system. The The net result is that the envelope of the received signal
base stations are themselves connected to a switching center varies with location in a complicated fashion.
by dedicated wirelines.
The mobile switching center has two important roles:
5.5 Binary Signaling over a Rayleigh Fad-
1. It acts as an interface between the cellular radio system
and the public switched telephone network. ing Channel
2. It performs overall supervision and control of the mobile Consider the transmission of binary data over a Rayleigh
communications. fading channel, for which the (low-pass) complex envelope of
the received signal is modeled as follows
The switching process, called a handover or handoff, is de-
signed to move a mobile subscriber from one base station x(t) = exp(j)s(t) + w(t)
to another during a call in a transparent fashion, without
interruption of service. where s(t) is the complex envelope of the transmitted (band-
This cellular concept relies on two essential features pass) signal, is a Rayleigh-distributed random variable de-
1. Frequency reuse refers to the use of radio channels on the scribing the attenuation in transmission, is a uniformly dis-
same carrier frequency to cover different areas, which are tributed random variable describing the phase-shift in trans-
physically sepearted from each other sufficiently ot ensure mission, and w(t) is a complex-valued white Gaussian noise
that co-channel interference is not objectionable. process. It is assumed that the channel is flat in both time
and frequency, so that we can estimate the phase-shift from
2. Cell splitting. When demand for service exceeds the num-
the received signal without error.
ber of channels allocated to a particular cell, cell splitting
Suppose then that coherent binary phase-shift keying is used
is used to handle the additional growth in traffic within
to do the data transmission. Under the condition that is
that particular cell. The new cells, which have a smaller
fixed or constant over a bit interval, we may express the av-
radius than the original cell, are called microcells.
erage probability of symbol error (i.e. bit error rate) due to
28
the additive white Gaussian noise acting alone as follows: 5.5.1 Diversity Techniques
1 Diversity may be viewed as a form of redundancy. THe fol-

Pe () = erfc( ) lowing diversity techniques are of particular interest:
2
Frequency Diversity
where is an attenuated version of the transmitted signal
In frequency diversity, the message signal is transmitted
energy per bit-to-noise spectral density ratio Eb /N0 as shown
using several carriers that are spaced sufficiently apart
by
from each other to provide independently fading versions
2 Eb of the signal. This may be accomplished by choosing a
=
N0 frequency spacing equal to or larger than the coherence
bandwidth of the channel.
We may view Pe () as a conditional probability given that
is fixed. Thus, to evaluate the average probability of symbol Time (signal-repetition) Diversity
error in the combined presence of fading and noise, we must In time diversity, the same message signal is transmitted
average Pe () over all possible values of : in different time slots, with the spacint between successive
time slots being equal to or greater than the coherence
Z
Pe = Pe ()f () d time of the channel.
0
Space Diversity
where f () is the probability density function of . In space diversity, multiple transmitting or receiving an-
Since is Rayleigh distributed, we find that has a chi- tennas (or both) are used, with the spacing between ad-
square distribution with two degrees of freedom jacent antennas being chosen so as to assure the indepen-
dence of fading events; this may be satisfied by spacing
1

the adjacent antennas by at least seven times the radio
f () = exp >0 wavelength.
0 0
Given that by one of these means we create L independently
The term 0 is the mean value of the received signal energy fading channels, we may then use a linear diversity combin-
per bit-to-noise spectral density ratio, which is defined by ing structure involving L spearate receivers. The system is
designed to compensate only for short-term effects of a fading
Eb
0 = E[] = E[2 ] channel. Moreover, it is assumed that noise-free stimates of
N0 the channel attenuation factors {l } and the channel phase-
shifts {l } are available.
where E[2 ] is the mean-square value of the Rayleigh- In the linear combiner, the instantaneous output signal-
distributed random variable . Having all this we get to-noise ratio (SNR) is the sum of the instantaneous SNRs
r on the individual diversity branches (channels). This opti-
1 0 mum form of a linear combiner is therefore referred to as a
Pe = 1
2 1 + 0 maximal-ratio combiner.
which defines the bit error rate for coherent binary phase-
shift keying (PSK) over a flat-flat Rayleigh fading channel.
Following a similar approach, we may derive the correspond-
ing bit error rates for coherent binary frequency-shift keying,
binary differential phase-shift keying and noncoherent binary
frequency-shift keying.
Type of Signaling Bit Error Rate Pe Approximate BER

for large 0
q
1 0 1
coherent BPSK 2 1 1+0 40
q
1 0 1
coherent BFSK 2 1 1+0 20
1 1
binary DPSK 2(1+0 ) 20
1 1
noncoherent FSK 2+0 0
We see Rayleigh fading results in a severe degradation in the

noise performance of a digital passband transmission system,
the degradation being measured in tens of decibels of addi-
tional mean signal-to-noise ratio compared to a nonfading
channel for the same bit error rate.
The practical implication of this difference is that in a mo-
bile radio environment, we have to provide a large increase
in mean signal-to-noise ratio, so as to ensure a bit error rate
that is low enough for practical use.
29
6 Fundamental Limits in Information Theory

The purpose of a communication system is to carry 4. I(sk sl ) = I(sk ) + I(sl ) if sk and sl are statistically inde-
information-bearing baseband signals from one place to an- pendent.
other over a communication channel. Information theory
deals with mathematical models and analysis of a commu- It is the standard practice to use a logarithm to base 2. The
nication system and provides answers to two fundamental resulting unit of information is called the bit (a contraction
questions: of binary digit).
1. What is the irreducible complexity below which a signal
cannot be compressed? 1
I(sk ) = log2 = log2 (pk ) k = 0, . . . , K 1
pk
2. What is the ultimate transmission rate for reliable com-
munication over a noisy channel? Def 6.3 Entropy
I(sk ) is a discrete random variable that takes on the val-
6.1 Uncertainty, Information and Entropy ues I(s0 ), . . . , I(sK1 ) with probabilities p0 , . . . , pK1 re-
spectively. The mean of I(sk over the source alphabet S
Def 6.1 Discrete Memoryless Source
is given by
Suppose that a probabilistic experiment involves the observa-
tion of the output emitted by a discrete source during every K1 K1
unit of time (signaling interval). The source output is mod- X X 1
H(S) = E[I(sk )] = pk I(sk ) = pk log2
eled as a discrete random variable S, which takes on symbols pk
k=0 k=0
from a fixed finite alphabet
S = {s0 , . . . , sK1 } The important quantity H(S) is called the entropy of a dis-
crete memoryless source with source alphabet S. It is a mea-
with probabilities sure of the average information content per source symbol.
K1
X
P (S = sk ) = pk pk = 1, k = 0, . . . , K 1 Properties of Entropy
k=0 Consider a discrete memoryless source. The entropy H(S)
We assume that the symbols emitted by the source during of such a source is bounded as follows:
successive signaling intervals are statistically independent. A
source having the properties just described is called a discrete 0 H(S) log2 K
memoryless source.
where K is the radix (number of symbols) of the alphabet S
If the source symbols occur with different probabilities, and of the source.
the probability pk is low, then there is more surprise, and
therefore information, when symbol sk is emitted by the 1. H(S) = 0, if and only if the probability pk = 1 for some k,
source than when symbol si , i 6= k with higher probability and the remaining probabilities in the set are all zero; this
is emitted. lower bound on entropy corresponds to no uncertainty.
Def 6.2 Information 2. H(S) = log2 K, if and only if pk = 1/K for all k (i.e., all
The amount of information is related to the inverse of the symbols in the alphabet S are equiprobable); this upper
probability of occurrence. We define the amount of informa- bound on entropy corresponds to the maximum uncer-
tion gained after observing the event S = sk , which occurs tainty.
with probability pk , as the logarithmic function
In discussing information-theoretic concepts, we often find
1 it useful to consider blocks rathar than individual symbols,
I(sk ) = log
pk with each block consisting of n successive source symbols.
with the following properties We may view each such block as being produced by an ex-
tended source with a source alphabet S n that has K n dis-
1. If we are absolutely certain of the outcome of an event, tinct blocks, where K is the number of distinct symbols in
even before it occurs, there is no information gained. the source alphabet S of the original source.
I(sk ) = 0 for pk = 1
H(S n ) = n H(S)
2. The occurence of an event S = sk either provides some or
no information, but never brings about a loss of informa-
tion. 6.2 Source-Coding Theorem
I(sk ) 0 0 pk 1
An important problem in communications is the efficient rep-
3. The less probable an event is, the more information we resentation of data generated by a discrete source. The pro-
gain when it occurs. cess by which this representation is accomplished is called
source encoding. The device that performs the representa-
I(sk ) > I(si ) pk < pi tion is called a source encoder.
31
Def 6.4 Variable-Length Code 6.3.1 Prefix Coding

If some source symbols are known to be more probable than
Consider a discrete memoryless source of alphabet
others, then we may exploit this feature in the generation
{s0 , . . . , sK1 } and statistics {p0 , . . . , pK1 }.
of a source code by assigning short code words to frequent
symbols, and long code words to rare symbols. We refer to
such a source code as a variable-length code. Def 6.5 Prefix Condition
Let the code word assigned to source symbol sk be de-
noted by (mk1 , . . . , mkn ), where the individual elements
An efficient source encoder has to satisfy the following two
mk1 , . . . , mkn are 0s and 1s, and n is the code-word length.
functional requirements
The initial part of the code owrd is represented by the ele-
1. The code words produced by the encoder are in binary ments mk1 , . . . , mki for some i n. Any sequence made up
form. of the initial part of the code word is called a prefix of the
code word.
2. The source code is uniquely decodable, so that the original
source sequence can be reconstructed perfectly from the Def 6.6 Prefix Code
encoded binary sequency. A prefix code is defined as a code in which no code word is
the prefix of any other code word.
Consider then a scheme, where the output sk of a discrete
memoryless source is converted by the encoder into a block To decode a sequence of code words generated from a prefix
of 0s and 1s, denoted by bk . We assume that the source source code, the source decoder simply starts at the begin-
has an alphabet with K different symbols, and that the kth ning of the sequence and decodes one code word at a time.
symbol sk occurs with probability pk , (k = 0, . . . , K 1). Specifically, it sets up what is equivalent to a decision tree,
Let the binary code word assigned to symbol sk by the en- which is a graphical portrayal of the code words in the par-
coder have length lk , measured in bits. We define the average ticular source code.
code-word length or average number of bits per source symbol A prefix code has the important property that it is always
L as uniquely decodable.
K1
X
L= p k lk
k=0 Def 6.7 Kraft-McMillan Inequality
If a prefix code has been constructed for a discrete mem-
Let Lmin denote the minimum possible value of L. We then oryless source with alphabet {s0 , . . . , sK1 } and statistics
define the coding efficiency of the source encoder as {p0 , . . . , pK1 } and the code word for symbol sk has length
lk , k = 1, . . . , K 1, then the code-word lengths of the code
Lmin
= always satisfy the Kraft-McMillan Inequality
L
K1
X
With L Lmin we clearly have 1. 2lk 1
k=0
Theorem 6.1 Source-Coding Theorem where the factor 2 refers to the radix (number of symbols)
Given a discrete memoryless source of entropy H(S), the in the binary alphabet.
average code-word length L for any distortionless source en-
coding scheme is bounded as Def 6.8 Instantaneous Codes
Prefix codes are distinguished from other uniquely decodable
L H(S) codes by the fact that the end of a code word is always rec-
ognizable. For this reason, prefix codes are also referred to
Thus with Lmin = H(S) we may rewrite the efficiency of a as instantaneous codes.
source encoder in termy of the entropy H(S) as
Given a discrete memoryless source of entropy H(S), a prefix
H(S)
= code can be constructed with an average code-word length
L L, which is bounded as follows
6.3 Data Compaction H(S) L < H(S) + 1
A common characteristic of signals generated by physical For the case that pk = 2lk , the average code-word length is
sources is that, in their natural form, they contain a signifi- K1
X lk
cant amount of information that is redundant, the transmis- L=
2lk
sion of which is therefore wasteful of primary communica- k=0
tion resources. For efficient signal transmission, the redun- and the corresponding entropy of the source is
dant information should be removed from the signal prior to
K1 K1
transmission. This operation, with no loss of information, is X 1
lk
X lk
ordinarily performed on a signal in digital form, in which case H(S) = log 2 (2 ) =
2lk 2lk
k=0 k=0
we refer to it as data compaction or lossless data compres-
sion. The entropy of the source establishes the fundamental Hence we find that the prefix code is matched to the source
limit on the removal of redundancy from the data. in that L = H(S).
32
Let Ln denote the average code-word length of the extended where p0 , . . . , pK1 are the source statistics, and lk is the
prefix code. For a uniquely decodable code, Ln is the small- length of the code word assigned to source symbol sk .
est possible.
It is usually found that when a combined symbol is moved
H(S n ) Ln < H(S n ) + 1
as high as possible, the resulting Huffman code has a signif-
nH(S) Ln < nH(S) + 1 icantly smaller variance 2 than when it is moved as low as
possible. On this basis, it is reasonable to choose the former
In the limit, as n approaches infinity, the lower and upper Huffman code over the latter.
bounds converge. We may therefore state that by making the
order n of an extended prefix source encoder large enough, Drawbacks of Huffman Coding
we can make the code faithfully represent the discrete mem-
oryless source S as closely as desired. The Huffman code requires knowledge of a probabilistic
However, the price we have to pay for decreasing the average model of the source; unfortunately, in practice, source
code-word length is increased decoding complexity, which is statistics are not always known a priori.
brought about by the high order of the extended prefix code.
In modeling text we find that storage requirements pre-
vent the Huffman code from capturing the higher-order
6.3.2 Huffman Coding
relationships between words and phrases, thereby compro-
An important class of prefix codes is known as Huffman mising the efficiency of the code.
codes. The basic idea behind Huffman coding is to assign
to each symbol of an alphabet a sequence of bits roughly 6.3.3 Lempel-Ziv Coding
equal in length to the amount of information conveyed by
the symbol in question. To overcome the practical limitations of the Huffman code,
The essence of the algorithm used to synthesize Huffman we may use the Lempel-Ziv algorithm, which is intrinsically
code is to replace the prescribed set of source statistics of a adaptive and simpler to implement than Huffman coding.
discrete memoryless source with a simpler one.
The Huffman encoding algorithm proceeds as follows: Basically, encoding in the Lempel-Ziv algorithm is accom-
plished by parsing the source data stream into segments that
1. The source symbols are listed in order of decreasing prob- are the shortest subsequences not encountered previously.
ability. The two source symbols of lowest probability are
assigned a 0 and a 1. This part of the step is referred to The last symbol of each subsequence in the code book is an
as a splitting stage. innovation symbol, which is so called in recognition of the
fact that its appendage to a particular subsequence distin-
2. These two symbols are regarded as being combined into a
guishes it from all previous subsequences stored in the code
new source symbol with probability equal to the sum of
book. Correspondingly, the last bit of each uniform block of
the two original probabilities. (The list of source symbols,
bits in the binary encoded representation of the data stream
and therefore source statistics, is thereby reduced in size
represents the innovation symbol for the particular subse-
by one.) The probability of the new symbol is placed in
quence under consideration.
the list in accordance with its value.
The remaining bits provide the equivalent binary represen-
3. The procedure is repeated until we are left with a final tation of the pointer to the root subsequence that matches
list of source statistics (symbols) of only two for which a the one in question except for the innovation symbol.
0 and a 1 are assigned.
In contrast to Huffman coding, the Lempel-Ziv algorithm
The code for each (original) source symbol is found by work- uses fixed-length codes to represent a variable number of
ing backward and tracing the sequence of 0s and 1s assigned source symbols; this feature makes the Lempel-Ziv code suit-
to that symbol as well as its successors. able for synchronous transmission.
It is noteworthy that the Huffman encoding process (i.e., the For a long time, Huffman coding was unchallenged as the al-
Huffman tree) is not unique: gorithm of choise for data compaction. However, the Lempel-
1. At each splitting stage in the construction of a Huffman Ziv algorithm has taken over almost completely and is now
code, there is arbitrariness in the way a 0 and a 1 are the standard algorithm for file compression.
assigned to the last two source symbols.
6.4 Discrete Memoryless Channels
2. Ambiguity arises when the probability of a combined sym-
bol is found to equal another probability in the list. A discrete memoryless channel is a statistical model with
an input X and an output Y that is a noisy version of X;
Note that the average code-word length remains the same. both X and Y are random variables. Every unit of time, the
channel accepts an input symbol X selected from an alpha-
Def 6.9 bet X and, in response, it emits an output symbol Y from
As a measure of the variability in code-word lengths of a an alphabet Y.
source code, we define the variance of the average code-word The channel is said to be discrete when both of the alpha-
length L over the ensemble of source symbols as bets have finite sizes. It is said to be memoryless when the
K1 current output symbol depends only on the current input
symbol and not any of the previous ones.
X
2 = pk (lk L)2
k=0
33
The channel is described in terms of an input alphabet a measure of the prior uncertainty about X, how can we
measure the uncertainty about X after observing Y ?
X = {x0 , . . . , xJ1 }
J1
X 1
and output alphabet H(X |Y = yk ) = p(xj |yk ) log2
j=0
p(xj |yk )
Y = {y0 , . . . , yK1 }
The mean of entropy H(X |Y = yk ) over the output alphabet
and a set of transition probabilities Y is given by
p(yk |xj ) = P (Y = yk |X = xj ) j, k K1
X
H(X |Y) = H(X |Y = yk )p(yk )
with 0 p(yk |xj ) 1, j, k. k=0
K1
X J1
X 1
A convenient way of describing a discrete memoryless chan- = p(xj |yk )p(yk ) log2
p(xj |yk )
nel is to arrange the various transition probabilities of the k=0 j=0
channel in the form of a matrix as follows: K1
X J1
X
1

= p(xj , yk ) log2
p(xj |yk )

p(y0 |x0 ) p(y1 |x0 ) ... p(yK1 |x0 ) k=0 j=0
p(y0 |x1 ) p(y1 |x1 ) ... p(yK1 |x1 )

P = .. .. ..
The quantity H(X |Y) is called a conditional entropy. It
. . . represents the amount of uncertainty remaining about the
p(y0 |xJ1 ) p(y1 |xJ1 ) . . . p(yK1 |xJ1 ) channel input after the channel output has been observed.
The J-by-K matrix P is called the channel matrix, or tran- Another important quantity is called the mutual information
sition matrix. Not that each row of the channel matrix of the channel, and given by
corresponds to a fixed channel input, whereas each column
of the matrix corresponds to a fixed channel output. I(X ; Y) = H(X ) H(X |Y)
Suppose now that the inputs to a discrete memoryless chan- Similarly, we may write
nel are selected according to the probability distribution p(xj )
I(Y; X ) = H(Y) H(Y|X )
p(xj ) = P (X = xj ) j = 0, . . . , J 1
where H(Y) si the entropy of the channel output and H(Y|X
The joint probability distribution of the random variables X is the conditional entropy of the channel output given the
and Y is given by channel input.
p(xj , yk ) = P (X = xj , Y = yk )
= P (Y = yk |X = xj )P (X = xj ) 6.5.1 Properties of Mutual Information
= p(yk |xj )p(xj ) 1. The mutual information of a channel is symmetric; that
is
The marginal probability distribution of the output random
I(X ; Y) = I(Y; X )
variable Y is obtained by
J1
where the mutual information I(X ; Y) is a measure of the
p(yk ) =
X
p(yk |xj )p(xj ) k = 0, . . . , K 1 uncertainty about the channel input that is resolved by
j=0
observing the channel output, and the mutual information
I(Y|X ) is a measure of the uncertainty about the channel
The probabilities p(xj for j = 0, . . . , J 1, are known as the output that is resolved by sending the channel input.
a priori probabilities of the various input symbols.
2. The mutual information is always nonnegative; that is
Binary Symmetric Channel
I(X ; Y) 0
1p
x0 = 0 y0 = 0
p 3. The mutual information of a channel is related to the joint
entropy of the channel input and channel output by
p
x1 = 1 y1 = 1 I(X ; Y) = H(X ) + H(Y) H(X , Y)
1p
where the joint entropy H(X , Y) is defined by
6.5 Mutual Information K1
X J1
X
1

Given that we think of the channel output Y as a noisy ver- H(X , Y) = p(xj , yk ) log2
p(xj , yk )
sion of the channel input X, and that the entropy H(X ) is k=0 j=0
34
6.6 Channel Capacity the destination from the source alphabet S and at the same
source rate of one symbol every Ts seconds.
Consider a discrete memoryless channel with input alphabet
We assume that the channel is capable of being used once
X , output alphabet Y, and transition probabilities p(yk |xj ),
every Tc seconds. Hence, the channel capacity per unit time
where j = 0, . . . , J 1 and k = 0, . . . , K 1.
is C/Tc bits per second, which represents the maximum rate
The mutual information of a channel depends not only on
of information transfer over the channel.
the channel but also on the way in which the channel is used
(i.e., the input probability distribution).
Theorem 6.2 Channel Coding Theorem
The channel coding theorem for a discrete memoryless chan-
Def 6.10 Channel Capacity
nel is state in two parts:
We define the channel capacity of a discrete memoryless
channel as the maximum mutual information I(X ; Y) in any 1. let a discrete memoryless source with an alphabet S have
single use of the channel (i.e., signaling interval), where the entropy H(S) and produce source symbols once every Ts
maximization is over all possible input probability distribu- seconds. Let a discrete memoryless channel have capacity
tions {p(xj )} on X . C and be used once every Tc seconds. Then, if
C = max I(X ; Y) H(S) C
{p(xj )}
Ts Tc
The channel capacity C is measured in bits per channel use,
or bits per transmission. there exists a coding scheme for which the source output
can be transmitted over the channel and be reconstructed
Note that the channel capacity C is a function only of the with an arbitrarily small probability of error. When the
transition probabilities p(yk |xj ), which define the channel. above equation is satisfied with the equality sign, the sys-
The calculation of C involves maximization subject to two tem is said to be signaling at the critical rate.
constraints:
2. Conversely, if
J1 H(S) C
X <
p(xj ) 0 j and p(xj ) = 1 Ts Tc
j=0
it is not possible to transmit information over the channel
and reconstruct it with an arbitrarily small probability of
6.7 Channel-Coding Theorem error.
The inevitable presence of noise in a channel causes discrep-
ancies (errors) between the output and input data sequences The theorem specifies the channel capacity C as a funda-
of a digital communication system. For many applications, mental limit on the rate at which the transmission of reliable
a probability of error equal to 106 or even lower is often error-free messages can take place over a discrete memoryless
a necessary requirement. To achieve such a high level of channel. It is important to note the following
performance, we resort to the use of channel coding.
Specifically, channel coding consists of mapping the incom- The theorem should be viewed as an existence proof.
ing data sequence into a channel input sequence, and inverse
mapping the channel output sequence into an output data It tells us that the probability of symbol error tends to
sequence in such a way that the overall effect of channel zero as the length of the code increases.
noise on the system is minimized. Not also that power and bandwidth constraints were hidden
in the discussion.
The channel encoder and channel decoder are both under
the designers control and should be designed to optimize
the overall reliability of the communication system. The 6.7.1 Application of the Channel Coding Theorem
approach taken is to introduce redundancy in the channel to Binary Symmetric Channels
encoder so as to reconstruct the original source sequence as Consider a discrete memoryless source that emits equally
accurately as possible. likely binary symbols once every Ts seconds. With the source
It suffices to confine our attention to block codes. In this class entropy equal to one bit per souce symbol, the information
of codes, the message sequence is subdivided into sequential rate of the source is (1/Ts ) bits per second. The source se-
blocks each k bits long, and each k-bit block is mapped into quence is applied to a channel encoder with code rate r. The
an n-bit block, where n > k. The number of redundant bits channel encoder produces a symbol once every Tc seconds.
added by the encoder to each transmitted block is n k bits. Hence, the encoded symbol transmission rate is (1/Tc ) sym-
The ratio bols per second.
k
r= <1 The channel encoder engages a binary symmetric channel
n
once every Tc seconds. Hence, the channel capacity per unit
is called the code rate.
time is (C/Tc ) bits per second, where C is determined by
Suppose then the discrete memoryless source has the source
the prescribed channel transition probability p. The channel
alphabet S and entropy H(S) bits per source symbol. We
coding theorem implies that if
assume that the source emits symbols once every Ts seconds.
hence, the average information rate of the source is H(S)/Ts 1 C
bits per second. The decoder delivers decoded symbols to
Ts Tc
35
the probability of error can be made arbitrarily low by the 6.8.1 Mutual Information
use of a suitable channel encoding scheme. With
We define the mutual information between random variables
Tc X and Y as follows:
r= Z Z
Ts fX (x|y)
I(X; Y ) = fX,Y (x, y) log2 dx dy
fX (x)
we may restate the condition simply as
where fX,Y (x, y) is the joint probability density function of
rC X and Y , and fX (x|y) is the conditional probability density
function of X, given that Y = y.
That is, for r C there exists a code capable of achieving
We find that the mutual information has the properties
an arbitrarily low propability of error.
1. I(X; Y ) = I(Y ; X)
6.8 Differential Entropy and Mutual Infor- 2. I(X; Y ) 0
mation for Continuous Ensembles
3. I(X; Y ) = h(X) h(X|Y ) = h(Y ) h(Y |X)
The sources and channels considered thus far have involved
The parameter h(x) is the differential entropy of X; likewise
ensembles of random variables that are discrete in amplitude.
for h(Y ). The parameter h(X|Y ) is the conditional differen-
We extend some of these concepts to continuous random vari-
tial entropy of X, given Y ; it is defined by
ables and random vectors. Z Z
1
h(X|Y ) = fX,Y (x, y) log2 dx dy
Def 6.11 Differential Entropy fX (x|y)
Consider a continuous random variable X with probability
density function fX (x). 6.9 Information Capacity Theorem
We define h(X) as the differential entropy of X
Consider a zero-mean stationary process X(t) that is band-
Z limited to B hertz. Let Xk , k = 1, . . . , K, denote the con-
1
h(X) = fX (x) log2 dx tinuous random variables obtainted by uniform sampling of
fX (x)
the process X(t) at the Nyquist rate of 2B samples per sec-
When we have a continuous random vector X consisting of ond. These samples are transmitted in T seconds over a
n random variables X1 , . . . , Xn , we define the differential en- noisy channel, also band-limited to B hertz. The number of
tropy of X as the n-fold integral samples K is given by
Z
1
K = 2BT
h(X) = fX (x) log2 dx
fX (x) We refer to Xk as a sample of the transmitted signal. The
channel output is perturbed by additive white Gaussian noise
where fX (x) is the joint probability density function of X. (AWGN) of zero mean and power spectral density N0 /2. The
noise is band-limited to B hertz. Let the continuous random
Consider a random variable X uniformly distributed over the variables Yk , k = 1, . . . , K denote samples of the received
interval (0, a). We then get signal
Yk = Xk + Nk k = 1, . . . , K
h(X) = log a
The noise sample Nk is Gaussian with zero mean and vari-
Note that the differential entropy of a continuous random ance given by
variable can be negative. 2 = N0 B
Consider two random variables X and Y that have the same We assume that the samples Yk are statistically independent
mean and the same variance 2 . Furthermore X is Gaus- for all k. Such a channel is called a discrete-time, memory-
sian distributed. For the differential entropy of the Gaussian less Gaussian channel.
random variable X we get
1 To assign a cost to each channel input, it is reasonable to
h(X) = log2 (2e 2 define
2
E[Xk 2 ] = p k = 1, . . . , K
and finally as the transmitter is typically limited in power; with P de-
( noting the average transmitted power.
X Gaussian random variable
h(Y ) h(X)
Y another random variable Let I(Xk ; Yk ) denote the mutual information between Xk
and Yk . We may then define the information capacity of the
For a finite variance 2 , the Gaussian rnadom variable has channel as
the largest differential entropy attainable by any random
C = max I(Xk ; Yk ) : E[Xk 2 ] = P

variable. fXk (x)
The entropy of a Gaussian random variable X is uniquely where the maximization is performed with respect to fXk (x),
determined by the variance of X (i.e., it is independent of the probability density function of Xk .
the mean of X).
36
By using I(Xk ; Yk ) = h(Yk ) h(Nk ) we may reformulate the A plot of bandwidth efficiency Rb /B versus Eb /N0 is called
channel capacity as the bandwidth-efficiency diagram.
C = I(Xk ; Yk ) : Xk Gaussian E[Xk 2 ] = P
The variance of sample Yk of the received signal equals
P + 2 . Hence
1
h(Yk ) = log2 2e(P + 2 )

2
The variance of the noise sample Nk equals 2 .
1
h(Nk ) = log2 (2e 2 )
2
For the desired channel capacity C we then get

1 P
C = log2 1 + 2 bits per transmission
2
With the cannel used K times for the transmission of K
samples of the process X(t) in T seconds, we find that the
information capacity per unit time is (K/T ) times the result
given above. The number K equals 2BT .
Def 6.12 Information Capacity Theorem

The information capacity of a continuous channel of band-
width B hertz, perturbed by additive white Gaussian noise
of power spectral density N0 /2 and limited in bandwidth to
B, is given by

P
C = B log2 1 + bits per second We can make the following observations, based on the above
N0 B
figure:
where P is the average transmitted power.
1. For infinite bandwidth, the ratio Eb /N0 approaches the
The information capacity theorem highlights most vividly limiting value
the interplay among three key system parameters: chan-
nel bandwidth, average transmitted power, and noise power
Eb

spectral density at the channel output. lim = log 2 = 0.693 = 1.6dB
B N0
We note that it is easier to increase the information capac-
ity of a communication channel by expanding its bandwidth
than increasing the transmitted power ofr a prescribed noise This value is called the Shannon limit for an AWGN chan-
variance. nel, assuming a code rate of zero.
The corresponding limiting value of the channel capacity

6.9.1 Implications of the Information Capacity is found to be
Theorem P
lim C = log2 e
We introduce the notion of an ideal system defined as one B N0
that transmits data at a bit rate Rb equal to the information
capacity C. We may then express the average transmitted 2. The capacity boundary, defined by the curve for the crit-
power as ical bit rate Rb = C, separates combinations of system
P = Eb C parameters that have the potential for supporting error-
where Eb is the transmitted energy per bit. Accordingly, the free transmission (Rb < C) from those for which error-free
ideal system is defined by the equation transmission is not possible (Rb > C).

C Eb C
= log2 1 +
B N0 B 3. The diagram highlights potential trade-offs among Eb /N0 ,
Rb /B, and probability of symbol error Pe . In particular,
Equivalently, we may define the signal energy-per-bit to noise
we may view movement of the operating point along a hor-
power spectral density ratio Eb /N0 interms of the ratio C/B
izontal line as trading Pe versus Eb /N0 for a fixed Rb /B.
for the ideal system as
On the other hand, we may view movement of the operat-
Eb 2C/B 1 ing point along a vertical line as trading Pe versus Rb /B
= for a fixed Eb /N0 .
N0 C/B
37
7 Error-Control Coding
For a fixed Eb /N0 , the only practical option available for the mathematical structure of the code. Accordingly, these
changing data quality from problematic to acceptable is to n k bits are referred to as parity bits.
use error-control coding. Block codes in which the message bits are transmittd in
Error control for data integrity may be exercised by means of unaltered form are called systematic codes. For applications
forward error correction (FEC). The discrete source gener- requiring both error detection and error correction, the use
ates information in the form of binary symbols. The channel of systematic block codes simplifies implementation of the
encoder in the transmitter accepts message bits and adds re- decoder.
dundancy according to a prescribed rule, thereby producing
encoded data at a higher bit rate. The channel decoder in Let m0 , . . . , mk1 constitute a block of k arbitrary message
the receiver exploits the redundancy to decide which message bits. Thus we have 2k distinct message blocks. Let this se-
bits were actually transmitted. quence of message bits be applied to a linear block encoder,
For a fixed modulation scheme, the adition of redundancy in producing an n-bit code word whose elements are denoted
the coded messages implies the need for increased transmis- by c0 , . . . , cn1 . Let b0 , . . . , bnk1 denote the (n k) parity
sion bandwidth. Moreover, the use of error-control coding bits in the code word.
adds complexity to the system. We may then write
(
bi i = 0, . . . , n k 1
Historically, the different error-correcting codes have been ci =
classified into block codes and convolutional codes. The dis- mi+kn i = n k, . . . , n 1
tinguishing feature for this particular classification is the The (n k) parity bits are linear sums of the k message bits,
presence or absence of memory in the encoders of the two as shown by the generalized relation
codes.
k1
X
bi = pji mj
7.1 Discrete-Memoryless Channels j=0
where the coefficients are defined as follows

We may model the combination of the modulator, the wave- (
form channel, and the detector as a discrete memoryless 1 if bi depends on mi
pij =
channel. The simplest discrete memoryless channel results 0 otherwise
form the use of binary input and binary output symbols.
The majority of coded digital communication systems em- The coefficients pij are chosen in such a way thath the rows
ploy binary coding with hard-decision decoding, due to sim- of the generator matrix are linearly independent and the
plicity of implementation offered by such an approach. parity equations are unique.
We may rewrite the above equations in a compact form using

The use of hard decisions prior to decoding causes an irre-
matrix notation. We therefore define the 1 k message vec-
versible loss of information in the receiver. To reduce this
tor (or information vector) m, the 1 (n k) parity vector
loss, soft-decision coding is used. This is achieved by includ-
b, and the 1 n code vector c as follows
ing a multilevel quantizer at the demodulator output. The
modulator has only binary symbols 0 and 1 as inputs, but m = [m0 , . . . , mk1 ]
the demodulator output now has an alphabet with Q sym- b = [b0 , . . . , bnk1 ]
bols. Such a channel is called a binary input Q-ary output
c = [c0 , . . . , cn1 ]
discrete memoryless channel.
Soft-decision decoding offers significant improvement in per- We may thus rewrite the set of simulatneous equations defin-
formance over hard-decision decoding by taking a probabilis- ing the parity bits in the compact form
tic rather than an algebraic approach.
b = mP
where P is the k (n k) coefficient matrix
7.1.1 Notation
p00 p01 ... p0,nk1
The encoding and decoding functions involve the binary
p10 p11 ... p1,nk1

arithmetic operations of modulo-2 addition and modulo-2 P = .
.. ..
multiplication. ..

. .
Modulo-2 addition is the EXCLUSIVE-OR operation in pk1,0 pk1,1 . . . pk1,nk1
logic, and modulo-2 multiplication is the AND operation.
where pij is 0 or 1.
The vector c may be expressed as a partitioned row vector
7.2 Linear Block Codes in terms of the vectors m and b as follows
h i
A code is said to be linear if any two code words in the code c= b m
can be added in modulo-2 arithmetic to produce a third code
word in the code. factoring out the common message vector, we get
h i
Consider then an (n, k) linear block code, in which k bits in c = m P Ik
the remaining portion are computed from the message bits in
accordance with a prescribed encoding rule that determines where Ik is the k k identity matrix.
39
We define the k n generator matrix as 7.2.2 Minimum Distance Considerations

h i Def 7.1 Hamming Distance
G= P Ik Consider a pair of code vectors c1 and c2 that have the same
number of elements. The Hamming distance d(c1 , c2 ) be-
which is said to be in the canonical form in that its k rows tween such a pair of code vectors is defined as the number
are linearly independent. of locations in which their respective elements differ.
The sum of any two code words is another code word. This
basic property of linear block codes is called closure. Def 7.2 Hamming Weight
The Hamming weight w(c) of a code vector c is defined as
Let H denote an (n k) n matrix, defined as the number of nonzero elements in the code vector.
h i
H = Ink PT Def 7.3 Minimum Distance
The mininum distance dmin of a linear block code is defined
where PT is the transpose of the coefficient matrix. as the smallest Hamming distance between any pair of code
In modulo-2 arithmetics we have vectors in the code.
Alternatively, we may state that the minimum distance of
HGT = PT + PT = 0 a linear block code is the smallest Hamming weight of the
nonzero code vectors in the code.
and hence
cHT = mGHT = 0 The minimum distance dmin is related to the structure of the
parity-check matrix of the code in a fundamental way. Let
The matrix H is called the parity-check matrix of the code, the matrix H be expressed in terms of its columns as follows
and the set of equations specified by the above equation are h i
called parity-check equations. H = h1 h2 . . . hn
The minimum distance of a linear block code is defined by

7.2.1 Syndrome: Definition and Properties the minimum number of rows of the matrix HT whose sum
Let r denote the 1 n received vector that results from send- is equal to the zero vector.
ing the code vector c over a noisy channel. The minimum distance of a linear block code determines the
error-correcting capability of the code. The best strategy for
r=c+e the decoder is to pick the code vector closest to the received
vector r, that is, the one for which the Hamming distance
The vector e is called the error vector or error pattern. d(ci , r) is the smallest. With such a strategy, the decoder
For i = 1, . . . , n, we have will be able to detect and correct all error patterns of Ham-
ming weight w(e) t, provided that the minimum distance
( of the code is equal to or greater than 2t + 1.
1 if an error has occured in the ith location
ei =
0 otherwise We thus conclude that an (n, k) linear block code has the
power to correct all error patterns of weight t or less if, and
The algorithm commonly used to perform this decoding op- only if,
eration starts with the computation of a 1 (n k) vector d(ci , cj ) 2t + 1 ci , cj , i 6= j
called the error-syndrome vector or simply the syndrome.
Alternatively we may state that an (n, k) linear block code
s = rHT of minimum distance dmin can correct up to t errors if, and
only if,
Properties of the Error-Syndrome 1
t (dmin 1)
2
1. The syndrome depends only on the rror pattern, and not where bc denotes the largest integer less than or equal to
on the transmitted code word. the enclosed quantity.
s = (c + e)HT = cHT + eHT = eHT
7.2.3 Syndrome Detecting
Hence, the parity-check matrix H of a code permit us to We are now ready to describe a syndrome-based decoding
compute the syndrome s, which depends only upon the scheme for linear block codes. Let c1 , . . . , c2k denote the 2k
error pattern e. code vectors of an (n, k) linear block code. Let r denote the
received vector, which may have one of 2n possible values.
2. All error patterns that differ by a code word have the The corresponding 2k subsets constitute a standard array of
same syndrome. the linear block code. To construct it, we may exploit the
linear structure of the code by proceeding as follows:
It sould be noted that the set of equations is underdetermined
in that we have more unknowns than equations. Accordingly, 1. The 2k vectors are placed in a row with the all-zero code
there is no unique solution for the error pattern. vector c1 as the left-most element.
40
2. An error pattern e2 is picked and placed under c1 , and a 7.3.1 Generator Polynomial
second row is formed by adding e2 to each of the remain-
Let g(X) be a polynomial of degree n k that is a factor of
ing code vectors in the first row; it is important that the
X n + 1; as such, g(X) is the polynomial of least degree in the
error pattern chosen as the first element in a row not have
code. In general, g(X) may be expanded as follows:
previously appeared in the standard array.
nk1
3. Step 2 is repeated until all the possible error patterns have
X
g(X) = 1 + gi X i + X nk
been accounted for. i=1
For a given channel, the probability of decoding error is min- where the coefficient gi is equal to 0 or 1.
imized when the most likely error patterns (i.e., those with The polynomial g(X) is called the generator polynomial of a
the largest probability of occurrence) are chosen as the coset cyclic code. A cyclic code is uniquely determined by the gen-
leaders. In the case of a binary symmetric channel, the erator polynomial g(X) in that each code polynomial in the
smaller the Hamming weight of an error pattern the more code can be expressed in the form of a polynomial product
likely it is to occur. as follows:
We may now describe a decoding procedure for a linear block c(X) = a(X)g(X)
code:
where a(X) is a polynomial in X with degree k 1.
1. For the received vector r, compute the syndrome s =
rHT . Suppose we are given the generator polynomial g(X)
and the requirement is to encode the message sequence
2. Within the coset characterized by the syndrome s, iden-
(m0 , . . . , mk1 ) into an (n, k) systematic cyclic code.
tify the coset leader (i.e., the error pattern with the largest
Let the message polynomial be defined by
probability of occurrence); call it e0 .
m(X) = m0 + m1 X + . . . + mk1 X k1
3. Compute the code vector
and let
c = r + e0
b(X) = b0 + b1 X + . . . + bnk1 X nk1
as the decoded version of the received vector r.
We want the code polynomial to be in the form
This procedure is called syndrome decoding.
c(X) = b(X) + X nk m(X) = a(X)g(X)
7.2.4 Dual Code
In light of modulo-2 addition, we may write
Given a linear block code, we may define its dual as follows: X nk m(X) b(X)
T = a(X) +
GH = 0 g(X) g(X)
T We may now summarize the steps involved in the encoding
where H is the transpose of the parity-check matrix of the
code, and 0 is a new zero matrix. This equation suggests procedure for an (n, k) cyclic code assured of a systematic
that every (n, k) linear block code with generator matrix G structure.
and parity-check matrix H has a dual code with parameters 1. Multiply the message polynomial m(X) by X nk .
(n, n k), generator matrix H and parity-check matrix G.
2. Divide X nk m(X) by the generator polynomial g(X), ob-
taining the remainder b(X).
7.3 Cyclic Codes
3. Add b(X) to X nk m(X), obtaining the code polynomial
Cyclic codes form a subclass of linear block codes. An ad-
c(X).
vantage of cyclic codes over most other types of codes is that
they are easy to encode.
A binary code is said to be a cyclic code if it exhibits two 7.3.2 Parity-Check Polynomial
fundamental properties: An (n, k) cyclic code is uniquely specified by its generator
1. Linearity property: The sum of any two code words in the polynomial g(X) of order (nk). Such a code is also uniquely
code is also a code word. specified by another polynomial of degree k, which is called
the parity-check polynomial, defined by
2. Cyclic property: Any cyclic shift of a code word in the
k1
code is also a code word. X
h(X) = 1 + hi X i + X k
We define the code polynomial i=1
2
c(X) = c0 + c1 X + c2 X + . . . + cn1 X n1 where the coefficients hi are 0 or 1.
We find that the matrix relation HGT = 0 for linear block
where X is an indeterminante. Naturally, for binary codes, codes corresponds to the relationship
the coefficients ci are 1s and 0s. Each power of X in the
polynomial c(X) represents a one-bit shift in time. g(X)h(X) mod (X n + 1) = 0
Accordingly, we may state that the generator polynomial
If c(X) is a code polynomial, then the polynomial g(X) and the parity-check polynomial h(X) are factors of
c(i) (X) = X i c(X) mod (X n + 1) the polynomial X n + 1, as shown by
is also a code polynomial for any cyclic shift i. g(X)h(X) = X n + 1
41
7.3.3 Calculation of the Syndrome

Suppose the code word (c0 , . . . , cn1 ) is transmitted over a
noisy channel, resulting in the received word (r0 , . . . , rn1 ).
In the case of a cyclic code in systematic form, the syndrome
can be calculated easily.
r(X) = r0 + r1 X + . . . + rn1 X n1
Let q(X) denote the quotient and s(X) denote the remain-
der, which are the results of dividing r(X) by the generator
polynomial g(X).
r(X) = q(X)g(X) + s(X)
The remainder s(X) is a polynomial of degree n k 1 or
less, which is the result of interest. It is called the syndrome
polynomial because its coefficients make up the (n k) 1
syndrome s.
The syndrome polynomial s(X) has the following useful
properties
1. The syndrome of a received word polynomial is also the
syndrome of the corresponding error polynomial.
2. Let s(X) be the syndrome of a received word polynomial
r(X). Then, the syndrome of Xr(X), a cyclic shift of
r(X) is given by Xs(X).
3. The syndrome polynomial s(X) is identical to the error
polynomial e(X), assuming that the errors are confined
to the (n k) parity-check bits of the received word poly-
nomial r(X).
42
8 Multiple Access Protocols

8.1 ALOHA Throughput 8.2 Carrier Sense Multiple Access
8.1.1 Assumptions 8.2.1 1-persistent CSMA
The arrival process for new packets is a Poisson process with
A station waits until the channel becomes idle and
an arrival rate :
transmits the frame.
(T )k eT
P r[k|T ] =
k! If a collision occures, the station waits a random
Offered load (new packets and retransmitted packets to- amount of time and starts over again.
gether) is poisson distributed with channel access rate of g.
The probability of generating k frames in a given time inter- 8.2.2 Nonpersistent CSMA
val D is:
(gD)k egD Gk eG If the channel is busy, the station waits a random
P r[k|D] = = amount of time and then checks the channel again.
k! k!
G = g D represents the average number of generated frames
per frame duration. Throughput:
8.1.2 Throughput pure eG(k)

S(k) = s(k)D = 1
S = G P0 = Ge2G G(k)+1+
1
maximum throughput of 2e = 0.184 is achived at G = 0.5.
8.2.3 P-persistent CSMA
8.1.3 Throughput slotted
This protocol only applies to slotted channels.
S = G P0 = GeG
1
maximum throughput of e = 0.368 is achived at G = 1. If a station which has data to send detects an idle chan-
nel it trasmits with a probability p and with a proba-
0.40 bility q = 1 p it defers until the next slot.
Slotted ALOHA: S = Ge-G
0.30 stable points:
arrival rate D = throughput S
If the station initially senses the channel busy, it waits
0.20 arrival rate D
until the next slot and applies the above algoithm.
0.10 Pure ALOHA: S = Ge-2G
0 Gstable 0.5 1.0 1.5 2.0 3.0 8.2.4 Comparison

G (attempts per packet time)
0.01-persistent CSMA
1.0
0.9 Nonpersistent CSMA
S (throughput per packet time)
8.1.4 Mean Delay

0.8 0.1-persistent CSMA
0.7
Number of backlogged nodes N : 0.5-persistent
0.6
m CSMA
X 0.5
N = En [n] = pn n 0.4
Slotted
ALOHA
n=0 0.3 1-persistent
0.2 Pure CSMA
Average delay of a frame until successful transmission: ALOHA
0.1
N N
T = = 0
0 1 2 3 4 5 6 7 8 9
(m N ) pa G (attempts per packet time)
43
A Line Codes B Hilbert Transform

Several line codes can be used for the electrical representa- Consider a signal g(t) with Fourier transform G(f ).
tion of a binary data stream.
Def B.1 Hilbert Transform
Unipolar nonreturn-to-zero (NRZ) signaling The Hilbert transform of g(t), which we shall denote by g(t),
is defined by
Symbol 1 is represented by transmitting a pulse of amplitude
1 g( )
Z
A for the duration of the symbol, and symbol 0 is represented g(t) = d
t
by switching off the pulse (also on-off signaling).
- Waste of power due to transmitted DC level. Clearly, the Hilbert ransformation of g(t) is a linear opera-
- Power spectrum does not approach zero at zero tion.
frequency.
Def B.2 Inverse Hilbert Transform
Polar nonreturn-to-zero (NRZ) signaling The inverse Hilbert transform, by means of which te original
Symbols 1 and 0 are represented by transmitting pulses of signal g(t) is recovered from g(t), is defined by
amplitude +A and A respectively.
1 g( )
Z
+ Relatively easy to generate. g(t) = d
t
- Power spectrum of the signal is large near zero
frequency.
The functions g(t) and g(t) are said to constitute a Hilbert-
transform pair.
Unipolar return-to-zero (RZ) signaling
Symbol 1 is represented by a rectangular pulse of amplitude We note from the definition of the Hilbert transform that
A and half-symbol width, and symbol 0 is represented by g(t) may be interpreted as the convolution of g(t) with the
transmitting no pulse. time function 1/t. For the time function 1/t we have
+ Presence of delta functions at f = 0, 1/Tb , useful for

1 1
f >0
bit-timing recovery at the receiver. c s j sgn(f ) = j 0 f =0
- Requires 3 dB more power than polar return-to-zero t
1 f < 0

signaling for the same probability of symbol error.
Bipolar return-to-zero (BRZ) signaling It follows therefore that the Fourier transform G(f ) of g(t)
is given by
Positive and negative pulses of equal amplitude Aare used G(f ) = j sgn(f )G(f )
alternately for symbol 1, with each pulse having half-symbol
width; no pulse is always used for symbol 0.
+ Power spectrum has no DC component.
+ Relatively insifgnificant low-frequency components
if 0 and 1 are equiprobable.
Also called alternate mark inversion (AMI) signaling.
Split-phase (Manchester code)

Symbol 1 is represented by a positive pulse of amplitude A
followed by a negative pulse of amplitude A, with both
pulses being half-symbol wide. For symbol 0, the polarities
of these two pulses are reversed. The Manchester code sup-
presses the DC component and has relatively insignificant
low-frequency components, regardless of the signal statis-
tics.
44
C Complex Representation of Signals and Systems

Def C.1 Pre-Envelope 2W is small compared with fc , is reffered to as narrowband
Consider a real-valued signal g(t). We define the pre- signal.
envelope, or analytical signal, of the signal g(t) as the Let the pre-envelope of a narrowband signal g(t) be expressed
complex-valued function in the form
g+ (t) = g(t) exp(j2fc t)
g+ (t) = g(t) + jg(t)
We refer to g(t) as the complex envelope of the signal.
where g(t) is the Hilbert transform of g(t). We note that the spectrum of g+ (t) is limited to the fre-
quency band fc W f fc + W .
We find that the pre-envelope is particularly useful in han- Applying the frequency-shifting property of the Fourier
dling band-pass signals and systems. transform, we find that the spectrum of the complex enve-
One of the important features of the pre-envelope g+ (t) is lope g(t) is limited to the band W f W . That is, the
the behavior of its Fourier transform. Let G+ (f ) denote the complex envelope g(t) of a band-pass signal g(t) is a low-pass
Fourier transform of g+ (t). signal.
We may express the original band-pass signal g(t) in terms
2G(f ) f > 0
of the complex envelope g(t) as follows:
G+ (f ) = G(f ) + sgn(f )G(f ) = G(0) f =0

0 f <0 g(t) = Re (g(t) exp(j2fc t))
where G(0) is the value of G(f ) at frequency f = 0. In general, g(t) is a complex-valued quantity
For a given signal g(t) we may determine its pre-envelope g(t) = gI (t) + jgQ (t)
g+ (t) in one of two equivalent ways:
where gI (t) and gQ (t) are both real-valued low-pass func-
1. We determine the Hilbert ransform g(t) of the signal g(t), tions; their low-pass property is inherited from the complex
and then use g+ (t) = g(t) + jg(t) to compute the pre envelope g(t).
envelope. The original band-pass signal may then be expressed in the
canonical, or standard, form:
2. We determine the Fourier transform G(f ) of the signal
g(t), and use G+ (f ) = G(f ) + sgn(f )G(f ) to determine g(t) = gI (t) cos(2fc t) gQ (t) sin(2fc t)
G+ (f ) and then evaluate the inverse Fourier transform of
G+ (f ) to obtain We refer to gI (t) as the in-phase component of the band-pass
Z signal g(t) and to gQ (t) as the quadrature component of the
g+ (t) = 2 G(f ) exp(j2f t) df signal.
0 Since both gI (t) and gQ (t) are low-pass signals limited to
the band W f W , they may be derived from the
Symmetrically we may define the pre-envelope for negative
band-pass signal g(t) (cf Figure below).
frequencies as
g (t) = g(t) jg(t)
The complex envelope g(t) can be expressed in the polar
The two pre-envelopes g+ (t) and g (t) are simply the com- form
plex conjugate of each other, as shown by g(t) = a(t) exp j(t)
where a(t) and (t) are both real-valued low-pass functions.
g (t) = g+ (t)
Based on this polar representation, the original band-pass
C.0.5 Canonical Representation: Band-Pass Signal signal g(t) is defined by
Consider a band-pass signal g(t) whose Fourier transform

g(t) = a(t) cos 2fc t + (t)
G(f ) is nonnegligible only in a band of frequencies of total
extent 2W , say, centered about some frequency fc . We re- We refer to a(t) as the natural envelope of the band-pass
fer to fc as the carrier frequency. A signal, whose bandwidth signal g(t) and to (t) as its phase.
45
Index
A Linear Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
linear receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Additive White Gaussian Noise Channel . . . . . . . . . . . . . . . . 9 Link Margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Antipodal Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Log-Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Autocorrelation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 AWGN Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Autocovariance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 M
Average Probability of Symbol Error . . . . . . . . . . . . . . . . . . . . 6
Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Binary Symmetric Channel . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Matched Filters
AWGN Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
B Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Modified Duobinary Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Baseband M-ary PAM Transmission . . . . . . . . . . . . . . . . . . . . 8
Baseband Pulse Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 N
Binary PCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Binary Symmetric Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Bit Duration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Nyquist Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Bit Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Nyquist Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Nyquists Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
C
Distortionless Baseband Transmission . . . . . . . . . . . . . . . . 7
Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
P
Class I Partial Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Class IV Partial Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Peak Pulse Signal-to-Noise Ratio . . . . . . . . . . . . . . . . . . . . . . . 5
Complementary Error Function . . . . . . . . . . . . . . . . . . . . . . . . . 6 Power Spectral Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Correlation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Correlative-Level Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Pulse Code Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Cross-Correlation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Q
D
quadrature component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Discrete Memoryless Source . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
R
Discrete Pulse-Amplitude Modulation . . . . . . . . . . . . . . . . . . . 6
Duobinary Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Raised Cosine Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Full-Cosine Rolloff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
E
Rolloff Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Einstein-Wiener-Khintchine relations . . . . . . . . . . . . . . . . . . . 2 Transmission Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Random Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Equivalent Noise Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . 3 S
Ergodic Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Sample Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
G Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Signal Energy-to-Noise Spectral Density Ratio . . . . . . . . . . 5
Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Source-Coding Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Gaussian Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Strict Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Geometric Representation of Signals . . . . . . . . . . . . . . . . . . . . 9 T
I Theorem of Irrelevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Ideal Nyquist Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Optimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
in-phase component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Transmitted Signal Energy per Bit . . . . . . . . . . . . . . . . . . . . . . 6
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Intersymbol Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 V
J Variable-Length Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Joint Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 W
L Weakly Stationary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Wide-Sense Stationary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
46

Communic PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Communic PDF

Uploaded by

Copyright:

Available Formats

Communication Systems

2 Baseband Pulse Transmission 5

4 Passband Data Transmission 13

5 Multiuser Radio Communications 25

6 Fundamental Limits in Information Theory 31

C Complex Representation of Signals and Systems 45

Def 1.2 Random Process Def 1.6 Autocorrelation Function

If the input to a stable LTI-Filter is a stationary process,

1.5 Transmission of a Random Process SX (f ) = SX (f )

1.7 Gaussian Processes

2 Baseband Pulse Transmission

2.1 Matched Filter 2.1.1 Properties of Matched Filters

Hence we get for the peak pulse signal-to-noise ratio + matched

The signal s(t) is modified as a result of transmission through

Def 2.7 Bit Rate Def 2.14 Transmission Bandwidth

2.4.1 Ideal Nyquist Channel Def 2.15 Full-Cosine Rolloff

sin(2W t) Additional zero crossings at t = 3Tb /2, 5Tb /2, . . .

Def 2.11 Nyquist Bandwidth

The corresponding impulse response 2.6 Baseband M-ary PAM Transmission

2.5.2 Modified Duobinary Signaling

We may correct this by using the class IV partial response or

The channel noise w(t) is the sample function of a zero-

3.1 Geometric Representation of Signals 3.1.1 Gram-Schmidt Orthogonalization Procedure

The real-valued basis functions 1 (t), . . . , N (t) are or-

We may thus express the received signal as

Def 3.10 Signal Detection Problem 3.5 Correlation Receiver

where pk is the a priori probability of transmitting symbol 3.6 Proability of Error

where E is the average energy of the original signal constel-

4 Passband Data Transmission

2Eb sin2 (Tb f )

4.3.2 Quadriphase-Shift Keying

In the region where E/2N0  1 we have

With Gray encoding used for the incoming symbols, we find

We amy therefore state that a coherent QPSK system

Power Spectral Density

Important to note is that the bit errors in the in-phase and

4.3.3 Offset QPSK The generation of /4-shifted DQPSK symbols, represented

where the plus sign corresponds to symbol 1 and the minus

The orthonormal basis function forMSK are defined by a pair

Correspondingly, we may express the MSK signal in the ex-

The pulse response

constitutes the frequency shaping pulse of the GMSK mod-

where w(t) is the sample function of AWGN of zero mean We have

phase by -90 degrees.

4.7.1 Noncoherent Binary FSK

where fi = ni /Tb for a fixed integer ni . f1 represents symbol

4.7.2 Differential PSK

5 Multiuser Radio Communications

Multiple access is a technique whereby many subscribers or Low-noise amplifier.

5.2 Satellite Communications

5.3 Radio Link Analysis Def 5.2 Power Density

1. Required Eb /N0 Def 5.4 Total Power Radiated

2. Received Eb /N0 Def 5.5 Average Power

Def 5.1 Link Margin (, ) (, )

Def 5.9 Antenna Beamwidth We define

When comparing low-noise devices (all with F 1) it is

1 Diversity may be viewed as a form of redundancy. THe fol-

Type of Signaling Bit Error Rate Pe Approximate BER

We see Rayleigh fading results in a severe degradation in the

6 Fundamental Limits in Information Theory

Def 6.4 Variable-Length Code 6.3.1 Prefix Coding

6.3 Data Compaction H(S) L < H(S) + 1

Def 6.12 Information Capacity Theorem

In the region where E/2N0 1 we have