You are on page 1of 608

Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Telecommunications
Engineering II
Jorma Kekalainen

Contents

Topics include
Signals and noise
Fourier analysis
Digital transmission
Statistical and linear algebra tools for advanced
telecommunications
Multi-carrier modulation
Coding and information theory
Diversity techniques
Wireless overview

1
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Information Sources
1. Usually, the most precise sources are the original
sources, i.e. standards, recommendations or other
specifications.
You can pull them from the Internet e.g.
- ITU-T www.itu.int/ITU-T/
- IETF www.ietf.org
- 3GPP www.3gpp.org
or from elsewhere.

2. You can look for material from corresponding


courses in the Internet
3. Some may find that the books are easier to read.
4. Many slides are adapted from the following books or
lecture notes based on those books

Books
Carlson et al.: Communication Systems: An Introduction to Signals and
Noise in Electrical Communication
Haykin: Communication Systems
Haykin & van Veen: Signals & Systems
Freeman: Radio System Design for Telecommunications
Goldsmith: Wireless Communications
Murthy et al.: Ad Hoc Wireless Networks: Architectures and Protocols
Pahlavan et al.: Principles of Wireless Networks: A Unified Approach
Rappaport: Wireless Communications: Principles and Practice
Roddy: Satellite Communications
Skolnik: Introduction to Radar Systems
Stallings: Wireless Communication and Networks
Tse: Fundamentals of Wireless Communication

2
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Introduction

Communication, message and signal

3
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Basic model for telecommunication

Measure of information

4
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Transmission channels

Cables
wire pairs (e.g., ordinary telephone line)
coaxial cable
waveguide (metallic waveguide and optical fiber)
More or less free space radio transmission
broadcasting
point-to-point microwave transmission
satellite position transmission
cell networks
(Portable magnetic/electronic/optical memory equipment)

Mode of communication

5
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Telecommunication and EM spectrum

Analogue vs. digital communication

6
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Constraints

Note: Latency (one-way) is the time from


start of packet transmission to the start
of packet reception.

Reliability of communication

7
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Performance of coding

Digital communication system: Structure

Noise
Transmitted Received Received
Info. signal signal info.
SOURCE
Source Transmitter Channel Receiver User

Transmitter

Source Channel
Formatter Modulator
encoder encoder

Receiver

Source Channel
Formatter Demodulator
decoder decoder

8
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Formatting and transmission of baseband


signal
Digital info.

Textual Format
source info.
Pulse
Analog Transmit
Sample Quantize Encode modulate
info.

Pulse
Bit stream waveforms Channel
Format
Analog
info. Low-pass
Decode Demodulate/
filter Receive
Textual Detect
sink
info.

Digital info.

Digital transmission system

9
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signals of communication system

Signals and Systems

Deterministic signals and


LTI systems

10
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time functions

21

Continuous-time vs. discrete-time

22

11
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Continuous-valued vs. discrete-valued


signal

23

Deterministic vs. random (stochastic)

24

12
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Causal vs. anticausal vs. non-causal

25

Even and odd signals

26

13
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Continuous-time signals

27

Sine signal

28

14
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sinusoids

29

Cosine signal

30

15
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equivalence of sinusoidals

31

Some trigonometric identities

32

16
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Exponentials

33

Other classification

34

17
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Period

35

Complex exponentials

36

18
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex exponential

37

Complex exponential

38

19
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex exponential vs. real sinusoids

39

Complex exponential vs. real sinusoids

40

20
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signal power and energy

41

Energy signals

42

21
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Power signals

43

Impulse function

Note: For analogue impulse


Lim {x(t)|t0 } = 44

22
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Impulse

45

Dirac-delta function

46

23
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Application

47

Advanced definition

48

24
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Step function

49

Another definition

50

25
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Step function

51

Application

52

26
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Step response and masking

53

Synthesis

54

27
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signum function

55

Rectangular pulse

56

28
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Rectangular pulse

Also notation:
57

Basic operations on signals

58

29
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Amplitude scaling

59

Examples

60

30
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time scaling

61
Note: Downsampling is sometimes called decimation

Examples

62

31
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time shifting

63

Examples

64

32
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reflection

65

LTI systems

66

33
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

System

System (e.g. electric network) is specified by:


the functional description of the system blocks,
the interconnection rules between system blocks,
and
the topology.

67

Example: Basic blocks

68

34
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Basic blocks

69

System: Definition

70

35
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Basic system concepts

71

Memoryless components and systems


The output of the system vo(t0) at time t0 depends only on the input at the
same time vi(t0).

Voltage divider: Discrete gain block:

72

36
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Systems and components with


memory
The system output at t0 depends on past values of the input t t0.

Capacitor: Inductor: Unit delay:

73

Basic system concepts

74

37
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Linear, time-invariant systems

75

Linearity and superposition


Analog signals and systems

a1 aN are any arbitrary constants, and x1(t) xN(t) are any arbitrary continuous-
time signals.

Discrete signals and systems

a1 aN are any arbitrary constants, and x1(n) xN(n) are any arbitrary discrete-
time signals

76

38
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

LTI system

77

Linearity condition

78

39
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Linearity condition

79

Note

80

40
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Principle of superposition

81

Time-invariance

82

41
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Impulse response

83

84

42
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Impulse response

85

Example

86

43
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Response of LTI system

87

Convolution

88

44
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Convolution

89

Convolution

90

45
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Impulse response

91

Causal system

92

46
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Causal LTI system

93

Causal LTI system

94

47
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time-invariance and causality

95

Cascade LTI systems

96

48
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Parallel LTI systems

97

Examples: Interconnections of
systems/components
Series: e.g. transmitter-channel-receiver

Parallel: e.g. multiple antennas Feedback: e.g. control systems (phase-


locked loop)

98

49
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of LTI systems

99

Step response

100

50
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

System models: Impulse response

101

System models: Frequency response

102

51
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

System models: Differential equation


Example:

103

System models: Difference equation


Example:

104

52
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fourier Analysis

Eigenfunctions of LTI system

106

53
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex exponential input

107

Complex exponential as an
eigenfunction

108
Note: Only one frequency, namely f0.

54
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency function = eigenvalue

Note: Of course, actually H(f0) is the value of the corresponding function


109
H(f) at some fixed frequency f0.

Sinusoids are eigenfunctions of LTI


systems

110

55
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency content

111

Fourier transform

112

56
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fourier transform pair

Note:

113

Frequency content of sinusoid

114

57
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency content of sinusoid

115

Frequency content of sinusoid

116

58
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Spectrum of sinusoid

117

Positive and negative frequency

118

59
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency content of the rectangular


pulse

119

Frequency content of the rectangular


pulse

120

60
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency content of the rectangular


pulse

121

Frequency content of the rectangular


pulse

122

61
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Spectrum function value at DC

Note: 123

Discrete-Time Fourier Series (DTFS =


DFT)

124

62
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fourier Series (FS)

125

Discrete-Time Fourier Transform


(DTFT)

126

63
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fourier Transform (FT)

127

F-representations

128

64
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Existence of F-representations

129

Gibbs effect

130

65
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gibbs effect

131

Properties of F-representations

132

66
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of F-representations

133

Properties of F-representations

134

67
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Note

135

Major properties of the F-transform

136

68
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Major properties of the F-transform

137

Major properties of the F-transform

138

69
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Major properties of the F-transform

139

F{convolution}

140

70
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

y=xh XH=Y

141

Power and energy

142

71
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Energy

143

Parsevals formula

144

72
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Power spectrum

145

Some useful F-transform pairs

146

73
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Some useful F-transform pairs

147

Some useful F-transform pairs

148

74
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Some useful F-transform pairs

149

Some useful F-transform pairs

150

75
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Modulation

151

Modulation

152

76
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Periodic extension

153

Transform of the periodic extension

154

77
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of the discrete-time F-


series

Note: The DTFS is also called the Discrete Fourier Transform (DFT)
155

Properties of the discrete-time F-


transform

156

78
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of the F-series

157

Properties of the F-transform

158

79
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency response and filtering

159

Impulse response Frequency response

(y=xh XH=Y)

160

80
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency response

161

Frequency function impulse function

Note: Previously presented A(f0)=|H(f0)| and (f0)=arg{H(f0)} are the162


values of the corresponding functions at the frequency f0.

81
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Distortionless system

163

Distortionless system

164

82
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Distortion and dispersive system

165

Ideal filters

166

83
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Ideal filters
Ideal filters are physically unrealizable, in the sense that their
characteristics cannot be achieved with a finite number of
elements.

For example. the impulse response of an ideal filter is a sinc-


function, which is infinite long and noncausal.

The filters used in practical real time applications must be causal, that is

h(t)=0 for t<0


167

Realizable filters

168

84
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Real filters and their ideal


counterparts

169

Equalizers
If we know the frequency response of a system then, given the
output, we can discover the input using the equation

H(j)0

To build an inverse system, we must design it so that its frequency


response is H(j)-1
Inverse systems are required in communications, where they are
known as equalizers.
The system to be inverted is the channel.
Ideal equalizers, like ideal filters, cannot be realized.
A communications channel introduces a delay that cannot be
undone!
Even distortionless equalizers are generally unrealizable.

Example: Loading coils on telephone lines. Inductors are placed in shunt across the line
every km or so to improve frequency flatness over voice frequencies.

85
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Equalization
If the transfer function of the channel that causes linear
distortion is known, it can be compensated by the inverse
transfer function.

If the transfer functions of the channel and equalizer are


Hc(f) and Heq(f), respectively, then the transfer function of
the overall transmission system is

H(f) = Hc(f)Heq(f)

Channel Equalizer

171

Quadrature filter and Hilbert transform

172

86
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Quadrature filter and Hilbert transform

173

Simple example of Hilbert transform


The simplest and most obvious Hilbert transform
pair follows directly from the phase shift property
of the quadrature filter.
Specifically, if

then a phase shift of -90 produces

This calculation can be generalized to any signal that


consists of a sum of sinusoids.
Most other Hilbert transforms involve performing the
174
convolution operation.

87
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Correlation functions and spectral


densities

175

Introduction to correlation and spectral


density
Here we study signals using the time average and signal power or
energy.

Taking the Fourier transform of a correlation function leads to


frequency-domain representations in terms of spectral density
functions.

Spectral densities allow us to deal with a broader range of signal


models, not necessarily Fourier transformable (e.g., random
signals).

So far and here we consider deterministic signals whose


behaviour is known for all possible times.

Note: Random signals occur in communication systems both as


unwanted noise and as desired information-bearing signals
176

88
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Correlation of energy signals

177

Energy cross-spectral density and


autocorrelation

178

89
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: PRN (Pseudo Random Number)


code correlation
Correlation between PRN1 and PRN1

Correlation between PRN1 and PRN2 PRN1 PRN2

Energy spectral density

Note:

180

90
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Correlation of power signals

Note:

181

Power spectral density

182
Spectral density function G(f) represents the distribution of the power or energy in the
frequency domain. The area under G(f) equals the average power or total energy.

91
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

W-K theorem and power cross-spectral


density

183

Properties of correlation and spectral


density

Note: Spectral Function of frequency 184


Power Density Integration gives power

92
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Correlation and spectral density


properties of I/O-signals

185

Periodic and discrete-time signals

186

93
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inner product
The inner product for two m 1 complex vectors

is given by

Similarly, we define the inner product of two (possibly


complex-valued) signals s(t) and r(t) as follows:

The inner product obeys the following linearity property

where a1, a2 are complex-valued constants, and s1, s2, r are


signals (or vectors).

The complex conjugate of a vector or row vector x, denoted as x*, is obtained by taking the
complex conjugate of each element of x. The Hermitian of a vector x, denoted as xH, is187
its
conjugate transpose: xH = (x)T.

Energy and norm

The energy Es of a signal s is defined as its


inner product with itself:

where s denotes the norm of s.

If the energy of s is zero, then s must be


zero.

188

94
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Matched filter


For a complex-valued signal s(t), the matched filter is defined as a
filter with impulse response sMF(t) = s(t).
Note that SMF(f) = S(f).
If the input to the matched filter is x(t), then the output is given by

The matched filter, therefore, computes the inner product between


the input x and all possible time translates of the waveform s.
In particular, the inner product <x, s> equals the output of the matched
filter at time 0.
For example, if x(t)=s(tt0) (i.e., the input is a time translate of s), then
the magnitude of the matched filter output is maximum at t=t0.
We can, then, intuitively see how the matched filter would be useful,
for example, in delay estimation using peak picking.

189

Matched filter for a complex-valued


signal

190

95
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Autocorrelation function
The inverse Fourier transform of the energy spectral density Es(f) is
termed the autocorrelation function Rs(), since it measures how closely
the signal s matches delayed versions of itself.
Since |S(f)|2 =S(f)S(f)=S(f)SMF(f), where sMF(t)=s(t) is the matched
filter for s introduced earlier.
We therefore have that

Thus, Rs() is the outcome of passing the signal s through its matched
filter, and sampling the output at time , or equivalently, correlating the
signal s with a complex conjugated version of itself, delayed by .
While the preceding definitions are for finite energy deterministic
signals, we revisit these concepts in the context of finite power random
processes later.

191

Bandpass signals and complex


baseband representation

192

96
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Real signals

Many signals in communication systems are


real bandpass signals with a frequency
response that occupies a narrow bandwidth
2B centered around a carrier frequency fc
with 2B << fc, as illustrated in figure.

Bandpass signal S(f)

193

Real signals

The bandwidth 2B of a bandpass signal is roughly


equal to the range of frequencies around fc where the
signal has non-negligible amplitude.
Bandpass signals are commonly used to model
transmitted and received signals in communication
systems.
These are real signals since the transmitter circuitry
can only generate real sinusoids (not complex
exponentials) and the channel just introduces an
amplitude and phase change at each frequency of the
real transmitted signal.

194

97
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Conjugate symmetry
Since bandpass signals are real, their frequency response has
conjugate symmetry, i.e. a bandpass signal s(t) has

|S(f)| = |S(f)| and S(f) = S(f).

However, bandpass signals are not necessarily conjugate


symmetric within the signal bandwidth about the carrier
frequency fc, i.e. we may have

|S(fc+f)| |S(fcf)| or S(fc+f) S(fcf)

for some f B.
This asymmetry in |S(f)| is illustrated in the previous figure.
Bandpass signals result from modulation of a baseband signal by
a carrier, or from filtering a deterministic or random signal with
a bandpass filter.
195

Bandpass signal

196

98
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bandpass signal

197

Canonical representation

198

99
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Direct-conversion modem

199

Polar decomposition

200

100
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bandpass and baseband equivalent system

201
A baseband communication refers to a system that does not include modulation.

Bandpass and lowpass (baseband)


equivalent

202

101
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Baseband vs. passband signals


A signal s(t) is said to be baseband if

for some W>0.


That is, the signal energy is concentrated in a band around DC.
Similarly, a channel modeled as a linear time-invariant system is
said to be baseband if its transfer function H(f) satisfies the
previous equation.
A signal s(t) is said to be passband if

where fc>W>0.
A channel modeled as a linear time-invariant system is said to be
passband if its transfer function H(f) satisfies the previous
equation
203

Baseband vs. passband signals

The spectrum S(f) for a real- The spectrum S(f) for a real-valued passband
valued baseband signal. The signal. The bandwidth of the signal is B.
bandwidth of the signal is B. The figure shows a frequency fc within the
band in which S(f) is nonzero. Typically, fc is
much larger than the signal bandwidth B.204

102
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex baseband representation


We often employ passband channels, which means that we must
be able to transmit and receive passband signals.

However, all the information carried in a real-valued passband


signal is contained in a corresponding complex-valued baseband
signal.

This baseband signal is called the complex baseband


representation or complex envelope of the passband signal.

This equivalence between passband and complex baseband has


profound practical significance, since the complex envelope can
be represented accurately in discrete time using a much smaller
sampling rate than the corresponding passband signal sp(t),
205

Complex baseband representation


Modern communication transceivers can implement complicated signal
processing algorithms digitally on complex baseband signals, keeping the
analog processing of passband signals to a minimum.

Thus, the transmitter encodes information into the complex baseband


waveform using encoding, modulation and filtering performed using
digital signal processing (DSP).

The complex baseband waveform is then upconverted to the


corresponding passband signal to be sent on the channel.

Similarly, the passband received waveform is downconverted to complex


baseband by the receiver, followed by DSP operations for
synchronization, demodulation, and decoding.

This leads to a modular framework for transceiver design, in which


sophisticated algorithms can be developed in complex baseband,
independent of the physical frequency band that is ultimately employed
for communication.
206

103
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time domain representation of a passband


signal
Any passband signal sp(t) can be written as
(1)

where sc(t) (c for cosine) and ss(t) (s for sine)


are real-valued signals, and fc is a frequency
reference typically chosen in the band occupied by
Sp(f).

The factor of 2 is included only for convenience in


normalization, and is often omitted as previously in
the canonical or standard representation of the
passband signal in terms of baseband signals.
207

I- and Q-components
The waveforms sc(t) and ss(t) are also referred to as the in-phase (or I)
component and the quadrature (or Q) component of the passband signal
sp(t), respectively.

The complex envelope, or complex baseband representation of sp(t) is


now defined as

(2)

We can rewrite (1) as

(3)

Note: To check (3), plug in (2) and Eulers identity on the right-hand side (3) 208
to obtain the expression (1).

104
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Envelope and phase of a passband


signal
The complex envelope s(t) can also be represented in polar form,
defining the envelope e(t) and phase (t) as

(4)

Plugging

into (3), we obtain yet another formula for the passband signal s:

(5)

The equations (1), (3) and (5) are three different ways of expressing
the same relationship between passband and complex baseband in the
time domain.
209

Orthogonality of I and Q channels

The passband waveform

corresponding to the I component, and the passband


waveform

corresponding to the Q component, are orthogonal.

That is,

210

105
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Upconversion and downconversion


Equation (1) immediately tells us how to upconvert from baseband
to passband.

To downconvert from passband to baseband, consider

The first term on the right-hand side is the I component, a baseband


signal.
The second and third terms are passband signals at 2fc, which we can
get rid of by lowpass filtering.
Similarly, we can obtain the Q component by lowpass filtering

211

Upconversion and downconversion

212

106
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling and PCM

Digital communication system: Structure

Noise
Transmitted Received Received
Info. signal signal info.
SOURCE
Source Transmitter Channel Receiver User

Transmitter

Source Channel
Formatter Modulator
encoder encoder

Receiver

Source Channel
Formatter Demodulator
decoder decoder

107
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Formatting and transmission of baseband


signal
Digital info.

Textual Format
source info.
Pulse
Analog Transmit
Sample Quantize Encode modulate
info.

Pulse
Bit stream waveforms Channel
Format
Analog
info. Low-pass
Decode Demodulate/
filter Receive
Textual Detect
sink
info.

Digital info.

Introduction

216

108
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling and pulse modulation


Mathematical functions and electric signals are frequently displayed as
continuous curves.
A smooth curve drawn can be displayed by using samples that have
sufficiently close spacing.
When the samples are represented as, e.g., voltage pulses, we obtain a
discrete-time signal.
In information transmission, we can use samples instead of a continuous
time signal.
Instead of CW-modulation methods, we can use pulse modulation methods.
A digital signal is obtained by representing the discrete samples as discrete
number of amplitude values (quantization), usually by using the binary
system.
Number of bits determines the number of discrete values.
In digital transmission systems are used digital modulation methods.

217

Sampled signals

218

109
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Analog and digital amplitude modulations

219

Why digital communications?

220

110
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why digital communications?

221

Regenerative repeater in digital


communications

222

111
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital vs. analog

223

Digital transmission system

224

112
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Schematic diagram of a PCM coder


decoder

225
A paralleltoserial (P/S) converter

Sampling concepts

Continuous-time signal

Discrete-time signal obtained


by uniform sampling

Digital signal represented


as discrete sample values
226

113
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Periodic signals and F-transform

227

Impulse train

228

114
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Periodic sampling

229

Reconstruction of signals from samples

230

115
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Perfect reconstruction

231

Example: Aliased reconstruction


(undersampled)

232

116
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling theorem

233

Reconstruction as interpolation

234

117
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reconstruction as interpolation

235

Bandlimited interpolation

236

118
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Nyquist sampling theorem

237

Time domain interpretation

238

119
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency domain interpretation

239

Example: Sampling of sinusoid

240

120
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Determination of sampling frequency


from signal waveform

241

Sampling with pre-filtering

242

121
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reconstruction of continuous signal from


samples

243

Bandwidth of signal
Baseband versus bandpass:

Baseband Bandpass
signal signal
Local oscillator

Bandwidth dilemma:
Bandlimited signals are not realizable!
Realizable signals have infinite bandwidth!
244

122
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bandwidth of signal: Approximations


Different definition of bandwidth:
a) Half-power bandwidth d) Fractional power bandwidth
b) Noise equivalent bandwidth e) Bounded power spectral density
c) Null-to-null bandwidth f) Absolute bandwidth

(a)
(b)
(c)
(d)
245
(e)50dB

Sampling of analog signals


Time domain Frequency domain
xs (t ) = x (t ) x(t ) X s ( f ) = X ( f ) X ( f )
x(t )
| X( f )|

x (t ) | X ( f ) |

xs (t )
| Xs( f )|

246

123
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Aliasing effect & Nyquist rate

LP filter

Nyquist rate

aliasing

247

Undersampling & aliasing in time domain

248

124
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling theorem
Analog Sampling Pulse amplitude
signal process modulated (PAM) signal

Sampling theorem: A bandlimited signal with no spectral components


beyond , can be uniquely determined by values sampled at uniform
intervals of

The sampling rate (Nyquist rate) is

In practice, it is need to sample faster than this because the receiving filter
will not be sharp.

249

Sampling theorem
Statement: Any signal with a bandwidth of W can be completely reconstructed if it is
sampled at a rate of 2W.

Original
waveform
samples
Capacitor
discharges

Capacitor
charges

Thus by sampling first at the transmitter and then passing the samples through a ideal
LPF the original waveform can be completely reconstructed 250

125
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Effect of under sampling

Original Incorrectly
signal at 3 Hz Samples at less reconstructed
sampling rate at 2 Hz signal at 1 Hz

Thus when any wave is sampled at a frequency that is less than double
the maximum signal frequency, the recovered wave will not be of the same
frequency as the input waveform. This distortion is called aliasing .

The sampling frequency has to be adjusted such that fs > 2fm 251

Pulse modulation

252

126
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Pulse modulation

Signal

PAM

PWM

PPM
253

Encoding (PCM)

Pulse code modulation (PCM): Encoding the


quantized signals into a digital word (PCM word or
codeword).
Each quantized sample is digitally encoded into an l bits
codeword where L in the number of quantization levels and

254

127
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digitizing analog speech


PAM & PCM
PAM (Pulse Amplitude Modulation).
The amplitude of a train of pulse is varied
according to the amplitude of the analog signal
(modulating signal)

PCM (Pulse Code Modulation).


The analog signal modulates a train of pulse (PAM).
In effect the analog signal is sampled and the
samples are coded to a binary value which is a
function of the amplitude of the sampled analog
signal

PCM = PAM + QUANTIZATION + COMPANDING 255

Schematic diagram of a PCM coder


decoder

256
A paralleltoserial (P/S) converter

128
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Pulse code modulation


PCM was developed by Reeves in 1937
PCM (ADPCM) is the preferred method of
communication within the PSTN

PCM is a type of coding that is called waveform coding


because it creates a coded form of the original voice
waveform.

PCM is a waveform coding method defined in the ITU-T


G.711 specification.

257

Example: Quantization and Pulse Code


Modulation (PCM)
Quantization
Quantizing error or noise
Approximations mean it is impossible to recover
original exactly
4 bit system divides amplitude range to 16 levels
8 bit sample gives 256 levels
8-bit quality comparable with analog speech
transmission in PSTN
8000 samples per second of 8 bits each gives
64kbps = digital speech channel
258

129
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Diagrammatic representation of 3 bit PCM


4
CODE No. 7 3.5
3
CODE No. 6 2.5
2
CODE No. 5 1.5
1
CODE No. 4 0.5
0
CODE No. 3 -0.5
-1
CODE No. 2 -1.5
-2
CODE No. 1 -2.5
-3
CODE No. 0 -3.5
-4

SAMPLE VALUE 0.0 3.35 1.75 - 0.25 -1.4 -2.3 -3.5


NEAREST Q LEVEL 0.5 3.5 1.5 -0.5 -1.5 -2.5 -3.5
QUANT ERROR +0.5 +0.15 -0.25 +0.25 +0.1 +0.2 0.0
CODE NUMBER 4 7 5 3 2 1 0 259
ENCODED BITS 100 111 101 011 010 001 000

Pulse Code Modulation (PCM)

260

130
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Single channel simplex PCM transmission


system
Quantization
Companding

261

PCM block diagram

pulse pulse code


amplitude modulated
modulated signal
sampling
(pam) signal (pcm)
clock

quantizer digitized
sampling voice
and
circuit signal
compander

analog
voice voice band
signal filter

262

131
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Quantization

263

Memoryless quantization

264

132
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Uniform quantizer

265

Example: Uniform quantizer

266

133
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Uniform quantization
Amplitude quantizing: Mapping samples of a continuous
amplitude waveform to a finite set of amplitudes.
Out

In
Average quantization noise power
Quantized

Signal peak power


values

Signal power to average


quantization noise power

267

Example
Derive quantization noise for uniform quantization in case of
signal x[-1,1] and the number of quantization levels is M.

Quantization error is

where x is the exact value of the sample and xj is corresponding


quantized value.

The mean square value of the quantized error is

where j is quantization interval and p(x) pdf of x


268

134
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Quantization levels xj and non-uniform


quantization intervals j

0-level

269

Example

If Max{j}<< then

In whole range

270

135
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example
Note:

If Max{j} << then

So

P(xj) is the probability of


the level xj

271

Example

If j= = constant j, then

Note x[-1,1], so =2/M


and

where M is the number of levels and n is the number of codeword bits.

Note that when M is doubled the mean square of quantization error or


quantization noise power is reduced to one quarter. 272

136
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: SNRqdB
Derive SNRqdB for uniform quantization in case of uniformly distributed
and normalized signal |x(t)|1 and the number of quantization levels is M.

The average signal power is

For uniformly distributed signal |x(t)|1

273

Example: SNRqdB
So

and

Because B=nW, for binary system and uniform signal

where B is channel bandwidth and W is signal bandwidth.

In PCM SNR increases exponentially as a function of channel bandwidth.


274

137
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Compressor + expander = compander

275

Quantization error
Quantizing error: The difference between the input and
output of a quantizer e(t ) = x (t ) x (t )

Process of quantizing noise


Quantizer
Model of quantizing noise
y = q ( x)
AGC x(t ) x (t )
x(t ) x (t )
x
e(t )

+
e(t ) =
x (t ) x (t )
276

138
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Non-linear quantization

Use of equal quantization intervals throughout the entire


dynamic range of an input analog signal (for low & high
energy signals) results in:
Low level signals have a low SNRq.
High level signals have a high SNRq.

Most voice signals are of low levels.


Thus, efficient way to improve voice signal quality at
lower signal levels is use a non-uniform (non-linear)
quantization process

277

Non-linear PCM and companding


Speech signals consist predominantly of small amplitude signals and the large
amplitude signals occur with much smaller probability.
Hence it is logical that the smaller amplitudes are quantized with more precision.
This means that the step size is maintained small for the region where the signal
amplitude is small.
Correspondingly, the step size for the large signals are made large.
This will of course result in large quantization error in case of large signals.
But this is tolerable since the large signal amplitudes do not occur very often.
This process of varying the step size during the encoding process is called
compressing and the corresponding receiver will do expanding to reverse the
distortion introduced at the encoder.
This process is called companding.
This type PCM is known as non-linear or logarithmic PCM.

278

139
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Compressor and expander

279

Non-uniform quantization
It is done by uniformly quantizing the compressed signal.
At the receiver, an inverse compression characteristic, called
expansion is employed to avoid signal distortion.

compression+expansion companding

y = C (x) x
x(t ) y (t ) y (t ) x (t )

x y
Compress Quantize Expand
Transmitter Channel Receiver
280

140
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Non-linear quantization
The voltage range between the lowest level and the
highest level is divided into segments in a non-linear
manner logarithmic

The lower the voltage levels, the smaller the range of


a segment.

The range of a segment gets larger for higher voltage


levels

The number of steps for each segment is the same

281

Companding
During the companding process, input analog signal
samples are compressed into logarithmic segments and
then each segment is quantized and coded using uniform
quantization.

Companding (compression and expansion) increases SNR


performance (minimize quantization noise) while keeping
the number of bits used for quantization constant.

282

141
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Speech compression

=255

Log.
Lin.

A=87.6

283

Piecewise linearized A-curve


Number of the
Equation of the segment Segment
segment

284

142
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Piecewise linearized A-curve


The logarithmic part is replaced with piecewise linear segments

285

Transfer characteristics of a compander


Vo
Compression
CODE 1111

CODE 1101

1.2
Vi
1.2

CODE 0010

CODE 0000

286
expanding

143
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Differential quantizier

287

1-point DPCM

288

144
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Baseband transmission
To transmit information through physical channels, PCM
sequences (codewords) are transformed to pulses (waveforms).
Each waveform carries a symbol from a set of size M.
Each transmit symbol represents k =log2 M bits of the PCM words.
PCM waveforms (line codes) are used for binary symbols (M=2).

M-ary pulse modulation are used for non-binary symbols


(M>2). Eg: M-ary PAM.
For a given data rate, M-ary PAM (M>2) requires less bandwidth than
binary PCM.
For a given average pulse power, binary PCM is easier to detect than M-
ary PAM (M>2).

289

Example: 8-ary PAM vs. binary


PAM

290

145
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Binary PAM and 4-ary PAM

Binary PAM 4-ary PAM


(rectangular pulse) (rectangular pulse)

3B
A.
11
1 B
T
T T 01
T -B 00 T T

0 10
-A. -3B

291

Other PCM waveforms: Examples

PCM waveforms category:

Nonreturn-to-zero (NRZ)
Return-to-zero (RZ)

+V 1 0 1 1 0 +V 1 0 1 1 0
NRZ -V Manchester -V

Unipolar-RZ +V Miller +V
0 -V
+V +V
Bipolar-RZ 0 Dicode NRZ 0
-V -V
0 T 2T 3T 4T 5T 0 T 2T 3T 4T 5T

292

146
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

PCM waveforms: Selection criteria

Criteria for comparing and selecting PCM


waveforms:
Spectral characteristics (power spectral
density and bandwidth efficiency)
Bit synchronization capability
Error detection capability
Interference and noise immunity
Implementation cost and complexity

293

Summary: Baseband formatting and


transmission
Digital info. Bit stream Pulse waveforms
(Data bits) (baseband signals)
Textual Format
source info.
Pulse
Analog Sample Quantize Encode modulate
info.

Sampling at rate Encoding each q. value to


f s = 1 / Ts l = log 2 L bits
(sampling time=Ts) (Data bit duration Tb=Ts/l)

Quantizing each sampled Mapping every m = log 2 M data bits to a


value to one of the symbol out of M symbols and transmitting
L levels in quantizer. a baseband waveform with duration T

Information (data- or bit-) rate: Rb = 1 / Tb [bits/sec]


Symbol rate : R = 1 / T [symbols/sec]
Rb = mR 294

147
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Codec

sampling
quantizing
encoding

Analog Digital
signal signal
Sampler Quantizer Encoder

295

Typical digital passband transmitting


system

M U
Analog Digital O P
signal C signal D C
O U O RF signal
D TDM L N
E Digital A V
C Base Band T E
O R
R T

296

148
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Appendix: Sampling and


Quantization

A more detailed review

297

Ideal (or impulse) sampling

298

149
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Illustration of ideal sampling

299

Spectrum of the sampled waveform

300

150
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reconstruction of m(t)

301

Bandlimited interpolation

302

151
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling theorem
A signal having no frequency components above W Hertz is
completely described by specifying the values of the signal at
periodic time instants that are separated by at most 1/(2W)
seconds.

303

Natural sampling

304

152
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Illustration of natural sampling

305

Signal reconstruction in natural sampling

306

153
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Flat-top sampling

307

Spectrum of ms(t) in flat-top sampling

308

154
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equalization

309

Pulse modulation

310

155
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

PWM & PPM waveforms with a sinusoidal


message

311

Quantization

312

156
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Memoryless quantization

313

Uniform quantizer

314

157
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Input and output of a midrise uniform


quantizer

315

Signal-to-quantization Noise Ratio (SNRq)

316

158
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signal-to-quantization Noise Ratio (SNRq)

317

Signal-to-quantization Noise Ratio (SNRq)

318

159
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Optimal quantizer

319

Optimal quantizer

320

160
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of optimal quantizer design

321

Lloyd-Max conditions and iterative


algorithm

322

161
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Robust quantizers

323

-law and A-law companders

324

162
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

SNRq of non-uniform quantizers

325

SNRq of non-uniform quantizers

326

163
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

SNRq of -law compander

327

SNRq of -law compander

328

164
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

8-bit quantizer for the Gaussian-


distributed message

One sacrifices performance for larger input power levels to obtain a 329
performance that remains robust over a wide range of input levels.

SNRq with 8-bit -law quantizer (L = 256,


= 255)

330

165
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Differential quantizers

331

Linear predictor

332

166
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Normal equations (or the Yule-Walker


Equations)

333

Reconstruction of m[n] from the


differential samples

334

167
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reconstruction of m[n] from the


differential samples

335

Intersymbol Interference
(ISI)
Pulse shaping and
equalization

336

168
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Symbols and signals

337

Digital pulse amplitude modulation

338

169
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital pulse amplitude modulation

339

Symbols and bits

340

170
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bandwith constraint

341

Inter-Symbol Interference (ISI)

342

171
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Attenuation and dispersion effects: ISI

Inter-symbol interference (ISI)

343

Inter-Symbol Interference (ISI)

Inter-Symbol interference (ISI) seems to be an


unavoidable phenomenon of both wired and wireless
communication systems.
Sent

Received

344

172
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Ideal and sent shape


Next figure shows a data sequence, 1,0,1,1,0, which we wish to send.
This sequence is in form of square pulses.
Square pulses are nice as an abstraction but in practice they are hard to
create and also require far too much bandwidth.
So we shape them as shown in the dotted line.
The shaped version looks essentially like a square pulse and we can quickly
tell what was sent even visually.
Advantage of (an arbitrary) shaping at this point is that it reduces
bandwidth requirements and can actually be created by the hardware.

345

Symbols are spread by the medium


Next figure shows each symbol as it is received.
We can see what the transmission medium creates a tail of energy that lasts
much longer than intended.
The energy from symbols 1 and 2 goes all the way into symbol 3.
Each symbol interferes with one or more of the subsequent symbols.
The circled areas show areas of large interference.

346

173
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Received vs. transmitted signal


Next figure shows the actual signal seen by the receiver.
It is the sum of all these distorted symbols.
Compared to the dashed line that was the transmitted signal, the received
signal looks quite different.
The receiver actually sees the value of the amplitude at the timing instant
only (the little yellow dot in the picture).
Notice that for symbol 3, this value is approximately half of the transmitted
value, which makes this particular symbol more susceptible to noise and
incorrect interpretation and this phenomena is the result of this symbol
delay and smearing.

347

ISI
This spreading and smearing of symbols such that the energy
from one symbol effects the next ones in such a way that the
received signal has a higher probability of being interpreted
incorrectly is called Inter-Symbol-Interference or ISI.
ISI can be caused by many different reasons.
It can be caused by filtering effects from hardware or frequency
selective fading, and from non-linearity effects.
Communication system designs for both wired and wireless
nearly always need to incorporate some way of controlling ISI.

348

174
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

ISI effects: Band-limited filtering of


channel
ISI due to filtering effect of the
communications channel (e.g. wireless
channels)
Channels behave like band-limited filters

H c ( f ) = H c ( f ) e j c ( f )

Non-constant amplitude Non-linear phase

Amplitude distortion Phase distortion


349

Inter-Symbol Interference (ISI)


ISI appears in the detection process due to the filtering
effects of the system
Overall equivalent system transfer function

H ( f ) = Ht ( f )H c ( f )H r ( f )
creates echoes and hence time dispersion
causes ISI at sampling time
ISI effect

z k = sk + nk + i si
ik

350

175
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inter-symbol interference (ISI): Model


Baseband system model
x1 x2
{xk } Tx filter Channel r (t ) Rx. filter
zk
{xk }
ht (t ) hc (t ) hr (t ) Detector
t = kT
T Ht ( f ) Hc ( f ) Hr ( f )
x3 T n(t )
Equivalent model

x1 x2
{xk } Equivalent system
h(t )
z (t ) zk
{xk }
Detector
t = kT
T H( f )
x3 T n (t )
filtered noise
H ( f ) = Ht ( f )H c ( f )H r ( f )
351

What can we do about ISI?


The main problem is that energy, which we wish to confine to one symbol, leaks
into others.
So one of the simplest things we can do to reduce ISI is to just slow down the
signal.
Transmit the next pulse of information only after allowing the received signal has
damped down.
The time it takes for the signal to die down is called delay spread, whereas the
original time of the pulse is called the symbol time.
If delay spread is less than or equal to the symbol time then no ISI will result.
Slowing down the bit rate was the main way ISI was controlled on the old days.
Of course, in our march for ever higher bit rates, slowing down the data rate is an
easy but an unacceptable solution.
Nowadays digital electronic allows us to do signal processing controlling ISI and
transmission speeds increase accordingly.

352

176
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Pulse shaping

353

Pulse shaping to reduce ISI

Goals and trade-off in pulse-shaping


Reduce ISI
Efficient bandwidth utilization
Robustness to timing error (small side lobes)

354

177
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why not sinc pulse

355

Vestigial-symmetry theorem and raised


cosine pulse

356

178
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Pulse shaping
The main tool used to counter ISI is pulse shaping.
How can pulse shaping help control ISI?
The secret lies in the digital demodulation process used.
When the timing pulse slices the signal to determine the value
of the signal at that instant, it does not care what the signal
looked like before or after it.
So if there was some way we could keep the symbols from
interfering in such a way that they do not affect the amplitude
at the slicing instant, we can counter ISI successfully.

357

Only sampling moments are important

Look at the wildly bouncing signal below.


However, the receiver only sees the points at the timing pulses (shown
below the signal) and rest of the variation has no effect.
So as long at these points, we can reduce the effect of adjacent symbols,
thats all we need to do to mitigate the effect.

We only care about


what the signal does
at the moment of
sampling. What it
does in between is
unimportant.
358

179
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Nyquist bandwidth constraint


Nyquist bandwidth constraint (on equivalent system):
The theoretical minimum required system bandwidth to
detect Rs [symbols/s] without ISI is Rs/2 [Hz].
Equivalently, a system with bandwidth W=1/2T=Rs/2 [Hz]
can support a maximum transmission rate of 2W=1/T=Rs
[symbols/s] without ISI.

1 R R
= s W s 2 [symbol/s/Hz]
2T 2 W
Bandwidth efficiency, R/W [bits/s/Hz] :
An important measure in DCs representing data
throughput per Hz of bandwidth.
Showing how efficiently the bandwidth resources are used
by signaling techniques. 359

Equivalent system: Ideal Nyquist pulse


(filter)
Ideal Nyquist filter Ideal Nyquist pulse
H( f ) h(t ) = sinc(t / T )
T 1

0 f 2T T 0 T 2T t
1 1
2T 2T
1
W= 360
2T

180
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Starting case: Square pulse shape


We will start by looking at the use of a square pulse.
It is an intuitive shape and we want to see what if anything is wrong with
using it.
Lets define some terms
Ts = symbol time, 1 second in the example below.
Rs, the symbol rate is inverse of symbol time, Rs = 1/ Ts.
R is directly related to bandwidth such that larger the symbol rate, the more
bandwidth is required.

The square pulse in time-domain 361

Spectrum of the square pulse

Since the symbol time is 1 second, the symbol rate is


1 symbol per second.
The frequency response of this square pulse (its
Fourier transform) is given by the equation

where Ts = symbol time (1 sec)

362

181
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Square pulse sinc function

The frequency
response of the
square pulse is a
sinc function.

Lowpass
bandwidth is one
half of the
bandpass case.

363

Lowpass and bandpass bandwidth

In the previous case, the symbol time is 1 second.


The symbol rate hence is also equal to 1.
The frequency response of the square pulse is in the shape of a
sinc function (sinx/x).
It has a maximum value of AT and it crosses the zero at
integer multiples of R.
The lowpass bandwidth which is defined as the distance from
origin to the first zero crossing, is equal to twice the symbol
rate 1 Hz.
The bandpass case is twice that.
This lowpass, bandpass business may be confusing but we can
understand it better if we realize that bandwidth is always
measured on the positive axis.

364

182
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Square pulses and corresponding


frequency functions

The effect of square


pulse symbol times
and their frequency
response

365

Pulse bandwidth

A narrow pulse has a wide frequency response.


A wide pulse has smaller bandwidth.
For each pulse, the bandwidth which we measure is
only on the positive half and is equal and its symbol
rate in Hz.
The important thing to note at this point is that a
square pulse of symbol rate R has a bandwidth of R
Hz (for bandpass signal it is twice that.)

366

183
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

An important relationship

Bandwidth of a square pulse


= R for lowpass signals,
= 2R for bandpass.
The frequency response of the square pulse goes on
forever.
This is not a good thing, because it would interfere
with others and interference is not allowed by the
authorities.

367

Disadvantages of the square pulse

1. The ideal square pulse is difficult to create in time


domain because of rise time and a decay time.
2. Its frequency response goes on forever and decays
slowly.
The second lobe is only 13 dB lower than the first
one.
3. It is very sensitive to ISI.

368

184
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Duality

If a square pulse gives us a sinc function in the


frequency domain, then we could use a sinc function
as a pulse shape in time domain and get a square
wave frequency response.
We could use a pulse that is shaped like a sinc
function instead of a square pulse and get that very
nice boxcar spectrum, with nothing spilling outside
the bandwidth.

369

Example

A sequences of bits (1011) by shaping the bits as sinc


pulses

370

185
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Comparison between square and sinc pulse

Using the sinc pulse cuts the bandwidth requirement


to one-half compared to the square pulse.

371

Nyquist bandwidth
The bandwidth achieved by the sinc pulse is called the Nyquist
bandwidth.
It requires only 1/2 Hz per symbol.
Can we find something even better?
It turns out that we have not been able to find any other shape
that can improve on this.
It is an ultimate limit for perfect reconstruction of the signal.
Band-limited spectrum in frequency domain with no energy
going to waste and small total bandwidth requirement seems to
be great!
But not so great however, because a sinc pulse is actually no
more possible to build than is a square pulse.

372

186
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Disadvantages of sinc pulse


1. In time domain a true sinc pulse is of infinite length with tails
extending to infinity so the energy can theoretically continue
to add up even after the signal has ended.
We can only design an approximation to the real sinc pulse of
a finite length.
But truncation leads to an imperfect pulse that does not have a
true sinc pattern and allows ISI to leak in.
2. The pulse tails that fall in the adjacent symbols decay at the
rate of 1/x so if there is some error in timing, this pulse is not
very forgiving.
It requires near-perfect timing to achieve decent performance.

373

Nyquist pulses (filters)

Nyquist pulses (filters):


Pulses (filters) which result in no ISI at the sampling time.
Nyquist filter:
Its transfer function in frequency domain is obtained by
convolving a rectangular function with any real even-
symmetric frequency function
Nyquist pulse:
Its shape can be represented by a sinc(t/T) function
multiply by another time function.
Example of Nyquist filters: Raised-Cosine filter
374

187
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Raised Cosine filter


Raised-Cosine filter
A Nyquist pulse (No ISI at the sampling time)

1 for | f |< 2W0 W


| f | +W 2W0
H ( f ) = cos 2 for 2W0 W <| f |< W
4 W W0
0 for | f |> W
cos[2 (W W0 )t ]
h(t ) = 2W0 (sinc(2W0t ))
1 [4(W W0 )t ]2
W W0
Excess bandwidth: W W Roll-off factor r =
W0
0 r 1
0

375

Raised Cosine (RC) filter: Nyquist pulse


approximation

| H ( f ) |=| H RC ( f ) | h(t ) = hRC (t )


1 r=0 1

r = 0.5
0.5 0.5 r =1
r =1 r = 0.5
r =0

1 3 1 0 1 3 1 3T 2T T 0 T 2T 3T
T 4T 2T 2T 4T T

Rs
Baseband W sSB= (1 + r ) Passband W DSB= (1 + r ) Rs
2
376
r = roll-off factor

188
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Raised cosine
Nyquist offered ways to build (realizable) shapes that had the
same good qualities as the sinc pulse and less of the
disadvantages.
One class of pulses he proposed are called the raised cosine
pulses.
They are really a modification of the sinc pulse.
The sinc pulse has a bandwidth of W0, where W0 is specified
as
W0 = 1/2T
The raised cosine pulses have an adjustable bandwidth which
can be varied from W0 to 2W0.
We want to get as close to W0, which is called the Nyquist
bandwidth.

377

Roll-off factor
The factor r related the achieved bandwidth to the ideal bandwidth W0 as

W W0
Roll-off factor r =
W0
0 r 1
where W0 is Nyquist bandwidth, and W is the utilized bandwidth.
The factor r is called the roll-off factor.
It indicates how much bandwidth is being used over the ideal bandwidth.
The smaller this factor, the smaller bandwidth and the more efficient the
scheme.
The percentage over the minimum required W is called the excess
bandwidth.
It is 100% for roll-off of 1.0 and 50% for roll-off of 0.5.

378

189
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Roll-off

The alternate way to express the utilized bandwidth is

The typical roll-off values used for communications


range from .2 to .4.
Obviously we want to use as small a roll-off as
possible, since this gives the smallest bandwidth.

379

Sinc and cosine parts


How the class of raised cosine pulse is defined in time
domain?
Time domain presentation includes product of two parts
The first part is the sinc pulse.
The second part is a cosine correction applied to the sinc pulse
to make it behave better.
The bandwidth is now adjustable.
It can be anywhere from 1/2 Rs to Rs.
It is greater than the Nyquist bandwidth by a factor (1+ r).
For r = 0, the impulse response of RC filter reduces to the sinc
pulse.

380

190
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Raised cosine impulse response

Roll-off
factor r=

381

Frequency response of the raised cosine


pulses of Rs = 1

Roll-off
factor r=

382

191
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of pulse shaping


Raised Cosine pulse at the output of matched filter
Amp. [V]

Baseband received waveform at


the matched filter output
(zero ISI)

t/T

383

Root-raised cosine
The whole raised cosine can be applied at once at the transmitter but in
practice it has been found that concatenating two filters each with a root
raised cosine response (called split-filtering) works better.
So to implement the raised cosine response, we split the filtering in two
parts to create a matched set.
In frequency domain, we take the square root of the frequency response
hence the name root-raised cosine.
Split filtering of raised cosine response, a root-raised cosine filter at the transmitter and
one at the receiver, giving a total response of a raised cosine.

384

192
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Square-root raised cosine pulse

385

Impact of AWGN only

386

193
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Square-root raised cosine pulse

387

Example of pulse shaping


Square-root Raised-Cosine (SRRC) pulse shaping
Amp. [V]

Baseband tr. Waveform

Third pulse

t/T
First pulse
Second pulse

Data symbol

388

194
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Monitoring transmission quality using eye


diagram

389

Eye diagram

The optimum sampling time


corresponds to the
maximum eye opening.
ISI at that time partially = Sensivity to timing
closes the eye and thereby error
reduces the noise margin.
If synchronization is
derived from the zero-
crossings, as it usually is,
zero-crossing distortion
produces jitter and results
in non-optimum sampling 390
times.

195
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Eye pattern
Eye pattern: Display on an oscilloscope which sweeps
the system response to a baseband signal at the rate
1/T (T symbol duration) the superposition of
successive symbol intervals
Distortion
due to ISI
Noise margin
amplitude scale

Sensitivity to
timing error

Timing jitter
time scale 391

Example of eye pattern: BPAM, SRRC pulse

Perfect (ideal) channel (no noise and no ISI)

392

196
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of eye pattern: BPAM, SRRC pulse

AWGN (Eb/N0=20 dB) and no ISI

393

Example of eye pattern: BPAM, SRRC pulse

AWGN (Eb/N0=10 dB) and no ISI

394

197
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of eye pattern with ISI:


BPAM, SRRC pulse
Distorted (non-ideal) channel and no noise
hc (t ) = (t ) + 0.7 (t T )

395

Example of eye pattern with ISI:


BPAM, SRRC pulse
AWGN (Eb/N0=20 dB) and ISI
hc (t ) = (t ) + 0.7 (t T )

396

198
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of eye pattern with ISI:


Binary-PAM, SRRC pulse
AWGN (Eb/N0=10 dB) and ISI
hc (t ) = (t ) + 0.7 (t T )

397

Multipath: Power-delay profile


Power

path-1
path-2
path-3
multi-path path-2
propagation
Path Delay

path-1

path-3
Mobile Station (MS)
Base Station (BS)

Channel Impulse Response:


Channel amplitude |h| correlated at delays .
Each tap value at kTs Rayleigh distributed
(actually the sum of several sub-paths)

398

199
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Power delay profile (WLAN/indoor)

399

Multipath: Time-dispersion => frequency


selectivity
The impulse response of the channel is correlated in
the time-domain (sum of echoes)
Manifests as a power-delay profile, dispersion in channel
autocorrelation function A()
Equivalent to selectivity or deep fades in the
frequency domain
Delay spread:
~ 50ns (indoor) 1s (outdoor/cellular).
Coherence Bandwidth:
Bc = 500kHz (outdoor/cellular) 20MHz (indoor)
Implications: High data rate: symbol smears onto the
adjacent ones (ISI).

400

200
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multipath intensity profile

Multipath
effects
~ O(1s)

401

Doppler: Non-stationary impulse response

Set of multipaths
changes ~ O(5 ms)

402

201
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Doppler: Dispersion (frequency) => time-selectivity


The Doppler power spectrum shows dispersion/flatness ~ Doppler
spread (100-200 Hz for vehicular speeds)
Equivalent to selectivity or deep fades in the time domain
correlation envelope.
Each envelope point in time-domain is drawn from Rayleigh
distribution. But because of Doppler, it is not IID, but correlated for
a time period ~ Tc (correlation time).
Doppler Spread: Ds ~ 100 Hz (vehicular speeds at 1GHz)
Coherence Time: Tc = 2.5-5ms.
Implications: A deep fade on a tone can persist for 2.5-5 ms!
Closed-loop estimation is valid only for 2.5-5 ms.
Note: A collection of random
variables is independent and
identically distributed (IID) if
each random variable has the
same probability distribution as
the others and all are mutually
independent. White noise is an
example of IID.

403

Time-varying (fading) channel impulse response


Note: IID refers to
sequences of
random variables.
"Independent and
identically
distributed"
implies an
element in the
sequence is
independent of
the random
variables that
came before it.

Note 1: At each tap, channel gain |h| is a Rayleigh distributed r.v.. The
random process is not IID.
Note 2: Response spreads out in the time-domain (), leading to inter-
symbol interference and deep fades in the frequency domain:
frequency-selectivity caused by multi-path fading
Note 3: Response completely vanish (deep fade) for certain values of t:
Time-selectivity caused by doppler effects (frequency-domain 404
dispersion/spreading)

202
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Dispersion-selectivity duality I

405

Dispersion-selectivity duality II

406

203
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fading terminology
Flat fading: no multipath ISI effects.
E.g. narrowband, indoors
Frequency-selective fading: multipath ISI effects.
E.g. broadband, outdoor.

Slow fading: no doppler effects.


E.g. indoor WiFi home networking
Fast fading: doppler effects, time-selective channel
E.g. cellular, vehicular

Broadband cellular + vehicular => Fast + frequency-


selective
407

Inter-Symbol-Interference (ISI) due to


multipath fading
Transmitted signal:

Received Signals:
Line-of-sight:

Reflected:

The symbols add up


on the channel Delays
Distortion!

Multipath Radio Channel

204
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

What is an equalizer?

We use it for music in everyday life!


Eg: default settings for various types of
music to emphasize bass, treble etc
Essentially we are setting up a (f-domain)
filter to cancel out the channel multipath
filtering effects

409

Equalization

Step 1 waveform to sample transformation Step 2 decision making

Demodulate & Sample Detect

z (T ) Threshold m i
r (t ) Frequency Receiving Equalizing
comparison
down-conversion filter filter

For bandpass signals Compensation for


channel induced ISI

Received waveform Baseband pulse


Baseband pulse Sample
(possibly distorted)
(test statistic)

410

205
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equalization: Channel is a LTI filter


ISI due to filtering effect of the
communications channel (e.g. wireless
channels)
Channels behave like band-limited filters

H c ( f ) = H c ( f ) e j c ( f )

Non-constant amplitude Non-linear phase

Amplitude distortion Phase distortion


411

Pulse shaping and equalization principles


No ISI at the sampling time

H RC ( f ) = H t ( f ) H c ( f ) H r ( f ) H e ( f )

Square-Root Raised Cosine (SRRC) filter and Equalizer


H RC ( f ) = H t ( f ) H r ( f )
Taking care of ISI
H r ( f ) = H t ( f ) = H RC ( f ) = H SRRC ( f ) caused by tr. filter

1
He ( f ) = Taking care of ISI
Hc ( f ) caused by channel

Equalizer: enhance weak frequencies, dampen strong frequencies to flatten


the spectrum
Since the channel Hc(f) changes with time, we need adaptive equalization,
i.e. re-estimate channel & equalize 412

206
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equalization: Slow fading channel


Example of a (somewhat) frequency selective, slowly changing (slow
fading) channel for a mobile user

413

Equalization: Fast fading channel


Example of a highly frequency-selective, fast changing (fast fading)
channel for a mobile user

414

207
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equalizing filters
Baseband system model
a1
a (t kT ) Tx filter
k Channel r (t ) Equalizer Rx. filter z (t ) z k {a k }
k ht (t ) hc (t ) he (t ) hr (t ) Detector
t = kT
Ta a Ht ( f ) Hc ( f ) He ( f ) Hr ( f )
2 3
n(t )

Equivalent model H ( f ) = H t ( f )H c ( f )H r ( f )
a1
a (t kT )
k
Equivalent system z (t ) x(t ) Equalizer z (t )
zk {ak }
k h(t ) he (t ) Detector
t = kT
Ta a H( f ) He ( f )
2 3 n (t )
filtered (colored) noise
n (t ) = n(t ) hr (t )
415

Equalizer types

416

208
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Recursive Least Squares (RLS) filters

The Recursive least squares (RLS) adaptive filter is


an algorithm which recursively finds the filter
coefficients that minimize a weighted linear least
squares cost function relating to the input signals.
This in contrast to the least mean squares (LMS)
algorithms that aim to reduce the mean square error.
In the derivation of the RLS, the input signals are
considered deterministic, while for the LMS they are
considered stochastic.
Compared to most of its competitors, the RLS
exhibits extremely fast convergence.
However, this benefit comes at the cost of high
computational complexity, and potentially poor
tracking performance when the filter to be estimated
(the "true system") changes.
417

Filter coefficients
The idea behind RLS filters is to minimize a cost function C by
appropriately selecting the filter coefficients wn , updating the
filter as new data arrives.
The error signal e(n) and desired signal d(n) are defined in the
negative feedback diagram below:

The error implicitly depends on the filter coefficients through


the estimate :
418

209
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cost function
The weighted least squares error function C the cost function
we desire to minimize being a function of e(n) is therefore
also dependent on the filter coefficients:

where 0<1 is the "forgetting factor" which gives exponentially


less weight to older error samples.

419

Linear equalizer
A linear equalizer effectively inverts the channel.

n(t)
Equalizer
Channel
1
Hc(f) Heq(f)
Hc(f)

The linear equalizer is usually implemented as a


tapped delay line.
On a channel with deep spectral nulls, this equalizer
enhances the noise (both signal and noise pass thru equalizer).

poor performance on frequency-selective


fading channels
420

210
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Noise enhancement with spectral nulls

421

Decision Feedback Equalizer (DFE)


DFE
n(t)
x(t) ^
x(t)
Forward +
Hc(f)
Filter -

Feedback
Filter

The DFE determines the ISI from the previously detected


symbols and subtracts it from the incoming symbols.
This equalizer does not suffer from noise enhancement
because it estimates the channel rather than inverting it.
The DFE has better performance than the linear
equalizer in a frequency-selective fading channel.
The DFE is subject to error propagation if decisions are
made incorrectly.
=> doesnt work well with low SNR.
Note. Optimal non-linear: MLSE (complexity grows exponentially
with delay spread) 422

211
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equalization by transversal filtering


Transversal filter:
A weighted tap delayed line that reduces the effect of
ISI by proper adjustment of the filter taps.
N
z (t ) = c x(t n )
n= N
n n = N ,..., N k = 2 N ,...,2 N

x (t )

c N cN +1 c N 1 cN

z (t )

Coefficient
adjustment 423

Training the filter

424

212
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Transversal equalizing filter


Zero-forcing equalizer:
The filter taps are adjusted such that the equalizer output is forced to be
zero at N sample points on each side:

Adjust 1 k =0
z (k ) =
{cn }nN= N 0 k = 1,..., N

Mean Square Error (MSE) equalizer:


The filter taps are adjusted such that the MSE of ISI and noise power at
the equalizer output is minimized. (note: noise is whitened before filter)

Adjust
{c n }nN= N
[
min E ( z (kT ) ak ) 2 ]
425

Equalizer
The ideal equalizer, an exact inverse system for the channel, is almost
always unrealizable.
With enough prior information about the channel, very good
approximations can be realized.
Using filter theory, the combination of channel and equalizer can be
made a near distortionless system.
Less distortion higher order filter, more delay.
Equalization can be done at receiver, transmitter or both.
However, the channel often is not known precisely at the time of
system design and needs to be fine-tuned during operation.
One approach to simplify the design of the equalizer is to focus on
eliminating ISI, rather than a complete inverse system.
At the input to the decision device, we simply try to enforce the zero-
ISI condition
peq(t0) = 1 and peq(t0 + nD) = 0 for n = 1, 2, ....

213
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Zero-forcing equalizer
Consider a transversal-filter equalizer positioned between the receiver
filter and the decision device.

427

Zero-forcing equalizer
When the system commences operation, the values of pR(t0 - kD)
for k = -2N, ... , 2N are measured during a training phase.
The system of equations (1) is then a set of 2N+ 1 linear equations
in 2N + 1 unknowns: the tap gains c[n] easily solved.
Typically the equalizer (and the solution of the linear equa-
tions) is implemented digitally.
This approach attempts to zero out as much of the ISI as possible:
hence called a zero-forcing equalizer.
A side effect may be noise enhancement: the noise power input
to the decision device may increase.
In many scenarios, the channel response may vary with time.
In this case, we may need to periodically suspend transmission
of data while the equalizer is re-trained.
More advanced equalizers are able to update continuously or
428
track the channel.
This is called adaptive equalization.

214
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Effect on BER: AWGN only

In a Gaussian channel (no fading) BER <=> Q(S/N)


erfc(S/N)

Typical BER vs. S/N curves


BER

Frequency-selective channel
(no equalization)

Gaussian
channel Flat fading channel
(no fading)

S/N

429

Effect on BER: Flat Fading

Flat fading: BER = BER ( S N z ) p ( z ) dz


z = signal power level

Typical BER vs. S/N curves


BER

Frequency-selective channel
(no equalization)

Gaussian
channel Flat fading channel
(no fading)

S/N

430

215
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Effect on BER: ISI/Frequency selective


channel
Frequency selective fading <=> irreducible BER floor!

Typical BER vs. S/N curves


BER

Frequency-selective channel
(no equalization)

Gaussian
channel Flat fading channel
(no fading)

S/N

431

Effect on BER: using equalization

Diversity (e.g. multipath diversity) <=> improved


performance

Typical BER vs. S/N curves


BER

Gaussian Frequency-selective channel


channel (with equalization)
Flat fading channel
(no fading)

S/N

432

216
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complexity and Adaptation

Nonlinear equalizers (DFE, MLSE) have


better performance but higher complexity

Equalizer filters must be FIR


Can approximate IIR Filters as FIR filters
Truncate or use MMSE criterion

Channel response needed for equalization


Training sequence used to learn channel
Tradeoffs in overhead, complexity, and delay
Channel tracked during data transmission
Based on bit decisions
Cant track large channel fluctuations
433

Equalization: Summary
Equalizer equalizes the channel response in frequency domain to
remove ISI
Can be difficult to design/implement,
Can get noise enhancement (linear EQs) or error propagation
(decision feedback EQs)

434

217
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Statistical Tools for


Telecommunications

Probability & Stochastic


processes

Introduction

218
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Elementary probability concepts

Experiment, sample space and event

219
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Formal definition of probability: Axioms

Probability measure: Important


properties

220
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Union Bound

A B

P(A B) P(A) + P(B)

P(A1 A2 AN) i= 1..N P(Ai)

Applications:
Getting bounds on BER (bit-error rates),
Bounding the tails of probability distributions

Experiment, outcome, probability, event


and sample space
Think of probability as modeling an experiment
E.g.: tossing a coin!
The set of all possible outcomes is the sample space:
S
Any subset A of S is an event

Classic Experiment:
Tossing a die:S = {1,2,3,4,5,6}
Any subset A of S is an event:
A = {the outcome is even} = {2,4,6}

221
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Probability of events: Axioms


P is the probability mass function if it maps each
event A, into a real number P(A), and:
i.) P ( A ) 0 for every event A S

ii.) P(S) = 1

iii.) If A and B are mutually exclusive events then,

A B
P ( A B ) = P ( A ) + P (B )

A B =

Probability of events
In fact for any sequence of pair-wise-mutually-
exclusive events, we have

A1, A2 , A3 ,... (i.e. Ai A j = 0 for any i j )

Ai A j = , and UA i =
S.
i =1

A1

A2
Ai P An = P ( An )
A
n =1 n =1
j An

222
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Conditional probability

Total probability theorem

223
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Conditional probability and


independence
P ( A | B ) = (conditional) probability that the
outcome is in A given that we know the
outcome in B
P ( AB )
P( A | B) = P (B ) 0
P (B )

Note that: P ( AB ) = P (B )P ( A | B ) = P ( A )P (B | A )

Events A and B are independent if P(AB) = P(A)P(B).

Also: P ( A | B ) = P ( A) and P (B | A ) = P (B )

Random variables

224
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random variable as a measurement


Thus a random variable can be thought of as a
measurement (yielding a real number) on an experiment
Maps events to real numbers
We can then talk about the cdf, pdf, and define the
mean/variance and other moments

Cumulative Distribution Function (CDF or


cdf)

225
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cumulative Distribution Function


The cumulative distribution function (CDF) for a random
variable X is

FX ( x) = P( X x) = P({s S | X ( s ) x})
Note that FX ( x ) is non-decreasing in x, i.e.

x1 x2 Fx ( x1 ) Fx ( x2 )
Also lim Fx ( x) = 0 and lim Fx ( x) = 1
x x

Plots of cdf

226
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cumulative Distribution Function


(CDF)
1

0 .9 L o g n o rm a l(0 ,1 )
G a m m a (.5 3 ,3 )
0 .8 E x p o n e n tia l(1 .6 )
W e ib u ll(.7 ,.9 )
0 .7 P a re to (1 ,1 .5 )

0 .6
F(x)

0 .5

0 .4

0 .3

0 .2
median
0 .1

0
0 2 4 6 8 10 12 14 16 18 20
x

Emphasizes skews, easy identification of median/quartiles,


converting uniform rvs to other distribution rvs

Complementary CDFs (CCDF)


0
10

-1
10

-2
10
log(1-F(x))

-3
10 L o g n o rm a l(0 ,1 )
G a m m a (.5 3 ,3 )
E x p o n e n tia l(1 .6 )
W e ib u ll(.7 ,.9 )
-4
10 P a re to II(1 ,1 .5 )
P a re to I(0 .1 ,1 .5 )

-1 0 1 2
10 10 10 10
lo g (x )

Useful for focusing on tails of distributions:


Line in a log-log plot => heavy tail

227
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Probability Density Function (PDF or pdf)

Histogram: Plotting frequencies

Class Freq.
Count 15 but < 25 3
5 25 but < 35 5
35 but < 45 2
Frequency 4

Relative 3
frequency Bars
2
Percent
1

0 15 25 35 45 55

Lower Boundary

228
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Probability distribution function (pdf):


Continuous version of histogram

a.k.a. frequency histogram, p.m.f (for discrete r.v.)

Continuous probability density


function
Mathematical formula
Frequency
Shows all values, x, and
frequencies, f(x) (Value, Frequency)

f(x) is not probability f(x)


Properties

f (x )dx = 1
x
All X a b
(Area Under Curve)

f ( x ) 0, a x b Value

229
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Continuous-valued random variables

Thus, for a continuous random variable X, we can


define its probability density function (pdf)

Note that since FX ( x) is non-decreasing in x we


have
f X ( x) 0 for all x.

Example: Uniform random variable

230
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gaussian (or normal) random variable

Gaussian random variable


(a) Emg (electromyography) signal

231
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gaussian random variable


(b) Histogram and pdf fits

probability density functions (pdf)


1 .5
L o g n o rm a l(0 ,1 )
G a m m a (.5 3 ,3 )
E x p o n e n tia l(1 .6 )
W e ib u ll(.7 ,.9 )
P a re to (1 ,1 .5 )

1
f(x)

0 .5

0
0 0 .5 1 1 .5 2 2 .5 3 3 .5 4 4 .5 5
x

Emphasizes main body of distribution, frequencies,


various modes (peaks), variability, skews

232
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Functions of random variables

Numerical data properties

Central Tendency
(Location)

Variation (Dispersion)

Shape

233
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Numerical data:
Properties & measures

Numerical Data
Properties

Central
Variation Shape
Tendency
Mean Range Skew
Median Inter-quartile Range
Mode Variance
Standard Deviation

Expectation of random variables

234
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Expectation of a random variable:


E[X]
The expectation (average) of a (discrete-valued) random variable X is

X = E ( X ) = xP( X = x) = xPX ( x)
x =

Expectation
The expectation (average) of a continuous random variable X
is given by

E( X ) = xf

X ( x)dx

Note that this is just the continuous equivalent of the discrete


expectation

E ( X ) = xPX ( x)
x =

235
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Other Measures: Median and mode

Median = F-1 (0.5), where F = CDF


Aka 50% percentile element
Order the values and pick the middle element
Used when distribution is skewed
Considered a robust measure

Mode: Most frequent or highest probability value


Multiple modes are possible
Need not be the central element
Mode may not exist (e.g. uniform distribution)

236
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Indices/Measures of spread/dispersion

Why care?

You can drown in a river of average depth 15 cm!


Lesson: The measure of uncertainty or dispersion may matter more than
the index of central tendency

Expectation of random variables

237
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

ACpower = Totalpower DCpower =


variance

Variance, standard deviation, coefficient


of variation, SIQR
Variance: second moment around the mean:
2 = E[(X-)2]
Standard deviation =

Coefficient of Variation (C.o.V.) = /

SIQR= Semi-Inter-Quartile Range (used with median


= 50th percentile)
(75th percentile 25th percentile)/2

238
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multiple random variables

Multiple random variables

239
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multiple random variables

Multiple random variables

240
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Correlation coefficient

Covariance and correlation: Measures of


dependence
Covariance: =

For i = j, covariance = variance!


Independence => covariance = 0 (not vice-versa!)

Correlation (coefficient) is a normalized (or scaleless) form


of covariance:

Between 1 and +1.


Zero => no correlation (uncorrelated).
Note uncorrelated DOES NOT mean independent!

241
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random vectors and sum of r.v.s

Random vector = [X1, , Xn], where Xi = r.v.


Covariance matrix:
K is an nxn matrix
Kij = Cov[Xi,Xj]
Kii = Cov[Xi,Xi] = Var[Xi]

Sum of independent r.v.s


Z=X+Y
PDF of Z is the convolution of PDFs of X and Y
Can use transforms!

Characteristic function
The distribution of a random variable X can be determined from its
characteristic function, defined as

It captures all the moments, and is related to the IFT of pdf:


We see that the characteristic function X() of X(t) is the inverse
Fourier transform of the distribution pX(x) evaluated at f = /(2). Thus
we can obtain pX(x) from X() as

This will become significant in finding the distribution for sums of random
variables.

242
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Important (discrete) random


variable: Bernoulli
The simplest possible measurement on an experiment:
Success (X = 1) or failure (X = 0).
Usual notation:

PX (1) = P( X = 1) = p PX (0) = P( X = 0) = 1 p
A discrete random variable that takes two values 1 and 0 with
probabilities p and 1-p.
Good model for a binary data source whose output is 1 or 0.
Can also be used to model the channel errors.

Bernoulli random variable

243
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binomial random variable

Binomial distribution

P(X)
.6
.4 n = 5 p = 0.1
.2
.0 X
Mean 0 1 2 3 4 5

= E ( x ) = np
P(X) n = 5 p = 0.5
Standard Deviation .6
.4
.2
= np (1 p) .0 X
0 1 2 3 4 5

244
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binomial distribution
Binomial can looks like skewed or normal

Depends upon
p and n !

Binomials for different p, N =20


Distribution of Blocks Experiencing k losse s out of N
Distribution of Blocks Experiencing k losses out of N

25.00%
30.00%

25.00% 20.00%

20.00%
Number of Blocks
Number of Blocks

15.00%

15.00%

10.00%
10.00%

5.00%
5.00%

0.00% 0.00%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Num ber of Losses out of N = 20 Number of Losses out of N = 20

10% PER 30% PER


Distribution of Blocks Experiencing k losses out of N

Npq = 1.8 20.00% Npq = 4.2


As Npq >> 1, better approximated by normal 18.00%

16.00%
distribution near the mean: 14.00%

symmetric, sharp peak at mean, exponential-square


Number of Blocks

12.00%

(e-x^2) decay of tails


10.00%

8.00%

(pmf concentrated near mean) 6.00%

50% PER
4.00%

Npq = 5
2.00%

0.00%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Num ber of Losses out of N = 20

245
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Important random variable:


Poisson
A Poisson random variable X is defined by its PMF: (limit of binomial)

x
P( X = x) = e x = 0,1, 2,...
x!
where > 0 is a constant
It can be shown that


PX ( x) = 1
x =0
and E(X) =
Poisson random variables are good for counting frequency of occurrence:
like the number of calls that arrive to a switchboard in one hour (busy
hour), or the number of packets that arrive to a router in one second.

Important continuous random variable:


Exponential
Used to represent time, e.g. until the next arrival
Has PDF e x for x 0
X f ( x) = {
0 for x < 0

for some >0


Properties:

1

0
f X ( x)dx = 1 and E ( X ) =

246
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gaussian/Normal distribution
Normal distribution:
Completely characterized by
mean () and variance (2)

Q-function: one-sided tail of


normal pdf

erfc(): two-sided tail.


So:

Normal distribution: Why?


Uniform distribution
looks nothing like
bell shaped (Gaussian)!
Large spread ()!

CENTRAL LIMIT TENDENCY!

Sum of r.v.s from a uniform distribution after


very few samples looks remarkably normal
BONUS: it has decreasing !

247
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Central Limit Theorem


Let X1, X2, X3, , XN denote N mutually independent random variables
whose individual distributions are not known and they are not necessarily
Gaussian distributed.
The theorem establishes that the sum of the N random variables (say, X) is
a random variable which tends to follow a Gaussian (or, normal)
distribution as N .
Further,
a) The mean of the sum random variable X is the sum of the mean values
of the constituent random variables and
b) the variance of the sum random variable X is the sum of the variance
values of the constituent random variables X1, X2, X3, , XN .
The Central Limit Theorem is very useful in modeling and analyzing
several situations in the study of electrical communications.
However, one necessary condition to look for before invoking Central
Limit theorem is that no single random variable should have significant
contribution to the sum random variable.

Gaussian distribution

Rapidly dropping tail probability

Why? Doubly exponential PDF (e-z^2 term)


A.k.a: Light tailed (not heavy-tailed).
No skew
Fully specified with just mean and variance (2nd order)

248
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error functions

Error functions

249
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error functions

Q and Phi () functions


Phi function: CDF of standard Gaussian
Q function: Complementary CDF of standard Gaussian

250
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Q and Phi function properties

By the symmetry of the N(0,1) density, we have: Q(x) = (x)

From their definitions, we have: Q(x) = 1 (x)

Combining, we get:

Note: we can express Q and Phi functions with any real-valued arguments in terms
of the Q function with positive arguments alone.
Simplifies computation, enables use of bounds and approximations
for Q functions with positive arguments.

Q function has exponentially decaying


tails
Asymptotically tight bounds for large arguments

Very Important Conclusion: The asymptotic behavior of the Q function is

The rapid decay of the Q function implies, for example, that: Q(1) + Q(4) Q(1)

Design implications: We will use this to identify dominant events causing errors

Useful bound for analysis (and works for small arguments):

251
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Plots of Q function and its bounds


Asymptotically tight

Useful for analysis


and for small arguments

Note the rapid decay: y-axis


has log scale

Gaussian distribution

252
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gaussian distribution

Height and spread of Gaussian can vary

Gaussian R.V.

Standard Gaussian :

Tail: Q(x)
tail decays exponentially!

Gaussian property preserved


with linear transformations:

253
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Standardized normal distribution

X
Z=
Normal Standardized normal
distribution distribution

=1

X = 0 Z
One table!

Obtaining the probability

Standardized normal
probability table (portion)
Z .00 .01 .02 =1
0.0 .0000 .0040 .0080
.0478
0.1 .0398 .0438 .0478
0.2 .0793 .0832 .0871
= 0 .12 Z
0.3 .1179 .1217 .1255 Shaded area
Probabilities exaggerated

254
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: P(X 8)

X 85
Z= = = .30
10
Normal Standardized Normal
Distribution Distribution

= 10 =1
.5000
.3821
.1179

=5 8 X =0 .30 Z
Shaded area exaggerated

Q-function:
Tail of normal
distribution

Q(z) = P(Z > z) = 1 P[Z < z]

255
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sampling from non-normal populations

Central tendency
Population distribution
x =
= 10
Dispersion

x =
n = 50 X
Sampling distribution
Sampling with
n=4 n =30
replacement X = 5 X = 1.8

X- = 50 X

Central Limit Theorem (CLT)

As sample
size gets
large
enough
(n 30) ...

256
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Central Limit Theorem (CLT)


As sample x =
size gets n
sampling
large distribution
enough becomes
(n 30) ... almost normal.

X
x =

Comment on CLT
Central limit theorem works if original distribution are not
heavy tailed
Need to have enough samples.
E.g. with multipaths, if there is not rich enough
scattering, the convergence to normal may have not
happened yet.
Moments converge to limits.
Trouble with aggregates of heavy tailed distribution
samples
Rate of convergence to normal also varies with
distributional skew, and dependence in samples

257
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Jointly Gaussian random variables (or


Gaussian random vectors)
Multiple random variables defined on a common
probability space are also called random vectors
same probability space means we can talk about joint
distributions
A random vector is Gaussian (or the random variables
concerned are jointly Gaussian) if any linear combination
is a Gaussian random variable
These arise naturally when we manipulate Gaussian noise
Correlation of Gaussian noise
Multiple samples of filtered Gaussian noise
Joint distribution characterized by mean vector and
covariance matrix
Analogous to mean and variance for scalar Gaussian

Jointly Gaussian distribution (bivariate)

258
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Joint pdf for x = y = 1 and x,y = 0

Joint pdf for x = y = 1 and x,y = 0.95

259
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multivariate Gaussian pdf

Mean vector and covariance matrix

Lets first review these for arbitrary random


vectors m x 1 random vector
Mean Vector (m x 1)

Covariance Matrix (m x m)

(i,j)th entry

Compact
representation

260
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of covariance
Covariance unaffected when we add constants

Adding constants changes the mean but not the covariance.


So we can always consider zero mean versions of random variables when
computing covariance.

Common scenario: Mean is due to signal, covariance is due to noise.


So we can often ignore the signal when computing covariance.

Covariance is a bilinear function (i.e., multiplicative constants pull out)

Quantities related to covariance

Variance of a random variable is its covariance with itself

Correlation coefficient is the normalized covariance

261
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Mean and covariance evolve separately under


affine transformations

Mean of Y depends only on the mean of X

Covariance of Y depends only on the covariance of X (and does not


depend on the additive constant b)

Back to Gaussian random vectors

is a Gaussian random vector if any linear combination


is a Gaussian random variable
A Gaussian random vector is completely characterized by its
mean vector and covariance matrix.
Notation:

Why? Consider the characteristic function of X (which specifies


its distribution)
X ( u) = E [e j ( u1 X 1 +...+ un X n ) ] = E [e ju X ]
T

But the linear combination uT X ~ N(uT mX ,uT C X u) is a Gaussian random


variable whose distribution depends only on the mean vector and covariance matrix of X
Thus, the characteristic function, and hence the distribution, of X depends only on its
mean vector and covariance matrix.

262
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Joint Gaussian density


Exists if and only if covariance matrix is invertible. If so, is given by:

How would we compute expectation of p(X), where X is a Gaussian random


vector?
Integrating over multiple dimensions is tedious. Instead we can use Monte Carlo simulations.
--start with samples of independent N(0,1) random variables
--transform to random vector with desired joint Gaussian stats
--evaluate function
--average over runs

Often we deal with a linear combination (e.g., sample at output of a filter), which
are simply scalar Gaussian, so we do not need multidimensional integration.

Independence and uncorrelatedness

Two random variables are uncorrelated if their covariance is zero.

Independent random variables are uncorrelated

Uncorrelated random variables are not necessarily independent

Uncorrelated, jointly Gaussian, random variables are independent


Diagonal covariance matrix means inverse is also diagonal,
and joint density decomposes into product of marginals.

263
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gaussian vectors (real-valued)


Collection of i.i.d. standard Gaussian r.v.s:

Euclidean distance from the origin to w

The density f(w) depends only on the magnitude of w, i.e. ||w||2

Orthogonal transformation O (i.e., OtO = OOt = I) preserves the magnitude of a


vector

Gaussian random vectors


Linear transformations of the standard Gaussian vector:

pdf: has covariance matrix K = AAt in the quadratic form instead of 2

When the covariance matrix K is diagonal, i.e., the component random


variables are uncorrelated. Uncorrelated + Gaussian => independence.
White Gaussian vector => uncorrelated, or K is diagonal
Whitening filter => convert K to become diagonal (using eigen-
decomposition)

Note: Normally AWGN noise has infinite components, but it is projected


onto a finite signal space to become a Gaussian vector.

264
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex Gaussian R.V: Circular Symmetry


A complex Gaussian random variable X whose real and
imaginary components are i.i.d. gaussian
satisfies a circular symmetry property:
ejX has the same distribution as X for any .
ej multiplication: rotation in the complex plane.
We shall call such a random variable circularly symmetric
complex Gaussian, denoted by CN(0, 2), where 2 = E[|X|2].

Complex Gaussian: Summary

265
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Complex Gaussian vectors: Summary

We will often see equations like:

Here, we will make use of the fact


that projections of w are complex Gaussian, i.e.:
h can describe the complex channel

Related distributions

X = [X1, , Xn] is Normal


||X|| is Rayleigh ( e.g., magnitude of a complex gaussian channel X1 +
jX2 )
||X||2 is Chi-Squared with n-degrees of freedom
When n = 2, chi-squared becomes exponential (e.g., power in
complex gaussian channel: sum of squares)

266
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random processes

Interpretation of random process


A random process X(A,t) can be viewed as a function of two
variables: an event A and time t.
In the next figure there are N sample functions of time,
{Xj(t)}.
Each of the sample functions can be regarded as the output of
a different noise generator.
For a specific event Aj we have a single time function X(Aj,t)
= Xj(t) (i.e., a sample function).
The totality of all sample functions is called an ensemble.
For a specific time tk, X(A,tk) is a random variable X(tk) whose
value depends on the event.
Finally, for a specific event, A = Aj, and a specific time t=tk,
X(Aj,tk) is simply a number.
For notational convenience we often shall designate the
random process by X(t), and let the functional dependence
upon event A be implicit.

267
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random noise process

Random sequences and random processes

268
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random processes

Random process
A random process is a collection of time functions, or signals,
corresponding to various outcomes of a random experiment.
For each outcome, there exists a deterministic function, which is called a
sample function or a realization.

Random
variables
Real number

Sample functions
or realizations
(deterministic
function)
time (t)

269
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Examples of random processes

Examples of random processes

270
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random process: Definition


A random process is defined by all its joint CDFs

for all possible sets of sample times

tn
t2
t0 t1

Random processes

271
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Classification of random processes

(weak sense stationarity)

Example: Stationary vs. nonstationary

The outside temperature of the house is an example of a nonstationary


random process, as the expected temperature in the summer is warmer than
in the winter.
The temperature in your refrigerator can be modelled as a stationary
random process.

272
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Stationarity
If time-shifts (any value T) do not affect its joint CDF

tn+T
tn
t2 t +T t2+T
t0 + T 1
t0 t1

Wide (Weak) sense stationarity (WSS)

Many nonstationary random processes have the property that


the mean and autocorrelation functions are independent of
time.
Such random processes are referred to as wide-sense stationary
(WSS).

Note: All strict-sense stationary random processes are also wide-


sense stationary, but not all wide-sense stationary random
processes are strict-sense stationary.
Note: All WSS Gaussian random processes are also stationary in
the strict sense.

273
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

LTI systems: WSS in WSS out

Keep only above two properties (2nd order stationarity)


Dont insist that higher-order moments or higher order joint
CDFs be unaffected by lag T

With LTI systems, we will see that WSS inputs lead to WSS
outputs,
In particular, if a WSS process with PSD SX(f) is passed through a linear
time-invariant filter with frequency response H(f), then the filter output
is also a WSS process with power spectral density |H(f)|2SX(f).

Gaussian w.s.s. = Gaussian stationary process (since it only has


2nd order moments)

Stationarity: Summary
Strictly stationary: If none of the statistics of the random
process are affected by a shift in the time origin.

Wide sense stationary (WSS): If the mean and


autocorrelation function do not change with a shift in the
origin time.

Cyclostationary: If the mean and autocorrelation function


are periodic in time.

274
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Statistical averages or joint moments

Example: Mean value or the 1st moment

275
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Mean-squared value or the 2nd


moment

Correlation

276
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Power Spectral Density (PSD) of a random


process

Time averaging and ergodicity

277
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Random processes and LTI systems

Ergodicity
Time averages = Ensemble averages
[i.e. ensemble averages like mean/autocorrelation can be computed as time-
averages over a single realization of the random process]
A random process: ergodic in mean and autocorrelation (like w.s.s.) if
and

278
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Power Spectral Density (PSD)


For deterministic signals, the power spectrum is usually found by taking the Fourier
transform of the signal. For stationary random processes, the power spectrum
(spectral density) is found by taking the Fourier transform of the autocorrelation
function.

1. SX(f) is real and SX(f) 0


2. SX(-f) = SX(f)
3. AX(0) = SX() d

Power spectrum
For a deterministic signal x(t), the spectrum is well defined: If X ( )
represents its Fourier transform, i.e., if
+
X ( ) = x(t )e jt dt ,
then | X ( ) |2 represents its energy spectrum.
This follows from Parsevals theorem since the signal energy is given by

+ +
x (t )dt = 21 | X ( ) | d = E.
2 2

Thus | X ( ) |2 represents the signal energy in the band ( , + )


| X ( )|2
X (t ) Energy in( , + )

0 t 0
+

279
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Spectral density: Summary


Energy signals:

Energy spectral density (ESD):

Power signals:

Power spectral density (PSD):

Random process:
Power spectral density (PSD):

Note: We have used f for and Gx for Sx

Properties of autocorrelation function

For real-valued (and WSS for random signals):


1. Autocorrelation and spectral density form a Fourier
transform pair RX() SX()
2. Autocorrelation is symmetric around zero RX(-) = RX()
3. Its maximum value occurs at the origin |RX()| RX(0)
4. Its value at the origin is equal to the average power or
energy

280
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Autocorrelation: Summary
Autocorrelation of an energy signal

Autocorrelation of a power signal

For a periodic signal:

Autocorrelation of a random signal

For a WSS process:

Signal transmission with linear systems


(filters)
Input Output
Linear system

Deterministic signals:

Random signals:

Ideal distortionless transmission:


All the frequency components of the signal not only arrive with an identical
time delay, but also amplified or attenuated equally.

281
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Deterministic systems with stochastic inputs

Deterministic systems

Memoryless Systems Systems with Memory

Y ( t ) = g [ X ( t )]

Time-varying Time-Invariant Linear systems


systems systems Y ( t ) = L[ X ( t )]

Linear-Time Invariant
(LTI) systems
+
X (t ) h(t ) Y ( t ) = h ( t ) X ( ) d
+
LTI system = h ( ) X ( t ) d .

282
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

LTI systems
WSS input is good enough

Noise

283
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Noise in communication systems


Noise in communcation systems is often described by a zero-mean
Gaussian random process, n(t).
This process is stationary and its PSD is flat, hence, it is called white
noise.
Observations at different times, no matter how close, are uncorrelated
(and therefore independent, since the process is Gaussian), i.i.d.
Gaussian.

[w/Hz]

Power spectral
density

Autocorrelation
function
Probability density function

White Gaussian Noise (WGN)


White:
Similar to white light contains equal amounts of all frequencies in the
visible band of EM spectrum
Power spectral density (PSD) is constant, i.e. flat, for all frequencies of
interest (from dc to 1012 Hz)
Autocorrelation is a delta function => two samples, no matter however
close, are uncorrelated.
N0/2 to indicate two-sided PSD
Zero-mean gaussian completely characterized by its variance (2)
Variance of filtered noise is finite = N0/2
Gaussian + uncorrelated => i.i.d.
Affects each symbol independently: memoryless channel
Practically: if bandwith of noise is much larger than that of the system:
white Gaussian noise approximation is good enough

Note: Colored noise exhibits correlations at positive lags

284
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Shannon capacity with AWGN

Approximation:
log2(1+x) x for small x

Application: Shannon capacity with AWGN noise:


Bits-per-Hz = C/B = log2(1+ SNR)
If we can increase SNR linearly when SNR is small (e.g.
cell-edge) we get a linear increase in capacity.

When SNR is large, of course increase in SNR gives only a


diminishing return in terms of capacity: log (1+ SNR)
C/B = log2(1+ SNR) log2(SNR) , when SNR>>

Gaussian random process


Random process: collection of random variables on a common
probability space.
(simple generalization of random vectors: instead of the number of random
variables being finite, we can have countably or uncountably many of them.)

Gaussian random process: any linear combination of samples is a


Gaussian random variable.
is a Gaussian random process
is a Gaussian random variable

for any choice of number of samples, sampling times


and combining coefficients

are jointly Gaussian


for any choice of number of samples, sampling times
and combining coefficients

285
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Characterizing a Gaussian random


process

Statistics of Gaussian random process completely specified by


mean function and autocorrelation/autocovariance function

Why? Need to be able to specify the statistics of any collection of samples.


Since these are jointly Gaussian, only need their means and covariances.

WSS Gaussian random processes are stationary.

Why?
Gaussian random process characterized by second order stats.
If second order stats are shift-invariant, then we cannot distinguish
statistically between shifted versions of the random process.

White Gaussian Noise (WGN)

Real-valued WGN: zero mean, WSS, Gaussian random process with

Sn ( f ) = N 0 /2 = 2 Rn ( ) = (N 0 /2) ( ) = 2 ( )
Two-sided PSD
(need to integrate over both positive and negative frequencies to get the power)

(need to integrate only over physical band of


One-sided PSD: N0 positive frequencies)

Complex-valued WGN: Real and imaginary components are i.i.d.


real valued WGN

286
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why we use a physically unrealizable


noise model
WGN has infinite power
Not physically realizable
Actual receiver noise is bandlimited and has finite power
OK and convenient to assume WGN at receiver input
Receiver noise PSD is relatively flat over typical
receiver bandwidths
Receiver always performs some form of bandlimiting
(e.g., filtering, correlation), at the output of which we
have finite power
Output noise statistics with WGN as input and
bandlimited noise at input are identical
Why is WGN more convenient?
Impulsive autocorrelation function makes computation
of output second order stats much easier

Modeling using WGN


Physical baseband system: corrupted by
real-valued WGN
Replace bandlimited noise by infinite-power
WGN at input to receiver
Physical passband system: complex
envelope of passband receiver noise
modeled as complex-valued WGN
Replace bandlimited noise by infinite-power
WGN at input to receiver

287
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Modeling using WGN: the big picture

How much is N0?

Ideal receiver at room temperature

Boltzmanns constant

room temperature (usually set to 290K)

Raised this using the receiver Noise Figure


for noise figure of F dB

E.g. if F=6 dB and B=20MHz

288
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Noise power computation


Communication theory can work with signal-to-noise
ratios
But we do need absolute numbers when calculating the
link budget: need to calculate required signal power
based on the actual value of noise power

Example: B = 20 MHz bandwidth, receiver noise figure (F) of 6 dB


Noise power

Noise power in dBm

White noise process & LTI systems

289
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

White noise

Example

290
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

Basic detection and estimation


concepts

291
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

AWGN Channel and hypothesis testing

One of M signals sent:

Receiver has to decide between one of M hypotheses based on


the received signal, which is modeled as:

where

Need to learn some detection theory first, before we can deal with
this hypothesis testing problem.

Likelihood principle

Experiment:
Pick Urn A or Urn B at random
Select a ball from that Urn.
The ball is black.
What is the probability that the selected Urn is A?

292
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Likelihood principle

Write out what you know!


P(Black | UrnA) = 1/3
P(Black | UrnB) = 2/3
P(Urn A) = P(Urn B) = 1/2
We want P(Urn A | Black).
Intuition: Urn B is more likely than Urn A (given that the ball is black). But
by how much?
This is an inverse probability problem.
Solution technique: Use Bayes Theorem.

Likelihood principle
Bayes manipulations:
P(Urn A | Black) = P(Urn A and Black) /P(Black)
Decompose the numerator and denomenator in terms of the probabilities we know.

P(Urn A and Black) = P(Black | UrnA)*P(Urn A)


P(Black) = P(Black| Urn A)*P(Urn A) + P(Black| UrnB)*P(UrnB)

We know all these values.


P(Urn A and Black) = 1/3 * 1/2
P(Black) = 1/3 * 1/2 + 2/3 * 1/2 = 1/2
P(Urn A and Black) /P(Black) = 1/3 = 0.333
Notice that it matches our intuition that Urn A is less likely, once we have seen
black.

The information that the ball is black has CHANGED !


From P(Urn A) = 0.5 to P(Urn A | Black) = 0.333

293
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Likelihood detection concepts

Hypotheses: Urn A or Urn B ?


Observation: Black
Prior probabilities: P(Urn A) and P(Urn B)
Likelihood of Black given choice of Urn: {aka forward probability}
P(Black | Urn A) and P(Black | Urn B)
Posterior Probability: of each hypothesis given evidence
P(Urn A | Black) {aka inverse probability}
Likelihood Principle (informal): All inferences depend ONLY on
The likelihoods P(Black | Urn A) and P(Black | Urn B), and
The priors P(Urn A) and P(Urn B)
Result is a probability (or distribution) model over the space of possible hypotheses.

Maximum Likelihood (intuition)


Recall:
P(Urn A | Black) = P(Urn A and Black) /P(Black) =
P(Black | UrnA)*P(Urn A) / P(Black)

P(Urn? | Black) is maximized when P(Black | Urn?) is maximized.


Maximization over the hypotheses space (Urn A or Urn B)

P(Black | Urn?) = likelihood


=> Maximum Likelihood approach to maximizing posterior probability

294
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Maximum Likelihood
Max likelihood

This hypothesis has the highest (maximum)


likelihood of explaining the data observed

Maximum Likelihood (ML) mechanics

Independent Observations (like Black): X1, , Xn


Hypothesis
Likelihood Function: L() = P(X1, , Xn | ) = i
P(Xi | )
{Independence => multiply individual likelihoods}
Log Likelihood LL() = i log P(Xi | )
Maximum likelihood: by taking derivative and
setting to zero and solving for
P

295
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Back to urn example


In our urn example, we are asking:
Given the observed data ball is black
which hypothesis (Urn A or Urn B) has
the highest likelihood of explaining this
observed data?
Ans from above analysis: Urn B

Not Just Urns and Balls: Detection of signal


in AWGN

Hypothesis testing framework


Want to determine which of M possible hypotheses best explains an observation?
Three ingredients:
Hypotheses
Observation takes values in observation space
(assume finite-dimensional--good enough for our purpose)

Statistical relationship between hypotheses and observation expressed


through the conditional densities of the observation given each hypothesis

Conditional densities

Fourth ingredient needed for Bayesian hypothesis testing

Prior probabilities

296
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Decision rule
A decision rule is a mapping from the observation space to the set of hypotheses

Can also view it as a partition of the observation space into M disjoint


regions:


1
5
2 4 3

Basic Gaussian example

0 or 1 sent

Conditional densities

Could model noisy sample


at output of an equalizer

297
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Basic Gaussian example (contd.)

Sensible rule: split the difference


Would this rule make sense if we know for sure that 0 was sent?
What if we know beforehand that 0 was sent with probability 0.75?
What if the noise is not Gaussian or additive?

Need a systematic framework for deriving good decision rules.


First step: define the performance metrics of interest

Performance metrics for evaluating decision


rules
Conditional error probabilities

Average error probability

Error probs for sensible rule in basic Gaussian example:


Conditional error probs
( )

( )

Average error prob regardless of prior probs

298
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Maximum Likelihood (ML) rule


Choose the hypothesis that maximizes the conditional probability of
the observation:

Check: Sensible rule for the basic Gaussian example is the ML rule

ML rule seems like a good idea. Is there anything optimal about it?

Minimizes error probability if all hypotheses are equally likely

Asymptotically optimal when observations can be trusted more and


more (e.g., high SNR, large number of samples).

Minimum Probability of Error (MPE) rule


which turns out to be the
Maximum A Posteriori Probability (MAP) rule

Minimize average probability of error (assume prior probabilities of


the hypotheses are known)

Lets derive it. Convenient to consider maximizing prob of correct decision.


For any given decision rule Decision regions

Conditional prob of correct decision

Average prob of correct decision

Consider any potential observation:


If we put it in the i-th decision region ( ), then our reward
(contribution to the prob of correct decision) is

MPE rule: choose i to maximize this contribution

299
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

MPE rule (contd.)


We have derived the MPE rule to be as follows:

1) MPE rule is equivalent to the Maximum A Posteriori Probability (MAP) rule

Posterior probability of hypothesis i


given the observation

Can rewrite MPE rule as follows:

2) MPE rule reduces to ML rule for equal priors ( )

We can drop from the maximization if it does not depend on i

Likelihood Ratio Test (LRT)


For binary hypothesis testing, MPE rule specializes to:

Can rewrite as a Likelihood Ratio Test (LRT)

Often we take log on both sides


to get Log LRT (LLRT)

Note: Comparing likelihood ratio to a threshold is a common feature of optimal decision


rules resulting from many different criteria--MPE, ML, Neyman-Pearson (radar problem
trading off false alarm vs. miss probabilities). The threshold changes based on the criterion.
The likelihood ratio summarizes all the information relevant to the hypothesis testing
problem; that is, it is a sufficient statistic

300
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Likelihood Ratio for basic Gaussian


example

Compare log LR with zero to get sensible (ML) rule:

Note that the inequalities are reversed when m<0.

Irrelevant statistics

Consider the following hypothesis testing problem:

When can we throw Y2 away without performance degradation?


That is, when is Y2 irrelevant to our decision?
Intuition for two scenarios:
If the noises are independent, Y2 should be irrelevant, since it
contains no signal contribution.
If the noises are equal (extreme case of highly correlated), Y2 is
very relevantsubtract it out from Y1 to get perfect detection!

Need a systematic way of recognizing irrelevant statistics

301
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Irrelevant statistics

BEGIN PROOF
Conditional densities are all that are relevant. Under the given conditions,

These depend on hypothesis only through the first observation. END PROOF

Relation to sufficient statistics: f(Y) is a sufficient statistic if


Y is irrelevant for hypothesis testing with (f(Y), Y). That is, once
we know f(Y), we have all the information needed for our decision, and
no longer need the original observation Y.

Irrelevant statistics: Example

Then theorem condition is (more than) satisfied to see Y2 is irrelevant:

Why?

This argument can be applied when deriving optimal receivers for


signaling in AWGN.

302
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Big Picture: Detection under AWGN

Baseband digital link

303
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Unipolar binary error probability

Decision threshold

304
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error probabilities and Q-function

Additive White Gaussian Noise


(AWGN)
Thermal noise is described by a zero-mean Gaussian random process,
n(t) that adds on to the signal => additive
Its PSD is flat, hence, it is called white noise.
Autocorrelation is a spike at 0: uncorrelated at any non-zero
lag

[W/Hz]

Power spectral
Density
(flat => white)

Autocorrelation
Function
Probability density function (uncorrelated)
(gaussian)

305
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Detection of signal in AWGN


Detection problem:
Given the observation vector z , perform a mapping from z
of the transmitted symbol, mi , such that
to an estimate m
the average probability of error in the decision is
minimized.

n
mi si z m
Modulator Decision rule

Binary PAM + AWGN

pz (z | m2 ) pz (z | m1 )

s2 s1
1 (t )
Eb 0 Eb

Signal s1 or s2 is sent. z is received


Additive white gaussian noise (AWGN) => the likelihoods are
bell-shaped pdfs around s1 and s2
pz ( z | m1 ) pz (z | m2 )
MLE => at any point on the x-axis, see which curve (blue or red)
has a higher (maximum) value and select the corresponding
signal (s1 or s2) : simplifies into a nearest-neighbor rule

306
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Effect of noise in signal space


The cloud falls off exponentially (gaussian).
Vector viewpoint can be used in signal space, with a random noise vector w

Maximum Likelihood (ML) detection: Scalar case

likelihoods

Assuming both symbols equally likely: uA is chosen if

Log-Likelihood => A simple distance criterion!

307
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

AWGN detection for Binary PAM

pz (z | m2 ) pz (z | m1 )

s2 s1
1 (t )
Eb 0 Eb

s s /2
Pe ( m1 ) = Pe (m2 ) = Q 1 2
N /2
0
2 Eb
PB = PE ( 2) = Q

N0

AWGN nearest neighbor detection

Projection onto the signal directions (subspace) is called matched filtering to


get the sufficient statistic
Error probability is the tail of the normal distribution (Q-function), based
upon the mid-point between the two signals

308
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Detection in AWGN: Summary

Vector detection

309
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Detection vs. estimation

In detection we have to decide which symbol was transmitted


sA or sB
This is a binary (0/1, true/false or yes/no) type answer,
with an associated error probability

In estimation, we have to output an estimate of a transmitted


signal h.
This estimate is a complex number, not a binary answer.
Typically, we try to estimate the complex channel h, so that
we can use it in coherent combining (matched filtering)

Estimation in AWGN: MMSE


Need:

Performance criterion: mean-squared error (MSE)

Optimal estimator is the conditional mean of x given the observation y


Gives Minimum Mean-Square Error (MMSE)

Satisfies orthogonality property:


Error independent of observation:

But, the conditional mean is a non-linear operator


It becomes linear if x is also gaussian.
Else, we need to find the best linear approximation (LMMSE)!

310
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

LMMSE

We are looking for a linear estimate: x = cy


The best linear estimator, i.e. weighting coefficient c is:

We are weighting the received signal y by the transmit


signal energy as a fraction of the received signal energy c.

The corresponding error (MMSE) is:

Linear Algebra Tools for


Advanced Telecommunications

622

311
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

What is linear and algebra?


Properties satisfied by a line through the origin (one-dimensional y
case).
A directed arrow from the origin (v) on the line, when scaled by a cv
constant (c) remains on the line v
Two directed arrows (u and v) on the line can be added to
create a longer directed arrow (u + v) in the same line. x

This is nothing but arithmetic with symbols!


Algebra: generalization and extension of arithmetic.
Linear operations: addition and scaling. y

Abstract and Generalize ! v


Line vector space having N dimensions u u+v
Point vector with N components in each of the N x
dimensions (basis vectors).
Vectors have: Length and Direction.
Basis vectors: span or define the space and its
dimensionality.
Linear function transforming vectors matrix.
The function acts on each vector component and scales it
Add up the resulting scaled components to get a new vector!
In general: f(cu + dv) = cf(u) + df(v)

623

Vector

Think of a vector as a directed line


segment in N-dimensions! a
v = b
Vector has length and direction r

Basic idea: convert geometry in higher c


dimensions into algebra!
Once you define a basis along each
dimension: x-, y-, z-axis y
Vector becomes a 1 x N matrix!
v = [a b c]T v

Geometry starts to become linear


algebra on vectors like v! x

312
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Examples of geometry becoming algebra

Lines are vectors through the origin, scaled and translated:


mx + c
Intersection of lines can be modeled as addition of vectors: solution of
linear equations.
Linear transformations of vectors can be associated with a
matrix A, whose columns represent how each basis vector is
transformed.
Ellipses and conic sections:
ax2 + 2bxy + cy2 = d
Let x = [x y]T and A is a symmetric matrix with rows [a b]T and [b c]T
xTAx = d {quadratic form equation for ellipse!}
This becomes convenient at higher dimensions
Note how a symmetric matrix A naturally arises from such a
homogenous multivariate equation.

625

Scalar vs. matrix equations

Line equation: y = mx + c
Matrix equation: y = Mx + c

Second order equations:


xTMx = d
y = (xTMx)u + Mx
involves quadratic forms like xTMx

626

313
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Vector addition: A+B


A+B = ( x1 , x2 ) + ( y1 , y2 ) = ( x1 + y1 , x2 + y2 )

A
A+B = C
(use the head-to-tail method
B to combine vectors)
C
B

627

Scalar product: av

av = a( x1 , x2 ) = (ax1 , ax2 )

av
v

Change only the length (scaling), but keep direction fixed.

Note: Matrix operation (Av) can change length,


direction and also dimensionality!
628

314
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Vectors: Magnitude (length) and phase


(direction)

(unit vector => pure direction) Alternate representations:


Polar coordinates: (||v||, )
y
Complex numbers: ||v||ej
||v||
||v||


phase
x

629

Inner (dot) product: v.w or wTv

v
w v.w = ( x1 , x2 ).( y1 , y2 ) = x1 y1 + x2 . y2

The inner product is a SCALAR!

v.w = ( x1 , x2 ).( y1 , y2 ) =|| v || || w || cos

v.w = 0 v w
If vectors v, w are columns, then dot product is wTv
630

315
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inner products, norms, signal space


Signals modeled as vectors in a vector space: signal space
To form a signal space, first we need to know the inner
product between two signals (functions):
Inner (scalar) product: (generalized for functions)

< x(t ), y (t ) >= x(t ) y (t )dt
*


= cross-correlation between x(t) and y(t)
Properties of inner product:
< ax(t ), y (t ) >= a < x(t ), y (t ) >
< x(t ), ay (t ) >= a * < x(t ), y (t ) >
< x(t ) + y (t ), z (t ) >=< x(t ), z (t ) > + < y (t ), z (t ) >
631

Signal space
The distance in signal space is measure by calculating the norm.
What is norm?
Norm of a signal (generalization of length):

x(t ) = < x(t ), x(t ) > = x(t ) dt = E x
2

= length of x(t)
ax(t ) = a x(t )

Norm between two signals:

d x , y = x(t ) y (t )

We refer to the norm between two signals as the Euclidean


distance between two signals.
632

316
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example of distances in signal space


2 (t )
s1 = (a11 , a12 )

E1 d s1 , z
1 (t )
E3 z = ( z1 , z 2 )
d s3 , z E2 d s2 , z
s 3 = (a31 , a32 )

s 2 = (a21 , a22 )
Detection in
The Euclidean distance between signals z(t) and s(t): AWGN noise:
d si , z = si (t ) z (t ) = (ai1 z1 ) 2 + ( ai 2 z2 ) 2 Pick the closest
i = 1,2,3 signal vector
633

Bases and orthonormal bases


Basis (or axes): Frame of reference

Basis: a space is totally defined by a set of vectors any point is a linear


combination of the basis

Ortho-Normal: orthogonal + normal


x = [1 0 0] x y = 0
T

y = [0 1 0] x z = 0
Note: T

Orthogonal: dot product is zero


z = [0 0 1] yz = 0
T 634
Normal: magnitude is one

317
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Projections and orthogonal basis


Get the component of the vector on each axis:
dot-product with unit vector on each axis!

Note: This is what Fourier transform does!


Projects a function onto a infinite number of orthonormal basis functions:
(ej or ej2n), and adds the results up (to get an equivalent representation
in the frequency domain).

CDMA codes are orthogonal, and projecting the composite received signal
on each code helps extract the symbol transmitted on that code. 635

Orthogonal projections: CDMA, Spread


Spectrum (SS)
Spread spectrum

Base-band Spectrum Radio Spectrum

Code B
Code A
B

B
Code A A
A

B C C
B B C
A A A B
A C
B

Time
Sender Receiver
636
Each code is an orthogonal basis vector signals sent are orthogonal

318
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Matrix

A matrix is a set of elements, organized into rows and


columns rows

a b
c d
columns

637

Matrix (geometrically)
Matrix represents a linear function acting on vectors:
Linearity (a.k.a. superposition): f(au + bv) = af(u) + bf(v)
f transforms the unit x-axis basis vector i = [1 0]T to [a c]T
f transforms the unit y-axis basis vector j = [0 1]T to [b d]T
f can be represented by the matrix with [a c]T and [b d]T as columns
Why? f(w = mi + nj) = A[m n]T
Column viewpoint: focus on the columns of the matrix!

a b
c d

[0,1]T [a,c]T

[1,0]T
[b,d]T
Linear Functions f : Rotate and/or stretch/shrink the basis vectors

319
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Matrix operating on vectors


Matrix is like a function that transforms the vectors on a plane
Matrix operating on a general point transforms x- and y-components
System of linear equations: matrix is just the bunch of coefficients !

a b x x'
x = ax + by
=
c d y y'
y = cx + dy

Vector (column) viewpoint:


New basis vector [a c]T is scaled by x, and added to:
New basis vector [b d]T scaled by y
i.e. a linear combination of columns of A to get [x y]T

Vector spaces, dimension, span


Another way to view Ax = b, is that a solution exists for all vectors b that lie in the
column space of A,
i.e. b is a linear combination of the basis vectors represented by the columns of
A
The columns of A span the column space
The dimension of the column space is the column rank (or rank) of matrix A.

In general, given a bunch of vectors, they span a vector space.


The dimension of the space is maximal only when the vectors are linearly
independent of the others.
Subspaces are vector spaces with lower dimension that are a subset of the
original space

Note: Linear channel codes (eg: Hamming, Reed-Solomon, BCH) can be viewed as
k-dimensional vector sub-spaces of a larger N-dimensional space.
k-data bits can therefore be protected with N-k parity bits

640

320
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Forward Error Correction (FEC):


Eg: Reed-Solomon RS(N,K)
K of N Recover K
RS(N,K) received data packets!

FEC (N-K)

Block
Size Lossy Network
(N)

Data = K

This is linear algebra in action: design an appropriate k-dimensional vector sub-space


641
out of an N-dimensional vector space

Matrices: Scaling, rotation, identity


Pure scaling, no rotation => diagonal matrix (note: x-, y-axes could be scaled differently!)
Pure rotation, no stretching => orthogonal matrix O
Identity (do nothing) matrix = unit scaling, no rotation!

r1 0
0 r2
[0,1]T [0,r2]T
scaling
[1,0]T [r1,0]T

cos -sin
sin cos
[-sin, cos]T
[0,1]T
[cos, sin]T
rotation

[1,0]T 642

321
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Scaling
P

r 0 a.k.a: dilation (r >1),


0 r
contraction (r <1)
643

Rotation

cos -sin
sin cos

644

322
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reflections
Reflection can be about any line or point.
Complex Conjugate: reflection about x-axis
(i.e. flip the phase to -)
Reflection => two times the projection
distance from the line.
Reflection does not affect magnitude

Induced Matrix

645

Orthogonal projections: Matrices

646

323
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

2D Translation
P

t
P

P
P ' = ( x + t x , y + t y ) = P+t ty
t

P
y

x tx 647

Basic matrix operations


Addition, Subtraction, Multiplication: creating new matrices (or functions)

a b e f a + e b + f
c d + g =
h c + g d + h
Just add elements

a b e f a e b f
c d g =
h c g d h
Just subtract elements

a b e f ae + bg af + bh
c d g = Multiply each row by
h ce + dg cf + dh each column

648

324
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multiplication
Is AB = BA? Maybe, but maybe not!

a b e f ae + bg ... e f a b ea + fc ...
c d g = =
h ... ... g
h c d ... ...

Matrix multiplication AB: apply transformation B first, and


then again transform using A!
Multiplication is NOT commutative!

Note: If A and B both represent either pure rotation or


scaling they can be interchanged (i.e. AB = BA)

649

Multiplication as composition

Different!

650

325
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inverse of a matrix
Identity matrix:
AI = A
Inverse exists only for square
matrices that are non-singular
Maps N-dim space to
1 0 0
I = 0 1 0
another N-dim space
bijectively
Some matrices have an
inverse, such that:
AA-1 = I
0 0 1
Inversion is tricky:
(ABC)-1 = C-1B-1A-1

Determinant of a matrix

a b
Used for inversion A=
If det(A) = 0, then A has no inverse c d

det( A) = ad bc
Note: Determinant-criterion for space-time
code design.
Good code exploiting time diversity 1 d b
A1 =
ad bc c a
should maximize the minimum
product distance between codewords.
Coding gain determined by min of
determinant over code words.

652

326
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Projection: Using inner products

p = a (aTx)
||a|| = aTa = 1
653

Projection: Using inner products


p = a (aTb)/ (aTa)

Note: the error vector e = b-p


is orthogonal (perpendicular) to p.
i.e. Inner product: (b-p)Tp = 0

Orthogonalization principle: after projection, the difference or error is


orthogonal to the projection

Note: We can use this idea to find a least-squares line that minimizes
the sum of squared errors (i.e. min eTe).

This is also used in detection under AWGN noise to get the test statistic:
Idea: project the noisy received vector y onto (complex) transmit vector h:
matched filter/max-ratio-combining (MRC)
654

327
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Schwartz Inequality and matched filter

Inner Product (aTx) Product of Norms (i.e. |a||x|)


Projection length Product of Individual Lengths
This is the Schwartz Inequality!
Equality happens when a and x are in the same direction (i.e. cos = 1, when
= 0)

Application: matched filter


Received vector y = x + w (zero-mean AWGN)
Note: w is infinite dimensional
Project y to the subspace formed by the finite set of transmitted symbols x: y
y is said to be a sufficient statistic for detection, i.e. reject the noise
dimensions outside the signal space.
This operation is called matching to the signal space (projecting)
Now, pick the x which is closest to y in distance (ML detection = nearest
neighbor)

655

Receiver without matched filter

Transmitted Signal Received Signal

Signal + AWGN noise will not reveal the original transmitted sequence.
There is a high power of noise relative to the power of the desired signal (low SNR).

If the receiver were to sample this signal at the correct times, the
resulting binary message would have a lot of bit errors.
656

328
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Matched filter

Consider the received signal as a vector r, and the transmitted signal vector as s
Matched filter projects the r onto signal space spanned by s (matches it)

Filtered signal can now be safely sampled by the receiver at the correct sampling instants,
resulting in a correct interpretation of the binary message
Matched filter is the filter that maximizes the signal-to-noise ratio it can be
shown that it also minimizes the BER: it is a simple projection operation
657

Matched filter and repetition coding

hx1 only spans a


1-dim space

||h||

658
Multiply by conjugate => cancel phase!

329
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Symmetric, Hermitian, positive definite

Symmetric: A = AT
Symmetric => square matrix
Complex vectors/matrices:
Transpose of a vector or a matrix with complex elements must involve a
conjugate transpose, i.e. flip the phase as well.
For example: ||x||2 = xHx, where xH refers to the conjugate transpose of x
Hermitian (for complex elements): A = AH
Like symmetric matrix, but must also do a conjugation of each element (i.e. flip
its phase).
i.e. symmetric, except for flipped phase
Note we will use A* instead of AH for convenience
Positive definite: symmetric, and its quadratic forms are strictly positive, for non-
zero x :
xTAx > 0
Geometry: bowl-shaped minima at x = 0

659

Orthogonal, unitary matrices: Rotations


Rotations and Reflections: Orthogonal matrices Q
Pure rotation => Changes vector direction, but not magnitude (no scaling
effect)
Retains dimensionality, and is invertible
Inverse rotation is simply QT

Unitary matrix (U): complex elements, rotation in complex plane


Inverse: UH (note: conjugate transpose).

Note:
Gaussian noise exhibits isotropy, i.e. invariance to direction. So any rotation
Q of a gaussian vector (w) yields another gaussian vector Qw.
Circular symmetric (c-s) complex gaussian vector w => complex rotation with
U yields another c-s gaussian vector Uw

Note: The Discrete Fourier Transform (DFT) matrix is both unitary and symmetric.
DFT is nothing but a complex rotation, i.e. viewed in a basis that is a rotated
version of the original basis.
FFT is just a fast implementation of DFT. It is fundamental in OFDM.
660

330
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Quadratic forms: xTAx


Linear:
y = mx + c generalizes to vector equation
y = Mx + c ( y, x, c are vectors, M = matrix)

Quadratic expressions in 1-variable: x2


Vector expression: xTx ( projection!)
Quadratic forms generalize this, by allowing a linear transformation A as well

Multivariable quadratic expression: x2 + 2xy + y2


Captured by a symmetric matrix A, and quadratic form:
xTAx

Note: Gaussian vector formula has a quadratic form term in its exponent:
exp[-0.5 (x -)T K-1 (x -)]
Similar to 1-variable gaussian: exp(-0.5 (x -)2/2 )
K-1 (inverse covariance matrix) instead of 1/ 2
Quadratic form involving (x -) instead of (x -)2
661

Rectangular matrices
Linear system of equations:
Ax = b
More or less equations than necessary.
Not full rank
If full column rank, we can modify equation as:
ATAx = ATb
Now (ATA) is square, symmetric and invertible.
x = (ATA)-1 ATb now solves the system of equations!
This solution is called the least-squares solution. Project b onto column space
and then solve.
(ATA)-1 AT is sometimes called the pseudo inverse

Note: (ATA) or (A*A) will appear often in communications math (MIMO). They
will also appear in SVD (singular value decomposition)
The pseudo inverse (ATA)-1 AT will appear in decorrelator receivers for MIMO

662

331
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Invariants of matrices: Eigenvectors


Consider a NxN matrix (or linear transformation) T
An invariant input x of a function T(x) is nice because it does not change when the
function T is applied to it.
i.e. solve this eqn for x: T(x) = x
We allow (positive or negative) scaling, but want invariance concerning direction:
T(x) = x
There are multiple solutions to this equation, equal to the rank of the matrix T. If T
is full rank, then we have a full set of solutions.
These invariant solution vectors x are eigenvectors, and the characteristic scaling
factors associated with each x are eigenvalues.

E-vectors:
- Points on the x-axis unaffected [1 0]T
- Points on y-axis are flipped [0 1]T
(but this is equivalent to scaling by -1!)
E-values: 1, -1 (also on diagonal of matrix)
663

Eigenvectors
Eigenvectors are even more interesting because any vector in the domain of
T can now be viewed in a new coordinate system formed with the invariant
eigen directions as a basis.
The operation of T(x) is now decomposable into simpler operations on x,
which involve projecting x onto the eigen directions and applying the
characteristic (eigenvalue) scaling along those directions

Note: In Fourier transforms (associated with linear systems):


The unit length phasors ej are the eigenvectors! And the frequency
response is composed of the eigenvalues!
Why? Linear systems are described by differential equations (i.e. d/d and
higher orders)
Recall d (ej)/d = jej
j is the eigenvalue and ej the eigenvector (actually, an eigenfunction)

664

332
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Eigenvalues and eigenvectors


Eigenvectors (for a square mm matrix S)

Example

(right) eigenvector eigenvalue

How many eigenvalues are there at most?

only has a non-zero solution if


this is a m-th order equation in which can have at
most m distinct solutions (roots of the characteristic
polynomial) can be complex even though S is real.
665

Diagonal (eigen) decomposition


2 1
Let S= ; 1 = 1, 2 = 3.
1 2

The eigenvectors
1 and 1 form U = 1 1
1 1 1
1

1 1 / 2 1 / 2 Recall
Inverting, we have U = UU1 =1.
1 / 2 1 / 2
1 1 1 0 1 / 2 1 / 2
Then, S=UU1 = 1 1 0 3 1 / 2 1 / 2

666

333
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

Lets divide U (and multiply U1) by 2

1 / 2 1 / 2 1 0 1 / 2 1/ 2
Then, S=
1 / 2 1 / 2 0 3 1 / 2 1/ 2
Q (Q-1= QT )

667

Geometric view: Eigenvectors


Homogeneous (2nd order) multivariable equations:
Represented in matrix (quadratic) form with symmetric matrix A:

where

Eigenvector decomposition:

Geometry: Principal Axes of Ellipse


Symmetric A orthogonal e-vectors!
Same idea in Fourier transforms
E-vectors are frequencies
Positive Definite A +ve real e-values!
668

334
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why do eigenvalues/vectors matter?


Eigenvectors are invariants of A
Do not change direction when operated A
Recall d(et)/dt = et .
et is an invariant function for the linear operator d/dt, with eigenvalue
E.g., pair of differential eqns can be written as: dy/dt = Ay, where y = [v u]T

Substitute y = etx into the equation dy/dt = Ay


etx = Aetx
This simplifies to the eigenvalue vector equation: Ax = x

Solutions of multivariable differential equations correspond to solutions of


linear algebraic eigenvalue equations!

669

Eigen decomposition
Every square matrix A, with distinct eigenvalues has an eigen
decomposition:
A = SS-1
S is a matrix of eigenvectors and
is a diagonal matrix of distinct eigenvalues = diag(1, N)

Follows from definition of eigenvector/eigenvalue:


Ax = x
Collect all these N eigenvectors into a matrix (S):
AS = S.
or, if S is invertible (if e-values are distinct)
A = SS-1

670

335
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Eigen decomposition: Symmetric A


Every square, symmetric matrix A can be decomposed into a product of a
rotation (Q), scaling () and an inverse rotation (QT)
A = QQT
Idea is similar A = SS-1
But the eigenvectors of a symmetric matrix A are orthogonal and form
an orthogonal basis transformation Q.
For an orthogonal matrix Q, inverse is just the transpose QT

This is why we like symmetric (or hermitian) matrices: they admit nice
decomposition
We like positive definite matrices even more: they are symmetric and
all have all eigenvalues strictly positive.
Many linear systems are equivalent to symmetric/hermitian or positive
definite transformations.

671

Fourier methods Eigen decomposition

Applying transform techniques is just eigen decomposition!


Discrete/Finite case (DFT/FFT):
C = FF* where F is the (complex) Fourier matrix, which
happens to be both unitary and symmetric, and
multiplication with F is rapid using the FFT.
Applying F = DFT, i.e. transform to frequency domain, i.e.
rotate the basis to view C in the frequency basis.
Applying is like applying the complex gains/phase
changes to each frequency component (basis vector)
Applying F* inverts back to the time-domain (IDFT or
IFFT)
672

336
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fourier/Eigen decomposition
Continuous case:
Any function f(t) can be viewed as a integral (sum) of
scaled, time-shifted impulses c()(t+) d
h(t) is the response the system gives to an impulse (impulse
response).
Functions response is the convolution of the function f(t)
with impulse response h(t): for linear time-invariant systems
(LTI): f(t)*h(t)
Convolution is messy in the time-domain, but becomes a
simple multiplication in the frequency domain: F(s)H(s)

Input Output
Linear system
673

Fourier /Eigen decomposition

Transforming an impulse response h(t) to frequency domain gives H(s), the


characteristic frequency response.
This is a generalization of multiplying by a Fourier matrix F
H(s) captures the eigenvalues (i.e scaling) corresponding to each
frequency component s.
Doing convolution now becomes a matter of multiplying eigenvalues
for each frequency component; and then transform back (i.e. like
multiplying with IDFT matrix F*)

The eigenvectors are the orthogonal harmonics, i.e. phasors eikx


Every harmonic eikx is an eigenfunction of every derivative and every
finite difference, which are linear operators.
Since dynamic systems can be written as differential/difference
equations, eigentransform methods convert them into simple
polynomial equations!
674

337
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Applications in random vectors/processes


Covariance matrix K for random vectors X:
Generalization of variance, Kij is the co-variance between
components xi and xj
K = E[(X -)(X -)T]
Kij = Kji: K is a real, symmetric matrix, with orthogonal
eigenvectors!
K is positive semi-definite. When K is full-rank, it is positive definite.
White no off-diagonal correlations
K is diagonal, and has the same variance in each element of the
diagonal
Eg: Additive White Gaussian Noise (AWGN)
Whitening filter: eigen decomposition of K + normalization of each
eigenvalue to 1!
(Auto)Correlation matrix R = E[XXT]
R-vectors X, Y uncorrelated E[XYT] = 0 orthogonal
675

Gaussian random vectors


Linear transformations of the standard gaussian vector:

pdf: has covariance matrix K = AAt in the quadratic form instead of 2

When the covariance matrix K is diagonal, i.e., the component random


variables are uncorrelated. Uncorrelated + gaussian => independence.
White gaussian vector => uncorrelated, or K is diagonal
Whitening filter => convert K to become diagonal (using eigen-
decomposition)

Note: normally AWGN noise has infinite components, but it is projected


676
onto a finite signal space to become a gaussian vector

338
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital Modulation

Basic concepts

Modulation
Placing baseband signals on high frequency carriers using the
process of modulation facilitates the long distance
transmission of data, voice and video signals.
Modulation:
The signal processing technique where, at the transmitter
one signal (the modulating signal) modifies a property of
another signal (the carrier signal) so that a composite wave
(the modulated wave) is formed.
Demodulation:
At the receiver, the modulating signal is recovered from the
modulated wave.

678

339
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bandwidth

The bandwidth of the modulated wave is equal to, or greater


than the bandwidth of the modulating signal.
Since the modulated wave has a higher frequency it can be
launched from practical sized antennas, cables or waveguides
Each symbol represents a specific sequence of bits and the
symbol set covers all possible bit combinations.
The maximum symbol rate is determined by the passband of the
bearer and associated equipment.

679

Analogue modulation

Analogue modulation combines a higher frequency


sinusoidal carrier with a lower frequency signal
carrying the message.
Such carriers can be modulated in three distinct ways
Amplitude A can be varied according to the message
Amplitude modulation
Frequency f can be varied according to the message signal
Frequency modulation
Phase can also be varied with the message signal.
Phase modulation
Note that, frequency and phase modulation are
referred to as angle modulation.

680

340
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

What is digital modulation?

Digital modulation combines a high frequency


sinusoidal carrier signal and a digital data stream to
create a modulated wave that assumes a limited
number of states.
As for analogue modulation, we can modulate the
wave in sympathy with the digital data stream in three
basic ways:
Amplitude A can be varied in sympathy with the message
Amplitude modulation
Frequency f can be varied according to the message signal
Frequency modulation
Phase can also be varied with the message signal.
Phase modulation

681

Why digital modulation?

Most communication systems can be classified into


one of three different categories:
Bandwidth efficient
Ability of system to accommodate data within a prescribed
bandwidth
Power efficient
Reliable sending of data with minimal power requirements
Cost efficient
System needs to be economical in the context of its use

682

341
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why digital modulation?

Digital modulation provides better information


capacity, higher data security, better quality
communications.

Industry trends:

683

Why digital modulation?

Another layer of complexity in many new systems is


multiplexing.
Two principal types of multiplexing (or multiple access)
used only digital systems are
TDMA (Time Division Multiple Access) and
CDMA (Code Division Multiple Access).
These are two different ways to add diversity to
signals allowing different signals to be separated from
one another.

684

342
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Transmitting information

A pure carrier is generated at the transmitter.


The carrier is modulated with the information to be
transmitted.
Any reliably detectable change in signal
characteristics can carry information.
At the receiver the signal modifications or changes are
detected and demodulated.

Modulation

685

Polar display

Polar display - magnitude and phase


represented together
A simple way to view amplitude
and phase is with the polar
diagram.
The carrier becomes a frequency
and phase reference and the
signal is interpreted relative to
the carrier.
The signal can be expressed in
polar form as a magnitude and a
phase.

686

343
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Polar display

Magnitude is represented as the


distance from the centre and
phase is represented as the
angle.
Amplitude modulation (AM)
changes only the magnitude of
the signal.
Phase modulation (PM)
changes only the phase of the
signal. Amplitude and phase
modulation can be used
together .
Frequency modulation (FM)
looks similar to phase
modulation, though frequency
is the controlled parameter,
rather than relative phase.

687

I/Q formats

In digital communications,
modulation is often expressed in
terms of I and Q.
This is a rectangular representation
of the polar diagram.
On a polar diagram, the I axis lies
on the zero degree phase reference,
and the Q axis is rotated by 90
degrees.
The signal vectors projection onto
the I axis is its I component and
the projection onto the Q axis is its
Q component.

688

344
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

I and Q in transmitter

I/Q diagrams are useful since they mirror the way in


which digital communication signals are created using
an I/Q modulator.
In the transmitter, I and Q signals are mixed with the
same local oscillator.
A 90o phase shifter is placed on one of the paths.
Signals that are at 90o are said to be orthogonal to
each other or in quadrature.

689

Transmitter side

Signals that are in quadrature are independent and do


not interfere with each other.
This simplifies digital radios and similar devices
690

345
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Receiver side

On the receiver side, the combined signals


are easily separated out

691

Why use I/Q?

Digital modulation is easy to accomplish with I/Q


modulators.
Most modulators map data onto a number of discrete
points on the I-Q plane.
Points are known as constellation points.
As the signal moves from one point to another,
simultaneous amplitude and phase modulation usually
takes place.
Difficult to achieve in conventional phase modulators.

692

346
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Application areas
Modulation format Application
MSK, GMSK GSM
BPSK Deep space telemetry, cable modems
QPSK and DQPSK Satellite, CDMA, TETRA
OQPSK (OffsetQPSK) CDMA, satellite
FSK, GFSK DECT, paging, AMPS, CT2,

VSB North American digital TV


8PSK Satellite, aircraft
16 QAM Microwave digital radio, modems, DVB-C, DVB-T
32 QAM Terrestrial microwave, DVB-T
64 QAM DVB-C modems
256 QAM Modems, Digital video (USA)

Terrestrial Trunked Radio (TETRA) is a professional mobile radio and two-way transceiver
specification. TETRA was specifically designed for use by government agencies, emergency
services, (police forces, fire departments, ambulance), transport services and the military.
693
TETRA is an European Telecommunications Standards Institute (ETSI) standard.

Digital modulation

The modulating signal m(t) is a digital signal given by


Binary line codes
or
Multi-level line codes
Correspondingly, the bandpass signals are also given
by
Binary line codes
or
Multi-level line codes

694

347
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary signal format example:


Unipolar
We shall illustrate a number of binary signal formats
or line codes in the following examples.
Unipolar
A 1 is represented by a current of 2A signal units and a 0 is
represented by a current of zero signal units.

695

Binary signal format example:


Unipolar
Unipolar actually can occur in two forms:
Non return to zero (NRZ)
Current maintained for entire bit period (time slot)
In a long sequence with equally likely 1s and 0s, power is (2A)2
or 2A2 signal watts
Return to zero (RZ)
Currents are maintained for a fraction of the time slot. If we
assume that the current is maintained for the time slot and the
symbols are equally likely, then the power in this case is x 2A2
= A2 signal watts.
Consider the sequence 101100111000 and view the
following diagrams to compare the two cases.

696

348
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary signal format example:


Unipolar
With Non Return to Zero operation:
Long sequences of 0s produce periods where there is no current
generated
Long sequences of 1s produce periods where positive current is
generated
When the 1s and 0s are equally likely, the mean value is A signal
units.
Each of the above conditions can cause problems for an
electronic receiver:
When a constant current or no current flows there is no timing
information and synchronization is difficult.

Unipolar (Non Return to Zero)

697

Binary signal format example:


Unipolar

With Return to Zero operation:


Long sequences of 0s produce periods where there is no
current generated
Long sequences of 1s produce periods where positive current
is generated for a fraction of the time and hence a change can
be detected by the receiver.
When the 1s and 0s are equally likely and the pulses are T
wide, the mean value is A/2 signal units.
So RZ eliminates the timing problem, but not the
problem of long term level shifts.

698

349
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary signal format example: Bipolar

Bipolar operation:
A 1 is represented by a current of +A signal units
A 0 is represented by a current of A signal units
Two modes of operation, once again:
Non Return to Zero
Currents maintained for entire time slot
Power needed for equally likely symbols is A2 signal watts
Return to Zero
Currents maintained for fraction of time slot
Power needed for equally likely symbols is A2/2 signal watts

699

Binary signal format example: Bipolar

Bipolar Non Return to Zero

Bipolar Return to Zero

700

350
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary signal format example: Bipolar

Long strings of 1s or 0s produce constant currents in


NRZ bipolar and these represent a problem for
electronic circuits once again.
For RZ bipolar, these problems are basically
eliminated because the receiver detects the return to
zero in each pulse period.
When 1s and 0s are equally likely, the mean signal
value is just zero.

701

Binary signal format example:


Biphase
Biphase (or Manchester)
A 1 is a positive current of amplitude A signal units that
changes to a negative current pulse of equal magnitude and a
0 is a negative pulse that changes to a positive current pulse
of equal magnitude.
The change-over occurs at the midpoint of the timeslot.
This type of coding is used between equipment that operates
at a high speed and requires close synchronization.

702

351
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary signal format example: AMI

Alternate Mark Inversion (AMI)


1s are represented by return to zero current pulses of equal
magnitude A that alternate between positive and negative.
0s are represented by the absence of current pulses.
Power requirements are A2/4 which is half of RZ bipolar and
one eighth of NRZ bipolar.
Since the polarity alternates, almost all the power is contained
within a bandwidth equal to the bit rate expressed in Hz.
With a pulse shape that is approximately the same as a raised
cosine, AMI is used extensively in the carrier systems.

703

Binary signal format example: 2B1Q

Two binary, one quaternary (2B1Q)


Four signal levels (3 and 1) each represent a pair of bits.
Of each pair, the first bit determines whether the level is positive
or negative (1 = +ve, and 0 = -ve).

101100111000

704

352
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Comments on 2B1Q signalling

2B1Q signaling is used for BISDN basic rate services (at 160kbps) and
ISDN
digital subscriber loop services.
For long sequences of 1s and 0s, or alternating 1s and 0s (i.e. 1010101010)
2B1Q
signaling produces constant currents and synchronization is impossible.
Since the frequency power density spectrums of 2B1Q, AMI and Raised
Cosine
are narrower, they are employed in bandwidth limited environments such as
telephone connections.
Manchester is used in LANs and other applications where precise
synchronization is important and bandwidth is available.
705

Bit rate and symbol rate


To understand and compare different modulation format efficiencies, it is
important to first understand the difference between bit rate and symbol
rate.
The signal bandwidth for the communications channel needed
depends on the symbol rate, not on the bit rate. (Ignore sync and error control)

Bit Rate

Bit Rate Symbol Rate:


Bit rate is the frequency of a system If symbols are generated at a rate of r
bit stream. per second to create a baseband
Take, for example, a radio with an 8 signal with a bandwidth of W Hz, then
bit sampler, sampling at 10 kHz for Nyquist has shown that r 2W.
voice.
For a double-sideband modulated
The bit rate, the basic bit stream
wave whose transmission bandwidth
rate in the radio, would be eight bits
multiplied by 10k samples per is BT Hz, BT = 2W so that r BT.
second, or 80 kbits per second.

706

353
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bit rate and symbol rate


The state diagram opposite represents QPSK (more
details later).
Notice that for each constellation point two bits are
transmitted.
If only one bit was being transmitted per symbol, then
in the previous example the symbol and bit rates would
be identical at 80kbits per second.
For the QPSK example, the symbol rate will be 40kbits
per second.
Symbol rate is sometimes called the baud rate. QPSK state diagram
Note that the baud rate is not the same as bit rate. (These
terms are often confused.)
If more bits can be sent with each symbol, then the
same amount of data can be sent in a narrower
spectrum.
This is why modulation formats that are more complex
and use a higher number of states can send the same
information over a narrower piece of the RF spectrum.

707

Bandwidth requirements

Consider the two modulation schemes depicted in the


figures below:

BPSK 8PSK
One bit per symbol 3 bits per symbol
Bit rate = Symbol rate Symbol rate = 1/3 Bit rate

An example of how symbol rate influences spectrum requirements can be seen in


eight-state Phase Shift Keying (8PSK) as shown above on the right. It is a variation
of
PSK. There are eight possible states that the signal can transition to at any time.
The phase of the signal can take any of eight values at any symbol time. Since 23 =
8, there are three bits per symbol. This means the symbol rate is one third of the
bit rate.

708

354
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital modulation basics

The bit rate defines the rate at which information is


passed.
The baud (or signaling) rate defines the number of
symbols per second.
Each symbol represents n bits, and has M signal
states, where M = 2n.
This is called M-ary signaling.
The maximum rate of information transfer through a
baseband channel is given by:
Capacity = 2 W log2M bits per second
where W = bandwidth of modulating baseband signal

709

Symbol clock
The symbol clock represents the frequency and exact
timing of the transmission of the individual symbols.
At the symbol clock transitions, the transmitted carrier
is at the correct I/Q (or magnitude/phase) value to
represent a specific symbol (a specific point in the
constellation).

710

355
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary bandpass signaling examples

711

Binary keying

Binary keying definition:


The bits in a message stream switch the modulation
parameters (amplitude, frequency and phase) from one state
to another. This process is called binary keying.
Binary keying is a process that makes the values of amplitude,
phase or frequency of the carrier signal change in sympathy
with the values of the bits in the binary signal stream.
Basic actions can be classified as:
ASK Amplitude Shift Keying
PSK Phase Shift Keying
FSK Frequency Shift Keying

712

356
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary Amplitude Shift Keying


(BASK)
The transmitted signal for BASK is a sinusoid whose amplitude is
changed by on-off keying (OOK) so that a 1 is represented by the
presence of a signal and a 0 is represented by the absence of a signal.
The modulated pulse can be described mathematically when signal

1 is present as:

where Tb is the bit duration (in sec).

When signal 0 is present we have

713

Double Side Band Suppressed


Carrier (DSB-SC)

The Double Side Band - Suppressed Carrier (DSB-SC)


signal is essentially an AM signal that has a
suppressed discrete carrier.
This signal is given by the following equation:

s(t ) = Ac m(t )cos ct


where m(t) is assumed to have a zero dc level for the
suppressed carrier case.

714

357
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

On-off Keying - OOK

OOK
On-off keying is also known as Amplitude Shift Keying (ASK)
The above graph shows a time domain representation of Binary
Amplitude Shift Keying

715

Binary or Bi-Phase Shift Keying


(BPSK)

One of the simplest forms of digital


modulation is Binary or Bi-Phase Shift
Keying (BPSK).
One application where this is used is for
deep space telemetry.
The phase of a constant amplitude carrier
signal moves between zero and 180
degrees.
On an I and Q diagram, the I state has two
different values.
There are two possible locations in the state
BPSK
diagram, so a binary one or zero can be One bit per symbol
sent. Bit rate = Symbol rate

716

358
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary Phase-Shift Keying

This is illustrated in the chart above. Notice the 180 phase shifts
indicated by the arrow.

717

Binary Phase-Shift Keying

The above equations describe the waveforms for


BPSK. Note that it can also be referred to as phase-
reversal keying or PRK.
Let
s(t ) = Ac cos[c t + Dp m(t )]
Where m(t) is given in the figure below:

718

359
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary Phase-Shift Keying

Typically, m(t) has peak values of 1 and Dp = /2 radians, thus

s(t ) = Ac m(t ) sin ct


BPSK is equivalent to DSB-SC with polar data waveform.

The complex envelope is given by

g (t ) = jAc m(t )

719

Quadrature Phase Shift Keying


(QPSK)
A more common type of phase modulation is
Quadrature Phase Shift Keying (QPSK).
QPSK is used extensively in applications
including:
CDMA (Code Division Multiple Access) cellular
service,
QPSK state diagram
Wireless local loop,
Iridium (a voice/data satellite system) and
DVB-S (Digital Video Broadcasting - Satellite).
QPSK is effectively two independent BPSK
systems (I and Q), and therefore exhibits the
same performance but twice the bandwidth
efficiency.

720

360
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Quadrature Phase Shift Keying


(QPSK)
Quadrature Phase Shift Keying can be filtered using
raised cosine filters to achieve excellent out of band
suppression.

721

Nyquist & Root raised cosine filters

The Nyquist bandwidth is the


minimum bandwidth that can be
used to represent a signal.
It is important to limit the spectral
occupancy of a signal, to improve
bandwidth efficiency and remove
adjacent channel interference.
Root raised cosine filters allow an
approximation to this minimum
bandwidth.

722

361
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Types of Quadrature Phase Shift Keying

Conventional QPSK Offset QPSK /4-QPSK

Conventional QPSK has transitions through zero (ie. 180o phase


transitions). A highly linear amplifier is required.
In Offset QPSK, the transitions on the I and Q channels are
staggered. Phase transitions are therefore limited to 90o.
In /4-QPSK the set of constellation points are toggled for each
symbol, so transitions through zero cannot occur. This scheme
produces the lowest envelope variations.
All QPSK schemes require linear power amplifiers.

723

QPSK and OQPSK

Constellation diagram for QPSK Signal doesn't cross zero,


with Gray coding (Each adjacent because only one bit of the
symbol only differs by one bit). symbol is changed at a time

724

362
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Offset QPSK (OQPSK)


Offset quadrature phase-shift keying (OQPSK) is a variant of
phase-shift keying modulation using 4 different values of the
phase to transmit.
Taking four values of the phase (two bits) at a time to construct
a QPSK symbol can allow the phase of the signal to jump by as
much as 180 at a time.
When the signal is low-pass filtered (as is typical in a
transmitter), these phase-shifts result in large amplitude
fluctuations, an undesirable quality in communication systems.
By offsetting the timing of the odd and even bits by one bit-
period, or half a symbol-period, the in-phase and quadrature
components will never change at the same time.
In the constellation diagram it can be seen that this will limit
the phase-shift to no more than 90 at a time.
This yields much lower amplitude fluctuations than non-offset
QPSK and is sometimes preferred in practice.

725

Difference of the phase between QPSK


and OQPSK

726

363
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

/4-QPSK
This final variant of QPSK uses two
identical constellations which are
rotated by 45 ( / 4 radians, hence
the name) with respect to one another.
Usually, either the even or odd symbols
are used to select points from one of
the constellations and the other
symbols select points from the other
constellation.
This also reduces the phase-shifts
from a maximum of 180, but only to a
maximum of 135 and so the amplitude
fluctuations of / 4QPSK are
between OQPSK and non-offset QPSK.
One property this modulation scheme
possesses is that if the modulated
signal is represented in the complex
domain, it does not have any paths
through the origin.
In other words, the signal does not
pass through the origin.
Dual constellation diagram for /4-QPSK.
This lowers the dynamical range of
fluctuations in the signal which is This shows the two separate constellations
desirable when engineering with identical Gray coding but rotated by
communications signals. 45 with respect to each other

QPSK Summary
Quadrature means that the signal shifts between
phase states that are separated by 90 degrees (/2
radians). The signal shifts in increments of 90 degrees
from 45 to 135, 45, or 135 degrees.
These points are chosen as they can be easily
implemented using an I/Q modulator.
Only two I values and two Q values are needed and
this gives two bits per symbol.
There are four states because 22 = 4.
It is therefore a more bandwidth-efficient type of
modulation than BPSK - twice as efficient.

728

364
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency Shift Keying (FSK)

Frequency modulation and Phase modulation are


closely related.

729

Frequency Shift Keying

Frequency Shift Keying


Discontinuous phase FSK
Where 1 = mark frequency; 2 = space frequency

730

365
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency Shift Keying

731

FSK

Continuous phase FSK

where

732

366
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency Shift Keying


In FSK, the frequency of the carrier is changed as a function of the
modulating signal (data) being transmitted. The amplitude is unchanged.
In Binary FSK (BFSK or 2FSK), a 1 is represented by one frequency
and a 0 is represented by another frequency.

The bandwidth occupancy of FSK depends on the spacing of the two


symbols. A frequency spacing of 0.5 times the symbol period is typically
used.
FSK can be expanded to a M-ary scheme, employing multiple

frequencies as different states.

733

Applications for FSK

FSK (Frequency Shift Keying) is used in many


applications including cordless and paging systems.
Some of the cordless systems include
DECT (Digital Enhanced Cordless
Telephone)
and

CT-2: Cordless Telephone 2


CT-2 is a second generation cordless
telephone system that allows users to
roam away from their home base
stations and receive service in public
places. Away from the home base
station, the service is one way outbound
from the phone to a telepoint that is
within range.

734

367
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary Frequency Shift Keying

Here the modulated wave is a sinusoid of constant amplitude


whose presence at one frequency means a 1 is present and if
another frequency is present then this means a 0 is present.
When signal 1 is present, the pulse can be described as:

When signal 0 is present, the pulse can be described as:

735

Binary Frequency Shift Keying

736

368
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Minimum Shift Keying


Since a frequency shift produces an advancing or
retarding phase, frequency shifts can be detected by
sampling the phase at each symbol period.
Phase shifts of (2N + 1) /2 radians are easily detected
with an I/Q demodulator.
At even numbered symbols, the polarity of the I channel
conveys the transmitted data,
At odd numbered symbols the polarity of the Q channel
conveys the data.
This orthogonality between I and Q simplifies
detection algorithms and hence reduces power
consumption in a mobile receiver.
- MSK is used in the GSM (Global System for Mobile Communications)
cellular standard.

737

Minimum Shift Keying


The minimum frequency shift which yields
orthogonality of I and Q is that which results in a
phase shift of /2 radians per symbol (90 degrees
per symbol).
FSK with this deviation is called MSK (Minimum
Shift Keying). The deviation must be accurate in
order to generate repeatable 90 degree phase
shifts.
A phase shift of +90 degrees represents a data bit

equal to 1, while 90 degrees represents a 0.
The peak-to-peak frequency shift of an MSK signal

is equal to half of the bit rate.

738

369
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Comments on FSK and MSK

FSK and MSK produce constant envelope carrier


signals, which have no amplitude variations.
This is a desirable characteristic for improving the power
efficiency of transmitters.
Amplitude variations can exercise nonlinearities in an amplifiers
amplitude-transfer function, generating spectral re-growth, a
component of adjacent channel power.
Therefore, more efficient amplifiers (which tend to be less
linear) can be used with constant-envelope signals, reducing
power consumption.

739

Comments on FSK and MSK

MSK has a narrower spectrum than wider deviation forms of FSK.


The width of the spectrum is also influenced by the waveforms
causing the frequency shift.
If those waveforms have fast transitions, then the
spectrum of the transmitter will be broad.
In practice, the waveforms are filtered with a Gaussian filter,
resulting in a narrow spectrum.
In addition, the Gaussian filter has no time-domain overshoot, which
would broaden the spectrum by increasing the peak deviation.
MSK with a Gaussian filter is termed GMSK (Gaussian MSK).

740

370
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Differential PSK

Recovery of the data stream from a PSK modulated


wave requires synchronous demodulation
The receiver must reconstruct the carrier exactly so that it
can detect changes in the phase of the received signal.
Differential PSK eliminates the need for the
synchronous carrier in the demodulation process and
this has the effect of simplifying the receiver.
At the transmitter, we process the data stream to give
a modulated wave where the phase changes by
radians whenever a 1 appears in the stream.
It remains constant whenever a 0 appears in the
stream.

741

DPSK
Differential Phase-Shift Keying
Binary data are first differentially encoded and then passed to
the BPSK modulator.
Example:

742

371
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

DPSK

Thus we see that the receiver only needs to detect


phase changes. It does not need to search for
specific phase values.

743

Example: DSPK

744

372
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital Modulation

More sophisticated approach


on modulation and
demodulation/detection

Digital communication system


Transmitted power;
bandpass/baseband
signal BW

Information:
Information
- analog:BW &
source
dynamic range Message Message
- digital:bit rate estimate Information
sink
Source
Maximization of encoder Source
decoder
information
transferred Channel Channel In baseband systems
Encoder decoder these blocks are missing
Message protection &
channel adaptation; Baseband
Interleaving Deinterleaving means that
convolution, block
coding no carrier
Modulator Demodulator wave
Fights against burst modulation
errors is used for
Transmitted Received signal transmission
Channel (may contain errors)
M-PSK/FSK/ASK..., signal
depends on channel wired/wireless
Noise constant/variable
BW & characteristics 746
Interference linear/nonlinear

373
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital communication system as an


application of theories

747

Modulation and demodulation/detection

Modulation
Transform digital data into an analog signal that
can be transmitted or stored (the real world is
analog, not digital).
Demodulation/detection
The received signal contains information about the
transmitted data but is corrupted by noise.
Estimate what data was sent, aiming at minimum
possible probability of making mistakes.

748

374
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Electrical communication system

749

Modulation

750

375
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Modulation

751

Signal, waveform, modulation and


demodulation

752

376
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Geometry of signal set

753

Basis waveforms

754

377
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inner product and norm

755

Signal space

756

378
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Basis of signal space

757

Linear independency

758

379
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Comparison: Waveforms vs. vectors

759

Gram-Schmidt

760

380
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signal space examples

761

Example: PAM

762

381
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: MPAM and 2PAM

763

Example: PPM

764

382
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: PPM

765

Example: Bi-orthogonal signals

766

383
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: PSK

767

Example: PSK

768

384
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Fourier series

769

Example: Sampling expansion

770

385
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Demodulation and detection

771

Modulation/demodulation

772

386
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Receiver

773

Transmission over an AWGN channel

774

387
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Optimal demodulation

775

Correlator demodulator

776

388
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equivalent Gaussian (vector) channel

777

Matched filtering+sampling
Correlation

778

389
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equivalency of matched filter and


correlator

779

Optimum detection: MAP decision


rule

780

390
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Optimum detection: ML decision rule

781

Example

782

391
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error probability

783

Equivalence of the original waveform


problem and the discrete vector problem

784

392
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error probability for two signals

785

Q-function

786

393
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Examples

787

Example: OOK

788

394
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Two-pole signaling

789

Example: Orthogonal signaling

790

395
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Pe vs. SNR

791

Example: Four signals

792

396
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error probability with two signals

793

Error probability with two signals

794

397
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Union bound for ML decisions

795

Note
Union Bound:

A B

P(A B) P(A) + P(B)


P(A1 A2 AN) i= 1..N P(Ai)

Applications:
Getting bounds on BER ,
In general, bounding the tails of probability distributions 796

398
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Approximation with dominating term(s)

797

Digital Modulation

Appendix: Basic signal space


and orthogonalization
concepts

798

399
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Vector space concepts

799

Vector space concepts

800

400
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Signal space concepts

801

Signal space concepts

802

401
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Gram-Schmidt (G-S) orthogonalization

803

G-S orthogonalization

804

402
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

G-S orthogonalization

805

Example

806

403
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example (contd.)

807

Example (contd.)

808

404
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Summary

809

Proof of G-S-O procedure

810

405
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Proof of G-S-O procedure

811

Note

812

406
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary ASK (PAM)

813

M-ary ASK (PAM)

814

407
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary ASK (PAM)

815

M-ary PSK

816

408
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary PSK

817

M-ary PSK

Quarternary Phase Shift Keying (QPSK)

818

409
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary PSK

819

M-ary QAM

820

410
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary QAM

821

M-ary QAM
Signal space diagram (M=16)

822

411
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary QAM
Signal space diagram (M=8)

823

M-ary QAM

824

412
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary FSK

825

M-ary FSK

826

413
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

M-ary FSK
Signal space diagram (M = 2)

827

M-ary FSK

828

414
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multicarrier Systems (OFDM)

OFDM, COFDM, DMT


Orthogonal frequency-division multiplexing (OFDM), essentially
identical to coded OFDM (COFDM) and discrete multi-tone modulation
(DMT), is a frequency-division multiplexing (FDM) scheme utilized as a
digital multi-carrier modulation method.
A large number of closely-spaced orthogonal sub-carriers are used to
carry data.
The data is divided into several parallel data streams or channels, one for
each sub-carrier.
Each sub-carrier is modulated with a conventional modulation scheme such
as quadrature amplitude modulation (QAM) or phase-shift keying (PSK) at
a low symbol rate, maintaining total data rates similar to conventional
single-carrier modulation schemes in the same bandwidth.
OFDM has developed into a popular scheme for wideband digital
communication, whether wireless or over wirelines, used in applications
such as digital television and audio broadcasting, wireless networking and
broadband internet access.

830

415
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Advantages of OFDM
The primary advantage of OFDM over single-carrier schemes is its ability
to cope with severe channel conditions (for example, attenuation of high
frequencies in a long copper wire, narrowband interference and
frequency-selective fading due to multipath) without complex
equalization filters.
Channel equalization is simplified because OFDM may be viewed as
using many slowly-modulated narrowband signals rather than one rapidly-
modulated wideband signal.
The low symbol rate makes the use of a guard interval between symbols
affordable, making it possible to handle time-spreading and eliminate
intersymbol interference (ISI).
This mechanism also facilitates the design of single frequency networks
(SFNs), where several adjacent transmitters send the same signal
simultaneously at the same frequency, as the signals from multiple distant
transmitters may be combined constructively, rather than interfering as
would typically occur in a traditional single-carrier system.

831

Multicarrier modulation
As it is known from the single carrier based communication
systems nonideal channels introduce Intersymbol Interference
(ISI), which degrades the performance compared with the ideal
channel.
The degree of performance degradation depends on the
frequency response of the channel.
Typically, the complexity of the receiver increases as the
spread of the ISI increases.
An alternative approach to the design of bandwidth-efficient
communication system in the presence of channel distortion is
to subdivide the available channel bandwidth into a number of
narrow sub-channels so, that the frequency response of each
subchannel is nearly flat.
Multicarrier modulation method has been used in variety of
applications (e.g., DAB, DVB-T/H, 3GPP-LTE, ADSL and
HDSL). 832

416
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Capacity of multicarrier modulation in


linear channel
Lets suppose that C(f) is the frequency response of a
nonideal bandlimited channel with bandwidth W .
Noise is supposed to be additive Gaussian white noise
(AWGN) with PSD G.
Number (N) of equispaced subbands of bandwidth is
f=W/N, where f is small enough that |C(f)|2/ G
constant within each subband.

833

H-S law capacity

834

417
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Max C with P(f)

835

P distribution

It can be stated that multicarrier modulation scheme that divides the


available bandwidth into subbands of relatively narrow width provides a
solution that could yield transmission rates close to capacity of the channel.
The signal in each sub-band may be controlled independently (coding,
modulation) at a synchronous symbol rate of 1/f=N/W .
If f is small enough the equalizer contains only one tap to correct
amplitude and phase distortion

836

418
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Heavy distortion media


A very suitable application of multicarrier modulation is in digital transmission
over copper wire subscriber loops, because of large amplitude distortion.
In this kind of wire the attenuation which increases rapidly as a function of
frequency makes it extremely difficult to achieve a high transmission rate with
conventional single modulated carrier and an equalizer at the receiver.
The dominant noise in transmission over subscriber lines is crosstalk interference
from signals carried on other telephone lines located in the same cable.
That is why the interference power is frequency dependent, which can be taken in
account when allocating the power to subcarriers.

The ISI penalty in the performance can be


large in wireless system
Multicarrier modulation with optimum
power distribution provides the potential for
a higher transmission rate.

837

Transmitter
Idealized system model:
A simple idealized OFDM system model suitable for a time-invariant AWGN
channel transmitter

838

419
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Transmitter
An OFDM carrier signal is the sum of a number of orthogonal sub-carriers, with
baseband data on each sub-carrier being independently modulated commonly using
some type of quadrature amplitude modulation (QAM) or phase-shift keying
(PSK).
This composite baseband signal is typically used to modulate a main RF carrier.
Input signal s[n] is a serial stream of binary digits.
By inverse multiplexing, these are first demultiplexed into N parallel streams, and
each one mapped to a (possibly complex) symbol stream using some modulation
constellation (QAM, PSK, etc.).
Note that the constellations may be different, so some streams may carry a higher
bit-rate than others.
An inverse FFT (IFFT) is computed on each set of symbols, giving a set of complex
time-domain samples.
These samples are then quadrature-mixed to passband in the standard way.
However, the real and imaginary components are first converted to the analogue
domain using digital-to-analogue converters (DACs); the analogue signals are then
used to modulate cosine and sine waves at the carrier frequency, fc.
These signals are then summed to give the transmission signal s(t).

839

Receiver
The receiver picks up the signal r(t) , which is then
quadrature-mixed down to baseband using cosine and sine
waves at the carrier frequency.
This also creates signals (mirror-images) centered on 2fc , so
low-pass filters are used to reject these.
The baseband signals are then sampled and digitised using
analogue-to-digital converters (ADCs).
Next FFT is used to convert signals back to the frequency
domain.
This returns N parallel streams, each of which is converted to a
binary stream using an appropriate symbol detector.
These streams are then re-combined into a serial stream, which
is an estimate of the original binary stream at the transmitter.

840

420
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Receiver
Idealized system model:
A simple idealized OFDM system model suitable for a time-invariant
AWGN channel receiver

841

OFDM principle: Transmitting end

842

421
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multicarrier spectrum

843

Spectra
With rectangular pulse shaping, the amplitude spectrum of the subcarrier is

spectral ovelapping occurs

844

422
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

OFDM transmitter

The multiplexing operations in transmitter can be


implemented by using IDFT (Inverse Discrete Fourier
Transform) operations
In practice, IDFT is implemented by using IFFTs
(Inverse Fast Fourier Transforms).

845

OFDM principle: Receiving end

846

423
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

OFDM Receiver
The multiplexing operations in receiver can be implemented by using DFT
(Discrete Fourier Transform) operations
In practice, DFT is implemented by using FFTs (Fast Fourier Transforms).

847

FFT based system

848

424
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

FFT based system

849

FFT based system

850

425
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Orthogonality
In OFDM, the sub-carrier frequencies are chosen so that the sub-carriers
are orthogonal to each other, meaning that cross-talk between the sub-
channels is eliminated and inter-carrier guard bands are not required.
This greatly simplifies the design of both the transmitter and the receiver;
unlike conventional FDM, a separate filter for each sub-channel is not
required.
The orthogonality requires that the sub-carrier spacing is f=k/TU Hz,
where TU seconds is the useful symbol duration (the receiver side window
size), and k is a positive integer, typically equal to 1.
Therefore, with N sub-carriers, the total passband bandwidth will be B
Nf (Hz).
The orthogonality also allows high spectral efficiency, with a total symbol
rate near the Nyquist rate for the equivalent baseband signal (i.e. near half
the Nyquist rate for the double-side band physical passband signal),
because almost the whole available frequency band can be utilized.
OFDM generally has a nearly 'white' spectrum, giving it gentle
electromagnetic interference properties with respect to other co-channel
users.
851

FFT based system

852

426
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

FFT based system

853

Avoiding ISI

854

427
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Guard interval for elimination of ISI


One key principle of OFDM is that since low symbol rate modulation
schemes (i.e., where the symbols are relatively long compared to the
channel time characteristics) suffer less from intersymbol interference
caused by multipath propagation, it is advantageous to transmit a number of
low-rate streams in parallel instead of a single high-rate stream.
Since the duration of each symbol is long, it is feasible to insert a guard
interval between the OFDM symbols, thus eliminating the intersymbol
interference.
The guard interval also eliminates the need for a pulse-shaping filter, and it
reduces the sensitivity to time synchronization problems.
The cyclic prefix, which is transmitted during the guard interval, consists of
the end of the OFDM symbol copied into the guard interval, and the guard
interval is transmitted followed by the OFDM symbol.
The reason that the guard interval consists of a copy of the end of the
OFDM symbol is so that the receiver will integrate over an integer number
of sinusoid cycles for each of the multipaths when it performs OFDM
demodulation with the FFT.
855

Channel responses

856

428
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel estimation

857

Inaccurate synchronization and frequency


offset
OFDM requires very accurate frequency synchronization between the
receiver and the transmitter; with frequency deviation the sub-carriers will
no longer be orthogonal, causing inter-carrier interference (ICI) (i.e.,
cross-talk between the sub-carriers).
Frequency offsets are typically caused by mismatched transmitter and
receiver oscillators, or by Doppler shift due to movement.
While Doppler shift alone may be compensated for by the receiver, the
situation is worsened when combined with multipath, as reflections will
appear at various frequency offsets, which is much harder to correct.
This effect typically worsens as speed increases, and is an important factor
limiting the use of OFDM in high-speed vehicles.
Several techniques for ICI suppression are suggested, but they may
increase the receiver complexity.

858

429
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example
If one sends a million symbols per second using conventional single-carrier
modulation over a wireless channel, then the duration of each symbol
would be one microsecond or less.
This imposes severe constraints on synchronization and necessitates the
removal of multipath interference.
If the same million symbols per second are spread among one thousand
sub-channels, the duration of each symbol can be longer by a factor of a
thousand (i.e., one millisecond) for orthogonality with approximately the
same bandwidth.
Assume that a guard interval of 1/8 of the symbol length is inserted
between each symbol.
Intersymbol interference can be avoided if the multipath time-spreading
(the time between the reception of the first and the last echo) is shorter than
the guard interval (i.e., 125 microseconds).
This corresponds to a maximum difference of 37.5 kilometers between the
lengths of the paths.

859

OFDM and linear distortion

860

430
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

OFDM and linear distortion

861

Simplified equalization
The effects of frequency-selective channel conditions, for
example fading caused by multipath propagation, can be
considered as constant (flat) over an OFDM sub-channel if the
sub-channel is sufficiently narrow-banded (i.e., if the number
of sub-channels is sufficiently large).
This makes equalization far simpler at the receiver in OFDM
in comparison to conventional single-carrier modulation.
The equalizer only has to multiply each detected sub-carrier
(each Fourier coefficient) by a constant complex number, or a
rarely changed value.

862

431
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel coding and interleaving


OFDM is invariably used in conjunction with channel coding (forward error
correction, FEC), and almost always uses frequency and/or time interleaving.
Frequency (subcarrier) interleaving increases resistance to frequency-selective
channel conditions such as fading.
For example, when a part of the channel bandwidth fades, frequency interleaving
ensures that the bit errors that would result from those subcarriers in the faded part
of the bandwidth are spread out in the bit-stream rather than being concentrated.
Similarly, time interleaving ensures that bits that are originally close together in the
bit-stream are transmitted far apart in time, thus mitigating against severe fading as
would happen when travelling at high speed.
However, time interleaving is of little benefit in slowly fading channels, such as for
stationary reception, and frequency interleaving offers little to no benefit for
narrowband channels that suffer from flat-fading (where the whole channel
bandwidth fades at the same time).
The reason why interleaving is used on OFDM is to attempt to spread the errors out
in the bit-stream that is presented to the error correction decoder, because when
such decoders are presented with a high concentration of errors the decoder is
unable to correct all the bit errors, and a burst of uncorrected errors occurs.
Note that similar design of audio data encoding makes compact disc (CD) playback
robust.
863

Channel coding and interleaving


A classical type of error correction coding used with OFDM-based systems
is convolutional coding, often concatenated with Reed-Solomon or BCH
coding.
Usually, additional interleaving (time and frequency interleaving) in
between the two layers of coding is implemented.
The choice for Reed-Solomon coding as the outer error correction code is
based on the observation that the Viterbi decoder used for inner
convolutional decoding produces short errors bursts when there is a high
concentration of errors, and Reed-Solomon codes are inherently well-suited
to correcting bursts of errors.
Newer systems adopt near-optimal types of error correction codes that use
the turbo decoding principle, where the decoder iterates towards the desired
solution.
Examples of such error correction coding types include turbo codes and
LDPC codes, which perform close to the Shannon limit for the Additive
White Gaussian Noise (AWGN) channel.

A low-density parity-check (LDPC) and turbo codes are capacity-approaching codes, which means that
864
practical constructions exist that allow codes to closely approach the channel capacity, a theoretical
maximum for the code rate at which reliable communication is still possible given a specific noise level.

432
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

OFDM extended with multiple access


OFDM in its primary form is considered as a digital modulation technique, and not
a multi-user channel access method, since it is utilized for transferring one bit
stream over one communication channel using one sequence of OFDM symbols.
However, OFDM can be combined with multiple access using time, frequency or
coding separation of the users.
In Orthogonal Frequency Division Multiple Access (OFDMA), frequency-division
multiple access is achieved by assigning different OFDM sub-channels to different
users.
OFDMA supports differentiated quality of service by assigning different number of
sub-carriers to different users in a similar fashion as in CDMA, and thus complex
packet scheduling or Media Access Control (MAC) schemes can be avoided
In Multi-carrier code division multiple access (MC-CDMA), also known as OFDM-
CDMA, OFDM is combined with CDMA spread spectrum communication for
coding separation of the users.
Co-channel interference can be mitigated, meaning that manual fixed channel
allocation (FCA) frequency planning is simplified, or complex dynamic channel
allocation (DCA) schemes are avoided.

865

Example

866

433
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Introduction to Information
and Coding Theory

Basic concepts

Information and coding theory

Information sources and source coding


Information measures, entropy,
Represent source data efficiently in digital form.
Channel capacity and coding
Channel capacity, limits,
Use redundant bits to counteract transmission
errors

868

434
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital communication system


Transmitted power;
bandpass/baseband
signal BW

Information:
Information
- analog:BW &
source
dynamic range Message Message
- digital:bit rate estimate Information
sink
Source
Maximization of encoder Source
information decoder
transferred Channel Channel In baseband systems
Encoder decoder these blocks are missing
Message protection &
channel adaptation; Baseband
Interleaving Deinterleaving means that
convolution, block
coding no carrier
Modulator Demodulator wave
Fights against burst modulation
errors is used for
Received signal transmission
Transmitted
signal Channel (may contain errors)
M-PSK/FSK/ASK...,
depends on channel wired/wireless
Noise constant/variable
BW & characteristics 869
Interference linear/nonlinear

Digital communication system as an


application of theories

870

435
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Some probability basics

P(not a) = 1 P(a)
P(a or b) = P(a)+P(b) P(a and b),
where a and b are events
We will often denote P(a and b) by P(a, b).
If P(a,b) = 0, we say a and b are mutually exclusive.

871

Conditional probability

P(a|b) is the probability of a, given that we know b.


The joint probability of both a and b is given by:
P(a,b) = P(a|b)P(b).
Since P(a,b) = P(b,a), we have Bayes Theorem:

P(a|b)P(b) = P(b|a)P(a),
or
P(a|b) =P(b|a)P(a)/P(b)

872

436
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Indepencence
If two events a and b are such that
P(a|b) = P(a),
we say that the events a and b are independent.
Note that from Bayes Theorem, we will also have
that
P(b|a) = P(b),
and furthermore,
P(a,b) = P(a|b)P(b) = P(a)P(b).
This last equation is often taken as the definition of
independence.

873

Required properties of information


measure
We will want our information measure I(p) to have several properties:
1. Information is a non-negative quantity:
I(p) 0.
2. If an event has probability p=1, we get no information from the
occurrence of the event:
I(1) = 0.
3. If two independent events occur (whose joint probability is the product
of their individual probabilities), then the information we get from
observing the events is the sum of these two pieces of information:
I(p1 p2) = I(p1)+I(p2). (This is the critical property )
4. We will want our information measure to be a continuous and monotonic
function of the probability meaning that slight changes in probability
should result in slight changes in information.

874

437
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Derivation of information measure


We can therefore derive the following:

1. I(p2) = I(p p) = I(p)+I(p) = 2 I(p)

2. Thus, further, I(pn) = n I(p) (by induction)

3. I(p) = I((p1/m)m) = m I(p1/m), so

I(p1/m) = (1/m) I(P) and thus in general

I(pn/m) =(n/m) I(p)

4. And thus, by continuity, we get, for 0 < p 1, and a real number a > 0 :

I(pa) = a I(p)

875

Information measure
We can find a simple expression, which satisfy the previous
properties.
This is
I(p) = logb(p) = logb(1/p)
for any base b.
The base b determines the units we are using.
We can change the units by changing the base, using the
formulas, for bases b1, b2, x > 0,

and therefore

876

438
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Units of information
Thus, using different bases for the logarithm results in
information measures which are just constant multiples of each
other, corresponding with measurements in different units:
log2 units are bits (from binary)
loge units are nats from natural logarithm
log10 units are hartleys, after an early scientist in the field of
transmission techniques.

Note: Unless we want to emphasize the units, we need not bother to


specify the base for the logarithm, and will write log(p).
Typically however, we will think in terms of log2(p).

877

Example

A) Flipping a fair coin once will give us events h(ead)


and t(ail) each with probability 1/2, and thus a single
flip of a coin gives us log2(1/2) = 1 bit of
information (whether it comes up h or t).
B) Flipping a fair coin n times (or, equivalently, flipping
n fair coins) gives us log2((1/2)n) = log2(2n) = n
log2(2) = n bits of information.

878

439
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

We could enumerate a sequence of 50 flips as, for


example:
hthhtththhht
or, using 1 for h and 0 for t, the 50 bits
101100101110
Thus n flips of a fair coin gives us n bits of
information, and takes n binary digits to specify.
That these two are the same reassures us that our
definition of information measure is good (enough).

879

Average amount of information


Suppose now that we have n symbols {a1, a2, . . . , an},
and some source is providing us with a stream of these
symbols.
Suppose further that the source emits the symbols with
probabilities {p1, p2, . . . , pn}, respectively.
For now, we also assume that the symbols are emitted
independently (successive symbols do not depend in
any way on past symbols).
What is the average amount of information we get
from each symbol we see in the stream?

880

440
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Average amount of information


If we observe the symbol ai, we will get log(1/pi) information from that
particular observation.
In a long run (say N) of observations, we will see (approximately) N pi
occurrences of symbol ai
Thus, in the N (independent) observations, we will get total information I
of

The average information per symbol observed will be

881

Note

Note that

so we can define pi log(1/pi) to be 0 when pi = 0.

882

441
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Entropy of the distribution


We have defined information strictly in terms of the
probabilities of events.
Therefore, let us suppose that we have a set of probabilities (a
probability distribution) P = {p1, p2, . . . , pn}.
We define the entropy of the distribution P by:

For a continuous probability distribution P(x)

883

Review of some probability concepts

The joint probability is denoted by


The conditional probability density function of X given (the
occurrence of) the value y of Y, can be written as

where pX,Y(x,y) gives the joint probability of X and Y, while pY(y)


gives the marginal density for Y (pY(y) > 0).

884

442
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Definition of information
Here the message signal is modeled as a random process.
We begin by considering observations of a random variable:
Each observation gives a certain amount of information.
But rare observations give more information than usual ones.

Example: The statement The sun will rise next morning


gives very little information (high probability).
The statement San Francisco will destroy
next morning by an earthquake gives a lot
of information (low probability).

885

Definition: (Self)information

Observing a random variable X that takes its


values from the set of possible outcomes
X = {x1, x2, , xK }, the (self) information of
observation xm is defined as

I(xm) = -log2(pX(xm))=log2(1/pX(xm))

where pX(xm) are the probabilities of the


outcomes

886

443
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Interpretation

It is easy to see that I()0 for 0pX1.


For a rare event the probability p() is small and the information is large.
For a usual event the probability p()1 and the information is small.

In case of two independent random variables X and Y, Y = {y1, y2, , yN }, the


information of the joint event (xm,yn) becomes

I(xm, yn) = -log2(pXY(xm, yn))= -log2(pX(xm))-log2(pY(yn))

= I(xm)+ I(yn)

Thus in case of independent events, the information is additive, which makes sense
intuitively.

887

Information source

Source data: a speech signal, an image, a computer file, ...


In practice source data is time-varying and unpredictable.
Bandlimited continuous-time signals (e.g. speech) can be
sampled into discrete time and reproduced nearly without
loss (quantization noise).

A source is a discrete-time stochastic process {Xn}

888

444
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Properties of information source

889

Entropy and information

890

445
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Entropy

( )

891

Entropy
Entropy has the following interpretations:
Average information obtained from an observation.
Average uncertainty about X before the observation.
Entropy is measure of uncertainty.
The more we know about something the lower the entropy.

Why the term entropy?


Thermodynamics (mid 19th): amount of unusable heat in system
Statistical Physics (end 19th): log (complexity of current system
state) amount of mess in the system

892

446
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary entropy function

893

Binary entropy function h(p)

894

447
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

895

Example

896

448
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Theorem

We have H(X) = 0 when exactly one of the probabilities is one and all
the rest are zero. We have H(X) = log(L) only when all of the events
have the same probability 1/L. That is, the maximum of the entropy
function is the log() of the number of possible events, and occurs
when all the events are equally likely.
897

Example
How much information can a student get from a
single grade?
First, the maximum information occurs if all grades
have equal probability.
E.g., in a pass/fail class, on average half should pass
if we want to maximize the information given by the
grade.
The maximum information the student gets from a
grade will be:
Pass/Fail : 1 bit.
Pass: 1, 2, 3, 4, 5 : 2.3 bits.

898

449
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Entropy of English text


(memoryless model)

899

Average entropy

A probability mass function (pmf) 900


Size of the set is also called cardinality

450
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Note

A probability mass function (pmf) is a function


that gives the probability that a discrete random
variable is exactly equal to some value.

A pmf differs from a probability density


function (pdf) in that the values of a pdf,
defined only for continuous random variables, are
not probabilities as such.

The integral of a pdf over a range of possible


values (a, b] gives the probability of the random
variable falling within that range.
901

Comment
It is important to recognize that our definitions of information
and entropy depend only on the probability distribution.
In general, it would not make sense for us to talk about the
information or the entropy of a source without specifying the
probability distribution.
It can certainly happen that two different observers of the same
data stream have different models of the source, and thus
associate different probability distributions to the source.
The two observers will then assign different values to the
information and entropy associated with the source.

902

451
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example on comment

Two people listening to the same lecture can get very


different information from the lecture depending on
their backgrounds.
For example, without appropriate background, one
person might not understand anything at all, and
therefore have as probability model a completely
random source, and therefore get much more
information [!!??] than the listener who understands
quite a bit, and can therefore anticipate much of what
goes on, and therefore assigns non-equal probabilities
to successive word.
903

Conditional entropy
Now given two random variables X and Y, the
conditional entropy of X given Y is denoted as
H(X|Y) and measures
average information obtained from observing X given that
the value of Y is known
average uncertainty about the observation X given that the
value of Y is known
how much extra information one still needs to supply on
average to communicate X given that the other party knows
Y
Thus the conditional entropy measures the statistical
dependence between X and Y in information theoretic
sense.
904

452
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Conditional entropy

905
cf. confer

Conditional entropy

The conditional uncertainty of the discrete random


variable X with L outcomes given the discrete
random variable Y with M outcomes is the quantity

906

453
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Theorem

907

Joint entropy H(XY) H(X,Y)

The joint entropy of two random variables X,Y is the


amount of the information needed on average to
specify both their values. 908

454
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Chain rule

909

Example

910

455
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

911

Example

912

456
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Dice

Lets consider the vector-valued random variable XY,


we get in total 12 outcomes.

913

Example: Dice

914

457
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Theorem

915

Entropy rate

916

458
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Recap: Entropy properties


Entropy measures the amount of information in a
random variable or the length of the message required
to transmit the outcome.
Joint entropy is the amount of information in two (or
more) random variables.
Conditional entropy is the amount of information in
one random variable, given we already know the
other.
Entropy rate is per-word or per-character entropy.

917

Reduction of uncertainty due to an


observation

918

459
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Symmetry in the reduction of


uncertainty

919

Mutual information

The mutual information basically measures the


amount of information which Y contains about X
(or vice versa).
920

460
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Note

Statistically independent X and Y

=> I(X,Y) = 0

=> Entropy is additive for independent variables

I(X,X) = H(X)-H(X|X) = H(X)


= Self-information is Entropy

921

Example

922

461
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

923

Definition: Mutual information

924

462
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Information measures

925

Continuous variables: Differential entropy

926

463
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Shannon theorems and channel


capacity

927

Model of communication systems


In 1948, Claude Shannon laid the foundations for
modern information, coding, and communication
theory.
He developed a general model for communication
systems, and a set of theoretical tools for analyzing
such systems.
His basic model consists of three parts: a sender (or
source), a channel, and a receiver (or sink).

928

464
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Communication with source and channel


coding

Shannons general model also includes encoding


and decoding elements, and noise within the
channel equivalent noiseless channel

929

Transmission channels

Cables
wire pairs (e.g., ordinary telephone line)
coaxial cable
waveguide (metallic waveguide and optical fiber)
More or less free space radio transmission
broadcasting
point-to-point microwave transmission
satellite position transmission
cell networks
(Portable magnetic/electronic/optical memory equipment)

930

465
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel models

Channel models of the random phenomena


introduced by the physical channel are needed

Examples:
A discrete channel A linear additive noise channel

Discrete information transmission


system

932

466
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Probability concepts of transmission


Several types of symbol probabilities will be needed to deal
with the two alphabets here, and well use the notation defined
as follows:
P(xi) is the probability that the source selects symbol xi for
transmission;
P(yj) is the probability that symbol yj is received at the
destination;
P(xi, yj) is the joint probability that xi is transmitted and yj is
received;
P(xi|yj) is the conditional probability that xi was transmitted
given that y is received;
P(yj|xi) is the conditional probability that yj is received given
that xi was transmitted.

933

Example
Well assume, for simplicity, that the channel is time-invariant and
memoryless, so the conditional probabilities are independent of time and
previous symbol transmissions.
The conditional probabilities P(yj|xi) then have special significance as the
channels forward transition probabilities.
By way of example, Figure depicts the forward transitions for a noisy
channel with two source symbols and three destination symbols.
If this system is intended to deliver yj = y1 when xi = x1 and yj = y2 when xi
= x2 , then the symbol error probabilities are given by P(yj|xi) for j i.

Forward
transition
probabilities for
a noisy discete
channel
934

467
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Shannons model
In Shannons discrete model, it is assumed that the
source provides a stream of symbols selected from a
finite alphabet A = {a1, a2, . . . , an}, which are then
encoded.
The code is sent through the channel and possibly
disturbed by noise.
At the other end of the channel, the receiver will
decode, and derive information from the sequence of
symbols.

Note: Sending information from one place to another is equivalent to sending


information from one time to another time, and thus Shannons theory applies equally
as well to information storage questions as to information transmission questions.

Shannons model
Given a source of symbols and a channel with noise (more precise, a
probability model for these elements), we can talk about the capacity of the
channel.
The general model Shannon worked with involved two sets of symbols, the
input symbols and the output symbols.
Let us say the two sets of symbols are
A = {a1, a2, . . . , an} and
B = {b1, b2, . . . , bm}.
Note that we do not necessarily assume the same number of symbols in the
two sets.
Given the noise in the channel, when symbol bj comes out of the channel,
we can not be sure which ai was put in.
The channel is characterized by the set of probabilities {P(ai|bj)}.

936

468
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Mutual information
We can then consider various related information and entropy
measures.
First, we can consider the information we get from observing a
symbol bj.
Given a probability model of the source, we have an a priori
estimate P(ai) that symbol ai will be sent next.
Upon observing bj, we can revise our estimate to P(ai|bj).
The change in our information (the mutual information) will
be given by:

937

Mutual information: Properties

If ai and bj are independent (i.e., if


P(ai, bj) = P(ai) P(bj)), then
I(ai; bj) = 0.

938

469
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Average mutual information


What we actually want is to average the mutual information
over all the symbols:

and from these,

939

Average mutual information: Properties

Also we have:
I(A;B) 0,
and
I(A;B) = 0
if and only if A and B are independent.

940

470
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Entropy: Definitions and properties

941

Mutual information and entropies

942

471
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel capacity

If we are given a channel, we could ask what is the


maximum possible information that can be
transmitted through the channel.
We could also ask what mix of the symbols {ai} we
should use to achieve the maximum.
In particular, using the definitions above, we can
define the channel capacity to be:

943

Shannons main theorem


For any channel, there exist ways of encoding input symbols
such that we can simultaneously utilize the channel as closely
as we wish to the capacity, and at the same time have an error
rate as close to zero as we wish.
This is actually quite a remarkable theorem.
We might naively guess that in order to minimize the error rate, we
would have to use more of the channel capacity for error
detection/correction, and less for actual transmission of information.
Shannon showed that it is possible to keep error rates low and
still use the channel for information transmission at (or near)
its capacity.

944

472
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Shannons channel coding theorem


Unfortunately, Shannons theorem has a a couple of
problematic points.
The first is that the proof is non-constructive.
It doesnt tell us how to construct the coding system to optimize
channel use, but only tells us that such a code exists.
The second is that in order to use the capacity with a low error
rate, we may have to encode very large blocks of data.
This means that if we are attempting to use the channel in real-time,
there may be time lags while we are filling buffers.
There is thus still much work possible in the search for
efficient coding schemes.

945

Application: Source coding


Consider a memoryless information source producing an
output signal to be represented by a bit stream.
The source coder doing this must use a unique bit stream for
each of the possible messages (message streams).
This is called lossless source coding.
Also lossy coding techniques exist.
Different source coding variants:
encoding each source output individually vs. treating many consecutive
outputs as a whole
nature of the code words in the whole coded stream fixed vs. variable
length code words
known vs. unknown statistics of the source
Question:
Whats the minimum amount of bits that can be used ?

946

473
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Discrete Memoryless Source (DMS) and


typical sequence

(the joint

947

Examples: Typical sequences

948

474
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Possible sequences

949

Coding problem

We need NH bits to enumerate all different typical


sequences.
That is, H bits/source symbol.
A source code: (1) Observe a sequence; (2a) If typical
sequence produce and store/transmit its NH bit index; (2b) If
non-typical sequence declare an error; (3) Reproduce the
source sequence from the stored/transmitted index.
This code has rate R = H bits/source symbol.
As N the code works without errors.

475
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Source Coding Theorem

The (lossless) Source Coding Theorem: For a source with


entropy rate H, a lossless source code of rate R exists as long
as R > H. For R < H no lossless source code can be found.

H measures the information content in the source,


in the sense that H bits per symbol are required to
describe its output!

Source coding: Example


One very simple but rather efficient code is obtained by coding two
consecutive source bits as follows: assign lowest-length symbol to the most
probable event

It is also easy to decode a bit-stream constructed in this way since no code


word is a prefix of another.
Now in this code, 0.645 bits per sample are used on average (opposed to 1
bit per sample of the trivial code) (average code length to transmit 2 bits
= 0.81*1+0.09*2+0.09*3+0.01*3=1.29 => average code length to transmit
1 bit =1.29/2=0.645) 952

476
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

Shannon source coding theorem relates the uncertainty


of the source output to the probability of typical long
sequences of source symbols.

953

Typical sequence

Lets assume that a source is a discrete time


stochastic process {Xn} and Xn is independent and
identically distributed binary digits 0 and 1 with
probabilities p and 1-p.

954

477
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

Assuming that p < 1/2, the most probable output


sequence consists of only ones.
But such a sequence is not a typical sequence.

Suppose we bet on horses, then it is likely that we


lose, but it would be very unlikely that we lose all the
time; such a losing sequence is not typical!

Probability of the typical sequence

956

478
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Amount of the typical sequences

957

Typical set

958

479
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

959

960

480
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example
Choose a smaller , namely =0.046 [ 5% of
h(1/3)], and increase the length of the
sequences.

961

Meaningful sequences

If we consider a source that outputs text, then the


typical long sequences are the sequences of
"meaningful" text while the nontypical sequences are
simply garbled text.
What is meant by "meaningful" is determined by the
structure of the language; that is, by its grammar,
spelling rules etc.

481
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Meaningful text

963

Meaningful fraction

964

482
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Coding of the typical sequences

965

Digital communication system

966

483
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Source coding theorem

967

Codewords

968

484
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Code rate

969

Transmission rate

The transmission rate Rt is measured in bits/second


and is obtained by multiplying the code rate R by the
number of transmitted channel symbols/second.
If the duration of the pulse for a symbol is T seconds,
then we obtain the transmission rate as

485
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

The (3,2) block code B = {000, 011, 101, 110}


consists of M = 2K= 4 codewords and has rate

R = log(M)/K = K/N = 2/3.

971

(N,K) block code

972

486
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

973

Separate source and channel coding

974

487
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary Symmetric Channel (BSC)

975

Channel coding theorem

976

488
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Jointly typical sequences

By combining
and the previous equations we obtain

977

Cardinality of jointly typical set

978

489
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fan

979

Fans

nonoverlapping fans in the following figure.

980

490
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fans

981

Number of distinguishable messages

Each fan can represent a message.

982

491
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel capacity

983

Non-overlapping fans and correct


decoding

984

492
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

R>C

985

Channel coding theorem

986

493
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Reversed fans

987

Coding strategy

988

494
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Proof of validity of strategy

989

Proof of validity of strategy

990

495
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Proof of validity of strategy

991

Example: Binary Symmetric Channel (BSC)

992

496
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel capacity of BSC

993

Discrete time Gaussian channel

994

497
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Infinite capacity Infinity energy

995

Limitation: Signaling energy

996

498
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Capacity of Gaussian channel

997

Bandlimited noisy channel

998

499
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Transmitted signal power

999

Signal and noise energies/sample

1000

500
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel capacity

1001

Channel capacity

1002

501
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Capacity of bandlimited Gaussian


channel

1003

Without bandwith limitation

1004

502
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Without bandwith limitation

1005

Energy/bit

1006

503
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Capacity without bandwith limitation

1007

Shannon limit

1008

504
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Shannon limit and reliable


communication

1009

Example: Bandlimited Gaussian


channel

1010

505
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Bandlimited Gaussian


channel

1011

Example: Bandlimited Gaussian


channel

1012

506
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Bandlimited Gaussian


channel

1013

Example: Bandlimited Gaussian


channel

1014

507
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Bandlimited Gaussian


channel

1015

Losless source coding

1016

508
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Source coding

1017

Lossless source coding: Concepts

1018

509
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Desired properties of a source code

1019

Instantaneous Prefix property

1020

510
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Classifications

1021

Shannon Source Coding Theorem


(SCT)
A discrete source with entropy rate H [bits/source
symbol], and a lossless source code of rate R
[bits/source symbol].
In 1948 Claude Shannon showed, based on typical
sequences, that a lossless (errorfree) code exists as
long as R > H.
A lossless code does not exist for any R < H.
The theorem does not say how to design practical
coding schemes.

Note: Shannon SCT is an "existence/non-existence theorem" (as many


results in information theory are).

511
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inequalites

1023

SCT for uniquely decodable codes

1024

512
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Source coding

1025

Source coding

1026

513
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Prefix

1027

Prefix-free code

1028

514
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1029

Example

1030

515
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Leaves and nodes

1031

Path length lemma

1032

516
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Average codeword length = average leaves


depth

1033

Example

1034

517
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Huffman tree

1035

Example: Huffman

1036

518
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Huffman

1037

Example: Huffman

1038

519
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Average codeword length vs. uncertainty


of the source

1039

Optimality of Huffman code

1040

520
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Coding source symbols pairwise

1041

Huffman codes

1042

521
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Huffman codes

1043

Lempel-Ziv-Welch

The LZW algorithm is a so-called universal source-


coding algorithm, which means that we do not need
to know the source statistics.
The algorithm is easy to implement and for long
sequences it approaches the uncertainty of the source;
it is asymptotically optimum.

522
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Lempel-Ziv algoritm

1045

Lossy source coding

1046

523
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Lossy source coding

1047

Distortion measures

1048

524
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Rate-distortion function

1049

Distortion-rate function

1050

525
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Examples

1051

Quantization

1052

526
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Scalar quantizer

1053

MSE and SQNR

1054

527
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Uniform/Linear quantization

1055

Quantization noise

1056

528
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Optimal non-uniform quantization

1057

Compressor and expander

1058

529
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Speech compression

1059

Waveform coding: PCM

1060

530
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Waveform coding: DPCM

1061

1-point DPCM

1062

531
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multipoint DPCM

1063

1064

532
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Analysis-syntesis technique

1065

Some examples

1066

533
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel coding (Error correction)

1067

Main classes of channel coding

1068

534
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Binary field

1069

Addition

1070

535
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Linear code

1071

Received word = codeword + error

1072

536
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Errors and Hamming distance

1073

Minimum distance

1074

537
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Hamming (7,4)


In Table 1 we specify an encoder mapping for the
(7,4) Hamming code with M = 24 = 16 codewords.

Table 1:

1075

Example

1076

538
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1077

Example

1078

539
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1079

Example

1080

540
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1081

Example

1082

541
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

G-matrix

1083

Generation of codewords

1084

542
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1085

Parity-checking procedure

1086

543
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Parity-checking procedure

1087

Parity-check matrix

1088

544
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

GHT=0

1089

Convolutional codes

1090

545
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Encoder for convolutional


code

1091

Encoder state

The state of a system is a compact description of


its past history such that it, together with the
present input, suffices to determine the present
output and the next state.

For our convolutional encoder we choose the state


to be the contents of its memory element;
that is, at time t the state is

546
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example

1093

State transition diagram

1094

547
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Trellis-diagram

1095

Viterbi-algorithm

1096

548
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: Viterbi decoding

1097

Example: Evolution of subpaths through


trellis

1098

549
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Error correction capability

1099

Error correction capability

1100

550
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Free distance and capability of error


correction

1101

Diversity

551
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Introduction to diversity
Basic Idea
Send same bits over independent fading paths
Independent fading paths obtained by time, space, frequency, or polarization
diversity
Combine paths to mitigate fading effects

Tb

t
Multiple paths unlikely fade simultaneously
1103

Diversity gain

AWGN case: BER vs SNR:


(any modulation scheme, only the constants differ)
Note: Here is received SNR

Rayleigh Fading without diversity:

Rayleigh Fading with diversity:

(MIMO):

Note: Diversity is a reliability theme, not a capacity/bit-rate one.


For capacity: need more degrees-of-freedom (i.e. symbols/s) & packing of bits/symbol
1104
(e.g. MQAM).

552
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time diversity

1105

Time diversity
Time diversity can be obtained by interleaving and coding over
symbols across different coherent time periods.

Channel: time
diversity/selectivity,
but correlated across
successive symbols

(Repetition) Coding
without interleaving: a full
codeword lost during fade

Interleaving: of sufficient
depth: (> coherence time)
At most 1 symbol of codeword
lost!
Coding alone is not sufficient!

553
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Forward Error Correction (FEC):


Eg: Reed-Solomon RS(N,K)
K of N Recover K
RS(N,K) received data packets!

FEC (N-K)

Block
Size Lossy Network
(N)

Data = K

Block: of sufficient size: (> coherence time), 1107


else need to interleave, or use with hybrid ARQ

Hybrid ARQ/FEC model


Packets Sequence Numbers
CRC or Checksum
Proactive FEC
Timeout
ACKs
Status Reports
NAKs,
SACKs
Bitmaps
Retransmissions
Packets
Reactive FEC

1108

554
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: GSM

The data of each user are sent over time slots of length 577 s
Time slots of the 8 users together form a frame of length 4.615 ms
Voice: 20 ms frames, rate convolution coded = 456 bits/voice-frame
Interleaved across 8 consecutive time slots assigned to that specific user:
0th, 8th, . . ., 448th bits are put into the first time slot,
1st, 9th, . . ., 449th bits are put into the second time slot, etc.
One time slot every 4.615 ms per user, or a delay of ~ 40 ms (ok for voice).
The 8 time slots are shared between two 20 ms speech frames.

1109

Repetition code: Diversity analysis

After interleaving over L coherence time periods,

Repetition coding: for all

where and

This is classic vector detection in white Gaussian


noise.
1110

555
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Repetition coding: Matched filtering

hx1 only spans a 1-dim space


(similar to MPAM, with
random channel gains instead!)

||h||

1111
Multiply by conjugate => cancel phase!

Repetition coding: Fading analysis


BPSK Error probability:

Average over ||h||2 i.e. over Chi-squared distribution,


L-degrees of freedom! Repetition coding gets full diversity,
but sends only one symbol every L
symbol times.
i.e. trades off bit-rate for
reliability (better BER)

1112

556
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Chi-square distribution
In probability theory and statistics, the chi-square distribution
(also chi-squared or -distribution) with k degrees of freedom
is the distribution of a sum of the squares of k independent
standard normal random variables.
It is one of the most widely used probability distributions in
hypothesis testing, or in construction of confidence intervals
If X1, ..., Xk are independent, standard normal random variables,
then the sum of their squares

is distributed according to the chi-square distribution with k


degrees of freedom denoted as

The chi-square distribution has one parameter: k a positive integer that specifies the
1113
number of degrees of freedom (i.e. the number of Xis)

Diversity gain: Intuition


Typical error (deep fade) event probability:
In other words, ||h|| < ||w||/||x||
i.e. ||hx|| < ||w||
(i.e. signal x is attenuated to be of the order of noise w)

Chi-Squared pdf of

1114

557
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Key note: Deep fades become rarer


Deep fade Error event

Note: this graph plots


reliability (i.e. BER vs SNR)

Repetition code trades


off information rate
(i.e. poor use of deg-of-freedom)
1115

Antenna diversity

1116

558
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Antenna diversity

Receive Transmit Both


(SIMO) (MISO) (MIMO)

1117

Antenna diversity: Rx

Receive Transmit Both


(SIMO) (MISO) (MIMO)

1118

559
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Receive diversity

Same mathematical structure as repetition


coding in time diversity (!), except that
there is a further power gain (aka array
gain).
Optimal reception is via matched filtering/MRC
(Maximal Ratio combiner)
(a.k.a. receive beamforming).

1119

Array gain vs. diversity gain


Diversity Gain: There are multiple independent channels between
the transmitter and receiver, and diversity gain is a product of the
statistical richness of those channels

Array gain is not caused by statistical diversity between the


different channels but coherent combination of the actual energy
received by each of the antennas.
Even if the channels are completely correlated, as might happen in a
line-of-sight (LOS) system, the received SNR increases linearly with
the number of receive antennas.
Eg: Correlated flat-fading:

Single Antenna SNR:

Adding all received paths:

1120

560
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Receive diversity: Selection combining

Recall: Bandpass vs. matched filter analogy.


Pick max signal, but dont fully combine signal
power from all taps. Diminishing returns from
more taps.

1121

Receive Beamforming: Maximal Ratio Combining


(MRC)
Weight each branch

SNR:

MRC idea: Branches with better signal energy should be enhanced,


whereas branches with lower SNRs given lower weights
1122

561
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Equivalence of MRC and matched filtering


Maximal Ratio Combining (MRC) or Beamforming is just
matched filtering in the spatial domain!
Generalization of this f-domain picture, for combining multi-tap
signal

Weight each branch

SNR: 1123

Selection diversity vs. MRC

1124

562
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Antenna diversity: Tx

Receive Transmit Both


(SIMO) (MISO) (MIMO)

1125

Transmit diversity

If transmitter knows the channel, send:

maximizes the received SNR by in-phase addition of


signals at the receiver (transmit beamforming), i.e.
closed-loop Tx diversity.

1126

563
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Spacetime coding (STC)


Spacetime coding is a technique used in wireless
communications to transmit multiple copies of a data stream
across a number of antennas and to exploit the various received
versions of the data to improve the reliability of data-transfer.
The fact that the transmitted signal must traverse a difficult
environment with scattering, reflection, refraction and so on and
may then be further corrupted by thermal noise in the receiver
means that some of the received copies of the data will be
'better' than others.
This redundancy results in a higher chance of being able to use
one or more of the received copies to correctly decode the
received signal.
In fact, spacetime coding combines all the copies of the
received signal in an optimal way to extract as much information
from each of them as possible.

1127

Space-Time Block Coding (STBC)


STC involves the transmission of multiple redundant
copies of data to compensate for fading and thermal
noise in the hope that some of them may arrive at the
receiver in a better state than others.
In the case of STBC in particular, the data stream to
be transmitted is encoded in blocks, which are
distributed among spaced antennas and across time.
While it is necessary to have multiple transmit
antennas, it is not necessary to have multiple receive
antennas, although to do so improves performance.
This process of receiving diverse copies of the data
is known as diversity reception

1128

564
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Antenna Diversity: Tx+Rx = MIMO

Receive Transmit Both


(SIMO) (MISO) (MIMO)

1129

Wireless Overview

565
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Digital communication system


Wireless channel with RX and TX antennas
and between them more or less free-space
Noise
Transmitted Received Received
Info. signal signal info.
Transmitter Channel Receiver
Source
SOURCE
User

Transmitter

Source Channel
Formatter Modulator
encoder encoder

Receiver

Source Channel
Formatter Demodulator
decoder decoder

What is wireless communication?


Any form of communication that does not require the
transmitter and receiver to be in physical contact through
guided media
Electromagnetic wave propagated through free-space
RF, Microwave, IR, Optical
Simplex: one-way communication (e.g., radio, TV)
Half-duplex: two-way communication but not simultaneous
(e.g., push-to-talk radios)
Full-duplex: two-way communication (e.g., cellular phones)
Frequency-division duplex (FDD)
Time-division duplex (TDD)
1132

566
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why wireless?
Characteristics
Mostly radio transmission
New protocols for data transmission are needed

Advantages
Spatial flexibility in radio reception range
Ad hoc networks without former planning
No problems with wiring (e.g. historical buildings, fire protection, esthetics)
Robust against disasters like earthquake, fire and careless users (which
remove and break connectors and cut wires)

Disadvantages
Generally lower transmission rates for higher numbers of users
Often proprietary, standards are often restricted
Many national regulations, global regulations are evolving slowly
Restricted frequency range, interferences of frequencies

Nevertheless, in the last 30 years, it has really been a wireless revolution 1133

Radio wave propagation

Propagation of the radio wave in free space


depends heavily on the frequency of the
signal and obstacles in its path.
There are some major effects on signal
behavior
Reflection and multipath
Diffraction or shadowing
scattering
Building and vehicle penetration
Fading of the signal
Interference 1134

567
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Factors affecting wave propagation

(1) direct signal


(2) diffraction
(3) vehicle penetration
(4) interference
(5) building penetration

1135

Path loss, shadowing, fading


Variable & rapid decay of signal due to environment, multi-
paths, mobility

568
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Fading channel

Wireless channel is very different!


Wireless channel is very different from a wired channel.
Not a point-to-point link: EM signal propagates in patterns determined by the
antenna gains and environment
Noise adds on to the signal (AWGN)
Signal strength falls off rapidly with distance (especially in cluttered
environments): large-scale fading.
Shadowing effects make this large-scale signal strength drop-off non-isotropic.
Fast fading leads to huge variations in signal strength over short distance, times,
or in the frequency domain.
Interference due to superimposition of signals, leakage of energy can raise the
noise-floor and fundamentally limit performance:
Self-interference (inter-symbol, inter-carrier), co-channel interference (in a
cellular system with high frequency reuse), cross-system interference
(microwave ovens vs. WiFi vs. bluetooth)
Results:
Variable capacity
Unreliable channel: errors, outages
Variable delays
Capacity is shared with interferers.

569
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Wireless systems
Cellular
With a big emphasis on voice communication
Terrestrial microwave and satellite systems
WiFi
Local networks over wireless, with infrastructure
E.g., 802.11a,b,g,n
WiMAX
Internet provider last mile replacement
Ad Hoc Network
Local networks over wireless, without infrastructure
Sensor network
Radar and radio telescope system 1139

Cellular systems
Geographic region divided into cells
Frequencies/timeslots/codes reused at spatially-separated
locations.
Base stations/MTSOs (Mobile Telephone Switching Offices)
coordinate handoff and control functions
Shrinking cell size increases capacity, as well as networking
burden
Note: Co-channel interference (between same-color cells
below).

BASE
STATION
MTSO

1140

570
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cellular phone networks


Los Angeles

BS
BS

Internet
New York
MTSO MTSO
PSTN

BS

1141
PSTN - Public ServiceTelephone Network

Inside the BS/MTSO is buzzwords bonanza!

1142

571
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Wireless generations
First Generation (1G): Analog 25 or 30 kHz FM,
voice only, mostly vehicular communication
Second Generation (2G): Narrowband TDMA and
CDMA, voice and low bit-rate data, portable
units.
Third Generation (3G): Wideband TDMA and
CDMA, voice and high bit-rate data, portable
units
Fourth Generation (4G and beyond 2015): true
broadband wireless: advanced versions of
WiMAX, 3G LTE, 802.11 a/b/g/n, UWB together
in form of intelligent cognitive radio

1143

Generations

Other Tradeoffs:
Rate Rate vs. Coverage
4G Rate vs. Delay
Rate vs. Cost
3G Rate vs. Energy

2G

Mobility
1144

572
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

LTE: Long-Term Evolution

Based upon OFDM, OFDMA, MIMO


Longer term objective is to support
up to peak data rate of 200 Mbps
with a high average spectral
efficiency

Rule of thumb: the actual capacity (Mbps per channel per sector) in a
multi-cell environment for most wireless
1145
technologies is about 20% to
30% of the peak theoretical data rate.

Wireless evolution to 4G

573
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

4G/IMT-Advanced
International Mobile Telecommunications (IMT)-Advanced Standard
are requirements issued by the ITU-R of the International
Telecommunication Union (ITU) in 2008 for what is marketed as 4G (Or
sometimes as 4,5G mobile phone and Internet access service.
4G provides, in addition to the usual voice and other services of 3G, mobile
broadband Internet access, for example to laptops with wireless modems, to
smartphones, and to other mobile devices.
Potential and current applications include amended mobile web access, IP
telephony, gaming services, high-definition mobile TV, video
conferencing, 3D television, and cloud computing.
4.5G provides better performance than 4G systems, as an interim step
towards deployment of full 5G capability.
The technology includes:
LTE Advanced
MIMO

5G
5G denotes the next major phase of mobile telecommunications standards
beyond the current 4G/IMT-Advanced standards.
5G network requirements could be as:
Spectral efficiency should be significantly enhanced compared to 4G.
Coverage should be improved.
Signaling efficiency enhanced.
Latency should be significantly reduced compared to LTE.
Data rates of several tens of Mb/s should be supported for tens of thousands of
users.
1 Gbit/s to be offered, simultaneously to tens of workers on the same office
floor.
Several hundreds of thousands of simultaneous connections to be supported for
massive sensor deployments.

574
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

5G

A road map would be:


Detailed requirements ready and initial system design in 2017
Standards ready end of 2018
Trials start in 2018
Commercial system ready in 2020
The launch of 5G will happen on an operator and country
specific basis in 2020.
In addition to simply providing faster speeds 5G networks will
also need to meet the needs such as the Internet of Things
(IOT).

Classification

Wireless (vs. wired) communication medium


Cellular (vs. meshed vs. MANETs) architectures for
coverage, capacity, QoS, mobility, auto-configuration,
infrastructure support
Mobile (vs. fixed vs. portable) implications for devices
WPAN (vs. WLAN vs. WMAN) network scope, coverage,
mobility
Technologies/Standards/Marketing Alliances: 802.11,
UWB, Bluetooth, Zigbee, 3G, GSM, CDMA, OFDM, MIMO,
WiMAX

Mobile Ad-hoc NETwork (MANET) - ad hoc: no backbone infrastructure


1150
Note: Wireless Body Area Network (WBAN)

575
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Wireless standards

IEEE 802.15.4 Sensors RFID


(Zigbee Alliance)

RAN
IEEE 802.22
WAN
IEEE 802.20
IEEE 802.16e

IEEE 802.16d MAN ETSI HiperMAN


WiMAX & HIPERACCESS

IEEE 802.11 LAN ETSI


Wi-Fi Alliance HiperLAN

IEEE 802.15.3 PAN ETSI


UWB, Bluetooth
HiperPAN
Wi-Media
1151

Wireless LANs: WiFi/802.11


Based on the IEEE 802.11a/b/g/n family of standards, and is
primarily a local area networking technology designed to provide
in-building or campus broadband coverage.
IEEE 802.11a/g peak physical layer data rate of 54 Mbps and
indoor coverage over a distance of 30 m.
Beyond buildings: municipal WiFi, Neighborhood Area
Networks (NaN), hotspots
Much higher peak data rates than 3G systems, primarily since it
operates over a larger bandwidth (20 MHz).
Its MAC scheme CSMA (Carrier Sense Multiple Access) is
inefficient for large numbers of users
The interference constraints of operating in the license-
exempt band is likely to significantly reduce the actual
capacity of outdoor Wi-Fi systems.
Wi-Fi systems are not designed to support high-speed
mobility.
Wide availability of terminal devices
802.11n: MIMO techniques for range extension and higher bit
rates 1152

576
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Some wireless LAN standards (Wi-Fi)


802.11b
Standard for 2.4GHz ISM band
Frequency hopped spread spectrum
Up to 11 Mbps
802.11a
Standard for 5GHz
OFDM
Up to 54 Mbps
HiperLAN in Europe
802.11g
Standard in both 2.4 GHz and 5 GHz bands
OFDM
Speeds up to 54 Mbps

Frequency Hopping Spread Spectrum (FHSS) - the total frequency band is split into a number of channels.
The broadcast data is spread across the entire frequency band by hopping between the channels in a
pseudorandom fashion.
OFDM - Orthogonal Frequency Division Multiplexing is a multi carrier transmission technique capable of
supporting high speed services whilst still being bandwidth efficient. It achieves this by forcing multiple
1153
sub-carriers together. However, to ensure these adjacent sub-carriers do not cause excessive
interference, they must be orthogonal or 90 to one another.

IEEE 802.11n
Over-the-air (OTA): 200 Mbps; MAC layer (MAC-SAP*): 100Mbps
Microcells, neighborhood area networks (NANs)
PHY
MIMO/multiple antenna techniques
Advanced FEC, (Forward Error Correction)
10, 20 & 40MHz channels widths
Higher order modulation/coding

1154

577
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

WLAN network architecture


Basic Service Set (BSS): a set of stations which
communicate with one another

Ad-hoc network Infrastructure Mode

Only direct communication


Stations communicate with AP
possible
AP provides connection to wired network
No relay function
(e.g. Ethernet)
Stations not allowed to communicate
directly
Some similarities with cellular
1155

WLAN network architecture


ESS: a set of BSSs interconnected by a distribution system (DS)

Local Area Network (e.g .Ethernet)

ESS Extended Service Set

578
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

IEEE 802.15 (WPANs)

802.15.1 adoption of Bluetooth standard into


IEEE
802.15.2 coexistence of WPANs and WLANs
in the 2.4GHz band
802.15.3 high rate WPAN (UWB)
802.15.4 low rate WPAN (Zigbee)

802.15: Wireless Personal Area Network

less than 10 m diameter


replacement for cables
(mouse, keyboard, S
headphones) radius of
M
ad hoc: no backbone coverage

infrastructure S
S
master/slaves:
slaves request permission to
send (to master)
master grants requests
802.15: evolved from
Bluetooth specification
2.4-2.5 GHz radio band
1158
up to 1 Mbps

579
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Bluetooth: WPAN (piconet)

Cable replacement RF technology (low cost)


Short range (typically <10 m, extendable to 100 m)
2.4 GHz ISM band (crowded!)
Data rate 1 Mbit/s
Widely supported by telcos, PC and consumer
electronics companies

What is UltraWideBand (UWB)?


Time-domain behavior Frequency-domain behavior
Narrowband
Communication

0 1 0 1

Frequency
Modulation
2.4 GHz
Communication
Ultrawideband

1 0 1
Impulse
Modulation

time 3 frequency 10 GHz

(FCC Min=500MHz)

Communication occupies more than 500 MHz of spectrum:


baseband or 3.6-10.1 GHz range. Strict power limits.

580
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Why is UWB interesting?


UWB is an impulse radio: sends pulses of tens of picoseconds (10-12) to
nanoseconds (10-9)
Duty cycle of only a fraction of a percent; carrier is not necessarily needed
Uses a lot of bandwidth (GHz); Low probability of detection
Excellent ranging capability; Synchronization (accurate/rapid) an issue.
Multipath highly resolvable: good and bad
Can use OFDM or Rake receiver to get around multipath problem.
Low power transmitters -- 100 times lower than Bluetooth for same
range/data rate
Very high data rates possible Gbps at ~10 m under current regulations
7.5 GHz of free spectrum in the U.S.
FCC legalized UWB for commercial use
Spectrum allocation overlays existing users, but its allowed power
level is very low to minimize interference
Apps: Wireless USB,
480 Mbps, 10m,

UWB spectrum

Bluetooth,
802.11b 802.11a
GPS
PCS

Emitted Cordless Phones


Signal Microwave Ovens
Power

-41 dBm/MHz
UWB
Spectrum
1.6 1.9 2.4 3.1 5 10.6
Frequency (GHz)

Worldwide regulations differ from US --- Japan, EU, Asia

1162

581
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

IEEE 802.15.4 / ZigBee radios

Low-Rate WPAN
Very low power consumption (no recharge for months or
years!), up to 255 devices
Data rates of 20, 40, 250 kbps
Star clusters or peer-to-peer operation
CSMA-CA channel access
Frequency of operation in ISM bands
Home automation, consumer electronics applications,
RFID/tagging applications (goods supply-chain)

WiMAX fixed and mobile


WiMAX Fixed / Nomadic WiMAX Mobile
802.16d or 802.16 802.16e
Usage: Backhaul, Wireless DSL Usage: Long-distance mobile
Frequencies: 2.5GHz, 3.5GHz wireless broadband
and 5.8GHz (Licensed and L- Frequencies: 2.5GHz
free) Description: Wireless
Description: wireless connections to laptops, PDAs and
connections to homes, handsets when outside of Wi-Fi
businesses, and other WiMAX hotspot coverage
or cellular network towers

WiMAX - Worldwide Interoperability for Microwave Access - the radio interface


1164
within two broad radio bands 2 10 GHz and 11 66 GHz

582
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Wide area: Satellite systems

Cover very large areas


Different orbit heights
GEOs (~40000 km), LEOs
(~2000 km), MEOs (~9000km)
Dish antennas, or bulky handsets
Optimized for one-way transmission
Location positioning
GPS systems
Satellite Radio
Radio ( DAB) and (SatTV) broadcasting
Most two-way systems struggling or bankrupt
Expensive alternative to terrestrial cellular system
Trucking fleets, journalists in wild areas, oil rigs

Paging systems
Broad coverage for short messaging
Message broadcast from all base stations
High Tx power (hundreds of watts to kilowatts),
low power pagers
Simple terminals
Optimized for 1-way transmission
Answer-back hard
Overtaken by cellular
obsolete

1166

583
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Radio spectrum and its efficient


utilization

Crowded spectrum: FCC chart

1168

584
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

EM Spectrum for telecom

Most spectra licensed; license can be very


expensive (cellular);
Infrared, ISM (Industrial, Scientific,
Medical) band, and amateur radio bands are
license-free

Licensed and unlicensed spectrum

Licensed
Cell phones, police & fire radio, taxi dispatch, etc.
Unlicensed
All unlicensed bands impose power limits
Industrial, Scientific, Medical (ISM) bands
e.g. (900MHz, 2.4GHz, 5.8GHz)
Unlicensed Personal Communication System (UPCS)
e.g. 1.910-1.920 GHz and 2.390-2.400 GHz
Unlicensed National Information Infrastructure
(UNII) bands
e.g. 5.2GHz

585
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Example: 2.4 GHz interference


Micro-wave oven
Bluetooth
802.11b/g WLAN
Cordless phone
Analog video link

Radio/TV/Wireless allocations: 30 MHz-30 GHz

1172

586
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Open spectrum: ISM and UNII Bands


ISM: Industrial, Scientific & Medical Band UNII
UNII: Unlicensed National Information Infrastructure band
ISM
ISM

1 2 3 4 5 GHz

1173

802.11/802.16 spectrum
UNII

International International
US Japan ISM
Licensed ISM Licensed Licensed
Licensed

1 2 3 4 5 GHz

802.16a has both licensed and license-free options


ISM: Industrial, Scientific & Medical Band Unlicensed band
UNII: Unlicensed National Information Infrastructure band Unlicensed band

1174

587
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Summary: Key pieces of licensed and unlicensed


spectrum
Upper
Low/Mid UNII
WCS ISM MMDS Intl Intl UNII and ISM
License Exempt

Licensed

New Spectrum

2 3 4 5 GHz

1175

Spectrum allocation methods

Auctions: raise revenue, market-based, but may shut out


smaller players; upfront cost depress innovation (lower
equipment budget).
Beauty contest: best technology wins. Faster deployments,
monopolies/oligopolies.
Unlicensed: power limits. (WiFi, some Wimax)
Underlay: primary vs. secondary users. Stricter power limits
for secondary: hide in a wider band under the noise floor
(UWB)
Cognitive radio: primary user has priority. Secondary user
can use greater power, but has to detect and vacate the
spectrum when primary users come up.
1176

588
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Counter-attacking the challenges


Turn disadvantages into advantages!
Resources associated with a fading channel: 1) diversity; 2) number of degrees of
freedom; 3) received power.
Cellular concept: reuse frequency and capacity by taking advantage of the fact that
signal fades with distance. Cost: cells, interference management
Multiple access technologies: CDMA, OFDMA, CSMA, TDMA: share the spectrum
amongst variable number of users within a cell
Diversity i.e. use performance variability as an ally by having access to multiple
modes (time, frequency, codes, space/antennas, users) and combining the signal
from all these modes
Directional/Smart/Multiple Antenna Techniques (MIMO): use spatial diversity, spatial
multiplexing.
Adaptive modulation/coding/power control per-user within a frame in low-SNR regime
Multi-hop/Meshed wireless networks with micro-cells
Interference: still the biggest challenge.
Interference estimation and cancellation techniques (e.g., multi-user) may be
key in the future.
CDMA: interference averaging.
Opportunistic beamforming using intelligent antennas

Cellular concept: Spatial reuse

Note: With CDMA


or WiMAX there can
be frequency
1178 reuse of 1

589
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cells in reality

Cellular model vs. reality: shadowing and variable


large-scale propagation due to environment

1179

Interference in cellular networks


Assume the asynchronous users sharing the
same bandwidth and using the same radio base
station in each coverage area or cell.
Intra-cell/co-channel interference due to the
signal from the other users in the home cell.
Inter-cell/adjacent channel interference due to
the signal from the users in the other cell.
Interference due to the thermal noise.

Methods for reducing interference:


Frequency reuse: in each cell of cluster
pattern different frequency is used
By optimizing reuse pattern the problems
of interference can be reduced
significantly, resulting in increased
capacity.
Reducing cell size: in smaller cells the
frequency is used more efficiently: cell
sectoring, splitting
Multilayer network design (overlays):
macro-, micro-, picocells

590
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cell splitting increases capacity

1181

Trend towards smaller cells


Driving forces:
Need for higher capacity in areas with high user density
Reduced size and cost of base station electronics.
[Large cells require 1 million base stations]
Lower height/power, closer to street.

Issues:
Mobiles traverse a small cell more quickly than a large cell.
Handoffs must be processed more quickly.
Location management becomes more complicated, since there are
more cells within a given area where a mobile may be located.
May need wireless backhaul
Wireless propagation models dont work for small cells.
Microcellular systems are often designed using square or triangular
cell shapes, but these shapes have a large margin of error

1182

591
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Sectoring improves S/I

Capacity increase 3X.


Each sector can reuse time and code slots.
Interference is reduced by sectoring, since users
only experience interference from the sectors at
their frequency.
1183

Sectoring: Tradeoffs

More antennas.

Even though intersector handoff is simpler compared


to intercell handoff, sectoring also increases the
overhead due to the increased number of inter-
sector handoffs.

In channels with heavy scattering, desired power can


be lost into other sectors, which can cause inter-
sector interference as well as power loss

1184

592
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Cell sizes: Multiple layers

Global
Satellite

Suburban Urban
In-Building

Picocell
Microcell
Macrocell

Basic Phone
Smart Phone
Laaptop

1185

Cell breathing: CDMA networks

1186

593
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Capacity planning: Multi-cell issues,


coverage-capacity-quality tradeoffs
Coverage and Range
Required site-to-site distance
in [m]
Capacity:
kbps/cell/MHz for data
Quality
Service dependent

Delay and packet loss rate


important for data services

Interference due to spectrum


reuse in nearby cells.

1187

Handover (Handoff)
Handover :
Cellular system tracks mobile stations in order to maintain their
communication links.
When mobile station goes to neighbor cell, communication link switches
from current cell to the neighbor cell.
Hard handover :
In FDMA or TDMA cellular system, new communication establishes after
breaking current communication at the moment doing handover.
Communication between MS and BS breaks at the moment switching
frequency or time slot.

switching

Cell B Cell A

Hard handover : connect (new cell B) after break (old cell A)

594
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Soft handover
Soft handover :
In CDMA cellular system, communication does not break even at the
moment doing handover, because switching frequency or time slot is
not required.

transmitting same signal from both BS A


and BS B simultaneously to the MS

Cell B
Cell A

Soft handover: break (old cell A) after connect (new cell B)

Mobility/Handover in umbrella cells

Avoids multiple handoffs.


1190

595
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Overlay wireless networks: Mobility &


Handover

1191

Duplexing methods for radio links

Base Station

Forward link
Reverse link

Mobile Station

1192

596
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Frequency Division Duplex (FDD)

Forward link frequency and reverse link frequency is different


In each link, signals are continuously transmitted in parallel.

Forward link (F1)


Reverse link (F2) Base Station

Mobile Station

1193

Example of FDD systems

Mobile Station Base Station

Transmitter BPF BPF Transmitter


F1 F2

Receiver BPF BPF Receiver


F2 F1

BPF: Band Pass Filter

1194

597
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Time Division Duplex (TDD)

Forward link frequency and reverse link frequency is the same.


In each link, signals take turns just like a ping-pong game.

Forward link (F1)

Reverse link (F1)


Base Station

Mobile Station

1195

Example of TDD Systems

Mobile Station Base Station

Transmitter Transmitter

BPF BPF
Receiver F1 F1 Receiver

Synchronous Switches

BPF: Band Pass Filter

1196

598
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multiplexing: Outline

Single link:
Channel partitioning (TDM, FDM, WDM)
vs. Packets/Queuing/Scheduling
Series of links:
Circuit switching vs. packet switching
Statistical Multiplexing (leverage randomness)
Multiplexing gain
Distributed multiplexing (MAC protocols)
Channel partitioning: TDMA, FDMA, CDMA
Randomized protocols: Aloha, Ethernet (CSMA/CD)
Taking turns: distributed round-robin: polling, tokens

1197

Multiplexing: TDM

1198

599
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Multi-carrier: FDM and OFDM


Ch.1 Ch.2 Ch.3 Ch.4 Ch.5 Ch.6 Ch.7 Ch.8 Ch.9 Ch.10

Conventional multicarrier techniques frequency

Ch.2 Ch.4 Ch.6 Ch.8 Ch.10


Ch.1 Ch.3 Ch.5 Ch.7 Ch.9
Saving of bandwidth

50% bandwidth saving

Orthogonal multicarrier techniques frequency

Actually these are sinc-pulses in frequency domain.


Symbols are longer duration
1199in time-domain, and can
eliminate ISI caused by dispersion due to multipaths

Multipath propagation & ISI


Time dispersive channel
Reflections from walls, etc.
Impulse response:

p ( )

[ns]
Problem with high rate data
transmission:
multipath delay spread is of the
order of symbol time
inter-symbol-interference (ISI)

1200

600
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Inter-Symbol-Interference (ISI) due to


Multipath fading

Transmitted signal:

Received Signals:
Line-of-sight:

Reflected:

The symbols add up on the


channel Delays
Distortion!

1201

OFDM: Parallel Tx on narrow bands


Channel
Channel impulse
transfer function
response Time
Frequency
(Freq.selective fading)

1 Channel (serial) Signal is


Frequency
broadband

2 Channels Frequency

8 Channels
Frequency

Channels are
narrowband
(flat fading, ISI)
1202

601
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

MIMO: Spatial diversity, spatial multiplexing


with multiple antennas

Example: Simple selection diversity (Rx only), Diversity Gains.. 1203

SISO, MISO, SIMO, MIMO, SDMA

SISO
Single Input,
Single Output

MISO
Multiple Input,
Single Output
SIMO
Single Input,
Multiple Output

MIMO
Multiple Input,
SDMA Multiple Output
1204

602
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Adaptive antenna gains (Tx or Rx)


Diversity
differently fading paths
fading margin reduction
no gain when noise-
limited
Coherent Gain
energy focusing
improved link budget
reduced radiation

Interference Mitigation
energy reduction
enhanced capacity
improved link budget

Enhanced Rate/Throughput
co-channel streams
increased capacity
increased data rate
1205

Multiple Access Control (MAC)

Base Station

Forward link
Reverse link
Mobile Station

Mobile Station
Mobile Station Mobile Station

1206

603
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

MAC protocols: a taxonomy


Channel Partitioning: TDMA, FDMA
divide channel into pieces (time slots, frequency)
allocate piece to node for exclusive use

Random Access: Aloha, Ethernet CSMA/CD, WiFi CSMA/CA


allow collisions
recover from collisions
Wireless: inefficiencies arise from hidden terminal problem, residual
interference
Cannot support large numbers of users and at high loads

Taking turns: Token ring, distributed round-robin, CDMA, polling


Coordinate shared access using turns to avoid collisions.
Achieve statistical multiplexing gain & large user base, but complexity
CDMA can be loosely classified here (orthogonal code = token)
OFDMA with scheduling also in this category 1207

MAC protocols: Efficiency


Channel partitioning MAC protocols:
share channel efficiently at high load
inefficient at low load: delay in channel access, 1/N
bandwidth allocated even if only 1 active node!

Random access MAC protocols


efficient at low load: single node can fully utilize channel
high load: collision overhead

Taking turns protocols


look for best of both worlds!

1208

604
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Channel partitioning
MAC protocols
TDMA: time division multiple access
Access to channel in "rounds"
Each station gets fixed length slot (length = pkt
trans time) in each round
Unused slots go idle
Example: 6-station LAN, 1,3,4 have pkt, slots
2,5,6 idle
Does not statistical multiplexing gains here

1209

TDMA overview

A
B f0
C B A C B A C B A C B A

C
Time

1210

605
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

FDMA overview

C C
f2

B B f1

A A f0

Time

Need substantial guard bands: inefficient


1211

CDMA
spread spectrum

Base-band Spectrum Radio Spectrum

Code B
Code A
B

B
Code A A
A

B C C
B B C
A A A B
A C
B

Time
Sender Receiver

1212

606
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

OFDMA
OFDMA: a mix of FDMA/TDMA: (OFDM modulation)
Sub channels are allocated in the frequency domain,
OFDM symbols allocated in the time domain.
Dynamic scheduling leverages statistical multiplexing gains,
and allows adaptive modulation/coding/power control, user
diversity
t TD M A

T D M A \O F D M A
m

N
1213

Summary of multiple access

FDMA
power

TDMA
power

CDMA
power

1214

607
Lecture notes Telecommunications Engineering II by Jorma Kekalainen

Wireless is hot, but note


The many advantages of wireless are evident to all
anywhere, anytime, unwired access to the global phone
network or wireless Internet via a highly portable
lightweight device.

But if you are not mobile user, it is often more efficient to go


wired (especially optical)
Nearly interference free
If you need more bandwidth: just add a bunch of fibers
As fiber is much cheaper than digging and resurfacing
streets, put in more fiber than you would ever need (dark
fiber)
Often only the last mile is wireless

1215

608

You might also like