MMAE 517: Illinois Institute of Technology

Computational Fluid Dynamics
MMAE 517
Illinois Institute of Technology

c 2010 K. W. Cassel
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

c 2010 K. W. Cassel 1 / 454
Introduction to CFD
CFD Advertisements...
“CFD is a modern and rather special expression

of fluid mechanics. It is an endlessly fascinating,
interdisciplinary blend of basic numerical methods,
a solid understanding of both experimental
and anayltical fluid dynamics, software engineering
and pragmatism.”
W.N. Dawes (JFM 1996)

c 2010 K. W. Cassel 2 / 454
Introduction to CFD
Advantages and disadvantages of analytical, computational and experimental

approaches to fluid mechanics:
Analytical:
+ Provides exact solutions to governing equations.
+ Gives physical insight, e.g. relative importance of different effects.
+ Can consider hypothetical problems (e.g. inviscid, incompressible,
zero-gravity, etc.).
− Exact solutions only available for simple problems and geometries.
− Of limited direct value in design.

c 2010 K. W. Cassel 3 / 454
Introduction to CFD
Computational:
+ Address more complex problems (physics and geometries).
+ Can consider hypothetical flows ⇒ test theoretical models.
+ Provides detailed solutions ⇒ good understanding of flow, e.g. does
separation occur?
+ Perform parametric studies.
+ Can easily try different configurations, e.g. geometry, boundary conditions,
etc... ⇒ important in design.
+ Computers becoming faster and cheaper ⇒ range of CFD expanding.
+ Increased potential using parallel processing.
+ More cost effective and faster than experimental prototyping.
− Requires accurate governing equations (don’t have for turbulence,
combustion, etc..., which require modeling).
− Boundary conditions sometimes difficult to implement, e.g. outlet.
− Difficult to do in certain parameter regimes, e.g. high Reynolds numbers, and
complex geometries.

c 2010 K. W. Cassel 4 / 454
Introduction to CFD
Experimental:
+ Easier to get overall quantities for problem (e.g. lift and drag on an airfoil).
+ No “modeling” necessary.
− Often requires intrusive measurement probes.
− Limited measurement accuracy.
− Limited measurement resolution.
− Effects of support apparatus, end walls, etc... must be considered.
− Some quantities difficult to obtain, e.g. streamfunction, vorticity, etc....
− Experimental equipment often expensive and takes up space.
− Difficult and costly to test full-scale models.
Note: Computational approaches do not replace analytical or experimental
approaches, but complement them.

c 2010 K. W. Cassel 5 / 454
Numerical Methods: General Considerations and Approaches Components and Properties of a Numerical Solution
Outline
2 Numerical Methods: General Considerations and Approaches

Components and Properties of a Numerical Solution
Numerical Solution Approaches
Finite Difference
Boundary-Value Problems
Finite Element
Spectral Methods
Vortex Methods

c 2010 K. W. Cassel 6 / 454
Numerical Solution Procedure:
Physical System
i.e. Reality
1 Physical Laws + Models
Mathematical Model Analytical

i.e. Governing Equations Solution
(odes or pdes)
2 Discretization
System of Linear Equations
3 Matrix Solver
Numerical Solution

c 2010 K. W. Cassel 7 / 454
Steps:
1 Mass, momentum and energy conservation + models, idealizations, etc...
2 Discretization – approximation of the continuous differential equation(s) by a
system of algebraic equations for the dependent variables at discrete locations
in the independent variables (space and time). For example, using finite
differences
3 Numerical solutions of linear systems of equations.

⇒ Method of discretization often produces certain structure: e.g. second-order
centered differences lead to a tridiagonal system of equations.

c 2010 K. W. Cassel 8 / 454
Sources of errors in numerical solutions arise due to each step of the Numerical
Solution Procedure:
1 Modeling errors – difference between actual flow and exact solution of
mathematical model.
2 Discretization errors – difference between exact solution of governing
equations and exact solution of algebraic equations.
i) Method of discretization → inherent error of method, i.e. truncation error.
ii) Computational grid → can be refined.
3 Iterative convergence errors – difference between numerical solution, i.e.
iterative, and exact solution to algebraic equations.
This is where the “pragmatism” mentioned by Dawes is an important element of
CFD.
e.g. DNS of turbulence in other than elementary flows is currently impossible.
⇒ Compromises must often be made at each stage of the procedure.

c 2010 K. W. Cassel 9 / 454
Properties of successful numerical solution methods:

1 Consistency – discretized equations must become governing equations as grid
size goes to zero
truncation error = discretized equations – exact equations
e.g.:
du ui+1 − ui−1
= + O(∆x2 ),
dx 2∆x
where O(∆x2 ) is the truncation error. Therefore, from the definition of the
−ui−1
derivative, as ∆x → 0, ui+12∆x → du
dx for consistency.
2 Stability – numerical procedure must not magnify errors produced in the
numerical solution.
Unstable method ⇒ diverge from exact solution.

c 2010 K. W. Cassel 10 / 454
3 Convergence – numerical solution of discretized equations approaches exact

solution of governing equations as grid size goes to zero (cf. iterative
convergence).
Generally don’t have exact solution; therefore, refine grid until
grid-independent solution is obtained.
Truncation error gives the rate of convergence to the exact solution as grid is
reduced, e.g. for O(∆x2 ) ⇒ ∆x → e, where e is the error. Then halving the
grid size reduces the error by a factor of four.
4 “Correctness” – numerical solution should compare favorably with available
analytical solutions, experimental results or other computational solutions
within the limitations of each.

c 2010 K. W. Cassel 11 / 454
Numerical Methods: General Considerations and Approaches Numerical Solution Approaches
Outline
2 Numerical Methods: General Considerations and Approaches

Components and Properties of a Numerical Solution
Numerical Solution Approaches
Finite Difference
Boundary-Value Problems
Finite Element
Spectral Methods
Vortex Methods

c 2010 K. W. Cassel 12 / 454
The discretization step (step 2 in the Numerical Solution Procedure) may be

accomplished using different methods. The most common approaches used in
CFD are briefly described below.

c 2010 K. W. Cassel 13 / 454
Finite Difference
Basic approach:
Discretize the governing equations in differential form using
Taylor-series-based finite-difference approximations at each grid point.
Produces algebraic equations involving each grid point and surrounding
points.
Local approximation method.
Popular in fluid dynamics research.
Advantages:
Relatively straightforward to understand and implement (based on Taylor
series).
Utilizes familiar differential form of governing equations.
Very general ⇒ Apply to a wide variety of problems (including complex
physics, e.g. fluids plus heat transfer plus combustion).
Can extend to higher-order approximations.

c 2010 K. W. Cassel 14 / 454
Disadvantages:
More difficult to implement for complex geometries.

c 2010 K. W. Cassel 15 / 454
Finite Volume
Basic approach:
Apply conservation equations in integral form to a set of control volumes.
Produces algebraic equations for each control volume involving surrounding
control volumes.
Popular for commercial CFD codes (e.g. FLUENT).
Advantages:
Easier to treat complex geometries than finite-difference approach.
“Ensures” conservation of necessary quantities (i.e. mass, momentum,
energy, etc.), i.e. even if solution in inaccurate.
Disadvantages:
More difficult to construct higher-order schemes.
Uses less familiar integral formulation of governing equations.

c 2010 K. W. Cassel 16 / 454
Finite Element
Basic approach:
Apply conservation equations in variational form with weighting function to
set of finite elements.
Produces set of linear or nonlinear algebraic equations.
Popular in commercial codes (particularly for solid mechanics and heat
transfer).
Advantages:
Easy to treat complex geometries.
Disadvantages:
Results in unstructured grids.
Solution methods are inefficient for the types of matrices resulting from
finite-element discretizations (cf. finite difference ⇒ sparse, highly structured
matrices).

c 2010 K. W. Cassel 17 / 454
Spectral Methods
Basic approach:
Solution of governing equations in differential form are approximated using
truncated (usually orthogonal) eigenfunction expansions.
Produces system of algebraic equations (steady) or system of ordinary
differential equations (unsteady) involving the coefficients in the
eigenfunction expansion.
Global approximation method.
Popular for direct numerical simulation (DNS) of turbulence.
Advantages:
Obtain highly accurate solutions when underlying solution is smooth.
Can achieve rapid convergence.

c 2010 K. W. Cassel 18 / 454
Disadvantages:
Less straightforward to implement than finite difference.
More difficult to treat complicated boundary conditions (e.g. Neumann).
Small changes in problem can cause large changes in algorithm.
Not well suited for solutions having large gradients.

c 2010 K. W. Cassel 19 / 454
Vortex Methods

c 2010 K. W. Cassel 20 / 454
Finite-Difference Methods Extended Fin Example
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition

c 2010 K. W. Cassel 21 / 454
As an example, consider the 1-D heat conduction in an extended surface with

convection to the ambient air.
The heat transfer within the extended fin is governed by the 1-D
ordinary-differential equation (see, for example, Incropera & DeWitt):
d2 T

1 dAc dT 1 h dAs
+ − (T − T∞ ) = 0, (3.1)
dx2 Ac dx dx Ac k dx
where

c 2010 K. W. Cassel 22 / 454
T (x) = temperature distribution,

Ac (x) = cross-sectional area,
As (x) = surface area from base,
h = convective heat transfer coefficient,
k = thermal conductivity.
Note that equation (3.1) is a second-order ordinary differential equation with
variable coefficients, and it is a boundary-value problem requiring boundary
conditions at both ends of the domain.
Letting θ(x) = T (x) − T∞ , rewrite equation (3.1) as
d2 θ dθ
2
+ f (x) + g(x)θ = 0, (3.2)
dx dx
where
1 dAc
f (x) = ,
Ac dx
1 h dAs
g(x) = − .
Ac k dx

c 2010 K. W. Cassel 23 / 454
For now, consider Dirichlet boundary conditions
θ = θb = Tb − T∞ at x = 0,
(3.3)
θ = θL = TL − T∞ at x = L.
Equation (3.2) with boundary conditions (3.3) represents the mathematical model
(step 1 in the Numerical Solution Procedure).
Step 2 → Discretization:
L
Divide the interval 0 ≤ x ≤ L into I equal subintervals of length ∆x = I:
Here, fi = f (xi ) and gi = g(xi ) are known, and the solution θi = θ(xi ) is to be
determined for i = 2, . . . , I.

c 2010 K. W. Cassel 24 / 454
In order to discretize the differential equation, consider the definition of the

derivative
dθ θ(xi + ∆x) − θ(xi )
= lim .
dx x=xi ∆x→0
∆x
This may be interpreted as a forward difference if ∆x is small, but not going to
zero, i.e. finite difference.
Graphical interpretation:

c 2010 K. W. Cassel 25 / 454

dθ θi+1 − θi
Forward Difference: ≈
dx xi
∆x

dθ θi − θi−1
Backward Difference: ≈
dx xi ∆x

dθ θi+1 − θi−1
Central Difference: ≈
dx xi
2∆x
Which approximation is better?

c 2010 K. W. Cassel 26 / 454
Finite-Difference Methods Formal Basis for Finite Differences
Outline
Thomas Algorithm

c 2010 K. W. Cassel 27 / 454
Consider the Taylor series expansion of a function θ(x) in the vicinity of the point
xi
(x − xi )2 d2 θ

dθ
θ(x) = θ(xi ) + (x − xi ) +
dx i 2! dx2 i
(3.4)
(x − xi )3 d3 θ (x − xi )n dn θ

+ + ··· + + ··· .
3! dx3 i n! dxn i
Apply the Taylor series at x = xi+1 : (xi+1 − xi = ∆x)
∆x2 d2 θ ∆x3 d3 θ ∆xn dn θ

dθ
θi+1 = θi +∆x + + +· · ·+ +· · · .
dx i 2 dx2 i 6 dx3 i n! dxn i
(3.5)
Solving for (dθ/dx)i gives
∆x d2 θ ∆xn−1 dn θ

dθ θi+1 − θi
= − − ··· − + ··· . (3.6)
dx i ∆x 2 dx2 i n! dxn i

c 2010 K. W. Cassel 28 / 454
Similarly, apply the Taylor series at x = xi−1 : (xi−1 − xi = −∆x)
∆x2 d2 θ ∆x3 d3 θ (−1)n ∆xn dn θ

dθ
θi−1 = θi −∆x + − +· · ·+ +· · · .
dx i 2 dx2 i 6 dx3 i n! dxn i
(3.7)
Solving again for (dθ/dx)i gives
θi − θi−1 ∆x d2 θ ∆x2 d3 θ (−1)n ∆xn−1 dn θ

dθ
= + − +· · ·+ +· · · .
dx i ∆x 2 dx2 i 6 dx3 i n! dxn i
(3.8)
Alternatively, subtract equation (3.7) from (3.5) to obtain
∆x3 d3 θ 2∆x2n+1 d2n+1 θ

dθ
θi+1 − θi−1 = 2∆x + +···+ +··· ,
dx i 3 dx3 i (2n + 1)! dx2n+1 i
(3.9)
and solve for (dθ/dx)i to obtain
θi+1 − θi−1 ∆x2 d3 θ ∆x2n

2n+1
dθ d θ
= − 3
−· · ·− 2n+1
−· · · . (3.10)
dx i 2∆x 6 dx i (2n + 1)! dx i

c 2010 K. W. Cassel 29 / 454
Equations (3.6), (3.8) and (3.10) are exact expressions for the first derivative
(dθ/dx)i , i.e. if all of the terms are retained in the expansions.
Approximate finite difference expressions for the first derivative may then be
obtained by truncating the series after the first term:

dθ θi+1 − θi
≈ + O(∆x) → Forward difference
dx i ∆x

dθ θi − θi−1
≈ + O(∆x) → Backward difference
dx i ∆x

dθ θi+1 − θi−1
≈ + O(∆x2 ) → Central difference
dx i 2∆x
The truncation error is the sum of all the truncated terms.

⇒ For small ∆x, successive terms get smaller and the order of the truncation
error is given by the first truncated term.

c 2010 K. W. Cassel 30 / 454
Higher-order approximations may be obtained by various manipulations of the

Taylor series at additional points. For example, to obtain a second-order acccurate
forward-difference approximation to the first derivative, apply the Taylor series at
xi+2 as follows
(2∆x)2 d2 θ (2∆x)3 d3 θ

dθ
θi+2 = θi + 2∆x + + + · · · . (3.11)
dx i 2! dx2 i 3! dx3 i
We eliminate the (d2 θ/dx2 )i term by taking 4×(3.5) − (3.11) to obtain
2∆x3 d3 θ

dθ
4θi+1 − θi+2 = 3θi + 2∆x − + ··· .
dx i 3 dx3 i
Solving for (dθ/dx)i gives
∆x2 d3 θ

dθ −3θi + 4θi+1 − θi+2
= + + ··· , (3.12)
dx i 2∆x 3 dx3 i
which is second-order accurate and involves the point of interest and the two
previous points.

c 2010 K. W. Cassel 31 / 454
For a second-order accurate central-difference approximation to the second

derivative add equation (3.5) and (3.7) for θi+1 and θi−1 , respectively, to
eliminate the (dθ/dx)i term. This gives
2
∆x4 d4 θ

2 d θ
θi+1 + θi−1 = 2θi + ∆x + + ··· .
dx2 i 12 dx4 i
Solving for (d2 θ/dx2 )i leads to

2
∆x2 d4 θ

d θ θi+1 − 2θi + θi−1
= − + ··· . (3.13)
dx2 i ∆x2 12 dx4 i
Note that the second-order accurate central-difference approximations for both

the first and second derivatives involve the point of interest and its two neighbors.

c 2010 K. W. Cassel 32 / 454
Finite-Difference Methods Application to Extended Fin Example
Outline
Thomas Algorithm

c 2010 K. W. Cassel 33 / 454
Returning to the fin equation (3.2)

d2 θ dθ
+ f (x) + g(x)θ = 0,
dx2 dx
and approximating the derivatives using second-order accurate finite differences
gives
θi+1 − 2θi + θi−1 θi+1 − θi−1
+ fi + gi θi = 0,
(∆x)2 2∆x
for the point xi . Multiplying by (∆x)2 and collecting terms leads to
ai θi−1 + bi θi + ci θi+1 = di , i = 2, . . . , I, (3.14)
where the difference equation is applied at each interior point of the domain, and
∆x
ai = 1 − fi ,
2
bi = −2 + (∆x)2 gi ,
∆x
ci = 1+ fi ,
2
di = 0.
c 2010 K. W. Cassel 34 / 454
Note that because we have discretized the differential equation at each interior
grid point, we obtain a set of (I − 1) algebraic equations for the (I − 1) unknown
values of the temperature θi , i = 2, . . . , I at each interior grid point.
The coefficient matrix for the difference equation (3.14) is tridiagonal
    
b2 c2 0 0 ··· 0 0 0 θ2 d2 − a2 θ1
a3 b3
 c3 0 · · · 0 0 0    θ3  
   d3 

 0 a4 b4 c4 · · · 0 0 0   θ4   d 4

..   ..  = 
    
 .. .. .. .. . . .. .. .. 
.
 . . . . . . . 

 .  
  . 

0 0 0 0 · · · aI−1 bI−1 cI−1  θI−1   dI−1 
0 0 0 0 ··· 0 aI bI θI dI − cI θI+1

c 2010 K. W. Cassel 35 / 454
Finite-Difference Methods Properties of Tridiagonal Matrices
Outline
Thomas Algorithm

c 2010 K. W. Cassel 36 / 454
Consider a N × N tridiagonal matrix with constant ai , bi , and ci

 
b c 0 ··· 0 0
a b c
 · · · 0 0 
0 a b · · · 0 0
A = . . . .. ..  .
 
 .. .. .. ..
 . . . 
0 0 0 · · · b c
0 0 0 ··· a b
It can be shown that the eigenvalues of a tridiagonal matrix are

√

jπ
λj = b + 2 ac cos , j = 1, . . . , N. (3.15)
N +1
The eigenvalues with the largest and smallest magnitudes are (which is largest or
smallest depends upon a, b and c)
√

π
|λ1 | = b + 2 ac cos ,
N +1
√

N π
|λN | = b + 2 ac cos .
N +1
c 2010 K. W. Cassel 37 / 454
Then the condition number of A using an L2 norm is defined by
|λ|max
cond2 (A) = .
|λ|min
Let us consider N large. Thus, expanding cosine in a Taylor series (the first is
expanded about π/(N + 1) → 0 and the second about N π/(N + 1) → π
2 4

π π 1 1 π
cos =1− + + ··· ,
N +1 N +1 2!4! N + 1
2 2
Nπ 1 Nπ 1 π
cos = −1 + − π − · · · = −1 + − ··· .
N +1 2! N + 1 2 N +1
Consider the common case that may result from the use of central differences for
a second-order derivative:
a = 1, b = −2, c = 1,
which is weakly diagonally dominant.

c 2010 K. W. Cassel 38 / 454
Then with the large N expansions from above

" 2 # 2
p 1 π π
|λ1 | = −2 + 2 (1)(1) 1 − + ··· = + ··· ,

2 N +1 N +1
" 2 #
p 1 π
|λN | = −2 + 2 (1)(1) −1 + − ··· = 4 + ··· .

2 N +1
(3.16)
Thus, the condition number for large N is approximately
4 4(N + 1)2
cond2 (A) ≈ = , for large N .
(π/(N + 1))2 π2
⇒ Increases proportional to N 2 with increasing N .

c 2010 K. W. Cassel 39 / 454
Now consider the case with
a = 1, b = −4, c = 1,
which is strictly, or strongly, diagonally dominant. Then from equation (3.16), the
condition number for large N is approximately
6
cond2 (A) ≈ = 3, for large N .
2
⇒ Constant with increasing N .

It can be
Pproven that for any strictly diagonally dominant matrix A, i.e.
N
|aii | > j=1,j6=i |aij |:
1) A is nonsingular; therefore, it is invertible.
2) Gauss elimination does not require row interchanges for conditioning.
3) Computations are stable with respect to round-off errors.
To see the influence of condition number on round-off error, see the Mathematica
notebook “Ill-Conditioned Matrices and Round-Off Error.”

c 2010 K. W. Cassel 40 / 454
Finite-Difference Methods Thomas Algorithm
Outline
Thomas Algorithm

c 2010 K. W. Cassel 41 / 454
Tridiagonal systems of equations may be solved directly using the Thomas

algorithm, which is essentially application of Gauss elimination and consists of two
stages (shown here for Dirichlet boundary conditions):
1) Forward Elimination:
F1 = 0, δ1 = θ1 = θb = boundary condition
ci di − ai δi−1
Fi = , δi = , i = 2, . . . , I.
bi − ai Fi−1 bi − ai Fi−1
2) Back Substitution:
θI+1 = θL = boundary condition

θi = δi − Fi θi+1 , i = I, . . . , 2.
Note the order of evaluation for the back substitution.

c 2010 K. W. Cassel 42 / 454
Notes:
1 See Anderson, Appendix A for a derivation.
2 The Thomas algorithm only requires O(I) operations, which is as good a
scaling as one could hope for. Gauss elimination of a full (dense) matrix, for
example, requires O(I 3 ) operations.
3 To prevent ill-conditioning the system of equations should be diagonally
dominant. For our tridiagonal system, this means
|bi | ≥ |ai | + |ci |,
where if the greater than sign applies we say that the matrix is strictly
diagonally dominant, or weakly diagonally dominant if the equal sign applies.
Performing operations with ill-conditioned matrices can result in the growth
of small round-off errors that then contaminate the solution. For example,
note how errors could accumulate in the Fi , δi coefficients in the Thomas
algorithm.

c 2010 K. W. Cassel 43 / 454
Finite-Difference Methods Extended Fin Example – Convection Boundary Condition
Outline
Thomas Algorithm

c 2010 K. W. Cassel 44 / 454
Rather than a Dirichlet boundary condition at the tip of the fin, which assumes
that we know the temperature there, let us consider a more realistic convection
condition at the tip
dT
−k = h(T − T∞ ) at x = L,
dx
or with θ(x) = T (x) − T∞

dθ
−k = hθ(L)
dx x=L
Note that the convection condition results in specifying a linear combination of

the temperature and its derivative at the tip. This is known as a Robin boundary
condition and is of the form
dθ
pθ + q =r at x = L,
dx
where p = h, q = k, r = 0, which are inputs to the Thomas algorithm.

c 2010 K. W. Cassel 45 / 454
Now let us consider evaluation of the heat flux at the base of the fin

dT dθ
qb = −kAc (0) = −kAc (0) .
dx x=0 dx x=0
In order to evaluate dθ/dx at the base x = 0, we must use a forward difference.

From equation (3.6) applied at i = 1, we have the first-order accurate
forward-difference approximation

dθ θ2 − θ1
≈ + O(∆x).
dx x=0 ∆x
For a more accurate approximation, we may use the second-order accurate

approximation from equation (3.12) applied at i = 1

dθ −3θ1 + 4θ2 − θ3
≈ + O(∆x2 ).
dx x=0 2∆x

c 2010 K. W. Cassel 46 / 454
Even higher-order finite-difference approximations may be formed. For example,

the third-order, forward-difference approximation is

dθ −11θ1 + 18θ2 − 9θ3 + 2θ4
≈ + O(∆x3 ).
dx x=0 6∆x
Observe that each successive approximation requires one additional point in the
interior of the domain.

c 2010 K. W. Cassel 47 / 454
Classification of Second-Order Partial Differential Equations Mathematical Classification
Outline
4 Classification of Second-Order Partial Differential Equations

Mathematical Classification
Hyperbolic Equations
Parabolic Equations
Elliptic Equations
Mixed Equations

c 2010 K. W. Cassel 48 / 454
Consider the general second-order partial differential equation for u(x, y)
auxx + buxy + cuyy + dux + euy + f u = g. (4.1)
Linear ⇒ a, b, c, d, e and f are only functions of (x, y).

Quasi-linear ⇒ a, b and c may be functions of (x, y, u, ux , uy ), i.e. equation is
linear in highest derivative (e.g. no u2xx terms).
Let us determine the criteria necessary for the existence of a smooth
(differentiable) and unique (single-valued) solution along a curve C:

c 2010 K. W. Cassel 49 / 454
Along C, let
φ1 (τ ) = uxx , φ2 (τ ) = uxy , φ3 (τ ) = uyy ,
(4.2)
ψ1 (τ ) = ux , ψ2 (τ ) = uy .
Substituting into equation (4.1) gives
aφ1 + bφ2 + cφ3 = g − dψ1 − eψ2 − f u = H. (4.3)
Transforming from (x, y) → τ , we have
d dx ∂ dy ∂
= + .
dτ dτ ∂x dτ ∂y
Thus, from (4.2)
dψ1 d dx dy dx dy
= ux = uxx + uxy = φ1 + φ2 , (4.4)
dτ dτ dτ dτ dτ dτ
dψ2 d dx dy dx dy
= uy = uxy + uyy = φ2 + φ3 . (4.5)
dτ dτ dτ dτ dτ dτ

c 2010 K. W. Cassel 50 / 454
Equations (4.3)–(4.5) are three equations for three unknowns, i.e. the
second-order derivatives φ1 , φ2 and φ3 . Written in matrix form, they are
    
a b c φ1 H
 dx/dτ dy/dτ 0   φ2  =  dψ1 /dτ 
0 dx/dτ dy/dτ φ3 dψ2 /dτ
If the determinant of the coefficient matrix is not equal to zero, a unique solution
exists for the second derivatives along the curve C. It can be shown that if the
second-order derivatives exist, then derivatives of all orders exist along C as well.
On the other hand, if the determinant of the coefficient matrix is equal to zero,
the solution is not unique, i.e. the second derivatives are discontinuous along C.
Setting the determinant equal to zero gives
2 2
dy dx dx dy
a +c −b = 0,
dτ dτ dτ dτ
or multiplying by (dτ /dx)2

2
dy dy
a −b + c = 0.
dx dx
c 2010 K. W. Cassel 51 / 454
This is a quadratic equation for dy/dx, which is the slope of the curve C.
Thus, √
dy b ± b2 − 4ac
= . (4.6)
dx 2a
→ The curves C for which y(x) satisfy (4.6) are called characteristic curves of
equation (4.1), and they are curves along which the second-order derivatives
are discontinuous.
Because the characteristics must be real, their behavior is determined by the sign
of b2 − 4ac:
b2 − 4ac > 0 ⇒ 2 real roots ⇒ 2 characteristics ⇒ hyperbolic p.d.e.
b2 − 4ac = 0 ⇒ 1 real root ⇒ 1 characteristic ⇒ parabolic p.d.e.
b2 − 4ac < 0 ⇒ no real roots ⇒ no characteristics ⇒ elliptic p.d.e.

c 2010 K. W. Cassel 52 / 454
The terminology arises from classification of the second-degree algebraic equation

ax2 + bxy + cy 2 + dx + ey + f = 0, i.e. conic sections.
Note that if from equation (4.1), we write (let x = x1 , y = x2 , then
au11 + 2b u12 + 2b u21 + cu22 = H)

a b/2
A=
b/2 c
(analogous to quadratic forms in MMAE 501). Then
det[A] = ac − b2 /4
or 
 > 0, hyperbolic
2
−4 det[A] = b − 4ac = 0, parabolic
< 0, elliptic


c 2010 K. W. Cassel 53 / 454
Notes:
1 Physically, characteristics are curves along which information propagates in
the solution.
2 For the case of elliptic equations, the matrix A is positive definite.
3 The classification depends on the coefficients of the highest-order derivatives,
i.e. a, b and c.
4 It can be shown that the classification of a partial differential equation is
independent of the coordinate system (see, for example, Tannehill et al.).

c 2010 K. W. Cassel 54 / 454
Classification of Second-Order Partial Differential Equations Hyperbolic Equations
Outline

Parabolic Equations
Elliptic Equations
Mixed Equations

c 2010 K. W. Cassel 55 / 454
From equation (4.6), for a hyperbolic equation there are two real roots, say
dy dy
= λ1 , = λ2 . (4.7)
dx dx
If a, b and c are constant (λ1 , λ2 constant), then we may integrate to obtain
y = λ1 x + x1 , y = λ 2 x + x2 ,
which are straight lines. Therefore, the solution propagates along two linear
characteristic curves.
For example, consider the wave equation
∂2u 2
2∂ u
= σ , (y → t : a = σ 2 , b = 0, c = −1)
∂t2 ∂x2
where u(x, t) is the amplitude of the wave, and σ is the wave speed. Therefore,
from equations (4.6) and (4.7),
1 1
λ1 = , λ2 = − .
σ σ
c 2010 K. W. Cassel 56 / 454
Therefore, the characteristics of the wave equation with a, b, and c constant are
straight lines with slopes 1/σ and −1/σ.
The solution to the wave equation is of the form
u(x, t) = F1 (x + σt) + F2 (x − σt),
and initial conditions are required at say t = 0, such as
u(x, 0) = f (x), ut (x, 0) = g(x).

c 2010 K. W. Cassel 57 / 454
Note that no boundary conditions are necessary at specified values of x, i.e. the
solution (F1 , F2 ) only depends upon the initial conditions.
Hyperbolic equations in fluid dynamics:
Unsteady inviscid flow
Steady supersonic inviscid flow

c 2010 K. W. Cassel 58 / 454
Classification of Second-Order Partial Differential Equations Parabolic Equations
Outline

Parabolic Equations
Elliptic Equations
Mixed Equations

c 2010 K. W. Cassel 59 / 454
From equation (4.6), for a parabolic equation there is only one real root, which is
dy b
= . (4.8)
dx 2a
If a and b are constant, then we may integrate to obtain
b
y= x + γ1 ,
2a
which is a straight line. Therefore, the solution propagates along one linear
characteristic direction (usually time).
For example, consider the one-dimensional, unsteady diffusion equation (e.g. heat
conduction)
∂u ∂2u
= α 2 , (y → t : a = α, b = c = 0)
∂t ∂x
where u(x, t) is the quantity undergoing diffusion (e.g. temperature), and α is the
diffusivity.

c 2010 K. W. Cassel 60 / 454
The solution marches forward in time, i.e. the characteristics (with b = 0) are lines
of constant t.
Initial and boundary conditions are required, such as
u(x, 0) = u0 (x), u(x1 , t) = f (t), u(x2 , t) = g(t).

c 2010 K. W. Cassel 61 / 454
Parabolic equations in fluid dynamics:

Unsteady Navier-Stokes
Steady boundary-layer flow (no separation):
∂u ∂u ∂p ∂ 2 u
u +v =− + , (a = 0, b = 0, c = 1).
∂x ∂y ∂x ∂y 2
In this case the solution marches forward in the x−direction from an initial
velocity profile.

c 2010 K. W. Cassel 62 / 454
Classification of Second-Order Partial Differential Equations Elliptic Equations
Outline

Parabolic Equations
Elliptic Equations
Mixed Equations

c 2010 K. W. Cassel 63 / 454
Disturbances have infinite speed of propagation in all directions, i.e. no

characteristics ⇒ A disturbance anywhere affects the solution everywhere
instantaneously.
For example, consider the Laplace equation (e.g. two-dimensional steady heat
conduction, potential flow, etc.)
∂2u ∂2u
+ 2 = 0, (a = 1, b = 0, c = 1)
∂x2 ∂y

c 2010 K. W. Cassel 64 / 454
Requires a global solution strategy, and boundary conditions (typically u or

∂u/∂n) must be specified on a closed contour bounding the domain.
Elliptic equations in fluid dynamics:
Steady Navier-Stokes (also steady energy equation)
Potential (inviscid, incompressible, irrotational) flow
Poisson equation for pressure or streamfunction

c 2010 K. W. Cassel 65 / 454
Classification of Second-Order Partial Differential Equations Mixed Equations
Outline

Parabolic Equations
Elliptic Equations
Mixed Equations

c 2010 K. W. Cassel 66 / 454
If a, b and c are variable coefficients, then b2 − 4ac may change sign with space
and/or time.
⇒ Character of equations may be different in certain regions.
For example, consider transonic flow (Mach ∼ 1). The governing equation for
two-dimensional, steady, compressible, potential flow about a slender body is
∂2φ ∂2φ
(1 − M 2 ) + = 0, (x → s, y → n : a = 1 − M 2 , b = 0, c = 1),
∂s2 ∂n2
where φ(s, n) is the velocity potential, M is the local Mach number, and s and n
are streamline coordinates with s locally being tangent to the streamline and n
being normal to the streamline.
To determine the nature of the equation, observe that
b2 − 4ac = 0 − 4(1 − M 2 )(1) = −4(1 − M 2 );
therefore,
M < 1 ⇒ b2 − 4ac < 0 ⇒ Elliptic
M = 1 ⇒ b2 − 4ac = 0 ⇒ Parabolic
M > 1 ⇒ b2 − 4ac > 0 ⇒ Hyperbolic
c 2010 K. W. Cassel 67 / 454
In the above example, we have the same equation, but different behavior in
various regions. In the following example, we have different equations in different
regions of the flow.

c 2010 K. W. Cassel 68 / 454
Consider the steady, incompressible viscous flow past a surface.

c 2010 K. W. Cassel 69 / 454
Numerical Solutions of Elliptic Problems Introduction
Outline
5 Numerical Solutions of Elliptic Problems

Introduction
Finite-Difference Methods for the Poisson Equation
Direct Methods for Linear Systems
Fourier Transform Methods
Cyclic Reduction
Iterative (Relaxation) Methods
Jacobi Iteration
Gauss-Seidel Iteration
Successive Overrelaxation (SOR)
Boundary Conditions
Dirichlet Boundary Conditions
Neumann (Derivative) Boundary Conditions
Alternating-Direction-Implicit (ADI) Method
Compact Finite Differences
Multigrid Methods
Motivation
Multigrid Methodology
Speed Comparisons

c 2010 K. W. Cassel 70 / 454
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing

c 2010 K. W. Cassel 71 / 454
Recall that elliptic problems have no preferred direction of propagation; therefore,

they require:
1 Boundary conditions on a closed contour surrounding the domain, i.e.
boundary-value problem.
2 Global solution method.

c 2010 K. W. Cassel 72 / 454
Types of boundary conditions:

1 Dirichlet – φ specified.
∂φ
2 Neumann – ∂n specified (n is normal to boundary).
∂φ
3 Robin (mixed) – ∂n + aφ = b specified
Notes:
1 Combinations of the above boundary conditions may be applied on different
portions of the boundary as long as some boundary condition is applied at
every point along the boundary contour.
2 Solutions to the Laplace and Poisson equations with Neumann boundary
conditions on the entire boundary can only be determined relative to an
unknown constant, i.e. φ(x, y) + c is a solution.

c 2010 K. W. Cassel 73 / 454
Linear vs. Non-Linear Elliptic Problems:

Linear – Linear equation and boundary conditions
e.g. Laplace (f = 0) or Poisson equations
∂2φ ∂2φ
+ = f (x, y),
∂x2 ∂y 2
with Dirichlet, Neumann or Robin boundary conditions.
Non-Linear:
1 Linear equation with non-linear boundary conditions; e.g. heat conduction with
radiation condition
∂2T ∂2T
+ = 0,
∂x2 ∂y 2
∂T
= D(T 4 − Tsur4
).
∂n
2 Non-linear equation; e.g.: Navier-Stokes equations
„ 2
∂2u
«
∂u ∂u ∂p 1 ∂ u
u +v =− + + .
∂x ∂y ∂x Re ∂x2 ∂y 2
Note that as Reynolds number, Re, increases, the non-linearity increases.

c 2010 K. W. Cassel 74 / 454
Numerical Solutions of Elliptic Problems Finite-Difference Methods for the Poisson Equation
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 75 / 454
Outline (cont’d)

c 2010 K. W. Cassel 76 / 454
As a representative elliptic equation, which is important in its own right, consider

the two-dimensional Poisson equation
∂2φ ∂2φ
+ 2 = f (x, y), (5.1)
∂x2 ∂y
where if f (x, y) = 0 we have the Laplace equation.

c 2010 K. W. Cassel 77 / 454
In order to define the grid:

Divide the x interval, 0 ≤ x ≤ a, into I equal subintervals of length ∆x, with
xi = ∆x(i − 1).
Divide the y interval, 0 ≤ y ≤ b, into J equal subintervals of length ∆y, with
yj = ∆y(j − 1).
⇒ 2-D grid intersecting in (I + 1) × (J + 1) points.

c 2010 K. W. Cassel 78 / 454
Consider an approximation to equation (5.1) at a typical point (i, j); the five
point finite-difference stencil is
Using second-order accurate, central differences (with φi,j = φ(xi , yj )):
∂2φ φi+1,j − 2φi,j + φi−1,j

2
≈ + O(∆x2 ),
∂x (∆x)2
∂2φ φi,j+1 − 2φi,j + φi,j−1

2
≈ + O(∆y 2 ).
∂y (∆y)2

c 2010 K. W. Cassel 79 / 454
Substituting into (5.1) and multiplying by (∆x)2 gives the final form of the
finite-difference equation
" 2 # 2
∆x ∆x
φi+1,j − 2 1 + φi,j + φi−1,j + (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j ,
∆y ∆y
(5.2)
which results in a system of (I + 1) × (J + 1) equations for (I + 1) × (J + 1)
unknowns.
There are two options for solving such systems of equations:
1 Direct Methods:
i) No iterative convergence errors.
ii) Efficient for certain types of linear systems, e.g. tridiagonal, block-tridiagonal.
iii) Become less efficient for large systems of equations.
iv) Typically cannot adapt to non-linear problems.
2 Iterative Methods:
i) Iterative convergence errors.
ii) Generally more efficient for large systems of equations.
iii) Apply to non-linear problems.

c 2010 K. W. Cassel 80 / 454
Numerical Solutions of Elliptic Problems Direct Methods for Linear Systems
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 81 / 454
Outline (cont’d)

c 2010 K. W. Cassel 82 / 454
We wish to consider direct methods for solving the discretized Poisson equation.
Repeating the difference equation (5.2) for the Poisson equation, we have
¯ i,j + φi−1,j + ∆
φi+1,j − 2(1 + ∆)φ ¯ (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j , (5.3)
¯ = (∆x/∆y)2 , and i = 1, . . . , I + 1; j = 1, . . . , J + 1.
where ∆
In order to write the system of difference equations in matrix form, let us renumber
the two-dimensional mesh (i, j) into a one-dimensional array (n) as follows

c 2010 K. W. Cassel 83 / 454
Thus, the relationship between the one-dimensional and two-dimensional indices is
n = (i − 1)(J + 1) + j, n = 1, . . . , (I + 1)(J + 1),
where i = 1, . . . , I + 1 and j = 1, . . . , J + 1.
Therefore, our five-point finite difference stencil becomes

c 2010 K. W. Cassel 84 / 454
And the finite-difference equation (5.4) for the Poisson equation becomes (with
∆ = ∆x = ∆y ⇒ ∆ ¯ = 1)
φn+J+1 + φn−(J+1) + φn+1 + φn−1 − 4φn = ∆2 fn , (5.4)
where n = 1, . . . , (I + 1)(J + 1).

Thus, the system of equations Aφ = d is of the form
    
D I 0 ··· 0 0 φ1 d1
 I D I ··· 0 0 φ2   d2 
    
0 I D ··· 0 0 φ 3
  d 3

= .
    
 .. .. .. . . .. ..   ..   ..
.
 . . . . . 
  . 
 
 . 

 0 0 0 · · · D I  φ(I+1)(J+1)−1  d(I+1)(J+1)−1 
0 0 0 ··· I D φ(I+1)(J+1) d(I+1)(J+1)
A is a block tridiagonal matrix with (I + 1) blocks in both directions, where each

block is a (J + 1) × (J + 1) matrix, and d contains all known information from
the boundary conditions.

c 2010 K. W. Cassel 85 / 454
0 is a zero matrix block, I is an identity matrix block, and the tridiagonal blocks
are  
−4 1 0 ··· 0 0
 1 −4 1 · · · 0 0
 
0 1 −4 · · · 0 0 
D= . .
 
. . . . .
 .. .. .. .. .. .. 

 
0 0 0 · · · −4 1 
0 0 0 ··· 1 −4
In general, the method for solving the system Aφ = d depends upon the form of
the coefficient matrix A:
Full or dense ⇒ Gauss elimination, LU decomposition, etc. (very expensive
computationally).
Sparse and banded → Result from discretizations of certain classes of
problems (e.g. separable elliptic partial differential equations, such as the
Poisson equation):
→ Tridiagonal ⇒ Thomas algorithm
→ Fast Fourier Transform (FFT) and/or cyclic reduction, which generally are the
fastest methods for problems in which they apply.

c 2010 K. W. Cassel 86 / 454

The Fourier transform of a continuous function h(t) is defined by
Z ∞
H(f ) = h(t)e2πif t dt,
−∞
where f is the frequency, and i is the imaginary number. The inverse transform is
Z ∞
h(t) = H(f )e−2πif t df.
−∞
Consider the discrete form of the Fourier transform in which we have N values of
h(t) at discrete points defined by
hk = h(tk ), tk = k∆, k = 0, 1, 2, . . . , N − 1,
where ∆ is the step size in the data.

c 2010 K. W. Cassel 87 / 454
After taking the Fourier transform, we will have N discrete points in the
frequency domain defined by
n N N
fn = , n=− ,..., ,
N∆ 2 2
corresponding to the Nyquist critical frequency range. Thus, the discrete Fourier
transform is Z ∞
H(fn ) = h(t)e2πifn t dt,
−∞
or approximating the integral by summing over each ∆ interval

N
X −1
H(fn ) ≈ hk e2πifn tk ∆.
k=0
Using the fact that fn tk = (n/N ∆)k∆ = nk/N gives

N
X −1
H(fn ) ≈ ∆ hk e2πikn/N .
k=0

c 2010 K. W. Cassel 88 / 454
Denoting the summation in the previous expression by ĥn , we write

N
X −1
∴ H(fn ) ≈ ∆ĥn , ĥn = hk e2πikn/N . (5.5)
k=0
The discrete inverse Fourier transform is then

N/2
1 X
hk = ĥn e−2πikn/N . (5.6)
N
n=−N/2
The above expressions are all for a one-dimensional domain.

For a two-dimensional grid, we define:
k = 0, . . . , K − 1, l = 0, . . . , L − 1 → Physical grid
m = −K K
2 ,..., 2 , n = − L2 , . . . , L2 → Transform space (frequency)
The two-dimensional discrete Fourier transform of φ is (analogous to (5.5)

K−1
X L−1
X
2
Φ(fm,n ) ≈ ∆ φ̂m,n , φ̂m,n = φk,l e2πimk/K e2πinl/L . (5.7)
k=0 l=0
c 2010 K. W. Cassel 89 / 454
This is equivalent to taking the Fourier transform in each direction. The inverse
Fourier transform is (analogous to (5.6))
K/2 L/2
1 X X
φk,l = φ̂m,n e−2πikm/K e−2πiln/L . (5.8)
KL
m=−K/2 n=−L/2
Now let us apply this to the discretized Poisson equation (5.2). For simplicity, set
∆ = ∆x = ∆y giving (with i → k, j → l)
φk+1,l + φk−1,l + φk,l+1 + φk,l−1 − 4φk,l = ∆2 fk,l , (5.9)
where now fk,l is the right-hand-side of the Poisson equation (not frequency).
Substituting equation (5.8) into equation (5.9) leads to
φ̂m,n e−2πi(k+1)m/K e−2πiln/L + e−2πi(k−1)m/K e−2πiln/L

+e−2πikm/K e−2πi(l+1)n/L + e−2πikm/K e−2πi(l−1)n/L

−4e−2πikm/K e−2πiln/L = ∆2 fˆm,n e−2πikm/K e−2πiln/L ,

where fˆm,n is the Fourier transform of the right-hand-side fk,l .

c 2010 K. W. Cassel 90 / 454
Canceling the common factor e−2πikm/K e−2πiln/L , we have

h i
φ̂m,n e −2πim/K
+e 2πim/K
+e −2πin/L
+e 2πin/L
− 4 = ∆2 fˆm,n .
Recalling that cos(ax) = eiax + e−iax /2 gives

2πm 2πn
φ̂m,n 2 cos + 2 cos − 4 = ∆2 fˆm,n .
K L
Solving for φ̂m,n leads to
∆2 fˆm,n
φ̂m,n = , (5.10)
2 cos 2πm 2πn

K + cos L −2
for m = 1, . . . , K − 1; n = 1, . . . , L − 1.

c 2010 K. W. Cassel 91 / 454
Therefore, to solve the difference equation (5.9) using Fourier transform methods:
1) Compute the Fourier transform fˆm,n of the right-hand-side fk,l using (similar
to (5.7))
K−1
X L−1
X
ˆ
fm,n = fk,l e2πimk/K e2πinl/L . (5.11)
k=0 l=0
2) Compute φ̂m,n from equation (5.10).

3) Compute φk,l using the inverse Fourier transform (5.8).

c 2010 K. W. Cassel 92 / 454
Notes:
1) The above procedure works for periodic boundary conditions, i.e. the solution
satisfies
φk,l = φk+K,l = φk,l+L .
For Dirichlet boundary conditions ⇒ Use sine transform.
For Neumann boundary conditions ⇒ Use cosine transform.
2) In practice, the Fourier (and inverse) transforms are computed using a Fast
Fourier Transform (FFT) technique (see, for example, Numerical Recipes).
3) Fourier transform methods can only be applied to partial differential
equations with constant coefficients in the direction(s) for which the Fourier
transform is applied.
4) We use Fourier transforms to solve the difference equation, not the
differential equation; therefore, this is not a spectral method.

c 2010 K. W. Cassel 93 / 454
Cyclic Reduction
Again, consider the Poisson equation
∂2φ ∂2φ
+ 2 = f (x, y),
∂x2 ∂y
discretized on a two-dimensional grid with ∆ = ∆x = ∆y and
i = 0, . . . , I; j = 0, . . . , J, where I = 2n with integer n ((I + 1) × (J + 1) points):

c 2010 K. W. Cassel 94 / 454
Applying central differences to the Poisson equation, the difference equation for
constant x-lines becomes
ui−1 − 2ui + ui+1 + B0 ui = ∆2 fi , i = 0, . . . , I, (5.12)
where
     
φi,0 fi,0 −2 1 0 ··· 0 0



 φi,1 


fi,1 1
 −2 1 ··· 0 0

  φi,2  fi,2 0 1 −2 ··· 0 0
0
ui =  , fi =  , B = . .
     
.. .. .. .. .. .. ..
  .   .  .. . . . . .

     
φi,J−1  fi,J−1  0 0 0 ··· −2 1
φi,J fi,J 0 0 0 ··· 1 −2
The first three terms in equation (5.12) correspond to the central difference in the
x-direction, and the fourth term corresponds to the central difference in the
y-direction (see B0 ).

c 2010 K. W. Cassel 95 / 454
Taking B = −2I + B0 , where I is the identity matrix, equation (5.12) becomes
ui−1 + Bui + ui+1 = ∆2 fi , i = 0, . . . , I, (5.13)
where the (J + 1) × (J + 1) matrix B is

 
−4 1 0 ··· 0 0
 1 −4 1 ··· 0 0
 
0 1 −4 ··· 0 0
B= . .
 
.. .. .. .. ..
 .. . . . . .

 
0 0 0 ··· −4 1
0 0 0 ··· 1 −4
Note that equation (5.13) corresponds to the block-tridiagonal matrix for equation
(5.4), where B is the tridiagonal portion and ui−1 and uu+1 are the ’fringes.’
Writing three successive equations of (5.13) for i − 1, i and i + 1:
ui−2 + Bui−1 + ui = ∆2 fi−1 ,

ui−1 + Bui + ui+1 = ∆2 fi ,
ui + Bui+1 + ui+2 = ∆2 fi+1 .
c 2010 K. W. Cassel 96 / 454
Multiplying −B times the middle equation and adding all three gives
ui−2 + B∗ ui + ui+2 = ∆2 fi∗ , (5.14)
where
B∗ = 2I − B2 ,
fi∗ = fi−1 − Bfi + fi+1 .
This is an equation of the same form as (5.13); therefore, applying this procedure
to all even numbered i equations in (5.13) reduces the number of equations by a
factor of two.
This cyclic reduction procedure can be repeated until a single equation remains for
the middle line of variables, uI/2 (I = 2n , with integer n), which is tridiagonal.
Thus, using the solution for uI/2 , solutions for all other i are obtained by
successively solving the tridiagonal problems at each level in reverse:

c 2010 K. W. Cassel 97 / 454
This results in a total of I tridiagonal problems to obtain ui , i = 0, . . . , I.

Notes:
1) Speed of FFT and cyclic reduction methods are comparable.
2) Cyclic reduction may be applied to somewhat more general equations, such
as those with variable coefficients.
3) Can accelerate by taking FFT in one direction (with constant coefficients)
and using cyclic reduction in the other.
c 2010 K. W. Cassel 98 / 454
Numerical Solutions of Elliptic Problems Iterative (Relaxation) Methods
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 99 / 454
Outline (cont’d)

c 2010 K. W. Cassel 100 / 454
Returning to the difference equation (5.2) for the Poisson equation (5.1)
¯ i,j + φi−1,j + ∆
φi+1,j − 2(1 + ∆)φ ¯ (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j , (5.15)
¯ = (∆x/∆y)2 .
where ∆
This may be written in general form as
Lφ = f, (5.16)
where L is the finite-difference operator for the particular problem.

Iterative methods consist of beginning with an initial guess and iteratively
“relaxing” equation (5.16) until convergence.

c 2010 K. W. Cassel 101 / 454
Jacobi Iteration
Solving equation (5.15) for φi,j and indicating the iteration number using a
superscript
1 ¯
φn+1
n n n n
2

i,j = ¯ φ i+1,j + φ i−1,j + ∆ φ i,j+1 + φ i,j−1 − (∆x) f i,j . (5.17)
2(1 + ∆)
Procedure:
1 Provide an initial guess φ1i,j for φi,j at each point
i = 1, . . . , I + 1, j = 1, . . . , J + 1.
2 Relax (iterate) by applying (5.17) at each grid point to produce successive
approximations:
φ2i,j , φ3i,j , . . . , φni,j , . . .
3 Continue until convergence, determined by
max |φn+1 n

i,j − φi,j |
< ,
max |φni,j |
where, for example, = 10−4 ⇒ convergence to ∼ 4 significant figures.

c 2010 K. W. Cassel 102 / 454
Notes:
1 Convergence is too slow ⇒ not used in practice.
2 Requires φn+1 n
i,j and φi,j to be stored for all i = 1, . . . , I + 1, j = 1, . . . , J + 1.
3 Used as a basis for comparison with other methods to follow.
4 Although not necessary, it is instructive to view iterative methods in matrix
form Ax = c.
→ See MMAE 501, section 2.2.2 notes.
We write A = M1 − M2 ; thus, an iterative scheme may be devised by
writing Ax = c in the form
M1 x(n+1) = M2 x(n) + c,
where we put the iteration number in parentheses in order to distinguish it

from powers. Multiplying by M−1
1 gives
x(n+1) = M−1
1 M2 x
(n)
+ M−1
1 c,
or
x(n+1) = Mx(n) + M−1
1 c,

c 2010 K. W. Cassel 103 / 454
where M = M−1
1 M2 is the iteration matrix.
Let
D = diagonal elements of A
−L = lower triangular elements of A
−U = upper triangular elements of A

Therefore,
Ax = (D − L − U)x = c.
Then for Jacobi iteration (M1 = D, M2 = L + U):
Dx(n+1) = (L + U)x(n) + c
x(n+1) = D−1 (L + U)x(n) + D−1 c

∴ M = M−1
1 M2 = D
−1
(L + U)
Then we check to be sure that the spectral radius of M satisfies ρ < 1 to
ensure convergence of the iterative scheme.

c 2010 K. W. Cassel 104 / 454
In addition, a smaller spectral radius results in more rapid convergence. It

can be shown that for equation (5.17) with ∆ ¯ = 1 (∆x = ∆y) and Dirichlet
boundary conditions that

1 π π
ρJac (I, J) = cos + cos .
2 I +1 J +1
If I = J and I is large, then

2
π 1 π
ρJac (I) = cos =1− + ··· ; (5.18)
I +1 2 I +1
therefore, as I → ∞, ρJac (I) → 1. As a result, we observe slower

convergence as I is increased, i.e. there is a disproportionate increase in
computational time as I is increased.

c 2010 K. W. Cassel 105 / 454
There are two problems with Jacobi iteration:
1 Slow convergence.
2 Must store current and previous iterations for entire grid.
The Gauss-Seidel method addresses both problems by using the most recently
updated information. For example, if sweeping along lines of constant y

c 2010 K. W. Cassel 106 / 454
Therefore, when updating φi,j , the points φi,j−1 and φi−1,j have already been
updated, and using these updated values, equation (5.17) is changed to
1 ¯
φn+1 n+1 n+1
n n
2

i,j = ¯ φ i+1,j + φ i−1,j + ∆ φ i,j+1 + φ i,j−1 − (∆x) f i,j . (5.19)
2(1 + ∆)
Observe that now it is not necessary to store φni,j at the previous iteration. The
values of φi,j are all stored in the same array, and it is not necessary to distinguish
between the (n) or (n + 1)st iterates. We simply use the most recently updated
information.
In matrix form, the Gauss-Seidel method is (M1 = D − L, M2 = U):
(D − L)x(n+1) = Ux(n) + c
x(n+1) = (D − L)−1 Ux(n) + (D − L)−1 c

∴ M = M−1
1 M2 = (D − L)
−1
U
It can be shown that
ρGS = ρ2Jac ;

c 2010 K. W. Cassel 107 / 454
therefore, for our model problem

" 2 #2 2
1 π π
ρGS (I) = ρ2Jac (I) = 1 − + ··· =1− + · · · . (5.20)
2 I +1 I +1
Thus, the rate of convergence is twice as fast as for Jacobi, i.e. the Gauss-Seidel
method requires one-half the iterations for the same level of accuracy.
Note: It can be shown that diagonal dominance of A is a sufficient (but not
necessary) condition for convergence of the Jacobi and Gauss-Seidel iteration
methods, i.e. ρ < 1, where ρ is the spectral radius of M = M−1
1 M2 (see Morton
and Mayers, p. 205 for proof).

c 2010 K. W. Cassel 108 / 454

In Gauss-Seidel iteration, the sign of the error typically does not change from
iteration to iteration; therefore, the iterative solution normally approaches the
exact solution very slowly. Convergence can often be accelerated by
“over-relaxing,” or magnifying, the change at each iteration.
This is accomplished by taking a weighted average of the previous iterate φni,j and
the Gauss-Seidel iterate φn+1
i,j . So if we denote the Gauss-Seidel iterate (5.19) by
∗
φi,j , the new SOR iterate is given by
∗
φn+1 n
i,j = (1 − ω)φi,j + ωφi,j , (5.21)
where ω is the relaxation parameter and 0 < ω < 2 for convergence (Morton &
Mayers, p. 206).
ω=1 ⇒ Gauss-Seidel
1<ω<2 ⇒ Overrelaxation
0<ω<1 ⇒ Underrelaxation

c 2010 K. W. Cassel 109 / 454
Overrelaxation typically accelerates convergence of linear problems, while

underrelaxation is often required for convergence when dealing with non-linear
problems.
In matrix form, the SOR method is (M1 = D − ωL, M2 = (1 − ω)D + ωU):
(D − ωL)x(n+1) = [(1 − ω)D + ωU] x(n) + ωc
x(n+1) = (D − ωL)−1 [(1 − ω)D + ωU] x(n) + ω(D − ωL)−1 c

∴ M = M−1
1 M2 = (D − ωL)
−1
[(1 − ω)D + ωU]
It can be shown that for the Poisson equation, the optimal value of ω is
2
ωopt = p , (5.22)
1 + 1 − ρ2Jac
and for this ωopt , the spectral radius for SOR is

!2
ρ
ρSOR = pJac . (5.23)
1 + 1 − ρ2Jac

c 2010 K. W. Cassel 110 / 454
2
1 π
Recall that for our model problem, ρJac = 1 − 2 I+1 + · · · ; thus,
2
ωopt = p
1+ 1 − ρ2Jac
2
= s 2
2
1 π
1+ 1− 1− 2 I+1 + ···
2 (5.24)
= s
2
π
1+ 1 − 1 − I+1 + · · ·
2
ωopt ≈ π .
1 + I+1

c 2010 K. W. Cassel 111 / 454
Then for large I

 2 2
1 π
1 − 2 I+1
ρSOR ≈

π
1+
 
I+1
(" 2 # )2
1 π π
≈ 1− 1− + ···
2 I +1 I +1 (5.25)
2
π
≈ 1− + ···
I +1
2π
ρSOR ≈ 1− .
I +1
Thus, as I → ∞, ωopt → 2, and ρSOR → 1.

c 2010 K. W. Cassel 112 / 454
However, from a comparison of equations (5.25) and (5.20), we see that
ρSOR < ρGS ,
such that SOR converges at a rate 2(I+1)

π times faster than Gauss-Seidel if the
optimal value of the relaxation parameter is used for the model problem under
consideration. Therefore, the relative convergence rate improves linearly with
increasing I.
This analysis assumes that we know ωopt . Typically, we do not, and the rate of
convergence depends significantly on the choice of ω, e.g. the typical behavior for
linear problems is as follows (for given I):

c 2010 K. W. Cassel 113 / 454
Notes:
1 Although ωopt does not depend on the right-hand-side, it does depend upon:
i) differential equation
ii) method of discretization
iii) boundary conditions
iv) shape of domain
2 For a given problem, ωopt must be estimated from a similar problem and/or
trial and error.

c 2010 K. W. Cassel 114 / 454
Numerical Solutions of Elliptic Problems Boundary Conditions
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 115 / 454
Outline (cont’d)

c 2010 K. W. Cassel 116 / 454

Simply apply the specified values at the boundaries to φ1,j , φI+1,j , φi,1 , and
φi,J+1 and iterate on the Jacobi (5.17), Gauss-Seidel (5.19), or SOR (5.21)
equation in the interior, i = 2, . . . , I, j = 2, . . . , J.
For example, Jacobi or Gauss-Seidel iterate at i = 2, j = 2:
1 ¯
φn+1
n n
2

2,2 = ¯ φ 3,2 + φ 1,2 + ∆ φ 2,3 + φ 2,1 − (∆x) f 2,2 .
2(1 + ∆)

c 2010 K. W. Cassel 117 / 454

Consider the following boundary condition
∂φ
= c at x = 0. (5.26)
∂x

c 2010 K. W. Cassel 118 / 454
The simplest treatment would be to use the Jacobi (5.17), Gauss-Seidel (5.19), or
SOR (5.21) equation to update φi,j in the interior for i = 2, . . . , I, and then to
approximate the boundary condition (5.26) by a forward difference applied at i = 1
φ2,j − φ1,j
+ O(∆x) = c. (5.27)
∆x
This could then be used to update φ1,j , j = 2, . . . , J using
φ1,j = φ2,j − c∆x.
This is unacceptable, however, because (5.27) is only first-order accurate.

A better alternative is to update the interior points as before, but now apply the
difference equation at the boundary. For example, we could apply Jacobi (5.17) at
i=1
1 ¯
φn+1
n n n n
2

1,j = ¯ φ 2,j + φ 0,j + ∆ φ 1,j+1 + φ 1,j−1 − (∆x) f 1,j . (5.28)
2(1 + ∆)

c 2010 K. W. Cassel 119 / 454
However, this involves a value φn0,j that is outside the domain. A second-order
accurate central-difference approximation for the boundary condition (5.26) is
φn2,j − φn0,j
+ O(∆x2 ) = c, (5.29)
2∆x
which also involves the value φn0,j . Therefore, solving (5.29) for φn0,j gives
φn0,j = φn2,j − 2c∆x,
and substituting into the difference equation (5.28) to eliminate φn0,j leads to
1 ¯
φn+1 n n n 2

1,j = ¯ 2 φ 2,j − c∆x + ∆ φ 1,j+1 + φ 1,j−1 − (∆x) f1,j . (5.30)
2(1 + ∆)
Thus, we use (5.30) to update φn+1

1,j , j = 2, . . . , J.
Notes:
1 This is the same procedure used for a Dirichlet condition but with an
additional sweep along the left boundary using (5.30) for φn+1
1,j , j = 2, . . . , J.

c 2010 K. W. Cassel 120 / 454
2 This approach requires special treatment at corners depending upon the

boundary condition along the adjacent boundary. For example, consider if in
addition to (5.26) we have
∂φ
=d at y = b. (5.31)
∂y

c 2010 K. W. Cassel 121 / 454
Applying (5.30) at the corner i = 1, j = J + 1

1 ¯ φn
φn+1 n n 2

1,J+1 = ¯ 2 φ 2,J+1 − c∆x + ∆ 1,J+2 + φ 1,J − (∆x) f1,J+1 ,
2(1 + ∆)
(5.32)
n
where φ1,J+2 is outside the domain. Approximating (5.31) using a central
difference in the same manner as (5.29) gives
φn1,J+2 − φ
d,
=
which leads to
φn1,J+2 = φn1,J + 2d∆y.
Substituting into (5.32) to eliminate φn1,J+2 gives
1 ¯
φn+1 n n 2

1,J+1 = ¯ 2 φ 2,J+1 − c∆x + 2 ∆ φ 1,J + d∆y − (∆x) f1,J+1 ,
2(1 + ∆)
which is used to update the corner value i = 1, j = J + 1.
3 If a lower-order approximation, such as (5.27), is used at a boundary, its
truncation error generally dominates the convergence rate.
4 The above approach is used in thomas.f, etc.
c 2010 K. W. Cassel 122 / 454
Numerical Solutions of Elliptic Problems Alternating-Direction-Implicit (ADI) Method
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 123 / 454
Outline (cont’d)

c 2010 K. W. Cassel 124 / 454
Recall that in elliptic problems, the solution anywhere depends on the solution
everywhere, i.e. it has an infinite speed of propagation in all directions.
However, in Jacobi, Gauss-Seidel, and SOR, information only propagates through
the mesh one point at a time. For example, if sweeping along lines of constant y
with 0 ≤ x ≤ a, it takes I iterations before the boundary condition at x = a is
“felt” at x = 0.
⇒ These techniques are not very “elliptic like.”
A more “elliptic-like” method could be obtained by solving entire lines in the grid
in an implicit manner. For example, sweeping along lines of constant y and
solving each constant y-line implicitly, i.e. all at once, would allow for the
boundary condition at x = a to influence the solution in the entire domain after
only one sweep through the grid.

c 2010 K. W. Cassel 125 / 454
Returning to the difference equation (5.15) for the Poisson equation

¯ i,j + φi−1,j + ∆
φi+1,j − 2(1 + ∆)φ ¯ (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j . (5.33)
Consider the j th line and assume that values along the j + 1st and j − 1st lines
are taken from the previous iterate.
Rewriting (5.33) as an implicit equation for the values of φi,j along the j th line
gives
φn+1 ¯ n+1 n+1 2 ¯ n n

i+1,j − 2(1 + ∆)φi,j + φi−1,j = (∆x) fi,j − ∆ φi,j+1 + φi,j−1 , i = 2, . . . , I.
(5.34)
th
Therefore, we have a tridiagonal problem for φi,j along the j line, which can be
solved using the Thomas algorithm.

c 2010 K. W. Cassel 126 / 454
Notes:
1 If sweeping through j-lines, j = 2, . . . , J, then φni,j−1 becomes φn+1
i,j−1 in
(5.34), i.e. it has already been updated. Therefore, we use updated values as
in Gauss-Seidel.
2 Can also incorporate SOR.
3 More efficient at spreading information throughout the domain; therefore, it
reduces the number of iterations required for convergence, but there is more
computation per iteration.
4 This provides motivation for the ADI method.
ADI Method
In the ADI method we sweep along lines but in alternating directions.
In the first half of the iteration we perform a sweep along constant y-lines by
solving the series of tridiagonal problems for j = 2, . . . , J:
n+1/2 n+1/2 2 n+1/2
φi+1,j −h(2 + σ)φi,j + φi−1,j = (∆x) i fi,j
−∆ ¯ φn σ
n n+1/2 (5.35)
i,j+1 − 2 − ∆
¯ φi,j + φi,j−1 , i = 2, . . . , I.

c 2010 K. W. Cassel 127 / 454
Notes:
n+1/2
1 The φi,j−1 term on the right-hand-side has been updated from the previous
line.
2 Unlike in equation (5.34), differencing in the x- and y-directions are kept
separate to mimic diffusion in each direction. This is called a splitting
method.
3 σ is an acceleration parameter to enhance diagonal dominance (σ > 0).
σ = 0 corresponds to no acceleration. Note that the σ terms on each side of
the equation cancel.
In the second half of the iteration we sweep along constant x-lines by solving the
series of tridiagonal problems for i = 2, . . . , I:
¯ n+1 − (2∆
∆φ ¯ + σ)φn+1 + ∆φ
¯ n+1 = (∆x)2 fi,j
i,j+1
h i,j i,j−1 i
n+1/2
− φi+1,j − (2 − σ) φi,j
n+1/2
+ φn+1 (5.36)
i−1,j , j = 2, . . . , J,
where φn+1
i−1,j has been updated from the previous line.

c 2010 K. W. Cassel 128 / 454
Notes:
1 Involves (I − 1) + (J − 1) tridiagonal solves for each iteration (for Dirichlet
boundary conditions).
2 For ∆x = ∆y (∆ ¯ = 1), it can be shown that for the Poisson (or Laplace)
equation with Dirichlet boundary conditions that the acceleration parameter
that gives the best speedup is
σ = 2 sin (π/R) ,
where R = max(I + 1, J + 1).

3 The ADI method with splitting is typically faster than Gauss-Seidel or SOR
for equations in which the terms are easily factored into x and y directions,
but it is more difficult if have derivative mixed terms such as ∂ 2 φ/∂x∂y.

c 2010 K. W. Cassel 129 / 454
Numerical Solutions of Elliptic Problems Compact Finite Differences
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 130 / 454
Outline (cont’d)

c 2010 K. W. Cassel 131 / 454
As we have seen before, in order to obtain higher-order finite-difference

approximations it is necessary to incorporate more points in the finite-difference
stencil. However, it is advantageous to maintain the compact nature of
second-order accurate central differences because they produce systems of
equations, e.g. tridiagonal, that may be solved efficiently. Here we derive a
compact fourth-order accurate central-difference approximation for the
second-order derivatives in the Poisson equation.
For convenience, we define the following second-order accurate central-difference
operators
φi−1,j − 2φi,j + φi+1,j
δx2 φi,j = ,
∆x2
φi,j−1 − 2φi,j + φi,j+1
δy2 φi,j = .
∆y 2
Recall from equation (3.13) in section 3.2 that from the Taylor series, we have
∂2φ 2 ∆x2 ∂ 4 φ
= δx φ − + O(∆x4 ). (5.37)
∂x2 12 ∂x4

c 2010 K. W. Cassel 132 / 454
Therefore,
∂ 2 φ ∆x2 ∂ 4 φ
δx2 φ = 2
+ 4
+ O(∆x4 ).
∂x 12 ∂x
(5.38)
∆x2 ∂ 2 ∂ 2 φ

δx2 φ = 1+ 2 2
+ O(∆x4 ).
12 ∂x ∂x
But from equation (5.37)
∂2
= δx2 + O(∆x2 ).
∂x2
Substituting into (5.38) gives
2
2
∆x 2 2 ∂ φ
+ O(∆x4 ),

δx2 φ = 1+ δx + O(∆x ) 2
12 ∂x
∆x2 2 ∂ 2 φ

δx2 φ = 1+ δ + O(∆x4 ).
12 x ∂x2
Solving for ∂ 2 φ/∂x2

−1 −1
∂2φ ∆x2 2 2

∆x
= 1+ δ δx2 φ + 1 + δ2 O(∆x4 ) (5.39)
∂x2 12 x 12 x
c 2010 K. W. Cassel 133 / 454
From a binomial expansion (with ∆x sufficiently small)

−1
∆x2 2 ∆x2 2

1+ δ =1− δ + O(∆x4 ). (5.40)
12 x 12 x
Because the last term in equation (5.39) is still O(∆x4 ), we can write equation
(5.39) as
−1
∂2φ ∆x2 2

2
= 1+ δx δx2 φ + O(∆x4 ). (5.41)
∂x 12
Substituting the expression (5.40) into equation (5.41) leads to a O(∆x4 )
accurate central-difference approximation for the second derivative
∂2φ ∆x2 2 2

2
= 1− δx δx φ + O(∆x4 ).
∂x 12
Due to the δx2 (δx2 φ) operator, however, this approximation involves the five points
φi−2 , φi−1 , φi , φi+1 and φi+2 ; therefore, it is not compact.

c 2010 K. W. Cassel 134 / 454
In order to obtain a compact scheme, we also consider the derivative in the

y-direction. Similar to equation (5.41), we have in the y-direction
−1
∂2φ ∆y 2 2

= 1+ δ δy2 φ + O(∆y 4 ). (5.42)
∂y 2 12 y

c 2010 K. W. Cassel 135 / 454
Now consider the Poisson equation
∂2φ ∂2φ
+ 2 = f (x, y).
∂x2 ∂y
Substituting equations (5.41) and (5.42) into the Poisson equation leads to
−1 −1
∆x2 2 ∆y 2 2

1+ δ
+ 1+ δy δx2 φ
δy2 φ + O(∆4 ) = f (x, y),
12 x 12

2 ∆y 2 2
where ∆ = max(∆x, ∆y). Multiplying by 1 + ∆x12 x δ 2
1 + 12 y gives
δ
∆y 2 2 2 ∆x2 2 2

1+ δy δx φ + 1 + δx δy φ + O(∆4 )
12 12
(5.43)
∆x2 2 ∆y 2 2

4
= 1+ δ + δ + O(∆ ) f (x, y),
12 x 12 y
which is a fourth-order accurate finite-difference approximation to the Poisson

equation.

c 2010 K. W. Cassel 136 / 454
Expand the first term in equation (5.43) as follows
∆y 2 2 2 ∆y 2 2 φi−1,j − 2φi,j + φi+1,j

1+ δ δ φ = 1+ δ
12 y x 12 y ∆x2
1
= (φi−1,j − 2φi,j + φi+1,j )
∆x2
1
+ [(φi−1,j−1 − 2φi−1,j + φi−1,j+1 )
12∆x2
−2 (φi,j−1 − 2φi,j + φi,j+1 )
+ (φi+1,j−1 − 2φi+1,j + φi+1,j+1 )]
1
= [−20φi,j + 10 (φi−1,j + φi+1,j )
12∆x2
−2 (φi,j−1 + φi,j+1 )
+φi−1,j−1 + φi−1,j+1 + φi+1,j−1 + φi+1,j+1 ] .
Therefore, we have a nine-point stencil, but the approximation only requires three
points in each direction and thus it is compact.

c 2010 K. W. Cassel 137 / 454
Similarly, expanding the second term in equation (5.43) yields
∆x2 2 2

1
1+ δx δy φ = [−20φi,j + 10 (φi,j−1 + φi,j+1 )
12 12∆y 2
−2 (φi−1,j + φi+1,j ) + φi−1,j−1 + φi−1,j+1 + φi+1,j−1 + φi+1,j+1 ] ,
and the right-hand-side of equation (5.43) is
∆x2 2 ∆y 2 2

1
1+ δx + δy f (x, y) = fi,j + [fi−1,j − 2fi,j + fi+1,j
12 12 12
+fi,j−1 − 2fi,j + fi,j+1 ]
1
= [8fi,j + fi−1,j + fi+1,j + fi,j−1 + fi,j+1 ] .
12

c 2010 K. W. Cassel 138 / 454
Thus, the coefficients in the finite-difference stencil for φ(x, y) are as follows:

c 2010 K. W. Cassel 139 / 454
and for f (x, y) they are

c 2010 K. W. Cassel 140 / 454
Notes:
1 Observe that in equation (5.43) the two-dimensionality of the equation has
been taken advantage of to obtain the compact finite-difference stencil, i.e.
see the δx2 δy2 φ and δy2 δx2 φ difference operators.
2 Because the finite-difference stencil is compact, i.e. only involving three
points in each direction, application of the ADI method as in the previous
section results in a set of tridiagonal problems to solve. Therefore, this
fourth-order, compact finite-difference approach is no less efficient then that
for the second-order scheme used in the previous section (there are simply
additional terms on the right-hand-side of the equations).
3 The primary disadvantage of using higher-order schemes is that it is generally
necessary to use lower-order approximations for derivative boundary
conditions. This is not a problem, however, for Dirichlet boundary conditions.

c 2010 K. W. Cassel 141 / 454
Numerical Solutions of Elliptic Problems Multigrid Methods
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 142 / 454
Outline (cont’d)

c 2010 K. W. Cassel 143 / 454
Motivation
The iterative techniques we have discussed all have the following property:
High-frequency components of error ⇒ fast convergence.
Low-frequency components of error ⇒ slow convergence.
To illustrate this, consider the simple one-dimensional problem
d2 φ
= 0, 0 ≤ x ≤ 1, (5.44)
dx2
with φ(0) = φ(1) = 0, which has the exact solution φ(x) = 0. Therefore, all plots
of the numerical solution are also plots of the error.
Discretizing equation (5.44) at I + 1 points using central differences gives
φi+1 − 2φi + φi−1 = 0, i = 1, . . . , I − 1

φ0 = φI = 0.

c 2010 K. W. Cassel 144 / 454
To show how the nature of the error affects convergence, consider an initial guess
consisting of the Fourier mode φ(x) = sin(kπx), where k is the wavenumber and
indicates the number of half sine waves on the interval 0 ≤ x ≤ 1. In discretized
form, with xi = i∆x = i/I, this is

kπi
φi = sin , i = 0, . . . , I,
I
with the wavenumber 1 ≤ k ≤ I − 1. Thus,

Small k ⇒ long, smooth waves (low frequency).
Large k ⇒ highly oscillatory waves (high frequency).

c 2010 K. W. Cassel 145 / 454
Consider initial guesses with k = 1, 3, 6 (the figures are from Briggs et al.):

c 2010 K. W. Cassel 146 / 454
Applying Jacobi and Gauss-Seidel iteration with I = 64, the solution converges
more rapidly for the higher frequency initial guess.
Jacobi:

c 2010 K. W. Cassel 147 / 454
Gauss-Seidel:

c 2010 K. W. Cassel 148 / 454
A more realistic situation is one in which the initial guess contains multiple modes,
for example

1 πi 6πi 32πi
φi = sin + sin + sin .
3 I I I
The k = 1, 6, and 32 terms represent low-, medium- and high-frequency modes,

respectively.

c 2010 K. W. Cassel 149 / 454
Applying Jacobi iteration with I = 64, the error is reduced rapidly during the early
iterations but more slowly thereafter.
Thus, there is rapid convergence until the high-frequency modes are smoothed
out, then slow convergence for the lower frequency modes.

c 2010 K. W. Cassel 150 / 454
See, for example, the following sequence using Jacobi iteration:

k = 3 after 1 iteration (left) and 10 iterations (right):
k = 16 after 1 iteration (left) and 10 iterations (right):

c 2010 K. W. Cassel 151 / 454
k = 2 and k = 16 after 1 iteration (left) and 10 iterations (right):

c 2010 K. W. Cassel 152 / 454
Multigrid methods take advantage of this property of relaxation techniques by

recognizing that smooth components of the error become more oscillatory with
respect to the grid size on a coarse grid.
Thus, relaxation is more effective on a coarse grid representation of the error (it is
also faster).

c 2010 K. W. Cassel 153 / 454
Notes:
1) Multigrid methods are not so much a specific set of techniques as they are a
framework for accelerating relaxation (iterative) methods.
2) Multigrid methods are comparable in speed with fast direct methods, such as
Fourier methods and cyclic reduction, but can be used to solve general
elliptic equations with variable coefficients and even nonlinear equations.

c 2010 K. W. Cassel 154 / 454
Consider the general second-order linear elliptic partial differential equation of the
form
∂2φ ∂φ ∂2φ ∂φ
A(x, y) 2 +B(x, y) +C(x, y) 2 +D(x, y) +E(x, y)φ = F (x, y). (5.45)
∂x ∂x ∂y ∂y
To be elliptic, A(x, y)C(x, y) > 0 for all (x, y). Approximating this differential
equation using second-order accurate central differences gives
φi+1,j − 2φi,j + φi−1,j φi+1,j − φi−1,j
Ai,j 2
+ Bi,j
∆x 2∆x
φi,j+1 − 2φi,j + φi,j−1 φi,j+1 − φi,j−1
Ci,j 2
+ Di,j
∆y 2∆y
+Ei,j φi,j = Fi,j ,
where Ai,j = A(xi , yj ), etc. We rewrite this equation in the form
ai,j φi+1,j + bi,j φi−1,j + ci,j φi,j+1 + di,j φi,j−1 + ei,j φi,j = Fi,j , (5.46)

c 2010 K. W. Cassel 155 / 454
where
Ai,j Bi,j Ai,j Bi,j
ai,j = + , bi,j = − ,
∆x2 2∆x ∆x2 2∆x
Ci,j Di,j Ci,j Di,j
ci,j = + , di,j = − ,
∆y 2 2∆y ∆y 2 2∆y
2Ai,j 2Ci,j
ei,j = Ei,j − − .
∆x2 ∆y 2
For convenience, write (5.46) (or some other difference equation) as
Lφ = f, (5.47)
where L represents the difference operator. If φ is the exact solution to equation

(5.46), then the error is defined by
e = φ − φ̄, (5.48)
where φ̄ is an approximation to φ. The residual is defined by
r = f − Lφ̄. (5.49)

c 2010 K. W. Cassel 156 / 454
Observe from (5.47) that if φ̄ = φ, then the residual is zero; therefore, the residual
is a measure of how “wrong” the approximate solution is.
Substituting (5.48) into equation (5.47) gives the error equation
Le = r = f − Lφ̄. (5.50)
From these definitions, we can devise a scheme with which to correct the solution
on a fine grid by solving for the error on a coarse grid.
⇒ Coarse-Grid Correction (CGC)
Definitions:
Fine grid → Ωh ; Coarse grid → Ω2h
→ Grids are typically reduced by a factor of two in each direction.

c 2010 K. W. Cassel 157 / 454
Restriction operator: Ih2h

→ Move information from fine grid to coarse grid.
h
Interpolation (Prolongation) operator: I2h
→ Move information from coarse grid to fine grid.
Coarse-Grid Correction Sequence:
1) Relax Lh φ̄h = f h using Gauss-Seidel, ADI, etc. ν1 times on the fine grid Ωh
with an initial guess φ̄h .
2) Compute the residual on Ωh and restrict it to the coarse grid Ω2h :
r2h = Ih2h rh = Ih2h (f h − Lh φ̄h ).
3) Using the residual as the right-hand-side, ’solve’ the error equation

L2h e2h = r2h on the coarse grid Ω2h .
4) Interpolate the error to the fine grid and correct the fine-grid approximation
according to
φ̄h ← φ̄h + I2h
h 2h
e .
5) Relax Lh φ̄h = f h ν2 times on the fine grid Ωh with corrected approximation
φ̄h as the initial guess.

c 2010 K. W. Cassel 158 / 454
Note: In practice, ν1 and ν2 are small (1, 2, or 3).
This CGC scheme is the primary component of all the many multigrid algorithms.

c 2010 K. W. Cassel 159 / 454
To illustrate consider the CGC sequence for equation (5.44), i.e. φ00 = 0
(I = 64, ν1 = 3):
Initial guess:

c 2010 K. W. Cassel 160 / 454
Step 1 after one relaxation sweep on fine grid:
→ The high-frequency mode is eliminated after only one relaxation step.

c 2010 K. W. Cassel 161 / 454
Step 1 after three relaxation sweeps on fine grid:
→ The low-frequency mode is reduced very little after three relaxation step.

c 2010 K. W. Cassel 162 / 454
Step 3 after one relaxation sweep on coarser grid:
→ ... and not much more after three.

c 2010 K. W. Cassel 163 / 454
Step 3 after three relaxation sweeps on coarser grid:
→ Restriction to the coarser grid accelerates convergence of the

low-frequency mode, which has higher frequency relative to the coarser grid.

c 2010 K. W. Cassel 164 / 454
Step 5 after three relaxation sweeps on fine grid:
→ Low-frequency mode is nearly eliminated after CGC.

c 2010 K. W. Cassel 165 / 454
How do we obtain the coarse grid solution for e2h in step 3?

→ Recursively replace step 3 by additional CGCs on progressively coarser grids.
⇒ V-Cycle
For example, a four-grid V-cycle is illustrated by
The V-cycles are repeated until convergence.

c 2010 K. W. Cassel 166 / 454
Note: The error being solved for on successively coarser grids is the error of the
error on the next finer grid, i.e. on grid...
Ωh → relaxation on original equation for φ.
Ω2h → relaxation on equation for error on Ωh .
Ω4h → relaxation on equation for error on Ω2h .
..
.
This simple V-cycle scheme is appropriate when a good initial guess is available.
For example, when considering a solution to equation (5.45) in the context of an
unsteady calculation in which case the solution for φh from the previous time step
is a good initial guess for the current time step.

c 2010 K. W. Cassel 167 / 454
If no good initial guess is available, then Full Multigrid V-Cycle (FMV) may be
applied according to the following procedure:
1) Solve Lφ = f on the coarset grid (Note: φ not e)
2) Interpolate to next finer grid.
3) Perform V-cycle to correct φ.
4) Interpolate to next finer grid.
5) Repeat (3) and (4) until finest grid is reached.
6) Perform V-cycles until convergence.

c 2010 K. W. Cassel 168 / 454
Grid Definitions:
Because each successive grid differs by a factor of two, the finest grid size is often
taken as 2n + 1, where n is an integer. Somewhat more general grids may be
obtained using the following grid definitions.
The differential equation (5.45) is discretized on a uniform grid having Nx × Ny
points which are defined by
Nx = mx 2(nx −1) + 1, Ny = my 2(ny −1) + 1, (5.51)
where nx and ny determine the number of grid levels, and mx and my determine
the size of the coarsest grid, which is (mx + 1) × (my + 1).
For a given grid, nx and ny should be as large as possible, and mx and my should
be as small as possible for maximum efficiency (typically mx and my are 2, 3 or
5). For example:
Nx = 65 ⇒ mx = 2, nx = 6
Nx = 129 ⇒ mx = 2, nx = 7
Nx = 49 ⇒ mx = 3, nx = 5
Nx = 81 ⇒ mx = 5, nx = 5
c 2010 K. W. Cassel 169 / 454
The number of grid levels is given by
N = max(nx , ny ) (5.52)
resulting in the series of grids
G(1) < . . . < G(L) < . . . < G(N ), (5.53)
where G(1) is the coarsest grid, G(N ) is the finest grid and L = 1, . . . , N . Each
grid G(L) has Mx (L) × My (L) grid points, where
Mx (L) = mx 2[max(nx +L−N,1)−1] + 1,

(5.54)
My (L) = my 2[max(ny +L−N,1)−1] + 1.
For example,
Nx = 65, Ny = 49 ⇒ mx = 2, nx = 6 and my = 3, ny = 5.

c 2010 K. W. Cassel 170 / 454
Thus, N = max(nx , ny ) = 6 and
G(6) : Mx (6) = 65, My (6) = 49

G(5) : Mx (5) = 33, My (5) = 25
G(4) : Mx (4) = 17, My (4) = 13
G(3) : Mx (3) = 9, My (3) = 7
G(2) : Mx (2) = 5, My (2) = 4
G(1) : Mx (1) = 3, My (1) = 4

c 2010 K. W. Cassel 171 / 454
Boundary Conditions
At each boundary the general form of the boundary condition is
∂φ
pφ + q = s, (5.55)
∂n
where n is the direction normal to the surface. This boundary condition is applied
directly on the finest grid Ωh , i.e.
h
h h h ∂φ
p φ +q = sh . (5.56)
∂n
But on the coarser grids, we need the boundary condition for the error. In order to
obtain such a condition, consider the following. On the coarse grid Ω2h , equation
(5.55) applies to the solution φ; thus,
2h
2h 2h 2h ∂φ
p φ +q = s2h .
∂n

c 2010 K. W. Cassel 172 / 454
If we enforce the error to be zero on the boundaries, then from

e2h = φ2h − φ̄2h = 0, i.e. φ2h = φ̄2h , we also have
∂ φ̄2h
p2h φ̄2h + q 2h = s2h .
∂n
Therefore, to obtain a boundary condition for the error on Ω2h
2h 2h 2h

2h 2h 2h ∂e 2h 2h 2h ∂φ 2h 2h 2h ∂ φ̄
p e +q = p φ +q − p φ̄ + q
∂n ∂n ∂n
= s2h − s2h (5.57)
∂e2h
p2h e2h + q 2h = 0.
∂n
Thus, the boundary conditions are homogeneous on all but the finest grid, where
the original condition on φ is applied. For example,
Dirichlet ⇒ e = 0 on boundary.
∂e
Neumann ⇒ ∂n = 0 on boundary.

c 2010 K. W. Cassel 173 / 454
That is, the type of boundary condition (Dirichlet, Neumann, or Robin) does not
change, i.e. the p and q coefficients are the same, but they become homogeneous,
i.e. s = 0.

c 2010 K. W. Cassel 174 / 454
Relaxation
Typically, red-black Gauss-Seidel iteration is used to relax the difference equation:
By performing the relaxation on all of the red and black points separately, it
eliminates data dependencies such that it is easily implemented on parallel
computers (see section 10). Note that when Gauss-Seidel is used, SOR should not
be implemented because it destroys the high-frequency smoothing.

c 2010 K. W. Cassel 175 / 454
The point-by-point relaxation of Gauss-Seidel does not work well, however, if

there are large gradients in the solution or rapid variations in the coefficients in
the differential equation (5.45). In such cases it is better to use line relaxation for
the same reason that ADI is better than Gauss-Seidel.
In general, we may have such behavior in both directions; therefore, we could use
alternating-direction-line (ADL) relaxation. When sweeping along lines of
constant y, the following tridiagonal problem is solved for each j = 1, . . . , My (L)
(see equation (5.46))
ai,j φi+1,j + ei,j φi,j + bi,j φi−1,j = fi,j − ci,j φ∗i,j+1 − di,j φ∗i,j−1 , (5.58)
for i = 1, . . . , Mx (L). Here φ∗ denotes the most recent approximation. Similar to

red-black Gauss-Seidel, we could sweep all lines with j even and j odd separately.
We will refer to this as zebra relaxation.

c 2010 K. W. Cassel 176 / 454
Then lines of constant x are swept by solving the tridiagonal problem for each
i = 1, . . . , Mx (L) given by
ci,j φi,j+1 + ei,j φi,j + di,j φi,j−1 = fi,j − ai,j φ∗i+1,j − bi,j φ∗i−1,j , (5.59)
for j = 1, . . . , My (L). Again we could sweep all lines with i even and i odd
separately.

c 2010 K. W. Cassel 177 / 454
Restriction Operator: Ih2h

The restriction operator is required for moving information from the finer grid to
the coarser grid.
Notation:
Grid Indices Range
Coarser (i, j) 1 ≤ i ≤ N XC, 1 ≤ j ≤ N Y C
Finer (i∗ , j ∗ ) 1 ≤ i∗ ≤ N XF, 1 ≤ j ∗ ≤ N Y F
Applying restriction in both directions:
N XC = 12 (N XF − 1) + 1, N Y C = 12 (N Y F − 1) + 1
i∗ = 2i − 1, j ∗ = 2j − 1
The easiest restriction operator is straight injection for which
φ2h h
i,j = φi∗ ,j ∗ ,
i.e. we simply drop the points that are not common to both grids. The matrix
symbol for straight injection is [1].

c 2010 K. W. Cassel 178 / 454
A better operator is full weighting. The matrix symbol is
Thus,
1

φ2h
i,j = h h h h
16 φi∗ −1,j ∗ −1 + φi∗ −1,j ∗ +1 + φi∗ +1,j ∗ −1 + φi∗ +1,j ∗ +1
+ 18 φhi∗ ,j ∗ −1 + φhi∗ ,j ∗ +1 + φhi∗ −1,j ∗ + φhi∗ +1,j ∗

(5.60)
+ 14 φhi∗ ,j ∗
This represents a weighted average of surrounding points in the fine mesh.

c 2010 K. W. Cassel 179 / 454
We then use straight injection on the boundaries, i.e. φ2h h

i,j = φi∗ ,j ∗ ,
i = 1, N XC, j = 1, . . . , N Y C and j = 1, N Y C, i = 1, . . . , N XC.
If, for example, restriction is applied only in the x-direction, then N Y C = N Y F
and j ∗ = j in equation (5.60).

c 2010 K. W. Cassel 180 / 454
h
Interpolation (Prolongation) Operator: I2h
The interpolation operator is required for moving information from the coarser to
finer grid. The most commonly used interpolation operator is based on bilinear
interpolation.
φhi∗ ,j ∗ = φ2h
i,j ← copy common points
1

φhi∗ +1,j ∗ = 2 φ2h
i,j + φ 2h
i+1,j
1

φhi∗ ,j ∗ +1 = 2 φ2h 2h
i,j + φi,j+1
1

φhi∗ +1,j ∗ +1 = 4 φ2h 2h 2h 2h
i,j + φi+1,j + φi,j+1 + φi+1,j+1
c 2010 K. W. Cassel 181 / 454
Speed Comparisons
Consider the test problem
∂2φ ∂φ ∂2φ ∂φ
A(x) + B(x) + C(y) + D(y) = F (x, y),
∂x2 ∂x ∂y 2 ∂y
with Neumann boundary conditions. The following times are for an SGI Indy
R5000-150MHz. The grid is N × N .
ADI:
= 10−4 = 10−5
N Iterations Time (sec) Iterations Time (sec)
65 673 22.35 821 27.22
129 2, 408 366.06 2, 995 456.03
Note that in both cases, the total time required for the N = 129 case is
approximately 16× that with N = 65 (∼ 4× increase in points and ∼ 4× increase
in iterations).

c 2010 K. W. Cassel 182 / 454
Multigrid:
V-cycle with ADL relaxation (no FMV to get improved initial guess). Here the
convergence criterion is evaluated between V-cycles.
= 10−4 = 10−5
N Iterations Time (sec) Iterations Time (sec)
65 18 1.78 23 2.28
129 23 10.10 29 12.68
Notes:
1) In both cases, the total time required for the N = 129 case is approximately
6× that with N = 65 (the minimum is 4×).
⇒ The multigrid method scales to larger grid sizes more effectively than ADI
alone, i.e. note the small increase in the number of V-cycles with increasing N .
2) The case with N = 65 is approximately 13× faster than ADI, and the case
with N = 129 is approximately 36× faster!

c 2010 K. W. Cassel 183 / 454
3) References:
1 Briggs, W.C., Henson, V.E. and McCormick, S.F., “A Multigrid Tutorial,”
(2nd Edition) SIAM (2000).
2 Thomas, J.L., Diskin, B. and Brandt, A.T., “Textbook Multigrid Efficiency for
Fluid Simulations,” Ann. Rev. Fluid Mech. (2003), 35, pp. 317–340.

c 2010 K. W. Cassel 184 / 454
Numerical Solutions of Elliptic Problems Treatment of Nonlinear Convective Terms
Outline

Introduction
Cyclic Reduction
Jacobi Iteration
Boundary Conditions
Multigrid Methods
Motivation
Speed Comparisons

c 2010 K. W. Cassel 185 / 454
Outline (cont’d)

c 2010 K. W. Cassel 186 / 454
Consider the 2-D, steady Burger’s equations
∂2u ∂2u

∂u ∂u
Re u +v = + 2, (5.61)
∂x ∂y ∂x2 ∂y
∂2v ∂2v

∂v ∂v
Re u +v = + 2, (5.62)
∂x ∂y ∂x2 ∂y
which represent a simplified version of the Navier-Stokes equations as there are no
pressure terms. The terms on the left-hand-side are the convection terms, and
those on the right-hand-side are the viscous or diffusion terms.
The Burger’s equations are elliptic due to the nature of the second-order viscous
terms, but the convection terms make the equations non-linear.
A simple approach to linearizing the equations is known as Picard iteration in
which we take the coefficients of the non-linear (first derivative) terms to be
known from the previous iteration denoted by
u∗i,j , vi,j
∗
.

c 2010 K. W. Cassel 187 / 454
Let us begin by approximating (5.61) using central differences for all derivatives as
follows

∗ ui+1,j − ui−1,j ∗ ui,j+1 − ui,j−1
Re ui,j + vi,j
2∆x 2∆y
ui+1,j − 2ui,j + ui−1,j ui,j+1 − 2ui,j + ui,j−1
= 2
+ .
(∆x) (∆y)2
Multiplying by (∆x)2 and rearranging leads to
1 − 21 Re ∆x u∗i,j ui+1,j + 1 + 12 Re ∆x u∗i,j ui−1,j

¯ − 1 Re ∆x ∆ ¯ 1/2 v ∗ ui,j+1 + ∆
¯ + 1 Re ∆x ∆¯ 1/2 v ∗ ui,j−1

+ ∆ 2 i,j 2 i,j (5.63)
¯ i,j = 0,
−2(1 + ∆)u
¯ = (∆x/∆y)2 .
where ∆

c 2010 K. W. Cassel 188 / 454
We can solve (5.63) using any of the iterative methods discussed except SOR
(generally need under-relaxation for non-linear problems). However, is (5.63)
diagonally dominant? To be diagonally dominant we must have
¯ − q + ∆
¯ + q ≤ 2(1 + ∆)
¯ ,

|1 − p| + |1 + p| + ∆
where
1 1 ¯ 1/2 vi,j
p= Re ∆x u∗i,j , q= Re ∆x ∆ ∗
.
2 2
¯ then this requires that
Suppose, for example, that p > 1 and q > ∆,
¯ + (∆
(p − 1) + (1 + p) + (q − ∆) ¯ + q) ≤ 2(1 + ∆)
¯
¯
2(p + q) ≤ 2(1 + ∆),
but with p > 1 and q > ∆¯ this condition cannot be satisfied, and equation (5.63)
is not diagonally dominant. The same result holds for p < −1 and q < −∆. ¯
Therefore, we must have |p| ≤ 1 and |q| ≤ ∆¯ or

1 1
Re ∆x u∗i,j ≤ 1, and Re ∆x vi,j ∗ ¯ 1/2

2 2 ≤∆ ,

c 2010 K. W. Cassel 189 / 454
which is a restriction on mesh size for given Re and velocity field.

Three difficulties:
1 As the Reynolds number Re increases, the grid sizes ∆x and ∆y must
decrease.
2 The velocities u∗i,j and vi,j
∗
vary throughout the domain and are unknown.
3 The central difference approximations for the first-order derivatives contribute
to the off-diagonal terms but not the main diagonal terms thereby adversely
affecting diagonal dominance.

c 2010 K. W. Cassel 190 / 454
In order to restore diagonal dominance, we use forward or backward differences for
the first-derivative terms depending upon the signs of the coefficients of the first
derivative terms, i.e. the velocities. For example, consider the u∗ ∂u/∂x term:
1 If u∗i,j > 0, then using a backward difference
∂u ui,j − ui−1,j
u∗ = u∗i,j + O(∆x),
∂x ∆x
which gives a positive addition to the ui,j term to promote diagonal
dominance.
2 If u∗i,j < 0, then using a forward difference
∂u ui+1,j − ui,j
u∗ = u∗i,j + O(∆x),
∂x ∆x
which again gives a positive addition to the ui,j term to promote diagonal
dominance.

c 2010 K. W. Cassel 191 / 454
Similarly for the v ∗ ∂u/∂y term:

∗
1 If vi,j > 0, then using a backward difference
∂u ∗ ui,j − ui,j−1
v∗ = vi,j + O(∆y).
∂y ∆y
∗
2 If vi,j < 0, then using a forward difference
∂u ∗ ui,j+1 − ui,j
v∗ = vi,j + O(∆y).
∂y ∆y
Let us consider diagonal dominance of the approximations to the x-derivative
terms in Burger’s equation (5.61) in the x-direction
u − ui−1,j , u∗i,j > 0

∗
∂2u ∂u ui+1,j − 2ui,j + ui−1,j Re ui,j  i,j
Tx = −Re u = − ;
∂x2 ∂x (∆x)2 ∆x  ∗
ui+1,j − ui,j , ui,j < 0
therefore,
 ui+1,j + (1 + Re ∆x u∗i,j )ui−1,j − (2 + Re ∆x u∗i,j )ui,j , u∗i,j > 0

(∆x)2 Tx = .
(1 − Re ∆x u∗i,j )ui+1,j + ui−1,j − (2 − Re ∆x u∗i,j )ui,j , u∗i,j <0


c 2010 K. W. Cassel 192 / 454
Therefore, if using ADI with splitting of x- and y-derivative terms, diagonal

dominance of the tridiagonal problems along lines of constant y requires that:
1 For u∗i,j > 0 (Re ∆x u∗i,j > 0):
|1| + |1 + Re ∆x u∗i,j | ≤ | − (2 + Re ∆x u∗i,j )|
1 + 1 + Re ∆x u∗i,j = 2 + Re ∆x u∗i,j ,
which is weakly diagonally dominant.

2 For u∗i,j < 0 (Re ∆x u∗i,j < 0):
|1| + |1 − Re ∆x u∗i,j | ≤ | − (2 − Re ∆x u∗i,j )|
1 + 1 − Re ∆x u∗i,j = 2 − Re ∆x u∗i,j ,
which is also weakly diagonally dominant.

c 2010 K. W. Cassel 193 / 454
Notes:
1 The same is true for the y-derivative terms when sweeping along lines of
constant x.
2 Updwind-downwind differencing forces diagonal dominance; therefore, the
iteration will always converge with no mesh restrictions.
3 The forward and backward differences used for the first-order derivatives are
only first-order accurate, i.e. the method is O(∆x, ∆y) accurate. To see the
potential affects of this error, consider the 1-D Burger’s equation
du d2 u
Re u = 2. (5.64)
dx dx
Recall from section 3.2 that, for example, the first-order, backward-difference
approximation to the first-order derivative is
∆x d2 u

du ui − ui−1
= + + ...,
dx i ∆x 2 dx2 i
where we have included the truncation error.

c 2010 K. W. Cassel 194 / 454
Substituting into (5.64) gives
∆x d2 u
2
∗ ui − ui−1 d u
Re ui + + . . . = ,
∆x 2 dx2 i dx2 i
or 2
ui − ui−1 Re d u
Re ui = 1− ∆x u∗i .
∆x 2 dx2 i
Therefore, depending upon the values of Re, ∆x, and u∗ , the truncation
error from the first-derivative terms, which is not included in the numerical
solution, may be of the same order, or even larger than, the physical diffusion
term. This is often referred to as artificial or numerical diffusion, the effects
of which increase with increasing Reynolds number.
4 Remedies:
i) Can return to O(∆x2 , ∆y 2 ) accuracy using deferred correction in which we
use the approximate solution to evaluate the leading term of the truncation
error which is then added to the original discretized equation as a source term.
ii) Alternatively, we could use second-order accurate forward and backward
differences, but the resulting system of equations would no longer be
tridiagonal.

c 2010 K. W. Cassel 195 / 454
5 Note that we have linearized the difference equation, not the differential
equation, in order to obtain a linear system of algebraic equations
⇒ The physical nonlinearity is still being solved for.

c 2010 K. W. Cassel 196 / 454
Numerical Solutions of Parabolic Problems Introduction
Outline
6 Numerical Solutions of Parabolic Problems

Introduction
Explicit Methods
Euler Method (First-Order Explicit)
Richardson Method
DuFort-Frankel Method
Stability Analysis
Introduction
Numerical Stability Analysis
Implicit Methods
First-Order Implicit
Crank-Nicolson
Non-Linear Convective Problems
First-Order Explicit
Crank-Nicolson
Multidimensional Problems
First-Order Explicit Method
First-Order Implicit Method
ADI Method with Time Splitting

c 2010 K. W. Cassel 197 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 198 / 454
Parabolic ⇒ preferred direction of propagation of solution:

1 time (unsteady problems)
2 spatial (some steady problems), e.g. steady duct flows and boundary-layer
flows.
Properties:
1 Initial-value problems, e.g. 1-D, unsteady, heat conduction (cf. elliptic ⇒
BVP)

c 2010 K. W. Cassel 199 / 454
2 Numerical algorithm “marches” in preferred direction in step-by-step manner,

i.e. it is not necessary to keep solution at every time step (just previous
step(s)).
Consider the general linear, 1-D, unsteady equation
∂φ ∂2φ ∂φ
= a(x, t) 2 + b(x, t) + c(x, t)φ + d(x, t). (6.1)
∂t ∂x ∂x
A simple model problem for this is the unsteady, 1-D diffusion equation
∂φ ∂2φ
= α 2, (6.2)
∂t ∂x
where
φ = T ⇒ heat conduction
φ = u ⇒ momentum diffusion (due to viscosity)
φ = ω ⇒ vorticity diffusion
φ = c ⇒ mass diffusion (c = concentration)
Techniques developed for equation (6.2) can be used for equation (6.1).

c 2010 K. W. Cassel 200 / 454
Methods of Solution:
1 Reduce partial differential equation to a set of ordinary differential equations
and solve, e.g. method of lines, predictor-corrector, Runge-Kutta, etc...
2 Finite-difference methods:
a) Explicit methods – obtain equation for φ at each mesh point.
b) Implicit methods – obtain set of algebraic equations for φ at all mesh points at
each ∆t.

c 2010 K. W. Cassel 201 / 454
Numerical Solutions of Parabolic Problems Explicit Methods
Outline

Introduction
Explicit Methods
Richardson Method
Stability Analysis
Introduction
Implicit Methods
Crank-Nicolson
Crank-Nicolson

c 2010 K. W. Cassel 202 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 203 / 454
Explicit Methods ⇒ Spatial derivatives are all evaluated at previous time level(s),
i.e. single unknown φn+1
i on left-hand-side.
Note that now the superscript n denotes the time step rather than the iteration
number.

c 2010 K. W. Cassel 204 / 454

c 2010 K. W. Cassel 205 / 454
First-order, forward difference for time derivative
∂φ φn+1 − φni
= i + O(∆t).
∂t ∆t
Second-order, central difference for spatial derivatives at nth time level (known)
∂2φ φni+1 − 2φni + φni−1

2
= 2
+ O(∆x2 ).
∂x (∆x)
Substituting into (6.2)
φn+1
i − φni φni+1 − 2φni + φni−1
=α ,
∆t (∆x)2
and solving for φn+1

i
α∆t
φn+1 = φni + n n n

φ − 2φ + φ
i
(∆x)2 i+1 i i−1
(6.3)
φn+1

i = (1 − 2s)φni + s φni+1 + φni−1 , i = 2, . . . , I,

c 2010 K. W. Cassel 206 / 454
where s = α∆t/(∆x)2 .
Notes:
1 Equation (6.3) is an explicit equation for φn+1
i at the (n + 1)st time step.
2 Method is second-order accurate in space and first-order accurate in time.
3 Time steps ∆t may be varied from step-to-step.
4 Restrictions on ∆t and ∆x for Euler method applied to 1-D diffusion
equation to remain stable (see section 6.3):
α∆t 1
s= 2
≤ ⇒ stable (very restrictive).
(∆x) 2
α∆t 1
s= 2
> ⇒ unstable.
(∆x) 2

c 2010 K. W. Cassel 207 / 454
Richardson Method
We want to improve on the temporal accuracy of the Euler method; therefore, we
use a central difference for the time derivative.
φn+1 − φn−1 φn − 2φni + φni−1

i i
= α i+1
2∆t (∆x)2
φn+1 = φn−1 n n n

i i + 2s φ i+1 − 2φ i + φ i−1 , i = 2, . . . , I. (6.4)

c 2010 K. W. Cassel 208 / 454
Notes:
1 Second-order accurate in space and time.
2 Must keep ∆t constant and requires starting method (need φi at two time
steps).
3 Unconditionally unstable for s > 0 ⇒ Do not use.

c 2010 K. W. Cassel 209 / 454
In order to maintain second-order accuracy, but improve stability, let us modify the
Richardson method by taking an average between time levels for φni . To devise
such an approximation, consider the Taylor series approximation at tn+1 about tn
n n
∆t2 ∂ 2 φ

n+1 n ∂φ
φi = φi + ∆t + + ··· .
∂t i 2 ∂t2 i
Similarly, consider the Taylor series approximation at tn−1 about tn

n n
∆t2 ∂ 2 φ

n−1 n ∂φ
φi = φi − ∆t + + ··· .
∂t i 2 ∂t2 i
Adding these Taylor series together gives

n
∂2φ

φn+1
i + φn−1
i = 2φni + ∆t 2
+ ··· ,
∂t2 i

c 2010 K. W. Cassel 210 / 454
and solving for φni leads to

n
∂2φ

1 n+1 1 2
φni = φi + φn−1
i − ∆t + ··· . (6.5)
2 2 ∂t2 i
Therefore, averaging between time levels in this manner is O(∆t2 ) accurate.

Substituting into the difference equation (6.4) from Richardson’s method gives
φn+1 n−1 n+1 n−1

n n

i = φ i + 2s φ i+1 − φ i + φ i + φ i−1
(1 + 2s)φn+1 = (1 − 2s)φn−1 + 2s φni+1 + φni−1

i i
1 − 2s n−1 2s
φn+1 φni+1 + φni−1 ,

i = φi + i = 2, . . . , I. (6.6)
1 + 2s 1 + 2s

c 2010 K. W. Cassel 211 / 454
However, let us consider the consistency of this approximation. Including the

truncation errors of each approximation, the 1-D, unsteady diffusion equation is
approximated as in the Richardson method (with central differences for the
temporal and spacial derivatives)
n
∂2φ φn+1 − φn−1 ∆t2 ∂ 3 φ

∂φ i i
−α 2 = − + ···
∂t ∂x 2∆t 6 ∂t3 i
n
φn − 2φni + φni−1 α ∆x2 ∂4φ

−α i+1 + + ··· .
∆x2 12 ∂x4 i
Substituting the time-averaging equation (6.5) for φni to implement the

DuFort-Frankel method leads to
∂φ ∂2φ φn+1
i − φn−1
i α n n n+1 n−1

−α 2 = − φ + φ − φ + φ
∂t ∂x 2∆t ∆x2 i+1 i−1 i i
n n n
α ∆t2 ∂ 2 φ ∆t2 ∂ 3 φ α ∆x2 ∂ 4 φ

− − + + ··· .
∆x2 ∂t2 i 6 ∂t3 i 12 ∂x4 i

c 2010 K. W. Cassel 212 / 454
For consistency, all of the truncation error terms must go to zero as

∆t → 0, ∆x → 0. That is, the difference equation must reduce to the differential
equation as ∆x, ∆t → 0. The second and third truncation error terms do so;
however, the first term requires that ∆t → 0 faster than ∆x → 0, i.e. ∆t << ∆x,
for consistency. Because this is not the case in general, the DuFort-Frankel
method is considered inconsistent.
Notes:
1 Second-order accurate in space.
2 Must keep ∆t constant and starting method necessary.
3 Method is unconditionally stable for any s = α∆t/(∆x)2 .
4 The method is inconsistent ⇒ Do not use.

c 2010 K. W. Cassel 213 / 454
Numerical Solutions of Parabolic Problems Stability Analysis
Outline

Introduction
Explicit Methods
Richardson Method
Stability Analysis
Introduction
Implicit Methods
Crank-Nicolson
Crank-Nicolson

c 2010 K. W. Cassel 214 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 215 / 454
Introduction
Real flows ⇒ small disturbances, e.g. imperfections, vibrations, etc...
Numerical solutions ⇒ small errors, e.g. truncation, round-off, etc...
Issue → What happens to small disturbances/errors as flow and/or solution
evolves in time?
Decay ⇒ stable (disturbances/errors are damped out).
Grow ⇒ unstable (disturbances/errors are amplified).
Two possible sources of instability in CFD:
1 Hydrodynamic instability – the flow itself is inherently unstable (see, for
example, Drazin & Reid)
→ This is real, i.e. physical
2 Numerical instability – the numerical algorithm magnifies small errors
→ This is not physical ⇒ Need a new method.

c 2010 K. W. Cassel 216 / 454
Difficulty → In CFD both are manifest in similar ways, i.e. oscillatory solutions;
therefore, it is often difficult to determine whether oscillatory numerical solutions
are a result of a numerical or hydrodynamic instability.
→ For an example of this, see “Supersonic Boundary-Layer Flow Over a
Compression Ramp,” Cassel, Ruban & Walker, JFM 1995, 1996.
Hydrodynamic vs. Numerical Instability:
1 Hydrodynamic stability analysis (Section 9 and MMAE 514):
Often difficult to perform.
Assumptions must often be made, e.g. parallel flow.
Can provide conclusive evidence for hydrodynamic instability (particularly if
confirmed by analytical or numerical results). For example, in supersonic flow
over a ramp, the Rayleigh Fjørtoft’s theorems are necessary conditions that
can be tested for.
2 Numerical stability analysis:
Often gives guidance, but not always conclusive for complex problems.
Note: Just because a numerical solution does not become oscillatory does not
mean that no physical instabilities are present! There may not be sufficient
resolution.
c 2010 K. W. Cassel 217 / 454

Two methods:
1) Matrix Method:
More rigorous → evaluates stability of entire scheme, including treatment of
boundary conditions.
More difficult → involves determining eigenvalues of a large matrix.
2) von Neumann (Fourier Series) Method:
For linear initial-value problems with constant coefficients.
⇒ More restrictive, i.e. nonlinear problems must be linearized.
Does not account for boundary conditions.
Most commonly used.

c 2010 K. W. Cassel 218 / 454
Matrix Method
Denote the exact solution of the difference equation at t = tn by φ̃ni ; then the
error is
eni = φni − φ̃ni , i = 2, . . . , I, (6.7)
where φni is the approximate solution at t = tn . Consider the
first-order explicit (Euler) method given by equation (6.3), which is repeated here
φn+1
i = (1 − 2s)φni + s(φni+1 + φni−1 ), (6.8)
where s = α∆t/∆x2 . Both φni and φ̃ni satisfy this equation; thus, the error
satisfies the same equation
ein+1 = (1 − 2s)eni + s(eni+1 + eni−1 ), i = 2, . . . , I. (6.9)
This may be written in matrix form as
en+1 = Aen , n = 0, 1, 2, . . . . (6.10)
Thus, we perform a matrix multiply to advance each time step (cf. matrix form for
iterative methods).

c 2010 K. W. Cassel 219 / 454
The (I − 1) × (I − 1) matrix A and the vector en are
en2
   
1 − 2s s 0 ··· 0 0
 s
 1 − 2s s ··· 0 
 0 


 en3
 0 s 1 − 2s ··· 0  0   en4
n
A= . , e = .
   
.. .. .. .. ..
 .. . . .  .   .
   n 
 0 0 0 ··· 1 − 2s s  eI−1 
0 0 0 ··· s 1 − 2s enI
Note that if φ is specified at the boundaries, then the error is zero there, i.e.
en1 = enI+1 = 0.
The method is stable if the eigenvalues λj of the matrix A are such that
|λj | ≤ 1 for all λj , (6.11)
i.e. ρ ≤ 1, in which case the error will not grow.

c 2010 K. W. Cassel 220 / 454
Because A is tridiagonal with constant elements along each diagonal, the

eigenvalues are (see section 3.5)

2 jπ
λj = 1 − (4s) sin , j = 1, . . . , I − 1. (6.12)
2I
Note that there are (I − 1) eigenvalues of the (I − 1) × (I − 1) matrix A. For
equation (6.11) to hold,

jπ
−1 ≤ 1 − (4s) sin2 ≤ 1.
2I
The right inequality is true for all j (s > 0 and sin2 () > 0). The left inequality is
true if
2 jπ
1 − (4s) sin ≥ −1
2I

2 jπ
−(4s) sin ≥ −2
2I

2 jπ 1
s sin ≤
2I 2

1 2 jπ
∴ true for all j if s ≤ . 0 < sin <1
2 2I
c 2010 K. W. Cassel 221 / 454
Thus, the Euler method is stable for s ≤ 1/2.

Notes:
Whereas here we only needed eigenvalues for a tridiagonal matrix, in general
we need to find the eigenvalues of a (I + 1) × (I + 1) matrix.
The effect of different boundary conditions are reflected in A and, therefore,
the resulting eigenvalues.
This is the same method used to obtain convergence properties of iterative
methods for elliptic problems. Recall that the spectral radius ρ(I, J) is the
modulus of the largest eigenvalue of the iteration matrix
φn+1 = Aφn ,
where here n indicates successive iterates rather than time steps. If ρ ≤ 1,

then the iterative procedure converges. In iterative methods, however, we are
concerned not only with whether they will converge or not, but the rate at
which they converge, e.g. Gauss-Seidel vs. Jacobi.
⇒ Minimize ρ for maximum convergence rate.

c 2010 K. W. Cassel 222 / 454
Parabolic problems ⇒ We only require eigenvalues to be less than or equal to

one for stability (it is not necessary to minimize).
⇒ It is often advantageous to use a time-marching (parabolic) scheme for solving
steady (elliptic) problems due to the less restrictive stability criterion. This is
sometimes referred to as the pseudo-transient method, e.g.
∂φ
∇2 φ = 0 → + ∇2 φ = 0.
∂t

c 2010 K. W. Cassel 223 / 454
von Neumann Method (Fourier Analysis)

We expand the errors along grid lines at one time level as a Fourier series. Then
we determine if the Fourier modes decay or amplify in time.
Expanding the error at t = 0 (n = 0)
I−1
X I−1
X
e(x, 0) = em (x, 0) = am (0)eiθm x , (6.13)
m=1 m=1
where
√ am (0) are the amplitudes of the Fourier modes, θm = mπ, and here
i = −1. At a later time t
I−1
X I−1
X
e(x, t) = em (x, t) = am (t)eiθm x . (6.14)
m=1 m=1
We want to determine how am (t) behaves with time.

c 2010 K. W. Cassel 224 / 454
To do this, define the gain of mode m, Gm (x, t), as
em (x, t) am (t)
Gm (x, t) = = ,
em (x, t − ∆t) am (t − ∆t)
which is the amplification factor for the mth mode during one time step.
Therefore, the error will not grow if |Gm | ≤ 1 for all m, i.e. the method is stable.
If it takes n time steps to get to time t, then the amplification after n time steps is
am (t) am (t − ∆t) am (∆t) am (t)

(Gm )n = ... = ,
am (t − ∆t) am (t − 2∆t) am (0) am (0)
where (Gm )n is the nth power of Gm .

The von Neumann method consists of seeking a solution of the form (6.14), with
am (t) = (Gm )n am (0), of the error equation corresponding to the difference
equation.

c 2010 K. W. Cassel 225 / 454
For the first-order explicit (Euler) method, the error equation (4.8) is (use index j
instead of i)
en+1
j = (1 − 2s)enj + s(enj+1 + enj−1 ), j = 2, . . . , I. (6.15)
This equation is linear; therefore, each mode m must satisfy the equation
independently. Thus, substituting (6.14) with am (t) = (Gm )n am (0) into equation
(6.15) gives (canceling am (0) in each term)
h i
n+1 iθm x n iθm x n iθm (x+∆x) iθm (x−∆x)
(Gm ) e = (1 − 2s)(Gm ) e + s(Gm ) e +e .
Dividing by (Gm )n eiθm x we have
(1 − 2s) + s eiθm ∆x + e−iθm ∆x

Gm =
= 1 − 2s [1 − cos(θm ∆x)] [cos(ax) = (eiax + e−iax )/2]

2 θm ∆x
Gm = 1 − (4s) sin [sin2 x = [1 − cos(2x)]/2]
2

c 2010 K. W. Cassel 226 / 454
For stability, |Gm | ≤ 1; therefore,

θm ∆x
−1 ≤ 1 − (4s) sin2 ≤1 for all θm = mπ.
2
The right inequality holds for all m and s > 0, i.e.

2 θm ∆x
0 ≤ sin ≤ 1.
2
The left inequality holds if s ≤ 1/2 (see matrix method). Thus, for this case we
obtain the same stability criterion as from the matrix method.
Note that |Gm | is the modulus; thus, if we have a complex number, |Gm | equals
the square root of the sum of the squares of the real and imaginary parts.

c 2010 K. W. Cassel 227 / 454
Numerical Solutions of Parabolic Problems Implicit Methods
Outline

Introduction
Explicit Methods
Richardson Method
Stability Analysis
Introduction
Implicit Methods
Crank-Nicolson
Crank-Nicolson

c 2010 K. W. Cassel 228 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 229 / 454
We can improve on the stability properties of explicit methods by solving implicitly

for more information at the new time level.

c 2010 K. W. Cassel 230 / 454
Recall the first-order explicit method:
Central difference for spatial derivatives on the nth (previous) time level.
First-order forward difference for time derivative.
First-order implicit:
Central difference for spatial derivatives on the (n + 1)st (current) time level.
First-order backward difference for time derivative.

c 2010 K. W. Cassel 231 / 454
For the unsteady, 1-D diffusion equation, this gives
φn+1
j − φnj φn+1 n+1
j+1 − 2φj + φn+1
j−1
=α 2
+ O(∆t, ∆x2 ).
∆t ∆x
Thus,
sφn+1 n+1
j+1 − (1 + 2s)φj + sφn+1 n
j−1 = −φj , j = 2, . . . , I, (6.16)
which is a tridiagonal problem for φn+1
j at the current time level. Note that it is
strongly diagonally dominant.

c 2010 K. W. Cassel 232 / 454
von Neumann Stability Analysis:

Error satisfies equation (6.16)
sen+1 n+1
j+1 − (1 + 2s)ej + sen+1 n
j−1 = −ej , (6.17)
We expand the error at time t as

I−1
X
e(x, t) = (Gm )n am (0)eiθm x , θm = mπ. (6.18)
m=1
Substituting into (6.17) gives (canceling am (0) in each term)
s(Gm )n+1 eiθm (x+∆x) − (1 + 2s)(Gm )n+1 eiθm x

+s(Gm )n+1 eiθm (x−∆x) = −(Gm )n eiθm x
s eiθm ∆x + e−iθm ∆x − (1 + 2s) Gm

= −1
[(2s) cos(θm ∆x) − (1 + 2s)] Gm = −1
{1 + 2s [1 − cos(θm ∆x)]} Gm = 1

c 2010 K. W. Cassel 233 / 454
−1
θm ∆x
∴ Gm = 1 + (4s) sin2 .
2
Thus, the method is stable if |Gm | ≤ 1. Note that

2 θm ∆x
1 + (4s) sin >1
2
for all θm and s > 0. Therefore, Gm < 1, and the method is

unconditionally stable.
Notes:
1) The first-order implicit method is only O(∆t) accurate.
2) There are more computations required per time step than for explicit
methods. However, one can typically use larger time steps (only limited by
temporal resolution required, not stability).
⇒ For explicit methods, we must choose the time step ∆t for both accuracy and
stability, whereas for implicit methods, we must only be concerned with
accuracy.

c 2010 K. W. Cassel 234 / 454
Crank-Nicolson
We prefer second-order accuracy in time; therefore, consider approximating the
equation midway between time levels.

c 2010 K. W. Cassel 235 / 454
For the unsteady, 1-D diffusion equation, this is accomplished as follows
φn+1 − φni α ∂ 2 φn+1 ∂ 2 φn

2
i
+ O(∆t ) = 2
+ 2
+ O(∆t2 )
∆t 2 ∂x ∂x
" #
n+1 n+1 n+1
φn+1 − φ n
α φ i+1 − 2φ i + φ i−1 φ n
− 2φ n
+ φ n
i i
= 2
+ i+1 i
2
i−1
.
∆t 2 ∆x ∆x
Later we will show that averaging the diffusion terms across time levels in this
manner is second-order accurate in time. Writing the difference equation in
tridiagonal form, we have
sφn+1 n+1
i+1 − 2(1 + s)φi + sφn+1 n n n
i−1 = −sφi+1 − 2(1 − s)φi − sφi−1 , i = 2, . . . , I,
(6.19)
which we solve for φn+1
i at the current time level.

c 2010 K. W. Cassel 236 / 454
Notes:
1) Second-order accurate in space and time.
2) Unconditionally stable for all s.
3) Apply derivative boundary conditions at current time level.
4) Very popular scheme for parabolic problems.

c 2010 K. W. Cassel 237 / 454
Averaging Across Time Levels:

Here we will again show that averaging across time levels as in the Crank-Nicolson
method is second-order accurate in time. In doing so, we will illustrate a method
to determine the accuracy of a specified approximation (cf. Taylor series
expansions in section 4).
Consider averaging a quantity φi midway between time levels as follows
n+1/2 1 n+1
φi = (φ + φni ) + T.E. (6.20)
2 i
n+1/2
We seek an expression of the form φi = φ̃ + T.E.
c 2010 K. W. Cassel 238 / 454
Let us expand each term as a Taylor series about (xi , tn+1/2 ):

∞ k
X 1 ∆t
φn+1
i = Dt φ̃,
k! 2
k=0
∞ k ∞ k
(−1)k ∆t

X 1 ∆t X
φni = − Dt φ̃ = Dt φ̃,
k! 2 k! 2
k=0 k=0
n+1/2
where Dt = ∂/∂t, and φ̃ is the exact value of φi midway between time levels,
n+1/2
i.e. φi = φ̃ + T.E.. Substituting these expansions into (6.20) gives
∞ k
n+1/2 1X 1 ∆t
1 + (−1)k

φi = Dt φ̃.
2 k! 2
k=0
Note that (
0, k = 1, 3, 5, . . .
1 + (−1)k = .
2, k = 0, 2, 4, . . .

c 2010 K. W. Cassel 239 / 454
Let k = 2m; thus,

∞ 2m
n+1/2
X1 ∆t
φi = Dt φ̃
m=0
(2m)! 2
2 4 .
n+1/2 1 ∆t 1 ∆t
φi = φ̃ + Dt φ̃ + Dt φ̃ + . . .
2! 2 4! 2
The first term in the expansion (m = 0) is the exact value and the second term
(m = 1) gives the truncation error of the approximation. This truncation error
term is
1 2 ∂ 2 φ̃
∆t ;
8 ∂t2
therefore, the approximation (6.20) is second-order accurate, i.e.
n+1/2 1 n+1
φi = (φ + φni ) + O(∆t2 ).
2 i
This shows that averaging across time levels gives an O(∆t2 ) approximation of φi
at the mid-time level tn+1/2 . Note that this agrees with the result (6.5) except for
the constant factor (we averaged across two time levels for the DuFort-Frankel
method).
c 2010 K. W. Cassel 240 / 454
Numerical Solutions of Parabolic Problems Non-Linear Convective Problems
Outline

Introduction
Explicit Methods
Richardson Method
Stability Analysis
Introduction
Implicit Methods
Crank-Nicolson
Crank-Nicolson

c 2010 K. W. Cassel 241 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 242 / 454
Consider the unsteady, 1-D Burger’s equation (1-D, unsteady diffusion equation
with convection terms)
∂u ∂2u ∂u
=ν 2 −u , (6.21)
∂t ∂x ∂x
where ν is the viscosity. We want to consider how the nonlinear convection term
is treated in the various schemes.

c 2010 K. W. Cassel 243 / 454
Approximating spatial derivatives at the previous time level, and using a forward
difference in time, we have
un+1 − uni un − 2uni + uni−1 n n

n ui+1 − ui−1
i
= ν i+1 2
− ui + O(∆x2 , ∆t),
∆t ∆x 2∆x
where the uni in the convection term is known from the previous time level.
Writing in explicit form leads to

n+1 Ci Ci
ui = s− uni+1 + (1 − 2s)uni + s + uni−1 ,
2 2
ν∆t un ∆t
where s = ∆x 2 and Ci = ∆x = Courant number. For stability, this method
i
requires that 2 ≤ Re∆x ≤ 2/Ci , where Re∆x = uni ∆x/ν = mesh Reynolds
number. This is very restrictive.

c 2010 K. W. Cassel 244 / 454
Crank-Nicolson
un+1 − uni ν ∂ 2 un+1 ∂ 2 un 1 n+1/2 ∂un+1 ∂un

i
= + − u +
∆t 2 ∂x2 ∂x2 2 ∂x ∂x
ν n+1 n+1 n+1 n n n

= u − 2u + u + ui+1 − 2ui + u i−1
2∆x2 i+1 i i−1
n+1/2
ui
un+1 n+1 n n 2 2

− i+1 − ui−1 + ui+1 − ui−1 + O(∆x , ∆t ),
4∆x
(6.22)

c 2010 K. W. Cassel 245 / 454
where
1 n+1 n+1/2
(ui + uni ) + O(∆t2 ).
ui = (6.23)
2
Thus, this results in the implicit finite difference equation

Ci n+1 n+1 Ci
− s− ui+1 + 2(1 + s)ui − s + un+1
i−1
2 2
(6.24)
Ci n n Ci
= s− ui+1 + 2(1 − s)ui + s + uni−1 ,
2 2
n+1/2
u ∆t n+1/2
where here Ci = i ∆x , but we do not know ui yet, i.e. it is nonlinear.
Therefore, this procedure requires iteration at each time step:
n+1/2
1) Begin with ui = uni , i.e. use ui from previous time step as initial guess
at current time step.
2) Compute update for un+1
i , i = 1, . . . , I + 1, using equation (6.24).
n+1/2
3) Update ui = 21 (uin+1 + uni ).
4) Repeat until un+1
i converges for all i.

c 2010 K. W. Cassel 246 / 454
Notes:
1) It typically requires less than ten iterations to converge at each time step; if
more are required, then the time step ∆t is too large.
2) In elliptic problems we use Picard iteration because we only care about the
final converged solution, whereas here we want an accurate solution at each
time step.

c 2010 K. W. Cassel 247 / 454
Consider the first-order convective terms in the Crank-Nicolson approximation, i.e.
n+1/2
n+1/2 ∂u
u .
∂x
If un+1/2 > 0, then approximate ∂un+1/2 /∂x as follows:

c 2010 K. W. Cassel 248 / 454
Then
" #
∂un+1 ∂un

∂u 1
= + + O(∆t2 )
∂x 2 ∂x i−1/2
∂x i+1/2

" # (6.25)
n+1 n+1
∂u 1 ui − ui−1 uni+1 − uni
= + + O(∆x2 , ∆t2 ),
∂x 2 ∆x ∆x
Notes:
1) Although the finite-difference approximation at the current time level appears
to be a backward difference and that at the previous time level appears to be
a forward difference, they are really central differences evaluated at
half-points in the grid.
2) The fact that this approximation is O(∆x2 , ∆t2 ) accurate will be shown at
the end of this section.

c 2010 K. W. Cassel 249 / 454
If un+1/2 < 0, then approximate ∂un+1/2 /∂x as follows:
Now we have
" #
n+1
n

∂u 1 ∂u ∂u
= + + O(∆t2 )
∂x 2 ∂x i+1/2
∂x i−1/2
" # (6.26)
n+1 n+1 n n
∂u 1 ui+1 − ui u − ui−1
= + i + O(∆x2 , ∆t2 ).
∂x 2 ∆x ∆x

c 2010 K. W. Cassel 250 / 454
Use (6.25) and (6.26) in equation (6.22) rather than central differences
1
un+1 − uni = s un+1 n+1 n+1 n n n

i i+1 − 2u i + ui−1 + ui+1 − 2u i + u i−1
2
n+1/2
(
1 un+1
i − un+1 n n
i−1 + ui+1 − ui , ui >0
− Ci n+1/2
.
2 un+1 − un+1 + un − un , u <0
i+1 i i i−1 i
This gives the tridiagonal problem

( )
un+1
i − un+1
i−1
−sun+1 n+1
i+1 + 2(1 + s)ui − sun+1
i−1 + Ci
un+1
i+1 − ui
n+1
n+1/2
( )
n n
ui+1 − ui , ui >0
= suni+1 + 2(1 − s)uni + suni−1 − Ci .
uni − uni−1 , ui
n+1/2
<0
(6.27)

c 2010 K. W. Cassel 251 / 454
Notes:
1) Equation (6.27) is diagonally dominant for all s and Ci (note that Ci may be
positive or negative). Be sure to check this for different equations, i.e. for
other than the one-dimensional Burger’s equation.
2) Iteration at each time step may require under-relaxation on ui ; therefore,
k+1/2
uk+1
i = ωui + (1 − ω)uki , k = 0, 1, 2, . . . ,
where 0 < ω < 1.

3) As shown below, this method is (almost) O(∆x2 , ∆t2 ) accurate.
Truncation Error of Crank-Nicolson with Upwind-Downwind Differencing:
We have used second-order accurate central differences for the ∂u/∂t and
∂ 2 u/∂x2 terms. Therefore, consider the un+1/2 ∂un+1/2 ∂x term with un+1/2 < 0
from (6.26)
∂u 1
= (un+1
i+1 − ui
n+1
+ uni − uni−1 ) + T.E. (6.28)
∂x 2∆x
We would like to determine the order of accuracy, i.e. the truncation error T.E.,
of this approximation.

c 2010 K. W. Cassel 252 / 454
n+1/2
Here, Dt = ∂/∂t, Dx = ∂/∂x, and ũ is the exact value of ui midway
between time levels.
We seek an expression of the form
∂u ∂ ũ
= + T.E.
∂x ∂x

c 2010 K. W. Cassel 253 / 454
Expanding each term in equation (6.28) as a 2-D Taylor series about (xi , tn+1/2 ):
∞ k
X 1 ∆t
un+1
i+1 = Dt + ∆xDx ũ,
k! 2
k=0
∞ k
X 1 ∆t
un+1
i = Dt ũ,
k! 2
k=0
∞ k ∞ k
(−1)k ∆t

X 1 ∆t X
uni = − Dt ũ = Dt ũ,
k! 2 k! 2
k=0 k=0
∞ k ∞ k
(−1)k ∆t

X 1 ∆t X
uni−1 = − Dt − ∆xDx ũ = Dt + ∆xDx ũ,
k! 2 k! 2
k=0 k=0

c 2010 K. W. Cassel 254 / 454
Substituting these expansions into (6.28) gives

∞
( k
∂u 1 X 1 ∆t
1 − (−1)k

= Dt + ∆xDx
∂x 2∆x k! 2
k=0
k )
∆t
+ −1 + (−1)k

Dt ũ
2
∞
" k k #
∂u 1 X 1 + (−1)k+1 ∆t ∆t
= Dt + ∆xDx − Dt ũ
∂x 2∆x k! 2 2
k=0
Note that (
0, k = 0, 2, 4, . . .
1 + (−1)k+1 = .
2, k = 1, 3, 5, . . .
Therefore, let k = 2l + 1; thus,
∞
" 2l+1 2l+1 #
∂u 1 X 1 ∆t ∆t
= Dt + ∆xDx − Dt ũ.
∂x ∆x (2l + 1)! 2 2
l=0

c 2010 K. W. Cassel 255 / 454
Recall the binomial theorem

k
X
k k k−m m
(a + b) = a b ,
m=0
m
k

where m are the binomial coefficients

k k!
= (0! = 1).
m m!(k − m)!

c 2010 K. W. Cassel 256 / 454
Thus,
∞
"2l+1
∂u 1 X 1 X 2l + 1 ∆t 2l+1−m
= Dt (∆xDx )m
∂x ∆x (2l + 1)! m=0 m 2
l=0
2l+1 #
∆t
− Dt ũ,
2
∞
"2l+1
1 X 1 X 2l + 1 ∆t 2l−m+1
= Dt (∆xDx )m
∆x (2l + 1)! m=1 m 2
l=0
2l+1 2l+1 #
2l + 1 ∆t ∆t
+ Dt − Dt ũ,
0 2 2
∞ 2l+1 2l−m+1
X 1 X (2l + 1)! ∆t
= Dt (∆x)m−1 (Dx )m ũ,
(2l + 1)! m=1 m!(2l − m + 1)! 2
l=0
∞ 2l+1 2l−m+1
∂u ∂ ũ X X 1 ∆t
= + Dt (∆x)m−1 (Dx )m ũ.
∂x ∂x m=1
m!(2l − m + 1)! 2
l=1

c 2010 K. W. Cassel 257 / 454
To obtain the truncation error, consider the l = 1 term (m = 1, 2, 3)

2
1 ∆t 1 ∆t 1
Dt2 Dx ũ + ∆xDt Dx2 ũ + ∆x2 Dx3 ũ.
1!2! 2 2!1! 2 3!1!
Therefore, the truncation error is
O(∆t2 , ∆t∆x, ∆x2 ).
Thus, if ∆t < ∆x then the approximation is O(∆x2 ) accurate, and if ∆t > ∆x

then the approximation is O(∆t2 ) accurate. This is better than O(∆x) or O(∆t),
but strictly speaking it is not O(∆t2 , ∆x2 ).
Similarly, the v∂u/∂y term has truncation error
O(∆t2 , ∆t∆y, ∆y 2 ).
The loss of accuracy is due to the diagonal averaging across time levels.
Note:
1 Method is unconditionally stable.

c 2010 K. W. Cassel 258 / 454
Numerical Solutions of Parabolic Problems Multidimensional Problems
Outline

Introduction
Explicit Methods
Richardson Method
Stability Analysis
Introduction
Implicit Methods
Crank-Nicolson
Crank-Nicolson

c 2010 K. W. Cassel 259 / 454
Outline (cont’d)
Factored ADI Method

c 2010 K. W. Cassel 260 / 454
Consider the unsteady, 2-D diffusion equation

2
∂ φ ∂2φ

∂φ
=α + 2 , φ = φ(x, y, t), (6.29)
∂t ∂x2 ∂y
with initial condition
φ(x, y, 0) = φ0 (x, y) at t = 0, (6.30)
and boundary conditions (Dirichlet, Neumann or mixed) on a closed contour C

enclosing the domain. We discretize as follows (t is normal to x − y plane):

c 2010 K. W. Cassel 261 / 454
Approximating equation (6.29) using a forward difference in time and central

differences in space at the previous time level gives
φn+1 n n
i,j − φi,j φi+1,j − 2φni,j + φni−1,j φni,j+1 − 2φni,j + φni,j−1

=α +
∆t (∆x)2 (∆y)2
+O(∆t, ∆x2 , ∆y 2 ).

c 2010 K. W. Cassel 262 / 454
Therefore, solving for the only unknown gives the explicit expression
φn+1 n n n n n
i,j = (1 − 2sx − 2sy )φi,j + sx (φi+1,j + φi−1,j ) + sy (φi,j+1 + φi,j−1 ), (6.31)
where sx = α∆t/(∆x)2 and sy = α∆t/(∆y)2 .

For numerical stability, a von Neumann stability analysis requires that
1
sx + sy ≤ .
2
Thus, for example if ∆x = ∆y, i.e. sx = sy = s, we must have
1
s≤ ,
4
1
which is even more restrictive than for the 1-D diffusion equation where s ≤ 2 for
stability.

c 2010 K. W. Cassel 263 / 454
Applying a backward difference in time and central differences in space at the

current time level leads to the implicit expression
(1 + 2sx + 2sy )φn+1 n+1 n+1 n+1 n+1 n

i,j − sx (φi+1,j + φi−1,j ) − sy (φi,j+1 + φi,j−1 ) = φi,j . (6.32)

c 2010 K. W. Cassel 264 / 454
Notes:
1 Unconditionally stable for all sx and sy .
2 Crank-Nicolson could be used to obtain second-order accuracy in time. It
produces a similar implicit equation, but with more terms on the
right-hand-side, i.e. evaluated at the previous time step.
3 Produces a banded matrix (with five unknowns) that is difficult to solve
efficiently.
4 Alternatively, we could split each time step into two half steps, called ADI
with time splitting, resulting in two sets of tridiagonal problems per time step:
Step 1: Solve implicitly for terms associated with one coordinate direction.
Step 2: Solve implicitly for terms associated with other coordinate direction.

c 2010 K. W. Cassel 265 / 454
ADI Method with Time Splitting (Fractional-Step Method)

As mentioned above, we split each time step into two half-time steps as follows:
1 Sweep along lines of constant y during the first half step.

c 2010 K. W. Cassel 266 / 454
n+1/2
" n+1/2 n+1/2 n+1/2
#
φi,j − φni,j φi+1,j − 2φi,j + φi−1,j φni,j+1 − 2φni,j + φni,j−1
=α + .
∆t/2 (∆x)2 (∆y)2
(6.33)
Therefore,
1 n+1/2 n+1/2 1 n+1/2 1 1
sx φi+1,j −(1+sx )φi,j + sx φi−1,j = − sy φni,j+1 −(1−sy )φni,j − sy φni,j−1 .
2 2 2 2
(6.34)
n+1/2
The tridiagonal problems (6.34) are solved for φi,j , i = 1, . . . , I + 1,
j = 1, . . . , J + 1, at the intermediate time level.

c 2010 K. W. Cassel 267 / 454
2 Sweep along lines of constant x during the second half step.
n+1/2 n+1/2 n+1/2 n+1/2

" #
φn+1
i,j − φi,j φi+1,j − 2φi,j + φi−1,j φn+1
i,j+1 − 2φ n+1
i,j + φ n+1
i,j−1
=α 2
+ 2
.
∆t/2 (∆x) (∆y)
(6.35)

c 2010 K. W. Cassel 268 / 454
Therefore,
1 n+1 1 1 n+1/2 n+1/2 1 n+1/2
sy φn+1 n+1
i,j+1 −(1+sy )φi,j + sy φi,j−1 = − sx φi+1,j −(1−sx )φi,j − sx φi−1,j .
2 2 2 2
(6.36)
n+1
The tridiagonal problems (6.36) are solved for φi,j , i = 1, . . . , I + 1,
j = 1, . . . , J + 1, at the current time level.
Notes:
1 Method is O(∆t2 , ∆x2 , ∆y 2 ).
2 Requires boundary conditions at the intermediate time level n + 1/2 for
equation (6.34).
For example, if the boundary condition at x = 0 is Dirichlet as follows
φ(0, y, t) = a(y, t),
then
φn1,j = anj .

c 2010 K. W. Cassel 269 / 454
Subtracting equation (6.35) from (6.33) gives

n+1/2 n+1/2
" #
φi,j − φni,j φn+1
i,j − φ i,j φ n
i,j+1 − 2φ n
i,j + φ n
i,j−1 φ n+1
i,j+1 − 2φ n+1
i,j + φ n+1
i,j−1
− =α 2
− 2
∆t/2 ∆t/2 (∆y) (∆y)
or solving for the unknown at the intermediate time level results in
n+1/2 1 n 1 n
φi,j + φn+1 + sy φi,j+1 − 2φni,j + φni,j−1 − φn+1 n+1 n+1

φi,j = i,j i,j+1 − 2φi,j + φi,j−1 .
2 4
Applying this equation at the boundary x = 0, leads to
n+1/2 1 n 1
aj + ajn+1 + sy anj+1 − 2anj + anj−1 − an+1 n+1 n+1

φ1,j = j+1 − 2aj + aj−1 .
2 4
This provides the boundary condition for φ1,j at the intermediate (n + 1/2)
time level. Note that the first term on the right-hand-side is the average of a
at the n and n + 1 time levels, the second term is ∂ 2 an /∂y 2 , and the third
term is ∂ 2 an+1 /∂y 2 .

c 2010 K. W. Cassel 270 / 454
3 For stability, apply von Neumann analysis at each half step and take the
product of the resulting amplification factors, G1 and G2 , to obtain G for the
full time step.
⇒ Method is unconditionally stable for all sx and sy .
4 In 3-D, we require three fractional steps (∆t/3) for each time step, and the
method is only conditionally stable, where
sx , sy , sz ≤ 1.5,
for stability (sz = α∆t/(∆z)2 ).

c 2010 K. W. Cassel 271 / 454
Factored ADI Method

Let us reconsider the unsteady, 2-D diffusion equation
2
∂ φ ∂2φ

∂φ
=α + 2 , (6.37)
∂t ∂x2 ∂y
and apply the Crank-Nicolson approximation
φn+1 n
i,j − φi,j α 2 n+1
δx φi,j + δx2 φni,j + δy2 φn+1 2 n

= i,j + δy φi,j ,
∆t 2
where δ represents second-order central difference operators (as in section 5.7)
φi+1,j − 2φi,j + φi−1,j

δx2 φi,j = ,
(∆x)2
φi,j+1 − 2φi,j + φi,j−1

δy2 φi,j = .
(∆y)2

c 2010 K. W. Cassel 272 / 454
Rewriting the difference equation with the unknowns on the left-hand-side and the
knowns on the right leads to

1 1
1 − α∆t δx2 + δy2 φn+1 2 2
φni,j .

i,j = 1 + α∆t δx + δy (6.38)
2 2
We “factor” the difference operator on the left-hand-side as follows

1 2 2
1 2 1 2
1 − α∆t δx + δy ≈ 1 − α∆tδx 1 − α∆tδy , (6.39)
2 2 2
where the first factor only involves the difference operator in the x-direction, and
the second factor only involves the difference operator in the y-direction. The
factored operator produces an extra term as compared to the unfactored operator
1 2 2 2 2
α ∆t δx δy = O(∆t2 ),
4
which is O(∆t2 ). Therefore, the factorization (6.39) is consistent with the
second-order accuracy in time of the Crank-Nicolson approximation.

c 2010 K. W. Cassel 273 / 454
The factored form of (6.38) is

1 1 1
1 − α∆tδx2 1 − α∆tδy2 φn+1 2 2
φni,j ,

i,j = 1 + α∆t δx + δy
2 2 2
which can be solved in two steps by defining the intermediate variable

1
φ̂i,j = 1 − α∆tδy φn+1
2
i,j . (6.40)
2
n+1/2
Note that φ̂i,j is not the same as φi,j , which is an intermediate approximation
to φi,j at the half time step.
The two stage solution process is:
1 Sweep along constant y-lines solving

1 2 1 2 2
n
1 − α∆tδx φ̂i,j = 1 + α∆t δx + δy φi,j , (6.41)
2 2
which produces a tridiagonal problem at each j for φ̂i,j , i = 1, . . . , I + 1.

c 2010 K. W. Cassel 274 / 454
2 Sweep along constant x-lines solving (from (6.40))

1
1 − α∆tδy2 φn+1 i,j = φ̂i,j , (6.42)
2
which produces a tridiagonal problem at each i for φn+1

i,j , j = 1, . . . , J + 1 at
the current time step. Note that the right-hand-side of this equation is the
solution of (6.41).

c 2010 K. W. Cassel 275 / 454
Notes:
1 Similar to the ADI method with time splitting, but have an intermediate
n+1/2
variable φ̂i,j rather than half time step φi,j .
Factored ADI is somewhat faster; it only requires one evaluation of the spatial
derivatives on the right-hand-side per time step (for equation (6.41)) rather
than two for the ADI method (see equations (6.34) and (6.36)).
2 Method is O(∆t2 , ∆x2 , ∆y 2 ) accurate and is unconditionally stable (even for
3-D implementation of unsteady diffusion equation).
3 Requires boundary conditions for the intermediate variable φ̂i,j to solve
(6.41). These are obtained from equation (6.40) applied at the boundaries
(see Fletcher, section 8.4.1).
4 The order of solution can be reversed, i.e. we could define

1
φ̂i,j = 1 − α∆tδx φn+1
2
i,j .
2
instead of (6.40).

c 2010 K. W. Cassel 276 / 454
5 If we have non-linear convective terms, i.e. unsteady Navier-Stokes or

transport equation, use upwind-downwind differencing as in section 6.5.3. See
Peridier, Smith & Walker, JFM, Vol. 232, pp. 99–131 (1991), which shows
factored ADI with upwind-downwind differencing applied to the the unsteady
boundary-layer equations (but method applies to more general equations).
6 Observe that numerical methods for parabolic problems do not require
relaxation or acceleration parameters as is often the case for elliptic solvers.
Recall that the issue is numerical stability, not iterative convergence.

c 2010 K. W. Cassel 277 / 454
Numerical Solutions of Navier-Stokes Equations Primitive-Variables Formulation
Outline
7 Numerical Solutions of Navier-Stokes Equations

Primitive-Variables Formulation
Vorticity-Streamfunction Formulation
Boundary Conditions for Vorticity-Streamfunction Formulation
Thom’s Method
Jensen’s Method
Numerical Solutions of Coupled Systems of Equations
Sequential Method

c 2010 K. W. Cassel 278 / 454
Incompressible flow of a Newtonian fluid is governed by the Navier-Stokes

equations, given by the momentum equation
∂V∗

ρ ∗
+ V · ∇V = −∇p∗ + µ∇2 V∗ ,
∗ ∗
∂t
and the continuity equation

∇ · V∗ = 0.
The density is ρ, the viscosity is µ, the pressure is p∗ (x∗ , y ∗ , z ∗ , t∗ ), and the
velocity vector in Cartesian coordinates, for example, is V∗ = u∗ i + v ∗ j + w∗ k.
It is convenient to nondimensionalize using a characteristic length scale L and
velocity U according to
(x∗ , y ∗ , z ∗ ) t∗ V∗ p∗
(x, y, z) = , t= V= , p= .
L L/U U ρU 2

c 2010 K. W. Cassel 279 / 454
Applying to the Navier-Stokes equations gives

∂V 1 2
+ V · ∇V = −∇p + ∇ V,
∂t Re
and
∇ · V = 0,
where Re = ρU L/µ is the nondimensional Reynolds number.
In 2-D, Cartesian coordinates, the incompressible Navier-Stokes equations are:
x-momentum:
1 ∂2u ∂2u

∂u ∂u ∂u ∂p
+u +v =− + + 2 , (7.1)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
y-momentum:
1 ∂2v ∂2v

∂v ∂v ∂v ∂p
+u +v =− + + 2 , (7.2)
∂t ∂x ∂y ∂y Re ∂x2 ∂y

c 2010 K. W. Cassel 280 / 454
Continuity:
∂u ∂v
+ = 0. (7.3)
∂x ∂y
Thus, we have three coupled equations for three dependent variables
u(x, y, t), v(x, y, t) and p(x, y, t), which we refer to as primitive variables.
Therefore, the system is closed mathematically:
Given v(x, y, t) and p(x, y, t), we can determine u(x, y, t) from equation
(7.1).
Given u(x, y, t) and p(x, y, t), we can determine v(x, y, t) from equation
(7.2).
But how do we determine p(x, y, t) given that p does not appear in equation
(7.3)?
⇒ Need equation for p(x, y) in terms of u(x, y) and v(x, y) at time t.
To obtain such an equation, take the divergence of the momentum equation in
vector form, i.e. ∇ · (N S), which in 2-D is equivalent to taking ∂/∂x of equation
(7.1), ∂/∂y of equation (7.2) and adding.

c 2010 K. W. Cassel 281 / 454
Doing so gives
" 2
2 2
∂2u ∂ 2 u ∂v ∂u ∂2u

∂ p ∂ p ∂u
2
+ 2 = − + +u 2 + +v
∂x ∂y ∂t∂x ∂x ∂x ∂x ∂y ∂x∂y
#
2 2
2 2
∂ v ∂u ∂v ∂ v ∂v ∂ v
+ + +u + +v 2
∂t∂y ∂y ∂x ∂x∂y ∂y ∂y
1 ∂3u ∂3u ∂3v ∂3v

+ + + + .
Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y 3

c 2010 K. W. Cassel 282 / 454
Doing so gives
" 2
2 2
∂2u ∂ 2 u ∂v ∂u ∂2u

∂ p ∂ p ∂u
2
+ 2 = − + +u 2 + +v
∂x ∂y ∂t∂x ∂x ∂x ∂x ∂y ∂x∂y
#
2 2
2 2
∂ v ∂u ∂v ∂ v ∂v ∂ v
+ + +u + +v 2
∂t∂y ∂y ∂x ∂x∂y ∂y ∂y
1 ∂3u ∂3u ∂3v ∂3v

+ + + + .
Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y 3
Or
∂2p ∂2p

∂ ∂u ∂v ∂ ∂u ∂v ∂ ∂u ∂v
+ = − + +u + +v +
∂x2 ∂y 2 ∂t ∂x ∂y ∂x ∂x# ∂y ∂y ∂x ∂y
2 2
∂u ∂v ∂v ∂u
+ + +2
∂x ∂y ∂x ∂y
2
∂ 2 ∂u ∂v

1 ∂ ∂u ∂v
+ + + 2 + .
Re ∂x2 ∂x ∂y ∂y ∂x ∂y

c 2010 K. W. Cassel 283 / 454
But from the continuity equation (7.3)

0 0 0

2 2 *
*
*

∂ p ∂ p  ∂ ∂u ∂v ∂ ∂u ∂v ∂ ∂u ∂v
+ = − + + u + + v +
∂x2 ∂y 2 ∂t∂x ∂y ∂x∂x ∂y ∂y ∂x ∂y


2 2 #
∂u ∂v ∂v ∂u
+ + +2
∂x ∂y ∂x ∂y
0 0
 
2 *
2 *

1  ∂ ∂u ∂v ∂ ∂u ∂v 

+  2 + + 2 + .
Re ∂x ∂x ∂y ∂y ∂x ∂y

We also have from continuity that

2 2
∂u ∂v ∂u ∂v
= =− .
∂x ∂y ∂x ∂y

c 2010 K. W. Cassel 284 / 454
Substituting gives
∂2p ∂2p

∂u ∂v ∂v ∂u
+ 2 =2 − , (7.4)
∂x2 ∂y ∂x ∂y ∂x ∂y
which is a Poisson equation for pressure with u(x, y) and v(x, y) known from
solutions of equations (7.1) and (7.2), respectively.
Notes:
1 The unsteady momentum equations (7.1) and (7.2) are parabolic in time.
⇒ May be solved using Crank-Nicolson, ADI, Factored-ADI, etc....
2 The pressure equation (7.4) and steady forms of (7.1) and (7.2) are elliptic.
⇒ Can be solved using cyclic reduction, Gauss-Seidel, ADI, multigrid, etc....

c 2010 K. W. Cassel 285 / 454
Velocity Boundary Conditions (us and un are velocity components tangential and
normal to the surface, respectively):
Surface: us = un = 0 (no slip and impermeability)
Inflow: us and un specified
Outflow: ∂u ∂un
∂n = ∂n = 0 (fully-developed flow)
s
∂us
Symmetry: ∂n = 0, un = 0 (no flow through symmetry plane)
Note that the domain must be sufficiently long for the fully-developed outflow
boundary condition to be valid.
Pressure Boundary Conditions at a Surface:
From the momentum equations (7.1) and (7.2) with u = v = 0:
∂p 1 ∂2u
n=x⇒ =
∂x Re ∂x2
∂p 1 ∂2v
n=y⇒ =
∂y Re ∂x2
Other boundary conditions for pressure obtained similarly.
Observe that we have Neumann boundary conditions on pressure at solid surfaces.

c 2010 K. W. Cassel 286 / 454
Numerical Solutions of Navier-Stokes Equations Vorticity-Streamfunction Formulation
Outline

Thom’s Method
Jensen’s Method
Sequential Method

c 2010 K. W. Cassel 287 / 454
The streamfunction ψ(x, y, t) is defined for 2-D, incompressible flow by
∂ψ ∂ψ
u= , v=− , (7.5)
∂y ∂x
such that the continuity equation (7.3) is identically satisfied. Lines of constant ψ
are called streamlines and are everywhere tangent to the local velocity vectors.
The vorticity ω(x, y, t) in 2-D is defined by
∂v ∂u
ω= − , (7.6)
∂x ∂y
and measures the local rate of rotation of fluid particles, with the sign
corresponding to the right-hand-rule. Note that in general 3-D flows vorticity is a
vector →
−
ω = ∇ × V.

c 2010 K. W. Cassel 288 / 454
Note that vorticity 6= vortex:

vorticity – spatial distribution of rotation rates of individual fluid particles.
vortex – a spatially coherent region of fluid rotating as a unit, i.e. an “eddy.”
Example of a flow with vorticity, but no vortex?
Example of a flow with vortex, but no vorticity?
To obtain the vorticity-transport equation, take the curl of the momentum
equation in vector form, i.e. ∇ × (N S), which in 2-D is equivalent to taking ∂/∂x
of equation (7.2) minus ∂/∂y of equation (7.1):

c 2010 K. W. Cassel 289 / 454
Doing so gives
∂2v ∂u ∂v ∂2v ∂v ∂v ∂2v

+ +u 2 + +v
∂x∂t ∂x ∂x ∂x ∂x ∂y ∂x∂y
2 2
∂ u ∂u ∂u ∂ u ∂v ∂u ∂2u
− − −u − −v 2
∂y∂t ∂y ∂x ∂x∂y ∂y ∂y ∂y
∂2p ∂2p 1 ∂3v ∂3v ∂3u ∂3u

=− + + + − − 3 .
∂x∂y ∂x∂y Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y

c 2010 K. W. Cassel 290 / 454
Doing so gives
∂2v ∂u ∂v ∂2v ∂v ∂v ∂2v

+ +u 2 + +v
∂x∂t ∂x ∂x ∂x ∂x ∂y ∂x∂y
2 2
∂ u ∂u ∂u ∂ u ∂v ∂u ∂2u
− − −u − −v 2
∂y∂t ∂y ∂x ∂x∂y ∂y ∂y ∂y
∂2p ∂2p 1 ∂3v ∂3v ∂3u ∂3u

=− + + + − − 3 .
∂x∂y ∂x∂y Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y
Or

∂ ∂v ∂u ∂ ∂v ∂u ∂ ∂v ∂u ∂v ∂u ∂v
− +u − +v − + +
∂t ∂x ∂y ∂x ∂x ∂y ∂y ∂x ∂y ∂x ∂x ∂y
2
∂ 2 ∂v

∂u ∂u ∂v 1 ∂ ∂v ∂u ∂u
− + = − + 2 −
∂y ∂x ∂y Re ∂x2 ∂x ∂y ∂y ∂x ∂y

c 2010 K. W. Cassel 291 / 454
But from the continuity equation (7.3) and the definition of vorticity (7.6)
*

ω *

ω *
ω *

0
∂ ∂v ∂u ∂ ∂v ∂u ∂ ∂v ∂u ∂v ∂u ∂v
− +u − +v − + +
∂t∂x ∂y ∂x∂x ∂y ∂y ∂x ∂y ∂x∂x ∂y

0  ω ω
*
2 *
2 *

∂u ∂u ∂v 1  ∂ ∂v ∂u ∂ ∂v ∂u

− + = − + − 
∂y ∂x ∂y Re ∂x2 ∂x ∂y ∂y 2 ∂x ∂y

Therefore, the vorticity-transport equation in 2-D is
1 ∂2ω ∂2ω

∂ω ∂ω ∂ω
+u +v = + , (7.7)
∂t ∂x ∂y Re ∂x2 ∂y 2
which is a convection-diffusion equation.

c 2010 K. W. Cassel 292 / 454
The vorticity and streamfunction may be related by substituting (7.5) into (7.6)
to obtain
∂2ψ ∂2ψ
+ = −ω, (7.8)
∂x2 ∂y 2
which is a Poisson equation for ψ(x, y, t) if ω(x, y, t) is known.
Notes:
1 Equations (7.7) and (7.8) are coupled equations for ω and ψ (with
u = ∂ψ/∂y, v = −∂ψ/∂x in (7.7)).
2 The vorticity-transport equation (7.7) is parabolic when unsteady and elliptic
when steady. The streamfunction equation (7.8) is elliptic.
3 The pressure terms have been eliminated, i.e. there is no need to calculate
the pressure in order to advance the solution in time.
→ Can compute p(x, y, t) from equation (7.4) if desired.
4 The vorticity-streamfunction formulation consists of two equations for the
two unknowns ω(x, y, t) and ψ(x, y, t) (cf. primitive variables formulation
with three equations for three unknowns in 2-D).

c 2010 K. W. Cassel 293 / 454
5 Unlike the primitive variables formulation, the ω–ψ formulation does not
easily extend to 3-D:
Three components of vorticity ⇒ Three vorticity equations.
Stretching and tilting terms in 3-D vorticity equations.
Cannot define streamfunction in 3-D ⇒ Vorticity-velocity potential
formulation.
6 We do not have straightforward boundary conditions for vorticity (see next
section).
⇒ Because of notes (3) and (4) (notwithstanding (6)), this is the preferred
formulation for 2-D, incompressible flows.

c 2010 K. W. Cassel 294 / 454
Numerical Solutions of Navier-Stokes Equations Boundary Conditions for Vorticity-Streamfunction Formulation
Outline

Thom’s Method
Jensen’s Method
Sequential Method

c 2010 K. W. Cassel 295 / 454
Consider the driven cavity flow:
Recall that the streamfunction is defined by

∂ψ ∂ψ
u= , v=− .
∂y ∂x

c 2010 K. W. Cassel 296 / 454
Thus, on AD and BC:

∂ψ
u=0⇒ = 0 ⇒ ψ = const. = 0
∂y
∂ψ
v=0⇒ =0
∂x
∂ψ
∴ ψ and specified (n = x).
∂n
Similarly, on AB:
∂ψ
v=0⇒ =0⇒ψ=0
∂x
∂ψ
u=0⇒ =0
∂y
∂ψ
∴ ψ and specified (n = y).
∂n

c 2010 K. W. Cassel 297 / 454
On CD:
∂ψ
v=0⇒ =0⇒ψ=0
∂x
∂ψ
u=1⇒ =1
∂y
∂ψ
∴ ψ and
specified (n = y).
∂n
At solid boundaries, therefore, the streamfunction and its normal derivative are
specified. Note that we have two boundary conditions on the streamfunction,
where only one is needed.

c 2010 K. W. Cassel 298 / 454
In order to consider boundary conditions on vorticity, recall its definition

∂v ∂u
ω= − .
∂x ∂y
Thus, for example, on AB:
∂v ∂u τw
v=0⇒ =0⇒ω=− =− ,
∂x ∂y µ
where τw is the wall shear stress. However, τw is unknown and must be
determined from the solution.
⇒ Don’t have boundary condition for ω.
Strategy:
Use ψ = const. as boundary condition for streamfunction equation (7.8).
Use “extra” boundary condition ∂ψ/∂n = const. to obtain a condition for ω
at the surface.

c 2010 K. W. Cassel 299 / 454
Thom’s Method
Consider, for example, the lower boundary AB (y = 0). Throughout the domain
we have
∂2ψ ∂2ψ
+ = −ω.
∂x2 ∂y 2
∂2ψ
However, along AB ψ = 0; therefore, = 0, and
∂x2
∂ 2 ψ

ωw = − . (7.9)
∂y 2 y=0
We also have
∂ψ
u= = 0 on y = 0.
∂y
For generality, consider a moving wall with

∂ψ
= uw (x) = g(x), (7.10)
∂y y=0
where the tangential wall velocity uw (x) = g(x) is specified.

c 2010 K. W. Cassel 300 / 454
Applying a central difference to (7.10) gives
ψi,2 − ψi,0
= gi + O(∆y 2 );
2∆y
therefore,
ψi,0 = ψi,2 − 2∆y gi + O(∆y 3 ). (7.11)

c 2010 K. W. Cassel 301 / 454
A central difference for (7.9) gives
ψi,2 − 2ψi,1 + ψi,0

ωi,1 = − + O(∆y 2 ). (7.12)
(∆y)2
Substituting (7.11) for ψi,0 in the previous equation leads to
2
ωi,1 = − [ψi,2 − ψi,1 − ∆y gi ] + O(∆y). (7.13)
(∆y)2
We then use the most recent iterate for streamfunction ψ to obtain a Dirichlet
boundary condition for vorticity ω.
Note: The truncation error for Thom’s method is only O(∆y). However, it
exhibits second-order convergence (Huang & Wetton 1996).

c 2010 K. W. Cassel 302 / 454
Jensen’s Method
We would like a method that is O(∆y 2 ) overall; therefore, consider using an
O(∆y 3 ) approximation for (7.10)
−2ψi,0 − 3ψi,1 + 6ψi,2 − ψi,3

= gi + O(∆y 3 );
6∆y
therefore,
3 1
ψi,0 = − ψi,1 + 3ψi,2 − ψi,3 − 3∆y gi + O(∆y 4 ).
2 2
Substituting into (7.12) results in
1
ωi,1 = − 2
[7ψi,1 − 8ψi,2 + ψi,3 + 6∆y gi ] + O(∆y 2 ). (7.14)
2(∆y)
We then use (7.14) in place of (7.13) to obtain a Dirichlet boundary condition for
vorticity.

c 2010 K. W. Cassel 303 / 454
Notes:
1 For discussion of boundary conditions on vorticity and streamfunction at
inlets and outlets, see Fletcher, Vol. II, pp. 380–381.
2 Treatment of vorticity (and pressure) boundary conditions is still an active
area of research and debate (see, for example, Rempfer 2003).

c 2010 K. W. Cassel 304 / 454
Numerical Solutions of Navier-Stokes Equations Numerical Solutions of Coupled Systems of Equations
Outline

Thom’s Method
Jensen’s Method
Sequential Method

c 2010 K. W. Cassel 305 / 454
Until now, we have considered methods for solving single equations; but in fluid
dynamics we must solve systems of coupled equations, such as the Navier-Stokes
equations in the primitive-variables or vorticity-streamfunction formulations.
Two methods for treating coupled equations numerically:
1 Sequential Solution:
Solve, i.e. iterate on, each equation for its dominant variable, treating the
other variables as known, i.e. use most recent values.
Requires one pass through mesh for each equation at each iteration.
⇒ Most common and easiest to implement.
2 Simultaneous (or Coupled) Solution:
Combine coupled equations into a single system of algebraic equations.
⇒ If have n dependent variables (e.g. 2-D primitive variables ⇒ n = 3 (u, v, p)),
produces an n × n block tridiagonal system of equations that is solved for all
the dependent variables simultaneously.
See, for example, S. P. Vanka, J. Comput. Phys., Vol. 65, pp. 138–158 (1986).

c 2010 K. W. Cassel 306 / 454
For example, if solving using Gauss-Seidel:

1 Sequential ⇒ One Gauss-Seidel expression for each dependent variable
updated in succession.
2 Simultaneous ⇒ Solve n equations for n unknown dependent variables
directly at each grid point.

c 2010 K. W. Cassel 307 / 454
Sequential Method
Consider for example the primitive-variables formulation.
Steady Problems:
Note:
May require underrelaxation for convergence due to nonlinearity.

c 2010 K. W. Cassel 308 / 454
Unsteady Problems:
Notes:
Outer loop for time marching.
Inner loop to obtain solution of coupled equations at current time step.
⇒ Trade-off: Generally, reducing ∆t reduces number of inner loop iterations. (If
∆t small enough, no iteration is necessary).

c 2010 K. W. Cassel 309 / 454
Grids and Grid Generation Introduction
Outline
8 Grids and Grid Generation

Introduction
Staggered vs. Collocated Grids
Uniform vs. Non-Uniform Grids
Grid Generation
Algebraic Grid Generation
Elliptic Grid Generation
Variational Grid Generation

c 2010 K. W. Cassel 310 / 454
Grids and Grid Generation Introduction
Thus far we have used what are called “uniform, collocated grids.”
Uniform ⇒ Grid spacings in each direction ∆x and ∆y are uniform.
Collocated ⇒ All dependent variables are approximated at the same point.
In the following two sections we consider alternatives to uniform and collocated
grids.

c 2010 K. W. Cassel 311 / 454
Grids and Grid Generation Staggered vs. Collocated Grids
Outline

Introduction
Grid Generation

c 2010 K. W. Cassel 312 / 454
Recall the 2-D, incompressible Navier-Stokes equations in the primitive variables

formulation
1 ∂2u ∂2u

∂u ∂u ∂u ∂p
+u +v =− + + 2 , (8.1)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
1 ∂2v ∂2v

∂v ∂v ∂v ∂p
+u +v =− + + 2 , (8.2)
∂t ∂x ∂y ∂y Re ∂x2 ∂y
∂2p ∂2p

∂u ∂v ∂v ∂u
+ 2 =2 − . (8.3)
∂x2 ∂y ∂x ∂y ∂x ∂y
When using collocated grids, all three equations are approximated at the same
points in the grid for their respective dependent variables u(x, y), v(x, y), and
p(x, y).

c 2010 K. W. Cassel 313 / 454
Alternatively, using a staggered grid we approximate each primitive variable and

its associated equation as follows.

c 2010 K. W. Cassel 314 / 454
For example, let us consider how each of the terms in the x-momentum equation
(8.1) are approximated (using central differences) on a staggered grid
∂u ui+1,j − ui−1,j
u = ui,j + O(∆x2 ),
∂x 2∆x
∂u 1 ui,j+1 − ui,j−1
v = (vi,j + vi+1,j + vi,j−1 + vi+1,j−1 ) + O(∆y 2 ),
∂y 4 2∆y
∂p pi+1,j − pi,j
= + O(∆x2 ),
∂x ∆x
∂2u ui+1,j − 2ui,j + ui−1,j

= + O(∆x2 ),
∂x2 ∆x 2
∂2u ui,j+1 − 2ui,j + ui,j−1

= + O(∆y 2 ).
∂y 2 ∆y 2
The terms in equations (8.2) and (8.3) are treated in a similar manner at their
respective points while being approximated at their respective locations in the grid.

c 2010 K. W. Cassel 315 / 454
Note that special treatment is required at the boundaries.
Therefore, either grid points at which u or v are computed or specified do not

coincide with the boundary. Specifically, the points where u are taken do not
coincide with the y = 0 boundary, and v points do not coincide with the x = 0
boundary in the above example.

c 2010 K. W. Cassel 316 / 454
Advantages of staggered grid:

1 Helps avoid oscillations in pressure that can occur in some methods applied
to incompressible Navier-Stokes.
2 “Ensures” conservation of kinetic energy.
Advantages of collocated grid:
1 Easier to understand and program, i.e. ui,j , vi,j and pi,j are all approximated
at the same location.
2 Easier to implement boundary conditions, particularly for complex boundaries.

c 2010 K. W. Cassel 317 / 454
Grids and Grid Generation Uniform vs. Non-Uniform Grids
Outline

Introduction
Grid Generation

c 2010 K. W. Cassel 318 / 454
In many flows, e.g. those involving boundary layers, the solution has local regions
of intense gradients.
Therefore, a fine grid is necessary to resolve the flow near boundaries, but the
same resolution is not necessary in the remainder of the domain. As a result, a
uniform grid would waste computational resources where they are not needed.
Alternatively, a non-uniform grid would allow us to refine the grid where it is
needed.

c 2010 K. W. Cassel 319 / 454
Let us obtain the finite-difference approximations using Taylor series as before, but
without assuming all ∆x’s are equal. For example, consider the first-derivative
term ∂φ/∂x.
Applying Taylor series at xi−1 and xi+1 with ∆xi 6= ∆xi+1 and solving for ∂φ/∂x
leads to
∆x2i+1 − ∆x2i ∆x3i+1 + ∆x3i
2 3
∂φ φi+1 − φi−1 ∂ φ ∂ φ
= − − +· · · .
∂x ∆xi + ∆xi+1 2(∆xi + ∆xi+1 ) ∂x2 i 6(∆xi + ∆xi+1 ) ∂x3 i

c 2010 K. W. Cassel 320 / 454
If the grid is uniform, i.e. ∆xi = ∆xi+1 , then the second term vanishes, and the
approximation reduces to the usual O(∆x2 )-accurate central difference
approximation for the first derivative. However, for a non-uniform grid, the
truncation error is only O(∆x).
We could restore second-order accuracy by using an appropriate approximation to
∂ 2 φ/∂x2 i in the second term, which results in
∂φ φi+1 ∆x2i − φi−1 ∆x2i+1 + φi (∆x2i+1 − ∆x2i )

= + O(∆x2 ).
∂x ∆xi+1 ∆xi (∆xi + ∆xi+1 )
As one can imagine, this gets very complicated, and it is difficult to ensure
consistent accuracy for all approximations.

c 2010 K. W. Cassel 321 / 454
Grids and Grid Generation Grid Generation
Outline

Introduction
Grid Generation

c 2010 K. W. Cassel 322 / 454
Rather than using non-uniform grids, it is often advantageous to transform the

problem to a computational domain on which a uniform mesh may be used. Such
a transformation generally has one or both of the following objectives:
1 Transform a complex physical domain into a simple, e.g. rectangular or
circular, computational domain.
2 Cluster grid points in regions of the physical domain where the solution varies
rapidly, i.e. large gradients occur.
We consider transformations of the form
(x, y) ⇐⇒ (ξ, η),
where (x, y) are the variables in the physical domain, and (ξ, η) are the variables
in the computational domain such that ξ = ξ(x, y) and η = η(x, y). To transform
back to the physical domain, we regard x = x(ξ, η) and y = y(ξ, η).

c 2010 K. W. Cassel 323 / 454
From the chain rule, the transformation laws are

∂ ∂ξ ∂ ∂η ∂
= + , (8.4)
∂x ∂x ∂ξ ∂x ∂η
∂ ∂ξ ∂ ∂η ∂
= + , (8.5)
∂y ∂y ∂ξ ∂y ∂η
where ξx = ∂ξ/∂x, ξy = ∂ξ/∂y, ηx = ∂η/∂x, and ηy = ∂η/∂y are called the
metrics of the transformation.
The Jacobian of the transformation is

ξx ηx ξx ξy
J = = = ξx ηy − ξy ηx . (8.6)
ξy η y η x ηy
Properties of the transformation:

1 Mapping must be one-to-one, which requires that J 6= 0.
2 We desire a smooth grid distribution; therefore, the metrics should be
smoothly varying throughout the domain.

c 2010 K. W. Cassel 324 / 454
3 We may want to enforce orthogonality of the grid (see Fletcher pp. 97-100).
Methods of obtaining the mapping:

1 Conformal mapping
See figure 1 for an example of a Schwarz-Christoffel transformation.
Difficult to extend to 3-D.
For details, recall MMAE 501.
2 Algebraic transformation.
3 Elliptic grid generation, i.e. solution of partial differential equation.
4 Variational methods.

c 2010 K. W. Cassel 325 / 454

Consider, for example, the flow in a diffuser:
The equation for the upper (sloped) boundary is
H2 − H1
yb = f (x) = H1 + x.
L

c 2010 K. W. Cassel 326 / 454
We seek to transform this physical domain into a rectangular computational

domain through the algebraic tranformation
y
ξ = x, η= . (8.7)
f (x)
We then utilize a uniform grid in the computational domain (ξ, η).

c 2010 K. W. Cassel 327 / 454
The inverse transformation is then

H2 − H1
x = ξ, y = f (x)η = H1 + ξ η, (8.8)
L
which produces a stretched grid in the physical domain.

Let us check the metrics of this transformation
∂ξ
= 1,
∂x
∂ξ
= 0,
∂y
H −H
∂η f 0 (x) f 0 (ξ) 2
L
1
η
= −y 2 =− η=− ,
∂x f (x) f (ξ) H1 + H2 −H
L
1
ξ
∂η 1 1
= = .
∂y f (x) H1 + 2 −H
H
L
1
ξ
See figure 2 for plots of these metrics, which show that they are smooth.
c 2010 K. W. Cassel 328 / 454
Let us transform the governing equations using the transformation laws (8.4) and
(8.5)
∂ ∂ξ ∂ ∂η ∂ ∂ f 0 (ξ) ∂
= + = −η ,
∂x ∂x ∂ξ ∂x ∂η ∂ξ f (ξ) ∂η
∂ ∂ξ ∂ ∂η ∂ 1 ∂
= + = .
∂y ∂y ∂ξ ∂y ∂η f (ξ) ∂η
For example, consider the convection term
f 0 (ξ) ∂u

∂u ∂u
u(x, y) = u(ξ, η) −η .
∂x ∂ξ f (ξ) ∂η

c 2010 K. W. Cassel 329 / 454
Let us now consider another scenario in which we have a semi-infinite

boundary-layer flow above a surface.
We have introduced the boundary-layer variable Y such that
y = Re−1/2 Y, 0 ≤ Y ≤ ∞.

c 2010 K. W. Cassel 330 / 454
We want a transformation that maps the semi-infinite domain into a finite one
and clusters points near the surface. One possibility is

2 Y
ξ = x, η = tan−1 . (8.9)
π a
This maps Y = 0 to η = 0 and Y = ∞ to η = 1, and reducing a concentrates

more grid points near the surface at y = 0 in the physical domain when using a
uniform grid in the computational (ξ, η) domain.
For an illustration, see figure 3 in which transformation (8.9) is used in the
x-direction and a stretching transformation similar to that for the diffuser is used
in the y-direction.
The transformation laws for the transformation (8.9) are
∂ ∂
= ,
∂x ∂ξ
∂ ∂ξ ∂ ∂η ∂ ∂
= + = Γ(η) ,
∂Y ∂Y ∂ξ ∂Y ∂η ∂η

c 2010 K. W. Cassel 331 / 454
where
1
Γ(η) = [1 + cos(πη)] .
πa
Then
∂2 ∂2

∂ ∂ ∂
2
= Γ(η) Γ(η) = Γ (η) 2 + Γ(η)Γ0 (η) .
2
∂Y ∂η ∂η ∂η ∂η
Notes:
1 Algebraic methods move the complexity, i.e. complex boundaries and/or
non-uniform grids, to the equations themselves.
Physical domain ⇒ “simple” equations; complex geometry and grid.
Computational domain ⇒ simple geometry and grid; “complex” equations.
2 Computational overhead is typically relatively small for algebraic methods, i.e.
there are no additional equations to solve.
3 It is easy to cluster grid points in the desired regions of the domain; however,
it is necessary to know where to cluster the grid points a priori.
4 Must choose the algebraic transformation ahead of time, i.e. before solving
the problem (cf. variational grid generation).
5 It is difficult to handle complex geometries.

c 2010 K. W. Cassel 332 / 454

Consider the potential flow past a cylinder:
The velocity potential and streamfunction both satisfy the Laplace equation
∂2φ ∂2φ ∂2ψ ∂2ψ

+ 2 = 0, + = 0. (8.10)
∂x2 ∂y ∂x2 ∂y 2
Observe that the streamlines and isopotential lines would make a good grid on
which to obtain a numerical solution, i.e. φ → ξ, ψ → η.

c 2010 K. W. Cassel 333 / 454
Thus, an appropriate solution to
∂2ξ ∂2ξ ∂2η ∂2η

+ = 0, + = 0, (8.11)
∂x2 ∂y 2 ∂x2 ∂y 2
would provide a good mapping on which to base the computational grid, i.e. a
uniform grid in (ξ, η) produces the above grid in physical space.
There are two approaches to obtaining the solution to (8.11):
1) Grid parameters are governed by the Laplace equation, i.e. they are harmonic;
therefore, we could utilize complex variables:
z = x + iy, ζ = ξ + iη.
This approach uses conformal mapping (see Fletcher, vol. II, pp. 89-96) and
is good for two-dimensional flows in certain types of geometries.
2) Solve a boundary-value problem to generate the grid, i.e. elliptic grid
generation.

c 2010 K. W. Cassel 334 / 454
In order to control grid clustering, known functions involving sources and sinks
may be added to the right-hand-side of equations (8.11)
∂2ξ ∂2ξ ∂2η ∂2η

+ = P (x, y), + = Q(x, y), (8.12)
∂x2 ∂y 2 ∂x2 ∂y 2
where P and Q contain exponential functions. Note that equations (8.12) are for
ξ = ξ(x, y) and η = η(x, y).
We want to solve (8.12) in the computational domain (ξ, η) to obtain the grid
transformations x = x(ξ, η) and y = y(ξ, η). In addition, we must transform the
governing equation(s), e.g. Navier-Stokes, to the computational domain.
Therefore, we seek the transformation laws for (x, y) → (ξ, η).
For
ξ = ξ(x, y), η = η(x, y),
the total differentials are
∂ξ ∂ξ ∂η ∂η
dξ = dx + dy, dη = dx + dy,
∂x ∂y ∂x ∂y

c 2010 K. W. Cassel 335 / 454
or in matrix form
dξ ξ ξy dx
= x . (8.13)
dη ηx ηy dy
Similarly, for the inverse transformation from (ξ, η) to (x, y), we have that

dx xξ xη dξ
= .
dy yξ yη dη
We can solve the latter expression for [dξ dη]T by multiplying by the inverse
−1
dξ xξ xη dx
=
dη yξ yη dy

yη −xη
−yξ xξ

dx
=
xξ xη dy

yξ yη

dξ 1 yη −xη dx
=
dη J −yξ xξ dy

c 2010 K. W. Cassel 336 / 454
Comparing with equation (8.13), we see that

ξx ξy 1 yη −xη
= .
ηx ηy J −yξ xξ
Thus, the transformation metrics are

∂ξ 1 ∂y ∂ξ 1 ∂x
= , =− ,
∂x J ∂η ∂y J ∂η
∂η 1 ∂y ∂η 1 ∂x
=− , = .
∂x J ∂ξ ∂y J ∂ξ
Substituting into the transformation laws (8.4) and (8.5) give

∂ ∂ξ ∂ ∂η ∂ 1 ∂y ∂ ∂y ∂
= + = − ,
∂x ∂x ∂ξ ∂x ∂η J ∂η ∂ξ ∂ξ ∂η

∂ ∂η ∂ ∂ξ ∂ 1 ∂x ∂ ∂x ∂
= + = − .
∂y ∂y ∂η ∂y ∂ξ J ∂ξ ∂η ∂η ∂ξ

c 2010 K. W. Cassel 337 / 454
From these we obtain the second derivative transformation laws as follows

2 2 2 2
∂2 ∂2ξ ∂ ∂2η ∂ ∂ξ ∂ ∂η ∂ ∂η ∂ξ ∂ 2
= + + + + 2 , (8.14)
∂x2 ∂x2 ∂ξ ∂x2 ∂η ∂x ∂ξ 2 ∂x ∂η 2 ∂x ∂x ∂η∂ξ
and
2 2
∂2 ∂2ξ ∂ ∂2η ∂ ∂2 ∂2 ∂η ∂ξ ∂ 2

∂ξ ∂η
= + 2 + + +2 , (8.15)
∂y 2 ∂y 2 ∂ξ ∂y ∂η ∂y ∂ξ 2 ∂y ∂η 2 ∂y ∂y ∂η∂ξ
Application of equations (8.14) and (8.15) to equations (8.12) gives
∂2x ∂2x ∂2x

1 ∂x ∂x
g22 2 − 2g12 + g11 2 = − 2 P +Q ,
∂ξ ∂ξ∂η ∂η J ∂ξ ∂η
(8.16)
∂2y ∂2y ∂2y

1 ∂y ∂y
g22 2 − 2g12 + g11 2 = − 2 P +Q ,
∂ξ ∂ξ∂η ∂η J ∂ξ ∂η

c 2010 K. W. Cassel 338 / 454
where 2 2
∂x ∂y
g11 = + ,
∂ξ ∂ξ
∂x ∂x ∂y ∂y
g12 = + ,
∂ξ ∂η ∂ξ ∂η
2 2
∂x ∂y
g22 = + .
∂η ∂η
Note that the coefficients in the equations are defined such that

g11 g12
g = = J 2 , (g12 = g21 ).
g21 g22
For boundary conditions to apply to equations (8.16), specify either:

1) Grid point locations (x, y) on boundaries.
2) Grid line slopes at the boundaries.

c 2010 K. W. Cassel 339 / 454
Therefore, we have a coupled set of nonlinear elliptic equations with Dirichlet or

Neumann boundary conditions. We solve (GS, SOR, ADI, multigrid, etc...)
equations (8.16) on a uniform grid in (ξ, η) to obtain the inverse transformation
x = x(ξ, η), y = y(ξ, η).
Notes:
1) Provides smoother grids than algebraic transformations due to use of
diffusion-type equations to generate grid.
→ Effects of discontinuities at boundaries are smoothed out in interior.
2) Can enforce grid orthogonality.

c 2010 K. W. Cassel 340 / 454
3) Can treat multiply-connected regions (see figure 5).
4) Can extend to three-dimensional geometries (cf. conformal mapping).

5) Difficult to choose P (ξ, η) and Q(ξ, η) for desired clustering. Generally, one
must use trial-and-error until something that “looks good” is obtained.
6) Large computational times are required to produce the grid, i.e. on the same
order as the time required to solve the governing equations.

c 2010 K. W. Cassel 341 / 454
7) References:
For more elliptic grid generation options, see Knupp, P. and Steinberg, S.,
“Fundamentals of Grid Generation,” CRC Press (1994), who consider
smoothness, Winslow and TTM methods.
Thompson, J. F., Warsi, Z. U. A. and Mastin, C. W., “Numerical Grid
Generation - Foundation and Applications,” North Holland (1985).
8) The ultimate in grid generation is adaptive grid methods, in which the grid
“adapts” to local features of the solution as it is computed; see, for example,
variational grid generation.

c 2010 K. W. Cassel 342 / 454

Advantages:
Provides a more intuitive and mathematically formal basis for grid generation.
Generally formulated to produce the optimal, i.e. ”best,” grid in a least
squares sense; therefore, it eliminates the trial-and-error necessary in elliptic
grid generation.
One-Dimensional:
We determine functionals for which the stationary function(s) gives a
transformation in which the grid spacing ∆xi is proportional to a weight function
φ(ξi+1/2 ).

c 2010 K. W. Cassel 343 / 454
Physical domain: a ≤ x ≤ b
Computational domain: 0 ≤ ξ ≤ 1

c 2010 K. W. Cassel 344 / 454
Consider the following discrete functional

I I
X (xi+1 − xi )2 X ∆x2i
S= = ,
i=1
2φi+1/2 i=1
2φ i+1/2
where xi = x(ξi ), xi+1 = x(ξi+1 ) and φi+1/2 = (φ(ξi ) + φ(ξi+1 ))/2. For a given
weight function φ(ξ), we want to minimize S subject to the end conditions
x1 = a, xI+1 = b.
Dividing the above expression by ∆ξ (= constant) gives
I 2
S X ∆xi ∆ξ
= .
∆ξ i=1
∆ξ 2φi+1/2
Taking ∆ξ → 0 gives the continuous functional

1
(∂x/∂ξ)2 1 x2ξ
Z Z
1 1
I[x(ξ)] = dξ = dξ, (8.17)
2 0 φ(ξ) 2 0 φ

c 2010 K. W. Cassel 345 / 454
with the end conditions

x(0) = a, x(1) = b.
This is known as the weighted length functional. We seek the grid distribution
x(ξ) that minimizes the functional I[x(ξ)].
With the integrand of the functional being
1 x2ξ
F = F (ξ, x, xξ ) = ,
2 φ
Euler’s equation is given by

∂F d ∂F
− =0
∂x dξ ∂xξ

d xξ
− =0 (8.18)
dξ φ
φξ
∴ xξξ − xξ = 0, (φ > 0), (8.19)
φ

c 2010 K. W. Cassel 346 / 454
with boundary conditions

x(0) = a, x(1) = b.
In order to see what this means for the grid spacing, let us consider equation
(8.18), which is
xξ
= 0.
φ ξ
Integrating (with integration constant C) and multiplying by φ(ξ) gives
dx
= Cφ(ξ).
dξ
Writing this expression in discrete form yields

∆xi ξi+1 + ξi
=Cφ .
∆ξ 2
Therefore, we see that this requires that ∆xi be proportional to φi+1/2 , where the
proportionality constant is C∆ξ.

c 2010 K. W. Cassel 347 / 454
Note that the weight function φ(ξ) has been expressed in the computational
domain, and it is not necessarily clear how it should be chosen. Conceptually, we
prefer to think in terms of physical weight functions, say w(x). In that case, the
grid spacing is determined by a physical variable or one of its derivatives giving
rise to feature-adaptive grid generation.
Let us set the grid spacing to be proportional to [w(x)]2 , i.e.

2
xi+1 + xi
∆xi = K w .
2
Therefore, we take φ = w2 > 0 in equation (8.17) giving the physical weight

length functional
1 1 x2ξ
Z
I[x(ξ)] = dξ, (8.20)
2 0 w2 (x)
with x(0) = a, x(1) = b.
In this case, the integrand of the functional is
1 x2ξ
F = F (ξ, x, xξ ) = .
2 w2 (x)
c 2010 K. W. Cassel 348 / 454
Therefore,
∂F wx ∂F xξ
= − 3 x2ξ , = 2,
∂x w ∂xξ w
and Euler’s equation gives
wx 2 d xξ
− 3 xξ − =0
w dξ w2
wx 2 xξξ wx xξ
x + − 2 xξ = 0
w3 ξ w2 w3
wx 2
xξξ − x = 0.
w ξ
By the chain rule
d dξ d 1 d 1
= = ⇒ wx = wξ .
dx dx dξ xξ dξ xξ
Thus, the Euler equation is

wξ
xξξ − xξ = 0, (w > 0), (8.21)
w

c 2010 K. W. Cassel 349 / 454
which is of the same form as (8.19) (this is why we set φ = w2 ), except that now
w(x) is a physical weight function. The boundary conditions are
x(0) = a, x(1) = b.
One-Dimensional Illustration:
Consider the one-dimensional, steady convection-diffusion equation
uxx − cux = 0, a ≤ x ≤ b,
(8.22)
u(a) = 0, u(b) = 1,
where c is the convection speed. Increasing c produces an increasingly thin

boundary layer near x = b. The exact solution to (8.22) is
ec(x−a) − 1
u(x) = c(b−a) .
e −1

c 2010 K. W. Cassel 350 / 454
The governing equation (8.22) and the grid equation must be transformed into
the computational domain. In one dimension, the transformation laws are
d dξ d 1 d
= = ,
dx dx dξ xξ dξ
d2 1 d2

1 d 1 d xξξ d
= = 2 2− 3 .
dx2 xξ dξ xξ dξ xξ dξ xξ dξ
Transforming the governing equation (8.22) to computational coordinates gives
1 d2 u xξξ du 1 du
− − c = 0,
x2ξ dξ 2 x3ξ dξ xξ dξ
or
xξξ
uξξ − + cxξ uξ = 0, (8.23)
xξ
with boundary conditions
u(0) = 0, u(1) = 1.

c 2010 K. W. Cassel 351 / 454
To solve (8.23) for the velocity u(ξ) requires the grid x(ξ). Here we will use the
grid equation (8.21) (corresponding to the physical weighted-length functional
(8.20))
wξ
xξξ − xξ = 0, (8.24)
w
with the boundary conditions x(0) = a, x(1) = b.
→ How to choose w(x)?
A common choice for a feature-adaptive weight function is based on the gradient
of the dependent variable of the form
1
w(x) = p , 0 < w ≤ 1, (8.25)
1 + 2 u2x
where is a parameter to be chosen. Observe that
ux small ⇒ w ≈ 1,
1
ux large ⇒ w≈ , (|ux | ↑ ⇒ w ↓ ⇒ ∆x ↓).
|ux |

c 2010 K. W. Cassel 352 / 454
For use in the grid equation (8.24), we need the weight function (8.25) in terms of
the computational coordinate ξ:
1
w (x(ξ)) = q . (8.26)
2 2
1+ u
x2ξ ξ
Taking the derivative

2 x 2 xξξ
2 x2ξ uξ uξξ − 2 xξξ
3 uξ 2 uξ xξ uξ − uξξ
ξ
wξ = − = 3/2 .
2
2 2
3/2 x2ξ 2 2
1 + x2 uξ 1 + x2 uξ
ξ ξ
Therefore, the coefficient in equation (8.24) is

xξξ
wξ uξ xξ uξ − uξξ
= 2 . (8.27)
w x2ξ + 2 u2ξ
Note: A uniform grid in the physical domain results if = 0 (w = 1, wξ = 0), i.e.

x(ξ) = (b − a)ξ + a. Thus, increasing increases the influence of the
transformation.
c 2010 K. W. Cassel 353 / 454
Numerical Procedure:
The governing equation (8.23) and the grid equation (8.24) (with (8.27)
expressed in the computational ξ-plane are
uξξ − X(ξ)uξ = 0, u(0) = 0, u(1) = 1, (8.28)
xξξ − W (ξ)xξ = 0, x(0) = a, x(1) = b, (8.29)

respectively, where
xξξ
X(ξ)
+ cxξ , =
xξ
wξ
W (ξ) = .
w
Recall that the last expression is given by equation (8.27). Thus, we have
two coupled second-order linear ordinary differential equations. Note that all
of the grid transformation information is encapsulated in the X(ξ) coefficient
in the governing equation, and all the weight function information is
contained in the W (ξ) coefficient in the grid equation.

c 2010 K. W. Cassel 354 / 454
Consider equation (8.28) approximated on the computational grid using

central differences
ui+1 − 2ui + ui−1 ui+1 − ui−1
2
− Xi = 0.
∆ξ 2∆ξ
Thus, we have the tridiagonal problem
Ai ui−1 + Bi ui + Ci ui+1 = Di , i = 2, . . . , I,
where
∆ξ
Ai = 1+ Xi ,
2
Bi = −2,
∆ξ
Ci = 1− Xi ,
2
Di = 0.
We obtain a similar tridiagional problem for the grid equation (8.29).

c 2010 K. W. Cassel 355 / 454
See the Mathematica notebook ’GridGen.nb’

Two-Dimensional:
The two-dimensional analog to the one-dimensional weighted-length functional
(8.17) is
Z Z " #
1 1 1 x2ξ + yξ2 x2η + yη2
I[x, y] = + dξdη,
2 0 0 φ(ξ, η) ψ(ξ, η)
where we now have two weight functions φ(ξ, η) > 0 and ψ(ξ, η) > 0. This
produces a grid for which the lengths of the coordinate lines are proportional to
the weight functions, i.e.
√ q
g11 = x2ξ + yξ2 = K1 φ(ξ, η),
√ q
g22 = x2η + yη2 = K2 ψ(ξ, η).
Recall that g11 and g22 are defined in section 8.4.2.

c 2010 K. W. Cassel 356 / 454
The Euler equations are then

xξ xη
+ = 0,
φ ξ ψ η

yξ yη
+ = 0.
φ ξ ψ η
Notes:
1) If φ = ψ = c, then the Euler equations are Laplace equations
xξξ + xηη = 0, yξξ + yηη = 0;
cf. elliptic grid generation.

2) The weight functions control the interior grid locations in a much more
formal manner than the P and Q forcing functions in elliptic grid generation.
That is, the local grid spacing is directly proportional to the specified weight
functions, which may be related directly to the physical solution.

c 2010 K. W. Cassel 357 / 454
Other functionals are also possible:

Area Functional:
The area of each cell, which is related to the Jacobian J, is proportional to a
weight function φ(ξ, η) > 0, i.e.
1 1
J2
Z Z
1
∴ IA [x, y] = dξdη.
2 0 0 φ
Euler equations:
Jxη Jxξ
− = 0,
φ ξ φ η

Jyη Jyξ
− = 0.
φ ξ φ η

c 2010 K. W. Cassel 358 / 454
Orthogonality Functional:
The grid is orthogonal if g12 = xξ xη + yξ yη = 0; therefore, the orthogonality
functional is
1 1 1 2
Z Z
∴ IO [x, y] = g dξdη,
2 0 0 12
such that g12 is minimized in a least squares sense (without a weight
function).
Euler equations:
(g12 xη )ξ + (g12 xξ )η = 0,
(g12 yη )ξ + (g12 yξ )η = 0.

c 2010 K. W. Cassel 359 / 454
Combination Functionals:
We can form combinations of the above functionals. For example, consider
the area-orthogonality functional
1 1 1 1
J 2 + g12
2
Z Z Z Z
1 1 g11 g22
IAO [x, y] = dξdη, = dξdη,
2 0 0 φ 2 0 0 φ
where again the fact that J 2 = g = g11 g22 − g12

2
has been used.
This results in a grid such that the average of the area and orthogonality
functionals are minimized. One criteria could be emphasized over the other
by including weight coefficients in the respective terms in the functional.
Euler equations:
g22 xξ g11 xη
+ = 0,
φ ξ φ η

g22 yξ g11 yη
+ = 0.
φ ξ φ η

c 2010 K. W. Cassel 360 / 454
Notes:
1) Just as in the one-dimensional case, it is generally preferable to define weight
functions in the physical domain, i.e. w(x, y), rather than in the
computational domain, i.e. φ(ξ, η).
2) The grid x(ξ, η), y(ξ, η) can be obtained by solving the Euler equations (most
common) or the variational form directly.
3) All of the two-dimensional functionals above have been written in the form
Z 1 Z 1
1
I[x, y] = F (ξ, η, x, y, xξ , yξ , xη , yη )dξdη,
2 0 0
in order to determine the mappings x(ξ, η) and y(ξ, η). Alternatively,

functionals may be written to determine ξ(x, y) and η(x, y), i.e.
Z 1 Z 1
1
I[ξ, η] = F (x, y, ξ, η, ξx , ηx , ξy , ηy )dxdy.
2 0 0
These are called contravariant functionals (see Knupps and Steinberg section
8.5 and chapter 11).

c 2010 K. W. Cassel 361 / 454
4) For unsteady problems additional terms are required in the governing

equations due to movement of the grid in the physical domain (the
computational grid (ξ, η) remains fixed and uniform), i.e.
ξ = ξ(x, y, t), η = η(x, y, t).

c 2010 K. W. Cassel 362 / 454
Hydrodynamic Stability and the Eigenproblem 1-D, Unsteady Diffusion Illustration
Outline
9 Hydrodynamic Stability and the Eigenproblem

1-D, Unsteady Diffusion Illustration
Exact Solution
Numerical Solution
Numerical Solution of the Eigenproblem
Similarity Transformation
QR Method to Obtain Eigenvalues and Eigenvectors
Plane Rotations
Arnoldi Method
Hydrodynamic Stability
Linearized Navier-Stokes Equations
Local Normal Mode Analysis
Numerical Solution of the Orr-Sommerfeld Equation
Example: Plane-Poiseuille Flow

c 2010 K. W. Cassel 363 / 454
Consider the one-dimensional, unsteady diffusion equation
∂u ∂2u
= α 2, 0 ≤ x ≤ `, (9.1)
∂t ∂x
with the boundary conditions
u(0, t) = 0, u(`, t) = 0, (9.2)
and the initial condition

u(x, 0) = f (x). (9.3)

c 2010 K. W. Cassel 364 / 454
Exact Solution
Let us begin by using the method of separation of variables to obtain an exact
solution for equation (9.1). We separate the variables according to
u(x, t) = φ(x)ψ(t). (9.4)
Substituting into equation (9.1) gives
dψ d2 φ
φ = αψ 2 .
dt dx
Moving everything depending upon t to the left-hand-side and everything
depending upon x to the right-hand-side, this becomes
1 dψ 1 d2 φ
= = λ = −µ2 .
αψ dt φ dx2
Because the x and t dependence can be separated in this way, both sides of the
equation must be equal to a constant, say λ. Positive and zero λ produce only the

c 2010 K. W. Cassel 365 / 454
trivial solution for the boundary conditions given, so we consider the case where
λ = −µ2 < 0.
Therefore, the partial differential equation (9.1) is converted into two ordinary
differential equations
d2 φ
2
+ µ2 φ = 0, (9.5)
dx
dψ
+ αµ2 ψ = 0, (9.6)
dt
each of which are differential eigenproblems. The solution to equation (9.5) is
φ(x) = c1 cos(µx) + c2 sin(µx). (9.7)
The boundary condition u(0, t) = 0 requires that φ(0) = 0, which requires that
c1 = 0. From the boundary condition u(`, t) = 0, we must have φ(`) = 0, which
requires that sin(µn `) = 0. Therefore,
nπ
µn = , n = 1, 2, 3, . . . ,
`

c 2010 K. W. Cassel 366 / 454
and the eigenvalues are

n2 π 2
λn = =− 2 . −µ2n
`
Letting c2 = 1, we have the spatial eigenfunction
nπ
φn (x) = sin x , n = 1, 2, 3, . . . (9.8)
`
The solution to (9.6) is
2
ψn (t) = cn e−αµn t , n = 1, 2, 3, . . . (9.9)
Then the eigenfunctions are

∞ ∞ ∞
αn2 π 2
X X nπ X
u(x, t) = un (x, t) = φn (x)ψn (t) = cn exp − 2 t sin x .
n=1 n=1 n=1
` `
(9.10)
The constants cn are determined by application of the initial condition (9.3), for
which
X∞ nπ
u(x, 0) = cn sin x = f (x),
n=1
`
c 2010 K. W. Cassel 367 / 454
or
∞
X
cn φn (x) = f (x).
n=1
Taking the inner product of φm (x) with both sides, the only non-vanishing term
(due to orthogonality of the eigenfunctions φn (x)) occurs when m = n, giving
cn ||φn (x)||2 = hf (x), φn (x)i ;
Therefore,
`
hf (x), φn (x)i
Z
2 nπ
cn = 2
= f (x) sin x dx, n = 1, 2, 3, . . . , (9.11)
||φn (x)|| ` 0 `
which are the Fourier sine coefficients of f (x). Thus, the exact solution to
(9.1)–(9.3) is given by equation (9.10) with (9.11).

c 2010 K. W. Cassel 368 / 454
Numerical Solution
Now let us consider solving the differential eigenproblem (9.5), with boundary
conditions (9.2) numerically. Using central differences, the differential equation
becomes
φi+1 − 2φi + φi−1
= λφi .
(∆x)2
Thus, for i = 2, . . . , I (φ1 = 0, φI+1 = 0), in matrix form we have
    
−2 1 0 ··· 0 0 φ2 φ2
 1 −2 1 ··· 0 0   φ3 
   φ3 
  
0 1 −2 · · · 0 0  φ4   φ4 
2 
..   ..  = (∆x) λ  ..  ,
   
 .. .. .. .. ..
 .
 . . . . . 
 . 
   . 
 
0 0 0 · · · −2 1  φI−1  φI−1 
0 0 0 ··· 1 −2 φI φI
or
Aφ = λ̄φ. (9.12)

c 2010 K. W. Cassel 369 / 454
This is an algebraic eigenproblem for the eigenvalues λ̄ = (∆x)2 λ and the

eigenvectors φ, which are discrete approximations of the continuous eigenfunctions
(9.8).
Because A is tridiagonal with constants along each diagonal, we have a closed
form expression for the eigenvalues (see section 3.5). In general, of course, this is
not the case, and the eigenvalues (and eigenvectors) must be determined
numerically for a large matrix A.
See Mathematica notebook “1Ddiff.nb” for a comparison of the eigenvalues
obtained numerically as described here versus the exact eigenvalues from the
previous section.

c 2010 K. W. Cassel 370 / 454
Hydrodynamic Stability and the Eigenproblem Numerical Solution of the Eigenproblem
Outline

Exact Solution
Numerical Solution
Plane Rotations
Arnoldi Method

c 2010 K. W. Cassel 371 / 454
The standard method for numerically determining the eigenvalues and eigenvectors
of a matrix is based on QR decomposition, which entails performing a series of
similarity transformations. This is the approach used by the built-in Mathematica
and Matlab functions Eigenvalues[]/ Eigenvectors[] and eig(), respectively.

c 2010 K. W. Cassel 372 / 454
Consider the eigenproblem
Ax = λx, (9.13)
where A is a real, square matrix. Suppose that Q is an orthogonal matrix such
that Q−1 = QT . Let us consider the transformation
B = QT AQ. (9.14)
Postmultiplying both sides by QT x leads to
BQT x = QT AQQT x,
= QT Ax,
= QT λx,
BQT x = λQT x.
Defining y = QT x, this can be written as
By = λy, (9.15)

c 2010 K. W. Cassel 373 / 454
which is an eigenproblem for the matrix B defined by the transformation (9.14).

Note that the eigenproblems (9.13) and (9.15) have the same eigenvalues λ;
therefore, we call equation (9.14) a similarity transformation because A and
B = QT AQ have the same eigenvalues. This is the case because Q is orthogonal.
The eigenvectors of A, i.e. x, and B, i.e. y, are related by
y = QT x (x = Qy) . (9.16)
If in addition to being real and square, A is symmetric such that A = AT ,

observe that
BT = [QT AQ]T = [Q]T [A]T [QT ]T = QT AQ = B.
Therefore, if A is symmetric, B is symmetric as well when Q is orthogonal.

In summary, for A real and symmetric, the similarity transformation (9.14)
preserves the eigenvalues and symmetry of A, and the eigenvectors are related by
(9.16).

c 2010 K. W. Cassel 374 / 454

Having determined the properties of similarity transformations, we now turn our
attention to the iterative QR method for finding the eigenvalues and eigenvectors
of a matrix A, which requires such transformations.
Consider A real and symmetric, i.e. A = AT . A QR decomposition exists such
that
A = QR,
where Q is an orthogonal matrix, and R is an upper (“right”) triangular matrix.
Letting
A0 = A, Q0 = Q, R0 = R,
the QR decomposition of the given matrix is
A 0 = Q0 R0 . (9.17)
Let us form the product (note order)
A 1 = R0 Q0 . (9.18)

c 2010 K. W. Cassel 375 / 454
Because Q0 is orthogonal, premultiplying equation (9.17) by Q−1 T

0 = Q0 gives
QT0 A0 = QT0 Q0 R0 = R0 .
Therefore, substituting for R0 in (9.18) we may determine A1 from
A1 = QT0 A0 Q0 , (9.19)
which is a similarity transformation. That is, taking R0 Q0 is equivalent to the

similarity transformation (9.19), and A1 has the same eigenvalues as A0 = A.
Thus, generalizing equation (9.18) we have
Ak+1 = Rk Qk , k = 0, 1, 2, . . . (9.20)
where all A1 , A2 , . . . , Ak , . . . are similar to A0 = A. Not only do they all have

the same eigenvalues, the similarity transformations maintain the same structure
as A, e.g. tridiagonal.
It can be shown (not easily) that the sequence of similar matrices
A0 , A1 , A2 , . . .

c 2010 K. W. Cassel 376 / 454
gets progressively closer, i.e. converges, to a diagonal or upper triangular matrix if

A is symmetric or non-symmetric, respectively. In either case, the eigenvalues of
A (and A1 , A2 , . . .) are on the main diagonal in increasing order by absolute
magnitude.
But how do we determine the Q and R matrices for each iteration?
⇒ Plane rotations.

c 2010 K. W. Cassel 377 / 454
Plane Rotations
Consider the n × n transformation matrix P comprised of the identity matrix with
only four elements changed in the pth and q th rows and columns according to
Ppp = Pqq = c, Ppq = s, Pqp = −s,
where c = cos φ and s = sin φ. That is,

 
1
 .. 

 . 


 1 


 c 0 ··· 0 s 


 0 1 0 

P= .. .. ..
.
 
 . . . 

 0 1 0 


 −s 0 ··· 0 c 


 1 

 .. 
 . 
1
c 2010 K. W. Cassel 378 / 454
Observe the effect of transforming a n-D vector x according to the tranformation
y = Px. (9.21)
where xT = [x1 x2 · · · xp · · · xq · · · xn ]. Then

 
x1
 x2 
 .. 
 
 . 
 
 yp 
y = Px =  .  ,
 
 .. 
 
 yq 
 
 . 
 .. 
xn
where the only two elements that are altered are
yp = cxp + sxq , (9.22)

c 2010 K. W. Cassel 379 / 454
yq = −sxp + cxq . (9.23)

For example, consider the case for n = 2, i.e. p = 1, q = 2
y1 = cx1 + sx2 ,
y2 = −sx1 + cx2 ,
or
y1 cos φ sin φ x1
= .
y2 − sin φ cos φ x2
This transformation rotates the vector x through an angle φ to obtain y. Note
that y = PT x rotates the vector x through an angle −φ.
Thus, in the general n-D case (9.21), P rotates the vector x through an angle φ
in the xp xq -plane.

c 2010 K. W. Cassel 380 / 454
Notes:
1 The transformation matrix P is orthogonal, i.e. PT = P−1 .
2 We can generalize to rotate a set of vectors, i.e. a matrix, by taking
Y = PX.
3 The angle φ may be chosen with one of several objectives in mind. For
example,
i) To zero all elements below (or to the right of) a specified element, e.g.
yT = [y1 y2 · · · yj 0 · · · 0].
Householder transformation (reflection) – efficient for dense matrices.
ii) To zero a single element, e.g. yp or yq (see equations (9.22) and (9.23)).
Givens transformation (rotation) – efficient for sparse, structured (e.g. banded)
matrices.

c 2010 K. W. Cassel 381 / 454
We can imagine a series of Givens or Householder transformations that reduce the

matrix A to a matrix that is upper triangular, which is the R matrix in a QR
decomposition.
Thus, if m projections are required to produce an upper triangular matrix, R is
given by
R = Pm · · · P2 P1 A. (9.24)
Because A = QR, i.e. R = QT A, the orthogonal matrix Q is then obtained from
QT = Pm · · · P2 P1 .
Taking the transpose leads to
Q = PT1 PT2 · · · PTm . (9.25)
See Mathematica notebook “QRmethod.nb” for an illustration of how QR

decomposition is used in an iterative algorithm to obtain the eigenvalues of a
matrix.

c 2010 K. W. Cassel 382 / 454
Notes:
1 The QR decomposition (9.24) and (9.25) is obtained from a series of plane
(Givens or Householder) rotations.
2 Givens transformations are most efficient for large, sparse, structured
matrices.
→ Configure to only zero elements that are not already zero.
3 There is a “fast Givens transformation” for which the P matrices are not
orthogonal, but the QR decompositions can be obtained two times faster
than in the standard Givens transformation illustrated in “QRmethod.nb.”
4 Convergence of the iterative QR method may be accelerated using shifting
(see, for example, Numerical Recipes, section 11.3).

c 2010 K. W. Cassel 383 / 454
5 The order of operations for the QR method per iteration are as follows:
Dense matrix → O(n3 ) ⇒ Very expensive.
Hessenberg matrix → O(n2 )
Tridiagonal matrix → O(n).
Thus, the most efficient procedure is as follows:
i) Transform A to a similar tridiagonal or Hessenberg form if A is symmetric or
non-symmetric, respectively.
→ This is done using a series of similarity transformations based on Householder
rotations for dense matrices or Givens rotations for sparse matrices.
ii) Use iterative QR method to obtain eigenvalues of tridiagonal or Hessenberg
matrix.

c 2010 K. W. Cassel 384 / 454
Arnoldi Method
The Arnoldi method has been developed to treat situations in which we only need
a small number of eigenvalues of a large sparse matrix:
1 The iterative QR method described in the previous section is the general
approach used to obtain the full spectrum of eigenvalues of a dense matrix.
2 As we saw in the 1-D unsteady diffusion example, and as we will see when we
evaluate hydrodynamic stability, we often seek the eigenvalues of large sparse
matrices.
3 In addition, we often do not require the full spectrum of eigenvalues in
stability problems as we only seek the “least stable mode.”
⇒ We would like an efficient algorithm that determines a subset of the full
sprectrum of eigenvalues (and possibly eigenvectors) of a sparse matrix.

c 2010 K. W. Cassel 385 / 454
Suppose we seek the largest k eigenvalues (by magnitude) of the large sparse
n × n matrix A, where k n. Given an arbitrary n-D vector q0 , we define the
Krylov subspace by
Kk (A, q0 ) = span q0 , Aq0 , A2 q0 , . . . , Ak−1 q0 ,

which has dimension k and is a subspace of Rn .

The Arnoldi method is based on constructing an orthonormal basis, e.g. using
Gram-Schmidt, of the Krylov subspace Kk that can be used to project a general
n × n matrix A onto the k-D Krylov subspace Kk (A, q0 ).

c 2010 K. W. Cassel 386 / 454
We form the orthonormal projection matrix Q using the following step-by-step

(non-iterative) method that produces a Hessenberg matrix H whose eigenvalues
approximate the largest k eigenvalues of A:
1 Specify starting Arnoldi vector q0 .
2 Normalize: q1 = q0 /||q0 ||.
3 Set Q = q1 .
4 Do i = 2, k
i) Multiply qi = Aqi−1 .
ii) Orthogonalize qi against q1 , q2 , . . . , qi−1 .
iii) Append qi to Q.
iv) Form the Hessenberg matrix H = QT AQ.
v) Determine the eigenvalues of H.
5 End Do

c 2010 K. W. Cassel 387 / 454
At each step i = 2, . . . , k:
→ An n × i orthonormal matrix Q is produced that forms an orthonormal basis
for the Krylov subspace Ki (A, q0 ).
→ Using the projection matrix Q, we transform A to produce an i × i
Hessenberg matrix H (or tridiagonal for symmetric A), which is an
orthogonal projection of A onto the Krylov subspace Ki .
→ The eigenvalues of H, sometimes called the Ritz eigenvalues, approximate
the largest i eigenvalues of A.
The approximations of the eigenvalues improve as each step is incorporated, and
we obtain the approximation of one additional eigenvalue.
Notes:
1 Because k n, we only require the determination of eigenvalues of
Hessenberg matrices that or no larger than k × k as opposed to the original
n × n matrix A.
2 Although the outcome of each step depends upon the starting Arnoldi vector
q0 used, the procedure converges to the correct eigenvalues of matrix A.

c 2010 K. W. Cassel 388 / 454
3 The more sparse the matrix A is, the smaller k can be to obtain a good
approximation of the largest k eigenvalues of A.
4 When applied to symmetric matrices, the Arnoldi method reduces to the
Lanczos method.
5 A shift and invert approach can be incorporated to determine the k
eigenvalues close to a specified part of the spectrum rather than that with
the largest magnitude.
For example, it can be designed to determine the k eigenvalues with the
largest real or imaginary part.
6 When seeking a set of eigenvalues in a particular portion of the full spectrum,
it is desirable that the starting Arnoldi vector q0 be in (or ‘nearly’ in) the
subspace spanned by the eigenvectors corresponding to the sought after
eigenvalues.
As the Arnoldi method progresses, we get better approximations of the desired
eigenvectors that can then be used to form a more desirable starting vector.
This is known as the implicitly restarted Arnoldi method and is based on the
implicitly-shifted QR decomposition method.
Restarting also reduces storage requirements by keeping k small.

c 2010 K. W. Cassel 389 / 454
7 The Arnoldi method may also be adapted to solve linear systems of

equations; this is called the generalized minimal residual (GMRES) method.
8 The Arnoldi method can be designed to apply to the generalized eigenproblem
Ax = λBx.
The generalized eigenproblem will be encountered in hydrodynamic stability.

It also arises in structural design problems in which A is called the stiffness
matrix, and B is called the mass matrix.
Application of Arnoldi’s method to the generalized eigenproblem requires that
B be positive definite, i.e. have all positive eigenvalues.
9 Many have standardized on the Arnoldi method as implemented in ARPACK
(http://www.caam.rice.edu/software/ARPACK/).
ARPACK was developed at Rice University in the mid 1990’s, first as a Fortran
77 library of subroutines, and subsequently it has been implemented as
ARPACK++ for C++ .
ARPACK has been implemented in Matlab via the eigs() function, where the
‘s’ denotes ‘sparse.’
In addition, it has been implemented in Mathematica, where one includes the
option ‘Method → Arnoldi’ in the Eigenvalues[] function.

c 2010 K. W. Cassel 390 / 454
The Arnoldi method is illustrated in more detail in the Mathematica notebook

“Arnoldi.nb.”
References:
Arnoldi, W. (1951) Q. Appl. Math. 9, 17.
Nayar, N. & Ortega, J. M. (1993) “Computation of Selected Eigenvalues of
Generalized Eigenvalue Problems.” J. Comput. Phys. 108, pp. 8–14.
Saad, Y. Iterative Methods for Sparse Linear Systems SIAM, Philadelphia
(2003).
Radke, R. A Matlab Implementation of the Implicitly Restarted Arnoldi
Method for Solving Large-Scale Eigenvalue Problems, MS Thesis, Rice
University (1996).

c 2010 K. W. Cassel 391 / 454
Hydrodynamic Stability and the Eigenproblem Hydrodynamic Stability
Outline

Exact Solution
Numerical Solution
Plane Rotations
Arnoldi Method

c 2010 K. W. Cassel 392 / 454

Consider the nondimensional Navier-Stokes equations for 2-D, incompressible flow
∂u ∂v
+ = 0, (9.26)
∂x ∂y
1 ∂2u ∂2u

∂u ∂u ∂u ∂p
+u +v =− + + 2 , (9.27)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
2
∂2v

∂v ∂v ∂v ∂p 1 ∂ v
+u +v =− + + 2 . (9.28)
∂t ∂x ∂y ∂y Re ∂x2 ∂y
We denote the solution to (9.26)–(9.28), i.e. the base flow, by u0 (x, y, t),
v0 (x, y, t) and p0 (x, y, t), and seek the behavior of small perturbations to this
base flow.
⇒ If the amplitude of the small perturbations grow, the flow is
hydrodynamically unstable.

c 2010 K. W. Cassel 393 / 454
Two possibilities may be considered:

1 Temporal analysis – amplitude of spatial perturbation (e.g. wavy wall)
grows/decays with time.
⇒ Absolutely unstable/stable.
2 Spatial analysis – amplitude of temporal perturbation (e.g. vibrating ribbon)
grows/decays in space.
⇒ Convectively unstable/stable.
For infinitesimally small perturbations ( 1), the flow may be decomposed as
follows
u(x, y, t) = u0 (x, y, t) + û(x, y, t),
v(x, y, t) = v0 (x, y, t) + v̂(x, y, t), (9.29)
p(x, y, t) = p0 (x, y, t) + p̂(x, y, t).

c 2010 K. W. Cassel 394 / 454
Substituting into (9.26)–(9.28) gives
∂u0 ∂ û ∂v0 ∂v̂

+ + + = 0,
∂x ∂x ∂y ∂y

∂u0 ∂ û ∂u0 ∂ û ∂u0 ∂ û
+ + (u0 + û) + + (v0 + v̂) +
∂t ∂t ∂x ∂x ∂y ∂y
2 2 2 2

∂p0 ∂ p̂ 1 ∂ u0 ∂ û ∂ u0 ∂ û
=− − + + + + ,
∂x ∂x Re ∂x2 ∂x2 ∂y 2 ∂y 2

∂v0 ∂v̂ ∂v0 ∂v̂ ∂v0 ∂v̂
+ + (u0 + û) + + (v0 + v̂) +
∂t ∂t ∂x ∂x ∂y ∂y
2
∂ 2 v̂ ∂ 2 v0 ∂ 2 v̂

∂p0 ∂ p̂ 1 ∂ v0
=− − + + + + .
∂y ∂y Re ∂x2 ∂x2 ∂y 2 ∂y 2
As expected, the O(1) terms are simply the Navier-Stokes equations (9.26)–(9.28)
for the base flow u0 , v0 and p0 .

c 2010 K. W. Cassel 395 / 454
The O() terms for the disturbance flow are
∂ û ∂v̂
+ = 0, (9.30)
∂x ∂y
1 ∂ 2 û ∂ 2 û

∂ û ∂ û ∂ û ∂u0 ∂u0 ∂ p̂
+ u0 + v0 + û + v̂ = − + + 2 , (9.31)
∂t ∂x ∂y ∂x ∂y ∂x Re ∂x2 ∂y
2
∂ 2 v̂

∂v̂ ∂v̂ ∂v̂ ∂v0 ∂v0 ∂ p̂ 1 ∂ v̂
+ u0 + v0 + û + v̂ = − + + 2 . (9.32)
∂t ∂x ∂y ∂x ∂y ∂y Re ∂x2 ∂y
Because is small, we neglect O(2 ) terms. Thus, the evolution of the
disturbances are governed by the linearized Navier-Stokes (LNS) equations
(9.30)–(9.32), where the base flow is known.
⇒ Linear Stability Theory

c 2010 K. W. Cassel 396 / 454
In principle, we could impose a perturbation û, v̂, p̂ at any time ti and track its
evolution in time and space to determine if the flow is stable to the imposed
perturbation. To fully characterize the stability of the base flow, however, would
require many calculations of the LNS equations with different perturbation
“shapes” imposed at different times.
We can formulate a more manageable stability problem by doing one or both of
the following:
1 Consider simplified base flows.
2 Impose “well-behaved” perturbations.

c 2010 K. W. Cassel 397 / 454

Classical linear stability analysis takes advantage of both simplifications mentioned
above:
1 Base flow:
i) Steady
∂u0 ∂v0
⇒ = = 0. (9.33)
∂t ∂t
ii) Parallel (e.g. Poiseuille flow)

c 2010 K. W. Cassel 398 / 454
⇒ u0 = u0 (y), v0 = 0, p0 = p0 (x). (9.34)

2 Perturbations – we impose normal modes for û, v̂, p̂ of the form (temporal
analysis):
û(x, y, t) = u1 (y)ei(αx−αct) ,
v̂(x, y, t) = v1 (y)ei(αx−αct) , (9.35)
p̂(x, y, t) = p1 (y)ei(αx−αct) ,
where α is the wavenumber, which is real, and c = cr + ici is the complex
wavespeed. Note that the wavelength of the disturbance is proportional to
1/α.
Notes:
1 It is understood that we take the real parts of (9.35) (or add complex
conjugate).

c 2010 K. W. Cassel 399 / 454
2 Consider the disturbances, e.g.

û(x, y, t) = Re u1 (y)ei(αx−αct) , c = cr + ici

= Re u1 (y)eαci t eαi(x−cr t) ,
= Re [u1 (y)eαci t {cos[α(x − cr t)] + i sin[α(x − cr t)]}] ,
û(x, y, t) = u1 (y)eαci t cos[α(x − cr t)]
⇒ Sine wave with wavenumber α and phase velocity cr , i.e. normal mode.
If ci > 0, the amplitude of the perturbation grows unbounded as t → ∞ with
growth rate αci .
3 Because equations (9.30)–(9.32) are linear, each normal mode with
wavenumber α may be considered independently of one another (cf. von
Neumann numerical stability analysis).
⇒ For a given mode α, we are looking for the eigenvalue (wavespeed) with the
fastest growth rate.

c 2010 K. W. Cassel 400 / 454
4 This is regarded as a “local” analysis because the stability of only one

velocity profile (9.34), i.e. at a single streamwise location, is considered at a
time due to the parallel-flow assumption.
5 In some cases, the parallel flow assumption can be justified on formal grounds
if the wavelength is such that 1/α L, where L is a typical streamwise
length scale in the flow.
6 For the temporal analysis considered here, α is real and c is complex. For a
spatial analysis, α is complex and c is real.

c 2010 K. W. Cassel 401 / 454
For steady, parallel base flow (9.33) and (9.34), the Navier-Stokes equations
(9.26)–(9.28) reduces to (from equation (9.27))
d2 u0 ∂p0
= Re , (9.36)
dy 2 ∂x
where Rep00 (x) is a constant for Poiseuille flow. The disturbance equations
(9.30)–(9.32) become
∂ û ∂v̂
+ = 0, (9.37)
∂x ∂y
1 ∂ 2 û ∂ 2 û

∂ û ∂ û ∂u0 ∂ p̂
+ u0 + v̂ = − + + 2 , (9.38)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
1 ∂ 2 v̂ ∂ 2 v̂

∂v̂ ∂v̂ ∂ p̂
+ u0 =− + + 2 . (9.39)
∂t ∂x ∂y Re ∂x2 ∂y

c 2010 K. W. Cassel 402 / 454
Substitution of the normal modes (9.35) into the disturbance equations

(9.37)–(9.39) leads to
αiu1 + v10 = 0, (9.40)
1 00
αi(u0 − c)u1 + u00 v1 = −αip1 + (u1 − α2 u1 ), (9.41)
Re
1 00
αi(u0 − c)v1 = −p01 +
(v1 − α2 v1 ), (9.42)
Re
where primes denote differentiation with respect to y.
Solving equation (9.40) for u1 and substituting into equation (9.41) results in
1 1 000
−(u0 − c)v10 + u00 v1 = −αip1 − (v1 − α2 v10 ), (9.43)
αi Re
leaving equations (9.42) and (9.43) for v1 (y) and p1 (y). Solving equation (9.43)
for p1 (y), differentiating and substituting into equation (9.42) leads to
1 1 0000
(u0 − c)(v100 − α2 v1 ) − u000 v1 = (v − 2α2 v100 + α4 v1 ), (9.44)
αi Re 1
which is the Orr-Sommerfeld equation.

c 2010 K. W. Cassel 403 / 454
Notes:
1 For a given base flow u0 (y), Reynolds number Re, and wavenumber α, the
Orr-Sommerfeld equation is a differential eigenproblem of the form
L1 v1 = cL2 v1 ,
where the wavespeeds c are the (complex) eigenvalues, the disturbance

velocities v1 (y) are the eigenfunctions, and L1 and L2 are differential
operators.
2 The Orr-Sommerfeld equation applies for steady, parallel, viscous flow
perturbed by infinitesimally small normal modes, which is a local analysis.
3 For inviscid flow, i.e. as Re → ∞, the Orr-Sommerfeld equation reduces to
the Rayleigh equation
(u0 − c)(v100 − α2 v1 ) − u000 v1 = 0.

c 2010 K. W. Cassel 404 / 454
4 For non-parallel flows, a significantly more involved global stability analysis is

required in which the base flow is 2-D, i.e.
u0 = u0 (x, y), v0 = v0 (x, y), p0 = p0 (x, y),
and the perturbations are of the form
û(x, y, t) = u1 (x, y)e−ict ,

v̂(x, y, t) = v1 (x, y)e−ict ,
p̂(x, y, t) = p1 (x, y)e−ict ,
in place of equation (9.35).

c 2010 K. W. Cassel 405 / 454

Recall that the continuous differential eigenproblem (9.44), i.e. the
Orr-Sommerfeld equation, is of the form
L1 v1 = cL2 v1 .
We seek a corresponding discrete, i.e. algebraic, generalized eigenproblem of the

form
M(α, Re)v = cN(α)v, (9.45)
where we have dropped the subscript on v. Let us rewrite the Orr-Sommerfeld
equation in the form

i 0000
v + Pj v 00 + Qj v = c v 00 − α2 v ,

(9.46)
αRe
where
2αi
P (yj ) = u0 (yj ) − ,
Re
α3 i
Q(yj ) = − α2 u0 (yj ) − u000 (yj ).
Re
c 2010 K. W. Cassel 406 / 454
To discretize equation (9.46) using central differences, we take

d2 v vj+1 − 2vj + vj−1
2
= 2
+ O(∆y 2 ),
dy (∆y)
d4 v vj+2 − 4vj+1 + 6vj − 4vj−1 + vj−2
4
= 4
+ O(∆y 2 ).
dy (∆y)
Substituting these approximations into equation (9.46) and collecting terms leads
to the difference equation

[Cvj−2 + Bj vj−1 + Aj vj + Bj vj+1 + Cvj+2 ] = c B̄vj−1 + Āvj + B̄vj+1 ,
(9.47)
where
6i
+ (∆y)2 (∆y)2 Qj − 2Pj ,

Aj =
αRe
4i
Bj = − + (∆y)2 Pj ,
αRe
i
C = ,
αRe

Ā = −(∆y)2 2 + α2 (∆y)2 ,
B̄ = (∆y)2 .
c 2010 K. W. Cassel 407 / 454
Because the Orr-Sommerfeld equation is 4th -order, we need two boundary

conditions at each boundary. For solid surfaces at y = a, b, we set
v = v 0 = 0, at y = a, b. (9.48)
Because v is known at y = a, b, i.e. v1 = vJ+1 = 0, the unknowns are

vj , j = 2, . . . , J.
Applying equation (9.47) at j = 2, we have

0 0
Cv0 + B2v
> v
1 + A2 v2 + B2 v3 + Cv4 = c B̄1 + Āv2 + B̄v3 ,
>
and from v 0 = 0 at y = a (j = 1)
v2 − v0
=0 ⇒ v 0 = v2 .
2∆y
Substituting into the difference equation for j = 2 results in

[(C + A2 )v2 + B2 v3 + Cv4 ] = c Āv2 + B̄v3 . (9.49)

c 2010 K. W. Cassel 408 / 454
Similarly, for j = J
[CvJ−2 + BJ vJ−1 + (C + AJ )vJ ] = c [BvJ−1 + AvJ ] . (9.50)
Also for j = 3, J − 1, we have v1 = vJ+1 = 0. Therefore, the matrices in the

algebraic form of the eigenproblem (9.45) are
 
C + A2 B 2 C 0 0 ··· 0 0 0 0
 B3
 A3 B 3 C 0 · · · 0 0 0 0  
 C B 4 A 4 B 4 C · · · 0 0 0 0 
M(α, Re) =  . ,
 
. . . . . . . . .
 .. .. .. .. .. . . .. .. .. .. 
 
 0 0 0 0 0 · · · C BJ−1 AJ−1 BJ−1 
0 0 0 0 0 ··· 0 C BJ C + AJ
 
Ā B̄ 0 0 ··· 0 0 0
B̄
 Ā B̄ 0 ··· 0 0 0 
0 B̄ Ā B̄ ··· 0 0 0
N(α) =  . ..  ,
 
.. .. .. .. .. ..
 .. . . . . . . .
 
0 0 0 0 ··· B̄ Ā B̄ 
0 0 0 0 ··· 0 B̄ Ā
c 2010 K. W. Cassel 409 / 454
and the eigenvector is vT = [v2 v3 · · · vJ ]. Thus, M is pentadiagonal, and N is

tridiagonal.
The (large) generalized eigenproblem (9.45) must be solved to obtain the complex
wavespeeds c, i.e. the eigenvalues, and the discretized eigenfunctions v(y) for a
given α and Re.
Notes:
1 The J − 1 eigenvalues obtained approximate the infinity of eigenvalues of the
continuous differential eigenproblem (9.44), i.e. the Orr-Sommerfeld
equation.

c 2010 K. W. Cassel 410 / 454
2 The least stable mode, i.e. that with the fastest growth rate αci , is given by
αmax(ci ).

c 2010 K. W. Cassel 411 / 454
3 A marginal stability curve may be obtained by determining max(ci ) for a

range of Re and α and plotting the max(ci ) = 0 contour:

c 2010 K. W. Cassel 412 / 454
Methods of Solution:
1 Convert the generalized eigenproblem (9.45) to a regular eigenproblem by
multiplying both sides by the inverse of N. That is, find the eigenvalues from
−1
N M − cI = 0.
This requires inverting a large matrix, and although M and N are typically
sparse and banded, N−1 M is a full, dense matrix.
⇒ This would require use of a general approach, such as the iterative QR
method, to determine the eigenvalues of a large, dense matrix.

c 2010 K. W. Cassel 413 / 454
2 In order to avoid solving the large matrix problem that results from the BVP,
traditionally the shooting method for IVPs has been used.
→ This approach avoided the need to find the eigenvalues of large matrices in the
days when computers were not capable of such large calculations.
→ In addition, it allowed for use of well-developed algorithms for IVPs.
→ However, this is like “using a hammer to drive a screw.”

c 2010 K. W. Cassel 414 / 454
3 Solve the generalized eigenproblem
Mv = cNv,
where M and N are large, sparse matrices, for the least stable mode. In
addition to the fact that the matrices are sparse, in stability contexts such as
this, we only need the least stable mode, not the entire spectrum of
eigenvalues. Recall that the least stable mode is that with the largest
imaginary part.
→ Currently, the state-of-the-art in such situations is the Arnoldi method
discussed in the last section.
Note that N must be positive definite for use in the Arnoldi method. That is,
it must have all positive eigenvalues. In our case, this requires us to take the
negatives of the matrices M and N as defined above.

c 2010 K. W. Cassel 415 / 454

As an illustration, let us consider stability of Plane-Poiseuille flow, i.e.
pressure-driven flow in a channel.
Such a flow is parallel; therefore, the base flow is a solution of equation (9.36).
The solution is a parabolic velocity profile given by
u0 (y) = y(2 − y), 0 ≤ y ≤ 2. (9.51)
Note that the base flow is independent of the Reynolds number.

c 2010 K. W. Cassel 416 / 454
See the Mathematica notebook “OS Psvll.nb” for a solution of the

Orr-Sommerfeld equation using the approach outlined in the previous section.
→ This notebook calculates the complex wavespeeds, i.e. the eigenvalues, for a
given wavenumber α and Reynolds number Re in order to evaluate stability.
→ Recall that a flow is unstable if the imaginary part of one of the discrete
eigenvalues is positive, i.e. max(ci ) > 0. The growth rate of the instability is
then α max(ci ).

c 2010 K. W. Cassel 417 / 454
By performing a large number of such calculations for a range of wavenumbers

and Reynolds numbers, we can plot the marginal stability curve for
plane-Poiseuille flow. This shows the curve in α-Re parameter space for which
max(ci ) = 0 delineating the regions of parameter space in which the flow is stable
and unstable to normal-mode perturbations.
1.10
1.05
1.00
0.95
0.90
0.85
0.80
5000 6000 7000 8000 9000 10 000
Thus, the critical Reynolds number is approximately Rec = 5, 800.

c 2010 K. W. Cassel 418 / 454
Numerical Modeling of Turbulent Flows Introduction
Outline
10 Numerical Modeling of Turbulent Flows

Introduction
Direct Numerical Simulation
Large-Eddy Simulation
Reynolds Averaged Navier-Stokes

c 2010 K. W. Cassel 419 / 454
Turbulent flows are characterized by the following:

They are highly unsteady ⇒ Seemingly random fluctuations of velocity about
a mean.
They are inherently three-dimensional due to velocity fluctuations (even if the
mean flow is two-dimensional).
They involve a wide range of temporal and spatial scales.
Thus, turbulent flows are very difficult to model numerically.
There are three basic approaches (given in increasing degree of approximation):
1) Direct Numerical Simulation (DNS):
Navier-Stokes equations are solved for all scales. ⇒ Very fine grids are
required.
2) Large Eddy Simulation (LES):
Recognize that only the large-scale motions depend on the macroscopic
features of the particular flow, and that the small-scale features are similar for
certain classes of turbulent flows. ⇒ Solve numerically for large scales and
model small scales.

c 2010 K. W. Cassel 420 / 454
3) Reynolds Averaged Navier-Stokes (RANS):

Solve numerically the time averaged Navier-Stokes equations for the mean
flow.
Requires turbulence models for closure of the governing equations.
Summarizing:
DNS ⇒ no turbulence modeled.

LES ⇒ some, i.e. small-scale, turbulence modeled.
RANS ⇒ all turbulence modeled.

c 2010 K. W. Cassel 421 / 454
Numerical Modeling of Turbulent Flows Direct Numerical Simulation
Outline

Introduction

c 2010 K. W. Cassel 422 / 454
In DNS all scales must be resolved.

Consider the following definitions:
= rate of dissipation of turbulent kinetic energy

3 1/4
ν
lK = = Kolmogorov scale (size of the smallest turbulent eddies)

L = length scale of large eddies
ReL = Reynolds number based on magnitude of
velocity fluctuations and L, e.g. ReL ≈ 0.01Re.
Here, Re is the macroscopic Reynolds number.

The grid size must be smaller than lK in order to resolve all of the scales;
therefore,
L 3/4
N∼ ∼ ReL ,
lK
where N is the number of grid points in each direction.

c 2010 K. W. Cassel 423 / 454
Note that as ReL ↑ ⇒ lK ↓ ⇒ N ↑; therefore, DNS is only practical for small to

moderate Re.
Notes:
1) DNS is only possible for moderate Re flows in simple geometries (with
current computational resources). This gives rise to the early, i.e. 1990s,
”turbulence-in-a-box” simulations.
2) DNS produces very detailed information about the flow (see DNS figures):
⇒ Can be used to evaluate turbulence statistics, e.g. to compare with
experiments.
⇒ Gives qualitative knowledge of flow (coherent structures, etc...).
⇒ Used to develop turbulence models for similar flows.
3) Spectral methods are popular for DNS because highly accurate solutions can
be obtained.

c 2010 K. W. Cassel 424 / 454
Numerical Modeling of Turbulent Flows Large-Eddy Simulation
Outline

Introduction

c 2010 K. W. Cassel 425 / 454
In LES we filter the velocity and pressure fields so they only contain large-scale
components, i.e. local average of complete fields:
Z
ūi (xi ) = G(x, x0 )ui (x0 )dx0 , (10.1)
where G(x, x0 ) is the filter kernel having an associated length scale ∆:

Eddies > ∆ ⇒ large eddies (computed).
Eddies < ∆ ⇒ small turbulent eddies (modeled).
That is,
lK ≤ modeled < ∆ < computed ≤ L
Numerical grid: lK < ∆xi < ∆, i.e. smaller than ∆, but not as small as lK (cf.
DNS).
Apply filter (10.1) to Navier-Stokes equations with
ui (xi , t) = ūi (xi , t) + u0i (xi , t), p(xi , t) = p̄(xi , t) + p0 (xi , t),

c 2010 K. W. Cassel 426 / 454
where ūi (xi , t) is the resolvable scale velocity (computed), and u0i (xi , t) is the
subgrid scale (SGS) velocity.
The filtered Navier-Stokes equations are (in tensor notation)
∂ ūi
= 0,
∂xi
(10.2)
∂ ūi ∂ 1 ∂ p̄
+ ui uj = − + ν∇2 ūi .
∂t ∂xj ρ ∂xi
Thus, we have a set of equations for large-scale quantities, but
ui uj = ūi ūj + ūi u0j + u0i ūj + u0i u0j . (10.3)
The first term on the right-hand-side is computed from the resolved scales, and
the remaining terms are τij = SGS Reynolds stress and must be modeled, i.e. they
contain u0i and u0j .
Therefore, the subgrid scale (SGS) model must specify τij as a function of the
resolvable variables (ūi , ūj ), and it provides for energy transfer between the
resolvable scales and the SGS.

c 2010 K. W. Cassel 427 / 454
The earliest and most common SGS model is the Smagorinsky model, which is an
eddy viscosity model (effective viscosity due to small-scale turbulent motion).
Thus, the SGS Reynolds stress τij increases transport and dissipation.
1
τij = τkk δij + 2µt S̄ij , (10.4)
3
where
1 ∂ ūi ∂ ūj
S̄ij = + = strain-rate for resolved field
2 ∂xj ∂xi
µt = CS2 ρ∆2 |S̄|.
Here, CS is the model parameter, which must be specified, and |S̄| = (S̄ij S̄ij )1/2 .

c 2010 K. W. Cassel 428 / 454
Notes:
1) The Smagorinsky model only accounts for energy transfer from large to small
scales.
2) The Smagorinsky model does not work well near boundaries (eddy viscosity is
much smaller and flow is more anisotropic).
3) Points (1) and (2) can be improved upon with dynamic SGS models:
Allow model parameter CS to vary with space and time, i.e. it is computed
from the resolvable flow field.
Automatically adjusts SGS parameter for anisotropic flow and flow near walls.
Allows for backscatter (µt < 0), which accounts for energy transferred from
small scales to large scales.
Active area of ongoing research.

c 2010 K. W. Cassel 429 / 454
4) LES is more economical computationally than DNS; therefore, it can be

applied to more complex flows. Consider, for example, the flow over a wall
mounted block (Re = 3, 200 and grid: 240 × 128 × 128) and combustor flows
(see figures) and note the level of detail obtained for the flow.

c 2010 K. W. Cassel 430 / 454
Numerical Modeling of Turbulent Flows Reynolds Averaged Navier-Stokes
Outline

Introduction

c 2010 K. W. Cassel 431 / 454
In RANS we compute only the mean flow and model all of the turbulence:
u(xi , t) = ū(xi ) + u0 (xi , t), (10.5)
where ū(xi ) is the mean flow, and u0 (xi , t) are the turbulent fluctuations (not the
same as u0 in LES).
To obtain the mean flow, we use Reynolds averaging:
Steady mean flow ⇒ time average Navier-Stokes equations
Z T
1
ū(xi ) = lim u(xi , t)dt, (10.6)
T →∞ T 0
where T = averaging interval.

Unsteady mean flow ⇒ ensemble average.

c 2010 K. W. Cassel 432 / 454
For incompressible flow, the Reynolds averaged Navier-Stokes equations are
∂ ū ∂v̄
+ = 0,
∂x ∂y

∂ ū ∂ ū ∂ ū ∂ p̄ ∂ ∂ ū ∂ ∂ ū
ρ + ū + v̄ =− + µ − ρu0 u0 + µ − ρu0 v 0 ,
∂t ∂x ∂y ∂x ∂x ∂x ∂y ∂y

∂v̄ ∂v̄ ∂v̄ ∂ p̄ ∂ ∂v̄ 0 0
∂ ∂v̄ 0 0
ρ + ū + v̄ =− + µ − ρu v + µ − ρv v ,
∂t ∂x ∂y ∂y ∂x ∂x ∂y ∂y
(10.7)
0 0 0 0 0 0
where ρu u , ρu v , ρv v are the Reynolds stresses. Equations (10.7) are three
equations for five unkowns (ū, v̄, p̄, u0 , v 0 ); therefore, we have a closure problem.
Closure is achieved by relating the Reynolds stresses to the mean flow quantities
through a turbulence model.
k − Turbulence Model:
The k − model is the most common turbulence model used in applications, but
there are many others.

c 2010 K. W. Cassel 433 / 454
Define:
1 0 0
k = u u = turbulent kinetic energy,
2 i i
0 0
∂ui ∂ui
= νT = rate of dissipation of turbulent energy.
∂xj ∂xj
Here, k and are determined from a solution of the following coupled equations,
which are derived from Navier-Stokes,

Dk ∂ µT ∂k ∂ui ∂uj ∂ui
ρ = + µT + − ρ, (10.8)
Dt ∂xj σk ∂xj ∂xj ∂xi ∂xj
ρC2 2

D ∂ µT ∂ C1 µT ∂ui ∂uj ∂ui
ρ = + + − . (10.9)
Dt ∂xj σ ∂xj k ∂xj ∂xi ∂xj k
The terms on the left-hand-side represent the transport of k and , and the first,
second and third terms on the right-hand-side represent diffusion, production and
dissipation, respectively, of k and .

c 2010 K. W. Cassel 434 / 454
µT (xi ) is the local eddy viscosity and is related to k and by
Cµ ρk 2
µT = . (10.10)

The eddy viscosity µT is used to relate the Reynolds stresses to the mean flow
quantities through

∂ ūi ∂ ūj 2
−ρu0i u0j = µT + − ρkδij . (10.11)
∂xj ∂xi 3
Notes:
1) Equations (10.8)–(10.10) involve five constants, Cµ , C1 , C2 , σk , σ , which
must be determined empirically.
2) Special treatment is necessary at solid boundaries; therefore, wall functions
that are usually based on the log-law are used for turbulent boundary layers.
3) Produces less detailed information about the flow than DNS and LES, but
RANS requires significantly more modest computational resources as
compared to DNS and LES. Thus, it is good for engineering applications, and
is used by most commercial CFD codes.

c 2010 K. W. Cassel 435 / 454
4) Most turbulence models (including k − ) have difficulties with accurately

predicting separation.
Consider the example of an impinging jet (see figures). Note the significant
differences between the results obtained with the two models used, both in terms
of the locations and magnitudes of turbulent kinetic energy.

c 2010 K. W. Cassel 436 / 454
Parallel Computing Introduction
Outline
11 Parallel Computing
Introduction

c 2010 K. W. Cassel 437 / 454
Glossary:
•
Supercomputer – the fastest computers available at a particular time.
•
MIPS – Millions of Instructions Per Second; outdated measure of computer
performance (architecture dependent).
•
Floating point operation – an operation, e.g. addition, subtraction,
multiplication or division, on one or more floating point numbers (non
integers).
•
FLOPS – Floating Point Operations Per Second; current measure of
computer performance for numerical applications.
MFLOPS MegaFLOPS million (106 ) FLOPS
GFLOPS GigaFLOPS billion (109 ) FLOPS
TFLOPS TeraFLOPS trillion (1012 ) FLOPS
PFLOPS PetaFLOPS 1015 FLOPS
•
Serial processing – a computer code is executed line-by-line in sequence on
one processor.
•
Scalar processing – a processor performs calculations on a single data
element.

c 2010 K. W. Cassel 438 / 454
•
Vector processing – a processor performs calculations on multiple data
elements, i.e. a vector, simultaneously.
•
Parallel processing – multiple processors (or cores) perform operations
simultaneously on different data elements.
•
Massively parallel – parallel computers involving thousands of processors.
•
MPP – Massively Parallel Processing.
•
SMP – Symmetric Multi-Processing; shared memory parallelism.
•
Shared memory parallelism – all of the processors share the same memory.
•
Distributed memory parallelism – all of the processors access their own
memory.
•
MPI – Message Passing Interface; most common library used for
inter-processor communication on distributed memory computers.
•
Cluster – a parallel computer comprised of commodity hardware, e.g. Beowulf
cluster (commodity hardware, Linux OS, and open source software).
•
SIMD – Single Instruction, Multiple Data; all processors perform the same
instruction on different data elements.

c 2010 K. W. Cassel 439 / 454
•
MIMD – Multiple Instruction, Multiple Data; all processors perform different
instructions on different data elements.
•
Embarrassingly parallel – processors work in parallel with very little or no
communication between them, e.g. image processing and Seti@Home.
•
Coupled parallel – processors work in parallel but require significant
communication between them, e.g. CFD.
•
HPC – High Performance Computing.
•
Grid computing – parallel computing across a geographically distributed
network (e.g. the internet); analogous to the electric grid.
•
Multi-core CPUs – a single chip with multiple processors (cores).

c 2010 K. W. Cassel 440 / 454
A Brief History of Supercomputers:

Generations of Supercomputer Architectures:
Note that each architecture requires its own approach to programming, and not
all algorithms are amenable to each.

c 2010 K. W. Cassel 441 / 454
Milestones:
Year Supercomputer Peak Speed
1906 Babbage Analytical Engine 0.3 OPS
1946 ENIAC 50 kOPS
1964 CDC 6600 (Seymour Cray) 3 MFLOPS
1969 CDC 7600 36 MFLOPS
1976 Cray-1 (Seymour Cray) 250 MFLOPS
1981 CDC Cyber 205 400 MFLOPS
1983 Cray X-MP 941 MFLOPS
1985 Cray-2 3.9 GFLOPS
1985 Thinking Machines CM-2 (64k)
1989 Cray Y-MP
1993 Thinking Machines CM-5 65.5 GFLOPS
1993 Intel Paragon 143.4 GFLOPS
1996 Hitachi/Tsukuba CP-PACS (2k) 368.2 GFLOPS
1999 Intel ASCI Red (10k) 2.8 TFLOPS
2002 NEC Earth Simulator (5k) 35.9 TFLOPS
2005 IBM Blue Gene/L (131k) 280.6 TFLOPS
2008 IBM Roadrunner (130k) 1.105 PFLOPS
c 2010 K. W. Cassel 442 / 454
Milestones (cont’d):
1954 – Fortran (Formula Translation) developed
1970 – Unix developed at AT&T Bell Labs
Components of Computing:
1 Hardware – the computer (CPUs, memory, storage, network, etc.)
→ Assuming adequate memory, the computational time is primarily determined
by:
Sequential → CPU speed
Parallel → number of CPUs, CPU speed, inter-processor communication speed
and bandwidth, percent of code that is run in parallel (see Amdahl’s law).
2 Software – OS, compiler, libraries, program, etc.
3 Algorithms – the numerical method
→ Note that the best algorithms for serial computers often are not the best for
parallel computers (e.g. BLKTRI).

c 2010 K. W. Cassel 443 / 454
Issues in Parallel Computing:

1 Data dependency
→ Can calculations be carried out independently of one another on separate
processors?
→ Examples:
Thomas algorithm ⇒ no.
Red-black Gauss-Siedel ⇒ yes.
2 Load balancing
→ Desire all processors to be utilized 100% of the time.
3 Scalability
→ Ability of a parallel code to maintain its speedup with increasing numbers of
processors and increasing problem size ⇒ see Amdahl’s Law.
4 Coarse vs. fine-grain parallelism
Fine-grain parallelism ⇒ parallelize the lower level structures of a program,
e.g. loops.
Coarse-grain parallelism ⇒ parallelize the higher level structures of a program,
e.g. equations.

c 2010 K. W. Cassel 444 / 454
5 Architectures:
SIMD (e.g. CM-2) vs. MIMD
Shared memory (e.g. Cray, SGI, multicore CPUs, etc.):
Distributed memory (e.g. CM, clusters, etc.):

c 2010 K. W. Cassel 445 / 454
Key Ingredients That Make Clusters Possible:

Hardware:
1) Fast, cheap PCs – inexpensive and widely available CPUs, memory, storage.
2) High-performance (high-bandwidth, low latency) LANs, e.g. Gigabit Ethernet,
Myrinet, Infiniband.
Software:
3) Open source OS, e.g. Linux – common platform on which to build tools
(previously, most supercomputing vendors had their own proprietary tools, e.g.
CM Fortran).
4) MPI – standardize message passing on various platforms (developed at ANL).
→ Ingredients (3) and (4) were not available during MPP (e.g. CM) heyday
in mid ’80s - mid ’90s. This led to the demise of the CM.

c 2010 K. W. Cassel 446 / 454
Amdahl’s Law:
Measure of speedup S(N ) on N processors versus single processor, i.e. serial code.
Amdahl’s Law:
T (1) N
S(N ) = = ,
T (N ) N − (N − 1)F
where T (N ) is the time required using N processors, and F is the fraction of the
serial run time that the code spends in the parallel portion.
See Mathematica notebook ‘’AmdahlsLaw.nb.”
Notes:
1 If F = 1 (100% parallel) ⇒ S(N ) = N (linear speedup)
If F < 1 ⇒ S(N ) < N (sub-linear speedup)

c 2010 K. W. Cassel 447 / 454
2 Maximum speedup for given F with N → ∞:

N 1 1
lim S(N ) = lim = = ,
N →∞ N →∞ N − (N − 1)F 1−F Fs
where Fs = 1 − F is the fraction of time in the serial code spent doing the
serial portion.
e.g. F = 0.95 ⇒ Fs = 0.05 ⇒ S(∞) = 20
3 In practice, Fs (and F ) depends on the number of processors N and the size
of the problem n, i.e. the number of grid points.
Thus, for ideal scalability a parallel algorithm should be such that
Fs (N, n) → 0 (F → 1) as n → ∞
∴ S(N ) → N as n → ∞ ⇒ linear speedup

4 Parallel efficiency – fraction of total potential (linear) speedup achieved:
S(N )
E(N ) =
N

c 2010 K. W. Cassel 448 / 454
5 Illustrates importance of profiling – determining where the code spends most

of its time in order to know where to focus parallelization efforts.
6 Ignores overhead due to parallelization:
•
Inter-processor communication
•
Synchronization of data between processors
•
Load imbalances
Modified Amdahl’s Law:

Modified to account for parallel overhead.
Modified Amdahl’s Law:
N
S(N ) = ,
N − (N − 1)F + N 2 Fo
where Fo is the fraction of time spent due to parallel overhead.

See Mathematica notebook “AmdahlsLaw.nb.”

c 2010 K. W. Cassel 449 / 454
Top 500 Supercomputers (www.top500.org)

Compiled every six months based on LINPACK benchmark tests.
November 2005:
26th List: The TOP10

Rmax
Manufacturer Computer Installation Site Country Year #Proc
[TF/s]
BlueGene/L
1 IBM 280.6 DOE/NNSA/LLNL USA 2005 131072
eServer Blue Gene
BGW
2 IBM 91.29 IBM Thomas Watson USA 2005 40960
eServer Blue Gene
ASC Purple
3 IBM 63.39 DOE/NNSA/LLNL USA 2005 10240
eServer pSeries p575
4 Columbia
SGI 51.87 NASA Ames USA 2004 10160
3 Altix, Infiniband
5 Dell Thunderbird 38.27 Sandia USA 2005 8000

6 Red Storm
Cray 36.19 Sandia USA 2005 10880
10 Cray XT3
7
NEC Earth-Simulator 35.86 Earth Simulator Center Japan 2002 5120
4
8 MareNostrum Barcelona Supercomputer
IBM 27.91 Spain 2005 4800
5 BladeCenter JS20, Myrinet Center
9 ASTRON
IBM eServer Blue Gene 27.45 Netherlands 2005 12288
6 University Groningen
Jaguar
10 Cray 20.53 Oak Ridge National Lab USA 2005 5200
Cray XT3
26th List / November 2005 www.top500.org page 4

c 2010 K. W. Cassel 450 / 454
Current List (November 2008):

c 2010 K. W. Cassel 451 / 454
Geographic location of top 100 supercomputers:

c 2010 K. W. Cassel 452 / 454
TOP500 List Highlights - November 2008:

Intel dominates the high-end processor market with 75.8 percent of all
systems and 87.5 percent of quad-core based systems.
410 systems are labeled as clusters, making this the most common
architecture in the TOP500 with a stable share of 82 percent.
Quad-core processor based systems have taken over the TOP500 quite
rapidly. Already 336 systems are using them. Seven systems use IBM’s
advanced Sony PlayStation 3 processor with 9 cores.
The average concurrency level in the TOP500 is 6,240 cores per system up
from 4,850 six month ago.
IBM and Hewlett-Packard continue to sell the bulk of systems at all
performance levels of the TOP500. HP took over the lead in systems with
209 systems (41.8 percent) over IBM with 188 systems (37.6 percent).
The U.S. is the leading consumer of HPC systems with 291 of the 500
systems (up from 257). The European share (151 systems – down from 184)
is settling down after having risen for some time, but is still substantially
larger then the Asian share (47 systems – unchanged).

c 2010 K. W. Cassel 453 / 454
The entry level to the list moved up to the 12.64 Tflop/s mark on the
Linpack benchmark, compared to 9.0 Tflop/s six months ago.
The last system on the newest list would have been listed at position 267 in
the previous TOP500 just six months ago.
Total combined performance of all 500 systems has grown to 16.95 Pflop/s,
compared to 11.7 Pflop/s six months ago and 6.97 Pflop/s one year ago.
The entry point for the top 100 increased in six months from 18.8 Tflop/s to
27.37 Tflop/s (which would have been # 9 on the November 2005 list).
The entry level into the TOP50 is at 50.55 Tflop/s (which would have been
# 5 on the November 2005 list).
Of the top 50, 56 percent of systems are installed at research labs and 32
percent at universities. Cray’s XT is the most-used system family with 20
percent, followed by IBMs BlueGene with 16 percent. The average
concurrency level is 30,490 cores per system – up from 24,400 six month ago.
Seven U.S. DOE systems dominate the TOP10.
Roadrunner (# 1) is based on the IBM QS22 blades that are built with
advanced versions of the processor in the Sony PlayStation 3.
The list now includes energy consumption of the supercomputers.

c 2010 K. W. Cassel 454 / 454

MMAE 517: Illinois Institute of Technology

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MMAE 517: Illinois Institute of Technology

Uploaded by

Copyright:

Available Formats

Computational Fluid Dynamics

Illinois Institute of Technology

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

“CFD is a modern and rather special expression

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Advantages and disadvantages of analytical, computational and experimental

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

2 Numerical Methods: General Considerations and Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Solution Procedure:

1 Physical Laws + Models

Mathematical Model Analytical

System of Linear Equations

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

3 Numerical solutions of linear systems of equations.

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Properties of successful numerical solution methods:

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

3 Convergence – numerical solution of discretized equations approaches exact

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Methods: General Considerations and Approaches Numerical Solution Approaches

2 Numerical Methods: General Considerations and Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

The discretization step (step 2 in the Numerical Solution Procedure) may be

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Methods: General Considerations and Approaches Numerical Solution Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Methods: General Considerations and Approaches Numerical Solution Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Methods: General Considerations and Approaches Numerical Solution Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Numerical Methods: General Considerations and Approaches Numerical Solution Approaches

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Finite-Difference Methods Extended Fin Example

As an example, consider the 1-D heat conduction in an extended surface with

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

T (x) = temperature distribution,

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Finite-Difference Methods Extended Fin Example

For now, consider Dirichlet boundary conditions

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

In order to discretize the differential equation, consider the definition of the

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Finite-Difference Methods Extended Fin Example

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Finite-Difference Methods Formal Basis for Finite Differences

Apply the Taylor series at x = xi+1 : (xi+1 − xi = ∆x)

∆x2 d2 θ ∆x3 d3 θ ∆xn dn θ

MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics

Similarly, apply the Taylor series at x = xi−1 : (xi−1 − xi = −∆x)

∆x2 d2 θ ∆x3 d3 θ (−1)n ∆xn dn θ

θi − θi−1 ∆x d2 θ ∆x2 d3 θ (−1)n ∆xn−1 dn θ

∆x3 d3 θ 2∆x2n+1 d2n+1 θ

θi+1 − θi−1 ∆x2 d3 θ ∆x2n