Professional Documents
Culture Documents
MMAE 517
c 2010 K. W. Cassel
Introduction to CFD
CFD Advertisements...
Introduction to CFD
Computational:
+ Address more complex problems (physics and geometries).
+ Can consider hypothetical flows ⇒ test theoretical models.
+ Provides detailed solutions ⇒ good understanding of flow, e.g. does
separation occur?
+ Perform parametric studies.
+ Can easily try different configurations, e.g. geometry, boundary conditions,
etc... ⇒ important in design.
+ Computers becoming faster and cheaper ⇒ range of CFD expanding.
+ Increased potential using parallel processing.
+ More cost effective and faster than experimental prototyping.
− Requires accurate governing equations (don’t have for turbulence,
combustion, etc..., which require modeling).
− Boundary conditions sometimes difficult to implement, e.g. outlet.
− Difficult to do in certain parameter regimes, e.g. high Reynolds numbers, and
complex geometries.
Experimental:
+ Easier to get overall quantities for problem (e.g. lift and drag on an airfoil).
+ No “modeling” necessary.
− Often requires intrusive measurement probes.
− Limited measurement accuracy.
− Limited measurement resolution.
− Effects of support apparatus, end walls, etc... must be considered.
− Some quantities difficult to obtain, e.g. streamfunction, vorticity, etc....
− Experimental equipment often expensive and takes up space.
− Difficult and costly to test full-scale models.
Note: Computational approaches do not replace analytical or experimental
approaches, but complement them.
Numerical Methods: General Considerations and Approaches Components and Properties of a Numerical Solution
Outline
Physical System
i.e. Reality
2 Discretization
3 Matrix Solver
Numerical Solution
Numerical Methods: General Considerations and Approaches Components and Properties of a Numerical Solution
Steps:
1 Mass, momentum and energy conservation + models, idealizations, etc...
2 Discretization – approximation of the continuous differential equation(s) by a
system of algebraic equations for the dependent variables at discrete locations
in the independent variables (space and time). For example, using finite
differences
Sources of errors in numerical solutions arise due to each step of the Numerical
Solution Procedure:
1 Modeling errors – difference between actual flow and exact solution of
mathematical model.
2 Discretization errors – difference between exact solution of governing
equations and exact solution of algebraic equations.
i) Method of discretization → inherent error of method, i.e. truncation error.
ii) Computational grid → can be refined.
3 Iterative convergence errors – difference between numerical solution, i.e.
iterative, and exact solution to algebraic equations.
This is where the “pragmatism” mentioned by Dawes is an important element of
CFD.
e.g. DNS of turbulence in other than elementary flows is currently impossible.
⇒ Compromises must often be made at each stage of the procedure.
Numerical Methods: General Considerations and Approaches Components and Properties of a Numerical Solution
Outline
Finite Difference
Basic approach:
Discretize the governing equations in differential form using
Taylor-series-based finite-difference approximations at each grid point.
Produces algebraic equations involving each grid point and surrounding
points.
Local approximation method.
Popular in fluid dynamics research.
Advantages:
Relatively straightforward to understand and implement (based on Taylor
series).
Utilizes familiar differential form of governing equations.
Very general ⇒ Apply to a wide variety of problems (including complex
physics, e.g. fluids plus heat transfer plus combustion).
Can extend to higher-order approximations.
Disadvantages:
More difficult to implement for complex geometries.
Finite Volume
Basic approach:
Apply conservation equations in integral form to a set of control volumes.
Produces algebraic equations for each control volume involving surrounding
control volumes.
Local approximation method.
Popular for commercial CFD codes (e.g. FLUENT).
Advantages:
Easier to treat complex geometries than finite-difference approach.
“Ensures” conservation of necessary quantities (i.e. mass, momentum,
energy, etc.), i.e. even if solution in inaccurate.
Disadvantages:
More difficult to construct higher-order schemes.
Uses less familiar integral formulation of governing equations.
Finite Element
Basic approach:
Apply conservation equations in variational form with weighting function to
set of finite elements.
Produces set of linear or nonlinear algebraic equations.
Local approximation method.
Popular in commercial codes (particularly for solid mechanics and heat
transfer).
Advantages:
Easy to treat complex geometries.
Disadvantages:
Results in unstructured grids.
Solution methods are inefficient for the types of matrices resulting from
finite-element discretizations (cf. finite difference ⇒ sparse, highly structured
matrices).
Spectral Methods
Basic approach:
Solution of governing equations in differential form are approximated using
truncated (usually orthogonal) eigenfunction expansions.
Produces system of algebraic equations (steady) or system of ordinary
differential equations (unsteady) involving the coefficients in the
eigenfunction expansion.
Global approximation method.
Popular for direct numerical simulation (DNS) of turbulence.
Advantages:
Obtain highly accurate solutions when underlying solution is smooth.
Can achieve rapid convergence.
Disadvantages:
Less straightforward to implement than finite difference.
More difficult to treat complicated boundary conditions (e.g. Neumann).
Small changes in problem can cause large changes in algorithm.
Not well suited for solutions having large gradients.
Vortex Methods
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
The heat transfer within the extended fin is governed by the 1-D
ordinary-differential equation (see, for example, Incropera & DeWitt):
d2 T
1 dAc dT 1 h dAs
+ − (T − T∞ ) = 0, (3.1)
dx2 Ac dx dx Ac k dx
where
d2 θ dθ
2
+ f (x) + g(x)θ = 0, (3.2)
dx dx
where
1 dAc
f (x) = ,
Ac dx
1 h dAs
g(x) = − .
Ac k dx
θ = θb = Tb − T∞ at x = 0,
(3.3)
θ = θL = TL − T∞ at x = L.
Equation (3.2) with boundary conditions (3.3) represents the mathematical model
(step 1 in the Numerical Solution Procedure).
Step 2 → Discretization:
L
Divide the interval 0 ≤ x ≤ L into I equal subintervals of length ∆x = I:
Here, fi = f (xi ) and gi = g(xi ) are known, and the solution θi = θ(xi ) is to be
determined for i = 2, . . . , I.
dθ θi+1 − θi
Forward Difference: ≈
dx xi
∆x
dθ θi − θi−1
Backward Difference: ≈
dx xi ∆x
dθ θi+1 − θi−1
Central Difference: ≈
dx xi
2∆x
Which approximation is better?
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
Consider the Taylor series expansion of a function θ(x) in the vicinity of the point
xi
(x − xi )2 d2 θ
dθ
θ(x) = θ(xi ) + (x − xi ) +
dx i 2! dx2 i
(3.4)
(x − xi )3 d3 θ (x − xi )n dn θ
+ + ··· + + ··· .
3! dx3 i n! dxn i
∆x d2 θ ∆xn−1 dn θ
dθ θi+1 − θi
= − − ··· − + ··· . (3.6)
dx i ∆x 2 dx2 i n! dxn i
Equations (3.6), (3.8) and (3.10) are exact expressions for the first derivative
(dθ/dx)i , i.e. if all of the terms are retained in the expansions.
Approximate finite difference expressions for the first derivative may then be
obtained by truncating the series after the first term:
dθ θi+1 − θi
≈ + O(∆x) → Forward difference
dx i ∆x
dθ θi − θi−1
≈ + O(∆x) → Backward difference
dx i ∆x
dθ θi+1 − θi−1
≈ + O(∆x2 ) → Central difference
dx i 2∆x
(2∆x)2 d2 θ (2∆x)3 d3 θ
dθ
θi+2 = θi + 2∆x + + + · · · . (3.11)
dx i 2! dx2 i 3! dx3 i
2∆x3 d3 θ
dθ
4θi+1 − θi+2 = 3θi + 2∆x − + ··· .
dx i 3 dx3 i
∆x2 d3 θ
dθ −3θi + 4θi+1 − θi+2
= + + ··· , (3.12)
dx i 2∆x 3 dx3 i
which is second-order accurate and involves the point of interest and the two
previous points.
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
bi = −2 + (∆x)2 gi ,
∆x
ci = 1+ fi ,
2
di = 0.
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 34 / 454
Finite-Difference Methods Application to Extended Fin Example
Note that because we have discretized the differential equation at each interior
grid point, we obtain a set of (I − 1) algebraic equations for the (I − 1) unknown
values of the temperature θi , i = 2, . . . , I at each interior grid point.
The coefficient matrix for the difference equation (3.14) is tridiagonal
b2 c2 0 0 ··· 0 0 0 θ2 d2 − a2 θ1
a3 b3
c3 0 · · · 0 0 0 θ3
d3
0 a4 b4 c4 · · · 0 0 0 θ4 d 4
.. .. =
.. .. .. .. . . .. .. ..
.
. . . . . . .
.
.
0 0 0 0 · · · aI−1 bI−1 cI−1 θI−1 dI−1
0 0 0 0 ··· 0 aI bI θI dI − cI θI+1
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
|λ|max
cond2 (A) = .
|λ|min
Let us consider N large. Thus, expanding cosine in a Taylor series (the first is
expanded about π/(N + 1) → 0 and the second about N π/(N + 1) → π
2 4
π π 1 1 π
cos =1− + + ··· ,
N +1 N +1 2!4! N + 1
2 2
Nπ 1 Nπ 1 π
cos = −1 + − π − · · · = −1 + − ··· .
N +1 2! N + 1 2 N +1
Consider the common case that may result from the use of central differences for
a second-order derivative:
a = 1, b = −2, c = 1,
4 4(N + 1)2
cond2 (A) ≈ = , for large N .
(π/(N + 1))2 π2
a = 1, b = −4, c = 1,
which is strictly, or strongly, diagonally dominant. Then from equation (3.16), the
condition number for large N is approximately
6
cond2 (A) ≈ = 3, for large N .
2
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
F1 = 0, δ1 = θ1 = θb = boundary condition
ci di − ai δi−1
Fi = , δi = , i = 2, . . . , I.
bi − ai Fi−1 bi − ai Fi−1
2) Back Substitution:
Notes:
1 See Anderson, Appendix A for a derivation.
2 The Thomas algorithm only requires O(I) operations, which is as good a
scaling as one could hope for. Gauss elimination of a full (dense) matrix, for
example, requires O(I 3 ) operations.
3 To prevent ill-conditioning the system of equations should be diagonally
dominant. For our tridiagonal system, this means
where if the greater than sign applies we say that the matrix is strictly
diagonally dominant, or weakly diagonally dominant if the equal sign applies.
Performing operations with ill-conditioned matrices can result in the growth
of small round-off errors that then contaminate the solution. For example,
note how errors could accumulate in the Fi , δi coefficients in the Thomas
algorithm.
Outline
3 Finite-Difference Methods
Extended Fin Example
Formal Basis for Finite Differences
Application to Extended Fin Example
Properties of Tridiagonal Matrices
Thomas Algorithm
Extended Fin Example – Convection Boundary Condition
Rather than a Dirichlet boundary condition at the tip of the fin, which assumes
that we know the temperature there, let us consider a more realistic convection
condition at the tip
dT
−k = h(T − T∞ ) at x = L,
dx
or with θ(x) = T (x) − T∞
dθ
−k = hθ(L)
dx x=L
Now let us consider evaluation of the heat flux at the base of the fin
dT dθ
qb = −kAc (0) = −kAc (0) .
dx x=0 dx x=0
Observe that each successive approximation requires one additional point in the
interior of the domain.
Outline
Along C, let
φ1 (τ ) = uxx , φ2 (τ ) = uxy , φ3 (τ ) = uyy ,
(4.2)
ψ1 (τ ) = ux , ψ2 (τ ) = uy .
Substituting into equation (4.1) gives
d dx ∂ dy ∂
= + .
dτ dτ ∂x dτ ∂y
dψ1 d dx dy dx dy
= ux = uxx + uxy = φ1 + φ2 , (4.4)
dτ dτ dτ dτ dτ dτ
dψ2 d dx dy dx dy
= uy = uxy + uyy = φ2 + φ3 . (4.5)
dτ dτ dτ dτ dτ dτ
Equations (4.3)–(4.5) are three equations for three unknowns, i.e. the
second-order derivatives φ1 , φ2 and φ3 . Written in matrix form, they are
a b c φ1 H
dx/dτ dy/dτ 0 φ2 = dψ1 /dτ
0 dx/dτ dy/dτ φ3 dψ2 /dτ
If the determinant of the coefficient matrix is not equal to zero, a unique solution
exists for the second derivatives along the curve C. It can be shown that if the
second-order derivatives exist, then derivatives of all orders exist along C as well.
On the other hand, if the determinant of the coefficient matrix is equal to zero,
the solution is not unique, i.e. the second derivatives are discontinuous along C.
Setting the determinant equal to zero gives
2 2
dy dx dx dy
a +c −b = 0,
dτ dτ dτ dτ
This is a quadratic equation for dy/dx, which is the slope of the curve C.
Thus, √
dy b ± b2 − 4ac
= . (4.6)
dx 2a
→ The curves C for which y(x) satisfy (4.6) are called characteristic curves of
equation (4.1), and they are curves along which the second-order derivatives
are discontinuous.
Because the characteristics must be real, their behavior is determined by the sign
of b2 − 4ac:
b2 − 4ac > 0 ⇒ 2 real roots ⇒ 2 characteristics ⇒ hyperbolic p.d.e.
b2 − 4ac = 0 ⇒ 1 real root ⇒ 1 characteristic ⇒ parabolic p.d.e.
b2 − 4ac < 0 ⇒ no real roots ⇒ no characteristics ⇒ elliptic p.d.e.
det[A] = ac − b2 /4
or
> 0, hyperbolic
2
−4 det[A] = b − 4ac = 0, parabolic
< 0, elliptic
Notes:
1 Physically, characteristics are curves along which information propagates in
the solution.
2 For the case of elliptic equations, the matrix A is positive definite.
3 The classification depends on the coefficients of the highest-order derivatives,
i.e. a, b and c.
4 It can be shown that the classification of a partial differential equation is
independent of the coordinate system (see, for example, Tannehill et al.).
Outline
From equation (4.6), for a hyperbolic equation there are two real roots, say
dy dy
= λ1 , = λ2 . (4.7)
dx dx
If a, b and c are constant (λ1 , λ2 constant), then we may integrate to obtain
y = λ1 x + x1 , y = λ 2 x + x2 ,
which are straight lines. Therefore, the solution propagates along two linear
characteristic curves.
For example, consider the wave equation
∂2u 2
2∂ u
= σ , (y → t : a = σ 2 , b = 0, c = −1)
∂t2 ∂x2
where u(x, t) is the amplitude of the wave, and σ is the wave speed. Therefore,
from equations (4.6) and (4.7),
1 1
λ1 = , λ2 = − .
σ σ
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 56 / 454
Classification of Second-Order Partial Differential Equations Hyperbolic Equations
Therefore, the characteristics of the wave equation with a, b, and c constant are
straight lines with slopes 1/σ and −1/σ.
Note that no boundary conditions are necessary at specified values of x, i.e. the
solution (F1 , F2 ) only depends upon the initial conditions.
Hyperbolic equations in fluid dynamics:
Unsteady inviscid flow
Steady supersonic inviscid flow
Outline
From equation (4.6), for a parabolic equation there is only one real root, which is
dy b
= . (4.8)
dx 2a
If a and b are constant, then we may integrate to obtain
b
y= x + γ1 ,
2a
which is a straight line. Therefore, the solution propagates along one linear
characteristic direction (usually time).
For example, consider the one-dimensional, unsteady diffusion equation (e.g. heat
conduction)
∂u ∂2u
= α 2 , (y → t : a = α, b = c = 0)
∂t ∂x
where u(x, t) is the quantity undergoing diffusion (e.g. temperature), and α is the
diffusivity.
The solution marches forward in time, i.e. the characteristics (with b = 0) are lines
of constant t.
∂u ∂u ∂p ∂ 2 u
u +v =− + , (a = 0, b = 0, c = 1).
∂x ∂y ∂x ∂y 2
In this case the solution marches forward in the x−direction from an initial
velocity profile.
Outline
∂2u ∂2u
+ 2 = 0, (a = 1, b = 0, c = 1)
∂x2 ∂y
Outline
If a, b and c are variable coefficients, then b2 − 4ac may change sign with space
and/or time.
⇒ Character of equations may be different in certain regions.
For example, consider transonic flow (Mach ∼ 1). The governing equation for
two-dimensional, steady, compressible, potential flow about a slender body is
∂2φ ∂2φ
(1 − M 2 ) + = 0, (x → s, y → n : a = 1 − M 2 , b = 0, c = 1),
∂s2 ∂n2
where φ(s, n) is the velocity potential, M is the local Mach number, and s and n
are streamline coordinates with s locally being tangent to the streamline and n
being normal to the streamline.
To determine the nature of the equation, observe that
therefore,
M < 1 ⇒ b2 − 4ac < 0 ⇒ Elliptic
M = 1 ⇒ b2 − 4ac = 0 ⇒ Parabolic
M > 1 ⇒ b2 − 4ac > 0 ⇒ Hyperbolic
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 67 / 454
In the above example, we have the same equation, but different behavior in
various regions. In the following example, we have different equations in different
regions of the flow.
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
∂2φ ∂2φ
+ = f (x, y),
∂x2 ∂y 2
with Dirichlet, Neumann or Robin boundary conditions.
Non-Linear:
1 Linear equation with non-linear boundary conditions; e.g. heat conduction with
radiation condition
∂2T ∂2T
+ = 0,
∂x2 ∂y 2
∂T
= D(T 4 − Tsur4
).
∂n
2 Non-linear equation; e.g.: Navier-Stokes equations
„ 2
∂2u
«
∂u ∂u ∂p 1 ∂ u
u +v =− + + .
∂x ∂y ∂x Re ∂x2 ∂y 2
Outline
Numerical Solutions of Elliptic Problems Finite-Difference Methods for the Poisson Equation
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
∂2φ ∂2φ
+ 2 = f (x, y), (5.1)
∂x2 ∂y
Numerical Solutions of Elliptic Problems Finite-Difference Methods for the Poisson Equation
Consider an approximation to equation (5.1) at a typical point (i, j); the five
point finite-difference stencil is
Numerical Solutions of Elliptic Problems Finite-Difference Methods for the Poisson Equation
Substituting into (5.1) and multiplying by (∆x)2 gives the final form of the
finite-difference equation
" 2 # 2
∆x ∆x
φi+1,j − 2 1 + φi,j + φi−1,j + (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j ,
∆y ∆y
(5.2)
which results in a system of (I + 1) × (J + 1) equations for (I + 1) × (J + 1)
unknowns.
There are two options for solving such systems of equations:
1 Direct Methods:
i) No iterative convergence errors.
ii) Efficient for certain types of linear systems, e.g. tridiagonal, block-tridiagonal.
iii) Become less efficient for large systems of equations.
iv) Typically cannot adapt to non-linear problems.
2 Iterative Methods:
i) Iterative convergence errors.
ii) Generally more efficient for large systems of equations.
iii) Apply to non-linear problems.
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
We wish to consider direct methods for solving the discretized Poisson equation.
Repeating the difference equation (5.2) for the Poisson equation, we have
¯ i,j + φi−1,j + ∆
φi+1,j − 2(1 + ∆)φ ¯ (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j , (5.3)
¯ = (∆x/∆y)2 , and i = 1, . . . , I + 1; j = 1, . . . , J + 1.
where ∆
In order to write the system of difference equations in matrix form, let us renumber
the two-dimensional mesh (i, j) into a one-dimensional array (n) as follows
where i = 1, . . . , I + 1 and j = 1, . . . , J + 1.
Therefore, our five-point finite difference stencil becomes
And the finite-difference equation (5.4) for the Poisson equation becomes (with
∆ = ∆x = ∆y ⇒ ∆ ¯ = 1)
0 is a zero matrix block, I is an identity matrix block, and the tridiagonal blocks
are
−4 1 0 ··· 0 0
1 −4 1 · · · 0 0
0 1 −4 · · · 0 0
D= . .
. . . . .
.. .. .. .. .. ..
0 0 0 · · · −4 1
0 0 0 ··· 1 −4
In general, the method for solving the system Aφ = d depends upon the form of
the coefficient matrix A:
Full or dense ⇒ Gauss elimination, LU decomposition, etc. (very expensive
computationally).
Sparse and banded → Result from discretizations of certain classes of
problems (e.g. separable elliptic partial differential equations, such as the
Poisson equation):
→ Tridiagonal ⇒ Thomas algorithm
→ Fast Fourier Transform (FFT) and/or cyclic reduction, which generally are the
fastest methods for problems in which they apply.
where f is the frequency, and i is the imaginary number. The inverse transform is
Z ∞
h(t) = H(f )e−2πif t df.
−∞
Consider the discrete form of the Fourier transform in which we have N values of
h(t) at discrete points defined by
hk = h(tk ), tk = k∆, k = 0, 1, 2, . . . , N − 1,
After taking the Fourier transform, we will have N discrete points in the
frequency domain defined by
n N N
fn = , n=− ,..., ,
N∆ 2 2
corresponding to the Nyquist critical frequency range. Thus, the discrete Fourier
transform is Z ∞
H(fn ) = h(t)e2πifn t dt,
−∞
This is equivalent to taking the Fourier transform in each direction. The inverse
Fourier transform is (analogous to (5.6))
K/2 L/2
1 X X
φk,l = φ̂m,n e−2πikm/K e−2πiln/L . (5.8)
KL
m=−K/2 n=−L/2
Now let us apply this to the discretized Poisson equation (5.2). For simplicity, set
∆ = ∆x = ∆y giving (with i → k, j → l)
where now fk,l is the right-hand-side of the Poisson equation (not frequency).
Substituting equation (5.8) into equation (5.9) leads to
2πm 2πn
φ̂m,n 2 cos + 2 cos − 4 = ∆2 fˆm,n .
K L
∆2 fˆm,n
φ̂m,n = , (5.10)
2 cos 2πm 2πn
K + cos L −2
for m = 1, . . . , K − 1; n = 1, . . . , L − 1.
Therefore, to solve the difference equation (5.9) using Fourier transform methods:
1) Compute the Fourier transform fˆm,n of the right-hand-side fk,l using (similar
to (5.7))
K−1
X L−1
X
ˆ
fm,n = fk,l e2πimk/K e2πinl/L . (5.11)
k=0 l=0
Notes:
1) The above procedure works for periodic boundary conditions, i.e. the solution
satisfies
φk,l = φk+K,l = φk,l+L .
For Dirichlet boundary conditions ⇒ Use sine transform.
For Neumann boundary conditions ⇒ Use cosine transform.
2) In practice, the Fourier (and inverse) transforms are computed using a Fast
Fourier Transform (FFT) technique (see, for example, Numerical Recipes).
3) Fourier transform methods can only be applied to partial differential
equations with constant coefficients in the direction(s) for which the Fourier
transform is applied.
4) We use Fourier transforms to solve the difference equation, not the
differential equation; therefore, this is not a spectral method.
Cyclic Reduction
Again, consider the Poisson equation
∂2φ ∂2φ
+ 2 = f (x, y),
∂x2 ∂y
discretized on a two-dimensional grid with ∆ = ∆x = ∆y and
i = 0, . . . , I; j = 0, . . . , J, where I = 2n with integer n ((I + 1) × (J + 1) points):
Applying central differences to the Poisson equation, the difference equation for
constant x-lines becomes
where
φi,0 fi,0 −2 1 0 ··· 0 0
φi,1
fi,1 1
−2 1 ··· 0 0
φi,2 fi,2 0 1 −2 ··· 0 0
0
ui = , fi = , B = . .
.. .. .. .. .. .. ..
. . .. . . . . .
φi,J−1 fi,J−1 0 0 0 ··· −2 1
φi,J fi,J 0 0 0 ··· 1 −2
The first three terms in equation (5.12) correspond to the central difference in the
x-direction, and the fourth term corresponds to the central difference in the
y-direction (see B0 ).
Note that equation (5.13) corresponds to the block-tridiagonal matrix for equation
(5.4), where B is the tridiagonal portion and ui−1 and uu+1 are the ’fringes.’
Writing three successive equations of (5.13) for i − 1, i and i + 1:
Multiplying −B times the middle equation and adding all three gives
where
B∗ = 2I − B2 ,
fi∗ = fi−1 − Bfi + fi+1 .
This is an equation of the same form as (5.13); therefore, applying this procedure
to all even numbered i equations in (5.13) reduces the number of equations by a
factor of two.
This cyclic reduction procedure can be repeated until a single equation remains for
the middle line of variables, uI/2 (I = 2n , with integer n), which is tridiagonal.
Thus, using the solution for uI/2 , solutions for all other i are obtained by
successively solving the tridiagonal problems at each level in reverse:
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
Returning to the difference equation (5.2) for the Poisson equation (5.1)
¯ i,j + φi−1,j + ∆
φi+1,j − 2(1 + ∆)φ ¯ (φi,j+1 + φi,j−1 ) = (∆x)2 fi,j , (5.15)
¯ = (∆x/∆y)2 .
where ∆
This may be written in general form as
Lφ = f, (5.16)
Jacobi Iteration
Solving equation (5.15) for φi,j and indicating the iteration number using a
superscript
1 ¯
φn+1
n n n n
2
i,j = ¯ φ i+1,j + φ i−1,j + ∆ φ i,j+1 + φ i,j−1 − (∆x) f i,j . (5.17)
2(1 + ∆)
Procedure:
1 Provide an initial guess φ1i,j for φi,j at each point
i = 1, . . . , I + 1, j = 1, . . . , J + 1.
2 Relax (iterate) by applying (5.17) at each grid point to produce successive
approximations:
φ2i,j , φ3i,j , . . . , φni,j , . . .
3 Continue until convergence, determined by
max |φn+1 n
i,j − φi,j |
< ,
max |φni,j |
Notes:
1 Convergence is too slow ⇒ not used in practice.
2 Requires φn+1 n
i,j and φi,j to be stored for all i = 1, . . . , I + 1, j = 1, . . . , J + 1.
3 Used as a basis for comparison with other methods to follow.
4 Although not necessary, it is instructive to view iterative methods in matrix
form Ax = c.
→ See MMAE 501, section 2.2.2 notes.
We write A = M1 − M2 ; thus, an iterative scheme may be devised by
writing Ax = c in the form
M1 x(n+1) = M2 x(n) + c,
x(n+1) = M−1
1 M2 x
(n)
+ M−1
1 c,
or
x(n+1) = Mx(n) + M−1
1 c,
where M = M−1
1 M2 is the iteration matrix.
Let
D = diagonal elements of A
Dx(n+1) = (L + U)x(n) + c
Gauss-Seidel Iteration
There are two problems with Jacobi iteration:
1 Slow convergence.
2 Must store current and previous iterations for entire grid.
The Gauss-Seidel method addresses both problems by using the most recently
updated information. For example, if sweeping along lines of constant y
Therefore, when updating φi,j , the points φi,j−1 and φi−1,j have already been
updated, and using these updated values, equation (5.17) is changed to
1 ¯
φn+1 n+1 n+1
n n
2
i,j = ¯ φ i+1,j + φ i−1,j + ∆ φ i,j+1 + φ i,j−1 − (∆x) f i,j . (5.19)
2(1 + ∆)
Observe that now it is not necessary to store φni,j at the previous iteration. The
values of φi,j are all stored in the same array, and it is not necessary to distinguish
between the (n) or (n + 1)st iterates. We simply use the most recently updated
information.
In matrix form, the Gauss-Seidel method is (M1 = D − L, M2 = U):
(D − L)x(n+1) = Ux(n) + c
Thus, the rate of convergence is twice as fast as for Jacobi, i.e. the Gauss-Seidel
method requires one-half the iterations for the same level of accuracy.
Note: It can be shown that diagonal dominance of A is a sufficient (but not
necessary) condition for convergence of the Jacobi and Gauss-Seidel iteration
methods, i.e. ρ < 1, where ρ is the spectral radius of M = M−1
1 M2 (see Morton
and Mayers, p. 205 for proof).
where ω is the relaxation parameter and 0 < ω < 2 for convergence (Morton &
Mayers, p. 206).
ω=1 ⇒ Gauss-Seidel
1<ω<2 ⇒ Overrelaxation
0<ω<1 ⇒ Underrelaxation
2
1 π
Recall that for our model problem, ρJac = 1 − 2 I+1 + · · · ; thus,
2
ωopt = p
1+ 1 − ρ2Jac
2
= s 2
2
1 π
1+ 1− 1− 2 I+1 + ···
2 (5.24)
= s
2
π
1+ 1 − 1 − I+1 + · · ·
2
ωopt ≈ π .
1 + I+1
Notes:
1 Although ωopt does not depend on the right-hand-side, it does depend upon:
i) differential equation
ii) method of discretization
iii) boundary conditions
iv) shape of domain
2 For a given problem, ωopt must be estimated from a similar problem and/or
trial and error.
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
1 ¯
φn+1
n n
2
2,2 = ¯ φ 3,2 + φ 1,2 + ∆ φ 2,3 + φ 2,1 − (∆x) f 2,2 .
2(1 + ∆)
The simplest treatment would be to use the Jacobi (5.17), Gauss-Seidel (5.19), or
SOR (5.21) equation to update φi,j in the interior for i = 2, . . . , I, and then to
approximate the boundary condition (5.26) by a forward difference applied at i = 1
φ2,j − φ1,j
+ O(∆x) = c. (5.27)
∆x
This could then be used to update φ1,j , j = 2, . . . , J using
However, this involves a value φn0,j that is outside the domain. A second-order
accurate central-difference approximation for the boundary condition (5.26) is
φn2,j − φn0,j
+ O(∆x2 ) = c, (5.29)
2∆x
which also involves the value φn0,j . Therefore, solving (5.29) for φn0,j gives
and substituting into the difference equation (5.28) to eliminate φn0,j leads to
1 ¯
φn+1 n n n 2
1,j = ¯ 2 φ 2,j − c∆x + ∆ φ 1,j+1 + φ 1,j−1 − (∆x) f1,j . (5.30)
2(1 + ∆)
Notes:
1 This is the same procedure used for a Dirichlet condition but with an
additional sweep along the left boundary using (5.30) for φn+1
1,j , j = 2, . . . , J.
∂φ
=d at y = b. (5.31)
∂y
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
Recall that in elliptic problems, the solution anywhere depends on the solution
everywhere, i.e. it has an infinite speed of propagation in all directions.
However, in Jacobi, Gauss-Seidel, and SOR, information only propagates through
the mesh one point at a time. For example, if sweeping along lines of constant y
with 0 ≤ x ≤ a, it takes I iterations before the boundary condition at x = a is
“felt” at x = 0.
⇒ These techniques are not very “elliptic like.”
A more “elliptic-like” method could be obtained by solving entire lines in the grid
in an implicit manner. For example, sweeping along lines of constant y and
solving each constant y-line implicitly, i.e. all at once, would allow for the
boundary condition at x = a to influence the solution in the entire domain after
only one sweep through the grid.
Consider the j th line and assume that values along the j + 1st and j − 1st lines
are taken from the previous iterate.
Rewriting (5.33) as an implicit equation for the values of φi,j along the j th line
gives
Notes:
1 If sweeping through j-lines, j = 2, . . . , J, then φni,j−1 becomes φn+1
i,j−1 in
(5.34), i.e. it has already been updated. Therefore, we use updated values as
in Gauss-Seidel.
2 Can also incorporate SOR.
3 More efficient at spreading information throughout the domain; therefore, it
reduces the number of iterations required for convergence, but there is more
computation per iteration.
4 This provides motivation for the ADI method.
ADI Method
In the ADI method we sweep along lines but in alternating directions.
In the first half of the iteration we perform a sweep along constant y-lines by
solving the series of tridiagonal problems for j = 2, . . . , J:
n+1/2 n+1/2 2 n+1/2
φi+1,j −h(2 + σ)φi,j + φi−1,j = (∆x) i fi,j
−∆ ¯ φn σ
n n+1/2 (5.35)
i,j+1 − 2 − ∆
¯ φi,j + φi,j−1 , i = 2, . . . , I.
Notes:
n+1/2
1 The φi,j−1 term on the right-hand-side has been updated from the previous
line.
2 Unlike in equation (5.34), differencing in the x- and y-directions are kept
separate to mimic diffusion in each direction. This is called a splitting
method.
3 σ is an acceleration parameter to enhance diagonal dominance (σ > 0).
σ = 0 corresponds to no acceleration. Note that the σ terms on each side of
the equation cancel.
In the second half of the iteration we sweep along constant x-lines by solving the
series of tridiagonal problems for i = 2, . . . , I:
¯ n+1 − (2∆
∆φ ¯ + σ)φn+1 + ∆φ
¯ n+1 = (∆x)2 fi,j
i,j+1
h i,j i,j−1 i
n+1/2
− φi+1,j − (2 − σ) φi,j
n+1/2
+ φn+1 (5.36)
i−1,j , j = 2, . . . , J,
where φn+1
i−1,j has been updated from the previous line.
Notes:
1 Involves (I − 1) + (J − 1) tridiagonal solves for each iteration (for Dirichlet
boundary conditions).
2 For ∆x = ∆y (∆ ¯ = 1), it can be shown that for the Poisson (or Laplace)
equation with Dirichlet boundary conditions that the acceleration parameter
that gives the best speedup is
σ = 2 sin (π/R) ,
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
∂2φ 2 ∆x2 ∂ 4 φ
= δx φ − + O(∆x4 ). (5.37)
∂x2 12 ∂x4
Therefore,
∂ 2 φ ∆x2 ∂ 4 φ
δx2 φ = 2
+ 4
+ O(∆x4 ).
∂x 12 ∂x
(5.38)
∆x2 ∂ 2 ∂ 2 φ
δx2 φ = 1+ 2 2
+ O(∆x4 ).
12 ∂x ∂x
But from equation (5.37)
∂2
= δx2 + O(∆x2 ).
∂x2
Substituting into (5.38) gives
2
2
∆x 2 2 ∂ φ
+ O(∆x4 ),
δx2 φ = 1+ δx + O(∆x ) 2
12 ∂x
∆x2 2 ∂ 2 φ
δx2 φ = 1+ δ + O(∆x4 ).
12 x ∂x2
Because the last term in equation (5.39) is still O(∆x4 ), we can write equation
(5.39) as
−1
∂2φ ∆x2 2
2
= 1+ δx δx2 φ + O(∆x4 ). (5.41)
∂x 12
Substituting the expression (5.40) into equation (5.41) leads to a O(∆x4 )
accurate central-difference approximation for the second derivative
∂2φ ∆x2 2 2
2
= 1− δx δx φ + O(∆x4 ).
∂x 12
Due to the δx2 (δx2 φ) operator, however, this approximation involves the five points
φi−2 , φi−1 , φi , φi+1 and φi+2 ; therefore, it is not compact.
∂2φ ∂2φ
+ 2 = f (x, y).
∂x2 ∂y
Substituting equations (5.41) and (5.42) into the Poisson equation leads to
−1 −1
∆x2 2 ∆y 2 2
1+ δ
+ 1+ δy δx2 φ
δy2 φ + O(∆4 ) = f (x, y),
12 x 12
2 ∆y 2 2
where ∆ = max(∆x, ∆y). Multiplying by 1 + ∆x12 x δ 2
1 + 12 y gives
δ
∆y 2 2 2 ∆x2 2 2
1+ δy δx φ + 1 + δx δy φ + O(∆4 )
12 12
(5.43)
∆x2 2 ∆y 2 2
4
= 1+ δ + δ + O(∆ ) f (x, y),
12 x 12 y
Therefore, we have a nine-point stencil, but the approximation only requires three
points in each direction and thus it is compact.
∆x2 2 2
1
1+ δx δy φ = [−20φi,j + 10 (φi,j−1 + φi,j+1 )
12 12∆y 2
−2 (φi−1,j + φi+1,j ) + φi−1,j−1 + φi−1,j+1 + φi+1,j−1 + φi+1,j+1 ] ,
∆x2 2 ∆y 2 2
1
1+ δx + δy f (x, y) = fi,j + [fi−1,j − 2fi,j + fi+1,j
12 12 12
+fi,j−1 − 2fi,j + fi,j+1 ]
1
= [8fi,j + fi−1,j + fi+1,j + fi,j−1 + fi,j+1 ] .
12
Thus, the coefficients in the finite-difference stencil for φ(x, y) are as follows:
Notes:
1 Observe that in equation (5.43) the two-dimensionality of the equation has
been taken advantage of to obtain the compact finite-difference stencil, i.e.
see the δx2 δy2 φ and δy2 δx2 φ difference operators.
2 Because the finite-difference stencil is compact, i.e. only involving three
points in each direction, application of the ADI method as in the previous
section results in a set of tridiagonal problems to solve. Therefore, this
fourth-order, compact finite-difference approach is no less efficient then that
for the second-order scheme used in the previous section (there are simply
additional terms on the right-hand-side of the equations).
3 The primary disadvantage of using higher-order schemes is that it is generally
necessary to use lower-order approximations for derivative boundary
conditions. This is not a problem, however, for Dirichlet boundary conditions.
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
Motivation
The iterative techniques we have discussed all have the following property:
High-frequency components of error ⇒ fast convergence.
Low-frequency components of error ⇒ slow convergence.
To illustrate this, consider the simple one-dimensional problem
d2 φ
= 0, 0 ≤ x ≤ 1, (5.44)
dx2
with φ(0) = φ(1) = 0, which has the exact solution φ(x) = 0. Therefore, all plots
of the numerical solution are also plots of the error.
Discretizing equation (5.44) at I + 1 points using central differences gives
To show how the nature of the error affects convergence, consider an initial guess
consisting of the Fourier mode φ(x) = sin(kπx), where k is the wavenumber and
indicates the number of half sine waves on the interval 0 ≤ x ≤ 1. In discretized
form, with xi = i∆x = i/I, this is
kπi
φi = sin , i = 0, . . . , I,
I
Consider initial guesses with k = 1, 3, 6 (the figures are from Briggs et al.):
Applying Jacobi and Gauss-Seidel iteration with I = 64, the solution converges
more rapidly for the higher frequency initial guess.
Jacobi:
Gauss-Seidel:
A more realistic situation is one in which the initial guess contains multiple modes,
for example
1 πi 6πi 32πi
φi = sin + sin + sin .
3 I I I
Applying Jacobi iteration with I = 64, the error is reduced rapidly during the early
iterations but more slowly thereafter.
Thus, there is rapid convergence until the high-frequency modes are smoothed
out, then slow convergence for the lower frequency modes.
Thus, relaxation is more effective on a coarse grid representation of the error (it is
also faster).
Notes:
1) Multigrid methods are not so much a specific set of techniques as they are a
framework for accelerating relaxation (iterative) methods.
2) Multigrid methods are comparable in speed with fast direct methods, such as
Fourier methods and cyclic reduction, but can be used to solve general
elliptic equations with variable coefficients and even nonlinear equations.
Multigrid Methodology
Consider the general second-order linear elliptic partial differential equation of the
form
∂2φ ∂φ ∂2φ ∂φ
A(x, y) 2 +B(x, y) +C(x, y) 2 +D(x, y) +E(x, y)φ = F (x, y). (5.45)
∂x ∂x ∂y ∂y
To be elliptic, A(x, y)C(x, y) > 0 for all (x, y). Approximating this differential
equation using second-order accurate central differences gives
φi+1,j − 2φi,j + φi−1,j φi+1,j − φi−1,j
Ai,j 2
+ Bi,j
∆x 2∆x
φi,j+1 − 2φi,j + φi,j−1 φi,j+1 − φi,j−1
Ci,j 2
+ Di,j
∆y 2∆y
+Ei,j φi,j = Fi,j ,
ai,j φi+1,j + bi,j φi−1,j + ci,j φi,j+1 + di,j φi,j−1 + ei,j φi,j = Fi,j , (5.46)
where
Ai,j Bi,j Ai,j Bi,j
ai,j = + , bi,j = − ,
∆x2 2∆x ∆x2 2∆x
Ci,j Di,j Ci,j Di,j
ci,j = + , di,j = − ,
∆y 2 2∆y ∆y 2 2∆y
2Ai,j 2Ci,j
ei,j = Ei,j − − .
∆x2 ∆y 2
For convenience, write (5.46) (or some other difference equation) as
Lφ = f, (5.47)
e = φ − φ̄, (5.48)
r = f − Lφ̄. (5.49)
Observe from (5.47) that if φ̄ = φ, then the residual is zero; therefore, the residual
is a measure of how “wrong” the approximate solution is.
Substituting (5.48) into equation (5.47) gives the error equation
Le = r = f − Lφ̄. (5.50)
From these definitions, we can devise a scheme with which to correct the solution
on a fine grid by solving for the error on a coarse grid.
⇒ Coarse-Grid Correction (CGC)
Definitions:
Fine grid → Ωh ; Coarse grid → Ω2h
This CGC scheme is the primary component of all the many multigrid algorithms.
To illustrate consider the CGC sequence for equation (5.44), i.e. φ00 = 0
(I = 64, ν1 = 3):
Initial guess:
→ The low-frequency mode is reduced very little after three relaxation step.
Note: The error being solved for on successively coarser grids is the error of the
error on the next finer grid, i.e. on grid...
Ωh → relaxation on original equation for φ.
Ω2h → relaxation on equation for error on Ωh .
Ω4h → relaxation on equation for error on Ω2h .
..
.
This simple V-cycle scheme is appropriate when a good initial guess is available.
For example, when considering a solution to equation (5.45) in the context of an
unsteady calculation in which case the solution for φh from the previous time step
is a good initial guess for the current time step.
If no good initial guess is available, then Full Multigrid V-Cycle (FMV) may be
applied according to the following procedure:
1) Solve Lφ = f on the coarset grid (Note: φ not e)
2) Interpolate to next finer grid.
3) Perform V-cycle to correct φ.
4) Interpolate to next finer grid.
5) Repeat (3) and (4) until finest grid is reached.
6) Perform V-cycles until convergence.
Grid Definitions:
Because each successive grid differs by a factor of two, the finest grid size is often
taken as 2n + 1, where n is an integer. Somewhat more general grids may be
obtained using the following grid definitions.
The differential equation (5.45) is discretized on a uniform grid having Nx × Ny
points which are defined by
where nx and ny determine the number of grid levels, and mx and my determine
the size of the coarsest grid, which is (mx + 1) × (my + 1).
For a given grid, nx and ny should be as large as possible, and mx and my should
be as small as possible for maximum efficiency (typically mx and my are 2, 3 or
5). For example:
Nx = 65 ⇒ mx = 2, nx = 6
Nx = 129 ⇒ mx = 2, nx = 7
Nx = 49 ⇒ mx = 3, nx = 5
Nx = 81 ⇒ mx = 5, nx = 5
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 169 / 454
N = max(nx , ny ) (5.52)
where G(1) is the coarsest grid, G(N ) is the finest grid and L = 1, . . . , N . Each
grid G(L) has Mx (L) × My (L) grid points, where
For example,
Nx = 65, Ny = 49 ⇒ mx = 2, nx = 6 and my = 3, ny = 5.
Boundary Conditions
At each boundary the general form of the boundary condition is
∂φ
pφ + q = s, (5.55)
∂n
where n is the direction normal to the surface. This boundary condition is applied
directly on the finest grid Ωh , i.e.
h
h h h ∂φ
p φ +q = sh . (5.56)
∂n
But on the coarser grids, we need the boundary condition for the error. In order to
obtain such a condition, consider the following. On the coarse grid Ω2h , equation
(5.55) applies to the solution φ; thus,
2h
2h 2h 2h ∂φ
p φ +q = s2h .
∂n
∂ φ̄2h
p2h φ̄2h + q 2h = s2h .
∂n
Therefore, to obtain a boundary condition for the error on Ω2h
2h 2h 2h
2h 2h 2h ∂e 2h 2h 2h ∂φ 2h 2h 2h ∂ φ̄
p e +q = p φ +q − p φ̄ + q
∂n ∂n ∂n
= s2h − s2h (5.57)
∂e2h
p2h e2h + q 2h = 0.
∂n
Thus, the boundary conditions are homogeneous on all but the finest grid, where
the original condition on φ is applied. For example,
Dirichlet ⇒ e = 0 on boundary.
∂e
Neumann ⇒ ∂n = 0 on boundary.
That is, the type of boundary condition (Dirichlet, Neumann, or Robin) does not
change, i.e. the p and q coefficients are the same, but they become homogeneous,
i.e. s = 0.
Relaxation
Typically, red-black Gauss-Seidel iteration is used to relax the difference equation:
By performing the relaxation on all of the red and black points separately, it
eliminates data dependencies such that it is easily implemented on parallel
computers (see section 10). Note that when Gauss-Seidel is used, SOR should not
be implemented because it destroys the high-frequency smoothing.
ai,j φi+1,j + ei,j φi,j + bi,j φi−1,j = fi,j − ci,j φ∗i,j+1 − di,j φ∗i,j−1 , (5.58)
Then lines of constant x are swept by solving the tridiagonal problem for each
i = 1, . . . , Mx (L) given by
ci,j φi,j+1 + ei,j φi,j + di,j φi,j−1 = fi,j − ai,j φ∗i+1,j − bi,j φ∗i−1,j , (5.59)
for j = 1, . . . , My (L). Again we could sweep all lines with i even and i odd
separately.
N XC = 12 (N XF − 1) + 1, N Y C = 12 (N Y F − 1) + 1
i∗ = 2i − 1, j ∗ = 2j − 1
φ2h h
i,j = φi∗ ,j ∗ ,
i.e. we simply drop the points that are not common to both grids. The matrix
symbol for straight injection is [1].
Thus,
1
φ2h
i,j = h h h h
16 φi∗ −1,j ∗ −1 + φi∗ −1,j ∗ +1 + φi∗ +1,j ∗ −1 + φi∗ +1,j ∗ +1
+ 18 φhi∗ ,j ∗ −1 + φhi∗ ,j ∗ +1 + φhi∗ −1,j ∗ + φhi∗ +1,j ∗
(5.60)
+ 14 φhi∗ ,j ∗
h
Interpolation (Prolongation) Operator: I2h
The interpolation operator is required for moving information from the coarser to
finer grid. The most commonly used interpolation operator is based on bilinear
interpolation.
φhi∗ ,j ∗ = φ2h
i,j ← copy common points
1
φhi∗ +1,j ∗ = 2 φ2h
i,j + φ 2h
i+1,j
1
φhi∗ ,j ∗ +1 = 2 φ2h 2h
i,j + φi,j+1
1
φhi∗ +1,j ∗ +1 = 4 φ2h 2h 2h 2h
i,j + φi+1,j + φi,j+1 + φi+1,j+1
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 181 / 454
Speed Comparisons
Consider the test problem
∂2φ ∂φ ∂2φ ∂φ
A(x) + B(x) + C(y) + D(y) = F (x, y),
∂x2 ∂x ∂y 2 ∂y
with Neumann boundary conditions. The following times are for an SGI Indy
R5000-150MHz. The grid is N × N .
ADI:
= 10−4 = 10−5
N Iterations Time (sec) Iterations Time (sec)
65 673 22.35 821 27.22
129 2, 408 366.06 2, 995 456.03
Note that in both cases, the total time required for the N = 129 case is
approximately 16× that with N = 65 (∼ 4× increase in points and ∼ 4× increase
in iterations).
Multigrid:
V-cycle with ADL relaxation (no FMV to get improved initial guess). Here the
convergence criterion is evaluated between V-cycles.
= 10−4 = 10−5
N Iterations Time (sec) Iterations Time (sec)
65 18 1.78 23 2.28
129 23 10.10 29 12.68
Notes:
1) In both cases, the total time required for the N = 129 case is approximately
6× that with N = 65 (the minimum is 4×).
⇒ The multigrid method scales to larger grid sizes more effectively than ADI
alone, i.e. note the small increase in the number of V-cycles with increasing N .
2) The case with N = 65 is approximately 13× faster than ADI, and the case
with N = 129 is approximately 36× faster!
3) References:
1 Briggs, W.C., Henson, V.E. and McCormick, S.F., “A Multigrid Tutorial,”
(2nd Edition) SIAM (2000).
2 Thomas, J.L., Diskin, B. and Brandt, A.T., “Textbook Multigrid Efficiency for
Fluid Simulations,” Ann. Rev. Fluid Mech. (2003), 35, pp. 317–340.
Outline
Outline (cont’d)
Treatment of Nonlinear Convective Terms
Upwind-Downwind Differencing
∂2u ∂2u
∂u ∂u
Re u +v = + 2, (5.61)
∂x ∂y ∂x2 ∂y
∂2v ∂2v
∂v ∂v
Re u +v = + 2, (5.62)
∂x ∂y ∂x2 ∂y
which represent a simplified version of the Navier-Stokes equations as there are no
pressure terms. The terms on the left-hand-side are the convection terms, and
those on the right-hand-side are the viscous or diffusion terms.
The Burger’s equations are elliptic due to the nature of the second-order viscous
terms, but the convection terms make the equations non-linear.
A simple approach to linearizing the equations is known as Picard iteration in
which we take the coefficients of the non-linear (first derivative) terms to be
known from the previous iteration denoted by
u∗i,j , vi,j
∗
.
Let us begin by approximating (5.61) using central differences for all derivatives as
follows
∗ ui+1,j − ui−1,j ∗ ui,j+1 − ui,j−1
Re ui,j + vi,j
2∆x 2∆y
ui+1,j − 2ui,j + ui−1,j ui,j+1 − 2ui,j + ui,j−1
= 2
+ .
(∆x) (∆y)2
¯ − 1 Re ∆x ∆ ¯ 1/2 v ∗ ui,j+1 + ∆
¯ + 1 Re ∆x ∆¯ 1/2 v ∗ ui,j−1
+ ∆ 2 i,j 2 i,j (5.63)
¯ i,j = 0,
−2(1 + ∆)u
¯ = (∆x/∆y)2 .
where ∆
We can solve (5.63) using any of the iterative methods discussed except SOR
(generally need under-relaxation for non-linear problems). However, is (5.63)
diagonally dominant? To be diagonally dominant we must have
¯ − q + ∆
¯ + q ≤ 2(1 + ∆)
¯ ,
|1 − p| + |1 + p| + ∆
where
1 1 ¯ 1/2 vi,j
p= Re ∆x u∗i,j , q= Re ∆x ∆ ∗
.
2 2
¯ then this requires that
Suppose, for example, that p > 1 and q > ∆,
¯ + (∆
(p − 1) + (1 + p) + (q − ∆) ¯ + q) ≤ 2(1 + ∆)
¯
¯
2(p + q) ≤ 2(1 + ∆),
but with p > 1 and q > ∆¯ this condition cannot be satisfied, and equation (5.63)
is not diagonally dominant. The same result holds for p < −1 and q < −∆. ¯
Therefore, we must have |p| ≤ 1 and |q| ≤ ∆¯ or
1 1
Re ∆x u∗i,j ≤ 1, and Re ∆x vi,j ∗ ¯ 1/2
2 2 ≤∆ ,
Upwind-Downwind Differencing
In order to restore diagonal dominance, we use forward or backward differences for
the first-derivative terms depending upon the signs of the coefficients of the first
derivative terms, i.e. the velocities. For example, consider the u∗ ∂u/∂x term:
1 If u∗i,j > 0, then using a backward difference
∂u ui,j − ui−1,j
u∗ = u∗i,j + O(∆x),
∂x ∆x
which gives a positive addition to the ui,j term to promote diagonal
dominance.
2 If u∗i,j < 0, then using a forward difference
∂u ui+1,j − ui,j
u∗ = u∗i,j + O(∆x),
∂x ∆x
which again gives a positive addition to the ui,j term to promote diagonal
dominance.
(∆x)2 Tx = .
(1 − Re ∆x u∗i,j )ui+1,j + ui−1,j − (2 − Re ∆x u∗i,j )ui,j , u∗i,j <0
1 + 1 + Re ∆x u∗i,j = 2 + Re ∆x u∗i,j ,
1 + 1 − Re ∆x u∗i,j = 2 − Re ∆x u∗i,j ,
Notes:
1 The same is true for the y-derivative terms when sweeping along lines of
constant x.
2 Updwind-downwind differencing forces diagonal dominance; therefore, the
iteration will always converge with no mesh restrictions.
3 The forward and backward differences used for the first-order derivatives are
only first-order accurate, i.e. the method is O(∆x, ∆y) accurate. To see the
potential affects of this error, consider the 1-D Burger’s equation
du d2 u
Re u = 2. (5.64)
dx dx
Recall from section 3.2 that, for example, the first-order, backward-difference
approximation to the first-order derivative is
∆x d2 u
du ui − ui−1
= + + ...,
dx i ∆x 2 dx2 i
∆x d2 u
2
∗ ui − ui−1 d u
Re ui + + . . . = ,
∆x 2 dx2 i dx2 i
or 2
ui − ui−1 Re d u
Re ui = 1− ∆x u∗i .
∆x 2 dx2 i
Therefore, depending upon the values of Re, ∆x, and u∗ , the truncation
error from the first-derivative terms, which is not included in the numerical
solution, may be of the same order, or even larger than, the physical diffusion
term. This is often referred to as artificial or numerical diffusion, the effects
of which increase with increasing Reynolds number.
4 Remedies:
i) Can return to O(∆x2 , ∆y 2 ) accuracy using deferred correction in which we
use the approximate solution to evaluate the leading term of the truncation
error which is then added to the original discretized equation as a source term.
ii) Alternatively, we could use second-order accurate forward and backward
differences, but the resulting system of equations would no longer be
tridiagonal.
5 Note that we have linearized the difference equation, not the differential
equation, in order to obtain a linear system of algebraic equations
⇒ The physical nonlinearity is still being solved for.
Outline
Outline (cont’d)
Factored ADI Method
∂φ ∂2φ ∂φ
= a(x, t) 2 + b(x, t) + c(x, t)φ + d(x, t). (6.1)
∂t ∂x ∂x
A simple model problem for this is the unsteady, 1-D diffusion equation
∂φ ∂2φ
= α 2, (6.2)
∂t ∂x
where
φ = T ⇒ heat conduction
φ = u ⇒ momentum diffusion (due to viscosity)
φ = ω ⇒ vorticity diffusion
φ = c ⇒ mass diffusion (c = concentration)
Techniques developed for equation (6.2) can be used for equation (6.1).
Methods of Solution:
1 Reduce partial differential equation to a set of ordinary differential equations
and solve, e.g. method of lines, predictor-corrector, Runge-Kutta, etc...
2 Finite-difference methods:
a) Explicit methods – obtain equation for φ at each mesh point.
b) Implicit methods – obtain set of algebraic equations for φ at all mesh points at
each ∆t.
Outline
Outline (cont’d)
Factored ADI Method
Explicit Methods ⇒ Spatial derivatives are all evaluated at previous time level(s),
i.e. single unknown φn+1
i on left-hand-side.
Note that now the superscript n denotes the time step rather than the iteration
number.
∂φ φn+1 − φni
= i + O(∆t).
∂t ∆t
Second-order, central difference for spatial derivatives at nth time level (known)
φn+1
i − φni φni+1 − 2φni + φni−1
=α ,
∆t (∆x)2
α∆t
φn+1 = φni + n n n
φ − 2φ + φ
i
(∆x)2 i+1 i i−1
(6.3)
φn+1
i = (1 − 2s)φni + s φni+1 + φni−1 , i = 2, . . . , I,
where s = α∆t/(∆x)2 .
Notes:
1 Equation (6.3) is an explicit equation for φn+1
i at the (n + 1)st time step.
2 Method is second-order accurate in space and first-order accurate in time.
3 Time steps ∆t may be varied from step-to-step.
4 Restrictions on ∆t and ∆x for Euler method applied to 1-D diffusion
equation to remain stable (see section 6.3):
α∆t 1
s= 2
≤ ⇒ stable (very restrictive).
(∆x) 2
α∆t 1
s= 2
> ⇒ unstable.
(∆x) 2
Richardson Method
We want to improve on the temporal accuracy of the Euler method; therefore, we
use a central difference for the time derivative.
Notes:
1 Second-order accurate in space and time.
2 Must keep ∆t constant and requires starting method (need φi at two time
steps).
3 Unconditionally unstable for s > 0 ⇒ Do not use.
DuFort-Frankel Method
In order to maintain second-order accuracy, but improve stability, let us modify the
Richardson method by taking an average between time levels for φni . To devise
such an approximation, consider the Taylor series approximation at tn+1 about tn
n n
∆t2 ∂ 2 φ
n+1 n ∂φ
φi = φi + ∆t + + ··· .
∂t i 2 ∂t2 i
1 − 2s n−1 2s
φn+1 φni+1 + φni−1 ,
i = φi + i = 2, . . . , I. (6.6)
1 + 2s 1 + 2s
n n n
α ∆t2 ∂ 2 φ ∆t2 ∂ 3 φ α ∆x2 ∂ 4 φ
− − + + ··· .
∆x2 ∂t2 i 6 ∂t3 i 12 ∂x4 i
Outline
Outline (cont’d)
Factored ADI Method
Introduction
Real flows ⇒ small disturbances, e.g. imperfections, vibrations, etc...
Numerical solutions ⇒ small errors, e.g. truncation, round-off, etc...
Issue → What happens to small disturbances/errors as flow and/or solution
evolves in time?
Decay ⇒ stable (disturbances/errors are damped out).
Grow ⇒ unstable (disturbances/errors are amplified).
Two possible sources of instability in CFD:
1 Hydrodynamic instability – the flow itself is inherently unstable (see, for
example, Drazin & Reid)
→ This is real, i.e. physical
2 Numerical instability – the numerical algorithm magnifies small errors
→ This is not physical ⇒ Need a new method.
Difficulty → In CFD both are manifest in similar ways, i.e. oscillatory solutions;
therefore, it is often difficult to determine whether oscillatory numerical solutions
are a result of a numerical or hydrodynamic instability.
→ For an example of this, see “Supersonic Boundary-Layer Flow Over a
Compression Ramp,” Cassel, Ruban & Walker, JFM 1995, 1996.
Hydrodynamic vs. Numerical Instability:
1 Hydrodynamic stability analysis (Section 9 and MMAE 514):
Often difficult to perform.
Assumptions must often be made, e.g. parallel flow.
Can provide conclusive evidence for hydrodynamic instability (particularly if
confirmed by analytical or numerical results). For example, in supersonic flow
over a ramp, the Rayleigh Fjørtoft’s theorems are necessary conditions that
can be tested for.
2 Numerical stability analysis:
Often gives guidance, but not always conclusive for complex problems.
Note: Just because a numerical solution does not become oscillatory does not
mean that no physical instabilities are present! There may not be sufficient
resolution.
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 217 / 454
Matrix Method
Denote the exact solution of the difference equation at t = tn by φ̃ni ; then the
error is
eni = φni − φ̃ni , i = 2, . . . , I, (6.7)
where φni is the approximate solution at t = tn . Consider the
first-order explicit (Euler) method given by equation (6.3), which is repeated here
φn+1
i = (1 − 2s)φni + s(φni+1 + φni−1 ), (6.8)
where s = α∆t/∆x2 . Both φni and φ̃ni satisfy this equation; thus, the error
satisfies the same equation
Thus, we perform a matrix multiply to advance each time step (cf. matrix form for
iterative methods).
en2
1 − 2s s 0 ··· 0 0
s
1 − 2s s ··· 0
0
en3
0 s 1 − 2s ··· 0 0 en4
n
A= . , e = .
.. .. .. .. ..
.. . . . . .
n
0 0 0 ··· 1 − 2s s eI−1
0 0 0 ··· s 1 − 2s enI
Note that if φ is specified at the boundaries, then the error is zero there, i.e.
en1 = enI+1 = 0.
The method is stable if the eigenvalues λj of the matrix A are such that
φn+1 = Aφn ,
where
√ am (0) are the amplitudes of the Fourier modes, θm = mπ, and here
i = −1. At a later time t
I−1
X I−1
X
e(x, t) = em (x, t) = am (t)eiθm x . (6.14)
m=1 m=1
em (x, t) am (t)
Gm (x, t) = = ,
em (x, t − ∆t) am (t − ∆t)
which is the amplification factor for the mth mode during one time step.
Therefore, the error will not grow if |Gm | ≤ 1 for all m, i.e. the method is stable.
If it takes n time steps to get to time t, then the amplification after n time steps is
For the first-order explicit (Euler) method, the error equation (4.8) is (use index j
instead of i)
en+1
j = (1 − 2s)enj + s(enj+1 + enj−1 ), j = 2, . . . , I. (6.15)
This equation is linear; therefore, each mode m must satisfy the equation
independently. Thus, substituting (6.14) with am (t) = (Gm )n am (0) into equation
(6.15) gives (canceling am (0) in each term)
h i
n+1 iθm x n iθm x n iθm (x+∆x) iθm (x−∆x)
(Gm ) e = (1 − 2s)(Gm ) e + s(Gm ) e +e .
The left inequality holds if s ≤ 1/2 (see matrix method). Thus, for this case we
obtain the same stability criterion as from the matrix method.
Note that |Gm | is the modulus; thus, if we have a complex number, |Gm | equals
the square root of the sum of the squares of the real and imaginary parts.
Outline
Outline (cont’d)
Factored ADI Method
First-Order Implicit
Recall the first-order explicit method:
Central difference for spatial derivatives on the nth (previous) time level.
First-order forward difference for time derivative.
First-order implicit:
Central difference for spatial derivatives on the (n + 1)st (current) time level.
First-order backward difference for time derivative.
φn+1
j − φnj φn+1 n+1
j+1 − 2φj + φn+1
j−1
=α 2
+ O(∆t, ∆x2 ).
∆t ∆x
Thus,
sφn+1 n+1
j+1 − (1 + 2s)φj + sφn+1 n
j−1 = −φj , j = 2, . . . , I, (6.16)
which is a tridiagonal problem for φn+1
j at the current time level. Note that it is
strongly diagonally dominant.
sen+1 n+1
j+1 − (1 + 2s)ej + sen+1 n
j−1 = −ej , (6.17)
−1
θm ∆x
∴ Gm = 1 + (4s) sin2 .
2
Thus, the method is stable if |Gm | ≤ 1. Note that
2 θm ∆x
1 + (4s) sin >1
2
Crank-Nicolson
We prefer second-order accuracy in time; therefore, consider approximating the
equation midway between time levels.
Later we will show that averaging the diffusion terms across time levels in this
manner is second-order accurate in time. Writing the difference equation in
tridiagonal form, we have
sφn+1 n+1
i+1 − 2(1 + s)φi + sφn+1 n n n
i−1 = −sφi+1 − 2(1 − s)φi − sφi−1 , i = 2, . . . , I,
(6.19)
which we solve for φn+1
i at the current time level.
Notes:
1) Second-order accurate in space and time.
2) Unconditionally stable for all s.
3) Apply derivative boundary conditions at current time level.
4) Very popular scheme for parabolic problems.
n+1/2 1 n+1
φi = (φ + φni ) + T.E. (6.20)
2 i
n+1/2
We seek an expression of the form φi = φ̃ + T.E.
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 238 / 454
Numerical Solutions of Parabolic Problems Implicit Methods
n+1/2
where Dt = ∂/∂t, and φ̃ is the exact value of φi midway between time levels,
n+1/2
i.e. φi = φ̃ + T.E.. Substituting these expansions into (6.20) gives
∞ k
n+1/2 1X 1 ∆t
1 + (−1)k
φi = Dt φ̃.
2 k! 2
k=0
Note that (
0, k = 1, 3, 5, . . .
1 + (−1)k = .
2, k = 0, 2, 4, . . .
n+1/2 1 n+1
φi = (φ + φni ) + O(∆t2 ).
2 i
This shows that averaging across time levels gives an O(∆t2 ) approximation of φi
at the mid-time level tn+1/2 . Note that this agrees with the result (6.5) except for
the constant factor (we averaged across two time levels for the DuFort-Frankel
method).
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 240 / 454
Numerical Solutions of Parabolic Problems Non-Linear Convective Problems
Outline
Outline (cont’d)
Factored ADI Method
Consider the unsteady, 1-D Burger’s equation (1-D, unsteady diffusion equation
with convection terms)
∂u ∂2u ∂u
=ν 2 −u , (6.21)
∂t ∂x ∂x
where ν is the viscosity. We want to consider how the nonlinear convection term
is treated in the various schemes.
First-Order Explicit
Approximating spatial derivatives at the previous time level, and using a forward
difference in time, we have
requires that 2 ≤ Re∆x ≤ 2/Ci , where Re∆x = uni ∆x/ν = mesh Reynolds
number. This is very restrictive.
Crank-Nicolson
where
1 n+1 n+1/2
(ui + uni ) + O(∆t2 ).
ui = (6.23)
2
Thus, this results in the implicit finite difference equation
Ci n+1 n+1 Ci
− s− ui+1 + 2(1 + s)ui − s + un+1
i−1
2 2
(6.24)
Ci n n Ci
= s− ui+1 + 2(1 − s)ui + s + uni−1 ,
2 2
n+1/2
u ∆t n+1/2
where here Ci = i ∆x , but we do not know ui yet, i.e. it is nonlinear.
Therefore, this procedure requires iteration at each time step:
n+1/2
1) Begin with ui = uni , i.e. use ui from previous time step as initial guess
at current time step.
2) Compute update for un+1
i , i = 1, . . . , I + 1, using equation (6.24).
n+1/2
3) Update ui = 21 (uin+1 + uni ).
4) Repeat until un+1
i converges for all i.
Notes:
1) It typically requires less than ten iterations to converge at each time step; if
more are required, then the time step ∆t is too large.
2) In elliptic problems we use Picard iteration because we only care about the
final converged solution, whereas here we want an accurate solution at each
time step.
Upwind-Downwind Differencing
Consider the first-order convective terms in the Crank-Nicolson approximation, i.e.
n+1/2
n+1/2 ∂u
u .
∂x
If un+1/2 > 0, then approximate ∂un+1/2 /∂x as follows:
Then
" #
∂un+1 ∂un
∂u 1
= + + O(∆t2 )
∂x 2 ∂x i−1/2
∂x i+1/2
" # (6.25)
n+1 n+1
∂u 1 ui − ui−1 uni+1 − uni
= + + O(∆x2 , ∆t2 ),
∂x 2 ∆x ∆x
Notes:
1) Although the finite-difference approximation at the current time level appears
to be a backward difference and that at the previous time level appears to be
a forward difference, they are really central differences evaluated at
half-points in the grid.
2) The fact that this approximation is O(∆x2 , ∆t2 ) accurate will be shown at
the end of this section.
Now we have
" #
n+1
n
∂u 1 ∂u ∂u
= + + O(∆t2 )
∂x 2 ∂x i+1/2
∂x i−1/2
" # (6.26)
n+1 n+1 n n
∂u 1 ui+1 − ui u − ui−1
= + i + O(∆x2 , ∆t2 ).
∂x 2 ∆x ∆x
Use (6.25) and (6.26) in equation (6.22) rather than central differences
1
un+1 − uni = s un+1 n+1 n+1 n n n
i i+1 − 2u i + ui−1 + ui+1 − 2u i + u i−1
2
n+1/2
(
1 un+1
i − un+1 n n
i−1 + ui+1 − ui , ui >0
− Ci n+1/2
.
2 un+1 − un+1 + un − un , u <0
i+1 i i i−1 i
n+1/2
( )
n n
ui+1 − ui , ui >0
= suni+1 + 2(1 − s)uni + suni−1 − Ci .
uni − uni−1 , ui
n+1/2
<0
(6.27)
Notes:
1) Equation (6.27) is diagonally dominant for all s and Ci (note that Ci may be
positive or negative). Be sure to check this for different equations, i.e. for
other than the one-dimensional Burger’s equation.
2) Iteration at each time step may require under-relaxation on ui ; therefore,
k+1/2
uk+1
i = ωui + (1 − ω)uki , k = 0, 1, 2, . . . ,
n+1/2
Here, Dt = ∂/∂t, Dx = ∂/∂x, and ũ is the exact value of ui midway
between time levels.
We seek an expression of the form
∂u ∂ ũ
= + T.E.
∂x ∂x
Expanding each term in equation (6.28) as a 2-D Taylor series about (xi , tn+1/2 ):
∞ k
X 1 ∆t
un+1
i+1 = Dt + ∆xDx ũ,
k! 2
k=0
∞ k
X 1 ∆t
un+1
i = Dt ũ,
k! 2
k=0
∞ k ∞ k
(−1)k ∆t
X 1 ∆t X
uni = − Dt ũ = Dt ũ,
k! 2 k! 2
k=0 k=0
∞ k ∞ k
(−1)k ∆t
X 1 ∆t X
uni−1 = − Dt − ∆xDx ũ = Dt + ∆xDx ũ,
k! 2 k! 2
k=0 k=0
Note that (
0, k = 0, 2, 4, . . .
1 + (−1)k+1 = .
2, k = 1, 3, 5, . . .
Therefore, let k = 2l + 1; thus,
∞
" 2l+1 2l+1 #
∂u 1 X 1 ∆t ∆t
= Dt + ∆xDx − Dt ũ.
∂x ∆x (2l + 1)! 2 2
l=0
k
where m are the binomial coefficients
k k!
= (0! = 1).
m m!(k − m)!
Thus,
∞
"2l+1
∂u 1 X 1 X 2l + 1 ∆t 2l+1−m
= Dt (∆xDx )m
∂x ∆x (2l + 1)! m=0 m 2
l=0
2l+1 #
∆t
− Dt ũ,
2
∞
"2l+1
1 X 1 X 2l + 1 ∆t 2l−m+1
= Dt (∆xDx )m
∆x (2l + 1)! m=1 m 2
l=0
2l+1 2l+1 #
2l + 1 ∆t ∆t
+ Dt − Dt ũ,
0 2 2
∞ 2l+1 2l−m+1
X 1 X (2l + 1)! ∆t
= Dt (∆x)m−1 (Dx )m ũ,
(2l + 1)! m=1 m!(2l − m + 1)! 2
l=0
∞ 2l+1 2l−m+1
∂u ∂ ũ X X 1 ∆t
= + Dt (∆x)m−1 (Dx )m ũ.
∂x ∂x m=1
m!(2l − m + 1)! 2
l=1
O(∆t2 , ∆t∆y, ∆y 2 ).
The loss of accuracy is due to the diagonal averaging across time levels.
Note:
1 Method is unconditionally stable.
Outline
Outline (cont’d)
Factored ADI Method
φn+1 n n
i,j − φi,j φi+1,j − 2φni,j + φni−1,j φni,j+1 − 2φni,j + φni,j−1
=α +
∆t (∆x)2 (∆y)2
+O(∆t, ∆x2 , ∆y 2 ).
Therefore, solving for the only unknown gives the explicit expression
φn+1 n n n n n
i,j = (1 − 2sx − 2sy )φi,j + sx (φi+1,j + φi−1,j ) + sy (φi,j+1 + φi,j−1 ), (6.31)
1
s≤ ,
4
1
which is even more restrictive than for the 1-D diffusion equation where s ≤ 2 for
stability.
Notes:
1 Unconditionally stable for all sx and sy .
2 Crank-Nicolson could be used to obtain second-order accuracy in time. It
produces a similar implicit equation, but with more terms on the
right-hand-side, i.e. evaluated at the previous time step.
3 Produces a banded matrix (with five unknowns) that is difficult to solve
efficiently.
4 Alternatively, we could split each time step into two half steps, called ADI
with time splitting, resulting in two sets of tridiagonal problems per time step:
Step 1: Solve implicitly for terms associated with one coordinate direction.
Step 2: Solve implicitly for terms associated with other coordinate direction.
n+1/2
" n+1/2 n+1/2 n+1/2
#
φi,j − φni,j φi+1,j − 2φi,j + φi−1,j φni,j+1 − 2φni,j + φni,j−1
=α + .
∆t/2 (∆x)2 (∆y)2
(6.33)
Therefore,
1 n+1/2 n+1/2 1 n+1/2 1 1
sx φi+1,j −(1+sx )φi,j + sx φi−1,j = − sy φni,j+1 −(1−sy )φni,j − sy φni,j−1 .
2 2 2 2
(6.34)
n+1/2
The tridiagonal problems (6.34) are solved for φi,j , i = 1, . . . , I + 1,
j = 1, . . . , J + 1, at the intermediate time level.
Therefore,
1 n+1 1 1 n+1/2 n+1/2 1 n+1/2
sy φn+1 n+1
i,j+1 −(1+sy )φi,j + sy φi,j−1 = − sx φi+1,j −(1−sx )φi,j − sx φi−1,j .
2 2 2 2
(6.36)
n+1
The tridiagonal problems (6.36) are solved for φi,j , i = 1, . . . , I + 1,
j = 1, . . . , J + 1, at the current time level.
Notes:
1 Method is O(∆t2 , ∆x2 , ∆y 2 ).
2 Requires boundary conditions at the intermediate time level n + 1/2 for
equation (6.34).
For example, if the boundary condition at x = 0 is Dirichlet as follows
then
φn1,j = anj .
n+1/2 1 n 1 n
φi,j + φn+1 + sy φi,j+1 − 2φni,j + φni,j−1 − φn+1 n+1 n+1
φi,j = i,j i,j+1 − 2φi,j + φi,j−1 .
2 4
Applying this equation at the boundary x = 0, leads to
n+1/2 1 n 1
aj + ajn+1 + sy anj+1 − 2anj + anj−1 − an+1 n+1 n+1
φ1,j = j+1 − 2aj + aj−1 .
2 4
This provides the boundary condition for φ1,j at the intermediate (n + 1/2)
time level. Note that the first term on the right-hand-side is the average of a
at the n and n + 1 time levels, the second term is ∂ 2 an /∂y 2 , and the third
term is ∂ 2 an+1 /∂y 2 .
3 For stability, apply von Neumann analysis at each half step and take the
product of the resulting amplification factors, G1 and G2 , to obtain G for the
full time step.
⇒ Method is unconditionally stable for all sx and sy .
4 In 3-D, we require three fractional steps (∆t/3) for each time step, and the
method is only conditionally stable, where
sx , sy , sz ≤ 1.5,
φn+1 n
i,j − φi,j α 2 n+1
δx φi,j + δx2 φni,j + δy2 φn+1 2 n
= i,j + δy φi,j ,
∆t 2
where δ represents second-order central difference operators (as in section 5.7)
Rewriting the difference equation with the unknowns on the left-hand-side and the
knowns on the right leads to
1 1
1 − α∆t δx2 + δy2 φn+1 2 2
φni,j .
i,j = 1 + α∆t δx + δy (6.38)
2 2
where the first factor only involves the difference operator in the x-direction, and
the second factor only involves the difference operator in the y-direction. The
factored operator produces an extra term as compared to the unfactored operator
1 2 2 2 2
α ∆t δx δy = O(∆t2 ),
4
which is O(∆t2 ). Therefore, the factorization (6.39) is consistent with the
second-order accuracy in time of the Crank-Nicolson approximation.
Notes:
1 Similar to the ADI method with time splitting, but have an intermediate
n+1/2
variable φ̂i,j rather than half time step φi,j .
Factored ADI is somewhat faster; it only requires one evaluation of the spatial
derivatives on the right-hand-side per time step (for equation (6.41)) rather
than two for the ADI method (see equations (6.34) and (6.36)).
2 Method is O(∆t2 , ∆x2 , ∆y 2 ) accurate and is unconditionally stable (even for
3-D implementation of unsteady diffusion equation).
3 Requires boundary conditions for the intermediate variable φ̂i,j to solve
(6.41). These are obtained from equation (6.40) applied at the boundaries
(see Fletcher, section 8.4.1).
4 The order of solution can be reversed, i.e. we could define
1
φ̂i,j = 1 − α∆tδx φn+1
2
i,j .
2
instead of (6.40).
Outline
∂V∗
ρ ∗
+ V · ∇V = −∇p∗ + µ∇2 V∗ ,
∗ ∗
∂t
(x∗ , y ∗ , z ∗ ) t∗ V∗ p∗
(x, y, z) = , t= V= , p= .
L L/U U ρU 2
1 ∂2v ∂2v
∂v ∂v ∂v ∂p
+u +v =− + + 2 , (7.2)
∂t ∂x ∂y ∂y Re ∂x2 ∂y
Continuity:
∂u ∂v
+ = 0. (7.3)
∂x ∂y
Thus, we have three coupled equations for three dependent variables
u(x, y, t), v(x, y, t) and p(x, y, t), which we refer to as primitive variables.
Therefore, the system is closed mathematically:
Given v(x, y, t) and p(x, y, t), we can determine u(x, y, t) from equation
(7.1).
Given u(x, y, t) and p(x, y, t), we can determine v(x, y, t) from equation
(7.2).
But how do we determine p(x, y, t) given that p does not appear in equation
(7.3)?
⇒ Need equation for p(x, y) in terms of u(x, y) and v(x, y) at time t.
To obtain such an equation, take the divergence of the momentum equation in
vector form, i.e. ∇ · (N S), which in 2-D is equivalent to taking ∂/∂x of equation
(7.1), ∂/∂y of equation (7.2) and adding.
Doing so gives
" 2
2 2
∂2u ∂ 2 u ∂v ∂u ∂2u
∂ p ∂ p ∂u
2
+ 2 = − + +u 2 + +v
∂x ∂y ∂t∂x ∂x ∂x ∂x ∂y ∂x∂y
#
2 2
2 2
∂ v ∂u ∂v ∂ v ∂v ∂ v
+ + +u + +v 2
∂t∂y ∂y ∂x ∂x∂y ∂y ∂y
1 ∂3u ∂3u ∂3v ∂3v
+ + + + .
Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y 3
Doing so gives
" 2
2 2
∂2u ∂ 2 u ∂v ∂u ∂2u
∂ p ∂ p ∂u
2
+ 2 = − + +u 2 + +v
∂x ∂y ∂t∂x ∂x ∂x ∂x ∂y ∂x∂y
#
2 2
2 2
∂ v ∂u ∂v ∂ v ∂v ∂ v
+ + +u + +v 2
∂t∂y ∂y ∂x ∂x∂y ∂y ∂y
1 ∂3u ∂3u ∂3v ∂3v
+ + + + .
Re ∂x3 ∂x∂y 2 ∂x2 ∂y ∂y 3
Or
∂2p ∂2p
∂ ∂u ∂v ∂ ∂u ∂v ∂ ∂u ∂v
+ = − + +u + +v +
∂x2 ∂y 2 ∂t ∂x ∂y ∂x ∂x# ∂y ∂y ∂x ∂y
2 2
∂u ∂v ∂v ∂u
+ + +2
∂x ∂y ∂x ∂y
2
∂ 2 ∂u ∂v
1 ∂ ∂u ∂v
+ + + 2 + .
Re ∂x2 ∂x ∂y ∂y ∂x ∂y
2 2 #
∂u ∂v ∂v ∂u
+ + +2
∂x ∂y ∂x ∂y
0 0
2 *
2 *
1 ∂ ∂u ∂v ∂ ∂u ∂v
+ 2 + + 2 + .
Re ∂x ∂x ∂y ∂y ∂x ∂y
Substituting gives
∂2p ∂2p
∂u ∂v ∂v ∂u
+ 2 =2 − , (7.4)
∂x2 ∂y ∂x ∂y ∂x ∂y
which is a Poisson equation for pressure with u(x, y) and v(x, y) known from
solutions of equations (7.1) and (7.2), respectively.
Notes:
1 The unsteady momentum equations (7.1) and (7.2) are parabolic in time.
⇒ May be solved using Crank-Nicolson, ADI, Factored-ADI, etc....
2 The pressure equation (7.4) and steady forms of (7.1) and (7.2) are elliptic.
⇒ Can be solved using cyclic reduction, Gauss-Seidel, ADI, multigrid, etc....
Velocity Boundary Conditions (us and un are velocity components tangential and
normal to the surface, respectively):
Surface: us = un = 0 (no slip and impermeability)
Inflow: us and un specified
Outflow: ∂u ∂un
∂n = ∂n = 0 (fully-developed flow)
s
∂us
Symmetry: ∂n = 0, un = 0 (no flow through symmetry plane)
Note that the domain must be sufficiently long for the fully-developed outflow
boundary condition to be valid.
Pressure Boundary Conditions at a Surface:
From the momentum equations (7.1) and (7.2) with u = v = 0:
∂p 1 ∂2u
n=x⇒ =
∂x Re ∂x2
∂p 1 ∂2v
n=y⇒ =
∂y Re ∂x2
Other boundary conditions for pressure obtained similarly.
Observe that we have Neumann boundary conditions on pressure at solid surfaces.
Outline
∂ψ ∂ψ
u= , v=− , (7.5)
∂y ∂x
such that the continuity equation (7.3) is identically satisfied. Lines of constant ψ
are called streamlines and are everywhere tangent to the local velocity vectors.
The vorticity ω(x, y, t) in 2-D is defined by
∂v ∂u
ω= − , (7.6)
∂x ∂y
and measures the local rate of rotation of fluid particles, with the sign
corresponding to the right-hand-rule. Note that in general 3-D flows vorticity is a
vector →
−
ω = ∇ × V.
Doing so gives
Doing so gives
Or
∂ ∂v ∂u ∂ ∂v ∂u ∂ ∂v ∂u ∂v ∂u ∂v
− +u − +v − + +
∂t ∂x ∂y ∂x ∂x ∂y ∂y ∂x ∂y ∂x ∂x ∂y
2
∂ 2 ∂v
∂u ∂u ∂v 1 ∂ ∂v ∂u ∂u
− + = − + 2 −
∂y ∂x ∂y Re ∂x2 ∂x ∂y ∂y ∂x ∂y
But from the continuity equation (7.3) and the definition of vorticity (7.6)
*
ω *
ω *
ω *
0
∂ ∂v ∂u ∂ ∂v ∂u ∂ ∂v ∂u ∂v ∂u ∂v
− +u − +v − + +
∂t∂x ∂y ∂x∂x ∂y ∂y ∂x ∂y ∂x∂x ∂y
0 ω ω
*
2 *
2 *
∂u ∂u ∂v 1 ∂ ∂v ∂u ∂ ∂v ∂u
− + = − + −
∂y ∂x ∂y Re ∂x2 ∂x ∂y ∂y 2 ∂x ∂y
1 ∂2ω ∂2ω
∂ω ∂ω ∂ω
+u +v = + , (7.7)
∂t ∂x ∂y Re ∂x2 ∂y 2
The vorticity and streamfunction may be related by substituting (7.5) into (7.6)
to obtain
∂2ψ ∂2ψ
+ = −ω, (7.8)
∂x2 ∂y 2
which is a Poisson equation for ψ(x, y, t) if ω(x, y, t) is known.
Notes:
1 Equations (7.7) and (7.8) are coupled equations for ω and ψ (with
u = ∂ψ/∂y, v = −∂ψ/∂x in (7.7)).
2 The vorticity-transport equation (7.7) is parabolic when unsteady and elliptic
when steady. The streamfunction equation (7.8) is elliptic.
3 The pressure terms have been eliminated, i.e. there is no need to calculate
the pressure in order to advance the solution in time.
→ Can compute p(x, y, t) from equation (7.4) if desired.
4 The vorticity-streamfunction formulation consists of two equations for the
two unknowns ω(x, y, t) and ψ(x, y, t) (cf. primitive variables formulation
with three equations for three unknowns in 2-D).
5 Unlike the primitive variables formulation, the ω–ψ formulation does not
easily extend to 3-D:
Three components of vorticity ⇒ Three vorticity equations.
Stretching and tilting terms in 3-D vorticity equations.
Cannot define streamfunction in 3-D ⇒ Vorticity-velocity potential
formulation.
6 We do not have straightforward boundary conditions for vorticity (see next
section).
⇒ Because of notes (3) and (4) (notwithstanding (6)), this is the preferred
formulation for 2-D, incompressible flows.
Outline
On CD:
∂ψ
v=0⇒ =0⇒ψ=0
∂x
∂ψ
u=1⇒ =1
∂y
∂ψ
∴ ψ and
specified (n = y).
∂n
At solid boundaries, therefore, the streamfunction and its normal derivative are
specified. Note that we have two boundary conditions on the streamfunction,
where only one is needed.
Thom’s Method
Consider, for example, the lower boundary AB (y = 0). Throughout the domain
we have
∂2ψ ∂2ψ
+ = −ω.
∂x2 ∂y 2
∂2ψ
However, along AB ψ = 0; therefore, = 0, and
∂x2
∂ 2 ψ
ωw = − . (7.9)
∂y 2 y=0
We also have
∂ψ
u= = 0 on y = 0.
∂y
For generality, consider a moving wall with
∂ψ
= uw (x) = g(x), (7.10)
∂y y=0
ψi,2 − ψi,0
= gi + O(∆y 2 );
2∆y
therefore,
ψi,0 = ψi,2 − 2∆y gi + O(∆y 3 ). (7.11)
2
ωi,1 = − [ψi,2 − ψi,1 − ∆y gi ] + O(∆y). (7.13)
(∆y)2
We then use the most recent iterate for streamfunction ψ to obtain a Dirichlet
boundary condition for vorticity ω.
Note: The truncation error for Thom’s method is only O(∆y). However, it
exhibits second-order convergence (Huang & Wetton 1996).
Jensen’s Method
We would like a method that is O(∆y 2 ) overall; therefore, consider using an
O(∆y 3 ) approximation for (7.10)
We then use (7.14) in place of (7.13) to obtain a Dirichlet boundary condition for
vorticity.
Notes:
1 For discussion of boundary conditions on vorticity and streamfunction at
inlets and outlets, see Fletcher, Vol. II, pp. 380–381.
2 Treatment of vorticity (and pressure) boundary conditions is still an active
area of research and debate (see, for example, Rempfer 2003).
Outline
Until now, we have considered methods for solving single equations; but in fluid
dynamics we must solve systems of coupled equations, such as the Navier-Stokes
equations in the primitive-variables or vorticity-streamfunction formulations.
Two methods for treating coupled equations numerically:
1 Sequential Solution:
Solve, i.e. iterate on, each equation for its dominant variable, treating the
other variables as known, i.e. use most recent values.
Requires one pass through mesh for each equation at each iteration.
⇒ Most common and easiest to implement.
2 Simultaneous (or Coupled) Solution:
Combine coupled equations into a single system of algebraic equations.
⇒ If have n dependent variables (e.g. 2-D primitive variables ⇒ n = 3 (u, v, p)),
produces an n × n block tridiagonal system of equations that is solved for all
the dependent variables simultaneously.
See, for example, S. P. Vanka, J. Comput. Phys., Vol. 65, pp. 138–158 (1986).
Sequential Method
Consider for example the primitive-variables formulation.
Steady Problems:
Note:
May require underrelaxation for convergence due to nonlinearity.
Unsteady Problems:
Notes:
Outer loop for time marching.
Inner loop to obtain solution of coupled equations at current time step.
⇒ Trade-off: Generally, reducing ∆t reduces number of inner loop iterations. (If
∆t small enough, no iteration is necessary).
Outline
Thus far we have used what are called “uniform, collocated grids.”
Uniform ⇒ Grid spacings in each direction ∆x and ∆y are uniform.
Collocated ⇒ All dependent variables are approximated at the same point.
In the following two sections we consider alternatives to uniform and collocated
grids.
Outline
For example, let us consider how each of the terms in the x-momentum equation
(8.1) are approximated (using central differences) on a staggered grid
∂u ui+1,j − ui−1,j
u = ui,j + O(∆x2 ),
∂x 2∆x
∂u 1 ui,j+1 − ui,j−1
v = (vi,j + vi+1,j + vi,j−1 + vi+1,j−1 ) + O(∆y 2 ),
∂y 4 2∆y
∂p pi+1,j − pi,j
= + O(∆x2 ),
∂x ∆x
The terms in equations (8.2) and (8.3) are treated in a similar manner at their
respective points while being approximated at their respective locations in the grid.
Outline
In many flows, e.g. those involving boundary layers, the solution has local regions
of intense gradients.
Therefore, a fine grid is necessary to resolve the flow near boundaries, but the
same resolution is not necessary in the remainder of the domain. As a result, a
uniform grid would waste computational resources where they are not needed.
Alternatively, a non-uniform grid would allow us to refine the grid where it is
needed.
Let us obtain the finite-difference approximations using Taylor series as before, but
without assuming all ∆x’s are equal. For example, consider the first-derivative
term ∂φ/∂x.
Applying Taylor series at xi−1 and xi+1 with ∆xi 6= ∆xi+1 and solving for ∂φ/∂x
leads to
∆x2i+1 − ∆x2i ∆x3i+1 + ∆x3i
2 3
∂φ φi+1 − φi−1 ∂ φ ∂ φ
= − − +· · · .
∂x ∆xi + ∆xi+1 2(∆xi + ∆xi+1 ) ∂x2 i 6(∆xi + ∆xi+1 ) ∂x3 i
If the grid is uniform, i.e. ∆xi = ∆xi+1 , then the second term vanishes, and the
approximation reduces to the usual O(∆x2 )-accurate central difference
approximation for the first derivative. However, for a non-uniform grid, the
truncation error is only O(∆x).
We could restore second-order accuracy by using an appropriate approximation to
∂ 2 φ/∂x2 i in the second term, which results in
As one can imagine, this gets very complicated, and it is difficult to ensure
consistent accuracy for all approximations.
Outline
where (x, y) are the variables in the physical domain, and (ξ, η) are the variables
in the computational domain such that ξ = ξ(x, y) and η = η(x, y). To transform
back to the physical domain, we regard x = x(ξ, η) and y = y(ξ, η).
3 We may want to enforce orthogonality of the grid (see Fletcher pp. 97-100).
H2 − H1
yb = f (x) = H1 + x.
L
∂η 1 1
= = .
∂y f (x) H1 + 2 −H
H
L
1
ξ
See figure 2 for plots of these metrics, which show that they are smooth.
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 328 / 454
Grids and Grid Generation Grid Generation
Let us transform the governing equations using the transformation laws (8.4) and
(8.5)
∂ ∂ξ ∂ ∂η ∂ ∂ f 0 (ξ) ∂
= + = −η ,
∂x ∂x ∂ξ ∂x ∂η ∂ξ f (ξ) ∂η
∂ ∂ξ ∂ ∂η ∂ 1 ∂
= + = .
∂y ∂y ∂ξ ∂y ∂η f (ξ) ∂η
For example, consider the convection term
f 0 (ξ) ∂u
∂u ∂u
u(x, y) = u(ξ, η) −η .
∂x ∂ξ f (ξ) ∂η
y = Re−1/2 Y, 0 ≤ Y ≤ ∞.
We want a transformation that maps the semi-infinite domain into a finite one
and clusters points near the surface. One possibility is
2 Y
ξ = x, η = tan−1 . (8.9)
π a
∂ ∂
= ,
∂x ∂ξ
∂ ∂ξ ∂ ∂η ∂ ∂
= + = Γ(η) ,
∂Y ∂Y ∂ξ ∂Y ∂η ∂η
where
1
Γ(η) = [1 + cos(πη)] .
πa
Then
∂2 ∂2
∂ ∂ ∂
2
= Γ(η) Γ(η) = Γ (η) 2 + Γ(η)Γ0 (η) .
2
∂Y ∂η ∂η ∂η ∂η
Notes:
1 Algebraic methods move the complexity, i.e. complex boundaries and/or
non-uniform grids, to the equations themselves.
Physical domain ⇒ “simple” equations; complex geometry and grid.
Computational domain ⇒ simple geometry and grid; “complex” equations.
2 Computational overhead is typically relatively small for algebraic methods, i.e.
there are no additional equations to solve.
3 It is easy to cluster grid points in the desired regions of the domain; however,
it is necessary to know where to cluster the grid points a priori.
4 Must choose the algebraic transformation ahead of time, i.e. before solving
the problem (cf. variational grid generation).
5 It is difficult to handle complex geometries.
The velocity potential and streamfunction both satisfy the Laplace equation
z = x + iy, ζ = ξ + iη.
This approach uses conformal mapping (see Fletcher, vol. II, pp. 89-96) and
is good for two-dimensional flows in certain types of geometries.
2) Solve a boundary-value problem to generate the grid, i.e. elliptic grid
generation.
In order to control grid clustering, known functions involving sources and sinks
may be added to the right-hand-side of equations (8.11)
where P and Q contain exponential functions. Note that equations (8.12) are for
ξ = ξ(x, y) and η = η(x, y).
We want to solve (8.12) in the computational domain (ξ, η) to obtain the grid
transformations x = x(ξ, η) and y = y(ξ, η). In addition, we must transform the
governing equation(s), e.g. Navier-Stokes, to the computational domain.
Therefore, we seek the transformation laws for (x, y) → (ξ, η).
For
ξ = ξ(x, y), η = η(x, y),
the total differentials are
∂ξ ∂ξ ∂η ∂η
dξ = dx + dy, dη = dx + dy,
∂x ∂y ∂x ∂y
or in matrix form
dξ ξ ξy dx
= x . (8.13)
dη ηx ηy dy
Similarly, for the inverse transformation from (ξ, η) to (x, y), we have that
dx xξ xη dξ
= .
dy yξ yη dη
We can solve the latter expression for [dξ dη]T by multiplying by the inverse
−1
dξ xξ xη dx
=
dη yξ yη dy
yη −xη
−yξ xξ
dx
=
xξ xη dy
yξ yη
dξ 1 yη −xη dx
=
dη J −yξ xξ dy
and
2 2
∂2 ∂2ξ ∂ ∂2η ∂ ∂2 ∂2 ∂η ∂ξ ∂ 2
∂ξ ∂η
= + 2 + + +2 , (8.15)
∂y 2 ∂y 2 ∂ξ ∂y ∂η ∂y ∂ξ 2 ∂y ∂η 2 ∂y ∂y ∂η∂ξ
where 2 2
∂x ∂y
g11 = + ,
∂ξ ∂ξ
∂x ∂x ∂y ∂y
g12 = + ,
∂ξ ∂η ∂ξ ∂η
2 2
∂x ∂y
g22 = + .
∂η ∂η
Note that the coefficients in the equations are defined such that
g11 g12
g = = J 2 , (g12 = g21 ).
g21 g22
7) References:
For more elliptic grid generation options, see Knupp, P. and Steinberg, S.,
“Fundamentals of Grid Generation,” CRC Press (1994), who consider
smoothness, Winslow and TTM methods.
Thompson, J. F., Warsi, Z. U. A. and Mastin, C. W., “Numerical Grid
Generation - Foundation and Applications,” North Holland (1985).
8) The ultimate in grid generation is adaptive grid methods, in which the grid
“adapts” to local features of the solution as it is computed; see, for example,
variational grid generation.
Physical domain: a ≤ x ≤ b
Computational domain: 0 ≤ ξ ≤ 1
where xi = x(ξi ), xi+1 = x(ξi+1 ) and φi+1/2 = (φ(ξi ) + φ(ξi+1 ))/2. For a given
weight function φ(ξ), we want to minimize S subject to the end conditions
x1 = a, xI+1 = b.
Dividing the above expression by ∆ξ (= constant) gives
I 2
S X ∆xi ∆ξ
= .
∆ξ i=1
∆ξ 2φi+1/2
1 x2ξ
F = F (ξ, x, xξ ) = ,
2 φ
Euler’s equation is given by
∂F d ∂F
− =0
∂x dξ ∂xξ
d xξ
− =0 (8.18)
dξ φ
φξ
∴ xξξ − xξ = 0, (φ > 0), (8.19)
φ
dx
= Cφ(ξ).
dξ
Writing this expression in discrete form yields
∆xi ξi+1 + ξi
=Cφ .
∆ξ 2
Therefore, we see that this requires that ∆xi be proportional to φi+1/2 , where the
proportionality constant is C∆ξ.
Note that the weight function φ(ξ) has been expressed in the computational
domain, and it is not necessarily clear how it should be chosen. Conceptually, we
prefer to think in terms of physical weight functions, say w(x). In that case, the
grid spacing is determined by a physical variable or one of its derivatives giving
rise to feature-adaptive grid generation.
1 x2ξ
F = F (ξ, x, xξ ) = .
2 w2 (x)
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 348 / 454
Grids and Grid Generation Grid Generation
Therefore,
∂F wx ∂F xξ
= − 3 x2ξ , = 2,
∂x w ∂xξ w
and Euler’s equation gives
wx 2 d xξ
− 3 xξ − =0
w dξ w2
wx 2 xξξ wx xξ
x + − 2 xξ = 0
w3 ξ w2 w3
wx 2
xξξ − x = 0.
w ξ
By the chain rule
d dξ d 1 d 1
= = ⇒ wx = wξ .
dx dx dξ xξ dξ xξ
which is of the same form as (8.19) (this is why we set φ = w2 ), except that now
w(x) is a physical weight function. The boundary conditions are
x(0) = a, x(1) = b.
One-Dimensional Illustration:
Consider the one-dimensional, steady convection-diffusion equation
uxx − cux = 0, a ≤ x ≤ b,
(8.22)
u(a) = 0, u(b) = 1,
ec(x−a) − 1
u(x) = c(b−a) .
e −1
The governing equation (8.22) and the grid equation must be transformed into
the computational domain. In one dimension, the transformation laws are
d dξ d 1 d
= = ,
dx dx dξ xξ dξ
d2 1 d2
1 d 1 d xξξ d
= = 2 2− 3 .
dx2 xξ dξ xξ dξ xξ dξ xξ dξ
1 d2 u xξξ du 1 du
− − c = 0,
x2ξ dξ 2 x3ξ dξ xξ dξ
or
xξξ
uξξ − + cxξ uξ = 0, (8.23)
xξ
with boundary conditions
u(0) = 0, u(1) = 1.
To solve (8.23) for the velocity u(ξ) requires the grid x(ξ). Here we will use the
grid equation (8.21) (corresponding to the physical weighted-length functional
(8.20))
wξ
xξξ − xξ = 0, (8.24)
w
with the boundary conditions x(0) = a, x(1) = b.
→ How to choose w(x)?
A common choice for a feature-adaptive weight function is based on the gradient
of the dependent variable of the form
1
w(x) = p , 0 < w ≤ 1, (8.25)
1 + 2 u2x
ux small ⇒ w ≈ 1,
1
ux large ⇒ w≈ , (|ux | ↑ ⇒ w ↓ ⇒ ∆x ↓).
|ux |
For use in the grid equation (8.24), we need the weight function (8.25) in terms of
the computational coordinate ξ:
1
w (x(ξ)) = q . (8.26)
2 2
1+ u
x2ξ ξ
Numerical Procedure:
The governing equation (8.23) and the grid equation (8.24) (with (8.27)
expressed in the computational ξ-plane are
Ai ui−1 + Bi ui + Ci ui+1 = Di , i = 2, . . . , I,
where
∆ξ
Ai = 1+ Xi ,
2
Bi = −2,
∆ξ
Ci = 1− Xi ,
2
Di = 0.
We obtain a similar tridiagional problem for the grid equation (8.29).
where we now have two weight functions φ(ξ, η) > 0 and ψ(ξ, η) > 0. This
produces a grid for which the lengths of the coordinate lines are proportional to
the weight functions, i.e.
√ q
g11 = x2ξ + yξ2 = K1 φ(ξ, η),
√ q
g22 = x2η + yη2 = K2 ψ(ξ, η).
Notes:
1) If φ = ψ = c, then the Euler equations are Laplace equations
Euler equations:
Jxη Jxξ
− = 0,
φ ξ φ η
Jyη Jyξ
− = 0.
φ ξ φ η
Orthogonality Functional:
The grid is orthogonal if g12 = xξ xη + yξ yη = 0; therefore, the orthogonality
functional is
1 1 1 2
Z Z
∴ IO [x, y] = g dξdη,
2 0 0 12
such that g12 is minimized in a least squares sense (without a weight
function).
Euler equations:
(g12 xη )ξ + (g12 xξ )η = 0,
(g12 yη )ξ + (g12 yξ )η = 0.
Combination Functionals:
We can form combinations of the above functionals. For example, consider
the area-orthogonality functional
1 1 1 1
J 2 + g12
2
Z Z Z Z
1 1 g11 g22
IAO [x, y] = dξdη, = dξdη,
2 0 0 φ 2 0 0 φ
Notes:
1) Just as in the one-dimensional case, it is generally preferable to define weight
functions in the physical domain, i.e. w(x, y), rather than in the
computational domain, i.e. φ(ξ, η).
2) The grid x(ξ, η), y(ξ, η) can be obtained by solving the Euler equations (most
common) or the variational form directly.
3) All of the two-dimensional functionals above have been written in the form
Z 1 Z 1
1
I[x, y] = F (ξ, η, x, y, xξ , yξ , xη , yη )dξdη,
2 0 0
These are called contravariant functionals (see Knupps and Steinberg section
8.5 and chapter 11).
Outline
∂u ∂2u
= α 2, 0 ≤ x ≤ `, (9.1)
∂t ∂x
with the boundary conditions
Exact Solution
Let us begin by using the method of separation of variables to obtain an exact
solution for equation (9.1). We separate the variables according to
dψ d2 φ
φ = αψ 2 .
dt dx
Moving everything depending upon t to the left-hand-side and everything
depending upon x to the right-hand-side, this becomes
1 dψ 1 d2 φ
= = λ = −µ2 .
αψ dt φ dx2
Because the x and t dependence can be separated in this way, both sides of the
equation must be equal to a constant, say λ. Positive and zero λ produce only the
trivial solution for the boundary conditions given, so we consider the case where
λ = −µ2 < 0.
Therefore, the partial differential equation (9.1) is converted into two ordinary
differential equations
d2 φ
2
+ µ2 φ = 0, (9.5)
dx
dψ
+ αµ2 ψ = 0, (9.6)
dt
each of which are differential eigenproblems. The solution to equation (9.5) is
The boundary condition u(0, t) = 0 requires that φ(0) = 0, which requires that
c1 = 0. From the boundary condition u(`, t) = 0, we must have φ(`) = 0, which
requires that sin(µn `) = 0. Therefore,
nπ
µn = , n = 1, 2, 3, . . . ,
`
or
∞
X
cn φn (x) = f (x).
n=1
Taking the inner product of φm (x) with both sides, the only non-vanishing term
(due to orthogonality of the eigenfunctions φn (x)) occurs when m = n, giving
Therefore,
`
hf (x), φn (x)i
Z
2 nπ
cn = 2
= f (x) sin x dx, n = 1, 2, 3, . . . , (9.11)
||φn (x)|| ` 0 `
which are the Fourier sine coefficients of f (x). Thus, the exact solution to
(9.1)–(9.3) is given by equation (9.10) with (9.11).
Numerical Solution
Now let us consider solving the differential eigenproblem (9.5), with boundary
conditions (9.2) numerically. Using central differences, the differential equation
becomes
φi+1 − 2φi + φi−1
= λφi .
(∆x)2
Thus, for i = 2, . . . , I (φ1 = 0, φI+1 = 0), in matrix form we have
−2 1 0 ··· 0 0 φ2 φ2
1 −2 1 ··· 0 0 φ3
φ3
0 1 −2 · · · 0 0 φ4 φ4
2
.. .. = (∆x) λ .. ,
.. .. .. .. ..
.
. . . . .
.
.
0 0 0 · · · −2 1 φI−1 φI−1
0 0 0 ··· 1 −2 φI φI
or
Aφ = λ̄φ. (9.12)
Outline
The standard method for numerically determining the eigenvalues and eigenvectors
of a matrix is based on QR decomposition, which entails performing a series of
similarity transformations. This is the approach used by the built-in Mathematica
and Matlab functions Eigenvalues[]/ Eigenvectors[] and eig(), respectively.
Similarity Transformation
Consider the eigenproblem
Ax = λx, (9.13)
where A is a real, square matrix. Suppose that Q is an orthogonal matrix such
that Q−1 = QT . Let us consider the transformation
B = QT AQ. (9.14)
BQT x = QT AQQT x,
= QT Ax,
= QT λx,
BQT x = λQT x.
By = λy, (9.15)
y = QT x (x = Qy) . (9.16)
A 0 = Q0 R0 . (9.17)
A 1 = R0 Q0 . (9.18)
QT0 A0 = QT0 Q0 R0 = R0 .
A1 = QT0 A0 Q0 , (9.19)
Ak+1 = Rk Qk , k = 0, 1, 2, . . . (9.20)
A0 , A1 , A2 , . . .
Plane Rotations
Consider the n × n transformation matrix P comprised of the identity matrix with
only four elements changed in the pth and q th rows and columns according to
y = Px. (9.21)
y1 = cx1 + sx2 ,
y2 = −sx1 + cx2 ,
or
y1 cos φ sin φ x1
= .
y2 − sin φ cos φ x2
This transformation rotates the vector x through an angle φ to obtain y. Note
that y = PT x rotates the vector x through an angle −φ.
Thus, in the general n-D case (9.21), P rotates the vector x through an angle φ
in the xp xq -plane.
Notes:
1 The transformation matrix P is orthogonal, i.e. PT = P−1 .
2 We can generalize to rotate a set of vectors, i.e. a matrix, by taking
Y = PX.
3 The angle φ may be chosen with one of several objectives in mind. For
example,
i) To zero all elements below (or to the right of) a specified element, e.g.
yT = [y1 y2 · · · yj 0 · · · 0].
Householder transformation (reflection) – efficient for dense matrices.
ii) To zero a single element, e.g. yp or yq (see equations (9.22) and (9.23)).
Givens transformation (rotation) – efficient for sparse, structured (e.g. banded)
matrices.
QT = Pm · · · P2 P1 .
Notes:
1 The QR decomposition (9.24) and (9.25) is obtained from a series of plane
(Givens or Householder) rotations.
2 Givens transformations are most efficient for large, sparse, structured
matrices.
→ Configure to only zero elements that are not already zero.
3 There is a “fast Givens transformation” for which the P matrices are not
orthogonal, but the QR decompositions can be obtained two times faster
than in the standard Givens transformation illustrated in “QRmethod.nb.”
4 Convergence of the iterative QR method may be accelerated using shifting
(see, for example, Numerical Recipes, section 11.3).
5 The order of operations for the QR method per iteration are as follows:
Dense matrix → O(n3 ) ⇒ Very expensive.
Hessenberg matrix → O(n2 )
Tridiagonal matrix → O(n).
Thus, the most efficient procedure is as follows:
i) Transform A to a similar tridiagonal or Hessenberg form if A is symmetric or
non-symmetric, respectively.
→ This is done using a series of similarity transformations based on Householder
rotations for dense matrices or Givens rotations for sparse matrices.
ii) Use iterative QR method to obtain eigenvalues of tridiagonal or Hessenberg
matrix.
Arnoldi Method
The Arnoldi method has been developed to treat situations in which we only need
a small number of eigenvalues of a large sparse matrix:
1 The iterative QR method described in the previous section is the general
approach used to obtain the full spectrum of eigenvalues of a dense matrix.
2 As we saw in the 1-D unsteady diffusion example, and as we will see when we
evaluate hydrodynamic stability, we often seek the eigenvalues of large sparse
matrices.
3 In addition, we often do not require the full spectrum of eigenvalues in
stability problems as we only seek the “least stable mode.”
⇒ We would like an efficient algorithm that determines a subset of the full
sprectrum of eigenvalues (and possibly eigenvectors) of a sparse matrix.
Suppose we seek the largest k eigenvalues (by magnitude) of the large sparse
n × n matrix A, where k n. Given an arbitrary n-D vector q0 , we define the
Krylov subspace by
At each step i = 2, . . . , k:
→ An n × i orthonormal matrix Q is produced that forms an orthonormal basis
for the Krylov subspace Ki (A, q0 ).
→ Using the projection matrix Q, we transform A to produce an i × i
Hessenberg matrix H (or tridiagonal for symmetric A), which is an
orthogonal projection of A onto the Krylov subspace Ki .
→ The eigenvalues of H, sometimes called the Ritz eigenvalues, approximate
the largest i eigenvalues of A.
The approximations of the eigenvalues improve as each step is incorporated, and
we obtain the approximation of one additional eigenvalue.
Notes:
1 Because k n, we only require the determination of eigenvalues of
Hessenberg matrices that or no larger than k × k as opposed to the original
n × n matrix A.
2 Although the outcome of each step depends upon the starting Arnoldi vector
q0 used, the procedure converges to the correct eigenvalues of matrix A.
3 The more sparse the matrix A is, the smaller k can be to obtain a good
approximation of the largest k eigenvalues of A.
4 When applied to symmetric matrices, the Arnoldi method reduces to the
Lanczos method.
5 A shift and invert approach can be incorporated to determine the k
eigenvalues close to a specified part of the spectrum rather than that with
the largest magnitude.
For example, it can be designed to determine the k eigenvalues with the
largest real or imaginary part.
6 When seeking a set of eigenvalues in a particular portion of the full spectrum,
it is desirable that the starting Arnoldi vector q0 be in (or ‘nearly’ in) the
subspace spanned by the eigenvectors corresponding to the sought after
eigenvalues.
As the Arnoldi method progresses, we get better approximations of the desired
eigenvectors that can then be used to form a more desirable starting vector.
This is known as the implicitly restarted Arnoldi method and is based on the
implicitly-shifted QR decomposition method.
Restarting also reduces storage requirements by keeping k small.
Ax = λBx.
Outline
1 ∂2u ∂2u
∂u ∂u ∂u ∂p
+u +v =− + + 2 , (9.27)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
2
∂2v
∂v ∂v ∂v ∂p 1 ∂ v
+u +v =− + + 2 . (9.28)
∂t ∂x ∂y ∂y Re ∂x2 ∂y
We denote the solution to (9.26)–(9.28), i.e. the base flow, by u0 (x, y, t),
v0 (x, y, t) and p0 (x, y, t), and seek the behavior of small perturbations to this
base flow.
⇒ If the amplitude of the small perturbations grow, the flow is
hydrodynamically unstable.
∂ û ∂v̂
+ = 0, (9.30)
∂x ∂y
1 ∂ 2 û ∂ 2 û
∂ û ∂ û ∂ û ∂u0 ∂u0 ∂ p̂
+ u0 + v0 + û + v̂ = − + + 2 , (9.31)
∂t ∂x ∂y ∂x ∂y ∂x Re ∂x2 ∂y
2
∂ 2 v̂
∂v̂ ∂v̂ ∂v̂ ∂v0 ∂v0 ∂ p̂ 1 ∂ v̂
+ u0 + v0 + û + v̂ = − + + 2 . (9.32)
∂t ∂x ∂y ∂x ∂y ∂y Re ∂x2 ∂y
Because is small, we neglect O(2 ) terms. Thus, the evolution of the
disturbances are governed by the linearized Navier-Stokes (LNS) equations
(9.30)–(9.32), where the base flow is known.
⇒ Linear Stability Theory
In principle, we could impose a perturbation û, v̂, p̂ at any time ti and track its
evolution in time and space to determine if the flow is stable to the imposed
perturbation. To fully characterize the stability of the base flow, however, would
require many calculations of the LNS equations with different perturbation
“shapes” imposed at different times.
We can formulate a more manageable stability problem by doing one or both of
the following:
1 Consider simplified base flows.
2 Impose “well-behaved” perturbations.
⇒ Sine wave with wavenumber α and phase velocity cr , i.e. normal mode.
If ci > 0, the amplitude of the perturbation grows unbounded as t → ∞ with
growth rate αci .
3 Because equations (9.30)–(9.32) are linear, each normal mode with
wavenumber α may be considered independently of one another (cf. von
Neumann numerical stability analysis).
⇒ For a given mode α, we are looking for the eigenvalue (wavespeed) with the
fastest growth rate.
For steady, parallel base flow (9.33) and (9.34), the Navier-Stokes equations
(9.26)–(9.28) reduces to (from equation (9.27))
d2 u0 ∂p0
= Re , (9.36)
dy 2 ∂x
where Rep00 (x) is a constant for Poiseuille flow. The disturbance equations
(9.30)–(9.32) become
∂ û ∂v̂
+ = 0, (9.37)
∂x ∂y
1 ∂ 2 û ∂ 2 û
∂ û ∂ û ∂u0 ∂ p̂
+ u0 + v̂ = − + + 2 , (9.38)
∂t ∂x ∂y ∂x Re ∂x2 ∂y
1 ∂ 2 v̂ ∂ 2 v̂
∂v̂ ∂v̂ ∂ p̂
+ u0 =− + + 2 . (9.39)
∂t ∂x ∂y Re ∂x2 ∂y
Notes:
1 For a given base flow u0 (y), Reynolds number Re, and wavenumber α, the
Orr-Sommerfeld equation is a differential eigenproblem of the form
L1 v1 = cL2 v1 ,
L1 v1 = cL2 v1 .
where
2αi
P (yj ) = u0 (yj ) − ,
Re
α3 i
Q(yj ) = − α2 u0 (yj ) − u000 (yj ).
Re
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 406 / 454
Hydrodynamic Stability and the Eigenproblem Hydrodynamic Stability
v = v 0 = 0, at y = a, b. (9.48)
and from v 0 = 0 at y = a (j = 1)
v2 − v0
=0 ⇒ v 0 = v2 .
2∆y
Substituting into the difference equation for j = 2 results in
[(C + A2 )v2 + B2 v3 + Cv4 ] = c Āv2 + B̄v3 . (9.49)
Similarly, for j = J
2 The least stable mode, i.e. that with the fastest growth rate αci , is given by
αmax(ci ).
Methods of Solution:
1 Convert the generalized eigenproblem (9.45) to a regular eigenproblem by
multiplying both sides by the inverse of N. That is, find the eigenvalues from
−1
N M − cI = 0.
This requires inverting a large matrix, and although M and N are typically
sparse and banded, N−1 M is a full, dense matrix.
⇒ This would require use of a general approach, such as the iterative QR
method, to determine the eigenvalues of a large, dense matrix.
2 In order to avoid solving the large matrix problem that results from the BVP,
traditionally the shooting method for IVPs has been used.
→ This approach avoided the need to find the eigenvalues of large matrices in the
days when computers were not capable of such large calculations.
→ In addition, it allowed for use of well-developed algorithms for IVPs.
→ However, this is like “using a hammer to drive a screw.”
Mv = cNv,
where M and N are large, sparse matrices, for the least stable mode. In
addition to the fact that the matrices are sparse, in stability contexts such as
this, we only need the least stable mode, not the entire spectrum of
eigenvalues. Recall that the least stable mode is that with the largest
imaginary part.
→ Currently, the state-of-the-art in such situations is the Arnoldi method
discussed in the last section.
Note that N must be positive definite for use in the Arnoldi method. That is,
it must have all positive eigenvalues. In our case, this requires us to take the
negatives of the matrices M and N as defined above.
Such a flow is parallel; therefore, the base flow is a solution of equation (9.36).
The solution is a parabolic velocity profile given by
1.05
1.00
0.95
0.90
0.85
0.80
5000 6000 7000 8000 9000 10 000
Outline
Outline
Outline
In LES we filter the velocity and pressure fields so they only contain large-scale
components, i.e. local average of complete fields:
Z
ūi (xi ) = G(x, x0 )ui (x0 )dx0 , (10.1)
ui (xi , t) = ūi (xi , t) + u0i (xi , t), p(xi , t) = p̄(xi , t) + p0 (xi , t),
where ūi (xi , t) is the resolvable scale velocity (computed), and u0i (xi , t) is the
subgrid scale (SGS) velocity.
The filtered Navier-Stokes equations are (in tensor notation)
∂ ūi
= 0,
∂xi
(10.2)
∂ ūi ∂ 1 ∂ p̄
+ ui uj = − + ν∇2 ūi .
∂t ∂xj ρ ∂xi
The first term on the right-hand-side is computed from the resolved scales, and
the remaining terms are τij = SGS Reynolds stress and must be modeled, i.e. they
contain u0i and u0j .
Therefore, the subgrid scale (SGS) model must specify τij as a function of the
resolvable variables (ūi , ūj ), and it provides for energy transfer between the
resolvable scales and the SGS.
The earliest and most common SGS model is the Smagorinsky model, which is an
eddy viscosity model (effective viscosity due to small-scale turbulent motion).
Thus, the SGS Reynolds stress τij increases transport and dissipation.
1
τij = τkk δij + 2µt S̄ij , (10.4)
3
where
1 ∂ ūi ∂ ūj
S̄ij = + = strain-rate for resolved field
2 ∂xj ∂xi
µt = CS2 ρ∆2 |S̄|.
Here, CS is the model parameter, which must be specified, and |S̄| = (S̄ij S̄ij )1/2 .
Notes:
1) The Smagorinsky model only accounts for energy transfer from large to small
scales.
2) The Smagorinsky model does not work well near boundaries (eddy viscosity is
much smaller and flow is more anisotropic).
3) Points (1) and (2) can be improved upon with dynamic SGS models:
Allow model parameter CS to vary with space and time, i.e. it is computed
from the resolvable flow field.
Automatically adjusts SGS parameter for anisotropic flow and flow near walls.
Allows for backscatter (µt < 0), which accounts for energy transferred from
small scales to large scales.
Active area of ongoing research.
Outline
In RANS we compute only the mean flow and model all of the turbulence:
where ū(xi ) is the mean flow, and u0 (xi , t) are the turbulent fluctuations (not the
same as u0 in LES).
To obtain the mean flow, we use Reynolds averaging:
Steady mean flow ⇒ time average Navier-Stokes equations
Z T
1
ū(xi ) = lim u(xi , t)dt, (10.6)
T →∞ T 0
∂ ū ∂v̄
+ = 0,
∂x ∂y
∂ ū ∂ ū ∂ ū ∂ p̄ ∂ ∂ ū ∂ ∂ ū
ρ + ū + v̄ =− + µ − ρu0 u0 + µ − ρu0 v 0 ,
∂t ∂x ∂y ∂x ∂x ∂x ∂y ∂y
∂v̄ ∂v̄ ∂v̄ ∂ p̄ ∂ ∂v̄ 0 0
∂ ∂v̄ 0 0
ρ + ū + v̄ =− + µ − ρu v + µ − ρv v ,
∂t ∂x ∂y ∂y ∂x ∂x ∂y ∂y
(10.7)
0 0 0 0 0 0
where ρu u , ρu v , ρv v are the Reynolds stresses. Equations (10.7) are three
equations for five unkowns (ū, v̄, p̄, u0 , v 0 ); therefore, we have a closure problem.
Closure is achieved by relating the Reynolds stresses to the mean flow quantities
through a turbulence model.
k − Turbulence Model:
The k − model is the most common turbulence model used in applications, but
there are many others.
Define:
1 0 0
k = u u = turbulent kinetic energy,
2 i i
0 0
∂ui ∂ui
= νT = rate of dissipation of turbulent energy.
∂xj ∂xj
Here, k and are determined from a solution of the following coupled equations,
which are derived from Navier-Stokes,
Dk ∂ µT ∂k ∂ui ∂uj ∂ui
ρ = + µT + − ρ, (10.8)
Dt ∂xj σk ∂xj ∂xj ∂xi ∂xj
ρC2 2
D ∂ µT ∂ C1 µT ∂ui ∂uj ∂ui
ρ = + + − . (10.9)
Dt ∂xj σ ∂xj k ∂xj ∂xi ∂xj k
The terms on the left-hand-side represent the transport of k and , and the first,
second and third terms on the right-hand-side represent diffusion, production and
dissipation, respectively, of k and .
Cµ ρk 2
µT = . (10.10)
The eddy viscosity µT is used to relate the Reynolds stresses to the mean flow
quantities through
∂ ūi ∂ ūj 2
−ρu0i u0j = µT + − ρkδij . (10.11)
∂xj ∂xi 3
Notes:
1) Equations (10.8)–(10.10) involve five constants, Cµ , C1 , C2 , σk , σ , which
must be determined empirically.
2) Special treatment is necessary at solid boundaries; therefore, wall functions
that are usually based on the log-law are used for turbulent boundary layers.
3) Produces less detailed information about the flow than DNS and LES, but
RANS requires significantly more modest computational resources as
compared to DNS and LES. Thus, it is good for engineering applications, and
is used by most commercial CFD codes.
Outline
11 Parallel Computing
Introduction
Glossary:
•
Supercomputer – the fastest computers available at a particular time.
•
MIPS – Millions of Instructions Per Second; outdated measure of computer
performance (architecture dependent).
•
Floating point operation – an operation, e.g. addition, subtraction,
multiplication or division, on one or more floating point numbers (non
integers).
•
FLOPS – Floating Point Operations Per Second; current measure of
computer performance for numerical applications.
MFLOPS MegaFLOPS million (106 ) FLOPS
GFLOPS GigaFLOPS billion (109 ) FLOPS
TFLOPS TeraFLOPS trillion (1012 ) FLOPS
PFLOPS PetaFLOPS 1015 FLOPS
•
Serial processing – a computer code is executed line-by-line in sequence on
one processor.
•
Scalar processing – a processor performs calculations on a single data
element.
•
Vector processing – a processor performs calculations on multiple data
elements, i.e. a vector, simultaneously.
•
Parallel processing – multiple processors (or cores) perform operations
simultaneously on different data elements.
•
Massively parallel – parallel computers involving thousands of processors.
•
MPP – Massively Parallel Processing.
•
SMP – Symmetric Multi-Processing; shared memory parallelism.
•
Shared memory parallelism – all of the processors share the same memory.
•
Distributed memory parallelism – all of the processors access their own
memory.
•
MPI – Message Passing Interface; most common library used for
inter-processor communication on distributed memory computers.
•
Cluster – a parallel computer comprised of commodity hardware, e.g. Beowulf
cluster (commodity hardware, Linux OS, and open source software).
•
SIMD – Single Instruction, Multiple Data; all processors perform the same
instruction on different data elements.
•
MIMD – Multiple Instruction, Multiple Data; all processors perform different
instructions on different data elements.
•
Embarrassingly parallel – processors work in parallel with very little or no
communication between them, e.g. image processing and Seti@Home.
•
Coupled parallel – processors work in parallel but require significant
communication between them, e.g. CFD.
•
HPC – High Performance Computing.
•
Grid computing – parallel computing across a geographically distributed
network (e.g. the internet); analogous to the electric grid.
•
Multi-core CPUs – a single chip with multiple processors (cores).
Note that each architecture requires its own approach to programming, and not
all algorithms are amenable to each.
Milestones:
Year Supercomputer Peak Speed
1906 Babbage Analytical Engine 0.3 OPS
1946 ENIAC 50 kOPS
1964 CDC 6600 (Seymour Cray) 3 MFLOPS
1969 CDC 7600 36 MFLOPS
1976 Cray-1 (Seymour Cray) 250 MFLOPS
1981 CDC Cyber 205 400 MFLOPS
1983 Cray X-MP 941 MFLOPS
1985 Cray-2 3.9 GFLOPS
1985 Thinking Machines CM-2 (64k)
1989 Cray Y-MP
1993 Thinking Machines CM-5 65.5 GFLOPS
1993 Intel Paragon 143.4 GFLOPS
1996 Hitachi/Tsukuba CP-PACS (2k) 368.2 GFLOPS
1999 Intel ASCI Red (10k) 2.8 TFLOPS
2002 NEC Earth Simulator (5k) 35.9 TFLOPS
2005 IBM Blue Gene/L (131k) 280.6 TFLOPS
2008 IBM Roadrunner (130k) 1.105 PFLOPS
MMAE 517 (Illinois Institute of Technology) Computational Fluid Dynamics
c 2010 K. W. Cassel 442 / 454
Parallel Computing Introduction
Milestones (cont’d):
1954 – Fortran (Formula Translation) developed
1970 – Unix developed at AT&T Bell Labs
Components of Computing:
1 Hardware – the computer (CPUs, memory, storage, network, etc.)
→ Assuming adequate memory, the computational time is primarily determined
by:
Sequential → CPU speed
Parallel → number of CPUs, CPU speed, inter-processor communication speed
and bandwidth, percent of code that is run in parallel (see Amdahl’s law).
2 Software – OS, compiler, libraries, program, etc.
3 Algorithms – the numerical method
→ Note that the best algorithms for serial computers often are not the best for
parallel computers (e.g. BLKTRI).
5 Architectures:
SIMD (e.g. CM-2) vs. MIMD
Shared memory (e.g. Cray, SGI, multicore CPUs, etc.):
→ Ingredients (3) and (4) were not available during MPP (e.g. CM) heyday
in mid ’80s - mid ’90s. This led to the demise of the CM.
Amdahl’s Law:
Measure of speedup S(N ) on N processors versus single processor, i.e. serial code.
Amdahl’s Law:
T (1) N
S(N ) = = ,
T (N ) N − (N − 1)F
where T (N ) is the time required using N processors, and F is the fraction of the
serial run time that the code spends in the parallel portion.
See Mathematica notebook ‘’AmdahlsLaw.nb.”
Notes:
1 If F = 1 (100% parallel) ⇒ S(N ) = N (linear speedup)
If F < 1 ⇒ S(N ) < N (sub-linear speedup)
where Fs = 1 − F is the fraction of time in the serial code spent doing the
serial portion.
e.g. F = 0.95 ⇒ Fs = 0.05 ⇒ S(∞) = 20
3 In practice, Fs (and F ) depends on the number of processors N and the size
of the problem n, i.e. the number of grid points.
Thus, for ideal scalability a parallel algorithm should be such that
Fs (N, n) → 0 (F → 1) as n → ∞
S(N )
E(N ) =
N
November 2005:
The entry level to the list moved up to the 12.64 Tflop/s mark on the
Linpack benchmark, compared to 9.0 Tflop/s six months ago.
The last system on the newest list would have been listed at position 267 in
the previous TOP500 just six months ago.
Total combined performance of all 500 systems has grown to 16.95 Pflop/s,
compared to 11.7 Pflop/s six months ago and 6.97 Pflop/s one year ago.
The entry point for the top 100 increased in six months from 18.8 Tflop/s to
27.37 Tflop/s (which would have been # 9 on the November 2005 list).
The entry level into the TOP50 is at 50.55 Tflop/s (which would have been
# 5 on the November 2005 list).
Of the top 50, 56 percent of systems are installed at research labs and 32
percent at universities. Cray’s XT is the most-used system family with 20
percent, followed by IBMs BlueGene with 16 percent. The average
concurrency level is 30,490 cores per system – up from 24,400 six month ago.
Seven U.S. DOE systems dominate the TOP10.
Roadrunner (# 1) is based on the IBM QS22 blades that are built with
advanced versions of the processor in the Sony PlayStation 3.
The list now includes energy consumption of the supercomputers.