Professional Documents
Culture Documents
9.1 Introduction
The functionals considered previously all involve fixed end points, that is the inde-
pendent variable is defined on a given interval at the ends of which the value of the
dependent variable is known. It is not hard to find variational problems with different
types of boundary conditions: in this introduction we describe a few of these problems
in order to motivate the analysis described here and in chapter 11.
The simplest generalisation is to natural boundary conditions in which the interval
of integration is given, but the value of the path at either one or both ends is not
given but needs to be determined as part of the variational principle. An example is a
stationary, loaded, stiff beam, which adopts a configuration that minimises its energy.
If the unloaded beam is horizontal along the x-axis, between x = 0 and L, and y(x)
represents the displacement, assumed small, the bending energy is proportional to its
curvature, which for small |y| is proportional to y 00 (x)2 ; then if (x) is the load per unit
length the energy functional can be shown to be
L
1 00 2
Z
E[y] = dx y g(x)y , (9.1)
0 2
where is a positive constant and g the acceleration due to gravity. Note that here y(x)
is positive for displacements below the x-axis. The Euler-Lagrange equation for this
functional is a linear, fourth order equation, see section 9.2.1, so requires four boundary
conditions.
If the beam is clamped horizontally at x = 0, there are just two boundary conditions,
y(0) = y 0 (0) = 0, though experience shows that this problem has a unique solution.
It transpires that the other two conditions, needed to determine this solution of the
Euler-Lagrange equation, can be derived directly from the variational principle that
requires E[y] to be stationary.
Alternatively, if the beam is simply supported at both ends, giving the boundary
conditions y(0) = y(L) = 0, it can be shown that the remaining two boundary conditions
are also obtained by insisting that E[y] is stationary. We explore this problem in
section 9.2.1.
341
342 CHAPTER 9. VARIABLE END POINTS
The first person to generalise boundary conditions was Newton in his investigations
of the motion of an axially symmetric body through a resisting medium, see equa-
tion 2.22 (page 96).
The brachistochrone problem was generalised by John Bernoulli in 1697, by allowing
the lower end of the stationary path to move on a given curve, defined by an equation
of the form (x, y) = 0. In figure 9.1 we show an example where the right end of the
brachistochrone lies on the straight line defined by (x, y) = x + y 1 = 0, and the
left end is fixed at (0, A), with A < 1. In figure 9.1 are shown the brachistochrones
for various values of A when the particle starts at rest at (0, A). The equation for the
stationary paths is derived in exercise 9.13. Notice that the cycloid intersects the curve
(x, y) = 0 at right angles and at x = 0 the gradient of the cycloid is infinite.
1
y
0.8 (x,y)=x+y-1=0
0.6
Cycloid segments
0.4
0.2
x
0
0.2 0.4 0.6 0.8 1
L R
Figure 9.1 Diagram showing stationary paths through the point (0, A), for
A = 0.2, 0.5 and 0.9, and (v, y(v)) where the right end is constrained to lie on the
straight line (x, y) = x + y 1 = 0, and the particle starts from rest at (0, A).
In this case the functional is, see equation 4.5 (page 166),
Z v s
1 + y0 2
T [y] = dx , y(0) = A, (v, y(v)) = 0, (9.2)
0 2E/m 2gy
Exercise 9.1
Explain why the stationary curves depicted in figure 9.1 are cycloids.
Many fixed end point problems can be modified in this manner. For instance a variation
of the catenary problem, described in section 2.5.6 (page 97) is given by an inelastic rope
9.2. NATURAL BOUNDARY CONDITIONS 343
hanging between two curves, defined by 1 (x, y) and 2 (x, y), on which the ends may
slide without hindrance, as shown in figure 9.2: the curve AB is a catenary, but now we
also need to determine the positions of A and B. Another example is a cable hanging
between two points A and B between which a weight of mass M is attached at a given
point C, with the distances AC and CB along the curve known, see figure 9.3. The
segments AC and CB will be catenaries but the gradient at C will be discontinuous.
Both these problems involve constraints, so are dealt with in chapter 11.
y 1 (x,y) 2 (x,y) y
A B
Catenary
A B C
x Mg x
Figure 9.2 Diagram of a rope hanging between Figure 9.3 Diagram of a rope hanging between
the two curves defined by k (x, y) = 0, k = 1, 2, two given points, A and B, and with a weight
on which it can slide freely. firmly attached at a given point of the rope.
In this section we develop the theory for a particularly simple type of free boundary,
because this illustrates the method in the clearest manner. The ideas used for the more
general case are similar, but the algebra is more complicated. Here the interval [a, b]
and the value of the path at x = a are given, but the value of y(b) is to be determined.
Thus the functional is
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, (9.3)
a
and both y(x) and y(b) need to be chosen to ensure that S[y] is stationary, as shown
schematically in figure 9.4, where the stationary and a varied path are depicted. This
problem differs from the general case, treated later, in that the value of x at the right-
hand end is given. This type of boundary condition is known as a natural condition or
a natural boundary condition because the value of y(b) is not imposed, but is defined
by the variational principle.
The admissible paths all pass through (a, A); the right end is constrained to lie on
the line x = b, but the actual position on this line needs to be determined. If y + h
are admissible paths then h(a) = 0, but h(b) need not be zero.
344 CHAPTER 9. VARIABLE END POINTS
y
y(b)
y(b) + h(b)
y(x)+h(x)
y(x)
A x
a b
L
Figure 9.4 Diagram showing the stationary path, the solid curve, and a varied path,
the dashed curve, for a problem in which the left end is fixed, but the other end is free
to move along the line x = b, parallel to the y-axis, so y(b) needs to be determined.
The Gateaux differential of the functional is given by equation 3.9 (page 125), that is
Z b
F F
S[y, h] = dx h(x) + h0 (x) 0 . (9.4)
a y y
As before we integrate the second term by parts: using the fact that h(a) = 0 this
gives, Z b
F d F F
S[y, h] = h(b) 0 dx h(x). (9.5)
y x=b dx y 0
a y
This is the equivalent of equation 3.10 (page 125) but now the boundary term is not
automatically zero.
For a stationary path S[y, h] = 0 for all h(x) and because the allowed variations
include those functions for which h(b) = 0 the stationary paths must satisfy the Euler-
Lagrange equation
d F F
0
= 0, y(a) = A, (9.6)
dx y y
with only one boundary condition1 . The general solution of this equation will contain
one arbitrary constant c, so we write the solution as y(x, c). Because y(x, c) satisfies
equation 9.6, the Gateaux differential becomes S = h(b)Fy0 (x, y, y 0 )|x=b and because
this must be zero for all h(b), the solution of the Euler-Lagrange equation must satisfy
the boundary condition
Fy0 (b, y(b, c), y 0 (b, c)) = 0, (9.7)
which determines possible values of c, and hence the stationary paths. Equation 9.7 is
the natural boundary condition.
As an example consider the brachistochrone problem, studied in section 4.2. It is
convenient to use the dependent variable z(x) = A y(x), defined in equation 4.6
(page 166), and as before, we suppose that the initial velocity is zero, v0 = 0. Then the
functional may be taken to be2
Z b r
1 + z0 2
T [z] = dx , z(0) = 0. (9.8)
0 z
1 This derivation assumes that there exists at least a one parameter family of variations, h(x), such
that h(a) = h(b) = 0, which is always the case for the problems we consider.
2 For convenience we ignore the factor (2g)1/2 , which does not affect the Euler-Lagrange equation.
9.2. NATURAL BOUNDARY CONDITIONS 345
The Euler-Lagrange equation is the same as in the previous discussion and, because the
functional does not depend explicitly upon x, it reduces to the first-order equation 4.7,
having the solution, see equation 4.8 (page 167),
1 2
x= c (2 sin 2), z = c2 sin2 , (9.9)
2
where we have set d = 0, because z = 0 when x = 0; the boundary condition at x = b
determines the value of c. For future reference we note that
dz dz . dx 2 sin cos 1
= = = because cos 2 = 1 2 sin2 .
dx d d 1 cos 2 tan
At x = b this solution must satisfy the boundary condition 9.7 which, for this
problem, becomes
z0
Fz 0 = p = 0.
z(1 + z 0 2 )
But z is bounded so the only solution is z 0 = 0, and since z 0 = 1/ tan , this gives
= /2, and means that the cycloid intersects the vertical line through x = b or-
thogonally, see figure 9.5. But at the right end x = b, so 2b = c2 , which gives the
solution
b 2b
x = (2 sin 2) , z = sin2 , 0 . (9.10)
2
The shape of this curve depends only upon b, rather than both A and b as in the
conventional problem. Here the value of A merely changes the vertical displacement of
the whole curve. It is therefore convenient to set A = 2b/, and then the dependence
upon b becomes a change of scale, seen by setting x = x/b and y = y/b to give
x = 2 sin 2, y = 2 cos2 , 0 . (9.11)
2
The graph of this scaled solution is shown in figure 9.5.
2
y
1.5
0.5
x
0 0.5 1 1.5 2 2.5 3
L
The time
p of passage is also independent of A, and is given by the simple formula
T (b) = b/g, a result derived in exercise 9.6.
Exercise 9.2
Write down a functional for the distance between the point (0, A) and the line
x = X > 0, parallel to the y-axis. Show that the stationary path is the straight
line through (0, A) and parallel to the x-axis.
346 CHAPTER 9. VARIABLE END POINTS
Exercise 9.3 Z /4
dx y 0 2 y 2 , y(0) = A > 0,
`
Find the stationary path of the functional S[y] =
0
where the right-hand end of the path lies on the line x = /4.
Exercise 9.4 Rb
Show that the functional S[y] = a dx F (x, y, y 0 ), y(b) = B, with the left end
of the path constrained to the line x = a, is stationary on the solution of the
Euler-Lagrange equation,
d F F
= 0, Fy0 (x, y, y 0 ) = 0, y(b) = B.
dx y 0 y x=a
Exercise 9.5 Z 1
dx y 0 2 + y 2 , y(1) = B > 0,
`
Find the stationary path for the functional S[y] =
0
with the left end of the path constrained to the y-axis.
Exercise 9.6 p
Show that the time to traverse the curve 9.10 is T (b) = b/g.
Hint use equation 9.8, but remember the factor (2g)1/2 .
Exercise 9.7
The navigation problem defined in section 2.5.4 gives rise to the functional
p
c2 (1 + y 0 2 ) v 2 vy 0
Z b
T [y] = dx F (x, y 0 ), F (x, y 0 ) = ,
0 c2 v 2
for the time to cross a river. The start point is at the origin so y(0) = 0, but the
terminus is, in this version of the problem, undefined so the boundary condition
at x = b is a natural boundary condition. Assuming that v(x) 0 show that the
stationary path is given by
1 x
Z
y(x) = du v(u).
c 0
Exercise 9.8
This exercise is important because it uses the method introduced in this section
to extend the range of boundary conditions that can be described by functionals.
(a) Show that the Euler-Lagrange equation for the functional
Z b
dx y 0 (x)2 y(x)2 ,
`
S[y] = y(a) = A, y(b) = B,
a
is y 00 + y = 0, y(a) = A, y(b) = B.
9.2. NATURAL BOUNDARY CONDITIONS 347
(b) Second-order equations of the above form occur frequently, but the boundary
conditions are sometimes different, involving linear combinations of y and y 0 . Thus
a typical equation is
d2 y
+ y = 0, ga y(a) + y 0 (a) = 0, gb y(b) + y 0 (b) = 0. (9.12)
dx2
where ga and gb are constants.
Show, from first principles, that the functional
Z b
S[y] = gb y(b)2 ga y(a)2 + dx y 0 (x)2 y(x)2
`
a
is stationary on the path that satisfies equation 9.12, for all ga and gb .
On a stationary path this must be zero for all allowed h(x). A subset of varied paths
has h(b) = h0 (b) = 0 and hence the stationary path must satisfy the Euler-Lagrange
equation
d2
F d F F
+ = 0, y(a) = A, y 0 (a) = A0 . (9.16)
dx2 y 00 dx y 0 y
The solution of this equation contains two arbitrary constants. Now consider those
varied paths for which h(b) = 0 and h0 (b) 6= 0, and those for which h(b) 6= 0 and
h0 (b) = 0, to see that the solutions of this Euler-Lagrange equation must also satisfy
the two extra boundary conditions,
d
Fy00 = 0 and Fy0 Fy00 = 0 at x = b, (9.17)
dx
which determine the two constants in the solution of equation 9.16.
Exercise 9.9
Derive equation 9.13.
Exercise 9.10
For the functional defined in equation 9.1 (page 341) with = constant and the
boundary conditions y(0) = y 0 (0) = 0, use equations 9.16 and 9.17 to derive the
associated Euler-Lagrange equation and show that its solution is
g 2 ` 2
x x 4Lx + 6L2 .
y(x) =
24
Exercise 9.11
d2
F d F F
+ = 0, y(a) = A, y(b) = B, Fy00 = Fy00 = 0.
dx2 y 00 dx y 0 y a b
(b) Apply the result found in part (a) to the functional defined in equation 9.1
(page 341), with = constant and the boundary conditions y(0) = y(L) = 0, to
derive the associated Euler-Lagrange equation and show that its solution is
g
x(L x) L2 + xL x2 .
`
y(x) =
24
9.3. VARIABLE END POINTS 349
y
y(x) + h(x)
y(x)
A (x,y)=0
x
L
a v v+ R
Figure 9.6 Diagram showing the stationary path, the solid line, and a varied
path, the dashed curve, for a problem in which the left-hand end is fixed, but
the other end is free to move along the line defined by (x, y) = 0.
The functional is Z v
S[y] = dx F (x, y, y 0 ), y(a) = A, (9.18)
a
where the path y(x) and v need to be chosen to make the functional stationary.
Let y(x)+h(x) be an admissible varied path, so h(a) = 0. If x = v is the right-hand
terminal point of y(x), the terminal point of the varied path is at x = v + , for some
, so the x and y coordinates of this point are,
x = v + and y = y(v + ) + h(v + )
= y(v) + y 0 (v) + h(v) + O(2 ).
so the derivative with respect to is given by equation 1.52, (page 45), with b = z()
so dz/d = ,
dS Z z
F
0 F
0 0
= F z, y(z) + h(z), y (z) + h (z) + dx h +h .
d a y y 0
Now use integration by parts and the fact that h(a) = 0 to give
Z v iv Z v
0
h d
dx h Fy = hFy
0 0 dx h (Fy0 )
a a a dx
Z v
d
= hFy0 dx h (Fy0 ) .
x=v a dx
Hence the Gateaux differential, equation 9.21, becomes
Z v
d F F
S[y, h] = F + hFy0 dx h. (9.22)
v a dx y 0 y
Finally we use equation 9.19 to express h(v) in terms of to arrive at the relation
Z v
0
d F F
S[y, h] = Fy x + (y Fy F )y dx h. (9.23)
0 0
y v a dx y 0 y
On a stationary path S[y, h] = 0 for all allowed h. A subset of these variations will
have = 0, consequently y(x) must satisfy the Euler-Lagrange equation,
d F F
= 0, y(a) = A. (9.24)
dx y 0 y
and this must also be zero for all . Hence, the equation
must be satisfied. This equation is the required boundary condition for the right-hand
end of the path and is named a transversality condition.
In order to see how this works, consider the solution of equation 9.24, y(x, c), which
depends upon a single constant c, because there is only one boundary condition. By
substituting this into equation 9.26 we obtain an equation relating v and c. But the
right-hand end of the path satisfies the condition (v, y(v, c)) = 0, and this gives another
relation between v and c: if these two equations can be solved for one or more real pairs
of v and c, stationary paths are obtained.
9.3. VARIABLE END POINTS 351
The derivation of equation 9.26 implicitly assumed that y 6= 0, see equation 9.23.
Suppose that on the stationary path y = 0, which means that at this point the curve
(x, y) = 0 is parallel to the y-axis, then from equation 9.19 we see that = 0, since we
assumed that x and y are not simultaneously zero, the boundary term of 9.22 reduces
to hFy0 = 0, which means that Fy0 = 0. Equation 9.26 also gives Fy0 = 0 if y = 0 so it
is also valid in this exceptional case. Note that in this limit the transversality condition
reduces to the natural boundary condition of equation 9.7, which is also retrieved by
setting = x b in equation 9.26.
The transversality condition can be written in an alternative form by noting that
if the equation (x, y) = 0 defines a curve y = g2 (x) then g20 (x) = x /y , and equa-
tion 9.26 becomes
F + (g20 y 0 )Fy0 = 0, x = v. (9.27)
This form of the transversality condition is not valid when y = 0, that is where |g20 (x)|
is infinite.
If the left end of the path is also constrained to a prescribed curve, (x, y) = 0, then
a similar equation can be derived. In summary we have the following result.
Theorem 9.1 Z v
For the functional S[y] = dx F (x, y, y 0 ) and the smooth curves C and C defined by
u
the equations (x, y) = 0 and (x, y) = 0, the continuously differentiable path joining
C and C , at x = u and x = v respectively, that makes S[y] stationary, satisfies the
Euler-Lagrange equation
d F F
=0 (9.28)
dx y 0 y
and the boundary conditions
x Fy0 + y (y 0 Fy0 F ) = 0 and x Fy0 + y (y 0 Fy0 F ) = 0. (9.29)
x=u x=v
with the right end of the path terminating on the curve C defined by (x, y) = 0. For
this functional a first-integral exists and is given by
f (y)
F y 0 Fy 0 = p = c = constant.
1 + y0 2
The transversality condition 9.26 then gives
x y 0 f (y)
p cy = 0 that is x y 0 (v) = y .
1 + y0 2
But the gradient of C is x /y and hence at the terminal point the stationary path is
perpendicular to C.
352 CHAPTER 9. VARIABLE END POINTS
Exercise 9.12 p
v
1 + y0 2
Z
Find the stationary path of the functional S[y] = dx , y(0) = 0, for
0 y
a path terminating on the line y = x a, a > 0.
Hint first show that the solutions of the Euler-Lagrange equation are circles
through the origin and with centres on the x-axis.
Exercise 9.13
Consider the brachistochrone in which the left end is fixed at (0, A) and the right
end is constrained to the curve x/a + y/b = 1, a, b > 0. Initially the particle is
stationary at (0, A).
Show that the equations of the stationary path are
1 2
x= c (2 sin 2) , y = A c2 sin2 , 0 b = tan1 (b/a),
2
where c is given by the equation c2 b = a (1 A/b).
Graphs of this solution, for various values of A and a = b = 1, are shown in
figure 9.1 (page 342).
Exercise 9.14
Consider the ellipse and the straight line defined, respectively, by the equations
x2 y2 x y
2
+ 2 =1 and + = 1, x > 0, y > 0,
a b A B
in the first quadrant, where a, b, A and B are positive constants.
(a) Show that these curves do not intersect if AB > , where 2 = A2 b2 + B 2 a2 .
(b) Construct a functional for the distance between two points (u, v) on the ellipse,
and (, ) on the straight line, and show that the solution of the associated Euler-
Lagrange equation is the straight line y = mx + c. Show also that the values of
the six constants m and c , (u, v) and (, ) making this distance stationary satisfy
the equations
mu v A u2 v2
= 2, m= , + 2 = 1, + = 1,
a2 b B a2 b A B
together with v = mu + c and = m + c.
(c) Solve these equations to show that when the curves do not intersect the sta-
AB
tionary distance is d = .
A2 + B 2
where the end of the path at t = 0 is fixed and the end at t = 1 lies on a smooth
curve, C, defined parametrically by x = ( ), y = ( ), where both ( ) and ( )
are continuously differentiable and such that 0 ( ) and 0 ( ) are not simultaneously
zero for any in the region of interest. Notice that the parameter t varies in the fixed
interval [0, 1] because the integrand is homogeneous of degree one in x and y: this is
different from the functional 9.18 in which it was necessary to allow the upper limit to
vary. Here 0 t 1 on all paths.
By considering the varied path (x + h1 , y + h2 ) we obtain the Gateaux differential
in the usual manner,
Z 1
S[x, y, h1 , h2 ] = dt h1 x + h1 x + h2 y + h2 y . (9.32)
0
The left end of the path is fixed at t = 0, consequently h1 (0) = h2 (0) = 0, and
integration by parts gives
Z 1
d d
S = h1 x + h2 y dt h1 + h2 .
t=1 0 dt x x dt y y
(9.33)
If S[x, y] is stationary it is necessary that S = 0 for all allowed variations h1 (t) and
h2 (t). By restricting the varied paths to those on which h1 (1) = h2 (1) = 0 we see that
the stationary path must satisfy the Euler-Lagrange equations
d d
= 0, = 0, x(0) = a, y(0) = A. (9.34)
dt x x dt y y
The general solutions of these equations satisfying the conditions at t = 0 will contain
two constants, which we denote by c and d. On these paths the Gateaux differential
becomes
S = h1 (t)x + h2 (t)y . (9.35)
t=1
Because all admissible paths terminate on C, as shown in figure 9.7, the values of h1 (1)
and h2 (1) are related.
y
= 1 +
= 1
C
A Stationary path
Varied path
x
a
Figure 9.7 Diagram showing the stationary path, the terminating
curve, C, and a varied path. At the intersection of C and the sta-
tionary path = 1 ; and the varied path intersects C at = 1 + .
354 CHAPTER 9. VARIABLE END POINTS
Suppose that the stationary path terminates at ((1 ), (1 )) and a varied path at a
different value of , 1 + . Hence
But S must be zero for all 6= 0 and hence the required boundary condition is
0 (1 )x + 0 (1 )y = 0. (9.37)
t=1
This is the transversality condition in parametric form and is the equivalent of equa-
tion 9.26 (page 350).
There are now three constants that need to be determined: these are (c, d) from
the solution of equations 9.34 and the value of the parameter 1 , where the stationary
path intersects C. Equation 9.37 gives one relation between these three parameters: the
other two are x(1, c, d) = (1 ) and y(1, c, d) = (1 ). In principle these equations can
be solved to give the required stationary path.
In order to see how this theory works consider the problem solved in exercise 8.1(b)
(page 313), that is the stationary values of the distance between the origin and the
parabola now defined parametrically by (1 2 , a ).
The parametric form of the functional is
Z 1 p
S[x, y] = dt x2 + y 2 , x(0) = y(0) = 0, (9.38)
0
where c and d are constants to be determined: these solutions are the parametric equa-
tions of a straight line through the origin, as expected. Hence equation 9.39 becomes
ad = 21 c. But at t = 1 the solution 9.40 intersects the parabola, hence c = 1 12 and
d = a1 . Substituting these into the equation ad = 21 c gives
a2
a2 1 = 21 (1 12 ) that is 1 = 0 or 12 = 1 .
2
The first of these solutions, 1 = 0, gives
x = 1 and y = 0. The second equation,
12 = 1 a2 /2, has real solutions if a < 2, which are the solutions found previously in
exercise 8.1(b).
9.5. WEIERSTRASS-ERDMANN CONDITIONS 355
Exercise 9.15
For the parametrically defined curve x = ( ), y = ( ), use the method described
above to show that the distance along the straight line y = mx from the origin to
a point on this curve is stationary if m = 0 ( )/0 ( ). If the curve is represented
by the function y(x), show that this becomes my 0 (x) = 1 and give a geometric
interpretation of this formula.
Exercise 9.16
Express the functional defined in equation 9.38 in non-parametric form and find
its stationary paths.
The configuration adopted by the wire is the continuous stationary path of this func-
tional.
y
L x
y1 y2
Mg
Figure 9.8 Diagram of a light, taut wire of length L sup-
porting a weight at x = .
This energy functional is different from others considered because the point x = is
special. We deal with this by splitting the interval [0, L] into two subintervals, [0, ]
and [, L] and writing the whole path, y(x) in terms of two functions,
(
y1 (x), 0 x ,
y(x) = (9.42)
y2 (x), x L,
Now proceed in the usual manner. By choosing those h(x) for which h(x) = 0 for
x L and those for which h(x) = 0 for 0 x , we obtain the Euler-Lagrange
equations for y1 (x) and y2 (x),
d2 y1
= 0, 0 x < , y1 (0) = 0,
dx2
(9.44)
d2 y2
= 0, < x L, y2 (L) = 0.
dx2
On the path satisfying these equations the Gateaux differential becomes
n o
E[y, h] = M g + T y10 () y20 () h()
Mg
y20 () y10 () = . (9.45)
T
Physically, this equation represents the resolution of forces acting on the weight in the
vertical direction. Together with the continuity of y(x) this condition provides sufficient
information to find a stationary path, as we now show by solving the equations.
The solutions of the Euler-Lagrange equations 9.44 that satisfy the boundary con-
ditions at x = 0 and x = L are
Exercise 9.17
Find the continuous stationary paths of the functional
1 L
Z
S[y] = Cy()2 + dx y 0 2 , y(0) = A, 0 < < L,
2 0
with natural boundary conditions at x = L. Explain why there cannot be a
unique, nontrivial solution if A = 0.
and compute its value on the varied paths y1 + h1 and y2 + h2 , and also allowing the
point x = c to move to c0 = c + , as shown diagrammatically in figure 9.9.
y y 2 + h2
B
y 1 + h1 y2
A y1
x
a c c b
Figure 9.9 Diagram showing the stationary and a varied
path: here c0 = c + .
Each integral is similar to that defined in equation 9.20 (page 349) and using the same
analysis that leads to equation 9.25, see exercise 9.18, we obtain,
S[y, h] = F + h1 Fy0 F + h2 Fy0 , (9.49)
(x,y)=(c,y1 ) (x,y)=(c,y2 )
On the stationary path the coordinates of the corner are (c, y(c)) and on a varied path
these become (c + , y(c) + ), with and independent variables. In terms of y1 and
y2 we have
Since y(x) is continuous, y1 (c) = y2 (c) = y(c), these equations allow h1 (c) and h2 (c)
to be expressed in terms of the independent variables and . Substituting these
expressions into equation 9.49 for S we obtain
S[y, h] = (F y 0 Fy0 ) + Fy0 (F y 0 Fy0 ) + Fy0 .
(x,y)=(c,y1 ) (x,y)=(c,y2 )
(9.52)
Note that each term of the right-hand side of this equation is similar to the left-hand
side of equation 9.26 (page 350) with = y and = x : the important difference is
that and are independent variables. Because of this S = 0 only if the coefficients
of and are both zero, which gives the two relations
These relations between the values of y1 and y2 , and their first derivative at x = c are
known as the Weierstrass-Erdmann (corner) conditions and they hold at every corner
of a stationary path. With one corner the Euler-Lagrange equations 9.50 and 9.51 may
be solved to give functions y1 (x, ) and y2 (x, ), each involving one arbitrary constant.
Substituting these into the corner conditions gives two equations relating , and c:
a third equation is given by the continuity equation y1 (c, ) = y2 (c, ). These three
equations allow, in principle, values for , and c to be found.
Exercise 9.18
Derive equations 9.499.52.
Because the integrand depends only upon y 0 , the solutions of the Euler-Lagrange equa-
tion are the straight lines y = mx + , for some constants m and . Therefore the
smooth solution that fits the boundary conditions is y = x/2 and on this path S = 1/8:
moreover, by considering the second-order terms in the expansion of S[y + h] we see
that this is path is a local maximum of S.
However, if y 0 = 0 or y 0 = 1 the integrand is zero, so we can imagine a broken
path comprising segments of straight lines at 45 and parallel to the x-axis on which
S[y] = 0; because the integrand is non-negative such a path gives a global minimum.
We now show that the corner conditions give such solutions.
Suppose that there is one corner at x = c. The two solutions that fit the boundary
conditions either side of c are
(
y1 = m1 x, 0 x c,
y=
y2 = m2 (x 2) + 1, c x 2.
Since
The only non-trivial solutions of these equations and the continuity condition, m1 c =
m2 (c 2) + 1 are (m1 , m2 , c) = (1, 0, 1) and (0, 1, 1), which give the two solutions shown
by the solid and dashed lines, respectively, in figure 9.10. On both lines the functional
has its smallest possible value of zero.
y
(1,1) (2,1)
Figure 9.10 Graph of some broken extremals for the functional 9.55. On the
solid line (m1 , m2 ) = (1, 0): on the dashed line (m1 , m2 ) = (0, 1) and in both
cases c = 1. The dotted line is a broken extremal with several corners.
In this example there are solutions with any number of corners comprising alternate
lines with unit gradient and horizontal lines; an example is depicted by the dotted line
in figure 9.10.
Exercise 9.19
(a) Show that the stationary path of the functional 9.55 without corners is y = x/2
and that on this path S[y] = 1/8.
(b) If y = x/2 show that
Z 2
1 2
S[y + h] = S[y] dx h0 (x)2
2 0
Exercise 9.20
Show that the only solutions of equations 9.56 are those given in the text.
Exercise 9.21
Find the stationary paths of the functional 9.55 with two corners.
Exercise 9.22 Z 4 2
dx y 0 2 1 , y(0) = 0,
`
Find the stationary paths of the functional S[y] =
0
y(4) = 2, having just one corner.
9.6. NEWTONS MINIMUM RESISTANCE PROBLEM 361
can be derived directly from equations 9.53 and 9.54, by setting (x, y, x, y) = xF (x, y, y/x)
and recalling the results of exercise 8.11 (page 319), to give
where the corner is at t = c, with 0 < c < 1. At such a corner either or both of x(t)
and y(t) are discontinuous.
c(1 + p2 )2
2 3 4
x(p) = and y(p) = B + c ln p p p (9.64)
p 4
where B is a constant which has absorbed all other constants: in these equations p
may be regarded as a parameter, so we have found a solution in parametric form. The
required solution is obtained by finding the appropriate values of B, c and a range of p
that satisfy (x, y) = (0, A) and (b, 0): it transpires that this is impossible, as will now
be demonstrated.
Define the related functions
x (1 + p2 )2 yB 3
(p) = = and (p) = = ln p p2 p4 , p>0 (9.65)
c p c 4
which contain no arbitrary constants. Since, by definition, p = y 0 (x) it follows from
the chain rule that p 0 (p) = 0 (p) and hence for p 6= 0 the stationary points of (p)
and (p) coincide. The graphs of (p) and (p) are shown in figure 9.11.
7 0
6 -1
-2
5
-3
4
-4
3 -5
p p
2 -6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
Figure 9.11 Graphs of (p) and (p). Each function is stationary at p = 1/ 3.
Since 0 (p) = (p2 + 1)(3p2 1)/p2, (p) has a single minimum
at p = 1/ 3 and, because
p 0 (p) = 0 (p), (p) has a single maximum at p = 1/ 3. The coordinates of the
3 Note that this is a fairly common trick and was used to simplify Riccatis equation in section 7.9.
9.6. NEWTONS MINIMUM RESISTANCE PROBLEM 363
5
stationary points are (, ) = (16 3/9, 21 ln 3 12 ) = (3.08, 0.97). The minimum
value of x(p) is 16 3c/9, with c > 0, hence there is no nontrivial stationary path that
can pass through x = 0, the lower boundary. However, we pursue the investigation of
this general solution because it is needed for the stationary path of S2 .
The graphs of (p) and (p) show that there are two branchesof the function (),
the solution of equation
9.62, one defined by p in the interval [1/ 3, ) and the other
on the interval (0, 1/ 3]. Consider each case
p 3 > 1: for p increasing from 1/ 3, (p) increases monotonically from its
minimum value and (p) decreases monotonically from its maximum value. Hence
the function
() remains in the fourth quadrant starting at (3.08, 0.97), where
4/3
p = 1/ 3, and behaving as 0 ' 3 /4 for large p. This
is the curve M R in
figure 9.12. At p = 1/ 3, () = 1/ 3. Since p > 1/ 3, this curve is a local
minimum of S1 [y], see equation 9.61.
p 3 < 1: for p decreasing from 1/ 3, (p) increases monotonically from its
minimum value and (p) decreases monotonically from its maximum value: again
() remains in the fourth quadrant, and for small p, ' 1/p and () ' ln .
On this curve() decreases more slowly than on the previous curve. At p = 1/ 3,
0 () = 1/ 3. This is the curve M S in figure 9.12.
The equations 9.64 define the parametric equations of a curve in the (, )-plane with
parameter p; this curve is shown in figure 9.12. In principle can be expressed in terms
of , but no simple formula for this relation exists. The two branches M R and M S of
(), shown in figure 9.12, start at (3.08, 0.97) with the same gradient.
2 3 4 5 6 7
0
p3 <1
-1 M
-2 S
-3
p3 >1
-4
-5
R
-6
Figure 9.12 Graph of the two branches of (), the solu-
tion of equation 9.62.
The above analysis shows that there is no smooth solution stationary path for S 1 [y].
However, suppose that the solid of revolution surrounds a hollow cylinder with axis
along Oy and with a given radius, a and height A, and through which the fluid flows
unhindered, as shown in figure 9.13.
364 CHAPTER 9. VARIABLE END POINTS
A
x
a b
Figure 9.13 Diagram of a solid surrounding a hollow cylinder.
The functional for this problem is a variation of that defined in equation 9.59,
b
x
Z
S3 [y] = dx , y(a) = A, y(b) = 0. (9.66)
a 1 + y0 2
For given values of b/a and A/a we now show that these equations have a unique
solution, provided A/a is larger than some minimum value that depends upon b/a, and
tends to zero as b a. The equations can be solved numerically for the constants,
(p1 , p2 , c, B). But this task is made easier by first expressing p2 and c in terms of
p1 . This is achieved by dividing the two equations 9.69 to eliminate c and writing the
resultant expression in the form
b 1
(p2 ) = (p1 ), p1 > . (9.70)
a 3
This equation can be interpreted geometrically as illustrated in figure 9.14 which shows
the graphs of (p) = (1 + p2 )2 /p and b(p)/a > (p), the dashed line. For a given value
of p1 > 1/ 3, we can see by following the arrows on the dotted lines that a unique
value of p2 is obtained. For large p we have (p) ' p3 giving the approximate solution
p2 ' (b/a)1/3 p1 .
9.6. NEWTONS MINIMUM RESISTANCE PROBLEM 365
b(p)/a
(p)
1/3 p1 p2 p
Figure 9.14 Graphs of the functions (p), the solid line, and b(p)/a, the
dashed line and the geometric interpretation of the solution of equation 9.70.
y
4
a
3 b=3.5a
2
b=2.5a
1
b=1.5a b=2a x/a
0 0.5 1 1.5 2 2.5 3 3.5
Figure 9.15 Examples of the stationary paths of S3 [y] for A = 4a and various
values of b/a. For this value of A there are no solutions for b > 3.72a.
366 CHAPTER 9. VARIABLE END POINTS
Differentiating with respect to , then setting to zero, integrating by parts and using
the fact that h(b) = 0 gives the Gateaux differential, see exercise 9.26,
Z b
dFy0
S2 = ak kF a, y 0 (a) h(a)Fy0 a, y 0 (a)
dx h . (9.78)
a dx
Using the subset of variations with k = h(a) = 0 gives equation 9.62, having the
parametric solution
(1 + p2 )2
2 3 4 1
x=c , y = B + c ln p p p , p1 p p2 , (9.79)
p 4 3
where c and B are constants and we restrict p > 1/ 3 because in the previous case
only this range of p gave a minimum: this assumption is justified in exercise 9.27.
On using equation 9.76 to express h(a) in terms of k, we see that on the stationary
path the Gateaux differential has the value
S2 = a F a, y 0 (a) + y 0 (a)Fy0 a, y 0 (a) k.
(9.80)
9.6. NEWTONS MINIMUM RESISTANCE PROBLEM 367
From this it follows that y 0 (a) = 1 (we ignore the solution y 0 (a) = 0) and since y(x)
is a decreasing function y 0 (a) = 1. From the definition of p, equation 9.63, it follows
that p1 = 1. Thus the solution is
a (1 + p2 )2
7a a 2 3 4
x= , y =A+ + ln p p p , 1 p p2 . (9.82)
4 p 16 4 4
Finally the values of a and p2 are determined from the boundary conditions x(p2 ) = b
and y(p2 ) = 0. Combining these equations we obtain
bp2 3 4 2 7
A= p + p ln p 2 . (9.83)
(1 + p22 )2 4 2 2
4
The term in curly brackets is zero at p2 = 1 and the right-hand side of this equation
increases as p2 for large p2 . Also the gradient of the right-hand side is positive for
p2 > 1, see exercise 9.25. Hence for any positive value of A this equation gives a unique
value of p2 ; and then a can be determined from either of equations 9.82. Further, this
path is a local (weak) minimum, see exercise 9.27.
In figure 9.16 are shown some solutions for the cases A = 4, b = 1, 2, , 5, 8 and
10: in this figure only the curved parts of the solutions are shown.
4
y
3
2
1 2 3 4 b=5 b=8 b=10
1
0 1 2 3 4 5 6 7 8 9 x 10
Figure 9.16 Graphs of the solutions defined in equation 9.82 for A = 4 and
b = 1, 2, , 5, 8 and 10. Here the horizontal part of each solution, from x = 0
to a, is not shown.
Exercise 9.23
Derive the first two terms of equation 9.72.
Exercise 9.24
Show that as a 0 equation 9.73 can be written in the approximate form,
4/3
A 3 b
' (p1 )1/3 ,
a 4 a
and hence that for sufficiently small a there is no solution if A 1.09b4/3 a1/3 .
368 CHAPTER 9. VARIABLE END POINTS
Exercise 9.25
Denote the right-hand side of equation 9.83 by bG(p2 ) where
ff
p 3 4 2 7
G(p) = p + p ln p
(1 + p2 )2 4 4
Exercise 9.26
Derive the Gateaux differential 9.78.
Exercise 9.27
Show that the second derivative of S2 [y + h] evaluated at = 0 is
Z b
d 2 S2 1 2 x(3y 0 2 1) 0 2
2
= k + 2 dx h ,
d 2 a (1 + y 0 2 )3
where k is defined in equation 9.76. Deduce that the stationary path defined by
equation 9.82 gives a (weak) local minimum of S2 [y], provided 3y 0 2 > 1.
Exercise 9.28
(a) Consider the value of S1 [y], defined in equation 9.59, on the path
1 x
z(x) = A cos n + , 0xb
2 b
where n is any integer, and show that S1 [z] may be made arbitrarily small.
(b) Which norm does this path satisfy?
9.7. MISCELLANEOUS EXERCISES 369
Exercise 9.30 Z a p
Show that the stationary paths of the functional S[y] = dx y 1 + y 0 2 , y(0) = A > 0,
0
with natural boundary conditions at x = a, are given by
a x a
y = c cosh with A = c cosh .
c c
Show that there are two solutions if A > 1.509a and none for smaller A.
Exercise 9.31
Derive the Euler-Lagrange equations for the functional
Z v
S[y] = G(y(v)) + dx F (x, y, y 0 ), y(a) = A,
a
Exercise 9.32 pv
1 + y0 2
Z
Find the stationary paths for the functional S[y] = dx , y(0) = 0,
0 y
where the point (v, y(v)) is constrained to the curve x + (y r) = r2 , that is a
2 2
Exercise 9.33
Consider the functional
Z v
dx f (x, y) 1 + y 0 2 exp tan1 y 0 ,
p
S[y] = y(a) = A,
a
with the condition that the right-hand end of the stationary path lies on the curve
C defined by (x, y) = 0. If the gradient of C and the stationary path at the point
of intersection are, respectively, tan C and tan , show that
1
tan( C ) = .
370 CHAPTER 9. VARIABLE END POINTS
Exercise 9.34
Show that the stationary path of the functional
Z 1
S[y] = dx (xy + y 00 2 ), y(0) = y 0 (0) = y(1) = 0,
0
2 2
is y(x) = x (1 x)(2x + 2x 7)/480.
Exercise 9.35
A weight of mass M is hung from the end, x = L, of the beam described by the
functional of equation 9.1 and the beam is clamped at x = 0. The relevant energy
functional is
Z L
1 00 2
E[y] = M gy(L) + dx y (x)gy , y(0) = y 0 (0) = 0,
0 2
where the y-axis is pointing downwards. Find the associated Euler-Lagrange equa-
tion and boundary conditions for this problem. Solve this equation in the case
that is independent of x.
Exercise 9.36
A weight of mass M is hung from a given point, x = , 0 < < L, of the beam
described by the functional of equation 9.1 and the beam rests on supports at
x = 0 and x = L, both at the same level. The relevant energy functional is
Z L
1 00 2
E[y] = M gy() + dx y (x)gy , y(0) = y(L) = 0,
0 2
where the y-axis is pointing downwards. Assuming that y(x) is continuous at
x = , find the associated Euler-Lagrange equation and all the boundary condi-
tions for this problem.
Exercise 9.37
Prove that the functional
Z b
dx y 0 2 + 2yy 0 + y 2 ,
`
S[y] = y(a) = A, y(b) = B,
a
Exercise 9.38 Ra
Can the functional S[y] = 0
dx y 0 3 , y(0) = 0, y(a) = A, have broken extremals
Exercise 9.39 Z a
dx y 0 4 6y 0 2 , y(0) = 0, y(a) = A > 0, have any
`
Does the functional S[y] =
0
stationary paths with a single corner? Find any such paths.
Exercise 9.40
Find the equation for the stationary curve of the modified brachistrochrone prob-
lem in which the initial point is (0, A), A > 0, and the final point is on a circle
with centre on the x-axis at x = b and with radius r < b. The particle starts from
rest at (0, A).
9.8. SOLUTIONS FOR CHAPTER 9 371
and the Euler-Lagrange equation has the solution y 0 = 0. The solution passing through
(0, A) is therefore y = mx + A, for some m to be determined. The natural boundary
condition, equation 9.7, is
y0
Fy 0 = p = 0,
1 + y0 2
that is, y 0 = m = 0, so that y = A, which defines a straight line parallel to the x-axis.
The same reasoning as used in the text gives the required result.
dz dz . dx 1 dx 4b
z0 = = = and = sin2 ,
dx d d tan d
372 CHAPTER 9. VARIABLE END POINTS
and the time is given by the functional, see equation 4.6 (page 166),
Z /2 r s Z /2 s
1 dx 1 + z 0 2 b b
T = d =2 d = .
2g 0 d z g 0 g
Integrating this and assuming that v(x) 0 and that y 0 (x) 0 gives
1 x
Z
y(x) = du v(u).
c 0
Using the subset of variations with h(a) = h(b) = 0 and the fundamental lemma of the
Calculus of Variations we see that S[y] is stationary only on those paths satisfying the
equations y 00 + y = 0. On these paths the Gateaux differential is
S[y, h] = 2h(b) y 0 (b) + gb y(b) 2h(a) y 0 (a) + ga y(a)
and this is zero for all variations only if ga y(a) + y 0 (a) = 0 and gb y(b) + y 0 (b) = 0.
g 2 g gL gL2
L + 6AL + 2B = 0 and L + 6A = 0 = A= , B= ,
2 6 4
g 2 2
x x 4Lx + 6L2 .
giving the solution y(x) =
24
Solution for Exercise 9.11
(a) The Gateaux differential is given in equation 9.14 and since h(a) = h(b) = 0 this
reduces to
b b
d2
F F d F F
Z
0
S[y, h] = h + dx + h.
y 00 a a dx2 y 00 dx y 0 y
Using the subset of varied paths for which h0 (a) = h0 (b) = 0, we see that y(x) satisfies
the Euler-Lagrange equation
d2
F d F F
+ = 0, y(a) = A, y(b) = B.
dx2 y 00 dx y 0 y
The other boundary conditions are obtained by considering those paths for which
h0 (a) = 0 and those for which h0 (b) = 0, which gives Fy00 |a = Fy00 |b = 0.
y0 1
Fy0 + (y 0 Fy0 F ) = 0 that is p =0
y 1+y 0 2 c
p
but y 1 + y 0 2 = c therefore y 0 = 1.
0 0
for y , with y = 1 gives y = c/ 2and
If the intersection is at x = v, the equation
from the equation for the circle v = c(1 1/ 2); the required root is v = c(1 + 1/ 2),
the other root corresponding to y 0 = 1. Now substitute these coordinates into the
straight line equation, y = x a, to see that c = a.
1 2
x= c (2 sin 2) , y = A c2 sin2 , 0 b .
2
p
The integrand of the functional may be taken to be F = 1 + y 0 2 / A y, so the
transversality condition 9.26 gives, since = x/a + y/b 1,
y0
1 1 dy a
p = 0 that is = .
Ay 1+ y0 2 a b dx b
dy 1 a
Hence the equation for b is = = . Finally, at = b the end of the
dx tan b b
cycloid is on the line x/a + y/b = 1 that is
c2 1
2b sin 2b + A c2 sin2 b = 1.
2a b
9.8. SOLUTIONS FOR CHAPTER 9 375
The equation for can be written in the form cos( ) = AB/, and this has real
roots only if |AB| . If |AB| > the equation has only complex roots and the
ellipse and the line do not intersect.
Z p
(b) The functional is S[y] = dx 1 + y 0 2 where the pairs of coordinates (u, v), on
u
the ellipse, and (, ), on the line, satisfy the equations
u2 v2
e = 2
+ 2 1 and l = + 1
a b A B
and also v = y(u) and = y().
The general solution of the Euler-Lagrange equation is y = mx + c, where m and c are
constants, which are chosen to satisfy the boundary conditions.
For the boundary conditions, equation 9.29, we require
y0 m 1
Fy 0 = p = and y 0 Fy0 F = .
1+ y0 2 1 + m2 1 + m2
The boundary conditions on the ellipse give
u m v 1 mu v
2
2 = 0 and hence 2
= 2. (9.84)
a 1+m 2 b 1 + m2 a b
The boundary conditions on the straight line give
1 m 1 1 A
= 0 and hence m = , (9.85)
A 1+m 2 B 1+m 2 B
which is the condition for the stationary path to be perpendicular to the straight line.
Also these points lie on the boundary curves,
u2 v2
+ = 1 and v = mu + c (9.86)
a2 b2
376 CHAPTER 9. VARIABLE END POINTS
and
+ = 1 and = m + c. (9.87)
A B
Thus we have six equations for the six parameters (u, v), (, ) and (m, c) that we need
to find.
The distance along the stationary path is
Z p p
S[y] = dx 1 + m2 = ( u) 1 + m2 . (9.88)
u
Now m = A/B is given directly, equation 9.85, and subtracting equations 9.86 from 9.87
gives v = ( u)m: rearranging this and substituting for v from 9.84 gives
2
b
= m + mu 1 .
a2
Substitute this into the first of equation 9.87 gives
2
2 u
2 2 2 b 2 2 2
+ 2 = AB 2 .
B +A +A u 1 = AB or ( u) B + A
a2 a
Using
equations 9.86 and 9.84 we obtain (since u > 0) u = aB/ and since 1 + m2 =
A2 + B 2 /B we have
A2 + B 2 AB
S[y] = ( u) = .
B A2 + B 2
The solution of the associated Euler-Lagrange equation satisfing the boundary con-
dition at x = 0 is y = mx for some constant m. The boundary condition on the
parabola gives, on using equation 9.26, a2 y 0 = 2y and hence v = a2 /2. At this point
y = ma2 /2, and since this point lies on the parabolawe obtain m2 = 4/a2 2. Thus
there are stationary paths if a < 2 and none if a > 2. Note that the stationary path
through (1, 0) is not given by this method.
Using the same arguments as in the text we see that y1 and y2 satisfy the Euler-Lagrange
equations
the first to make the path stationary and the second to ensure that the path is continous.
The Euler-Lagrange equations have the following solutions
where , and are constants. The natural boundary condition at x = L, see equa-
tion 9.7, gives Fy0 = y 0 (L) = 0, that is = 0.
Continuity at x = gives = + A and the other condition at x = gives
+ 2C = 0, and these two equations give
2CA A
= , and = .
1 + 2C 1 + 2C
Using the subset of variations for which = 0 gives equations 9.50 and 9.51, for y 1 (x)
and y2 (x). On these paths S reduces to equation 9.49.
The terms O() is, by definition, zero on a stationary path and since
Fy0 y0 = 2 1 6y 0 + 6y 0 2
Hence for all allowed h and 0 < || 1, S[y + h] S[y] < 0, so this stationary path is
a local maximum of S[y].
interesting solutions; denote these by x1 (y) and x2 (y). These must also be solutions of
the quartic, f (x) = f (y). But the root structure of a quartic and a cubic is different; for
instance roots of the quartic will coalesce are different values of y than for the cubic, so
it is unlikely that there is a range of y for which x1 (y) and x2 (y) satisfy both equations.
There may, however, be accidental coincidences; we now show that there are none.
Consider the differences,
f (x) f (y) = (x y)F (x, y) and g(x) g(y) = (x y)G(x, y)
where F (x, y) and G(x, y) are respectively symmetric cubic and quadratic functions of
x and y. The solution x = y is of no interest, so we require the solutions of the equations
F (x, y) = 3(x3 + y 3 ) + 3x2 y + 3xy 2 4(x2 + y 2 + xy) + x + y = 0,
G(x, y) = 2(x2 + y 2 ) + 2xy 3(x + y) + 1 = 0.
These equations are more conveniently expressed in terms of the variables u = x + y
and v = xy, (so when x = y, u2 = 4v)
F (u, v) = 3u3 4u2 6uv + u + 4v = 0,
G(u, v) = 1 3u + 2u2 2v = 0,
which gives
1
1 3u + 2u2
v=
2
and
F = (1 u)(3u2 6u + 2) = 0.
If u = 1 then v = 0 and (x, y) = (0, 1) and (1, 0), which are the solutions found in the
text.
If 3u2 6u + 2 = 0, u = 1 3/3 and v = 1/3 3/6, so u2 = 4v giving x = y.
Hence there are no real solutions other than those found in the text.
For small |2 |
1
tan 2 = 2 + 23 + O(25 )
3
1 2 1 3 2 2
= d 1 + d + d + O(d ) = d 1 + d + O(d5 ),
5
3 3 3
and hence
1 1 1 2
p2 = = 2 2 4
= 1 d2 + O(d4 ) .
tan 2 d(1 + 3 d + O(d )) d 3
9.8. SOLUTIONS FOR CHAPTER 9 381
Integrating by parts and using the fact that h(b) = 0 then gives the required result,
b
dFy0
Z
S2 [y, h] = ak kF a, y 0 (a) h(a)Fy0 a, y 0 (a)
dx h .
a dx
Since this equation is true for all in a neighbourhood of the origin it follows that
ky 0 (a) + h(a) = 0, as in equation 9.76, and k 2 y 00 (a) + 2kh0 (a) = 0. Thus the second
derivative becomes the simple expression
b
d 2 S2 h Z
2
i
= k 1 F x a, y 0
(a) + dx h0 2 Fy0 y0 (x, y 0 ).
d2 a
But,
x 1 2x(3y 0 2 1)
F = so that Fx = and Fy0 y0 = ,
1 + y0 2 1 + y0 2 (1 + y 0 2 )3
and since y 0 (a) = 1, see equation 9.81 the second derivative becomes
b
d 2 S2 1 x(3y 0 2 1) 0 2
Z
2
= k2 + 2 dx h .
d 2 a (1 + y 0 2 )3
It follows that provided 3y 0 2 > 1 the second variation is positive for all nonzero k and
h(x), and that the stationary path is a weak local minimum.
In each integral of the sum put (n+1/2)u = p1+w, 0 w 1, and (n+1/2)u = n+w
in the last integral to write this in the form
n Z 1 Z 1/2
b2 X p1+w b2 n+w
S1 [z] = dw 2 + dw .
2
(n + 1/2) p=1 0 2
1 + B sin w (n + 1/2) 0
2 1 + B 2 sin2 w
But
1 1
p1+w 1 p
Z Z
dw p dw = , p = 1, 2, , n
0 1 + B 2 sin2 w 0 1+ B2 2
sin w 1 + B2
and
1/2 1
n+w 1 n+1
Z Z
dw (n + 1) dw =
0 1 + B 2 sin2 w 0 1+ B2 2
sin w 1 + B2
so that
n+1
b2 X b2
S1 [z] p= (1 + O(1/n)) .
(n + 1/2)2 1 + B 2 p=1 2 1 + B2
But B = O(n) so S1 [z] = O(1/n) for large n. Hence, given any number > 0, an n can
be found such that S1 [z] < .
384 CHAPTER 9. VARIABLE END POINTS
(b) Since max(z 0 (x) = O(n) the derivative is not bounded and z(x) satisfies the D0
norm.
f (x, y)y 0
p = 0 at x = b.
1 + y0 2
Since f (x, y) 6= 0, this means that y 0 (b) = 0 and that the stationary path is perpendic-
ular to the line x = b.
Using the subset of variations for which h(v) = 0, we see that y(x) satisfies the Euler-
Lagrange equation. Then the boundary terms shows that the boundary condition at
x = v is
Fy0 (v, y(v), y 0 (v)) + G0 (y(v)) = 0.
(b) If the right end of the path satisfies (v, y(v)) = 0 and the varied path is y + h,
and ends at v + for some , the same analysis that leads to equation 9.19 gives
x + y 0 (v)y + h(v)y = 0.
9.8. SOLUTIONS FOR CHAPTER 9 385
It is convenient to write
then differentiate with respect to , and then set = 0 to obtain the Gateaux differential
Z v
S = G0 (y(v)) y 0 (v) + h(v) + F (v, y(v), v 0 (v)) + dx (hFy + h0 Fy0 ) .
a
1 1
F y 0 Fy 0 = p =
y 1+y 0 2 c
y Boundary curve
Stationary path
r
r
c
x
c
Figure 9.17
386 CHAPTER 9. VARIABLE END POINTS
so that
f (x, y)
y 0 Fy 0 F = p exp( tan1 y 0 )(y 0 1).
1+y 0 2
f (x, y)
p {( + y 0 )x (y 0 1)y } = 0 at x = v.
1 + y0 2
If the gradient of and the stationary path at x = v are respectively tan C and tan ,
so x = y tan C and y 0 = tan , this boundary condition becomes
which rearranges to (tan tan c ) = 1 + tan tan c , giving the required result.
The general solution of this equation is y(x) = x5 /240 + Ax3 + Bx2 + Cx + D. The
boundary conditions y(0) = y 0 (0) = 0 give C = D = 0, and the other two conditions
give the equations
7 1
0= +A+B and 0 = + 6A + 2B
480 12
9.8. SOLUTIONS FOR CHAPTER 9 387
so that
L
dE
Z
= M gh(L) + dx h00 (y 00 + h00 ) gh
d 0
and putting = 0 gives the Gateaux differential
Z L
[E, h] = M gh(L) + dx h00 y 00 gh .
0
d4 y Mg
= (x)g, y(0) = y 0 (0) = y 00 (L) = 0, y (3) (L) = .
dx4
If is independent of x the general solution of this equation that satisfies the boundary
conditions at x = 0 is
g 4
y(x) = x + Ax3 + Bx2
24
and the constants A and B are determined by the two conditions at x = L; since
g 2 g
y (2) (x) = x + 6Ax + 2B and y (3) (x) = x + 6A
2
these give the equations
g Lg L
A= (M + L) and B = M+ .
6 2 2
Hence
g 4 x3 Lx2
L
y(x) = x (M + L)g + M+ g,
24 6 2 2
g 2 2 Mg 2
= x (x 4xL + 6L2 ) + x (3L x).
24 6
388 CHAPTER 9. VARIABLE END POINTS
Since Z h i Z
dx y 00 h00 = y 00 h0 y 000 h + dx y (4) h
On collecting relevant terms together and using the fact that h(x) is continuous at x =
and that h1 (0) = h2 (L) = 0 this becomes
y2000 y1000 M g h(x) + (y100 h01 y200 h02 )
E =
x= x=
h i h i
00 0 00 0
y1 h1 + y 2 h2
x=0 x=L
Z Z L
(4) (4)
+ dx y1 g h1 + dx y2 g h2 .
0
Now choose the subset of variations that make all the boundary terms zero to see that
y1 and y2 satisfy the Euler-Lagrange equations
(4)
y1 = g, y1 (0) = 0, 0 x ,
(4)
y2 = g, y2 (L) = 0, x L.
Also, since h01 (0) 6= 0 and h02 (L) 6= 0 the natural boundary conditions at x = 0 and L
are
y100 (0) = y200 (L) = 0.
Finally choose the subset of variations for which h0 (x) is continuous to see that y100 () =
y200 (), that is y 00 (x) is continuous at x = .
9.8. SOLUTIONS FOR CHAPTER 9 389
For the sake of completeness we now show how these conditions can be used to
find the solution when is independent of x. This analysis was not requested in the
question.
The two solutions of the Euler-Lagrange equation that fit the boundary conditions
at x = 0 and L are
g 4
y1 (x) = x + a1 x3 + b1 x,
24
g
y2 (x) = (L x)4 + a2 (L x)3 + b2 (L x),
24
so there are four further constants to be determined by the conditions just derived. The
conditions of the second and third derivatives at x = give
gL
M g = gL 6(a1 + a2 ) and 6a2 L = 6(a1 + a2 ) + (2 L).
2
Hence
Mg Lg M g Lg
a1 = ( L) and a2 =
6L 12 6L 12
and
gL3 Mg gL3 Mg
b1 = + (L )(2L ) and b2 = + (L2 2 ).
24 6L 24 6L
The second equation gives m2 = m1 and the first shows that the only solution is
m1 = m2 . Hence the functional has no corners.
The first equation gives m2 = m1 (we ignore the solution m2 = m1 ), and then the
second equation gives
m1 = m, m2 = m with m = 3.
1.5
Cycloid
1
Boundary circle
0.5
r
0
L
0 0.5 1 1.5 2 2.5 3 R
which gives cos( ) = 0 (we ignore the solution sin = 0). This equation has many
solutions and we determine which is appropriate by considering the limiting cases where
= /2 (and c2 = A) so at the terminus the cycloid is tangential to the x-axis. Then
= , so the required solution is = /2.
9.8. SOLUTIONS FOR CHAPTER 9 391
The equations for c and the value of at the terminus, which we denote by , are
1 2
c (2 sin 2) = b r sin and c2 sin2 = A r cos .
2
An equation for is therefore
2 sin 2 b r sin
2 = , (b > r).
2 sin A r cos
This equation has one real root in the interval 0 < < , as may be seen by sketching
the graphs of
2 sin 2 b r sin
f1 () = 2 and f2 () = .
2 sin A r cos
Observe that f1 () = 43 + O( 3 ), that f1 () as and that f1 is monotonc
increasing for 0 < < . Note also that the behaviour of f2 depends upon whether
A > r or A < r, but in either case sketches of these functions show that there is one
real root for 0 < < .
392 CHAPTER 9. VARIABLE END POINTS
Chapter 10
10.1 Introduction
In this chapter we introduce the method needed to treat constrained variational prob-
lems, examples of which are the isoperimetric and catenary problems, described in
sections 2.5.5 and 2.5.6. With such problems the admissible paths are constrained to
a subset of all possible paths: in the isoperimetric and catenary problems these con-
straints are the lengths of the boundary and chain, respectively.
We introduce the technique required using the simpler example of constrained sta-
tionary points of functions of two or more variables, beginning with a discussion of a few
elementary cases; the method is applied to the Calculus of Variations in the next chap-
ter. Throughout this chapter we assume that all functions are sufficiently differentiable
in the region of interest.
Consider a walker on a hill but confined to a one-dimensional path, AB, as shown
in figure 10.1.
3
2.5
2
h 1.5
1 B
0.5 2
0 A
3 2 0
1 0 x
y 1 2 3 2
Figure 10.1 Graph showing the height h(x, y) of the hill as x
and y vary. The path x + y = 1 is depicted by the solid line.
393
394 CHAPTER 10. CONDITIONAL STATIONARY POINTS
and the path by the equation x + y = 1. This hill has a global maximum at x = y = 0,
but because the path does not pass through this point the maximum height attained
by the walker is less. The problem is to find this stationary point and its position: we
should also like to classify this stationary point, but usually this is more difficult.
The maximum height of the walker may be determined by rearranging the equation
of the path to express y in terms of x, y = 1 x, and then by expressing the height in
terms of x alone,
3 2 1
h(x) = 3 exp x + x . (10.2)
2 2
The maximum of this function may be found by the methods described in section 7.2,
see also exercise 10.1, and is max(h(x, y)) = 3e1/3 . In this example the path x + y = 1
constrains the walker and is named the constraint, or the equation of constraint.
Another problem is that of inscribing a rectangle of maximum area inside a given
ellipse, such that all corners of the rectangle lie on the ellipse, as shown in figure 10.2.
y
b
(x,y)
a x
The coordinates of the top right-hand corner of the rectangle are (x, y) and since the
equation of the ellipse is
x2 y2
+ = 1, (10.3)
a2 b2
this is the equation of constraint. The area of the rectangle is
so we need the maximum of this function subject to the constraint 10.3: this problem
is solved in exercise 10.2.
If there are two independent variables, (x, y), there can be only one constraint which
we denote by g(x, y) = 0, and we require the stationary points of f (x, y) subject to this
constraint. Geometrically the constraint equation, g(x, y) = 0, defines a curve C g in the
Oxy plane, see for example figure 10.3, so we are searching for the stationary points of
f (x, y) along this curve.
With two independent variables there can be only one constraint because another
constraint, (x, y) = 0, defines another curve, C that intersects Cg at isolated points,
if at all. Sometimes, however, the equations g(x, y) = 0 and (x, y) = 0 will define the
same curve, despite being algebraically dissimilar: then the functions g and are said
to be dependent and it can be shown that in the region where the curves g(x, y) = 0 and
(x, y) = 0 coincide there is a differentiable function F (u, v) of two real variables such
10.1. INTRODUCTION 395
that F (g(x, y), (x, y)) =constant: alternatively, using the implicit function theorem,
section 1.3.7, it can be shown that can be expressed in terms of g, (x, y) = G(g(x, y)),
or vice versa. It is not always obvious that two functions define the same curve: for
instance the equations
2
g(x, y) = y sinh1 (tan x) = 0 and (x, y) = 1 ey = 0 (10.5)
1 + tan(x/2)
y C
C Cg
A
A B
B
x
If there are three independent variables, (x, y, z) and we require the stationary points
of f (x, y, z) subject to the single constraint g1 (x, y, z) = 0, we may proceed in the same
manner, by using the constraint to express z in terms of (x, y) to form the function
f (x, y, z(x, y)) of two independent variables. With two constraints gk (x, y, z) = 0,
k = 1, 2, the more general implicit function theorem, described on page 32, may be used
to express any two variables in terms of the third, to express f (x, y, z) as a function
of one variable. In either case there are three ways to proceed and it is rarely clear in
advance which yields the simplest algebra.
In general, with n variables x = (x1 , x2 , . . . , xn ) there can be at most n 1 con-
straints. Suppose there are m constraints, m n 1, gk (x) = 0, k = 1, 2, , m.
Then, in principle we may use these m equations to express m of the variables in terms
of the remaining n m, hence giving a function of n m variables. In practice this is
rarely an easy task.
There are two main methods of dealing with constrained stationary problems. The
conceptually simplest method is to reduce the number of independent variables, as
described above, and in simple examples this method is usually preferable. The more
elegant method, due to Lagrange (1736 1813), is described in the next section.
There are two main disadvantages with the direct method:
396 CHAPTER 10. CONDITIONAL STATIONARY POINTS
Exercise 10.1
Show that the function defined in equation 10.1 has a local maximum at x = 1/3,
where y = 2/3, and that the height of the hill here is 3e1/3 .
Exercise 10.2
Show that the area of the rectangle inscribed in the ellipse shown in figure 10.2
can be expressed in the form
4b p 2
A(x) = x a x2 , 0 x a,
a
and by finding the stationary point of this expression show that max(A) = 2ab.
Exercise 10.3
Geometric problems often give rise to constrained stationary problems and here
we consider a relatively simple example.
Let P be a point in the Cartesian plane with coordinates
(A, B) and D the distance from P to any point (x, y) on y
the straight line with equation
x y b (x,y)
+ = 1. D
a b (A,B)
P
Show that D 2 = (x A)2 + (y B)2 and deduce that
the shortest distance is
x
|ab Ab Ba|
min(D) = . a
a2 + b 2
Exercise 10.4
If A, B and C are the angles of a triangle show that the function
Exercise 10.5
If z = f (x, y) and x and y satisfy the constraint g(x, y) = 0, show that at the
stationary points of z the contours of f (x, y), that is the curves defined by the
equations f (x, y) = constant, are tangential to the curve defined by g(x, y) = 0.
10.2. THE LAGRANGE MULTIPLIER 397
This equation is true for any value of . Because of the constraint, variations in u, v
and w are not independent but, if g/z 6= 0 we may choose to make the coefficient
of w in equation 10.9 zero, that is
f g
= 0. (10.10)
z z
Then equation 10.9 reduces to
f g f g
u+ v = 0.
x x y y
f g f g
= 0, = 0. (10.11)
x x y y
The three equations 10.10 and 10.11 relate the four variables x, y, z and . Assuming
that the implicit function theorem can be applied, that is the Jacobian 1.26 (page 32)
is not zero, we can use these equations to express (x, y, z) in terms of . Then the
constraint becomes g(x(), y(), z()) = 0, which determines appropriate values of .
This procedure is equivalent to defining an auxiliary function of four variables
and finding the stationary points of F (x, y, z, ) using the conventional theory for all
four variables, that is the solutions of
f g f g f g
Fx = = 0, Fy = = 0, Fz = = 0,
x x y y z z
and F = g(x, y, z) = 0. Usually the first three of these are solved first to give
(x(), y(), z()) in terms of , and then the fourth, the equation of constraint, is
used to determine , although the order in which these equations are solved is clearly
immaterial.
Thus the introduction of the Lagrange multiplier , , gives a method of finding
stationary points that treats the three original variables equally. Before showing how
this method generalises to n variables and m n 1 constraints we apply it to the
triangle problem treated in exercise 10.4.
10.2. THE LAGRANGE MULTIPLIER 399
For this problem f (x, y, z) = sin x sin y sin z and g(x, y, z) = x + y + z , so that
the auxiliary function is
with each of x, y and z in the interval (0, ). Equations 10.10 and 10.11 become
Similarly, by subtracting the third from the second and the third from the first we
obtain
sin z sin(x y) = 0 and sin y sin(z x) = 0.
From 10.13 either sin x = 0 or sin(y z) = 0; but for a triangle of nonzero area none
of x, y or z can be zero or , so < y z < and the only solution is y = z. The
remaining two equations give y = x and z = x and hence x = y = z and then the
constraint gives x = y = z = /3.
Exercise 10.6
Use a Lagrange multiplier to find the stationary points of the problems set in
exercises 10.1, 10.2 and 10.3.
Exercise 10.7
Show that the stationary distance between the origin and
the plane defined by the
equation ax + by + cz = d is given by the formula |d|/ a2 + b2 + c2 .
Exercise 10.8
Consider a rectangle, two sides of which are along the x- and y-axes; the bot-
tom left-hand corner is at the origin and the opposite corner lies on the line
x/a + y/b = 1, where a and b are positive numbers. Show that the stationary
area of such a rectangle is A = ab/4 and that for this rectangle the top right-hand
corner is at (a/2, b/2).
where all derivatives are evaluated at the stationary point. Provided neither g 1 (x, y, z)
nor g2 (x, y, z) is stationary, and that the normals to the planes defined by the equations
400 CHAPTER 10. CONDITIONAL STATIONARY POINTS
are not parallel, so that the planes exist and are distinct, then the planes intersect along
a line and there can be only one independent variable.
Equation 10.8 remains valid and now we proceed by introducing two Lagrange mul-
tipliers, 1 and 2 , one for each constraint. Thus from equations 10.8 and 10.14 we
may form another equation,
f g1 g2 f g1 g2 f g1 g2
1 2 u+ 1 2 v+ 1 2 w = 0.
x x x y y y z z z
(10.15)
Now choose 1 and 2 to make the coefficients of v and w zero, that is
f g1 g2 f g1 g2
1 2 = 0 and 1 2 = 0. (10.16)
y y y z z z
Then, since u may be varied independently, we have a third equation
f g1 g2
1 2 = 0. (10.17)
x x x
The three equations 10.16 and 10.17 may, in principle, be solved to give (x, y, z) in terms
of 1 and 2 and then the constraints, gj (x, y, z) = 0, j = 1, 2, give two equations for
1 and 2 . Needless to say, in practice these equations are not usually easy to solve.
As in the previous case this is formally equivalent to defining an auxiliary function
of five variables and finding the stationary points of this, that is the solutions of
F F F F F
= 0, = 0, = 0, = 0 and = 0.
x y z 1 2
We illustrate this method by showing how to find the stationary values of f (x, y, z) =
ax2 + by 2 + cz 2 , subject to the variables being confined to the planes x + y + z = 1 and
x + 2y + 3z = 2. The auxiliary function is
F = ax2 + by 2 + cz 2 1 (x + y + z 1) 2 (x + 2y + 3z 2)
Fx = 2ax (1 + 2 ) = 0,
Fy = 2by (1 + 2 ) 2 = 0,
Fz = 2cz (1 + 2 ) 22 = 0.
In this case it is convenient to define a new variable = 1 + 2 , and then these three
equations can be solved to give
+ 2 + 22
x= , y= , z=
2a 2b 2c
and the equations of constraint become
(ab + ac + bc)+2 (2ab + ac) = 2abc and (3ab + 2ac + bc)+2 (6ab + 2ac) = 4abc,
10.2. THE LAGRANGE MULTIPLIER 401
Hence
2b a+c 2b
x= , y= and z = x = . (10.19)
a + 4b + c a + 4b + c a + 4b + c
Exercise 10.9
Derive equations 10.19 by using the constraints to express x and y in terms of z.
Note that in this example the direct method is easier, because the constraints are
linear.
Exercise 10.10
If f (x) is a function of the n variables x = (x1 , x2 , , xn ) constrained by the
single function g(x) = 0 show that the stationary points can be found by forming
the auxiliary function F (x, ) = f (x) g(x) of n + 1 variables and finding its
stationary points.
where f (x) is the function for which stationary points are required. The stationary
points of F are at the roots of
m
F f X gj
= j = 0, k = 1, 2, , n, (10.21)
xk xk j=1 xk
F
= gj (x) = 0, j = 1, 2, , m n 1. (10.22)
j
This method has the advantage of treating all variables equally and hence retaining any
symmetries that might be present.
The Lagrange multiplier method determines the position of stationary points. It is
generally more difficult to determine the nature of a constrained stationary point and
normally one has to use physical or geometric considerations besides algebraic methods
to understand the problem.
402 CHAPTER 10. CONDITIONAL STATIONARY POINTS
This equation can be used to find the stationary points of g(x) subject to the constraint
f (x) = 0, which are given by the roots of
G g f
= = 0, k = 1, 2, , n,
xk xk xk
which are the same equations as for the stationary points of the original problem. If
x() is a solution of these equations the stationary point of the new constrained problem
is given by those satisfying f (x()) = 0. Further, since = 1, the stationary points
of the original problem are x(1/) with the values of given by g(x(1/)) = 0. Thus
the Lagrange multiplier method highlights a duality between,
a) the stationary points of f (x) with the constraint g(x) = 0, and
b) the stationary points of g(x) with the constraint f (x) = 0,
which is not apparent in the conventional method.
Exercise 10.11
This exercise provides an illustration of the duality described above; compare this
problem with that considered in exercise 10.1.
Find the stationary value of the function g(x, y) = x + y 1 subject to the
constraint f (x, y) = 3 exp(x2 y 2 /2) c where c is a positive constant.
Exercise 10.12
An open rectangular box made of thin sheet metal and sides of height z and a
rectangular base of interior dimensions x and y. The base and sides of length x
are of (small) uniform thickness d and the sides of length y are of thickness 2d. If
the volume of metal is fixed prove that the volume of the box is stationary when
x = 2y = 4z.
Exercise 10.13
A vessel comprises a cylinder of radius r and height h with equal conical ends, the
semi-vertical angle of each cone being . Show that the volume V and the surface
area, S, are given by
2r3 2r2
V = r2 h + and S = 2rh + .
3 tan sin
If r, h and can vary, show that for a vessel of given volume the stationary surface
area occurs when cos = 23 . Also find the value of h in terms of r and and r in
terms of V .
10.4. MISCELLANEOUS EXERCISES 403
Exercise 10.15
Find the stationary value of f = x2 + y 2 + z 2 + w2 , subject to the constraint
(xyzw)2 = 1, and the values of the variables at which the stationary values are
attained.
Exercise 10.16
Find the stationary points of f = xyzw 9 subject to the constraint g = 4x4 + 2y 8 +
z 16 + 9w16 = 1 in the region where all variables are positive.
Exercise 10.17
If a, b, c and d are given positive numbers and x, y and z are positive, real variables
satisfying the equation x + y + z = d, show that the function
a2 b2 c2
f (x, y, z) = + +
x y z
Exercise 10.18
Show that the shortest distance between the plane ax + by + cz = d in the Oxy-
plane and the point (A, B, C) is given by
|Aa + Bb + Cc d|
D= .
a2 + b 2 + c 2
Exercise 10.19
For a simple lens with focal length f the object distance p and the image distance
q are related by 1/p + 1/q = 1/f . If p + q =constant find the stationary value of f .
Exercise 10.20
Show that the stationary points of f = ax2 + by 2 + cz 2 , where the constants a, b
and c are all positive, on the line where the vertical cylinder, x2 +y 2 = 1, intersects
the plane x + y + z = 1, are given by
2 2 2
x= , y= and z= ,
2(a 1 ) 2(b 1 ) 2c
1 2c(6c 2) p
1 = (a + b + 4c) , 2 = with = (a b)2 + 8c2 .
2 2c
404 CHAPTER 10. CONDITIONAL STATIONARY POINTS
Exercise 10.21
Show that the area, S, of canvas needed to make a tent of given volume V com-
prising a right circular cylinder of radius r, made of a single thickness of canvas,
together with conical top of height h, made of two thickness of canvas is given by
2V p 2
S= + 2r r2 + h2 rh.
r 3
If both
r and h can vary show that the stationary value of S, for fixed V , is given
by r 2 = 4h = R where V = 2R3 /3.
10.5. SOLUTIONS FOR CHAPTER 10 405
x2 4b a2 2x2
dA 4b p 2 2
= a x = ,
dx a a2 x2 a a2 x2
so that A0 (x) = 0 when x = a/ 2 (since x > 0). If x = a/ 2 r2 , A0 2 , so that
4b a a2
A(x) has a local maximum at this point, with value A = a2 = 2ab.
a 2 2
This is a quadratic equation in x and since the coefficient of x is positive it has a single
minimum, which is most easily seen by writing it in the form
2
a2 + b 2 (Aa + b(b B))2
2 a 2 2
D = x (Aa + b(b B)) + A + (b B) .
a2 a2 + b 2 a2 + b 2
Since sin A 6= 0 and sin B 6= 0 (because 0 < A, B < ) we have sin(2A + B) = 0 and
sin(A+2B) = 0, that is 2A+B = n and A+2B = m, with n and m positive integers,
both smaller than 3. Hence 3A = (2nm) and 3B = (2mn), and 3C = (3nm).
The bounds on A and B give n = m = 1 and hence A = B = C = /3.
which gives
2 (ab Ab Ba)2
1 1
D2 = + 2 = .
4 a2 b a2 + b 2
d2 |d|
D2 = 2 (a2 + b2 + c2 ) = that is D = .
a2 + b 2 + c 2 a2 + b2 + c2
and have the solution x = z and y = 1 2z. Hence the expression for f (x, y, z) becomes
If a + 4b + c < 0 this stationary point is a maximum. Using the expressions for x(z)
and y(z) the results quoted in equation 10.19 are obtained.
Fx = yz (y + 2z) = 0,
Fy = xz (x + 4z) = 0,
Fz = xy (4y + 2x) = 0.
The second two equations can be rearranged to give t(x 4) = and s(x 4) = 2,
so that s = 2t. Also t = /(x 4), so the first equation gives x = 8; hence t = 1/4
and s = 1/2, giving 2y = x and 4z = x.
10.5. SOLUTIONS FOR CHAPTER 10 409
2r2 2r3
F = 2rh + r2 h + .
sin 3 tan
2r3 2r2 r
2 cos
F = 2r + = cos = 0
sin2 3 sin2 sin2 3
where we have used the identity tan x = 2t/(1 t2 ). Adding unity to each side of the
last equation gives the second of equations 10.5.
410 CHAPTER 10. CONDITIONAL STATIONARY POINTS
Fy = 2y 1 (xzw)2 = y = 0 or (xzw)2 = 1,
Fz = 2z 1 (xyw)2 = z = 0 or (xyw)2 = 1,
Fw = 2w 1 (xyz)2 = w = 0 or (xyz)2 = 1.
If x = 0 then the three remaining equations for have no solution: hence we discard
the solution x = y = z = w = 0. The equation (yzw)2 = 1 gives = x2 (on
multiplying by x2 and using the constraint equation). Similarly, = y 2 = z 2 = w2 ,
hence (xyzw)2 = 4 = 1 and = 1 ( = 1 is not allowed); so there are 16 stationary
points, x = 1, y = 1, z = 1 and w = 1, all of which give f = 4.
Fx = 2(x A) 2a = x = a + A,
Fy = 2(y B) 2b = y = b + B,
Fz = 2(z C) 2c = z = c + C.
Aa + Bb + Cc d
(a2 + b2 + c2 ) + Aa + Bb + Cc = d = = .
a2 + b 2 + c 2
But at the stationary point
F q2 F p2
= = 0 and = = 0.
p (p + q)2 q (p + q)2
Clearly since both p and q are positive, p = q and then = 1/4. The constraint
equation then gives p = q = 2c.
F = ax2 + by 2 + cz 2 1 (x2 + y 2 1) 2 (x + y + z 1)
2
Fx = 2x(a 1 ) 2 = 0 = x =
2(a 1 )
2
Fy = 2y(b 1 ) 2 = 0 = y =
2(b 1 )
2
Fz = 2cz 2 = 0 = x = .
2c
The constraints now give the following equations for 1 and 2 : on the plane,
2 1 1 1
+ + = 1. (10.24)
2 a 1 b 1 c
1 1 4c
+ = .
a 1 b 1 2c(3c )
2V 2 p
S(r, h) = rh + 2r r2 + h2 .
r 3
Thus
S 2 2rh p
= r + = 0 and hence r2 + h2 = 3h or r = 2 2 h,
h 3 r 2 + h2
10.5. SOLUTIONS FOR CHAPTER 10 413
and hence
4 3 2 3
V = 2r = r 2 .
3 3
414 CHAPTER 10. CONDITIONAL STATIONARY POINTS
Chapter 11
Constrained Variational
Problems
11.1 Introduction
In this chapter we apply the Lagrange multiplier method to functionals with constrained
admissible functions. Examples are the isoperimetric and the catenary problems, de-
scribed in sections 2.5.5 and 2.5.6, where the constraint is another functional. In these
examples the stationary path is described by a single function, y(x).
But, the most celebrated isoperimetric problem is that enshrined in the myth de-
scribing the foundation of the Phoenician city of Carthage in 814 BC: this is that Dido,
also known as Elissa, having fled from Tyre after her brother, King Pygmalion, had
killed her husband, was granted by the Libyans as much land as an ox-hide could cover.
By cutting the hide into thin strips, she was able to claim far more ground than an-
ticipated. In common with all foundation myths there is no trace of evidence for its
veracity.
Didos solution is a circle which cannot be described by a single function, the natural
representation being parametric. Thus, we need to consider the effects of constraints
on both types of functionals.
There is, however, another type of constrained problem, of equal significance, exem-
plified by the problem of finding geodesics on surfaces. Consider a surface defined in the
three dimensional Cartesian space, which we suppose can be defined by an equation of
the form S(x, y, z) = 0. Given two points on this surface we require the shortest line, on
the surface, joining these points. Any smooth path can be represented parametrically
by three functions (x(t), y(t), z(t)) of a parameter t, with end points at t = 0 and t = 1.
The distance along this path is given by the functional
Z 1 p
D[x, y, z] = dt x2 + y 2 + z 2
0
and the constraint that forces this path to be on the surface is S(x(t), y(t), z(t)) = 0 for
0 t 1. This is a different type of constraint than found in the problems described
above. In the non-assessed sections 11.7 and 11.8 this theory is used to solve variants
of the brachistochrone problem.
415
416 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
No fundamentally new ideas are presented in this chapter, but many ideas and
techniques introduced in previous chapters are used in a slightly different context, to
derive new results. As you read through this chapter you should ensure that you
thoroughly understand the previous work upon which it is based.
where the admissible curves must also satisfy the constraint functional
Z b
C[y] = dx G(x, y, y 0 ) = c, (11.2)
a
where c is a given constant, then, if y(x) is not a stationary path of C[y], there exists a
Lagrange multiplier such that y(x) is a stationary path of the auxiliary functional
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, (11.3)
a
where F = F G. That is, the stationary path is given by the solutions of the
Euler-Lagrange equation
d F F
= 0, y(a) = A, y(b) = B. (11.4)
dx y 0 y
The solution of this Euler-Lagrange equation will depend upon , the value of which is
determined by substituting the solution into the constraint functional 11.2.
The proof of this theorem requires a significant, and not immediately obvious, change
to the proof presented in section 3.4. Thus before providing the proof it is instructive
to see what happens when the general theory of section 3.4 is applied directly to this
type of problem; this shows why a modification is required.
11.2. CONDITIONAL STATIONARY VALUES OF FUNCTIONALS 417
Suppose that y(x) is the required solution: consider the neighbouring admissible
function y(x) + h(x) where h(a) = h(b) = 0, then the Gateaux differential is
Z b
F d F
S[y, h] = dx h(x). (11.5)
a y dx y 0
But both y(x)+h(x) and y(x) are chosen to satisfy the constraint, that is C[y + h] = C[y],
so the rate of change of C[y] is zero, that is the Gateaux differential is zero
Z b
G d G
C[y, h] = dx h(x) = 0 for all h(x). (11.6)
a y dx y 0
It is assumed that C[y] is not stationary, so G(x, y, y 0 ) does not satisfy the Euler-
Lagrange equation. But 11.6 is true for all h(x) only if G satisfies the Euler-Lagrange
equation. This contradiction can be resolved with a judicious choice of h(x). The
problem is that the constraint places an additional restriction on the variation h(x)
so that the theory developed in chapter 4, which placed no restriction (other than
differentiability) on h(x), needs to be modified.
The same problem arises with functions of n real variables, s(x) and a single con-
straint c(x) = 0. In this case the equivalents of expressions 11.5 and 11.6 are
n n
X s X c
s[x, h] = hk = 0 and c[x, h] = hk = 0.
xk xk
k=1 k=1
But the second of these equations is true for all variations satisfying the constraint, so
the hk cannot be varied independently, and therefore we cannot deduce that s/xk = 0
for all xk .
In order to derive the Euler-Lagrange equation 11.4 we use a special set of variations.
Recall that when first deriving the Euler-Lagrange equation in section 3.4 we used the
fundamental Lemma, section 3.3, which involved sets of functions h(x) that isolated
small intervals of the integrand. Here we use a modification of this method that involves
picking out two, small, distinct intervals.
This is achieved by writing
h(x) = 1 g(x 1 ) + 2 g(x 2 ), 1 6= 2 , (11.7)
where the function g(x ) is strongly peaked in a neighbourhood of x = and zero
for other x.
Such functions can be constructed from the type of function used to prove the
fundamental lemma, section 3.3; for example define
1 2 (x )2 ,
a < x + < b,
g(x ) = 2 (11.8)
0, otherwise.
The coefficient 2 is chosen to make g = O(1). This function is zero except in the
neighbourhood of width 2 centred at x = .
For any function f (x) possessing a third derivative for a x b, we have, see
exercise 11.1
Z b
4
dx f (x)g(x ) = f () + O( 3 ), = , a + < < b . (11.9)
a 3
418 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
In the following analysis we use the specific family of functions 11.8 in order to illustrate
how the proof works. Such a restriction is not necessary, but without it the more general
equivalent of equation 11.9 needs to be derived; the only significant difference between
equation 11.9 and the general case is that the term O( 3 ) is replaced by a term O( 2 ).
For convenience define the functions
d F F d G G
F(x) = 0
and G(x) = 0
dx y y dx y y
which we assume are sufficiently well behaved for a x b. Then the integrals 11.5
and 11.6 become, respectively,
Z b
S = dx F(x) 1 g(x 1 ) + 2 g(x 2 )
a
= 1 F(1 ) + 2 F(2 ) + O( 3 ), (11.10)
Z b
C = dx G(x) 1 g(x 1 ) + 2 g(x 2 )
a
= 1 G(1 ) + 2 G(2 ) + O( 3 ). (11.11)
The functional C[y] is not stationary therefore we may choose 2 such that G(2 ) 6= 0,
and then equation 11.11 gives, since C = 0,
G(1 )
2 = 1 + O( 2 ).
G(2 )
Substituting this into equation 11.10 and using the fact S[y] is stationary, so S = 0,
F(1 ) F(2 )
1 = O( 2 ).
G(1 ) G(2 )
Since this equation must be true for all 1 , and the left-hand side is independent of ,
we must have
F(1 ) F(2 )
= .
G(1 ) G(2 )
Finally, recall that 1 and 2 are arbitrary, so it follows that the ratio F(x)/G(x) is
independent of x. Setting this ratio to a constant we obtain
d F F d G G
= 0 for a x b, (11.12)
dx y 0 y dx y 0 y
which is just equation 11.4 and can be derived from the functional S[y] in the usual
manner.
This proof shows clearly why two small parameters, 1 and 2 , are necessary; we
need the flexibility to isolate two distinct points, 1 and 2 , in the interval (a, b) to
show that the ratio F(x)/G(x) is independent of x. In this proof it is necessary to
assume that G(x) 6= 0 for almost all values of x in this interval: that is, C[y] must not
be stationary.
11.2. CONDITIONAL STATIONARY VALUES OF FUNCTIONALS 419
Exercise 11.1
Prove equation 11.9
Exercise 11.2
Use theorem 11.1 to show that the stationary path of the variational problem
Z 1
S[y] = dx y 0 2 , y(0) = y(1) = 0,
0
subject to the constraint that the area under the curve is fixed, that is
Z 1
C[y] = dx y(x) = A,
0
Exercise 11.3
Show that the stationary path of the functional
Z 2
S[y] = dx xy 0 2 , y(1) = y(2) = 0,
1
where y = (y1 , y2 , . . . , yn ) and where the admissible curves must also satisfy the M
constraint functionals
Z b
Cj [y] = dx Gj (x, y(x), y0 (x)) = cj , j = 1, 2, , M, (11.14)
a
where the cj are M given constants, then, if y(x) is not a stationary path of any of the
constraints, there exists a set of M Lagrange multipliers j , j = 1, 2, , M , such that
y(x) is a stationary path of the functional
Z b
S[y] = dx F (x, y(x), y0 (x)), y(a) = A, y(b) = B, (11.15)
a
420 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
PM
where F = F j=1 j Gj . That is, the stationary path is given by the solution of the
Euler-Lagrange equations
d F F
0 = 0, yk (a) = Ak , yk (b) = Bk , k = 1, 2, , n. (11.16)
dx yk yk
The solution of these n Euler-Lagrange equations will depend upon M Lagrange mul-
tipliers, the values of which are determined by substituting the solution into the M
constraint functionals 11.14.
Exercise 11.4
Show that the stationary paths of the functional
Z 1
S[y, z] = dx y 0 2 + z 0 2 2xz 0 4z , y(0) = z(0) = 0, y(1) = z(1) = 1,
0
are given by
(4 3)x x2 (3 + 2)x x2
y= and z=
4(1 ) 2(1 + )
where is a solution of
24 46 + 232 1
1+c = .
48(1 )2 12(1 + )2
Exercise 11.5
Show that the stationary path of the variational problem
Z 1
S[y] = dx y, y(0) = y(1) = 0,
0
Z 1
subject to the constraint C[y] = dx y 0 2 = c, is given by y(x) = 3c x(1 x),
0
and that the undetermined multiplier is = 1/(4 3c).
11.2. CONDITIONAL STATIONARY VALUES OF FUNCTIONALS 421
(a,A)
A
y
B
x
x=0 x=a
Figure 11.1 The catenary formed by a uniform cable
hanging between two points at different heights.
If a curve is described by a differentiable function y(x) it can be shown, see exercise 2.19
(page 103), that the potential energy E of the cable is proportional to the functional
Z a p
E[y] = g dx y 1 + y0 2, y(0) = B, y(a) = A B. (11.17)
0
The curve that minimises this functional, subject to the length of the cable,
Z a p
L[y] = dx 1 + y 0 2 , (11.18)
0
a
Figure 11.2 Diagram showing a cable hanging over two
smooth pegs, at the same height, A, above the ground, a
distance a apart. The cable is long enough to reach the
ground on both sides.
In this example the potential energy of the vertical segments is independent of the
shape of the hanging portion, so the energy is given by equation 11.17, and there is no
constraint. This is the same functional as gives the area of a surface of revolution.
The hanging portion of the cable is supported only by the weight of the vertical
portion of the cable, so we consider the effect of keeping A and B fixed and changing
a, the separation between the pegs. First consider the case A = B.
If a A the weight of the hanging cable is relatively small by comparison to the
vertical portion, and we expect the portion between the pegs to be almost horizontal.
In addition there will be a solution where the hanging portion falls almost vertically
near the pegs and with a section of it resting on the floor. Figure 4.11 (page 176) shows
the two solutions, for a A; one is almost horizontal and is shown in section 7.7 to be
a local minimum.
If a A the weight of the hanging cable is relatively large and cannot be sup-
ported by the vertical portion. Now the only solution is the Goldschmidt solution,
equation 4.20, which is physically possible only for an infinitely flexible cable.
Notice that if B = 0 the length of the vertical portion of the cable must be less
than the hanging portion, which therefore cannot be supported, so there is no smooth
solution, as in exercise 4.19. This example demonstrates the importance of constraints.
Returning to the main problem, choose the axes so the left-hand support is at the
origin, that is B = 0, and the right-hand end has coordinates (a, A). Further we may
assume, with no loss of generality, that A 0. The energy and constraint functionals
are given in equations 11.17 and 11.18, so if g is the Lagrange multiplier the auxiliary
functional is proportional to
Z a p
E[y] = dx (y ) 1 + y 0 2 , y(0) = 0, y(a) = A 0, (11.19)
0
where c is a constant. Solving this equation for z 0 gives the first-order equation
dz z 2 c2
= ,
dx c
which is same equation as derived in section 4.3.2. Putting z = c cosh (x) gives
c0 = 1, so the general solution is
x+d
y = + c cosh , (11.20)
c
Equations 11.21 and 11.22 give three equations enabling the constants , c and d to be
determined in terms of L, B and (a, A). It is not possible to find formulae for these
constants, but a numerical solution is made relatively easy after some rearrangements
are made. Subtracting equations 11.21 gives
a+d d a + 2d a
A = c cosh cosh = 2c sinh sinh . (11.23)
c c 2c 2c
d a
= 0 D0 , 0 = .
c0 2c0
424 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
= c0 cosh(0 D0 ), giving + + = A.
and
20
y (x) = c0 cosh x (0 + D0 ) cosh(0 + D0 ) . (11.26)
a
The solution y+ (x) has a local minimum at x = xm = a(1 D0 /0 )/2, and y (x) has
a maximum at x = xm . Also we note that y (a x) = A y+ (x). An example of each
of these solutions is shown in figure 11.3. Only y+ (x) is physically significant in the
present context.
2 y
y-(x)
1.5
0.5
x
0
0.2 0.4 0.6 0.8 1
-0.5
y+(x)
Figure 11.3 Graphs of the functions y (x) in the case
a = A = 1 and L = 3.
We can deduce the existence of these two solutions directly from the original functional.
Suppose that y(x) satisfies the Euler-Lagrange equations associated with E[y], then if
w(x) = A y(a x), so w(0) = 0 and w(a) = A, then
Z a p
E[y] = dx (A w(a x)) 1 + w0 (a x)2
0
Z a p
= du (w(u) ) 1 + w0 (u)2 , u = a x, + = A.
0
that is the potential energy of the path y+ (x) is less that that of the path y (x); physical
considerations suggest that y+ (x) gives a minimum value of E[y].
In figure 11.4 we show some examples of y+ (x) for a = A = 1 and various values
of L.
11.3. VARIABLE END POINTS 425
y
1
L=1.42
0.5
L=1.5
x
0 0.2 0.4 0.6 0.8 1
L=2
-0.5
L=3
-1
Figure 11.4 Graphs showing catenaries y+ (x), defined in
equation 11.25, of various lengths, L, for a = A = 1.
Exercise 11.6
Show that equation 11.24 for has a unique real solution if L is larger than the
distance between the origin and (a, A). What is the positive limiting value of c as
the stationary path y+ (x) tends to the straight line between the end points?
Exercise 11.7
For given values of a and L, (L > a), show that the catenary y+ (x) with zero
gradient at the left end, x = 0, has the height difference A = L tanh where
a sinh 2 = 2L.
Exercise 11.8
Prove the inequality 11.27.
Exercise 11.9
(a) Show that the Euler-Lagrange equation associated with the functional E[y],
defined in equation 11.19,, is
(y )y 00 = 1 + y 0 2 , y(0) = 0, y(a) = A.
Hence explain why another solution for the minimum surface problem, discussed
in section 4.3, cannot be generated by this transformation.
where the right-hand end of the path lies on the curve defined by (x, y) = 0 and where
the constraint Z v
C[y] = dx G(x, y, y 0 ) = c, a constant, (11.29)
a
also needs to be satisfied.
Using a similar analysis to that outlined in section 11.2.1 it can be shown that the
required stationary path is given by the stationary path of the auxiliary functional
Z v
S[y] = dx F (x, y, y 0 ), y(a) = A, F = F G, (11.30)
a
is satisfied.
As before the solution of the associated Euler-Lagrange equation depends upon ,
the value of which is determined by the constraint.
Exercise 11.10
A curve of given length L is described by the positive function y(x) passing through
the origin and some point, (v, 0), with v > 0, to be determined. Find the shape
of the curve making the area under it stationary.
Hint in this example the boundary curve is (x, y) = y = 0.
Exercise 11.11
A curve described by the positive function y(x) passing through the origin and
some point, to be determined, x = v > 0 on the x-axis, is rotated about the x-axis
to form a solid body.
(a) Show that the volume, V [y], and the surface area, A[y], of this body are given
by Z v Z v
dx y 2 and A[y] = 2
p
V [y] = dx y 1 + y 0 2 .
0 0
(b) If the surface area is given determine the path making the volume stationary,
and find the volume in terms of A.
Hint in this example the boundary curve is (x, y) = y = 0.
Exercise 11.12
Show that the equation of the cable with the right-hand end fixed at (a, A), where
a and A are positive, and with the left-hand end free
x to slide on aa vertical pole
aligned along the y-axis is given by y = A + c cosh c cosh , where c is
c c
given by the positive root of L/a = sinh and c = a/.
11.4. BROKEN EXTREMALS 427
Exercise 11.13
Show that the equations of a cable of length L and uniform density, with the left
end free to slide on a vertical pole aligned along the y-axis and the right end free
to slide along the straight line x/a + y/b = 1, a, b > 0, is
bL ax bL a
y =+ cosh , 0x sinh1 ,
a bL a b
for some for which you should find an expression in terms of a, b and L.
of given length Z a p
L[y] = dx 1 + y0 2
0
This integrand depends only upon y 0 , so the solutions of the associated Euler-Lagrange
equation are straight lines, y = mx + d. On the interval 0 x c, since y(0) = 0, the
appropriate solution is y = m1 x, for some constant m1 . On the interval c x a, the
solution through y(a) = A is y = A + m2 (x a). The solution is continuous at x = c,
so
(m1 m2 )c = A m2 a. (11.34)
The Weierstrass-Erdmann (corner) conditions connecting the two sides of the solution
at x = c are, see equations 9.53 and 9.53 (page 359),
lim F y 0 F y0 = lim F y 0 F y0 ,
xc xc+
lim F y0 = lim F y0 .
xc xc+
Since !
F y0 = y 0
2 p and F y 0 F y0 = y 0 2 p
1 + y0 2 1 + y0 2
428 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Equation 11.34 for continuity then gives c = (A + ma)/2m. Hence the stationary paths
are
mx, 0 x c, r
y(x) = 2
A + m(a x), c x a, where m = L 1 and c = A + ma .
a2 2m
Since 0 < c < a we must have |A| < ma.
With no corner conditions the differentiable solution exists only if L = a2 + A2 ,
there being insufficient flexibility to satisfy the constraints and the Euler-Lagrange
equation for any other values of L, a and A.
Exercise 11.14
Show that the only solutions of equations 11.35 are those considered in the text.
Exercise 11.15
This is a long, difficult question which should be attempted only if time permits.
An inextensible cable with uniform density , is suspended between the points
(0, B) and (a, A), with A B, where the y-axis is vertically upwards. A weight of
mass M is firmly attached to the cable at distances L1 and L2 from the left and
right ends respectively, all distances being measured along the cable.
(a) Show that the energy functional is
Z p Z a p
E[y] = M gy() + g dx y 1 + y 0 2 + g dx y 1 + y0 2,
0
where is the x-coordinate of the weight, and that the two constraints are
Z p Z L p
L1 = dx 1 + y0 2 and L2 = dx 1 + y0 2 .
0
11.5. PARAMETRIC FUNCTIONALS 429
(b) Derive the Euler-Lagrange equations for the cable and show that their solu-
tions are
x d1
y1 (x) = 1 + c1 cosh , 0 x , y1 (0) = B,
c1
x d2
y2 (x) = 2 + c2 cosh , x a, y2 (L) = A,
c2
where 1 and 2 are two Lagrange multipliers and (c1 , c2 , d1 , d2 ) are constants
arising from the integration of the Euler-Lagrange equations.
(c) Show that c1 = c2 = c and that the six remaining unknown constants (1 , 2 , , c, d1 , d2 )
are determined by the following six equations.
Z
d1 d1
q
L1 = dx 1 + y10 2 = c sinh + sinh
0 c c
Z a
a d d2
q
2
L2 = dx 1 + y20 2 = c sinh sinh .
c c
d1 a d2
B = 1 + c cosh and A = 2 + c cosh .
c c
d2 d1
M = c sinh sinh
c c
and
d1 d2
1 + c cosh = 2 + c cosh .
c c
with given boundary conditions and with admissible functions restricted to those paths
that satisfy the constraint,
Z 1
C[x, y, z] = dt G(x, y, z, x, y, z) = c (11.37)
0
where c is a constant. This is just the problem dealt with by theorem 11.2, so the
stationary paths satisfy the three Euler-Lagrange equations
d
, u = {x, y, z}, = G, (11.38)
dt u u
with the same boundary conditions as defined for the original functional, and where
is a Lagrange multiplier.
430 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
These simplify to
! !
d x d y
p + y = 0 and p x = 0, (11.42)
dt x2 + y 2 dt x2 + y 2
which integrate directly to
x y
p =y and p = + x, (11.43)
x2 + y 2 x2 + y 2
for some constants and . Now multiply the first of these by y, the second by x and
subtract to give ( y)y ( + x)x = 0. Integrate this to obtain
(x + )2 + (y )2 = 2 , (11.44)
where is another real constant. This is the equation of the circle with centre at
(, ) and radius . Its circumference is 2 = L, which gives the required path. In
parametric form its equations are
L L
x = +cos t and y = + sin t. (11.45)
2 2
The position of the centre of this circle cannot be determined from the information
provided.
11.6. THE LAGRANGE PROBLEM 431
Exercise 11.16
An alternative method of finding the stationary path for the area from equa-
tions 11.42 is to use the arc length, s, as the independent variable, which is related
to the parameter t by the relation
ds p
= x2 + y 2 .
dt
(a) Show that with s as the independent variable equations 11.42 become
dx dy
=y and = + x.
ds ds
Further, show that these equations can be converted to
(
2 y = + a cos(s/ + )
2d y
+ y = having the general solutions
ds2 x = a sin(s/ + ),
Exercise 11.17
What is the shape of the closed curve, enclosing a given area, for which the length
is stationary.
convenient method and has the advantage that each constraint reduces the number
of dependent variables by unity. For example if there are three variables with the
constraint C(y1 , y2 , y3 ) = y12 + y22 + y32 r2 = 0, that forces the admissible paths to lie
on a sphere, it is usually better to use the two spherical polar angles (, ), where
y1 = r sin cos , y2 = r sin sin , y3 = r cos .
The method described here is an alternative, and a specific example is considered in
section 11.6.2.
We assume that the m constraint equations Cj (x, y) = 0 are sufficiently well behaved
that along the stationary path they can be used to express m of the dependent variables
in terms of the remaining n m variables, which means that boundary conditions for at
most n m variables need be specified. We shall assume that all holonomic constraints
are consistent with the boundary conditions. In the following proof we assume that
there is just one constraint, C(x, y) = 0.
Suppose that y(x) is the stationary path with the boundary conditions y(a) = A,
y(b) = B. If y + h is a neighbouring admissible path that also satisfies the constraint,
so h(a) = h(b) = 0, and for each j, hj (x) is in D1 (a, b), then the Gateaux differential
is Z b n
X F d F
S[y, h] = dx hk (x). (11.48)
a yk dx yk0
k=1
But also C(x, y) = C(x, y + h) for all a x b, and hence
n
X C
hk (x) = 0, (11.49)
yk
k=1
as in equation 11.9 (page 417). Thus equations 11.48 and 11.49 become
n n
X C 3
X F d F
k = O( ) and k = O( 3 ),
yk yk dx yk0
k=1 k=1
Now choose () so that the coefficient of n is zero. We have the freedom to choose
the remaining n 1 coefficients k , k = 1, 2, , n 1, independently hence, using the
same argument as in section 10.2, we obtain the n Euler-Lagrange equations
d F F C
0 + (x) = 0, y(a) = A, y(b) = B, k = 1, 2, , n. (11.51)
dx yk yk yk
11.6. THE LAGRANGE PROBLEM 433
The derivation of this result assumed that there is a single holonomic constraint C(x, y).
This is not necessary; the addition of another holonomic constraint adds another La-
grange multiplier and in equation 11.51 the term
C C1 C2
(x) is replaced by 1 (x) + 2 (x) .
yk yk yk
A common type of problem involving a single holonomic constraint is described in
section 11.6.2.
where F = F (x)C, is stationary on this path and satisfies the natural boundary
condition
F y20 = Fy20 Cy20 = 0. (11.55)
x=b
The solution of the associated Euler-Lagrange equation will depend upon (x), which
is determined by substituting the solution into the constraint equation 11.53.
434 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
It is now helpful to use s, the arc length along the curve for the independent variable,
so that s2 = 2 . First we note that
d2 x
d x d 1 dx
= s s = s 2 .
dt ds ds ds
We now show that the Lagrange multiplier is a constant, which makes the integration
of these equations easy. Differentiating the constraint twice with respect to s gives
xx00 + yy 00 + x0 2 + y 0 2 = 0.
But, from the definition of s we have x0 2 +y 0 2 +z 0 2 = 1 and together with equations 11.57
we obtain 2a2 + 1 z 0 2 = 0, and hence =constant (since z 0 is a constant). The
solution of equations 11.57 that fit the initial conditions are, with 2 = 2,
x = a cos s, y = a sin s, z = s,
for some constant . If the length of the curve is S then S = + 2n, for some integer
n, and S = b. Defining a new variable = s we obtain a parametric representation
of a geodesic,
b
x = a cos , y = a sin , z= , 0 2n + . (11.58)
2n +
For this example it is far easier to use cylindrical polar coordinates, see exercise 2.20
(page 104), which automatically satisfy the constraint.
11.7. BRACHISTOCHRONE IN A RESISTING MEDIUM 435
Exercise 11.18
Consider the functional
Z b
dx y 0 2 + z 0 2 y 2 ,
`
S[y, z] = y(a) = A1 , z(a) = A2 , y(b) = B1 ,
a
(a) Show that the Euler-Lagrange equations can be written in the form
d4 y d2 y
y = 0, y(a) = A1 , y 0 (a) = A2 , y(b) = B1 , y 00 (b) = 0,
dx4 dx2
For a particle at P , consider the tangent P N , figure 11.5, to the curve which makes an
angle with the downward vertical, and let s be the distance along the curve from the
starting point, increasing with x. The component of the vertical force of gravity along
the tangent, P N , in the direction of increasing s, is mg cos .
If the magnitude of the resistance per unit mass is R(v), where R(v) is a positive
function such that2 R(0) = 0, then by resolving forces along the tangent at P , Newtons
equation becomes
d2 s
m 2 = mR(v) + mg cos . (11.59)
dt
The chain rule gives
d2 s dv dv ds dv
= = =v
dt2 dt ds dt ds
and since y = s cos the equation of motion can be written as the first-order
equation
dv dy
v = R(v) g . (11.60)
ds ds
We consider only cases where initially the particle is either stationary or moving down-
wards with a speed such that R(v) is small compared with the gravitational force, g
per unit mass. Thus, v 0 (s) is initially increasing. Subsequently there are two possible
types of motion:
A: v(s) steadily increases until the terminal point is reached, or;
B: v(s) increases to a maximum value at which v 0 (s) = 0, so here gy 0 (s) = R(v) < 0,
2 A typically approximation to assume that R is proportional to v 2 , see section 2.5.3, but this is
poor for low speeds, when R is proportional to v, and fails near the speed of sound.
11.7. BRACHISTOCHRONE IN A RESISTING MEDIUM 437
Exercise 11.19
If the wire is vertical, so s = y, and the particle starts from rest at s = 0, and
R(v) = v 2 , for some constant , show that the equation of motion 11.60 becomes
dv g`
= v 2 + g and hence show that v 2 = 1 e2s , for a particle start-
v
ds
ing at rest where s = 0. p
Note that as s , v g/ and approaches this limiting or terminal speed
monotonically.
where
p 1
F = H(, v) x0 ( )2 + y 0 ( )2 vv 0 gy 0 with H(, v) = ( )R(v) (11.65)
v
438 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
and where ( ) is the Lagrange multiplier, that depends upon the independent variable,
here .
There are five known boundary conditions. The initial values of (x, y, v) are assumed
known, and given by (0, A, v0 ), and the final values of (x, y) are given by (b, 0). The
final value of v is not known, because this depends upon the path taken. For this we
use the natural boundary condition, equation 11.55 (page 433), at = 1,
F
= (1)v(1) = 0. (11.66)
v 0
Assuming that v(1) 6= 0, this gives (1) = 0. In exercise 11.20 it is shown that ( ) < 0
for 0 < 1, and hence that H(, v) > 0.
Exercise 11.20
(a) Show that 0 ( ) > 0 at = 1.
(b) If (1 ) = 0 for 0 < 1 < 1, show that 0 (1 ) > 0. Deduce that ( ) < 0 for
0 < 1, and that H(, v) > 0 for 0 1.
Hint for part (a) you will need the Euler-Lagrange equation for ( ), given in
equation 11.71.
where is another constant. Since (1) = 0 it follows that for type A motion, in which
y 0 ( ) < 0 for all , > 0; for type B motion during which y 0 ( ) changes sign we must
have, < 0. It is shown how the values of the constants and may be determined
by expressions derived at the end of this calculation.
It is now helpful to use s as the independent variable because, using equation 11.63,
1 dx dx 1 dy dy
p = and p =
x0 ( )2 + y 0 ( )2 d ds x0 ( )2 + y 0 ( )2 d ds
and hence equations 11.67 and 11.68 have the simpler form
dx
H(s, v) = , (11.69)
ds
dy 1
H(s, v) = g , H= (s)R(v). (11.70)
ds v
11.7. BRACHISTOCHRONE IN A RESISTING MEDIUM 439
d p
(v) x0 ( )2 + y 0 ( )2 Hv + v 0 = 0
d
and this simplifies to
d p 0 2
v+ x ( ) + y 0 ( )2 Hv = 0. (11.71)
d
Again using s for the independent variable gives the simpler equation
d
v = Hv . (11.72)
ds
Equations 11.69, 11.70 and 11.72 are the three Euler-Lagrange equations that we need
to solve. The remaining analysis is difficult partly because we change variables several
times and partly because it is necessary to keep in mind the expected behaviour of the
solution: in particular the two types of motion described before exercise 11.19 need to
be treated slightly differently.
Since x0 (s)2 + y 0 (s)2 = 1, squaring and adding equations 11.69 and 11.70 gives
2
2 2 2 1
H = + (g ) that is R = 2 + (g )2 (11.73)
v
where we have used the definition 11.65. This is a quadratic equation for and hence
can be used to express as a function of v.
Before solving this equation consider its value at the terminal point, = 1, where
(1) = 0. If the speed at the terminus is Vt , this equation gives
1
Vt2 = ,
2 + 2
Since = 0 at the terminus the correct solution is given by the negative sign, and this
is conveniently written in the form
2 2
R f (v)
g R g = p (11.76)
v 2 + 2
440 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Exercise 11.21
In this exercise the limit R = 0 is considered and it is shown that equations 11.81
and 11.83 reduce to the conventional parametric equations of the cycloid.
(a) Show that in the limit R = 0 equation 11.83 for y(v) reduces to the energy
equation
1 1
mg(A y) = mv 2 mv02 .
2 2
(b) Show that if R = 0,
p
2 + 2 v
=
f (v) g 1 2 v 2
and hence that equation 11.81 for x(v) becomes
v v2
Z
x(v) = dv .
g v0 1 2 v 2
(c) Using the substiution v = sin and setting v0 = 0, show that the equation
found in parts (a) and (b) become
c2 1
x= (2 sin 2), y = A c2 sin2 , c2 = , 0 b .
2 22 g
(d) Show also that g = / tan , and hence that = / tan b . Deduce that
> 0 if 0 b < /2 and < 0 if /2 < b < ; explain the significance of the
condition b = /2.
442 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
In general equations 11.81 and 11.82 can be dealt with only using numerical methods,
and this is not easy because it is necessary to solve two coupled nonlinear equations,
which can be evaluated only by numerical integration. However, if the resistance is
relatively small we expect the stationary path to be close to that of the cycloid of the
resistance free motion, which suggests making an expansion in powers of R(v). Such
an analysis also helps check numerical solutions.
In order to facilitate this expansion it is helpful to replace R(v) by R(v), where
is a small positive, dimensionless, quantity, and to use to keep track of the expansion.
An approximation to f (v), to order , can be written,
p
2 + 2 v
= q + O(2 ).
f (v) g 1 v 2 2 2
vR(v)
g
The first task is to determine the values of and b from the terminal conditions. This
is facilitated by noting that the equation y(b ) = 0 is a quadratic in 1/(g2 ),
sin2 b
G2 (b ) + A = 0.
g 2 4 2g2
The quadratic term is proportional to so one of the roots behaves as 1 , as 0,
and since we require a root that is finite when there is no resistance, the relevant
solution is
1 4A
= . (11.88)
g2
q
sin2 b + sin4 b 16AG2 (b )
This expression defines in terms of b , but numerical calculations show that it is real
only for small .
Using the equation = / tan b for allows us to write the equation x(b ) = b in
the form
1
b= (2b sin 2b ) + 2 4 G1 (b ). (11.89)
4g2 g tan b
Since g2 is given in terms of b by equation 11.88 this is a single equation for b that
can be solved numerically.
In figure 11.6 we show an example of such a solution. For the purposes of illustration
we choose g = 1 and take the end points to be (0, A), with A = 2/( 2), and (b, 0),
with b = 1, so for the cycloid b = /4. For these parameters it is necessary that
< 0.135 (approximately) for (b ) to be real, so we take = 0.12. As might be
expected the resistance forces the stationary path below that of the cycloid, on to a
path that is initially steeper.
2
y
1.5 Cycloid
1 With resistance
0.5
x
L
0 0.2 0.4 0.6 0.8 1 R
Now return briefly to case B when the speed reaches a maximum value along the path,
so v(s) is not a monotonic increasing function for all s. If v 0 (s) = 0 at some intermediate
point where s = Sm and v = Vm , then the equation of motion 11.60 shows that at this
point gy 0 (Sm ) = R(Vm ) < 0, that is y(s) is still decreasing, so the maximum speed is
reached before the lowest point of the path; this is contrary to the case R = 0 where
energy conservation ensures that these points coincide. Substituting this value of y 0
into the Euler-Lagrange equation 11.70 for y 0 (s) gives the relation
R
g 2 R2 = g ,
v
444 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
which, on comparing with equation 11.76, gives f (Vm ) = 0. Prior to this point the speed
is increasing to its maximum, Vm and y 0 (s) < 0; subsequently v decreases steadily to the
speed at the terminus. The vertical component of the velocity changes when y 0 (s) = 0.
This situation is summarised in figure 11.7.
y
f(v)=0, v(s)=0
A
g=0, y(s)=0
x
Figure 11.7 Diagram showing where v 0 (s) and y 0 (s) are zero.
On the first part of the path v 0 (s) > 0 and g < 0 and we have
g f (v) gR vR2
g = , < 0.
g 2 R 2 2 + 2 v(g 2 R2 )
p
We now use the limiting case, R = 0 dealt with in exercise 11.20, to suggest how this
problem may be simplified. Assuming that at v = Vm , f (v)2 has a simple zero, it is
convenient to factor f (v) in the form f (v)2 = (Vm2 v 2 )f1 (v), where f1 (v) > 0 for
0 v Vm . Now define a new parameter [0, ] by v = pVm sin , so v increases for
< /2 and decreases for > /2, and then f (v) = Vm f1 (v) cos . If R = 0 then
Vm = 1/ and this is the same parameter used for the cycloid. The two expressions for
g can now both be written in the form
p
gVm cos f1 (v) gR vR2
g = 2 , v = Vm sin .
g R2 v(g 2 R2 )
p
2 + 2
In terms of equations 11.80 for x and y become
dx p v dy p v(g )
= 2 + 2 p and = 2 + 2 p .
d f1 (v) d f1 (v)
Substituting for g and integrating gives
Z Z
p v p sin cos
x() = 2 + 2 d p = Vm2 2 + 2 d , (11.90)
0 f1 (v) 0 f (v)
and
sin cos p 2 gR vR2
Z Z
y() = gVm2 d + 2 d
g 2 R2
p
0 0 (g 2 R2 ) f1 (v)
v
v cos gR vR2
Z p Z
= Ag dv 2 2
Vm 2 + 2 d . (11.91)
v0 g R 0 f (v) g 2 R2
11.8. BRACHISTOCHRONE WITH COULOMB FRICTION 445
The first-integral in this expression is the equivalent of the kinetic energy discussed in
part (a) of exercise 11.21, to which it reduces when R = 0. Further, for < /2 these
two equations for (x(), y()) are identical to equations 11.81 and 11.82, but now they
are valid for all . The two equations for and are obtained by integrating to t ,
where Vt = Vm sin t and where t > /2 if < 0 and t < /2 if > 0.
Exercise 11.22
Consider the case where the initial speed, v0 , is large, so that R(v0 ) > g, and show
that the equations for the stationary path are now
dx p v dy p v
= 2 + 2 and = ( g) 2 + 2
dv f (v) dv f (v)
where
2
g 1
f (v)2 = (2 + 2 )R 2 g 2 2 + 2 2 .
v v
Hence show that in the limit g 0 the stationary path between the points (0, A)
and (b, 0) is the straight line y = A(1 x/b), as expected.
The Cartesian coordinates of the end points of the wire are taken to be (x, y) = (0, A),
for the starting point, and (b, 0) for the terminus, with A > 0 and b > 0, and where
the y-axis is vertically upwards. If m is the mass of the bead this configuration and
the forces acting on the bead are shown in figure 11.8. The gradient of the wire at the
bead is tan = dy/dx, where y(x) is the required curve.
3N Ashby, W E Brittin, W F Love and W Wyss, Amer J Phys 1975, 43 pages 902-6.
446 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
y
A y
N wire
x
b x
N
mg
mg
Figure 11.8 Diagram showing the wire and its terminal points, on the left, and the
forces acting on the bead on the right: here N is the force normal to the wire.
There are three forces acting on the bead, as shown on the right of figure 11.8; that due
to gravity, the force N normal to the wire, which does not directly affect the motion, and
the frictional force of magnitude N directed along the wire and opposing the motion.
Here is the constant coefficient of friction and 0. For the reason discussed above
for a given value of we expect no stationary paths if b/A is too large.
The forces on the bead in the x- and y-directions are obtained directly by resolving
the forces shown in the inset of figure 11.8,
where we use the notation (due to Newton) x = dx/dt and x = d2 x/dt2 . Along the
wire, if v is the speed
mv = FT = N mg sin . (11.96)
Eliminating N from equations 11.94 and 11.95 gives
But also
x = v cos and y = v sin , (11.98)
and by differentiation we see that x sin y cos = v , so that equation 11.97 becomes
N = mv + mg cos . (11.99)
By substituting this into equation 11.96, for the tangential motion, we obtain the equa-
tion of motion
v + (v + g cos ) + g sin = 0. (11.100)
11.8. BRACHISTOCHRONE WITH COULOMB FRICTION 447
Using equation 11.98 this equation can be written in the alternative form
v v + v 2 + g x + g y = 0. (11.101)
In this equation (x, y) are related to v and , by geometry, equation 11.98; squaring
and adding these equations gives the obvious identity v 2 = x2 + y 2 , which is one of the
constraints on the functional. Differentiation of equations 11.98 gives
y cos x sin yx xy
= = .
v v2
This relation, together with the equation of motion 11.101, is the other constraint.
Exercise 11.23
A bead slides on a rough wire joining (0, A) to (b, 0) in a straight line, starting
from (0, A) with speed v0 .
Show that provided v02 > 2g(b A) the bead reaches the terminus at the time
2 A2 + b 2
t= p .
v0 + v02 + 2g(A )
Exercise 11.24
Consider a wire in the shape of the quadrant of a circle of radius R, centre at
(R, R) joining the points (0, R) and (R, 0). The coordinates of a point on this
quadrant can be expressed in terms of the angle ,
x = R(1 cos ), y = R(1 sin ), 0 ,
2
dv
v + v 2 = gR(cos sin ).
d
(c) By making an appropriate change of variable deduce, without solving the equa-
tion, that if v(0) = 0 the value of for which v(/2) = 0 is independent of R.
(d) By solving the differential equation derived in part (b) with v(0) = 0 show
that v(/2) = 0 for = 1 where 1 is the solution of
22 + 3e = 1.
Deduce that if is slightly larger that 1 the bead does not reach the terminus.
448 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
with a prime denoting differentiation with respect to . For 0 , since tan = y/x =
y 0 /x0 , differentiation gives
y 00 x0 y 0 x00
vv 0 + gy 0 + + gx0 = 0.
t0 2
The auxiliary functional is therefore
Z 1
T [x, y, v, t] = d F (x0 , x00 , y 0 , y 00 , v, v 0 , t0 ) (11.103)
0
where
y 00 x0 y 0 x00
p
0 0 0
F = t + 1 vv + gy + + gx0 + 2 x0 2 + y 0 2 vt0 , (11.104)
t0 2
with both the Lagrange multipliers, 1 and 2 , depending upon . The dependent
variables are (x, y, v, t) and the functional contains second derivatives of x and y.
The known boundary conditions at the start, = 0, are
F y0 F x0
00
= 1 0 2 = 0, 00
= 1 0 2 = 0, at = 0 and 1, (11.107)
x t y t
11.8. BRACHISTOCHRONE WITH COULOMB FRICTION 449
F F x0 y 00 y 0 x00
= 0, = 1 2 1 2 v
t t0 t0 3
F F
= 1 v 0 2 t0 , = 1 v,
v v 0
F 1 y 00 2 x0 F 1 y 0
= + 1 g + , = ,
x0 t0 2 x00 t0 2
p
x0 2 + y 0 2
F 1 x00 2 y 0 F 1 x0
0
= 0 2 + 1 g + p , 00
= 02 .
y t x0 2 + y 0 2 y t
From these expressions we obtain the four Euler-Lagrange equations in terms of , after
which we may replace by t (because the choice of parameter is arbitrary). Thus the
four following Euler-Lagrange equations are obtained
where c1 , cx and cy are integration constants. These four equations, together with the
constraints allow a solution to be found; remarkably these equations can be integrated
in terms of known functions, though this process is not simple.
Using equations 11.110 and 11.98 we see that x1 = 2 cos and y1 = 2 sin .
Equation 11.109 gives 2 in terms of 1 , and the second derivatives in equations 11.111
and 11.112, x and y, may be replaced by the first derivatives v and using
Now note that the combination v + v also occurs in the equation of motion 11.101,
which can therefore be used to obtain two algebraic equations relating v and 1 . Thus
equations 11.113 and 11.114 become
c1
1 g 1 2 sin cos 2 sin2 + (cos sin )
= cx , (11.115)
v
2 2
c1
1 g 1 + 2 sin cos + 2 cos (sin + cos ) = cy . (11.116)
v
450 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
cos
v() = where h() = 1 + 2 sin cos + 2C cos2 (11.117)
Bh()
and
cx cy cy + cx
1 g = Bc1 (C + tan ) where B = , C= . (11.118)
c1 (1 + 2 ) cx cy
x0 () dx dt dy dt
x = = v() cos that is = v cos and similarly = v sin .
t0 () d d d d
dt 2 1
gB = 2 . (11.119)
d h h
Hence the differential equations for x() and y() are
dx 2 1 2 dy 2 1
gB 2 = cos 2
and gB = sin cos . (11.120)
d h3 h2 d h3 h2
The two boundary conditions give two equations for B and 1 which may be solved
(numerically) to yield the stationary path. Some examples of the solutions of these
equations are shown in figure 11.9; here the frictionless case ends tangentially to the
x-axis and if > 0 the stationary path dips below the x-axis, but too little to be seen
on this graph.
11.8. BRACHISTOCHRONE WITH COULOMB FRICTION 451
1
y
0.8
=0.5
0.6
=0.3
=0 =0.2
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 x
Figure 11.9 Graphs of the curves traced out by equations 11.121 and 11.122
for the terminal points (0, 1) and (/2, 0) for which the frictionless brachis-
tochrone, = 0, ends tangentially to the x-axis; this is depicted by the dashed
line. The cases = 0.2, 0.3 and 0.5 are shown.
In figure 11.10 is shown stationary paths with the end points (0, 1) and (5, 0) for which
the frictionless brachistochrone dips below the x-axis. In this case the distance travelled
is longer than in figure 11.9 and the value of above which there is no stationary path
is smaller, as illustrated in the problems considered in exercises 11.24 and 11.32.
1
y =0.2
=0.15
0.5
=0.1
x
0
1 2 3 4 5
-0.5
=0
=0.05
Figure 11.10 Graphs of the curves traced out by equations 11.121 and 11.122
for the terminal points (0, 1) and (5, 0) and various values of , with the case
= 0 shown with the dashed line.
Exercise 11.25
Assuming that v0 = 0 show that at the end points h() = 1, where h() is defined
in equation 11.117, and that h() has a single minimum at = 1 /2 /4.
Find the minimum value of h() and deduce that solutions exist only if
1
tan + < 1.
2 4
452 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Exercise 11.26
In the friction free limit, = 0, show that equations 11.121 and 11.122 give
1 1
x= (2 sin 2) and y =A (1 cos 2), = + ,
4gB 2 4gB 2 2
and that 1 is related to A by
b 21 sin 21
= , 1 = + 1 .
A 1 cos 21 2
11.9. MISCELLANEOUS EXERCISES 453
d2 y
+ y = 0, y(0) = y() = 0,
dx2
where is the Lagrange multiplier.
(b) Show that the functions y(x) = 2/ sin nx, with Lagrange multiplier = n2 ,
p
Exercise 11.28
(a) Show that the functional, which is quadratic in y and y 0 ,
Z b
dx p(x)y 0 2 q(x)y 2 , y(a) = y(b) = 0,
`
S[y] =
a
Z b
and the constraint dx w(x)y(x)2 = 1 leads to the linear equation
a
d dy
p(x) + (q(x) + w(x))y = 0, y(a) = y(b) = 0.
dx dx
(b) If the constraint were not also quadratic in y(x) would the resulting Euler-
Lagrange equation be linear?
Exercise 11.29 Z 1
Find the stationary value of the functional S[y] = dx y 2 subject to the con-
Z 1 0
straint dx y = a.
0
Exercise 11.30 Z
Find the function y(x) making the functional P [y] = dx y ln y stationary
Z Z
subject to the two constraints dx y = 1 and dx x2 y = 2 , and where
y(x) goes to zero sufficiently rapidly as |x| for all integrals to exist.
You will find the following integrals useful:
Z r Z
2 2
dx eax = , dx x2 eax = 3/2 where <(a) > 0.
a 2a
This is an important problem that occurs in statistical physics and information
theory, where y(x) is the probability distribution of a continuously distributed
random variable x and P [y] is the entropy. The first constraint is just the normal-
isation condition, satisfied by all distributions, and the second is the variance.
454 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Exercise 11.31
Show that the the stationary path of the functional
Z
S[y] = dx y 0 2 , y(0) = y() = 0,
0
R
subject to the constraint 0
dx y sin x = a, is y(x) = (2a/) sin x.
Exercise 11.32
The points (0, a) and (b, 0), respectively on the Oy and Ox axes, are joined by
a rough wire in the shape of the quadrant of the ellipse parameterised by the
equations
x = b(1 cos ), y = a(1 sin ), 0 .
2
A bead slides down this wire under the influence of gravity and Coulomb friction,
show that the equation of motion 11.101 can be written in the form
dz 2abz
+ 2 = g(a cos b sin ),
d a cos2 + b2 sin2
1 1
1 2 00 1 3 000
Z Z
2 2 0
dy f ( +y) y = 2 dy f () + yf () + y f () + y f ( + )
2 2 6
for some in (, ), where we have used Taylors series, (section 1.3.8). Since
4
4 5
Z Z Z
dy 2 y 2 = 3 , dy 2 y 2 y = 0, dy 2 y 2 y 2 =
,
3 15
we see that
b
4
Z
dx f (x)g(x ) = f () + O( 3 ), = .
a 3
and the associated Euler-Lagrange equation is 2y 00 + = 0. This has the general solution
y(x) = x2 + ax + b,
4
where a and b are constants. The boundary condition at x = 0 gives b = 0: that at
x = 1 gives a = /4, so the solution is y(x) = x(1 x). The constraint gives
4
Z 1
A= dx x(1 x) = giving y(x) = 6Ax(1 x).
4 0 24
and the associated Euler-Lagrange equation is 2(xy 0 )0 + = 0, which has the general
solution
y(x) = x + a ln x + b,
2
where a and b are constants. The boundary condition at x = 1 gives b = /2: that at
x = 2 gives 0 = a ln 2 /2, so the solution is
ln x
y(x) = (1 x) + .
2 2 ln 2
456 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
and hence
2 ln 2 ln x
y(x) = 1x+ .
3 ln 2 2 ln 2
Hence
24 46 + 232
c1 c 2 = .
48(1 )2
Also
1 1
1 1
Z Z
2
c3 = dx z 0 2 = dx (3 + 2 2x) = 1 + .
0 4(1 + )2 0 12(1 + )2
Thus the constraint becomes
24 46 + 232 1
c= 1 .
48(1 )2 12(1 + )2
Define = E[y ] E[y+ ], so we need to show that > 0. Write in the form
d d+
= ac0 + Lc cosh + cosh
c0 c0
1 a + 2d a + 2d+ a
c20 cosh + cosh sinh
2 c0 c0 c0
d + d d+ d
= ac0 + 2Lc0 cosh cosh
2c0 2c0
a a + d + + d d+ d
c20 sinh cosh cosh .
c0 c0 c0
y
y 0 F y0 F = p y 2 = c,
1 + y0 2
where c is a constant.
The boundary condition at x = v is given by equation 11.31 with (x, y) = y: hence
as in exercise 11.10, c = 0 and,the equation for the stationary path is exactly the same
as in exercise 11.10, that is 2 = (x B)2 + y 2 , for some constant B. The boundary
condition at x = 0 gives = B, so the stationary path is a semicircle of radius with
centre at (, 0) and hence v = 2. Since the shape created is a sphere of radius , its
area and volume are A = 42 and V = 34 3 = A3/2 /(6 ).
The left-hand end of the cable is constrained to the curve = x = 0, so the boundary
condition at x = 0 is, from equation 11.31,
(y )y 0
0 = F y0 = p , (x = 0), which gives d = 0.
1 + y0 2
) 1 + y 0 2 , so
y 0 (y )
p = 0, that is y(0) = or y 0 (0) = 0.
1 + y0 2
The first equation gives cosh(d/c) = 0, which cannot be satisfied, and the second gives
sinh(d/c) = 0, which gives d = 0.
The transversality condition, equation 11.31, with = x/a + y/b 1, gives
0
y y 1 v a
p = 0 at x = v, and hence sinh = .
1 + y0 2 a b c b
This is one equation relating v and c. The other is given by the length constraint
Z v p
v ac bL
L= dx 1 + y 0 2 = c sinh = hence c = .
0 c b a
Thus the required solution is
Lb ax Lb a
y =+ cosh , 0x sinh1 .
a bL a b
Finally, at x = v we have v/a + y(v)/b = 1 and since
Lb b Lp 2
y(v) = + cosh sinh1 =+ a + b2 ,
a a a
this gives
Lp 2 vb Lp 2 Lb2 a
=b a + b2 =b a + b2 2 sinh1 .
a a a a b
sin 1 + sin 1
sin 1 cos 2 + sin 2 cos 1 2
2
2
2
= .
cos 1 + sin 1
2 cos 1 cos 2 2
2
2
2
462 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Assuming that 1 6= 2 , because we have already dealt with this solution, gives
2 1 + 2
cos = cos 1 cos 2
2
which simplifies to 1 = cos 1 cos 2 + sin 1 sin 2 = cos(1 2 ), the only solution of
which is 1 = 2 + 2n.
Since the total energy is the sum of these three components we obtain the given result.
The constrains are just the lengths along each portion of the cable.
where
y1 (x), 0 x ,
y(x) = with y1 () = y2 ().
y2 (x), x a,
Now evaluate the functional on the varied path y + h, using the method described in
section 9.5.2. The corner moves to the point ( + u, y() + v), where u and v are
independent variables. Thus
Z +u Z a
E[y+h] = M g y() + v +g dx F (y1 +h1 , y10 +h01 )+g dx F (y2 +h2 , y20 +h02 )
0 +u
p (11.123)
where F (y, y 0 ) = (y ) 1 + y0 2.
We have, as in section 9.5.2 (but with notation changes)
Differentiate equation 11.123 with respect to and then set = 0 to obtain the Gateaux
differential,
h i
E[y] = M gv + gu F (y1 , y10 ) F (y2 , y20 )
x=
Z Z a
dx h1 Fy1 + h01 Fy10 + g dx h2 Fy2 + h02 Fy20 .
+g
0
11.10. SOLUTIONS FOR CHAPTER 11 463
First consider the subset of variations for which u = h() = 0, to obtain the Euler-
Lagrange equations satisfied by y1 and y2 :
d
Fyk0 Fyk = 0, y1 (0) = B y2 (a) = A.
dx
p
Since F = (y ) 1 + y 0 2 is independent of x we the first integrals
yk k
p = ck = constant, k = 1 and 2. (11.124)
1 + yk0 2
E[y] = gv M Fy20 Fy10 + gu F (y1 , y10 ) y10 Fy10 F (y2 , y20 ) y20 Fy20 .
This expression must be zero for all u and v and hence we have the conditions
The first of these equations represents the resolution of forces in the vertical direction
at x = : the second equation is the resolution of forces in the horizontal direction.
In addition the first integral, equation 11.125, represents the fact that the horizontal
component of the tension in the cable is constant. Since F y 0 Fy0 is c1 or c2 we see
that the second of these conditions gives c1 = c2 = c. Using the actual expression for
F the first condition becomes
( )
(y2 2 )y20 (y1 1 )y10
M = p p = c (y20 y10 ) . (11.127)
1 + y20 2 1 + y10 2
Now we have sufficient conditions to solve the problem, as may be seen by substituting
the solutions 11.126 into these equations.
464 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
d2 x 0 d2 y
s + sy (s) = 0 and s sx0 (s) = 0.
ds2 ds2
On putting t = s, so s = 1 these integrate to
x0 + y = and y 0 x = .
Differentiate the second with respect to s and use the first to substitute for x0 to obtain
2 y 00 + y = which has the general solution y = + a cos(s/ + ), where a and are
constants. From this we obtain x = y 0 = a sin(s/ + ).
(b) The curve is closed and has length L, that is x(0) = x(L) and y(0) = y(L), so
that L/ = 2. Further (x + )2 + (y a)2 = a2 , so a is the radius of the circle of
circumference L, that is 2a = L.
Integrating these and using s, the arc length, for the independent variables gives as in
exercise 11.16
dx dy
= y and = + x,
ds ds
with solutions
where is a constant.
d d
(y 0 + ) + y = 0 and (z 0 ) + = 0,
dx dx
and the natural boundary condition is z 0 (b) = 0. These equations simplify to
y 00 + y + 0 = 0 and = z 00 ,
d4 y d2 y
4
2 y = 0, y(a) = A1 , z(a) = A2 , y(b) = B1 , z 0 (b) = y 00 (b) = 0.
dx dx
(b) The Euler-Lagrange equation for the functional J[y] is given using the general result
given in section 9.2.1, but see also exercise 3.34 (page 141),
d2
F d F F
2 00
0
+ = 0,
dx y dx y y
dv
v = v 2 + g, v(0) = 0.
ds
466 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
Integration gives
v iv
v
Z h
2
dv = s that is ln(g v = 2s
0 g v 2 0
1 v 1 1
Z
y =A dv v that is g(A y) = v 2 v02 .
g v0 2 2
Multiplying by the mass gives the energy equation: the left-hand side is the loss in
potential energy as the particle falls through a distance A y; the right-hand side is
the gain in kinetic energy.
(b) If R = 0
2 g2
2 2 2 1 2 2
f (v) = + g
v2 v2
p
2 + 2
1 v
= g 2 2 + 2 2
hence =
v2 f (v) g 1 2 v 2
v v2
Z
x(v) = dv p .
g v0 1 (v)2
1 1
y =A sin2 = A 2 (1 cos 2) .
22 g 4 g
At the terminus = 0, so = / tan b . Since tan > 0 for (0, /2) and tan < 0
for (/2, ) the result follows.
If b = /2 the cycloid is tangent to the x-axis at the terminus. If b > /2 it crosses
the x-axis and reaches a point lower than the end point. Thus type A motion has
b < /2 and type B motion has b > /2.
and using the condition v = Vt when = 0 we see that the lower sign gives the required
solution. Hence
2 2
R f (v)
R g g = p
g 2 + 2
where f (v) is defined in the question. Then, as in the text,
2
1
H 2 = 2 + (g )2 = R = 2 + (g )2
v
and
dH d d d
H = g(g ) and H Hv R = ( g)
dv dv dv dv
468 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
so that
d 1 d f (v)
HHv = R R g( g) = p
dv v dv 2 + 2
As in the text, equation 11.77
dx v d dy ( g)v d
= and =
dv HHv dv d HHv dv
and hence
dx p v dy p v
= 2 + 2 and = ( g) 2 + 2 .
dv f (v) dv f (v)
With / = A/b this gives the straight line y = A(1x/b) through the terminal points.
If A < b the above expression for s(t) is valid only t < t0 where v(t0 )= 0: for t t0
the bead is stationary. The equation for the time to reach the end, s = A2 + b2 is the
11.10. SOLUTIONS FOR CHAPTER 11 469
same but now both roots are positive and only one satisfies t < t0 and this gives the
above expression for t. If v02 < 2g( A) this time is complex and the bead does not
reach the point x = b.
But
1
Z
dx eax+ibx = eax+ibx
a + ib
eax
= (a cos bx + b sin bx + i(a sin bx b cos bx))
a2 + b 2
so that
e2
Z
d e2 (cos sin ) = 3 cos + (1 22 ) sin )
1 + 4 2
470 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
and hence
1 2 gR h i
v = 2
3 cos + (1 22 ) sin 3e2
2 1 + 4
so that v(/2) = 0 if 22 + 3e = 1.
The speed is zero when and satisfy
This equation defines a function (), the angle at which the bead stops. If > 1 ,
where 1 is the solution of this equation when = /2, the implicit function theorem
gives the rate of change of (1 ), d/d = g /g with the derivatives evaluated at
= /2 and = 1 . Thus
4 + 3(1 )e
0 (1 ) = = 1.36
3(1 2e )
Putting = /2 + gives
1 1
Z
x() = d (1 cos 2) = (2 sin 2)
2gB 2 0 4gB 2
1 1
Z
y() = A d sin 2 = A (1 cos 2).
2gB 2 0 4gB 2
which is a linear inhomogeneuos equation. Otherwise f 0 (y) is not linear and the Euler-
Lagrange equation is a nonlinear equation.
and the Euler-Lagrange equation gives y = /2. The constraint then gives = 2a, so
y = a.
472 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
1 + ln y + 1 + 2 x2 = 0 that is y = exp 1 1 2 x2 .
Hence
x2
1
y(x) = exp 2 .
2 2
2a
Z
a= dx sin2 x = hence y(x) = sin x.
2 0 4
dy a cos a
tan = = =
dx b sin b tan
hence
1 d a d ab
= that is = 2 .
2
cos d b sin2 d a cos + b2 sin2
2
11.10. SOLUTIONS FOR CHAPTER 11 473
dv abv 2
v + 2 = g(a cos b sin ).
d a cos2 + b2 sin2
If z = v 2 /2 this gives
dz 2abz
+ = g(a cos b sin ).
d a2 cos2 + b2 sin2
d(zf )
= g(a cos b sin )f (), z(0) = 0,
d
and integration gives
g
Z
z() = dw (a cos w b sin w)f (w)
f () 0
Sturm-Liouville systems
12.1 Introduction
The general theory of Sturm-Liouville systems presented in the first part of this chapter
was created in a series of articles in 1836 and 1837 by Sturm (1803 1855) and Liouville
(1809 1882): their work, later known as Sturm-Liouville theory, created a new sub-
ject in mathematical analysis. The theory deals with the general linear, second-order
differential equation
d dy
p(x) + q(x) + w(x) y = 0 (12.1)
dx dx
where the real variable, x, is confined to an interval, a x b, which may be the whole
real line or just x 0. The functions p(x), q(x) and w(x) are real and satisfy certain,
not very restrictive, conditions that will be delineated in section 12.4; in any particular
problem these functions are known. A second-order differential equation is said to be
in self-adjoint form when expressed as in equation 12.1: most second-order equations
can be expressed in this form, see exercise 12.1.
In addition to the differential equation, boundary conditions are specified with the
consequence that solutions exist for only particular values of the constant = k ,
k = 1, 2, , which are named1 eigenvalues: the solution yk (x) is named the eigenfunc-
tion for the eigenvalue2 k . At this stage we shall not specify any boundary conditions,
despite their importance, because different types of problems produce different types of
conditions. Equation 12.1, together with any necessary boundary conditions, is known
as a Sturm-Liouville system, or problem, which belongs to the class of problems known
as eigenvalue problems.
Sturm-Liouville problems are important partly because they arise in diverse cir-
cumstances and partly because the properties of the eigenvalues and eigenfunctions are
1 The fact that we use the same symbol for the eigenvalue and the Lagrange multiplier introduced
in chapter 11, is not a coincidence, as is seen by comparing equation 12.1 with the equation derived in
exercise 11.28, page 453.
2 There are also important examples where the eigenvalues can take any real number in an interval
(which may be infinite), and there are examples in which the eigenvalues can be both discrete and
continuous. Such problems are common and important in quantum mechanics. In this course we deal
only with discrete sets of eigenvalues.
475
476 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
well understood. Moreover, the behaviour of both the eigenvalues and eigenfunctions
of a wide class of Sturm-Liouville systems are remarkably similar and is independent
of the particular form of the functions p(x), q(x) and w(x). In this class of problems
there is always a countable infinity of real eigenvalues k , k = 1, 2, , and the set of
eigenfunctions yk (x), k = 1, 2, , is complete, meaning that these functions may be
used to form generalised Fourier series, as described in section 12.3. Further, there are
simple approximations for both the eigenvalues and eigenfunctions which are accurate
for large k, as shown in exercise 12.35 (page 512).
The achievements of Sturm and Liouville are more impressive when seen in the
context of early nineteenth century mathematics. Prior to 1820 work on differential
equations was concerned with finding solutions in terms of finite formulae or power
series; but for the general equation 12.1 Sturm could not find an expression for the
solution and instead obtained information about the properties of the solution from
the equation itself. This was the first qualitative theory of differential equations and
anticipated Poincares work on nonlinear differential equations developed at the end
of that century. Today the work of Sturm and Liouville is intimately interconnected:
however, though lifelong friends who discussed their work prior to publication, this
theory emerged from a series of articles published separately by each author during
the period 1829 to 1840. More details of this history may be found in Lutzen (1990,
chapter 10).
This chapter introduces the basic theory of Sturm-Liouville systems and shows that
variational principles are useful for the approximation of eigenvalues and eigenfunctions.
Traditional Sturm-Liouville theory does not depend upon the calculus of variations, but
stems from the theory of ordinary linear differential equations which is introduced in
section 12.5.
The Sturm-Liouville eigenvalue problem is, however, readily formulated as a con-
strained variational principle, and this formulation can be used to approximate the
solutions. The crucial property of Sturm-Liouville systems that makes this method so
useful and important is their linearity, which means that the associated functional is
quadratic in y. Besides allowing convenient approximations many general properties
of the eigenvalues can be derived using the variational principle. Some aspects of this
theory are presented in section 12.6.
Sturm-Liouville systems are important because they arise in attempts to solve the
linear, partial differential equations that describe a wide variety of physical problems.
In addition most of the special functions that are so useful in mathematical physics,
and the study of which led to advances in analysis in the 19 th century, originate in
Sturm-Liouville equations. The importance of these functions should not be under-
estimated, as is frequent in this age of computing, for they furnish useful solutions to
many physical problems and can lead to a broader understanding than purely numerical
solutions. Further, the mathematics associated with these functions is elegant and its
study rewarding. There is no time in this course for any discussion of these functions,
but aspects of the important Bessel function are described in section 12.3.1.
Section 12.2 therefore briefly describes how Sturm-Liouville systems occur and gives
some idea of the variety of types of Sturm-Liouville problems that need to be tackled.
This section is optional, but recommended.
In section 12.3 we consider a particularly simple, solvable, Sturm-Liouville system
and examine the properties of its eigenvalues and eigenfunctions in order to illustrate
12.1. INTRODUCTION 477
all the relevant properties of more general systems, which normally cannot be solved in
terms of elementary functions. Some of these properties depend on elementary prop-
erties of second-order differential equations; this theory in described in section 12.5.
Other properties are endowed on the eigenvalues and eigenfunctions because the canon-
ical form of equation 12.1 is self-adjoint, a term defined in section 12.5.3.
The canonical form of equation 12.1 may seem rather special and to be unrepresen-
tative of most linear second-order differential equations: in fact, as shown in the next
exercise, this equation is typical of a large class of such equations.
Equation 12.1 can be cast into a variety of other forms which are useful in the
following discussion. Additionally this equation, with appropriate boundary conditions,
is the Euler-Lagrange equation of a constrained variational problem, with as the
Lagrange multiplier, and this is crucial for the later developments in section 12.6. The
following exercises lead you through this background and we recommend that you do
these exercises.
Exercise 12.1
Consider the second-order, homogeneous, linear differential equation
d2 y dy
a2 (x) + a1 (x) + a0 (x)y = 0.
dx2 dx
d2 u
+ I(x)u = 0, u = y p, (12.3)
dx2
1 ` 02
p + 4qp 2pp00 . Equation 12.3 is sometimes known as
and where I(x) = 2
4p
the normal form and I(x) the invariant of the original equation.
Exercise 12.2
(a) Show that the Euler-Lagrange equation for the functional and constraint
Z b Z b
S[y] = dx py 0 2 qy 2 , C[y] = dx w(x)y 2 = 1,
a a
d2 y
+ p(q + w)y = 0.
d 2
(c) By putting y = uv and by choosing v carefully, show that the original func-
tional and constraint can be written in the form
Z b Z b
1 ` w
dx u0 2 2 p0 2 + 4pq 2pp00 u2 , C[u] = dx u2 ,
S[u] =
a 4p a p
where u(a) = u(b) = 0. Hence derive the Euler-Lagrange equation for u and
compare this with equation 12.3.
Exercise 12.3
Liouvilles normal form:
Consider the functional
Z b
S[y] = dx p(x)y 0 2 (q + w)y 2 .
a
(a) Change the independent variable to = (x) and the dependent variable to
v() where y = A()v(). With a suitable choice of (x) show that the functional
can be written in the form
Z d 2 !
1h 0 ` 2 0 2 i d dv 2
S[v] = p (x) A v + d F ()v ,
2 c c d
d 1
where = , c = (a), d = (b) and
dx pA2
d2
4 1
F () = (q + w)pA A 2 .
d A
(b) By defining A = (wp)1/4 , show that 0 (x) = w/p and the associated Euler-
p
Lagrange equation is
d2 v d2
q 1
+ A + v = 0.
d 2 w d 2 A
This transformation is sometimes named Liouvilles transformation , and is par-
ticularly useful for approximating the eigenvalues and eigenfunctions when is
large, see exercise 12.35 (page 512).
The original work of Sturm appears to have been motivated by the problem of
heat conduction. One example he discussed is the temperature distribution in a one-
dimensional bar, described by the linear partial differential equation
u u
h(x) = p(x) l(x)u, (12.4)
t x x
where u(x, t) denotes the temperature at a point x of the bar at time t, and h(x), p(x)
and l(x) are positive functions. If the surroundings of the bar are held at constant
temperature and the ends of the bar, at x = 0 and x = L, are in contact with large
bodies at a different temperature, then the boundary conditions can be shown to be
u
p(x) + u(x, t) = 0, at x = 0,
x (12.5)
u
p(x) + u(x, t) = 0, at x = L,
x
for some constants and . Finally, the initial temperature of the bar needs to be
specified, so u(x, 0) = f (x) where f (x) is the known initial temperature.
Sturm attempted to solve this equation by first substituting a function of the form
u(x, t) = X(x)et , where is a constant and X(x) is independent of t. This yields
the ordinary differential equation
d dX
p(x) + h(x) l(x) X = 0 (12.6)
dx dx
for X(x) in terms of the unknown constant , together with the boundary conditions
This is an eigenvalue problem. Assuming that there are solutions Xk (x) with eigenvalues
= k , for k = 1, 2, , Sturm used the linearity of the original equation to write a
general solution as the sum
X
u(x, t) = Ak Xk (x)ek t ,
k=1
where the coefficients Ak are arbitrary. This solution formally satisfies the differential
equation and the boundary conditions, but not the initial condition u(x, 0) = f (x),
which will be satisfied only if
X
f (x) = Ak Xk (x).
k=1
Thus the problem reduces to that of finding the values of the Ak satisfying this equation.
Fourier (1768 1830) and Poisson (1781 1840) found expressions for the coefficients
Ak for particular functions h(x), p(x) and l(x), but Sturm and Liouville determined
the general solution.
480 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
2 + k 2 = 0, (12.8)
2 k = 0, heat or diffusion equation, (12.9)
t
2
1
2 2 2 = 0, wave equation, (12.10)
c t
1 2
1
(x) 2 2 = 0, canal or horn equation, (12.11)
(x) x x c t
2 2 2
2 = + + .
x2 y 2 z 2
The first of these equations arises in the solution of Poissons equation that is,
2 = F (r) and similar equations occur when using separation of variables. The
second equation describes diffusion processes and heat flow. The third equation 12.10
is the wave equation for propagation of small disturbances in an isotropic medium and
describes a variety of wave phenomena such as electromagnetic radiation, water and
air waves, waves in strings and membranes. The fourth equation is a variant of the
previous wave equation and in this form was derived by Green (1793 1841) in his
18383 paper describing waves on a canal of rectangular cross section but with a width
varying along its length; a similar equation describes, approximately, the air pressure
in a horn, though in many instruments the flare is sufficiently rapid for the longitudinal
and radial modes to couple, so it is necessary to use the two-dimensional version of 12.11
in which the variation of the air pressure along the length of the pipe and in the radial
direction is included.
The many different forms of the Sturm-Liouville system that we discuss in the fol-
lowing sections are largely a consequence of the shapes of the regions in which the
physical system is defined and of the coordinate system that simplifies the equations.
A Sturm-Liouville system arises when the method of separation of variables is used to
reduce a partial differential equation to a set of uncoupled ordinary differential equa-
tions. Whether or not such a simplification is feasible depends upon the existence of
a suitable coordinate system and this depends upon the form of the original equation
and the shape of the boundary. Relatively few problems yield to this treatment, but
it is important because it is one of the principal means of finding solutions in terms
of known functions: the main alternatives are numerical and variational methods, the
latter being introduced in section 12.6.
In problems with two spatial dimensions separation of variables can be used with
equations 12.8 and 12.10 for rectangular, circular and elliptical boundaries but not, for
example, most triangular boundaries.
3 On the Motion of Waves in a variable Canal of small Depth and Width, 1838 Camb Phil Soc, Vol
2
d 2 d
(1 x ) + = 0, 1 x 1. (12.17)
dx dx 1 x2
Both equation 12.15 for and 12.17 for are in the canonical form of equation 12.1.
Comparison of 12.15 for with equation 12.1 shows that the separation constant 2
now plays the role of the eigenvalue; its value is determined by the boundary conditions
that needs to satisfy. Comparison of 12.17 for with equation 12.1 shows that here
plays the role of the eigenvalue.
This analysis shows that in spherical polar coordinates the equation 2 + k 2 = 0
gives rise to three Sturm-Liouville systems for R(r), () and () where = R(r)()().
These equations are summarised in table 12.1.
Table 12.1: Summary of the three Sturm-Liouville systems arising from separation of vari-
ables of equation 12.8 using spherical polar coordinates, giving the explicit form for the three
functions p, q and w, in each case.
Equation p q w Eigenvalue
00 + 2 = 0 1 0 1 2
0 2 2
(1 x2 )0 (x) + =0 1 x2 1
1 x2 1 x2
0
r2 R0 (r) + (k 2 r2 )R = 0 r2 r2 k2
that () is bounded for x [1, 1]. It can be shown that with = m, this condition
gives = l(l + 1), l = m, m + 1, m + 2, 4 ; these solutions are named the associated
Legendre polynomials and are denoted by Plm (x).
The radial equation for R(r) has p(r) = r 2 , so if the original space includes the origin
we find that because p(0) = 0 the solutions are of two types, those that are bounded
and those that are unbounded at r = 0. Again, physical considerations usually suggest
that the bounded solutions are chosen. The other boundary conditions are either given
by some condition at r = a > 0, where a is the radius of the sphere in which the original
problem is defined, or that the solutions remain bounded as r .
Summary: the method of separation of variables applied to the equation 2 + k 2 = 0,
using spherical polar coordinates leads to three different types of Sturm-Liouville sys-
tems. In this summary we introduce the idea of regular and singular Sturm-Liouville
systems, that will be discussed further and defined in section 12.4.
2
d 2 d
(1 x ) + = 0, 1 x 1. (12.19)
dx dx 1 x2
The condition that () is bounded for all x serves the same purpose as boundary
conditions, and determines possible values of the eigenvalue , once 2 is known.
Because p(x) = 1x2 is zero at the ends of the interval this type of Sturm-Liouville
equation is classified as a singular Sturm-Liouville system.
(3) The equation
d 2 dR
+ k 2 r2 R = 0.
r (12.20)
dr dr
For this equation several types of conditions can specify the solution uniquely and
determine possible values of the eigenvalue k 2 .
(i) If 0 r a, since p(r) = r 2 is zero at r = 0, the solutions will normally
be required to be bounded at r = 0 and satisfy a condition of the form
A1 y(a) + A2 y 0 (a) = 0 at r = a, where A1 and A2 are constants. This system
is classified as a singular Sturm-Liouville system because p(r) = 0 at r = 0.
(ii) If r [0, ), since p(0) = 0 the solutions will normally be required to
be bounded at r = 0 and tend to zero as r . Again this is a singular
Sturm-Liouville system.
4 A physical reason why l m is that in some circumstances l is proportional to the magnitude of
an angular momentum and m a projection of this vector along a given axis, which can be no longer
than the original vector.
484 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
where A1 , A2 , B1 and B2 are constants. For this system p(r) = r 2 > 0 for all
r and the system is a regular Sturm-Liouville system.
The examples described in this section show how Sturm-Liouville equations arise and
why a variety of types of these equations exist. The significance of the differing types
will become clear as the theory develops.
Exercise 12.4
Consider the system 2 + k2 = 0 with (x, y) = 0 on the rectangle defined
by the x- and y-axes, and the lines x = a > 0, y = b > 0. Show that inside
this rectangle separation of variables with Cartesian coordinates leads to the two
Sturm-Liouville systems
d2 X d2 Y
+ 12 X = 0 and + 22 Y = 0
dx2 dy 2
with X(0) = X(a) = 0, Y (0) = Y (b) = 0 and where = X(x)Y (y) and
12 + 22 = k2 .
Exercise 12.5
Consider the system 2 + k2 = 0 with (x, y) = 0 defined inside the circle of
radius a. Use the polar coordinates x = r cos , y = r sin , 0 r a to cast the
equation in the form
2 1 1 2
2
+ + 2 + k2 = 0.
r r r r 2
By putting = R(r)(), where R(r) depends only upon r and () only upon
, show that
d2
+ 2 = 0, with () 2-periodic,
d2
d2 R dR ` 2 2
r2 2 + r + k r 2 R
= 0,
dr dr
where is a positive constant. Show further that the equation for R(r) can be
cast in self-adjoint form
2
d dR
r + k2 r R = 0.
dr dr r
12.3. EIGENVALUES AND FUNCTIONS OF SIMPLE SYSTEMS 485
In exercise 12.6 it was shown that the eigenfunctions and eigenvalues of equation 12.21
are
yn (x) = B sin nx, n = n2 , n = 1, 2, . (12.22)
The constant B is undetermined because the equation and boundary conditions are
homogeneous. It is often convenient to fix the value of this constant by normalising the
eigenfunctions to unity, that is we set
Z Z
1
2
dx yn (x) = 1 and this gives B 2
dx sin2 nx = B 2 = 1. (12.23)
0 0 2
By choosing B to be positive this convention gives the following eigenfunctions and
eigenvalues r
2
yn (x) = sin nx, n = n2 , n = 1, 2, . (12.24)
Graphs of the adjacent pairs of eigenfunctions {y1 (x), y2 (x)}, and {y5 (x), y6 (x)} are
shown in the following figure.
y y
k=1
k=5
0.5 0.5
k=2 k=6
x x
0 0
1 2 3 1 2 3
-0.5 -0.5
p
Figure 12.1 Graphs of yk (x) = 2/ sin kx for k = 1, 2 on the left, and k = 5, 6 on the right.
486 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
We now list the important properties of these eigenvalues and eigenfunctions and state
which are common to all Sturm-Liouville systems. It is surprising that most of these
properties are common to all Sturm-Liouville systems regardless of the precise forms of
the functions p, q and w.
In this list we first state the specific property of the solutions of the Sturm-Liouville
system 12.21, and then state the equivalent general property of the solutions for the
general system, equation 12.1.
Interlacing zeros The zeros of adjacent eigenfunctions interlace, so there is one and
only one zero of yn+1 (x) between adjacent zeros of yn (x), see figure 12.1.
This is also true in the general case, and is a property of many solutions of second-
order equations, see theorem 12.2 (page 501), see also theorem 12.3.
Number of zeros of the nth eigenfunction The nth eigenfunction has n 1 zeros
in 0 < x < .
For the general Sturm-Liouville problem on the interval [a, b] the nth eigenfunction
has n 1 zeros in a < x < b. This property is largely a consequence of the
interlacing of zeros.
For the general Sturm-Liouville system, regular and singular, defined in equa-
tion 12.1 there is a similar result. If n (x) and m (x) are eigenfunctions belonging
12.3. EIGENVALUES AND FUNCTIONS OF SIMPLE SYSTEMS 487
to two distinct eigenvalues, then they can be shown to satisfy the orthogonality
relation Z b
dx w(x)n (x) m (x) = hn nm , (12.25)
a
where f (x) and g(x) are any functions, which may be complex, for which the
integral exists. Notice that (g, f )w = (f, g)w . With this notation equation 12.25
can be written in the form hn = (n , n )w . If w(x) = 1 we denote the inner
product by (f, g).
If (f, g)w = 0 the two functions are said to be orthogonal and if (f, f )w = 1 the
function f is said to be normalised.
where
b
(n , f )w 1
Z
an = = dx w(x)n (x) f (x), hn = (n , n )w .
(n , n )w hn a
It is conventional to name the more general series 12.28 a Fourier series and the
coefficients an the Fourier components: the series 12.27 is often referred to as a
trigonometric series, if a distinction is necessary.
The twin properties of orthogonality and completeness of the eigenfunctions, and
hence the existence of the series 12.28, are two reasons why Sturm-Liouville sys-
tems play a significant role in the theory of linear differential equations. It means,
for instance, that solutions of the inhomogeneous equation
d dy
p(x) + q(x)y = F (x), (12.29)
dx dx
with suitable boundary conditions, can usually be expressed as a linear combina-
tion of the eigenfunctions of the related Sturm-Liouville system,
d dy
p(x) + q(x) + w(x) y = 0,
dx dx
with the same boundary conditions. The rigorous treatment of this theory is too
involved to be included in this course, but an outline of the theory is contained
in the next exercise.
Exercise 12.7
Suppose that the Sturm-Liouville system
d dy
p + (q + w)y = 0, y(a) = y(b) = 0,
dx dx
has an infinite set of eigenvalues and eigenfunctions k and k (x), k = 1, 2, ,
with 0 < 1 < 2 < . which satisfy the orthogonality relation 12.25.
(a) Consider the infinite series
X
y(x) = yk k (x)
k=1
where the coefficients yk are constants. Assuming the order of summation and
differentiation can be interchanged, show that
d dy X
p + qy = yk k w(x)k (x).
dx dx
k=1
(b) Hence show that the solution of the inhomogeneous equation 12.29 can be
written in the form
Z b
X k (u) k (x)
y(x) = du G(x, u)F (u) where G(x, u) = .
a k=1
h k k
12.3. EIGENVALUES AND FUNCTIONS OF SIMPLE SYSTEMS 489
d2 y dy
x2 +x + (x2 2 )y = 0, (12.30)
dx2 dx
where is a real number7 , though in the following we consider only the case = 1.
The various solutions of this equation are collectively named Bessel functions. This
equation is singular at the origin (see section 12.5) and, as a consequence, it can be
shown to possess two types of solution. Those denoted by J (x) are bounded at the
origin: those denoted by Y (x) are unbounded at the origin.
The second application arises because it is frequently necessary to expand the func-
tion eiz sin t , which is 2-periodic in t, as a Fourier series. It transpires that the Fourier
components are Bessel functions,
X
eiz sin t = Jn (z)eint . (12.31)
n=
This relation is useful in the modern problem of the interaction of periodic electric
fields, lasers for example, with atoms and molecules: but the original application of
Bessel functions in this context was the inversion of Keplers equation, which relates
the time, t, to the eccentric anomaly, u, of a planet in an elliptical orbit with the Sun
at one focus,
t = u sin u (Keplers equation). (12.32)
6 G N Watson 1966 A treatise on the Theory of Bessel Functions (Cambridge University Press),
Here is the angular frequency of the planet and the eccentricity of the elliptical
path typically less than 0.1, the exceptions being Mercury (0.21) and Pluto (0.25).
Elementary dynamics gives the approximate position of each planet in terms of u, but
for practical applications they are needed in terms of the time. By writing t = and
u = + P (), so P () is a 2-periodic function, we find that the Fourier components
of P () are related to Bessel functions, see exercise 12.49.
This application gives rise to the integral definition of Jn (x),
Z
1
Jn (x) = dt exp i (nt x sin t) , n = 0, 1, 2, . (12.33)
2
Exercise 12.8
(a) Show that the self-adjoint form of equation 12.30 is
2
d dy
x + x y = 0.
dx dx x
(b) Show that the normal form, defined in exercise 12.1, of equation 12.30 is
!
d2 u 2 14 u(x)
+ 1 u = 0 where y(x) = , x > 0.
dx2 x2 x
(c) Apply the Liouville transformation, defined in exercise 12.3, to equation 12.30
to give the alternative form of Bessels equation
d2 y 2
2
+ e 2 y = 0 where = ln x, x > 0.
d
Exercise 12.9
(a) Use the Fourier series 12.31 to show that
(i) Jn (x) = (1)n Jn (x);
(ii) Jn (x) = (1)n Jn (x);
(iii) J0 (x) + 2J2 (x) + 2J4 (x) + = 1.
(b) Use the integral definition to show that J0 (0) = 1 and that Jn (0) = 0 for
n 6= 0.
(c) By differentiating the integral definition 12.33 with respect to x derive the
recurrence relation
2Jn0 (x) = Jn1 (x) Jn+1 (x).
(d) Use the integral definition 12.33 to show that
2n
Jn1 (x) + Jn+1 (x) = Jn (x).
x
12.3. EIGENVALUES AND FUNCTIONS OF SIMPLE SYSTEMS 491
In the remainder of this section we describe the behaviour of the eigenvalues and eigen-
functions of the singular Sturm-Liouville system associated with Bessels equation,
d2 y dy
x2 +x + (2 x2 1)y = 0, 0 x 1, y(1) = 0. (12.34)
dx2 dx
with > 0, in particular we show that they satisfy most of the properties listed at
the beginning of section 12.3. By converting equation 12.34 to the self-adjoint form
(xy 0 )0 + (x2 1/x)y = 0, see exercise 12.8, and comparing with equation 12.1 we see
that the eigenvalue is = 2 (and p = w = x, q = 1/x). By changing the independent
variable to = x we see that this equation is the same as equation 12.30 with = 1
and hence has the solutions Y1 (x) and J1 (x); we require the solution that is bounded,
that is J1 (x).
The boundary condition at x = 1 then gives J1 () = 0, that is must be one of the
zeros of the Bessel function. A graph of J1 () is shown in figure 12.2 and this suggests
that there are an infinite number of positive zeros, k , k = 1, 2, .
J1()
0.6
0.4
0.2
0
2 4 6 8 10 12 14 16 18 20
-0.2
-0.4
Figure 12.2 Graph of the Bessel function J1 ().
Using its series expansion Daniel Bernoulli (1738) first suggested that this Bessel func-
tion has an infinite set of zeros. Later we shall see how this follows from the general
theory of second-order differential equations: the first five zeros are
which gives the first zero to within 0.006% and progressively improves in accuracy with
increasing k.
The easiest way to understand why J1 (x) oscillates in the manner shown in fig-
ure 12.2is to use the result derived in exercise 12.8(b). For large x this shows that
u(x) = xJ1 (x) is given approximately by the equation u00 + u = 0, so that J1 (x) '
(A cos x + B sin x)/ x; this shows why J1 (x) oscillates but does not give the phase of
the oscillations, that is the values of A and B.
The eigenfunctions of equation 12.34 are thus
In the following two figures are shown the graphs of the eigenfunctions {y1 (x), y2 (x)}
and {y5 (x), y6 (x)}, as in figure 12.1 (page 485), with which you should compare the
present figures.
y y
k=5
0.4 k=1 0.4
-0.4 -0.4
Figure 12.3 Graphs of yk (x) = J1 (k x), for k = 1, 2, on the left, and k = 5, 6 on the right.
The eigenfunctions are complete, which means that any sufficiently well behaved
real function, f (x), on the interval 0 < x < 1 can be expressed as the infinite
series, equation 12.28 (page 487),
1
2
X Z
f (x) = an J1 (xn ) where an = dx xf (x)J1 (xn ).
n=1
J10 (n )2 0
Exercise 12.10
This exercise shows how the boundary conditions can affect the eigenvalues and
eigenfunctions. Find all eigenvalues and eigenfunctions of the Sturm-Liouville
systems defined by the differential equation
d2 y
+ y = 0,
dx2
and the three sets of boundary conditions
In each case show that the eigenfunctions, n (x), belonging to distinct eigenvalues
are orthogonal, that is satisfy,
Z
dx n (x) m (x) = hn nm
0
Exercise 12.11
This exercise involves lengthy algebraic manipulations. In exercise 12.10 you found
the following sets of eigenfunctions, yn (x), and eigenvalues, n , for the equation
d2 y/dx2 + y = 0 with three different boundary conditions,
The Sturm-Liouville theorem shows that each of these sets of functions is complete
on (0, ). Use equation 12.28 to show that the function x may be represented by
any of the following series on the interval (0, )
4 X cos(2k + 1)x
x = ,
2 (2k + 1)2
k=0
2 X (1)k
1
x = sin k + x,
(k + 1/2)2 2
k=0
2( 1) cosh 0 X cos k sin k x
x = 2
sinh 0 x 2( 1) 2
.
0 ( cosh 0 ) k=1
k ( cos k )
Exercise 12.12
Periodic boundary conditions:
(a) Show that the eigenvalues of the Sturm-Liouville system
d2 y
+ y = 0, y(0) = y(2a), y 0 (0) = y 0 (2a), a > 0,
dx2
are given by
n 2
n = , n = 0, 1, 2, ,
a
and that there are no negative eigenvalues. Show also that for n = 0 there is
just one eigenfunction, which can be taken to be y0 (x) = 1, and for n 1 each
eigenvalue has two linearly independent eigenfunctions,
n nx nx o
yn (x) = cos , sin ,
a a
or any linear combination of these.
(b) Consider the two eigenfunctions associated with the nth eigenvalue
nx nx nx nx
u1 (x) = A1 cos + B1 sin and u2 (x) = A2 cos + B2 sin .
a a a a
Show that these are orthogonal only if A1 A2 + B1 B2 = 0.
494 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
Exercise 12.13
Mixed boundary conditions:
The solutions of a Sturm-Liouville equation with mixed boundary conditions usu-
ally behave quite differently from those with unmixed conditions. An example is
considered in this exercise.
Consider the system with mixed boundary conditions
d2 y
+ y = 0, y(0) = 0, y() = ay 0 (0), a > 0.
dx2
Show that if 0 < a < there are a finite number of real eigenvalues given by the
real roots of the equation sin = a, (1 , 2 , , N ), with = 2 and with
eigenfunctions yk (x) = sin k x and N ' 1/a.
Are these eigenfunctions orthogonal?
defined on a finite interval of the real axis a x b, together with the homogeneous
boundary conditions
the functions p(x), q(x) and w(x) are real and continuous for a x b;
(1965) and Birkhoff and Rota (1962) the sign in front of q(x) is negative and in Korner (1988) the
signs in front of q(x) and are negative. Care is needed when using different sources.
12.4. STURM-LIOUVILLE SYSTEMS 495
Equation 12.34, defining the Bessel function, and the radial equation 12.20 for R(r) and
equation 12.19 for (), do not satisfy the condition p > 0. Further, equation 12.18
for () has a different type of boundary condition than those of equation 12.37. It
follows that the scope of the theory needs to be extended if it is to be useful.
First, it needs to apply to periodic boundary conditions, that is
which are an important subset of the class of mixed boundary conditions, see exer-
cise 12.12. Equation 12.18 for () has this type of boundary condition. Another
common Sturm-Liouville system with periodic boundary conditions is Mathieus equa-
tion,
d2 y
+ ( 2q cos 2) y = 0, y(0) = y(), y 0 (0) = y 0 (), (12.39)
d2
where here q is a real variable. This equation seems to have been first studied by the
French mathematician Mathieu (1835 1890) in his discussion of the vibrations of an
elliptic membrane and occurs when separating variables in elliptical coordinates, see
exercise 12.49 (page 526). In this example (q) is the eigenvalue and it has a fairly
complicated dependence upon the variable q.
The main difference between periodic and separated boundary values is that some-
times, see exercise 12.12, each eigenvalue has more than one eigenfunction. In such
cases it is always possible to choose linear combinations that are orthogonal.
The second necessary extension is to those equations where p(x) = 0 at either or
both end points. In the example treated in section 12.2, the equation 12.20 for R(r) is
singular if the interval contains r = 0, as is the Bessel function example, equation 12.34:
the equation 12.19 for () is singular because p(x) = 1 x2 is zero at both ends of
the interval. Thus singular systems are as common as regular systems.
As an aside we note that all these singular systems arise because the spherical polar
coordinates used to separate variables are singular at the poles, where x = cos = 1
and is undefined, and at r = 0 where neither nor are defined. It is this geo-
metric singularity in the transformation between Cartesian and polar coordinates that
makes the Sturm-Liouville systems singular: therefore we do not expect these particular
singular systems to be much different from regular systems.
A Sturm-Liouville system for which p(x) is positive for a < x < b but vanishes at
one or both ends is named a singular Sturm-Liouville system. These systems comprise
the differential equation 12.36, with w(x) and q(x) satisfying the same conditions as
for a regular system, and
the solution is bounded for a x b;
at an end point at which p(x) does not vanish, y(x) satisfies a boundary condition
of the type 12.37.
The example of equation 12.19 shows that for some singular systems q(x) is unbounded
at the interval ends. The behaviour of q(x) is not, however, so important in determining
the behaviour of the eigenfunctions.
The third necessary extension is to systems defined on infinite or semi-infinite inter-
vals, which arise in many applications in quantum mechanics. We shall not deal with
these problems, but note that in many cases these systems behave like regular systems.
496 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
Exercise 12.14
Consider the eigenvalue problem
d 2 dy
x + y = 0, 0 x 1, y(1) = c 0,
dx dx
9G Birkhoff and G-C Rota, 1962 Ordinary differential equations (Blaisdell Publishing Co.).
10 E L Ince, 1956 Ordinary differential equations (Dover).
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 497
Exercise 12.15
d2 y
Consider the inhomogeneous equation + y = x.
dx2
(a) Find the general solution of the homogeneous equation.
(b) Find any particular integral of the inhomogeneous equation, and hence find
its general solution.
The solutions of equations 12.40 and 12.41 satisfy the following properties.
P2: Uniqueness of the initial value problem. If p1 /p2 and p0 /p2 are continu-
ous for x [a, b] then at most one solution of equation 12.40 can satisfy the given
initial conditions y(a) = 0 , y 0 (a) = 1 , see also section 3.5.2.
P3: If f (x) and g(x) are solutions of the homogeneous equation 12.40 and if, for
some x = , the vectors (f (), f 0 ()) and (g(), g 0 ()) are linearly independent,
then every solution of equation 12.41 can be written as a linear combination of
f (x) and g(x),
y(x) = c1 f (x) + c2 g(x).
The two functions f (x) and g(x) are said to form a basis of the differential equa-
tion.
P4: The general solution of the inhomogeneous equation 12.40 is given by the
sum of any particular solution and the general solution of the homogeneous equa-
tion 12.41.
Exercise 12.16
Use properties P2 to show that if a nontrivial solution of equation 12.41 y(x) is
zero at x = , then y 0 () 6= 0, that is the zeros of the solutions are simple.
Exercise 12.17
Consider the two vectors x = (x1 , x2 ) and y = (y1 , y2 ) in the Cartesian plane.
Show that they are linearly independent, that is not parallel, if
x1 x2
x1 y2 x2 y1 = 6 0.
=
y1 y2
498 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
The function W (f, g; x) is named the Wronskian11 , of the functions f (x) and g(x).
This notation for the Wronskian shows which functions are used to construct it and the
independent variable; sometimes such detail is unnecessary so either of the notations
W (x) or W (f, g) are freely used.
If W (f, g; x) 6= 0 for a < x < b the functions f (x) and g(x) are said to be linearly
independent in (a, b); alternatively if W (f, g; x) = 0 they are linearly dependent. These
rules apply only to sufficiently smooth functions as the example of exercise 12.22 shows.
The Wronskian of any two solutions, f and g, of equation 12.41 satisfies the identity
Z x
p1 (t)
W (f, g; x) = W (f, g; a) exp dt . (12.43)
a p2 (t)
This identity is proved in exercise 12.23 by showing that W (x) satisfies a first-order
differential equation and solving it. Because the right-hand side of equation 12.43
always has the same sign, it follows that the Wronskian of two solutions is either always
positive, always negative or always zero. Thus, if f and g are linearly independent
at one point of the interval (a, b) they are linearly independent at all points of (a, b).
Conversely, if W (f, g) vanishes anywhere it vanishes everywhere.
The Wronskian can be used with one known solution to construct another. Suppose
that f (x) is a known solution and let g(x) be another (unknown) solution. The equation
for W (x) can be interpreted as a first-order equation for g,
g 0 f gf 0 = W (x),
0 0 2 d g
and, because g f gf = f , this equation, with 12.43, can be written in the
dx f
form Z x
d g W (a) p1 (t)
= exp dt
dx f f (x)2 a p2 (t)
having the general solution
x Z s
1 p1 (t)
Z
g(x) = f (x) C + W (a) ds exp dt , (12.44)
a f (s)2 a p2 (t)
where C is an arbitrary constant.
Exercise 12.18
If F (z) is a differentiable function and g = F (f ), with f (x) a differentiable,
non-constant function of x, show that W (f, g) = 0 only if g(x) = cf (x) for any
constant c.
11 Josef Hoene (1778 1853) was born in Poland, moved to France and become a French citizen in
1800. He moved to Paris in 1810 and adpoted the name Josef Hoene de Wronski at about that time,
just after he married.
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 499
Exercise 12.19
Show that the functions a1 sin x + a2 cos x and b1 sin x + b2 cos x are linearly inde-
pendent if a1 b2 6= a2 b1 .
Exercise 12.20
Use equation 12.44 to show that if f (x) is any nontrivial solution ofZ the equation
x
ds
y 00 + q(x)y = 0 for a < x < b, then another solution is g(x) = f (x) 2
.
a f (s)
Exercise 12.21
(a) If f and g are linearly independent solutions of the homogeneous differential
equation y 00 + p1 (x)y 0 + p0 (x)y = 0, show that
f g 00 gf 00 f 0 g 00 g 0 f 00
p1 (x) = and p0 (x) = .
W (f, g; x) W (f, g; x)
Exercise 12.22
If f (x) = x3 and g(x) = |x|3 show that, (a) W (f, g) = 0 for all x 6= 0, and (b) that
the vectors (f, f 0 ), (g, g 0 ) are linearly independent for 1 x 1. Why does this
not contadict the properties stated after equation 12.43?
Exercise 12.23
Show that the Wronskian W (f, g; x), where f and g are linearly independent
solutions of equation 12.41 satisfies the first-order differential equation
dW p1 (x)
= W
dx p2 (x)
and hence derive equation 12.43.
Now let c and d be two successive zeros of g(x), so g(c) = g(d) = 0 then f (c) 6= 0
and f (d) 6= 0; also g 0 (c) and g 0 (d) must have different signs (because if g(x) is increasing
at x = c it must be decreasing at x = d, or vice-versa). Since W (f, g; x) has constant
sign and
W (c) = f (c)g 0 (c), W (d) = f (d)g 0 (d),
it follows that f (c) and f (d) must have opposite signs. Hence f (x) must have at least
one zero for c < x < d; two possible situations are shown in figure 12.4.
y y
f(x) g(x) f(x) g(x)
x x
c d c d
Figure 12.4 Diagram showing the behaviour of f (x) between two adjacent zeros of g(x),
consistent with W (f, g) not changing sign. Only the behaviour on the left-hand side is
actually possible, because we assume that g(x) 6= 0 for c < x < d, see text.
However, there can be only one zero of f (x) between adjacent zeros of g(x). Suppose
there are more: by reversing the roles of f and g we see that between two of the zeros
of f (x), there must be at least one zero of g(x), which contradicts the assumption that
c and d are adjacent zeros. Thus we have the following theorem.
Theorem 12.1
Sturms separation theorem. If f (x) and g(x) are linearly independent solutions of
the second-order homogeneous equation
d2 y dy
p2 (x) + p1 (x) + p0 (x)y = 0, a x b, (12.45)
dx2 dx
where p2 (x) 6= 0 for x [a, b], then the zeros of f (x) and g(x) alternate in (a, b).
A well known example of this theorem is the equation y 00 + y = 0, on the whole real
line, which has the independent solutions sin x and cos x with the alternating zeros n
and (n + 1/2), n = 0, 1, 2, , respectively. A less obvious consequence is that the
two functions
Theorem 12.2
Sturms comparison theorem. Let y1 (x) and y2 (x) be, respectively, nontrivial so-
lutions of the differential equations
d2 y d2 y
+ Q1 (x)y = 0 and + Q2 (x)y = 0 (12.46)
dx2 dx2
on an interval (a, b) and assume that Q1 (x) Q2 (x) everywhere in this interval. Then
between any two zeros of y2 (x) there is at least one zero of y1 (x), unless Q1 (x) = Q2 (x)
everywhere and y1 is a constant multiple of y2 .
But
dW d
= (y1 y20 y10 y2 ) = y1 y200 y100 y2
dx dx
and, on using the differential equations 12.46 defining y1 and y2 , this simplifies to
dW
= Q1 (x) Q2 (x) y1 (x)y2 (x) 0, c x d.
dx
It follows that if Q1 (x) > Q2 (x), W (y1 , y2 ; x) is a monotonic increasing function of x,
so that W (c) W (d), which contradicts equation 12.47. Thus we must have y 1 (d) < 0
and hence y1 (x) must have at least one zero in (c, d).
Further, if Q1 = Q2 the separation theorem implies that there is one zero unless y1
and y2 are linearly dependent, that is y2 (x) is a multiple of y1 (x).
Applications of the comparison theorem
The equation y 00 + Q(x)y = 0, Q(x) 0
The first important result that follows from this is that every nontrivial solution of
d2 y
+ Q(x)y = 0 (12.48)
dx2
has at most one zero in any interval where Q(x) 0.
The proof is by contradiction. A solution of y 00 = 0 (that is, Q1 (x) = 0) is y1 (x) = 1.
If a solution of 12.48 has two zeros in a region where Q2 Q1 = 0, then y1 (x) would
have at least one zero in between, which is a contradiction.
502 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
d2 y 2
1 dy
+ + 1 y = 0, (12.49)
dx2 x dx x2
which can be written in the normal form, exercise 12.1 (page 477),
d2 u 1 4 2
u(x)
2
+ 1+ 2
u = 0 where y(x) = , x > 0. (12.50)
dx 4x x
1 4 2
If < 1/2 the function Q1 (x) = 1 + > 1 so a suitable comparison equation is
00
4x2
v + v = 0, that is Q2 = 1 < Q1 . A solution of the comparison equation is v = sin x,
with positive zeros at x = n, n = 1, 2, . Hence u(x) has at least one zero in each
of the intervals (n, (n + 1)), n = 1, 2, .
If > 1/2 we can show that the solution has an infinity of positive zeros. In this case
Q1 (x) = 1 (4 2 1)/x2 < 1, so we take the comparison equation to be v 00 + 2 v = 0,
with 0 < < 1: then for x > x0 (), where Q1 (x0 ) = 2 , Q1 (x) > Q2 = 2 , and
the comparison theorem shows that there is at least one zero of u(x) in each interval
(n/, (n + 1)/), with n > x0 ; as x , we may chose close to unity.
We end this section by quoting, without proof, a more general comparison theorem,
needed later to obtain approximate positions of the zeros of an eigenfunction. The proof
of this theorem may be found in Birkhoff and Rota (1962, chapter 10).
Theorem 12.3
Sturms comparison theorem II. For the differential equations
d dy d dy
p1 (x) + Q1 (x)y = 0 and p2 (x) + Q2 (x)y = 0, a x b,
dx dx dx dx
where p2 (x) p1 (x) and Q2 (x) Q1 (x) for x (a, b), then if y1 (x) is a solution of the
first equation and y2 (x) any solution of the second equation, between any two adjacent
zeros of y2 there lies at least one zero of y1 , except if p1 = p2 , Q1 = Q2 , for all x [a, b],
and y1 is a constant multiple of y2 .
A shorter, approximate, easy to remember version is that as Q(x) increases and/or p(x)
decreases, the number of zeros of every solution increases.
The first comparison theorem is a direct consequence of this theorem. These the-
orems can be used to show that for a regular Sturm-Liouville system, provided the
eigenfunctions yn (x) exist and the eigenvalues satisfy 1 < 2 < < n < n+1 < ,
then the zeros of yn (x) interlace and that yn (x) has n 1 zeros in (a, b). We outline a
proof that these eigenfunctions exist in section 12.5.4.
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 503
Exercise 12.24
Use the Liouville normal form found in exercise 12.3 (page 478) and the comparison
theorem to show that there is a lower bound on the eigenvalues of a regular Sturm-
Liouville system with the boundary conditions y(a) = y(b) = 0.
Exercise 12.25
(a) Show that every solution of the Airy equation y 00 + xy = 0 vanishes infinitely
often for x > 1 and at most once for x < 0.
(b) Show that if y(x) satisfies Airys equation, then v(x) = y(ax) satisfies the
equation v 00 + a3 xv = 0.
(c) Show that the Sturm-Liouville system y 00 + xy = 0, y(0) = y(1) = 0, has an
infinite sequence of positive eigenvalues and no negative eigenvalues.
du
d dv
v(Lu) u Lv = p(x) v u (12.52)
dx dx dx
where u and v are any, possibly complex, functions for which both sides of the identity
exist.
Exercise 12.26
Prove Lagranges identity, equation 12.52.
Z b
12
Using the the inner product notation, with unit weight function , (f, g) = dx f (x) g(x),
a
Lagranges identity can be written in the form
b
du
dv
(Lu, v) (u, Lv) = p(x) v(x) u(x) . (12.53)
dx dx a
For some boundary conditions the right-hand side of this equation is zero and then
physics, particularly in quantum mechanics, but in mathematics texts the integrand is often taken to
be f (x)g(x) . Provided one definition is used consistently the difference is immaterial.
504 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
In this case the operator and the boundary conditions are said to be self-adjoint. It is
important to note that a differential operator cannot be self-adjoint without appropriate
boundary conditions.
For the homogeneous, separated boundary conditions defined in equation 12.37
(page 494) we have, since A1 and A2 are real, and assuming A2 6= 0,
This shows that the boundary term of equation 12.53 is zero at x = a; a similar analysis
shows it to be zero at x = b. If A2 = 0 then u(a) = v(a) = 0 and the same result follows.
For a singular system, if p(a) = 0 the boundary term at x = a is clearly zero. Thus
for regular and singular systems (Lu, v) = (u, Lv) and the operator L is self-adjoint.
Periodic boundary conditions also make the system self-adjoint, as shown in the next
exercise.
Exercise 12.27
Prove that if the boundary conditions are periodic, y(a) = y(b) and y 0 (a) = y 0 (b)
and p(a) = p(b), then L is self-adjoint.
Note: periodic boundary conditions are examples of mixed boundary conditions
in which the values of the function, and possibly its derivative, at the two ends of
the range are non-trivially related. Normally mixed boundary conditions produce
operators that are not self-adjoint, exercise 12.30.
Exercise 12.28
In this chapter the operators considered are real but complex operators are often
useful.
R
Show that on the space of differentiable functions for which dx |u(x)|2 exists
d
the real operator L = dx is not self-adjoint, but that the complex operator L = iL
is self-adjoint.
RInthis example2
there are no boundary conditions: the condition that the integral
dx |u(x)| exists means that |u| 0 as x and this plays the role of
the boundary conditions.
Exercise 12.29
Show that the operator L defined by
d2 y
Ly = + y = 0, y(0) = A, y 0 () = B,
dx2
where , A and B are nonzero constants, is not self-adjoint. This exercise shows
why the boundary conditions need to be homogeneous.
Exercise 12.30
Show that the system Ly = y 00 + y = 0, with the mixed boundary conditions,
y(0) = 0, y() = ay 0 (0), a 6= 0, is not self-adjoint.
Note in exercise 12.13 it was shown that some of the eigenvalues of this system
are complex and that the eigenfunctions are not orthogonal.
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 505
Also
(, L) = (, w) = (, w)
and hence, since w(x) is real,
Z b
0 = (L, ) (, L) = ( )
dx w(x)|(x)|2 .
a
Since w(x) > 0 and (, )w > 0, for almost all x, the right-hand side can be zero only if
= , that is the eigenvalues of a Sturm-Liouville system are real: this proof is valid
for regular and singular systems and if the boundary conditions are periodic.
The eigenfunctions are orthogonal
Now consider two eigenfunctions (x) and (x) corresponding to distinct eigenvalues
and , respectively, that is L = w and L = w. By the self-adjoint property
0 = (L, ) (, L) = (, )w + (, )w
Z b
= ( ) dx w(x)(x) (x).
a
has an infinite sequence of real eigenvalues 1 < 2 < < n < n+1 < with
limn n = . The eigenfunction yn (x) belonging to the eigenvalue n has exactly
506 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
The main idea behind the proof outlined here is the Prufer substitution, named after
the German mathematician Heinz Prufer (1896 1934); this involves using polar coordi-
nates in the Cartesian plane having coordinates (py 0 , y) to understand how the solution
behaves. Two new dependent variables (r(x), (x)) are defined by the relations
so that
y
r 2 = y 2 + p2 y 0 2 and tan = . (12.59)
py 0
Since y and y 0 cannot simultaneously be zero, r > 0. Notice that y(x) = 0 when
(x) = n, where n is an integer.
First we need the differential equations for r and . Differentiating the equation for
tan gives
2
1 y(py 0 )0
1 d 1 y
= = + Q
cos2 dx p (py 0 )2 p py 0
where we have used the relation (py 0 )0 = Qy. Multiplying by cos2 gives
d 1
= Q(x) sin2 + cos2 , Q = q(x) + w(x). (12.60)
dx p(x)
dr dy d(py 0 ) r2
r =y + py 0 = sin cos Qr 2 sin cos .
dx dx dx p
Hence
dr 1 1
= Q(x) r sin 2. (12.61)
dx 2 p(x)
The two equations 12.60 and 12.61 are equivalent to the original differential equation
and are named the Prufer system assocated with the self-adjoint equation 12.36.
The equation for r can be expressed as an integral
Z x
1 1
r(x) = r(a) exp dt Q(t) sin 2(t) , (12.62)
2 a p(t)
which can be evaluated once (x) is known; however, we shall not need this equation.
Notice that because the original equation for y is homogeneous the magnitude of r(x)
is unimportant, and is why r(x) depends linearly upon r(a).
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 507
The solution of equation 12.60 for (x) depends only upon the initial conditions,
that is the boundary condition A1 y(a) + A2 y 0 (a) = 0, which gives
A2
tan a = with 0 a < , (12.63)
A1 p(a)
and with a = /2 if A1 = 0 The eigenvalues are given by those values of for which
b = (b, ), satisfies the equation tan b = B2 /(B1 p(b)). However, here the main
objective is not to find the eigenvalues but to first determine that they exist and second
to determine some of their properties, and for this only the initial condition is required.
It is necessary to understand how (x, ) behaves as a function of x and ; this
behaviour is summarised in the following theorem which is proved rigorously in Birkhoff
and Rota (1962, chapter 10).
Theorem 12.5
The oscillation theorem. The solution of the differential equation 12.60 satisfying
the initial condition (a, ) = a < , for all , is a continuous and strictly monotonic
in for fixed x on the interval a < x b. Also
This theorem show that y(b, ) = r(b) sin (b, ) has infinitely many zeros for > 0,
and hence that there are infinitely many eigenfunctions.
In order to understand why (x, ) behaves in the manner described in theorem 12.5
we consider two specific examples. The first is a very simple system with known eigen-
functions; the second example is sufficiently general to contain all the essential features
of the general case.
The first system is
d2 y
+ y = 0, 0 x , (12.64)
dx2
and here p = 1 and Q = , so the equation 12.60 for is
d
= cos2 + sin2 , (0) = 0 .
dx
This equation is particularly simple because the right-hand side is independent of x, so
it can be integrated directly, to give
1
Z
x() = d . (12.65)
0 cos2 + sin2
However, this means that it is unrepresentative which is why another example is con-
sidered after the following discussion. We now deduce the qualitative behaviour of the
function (x) from this integral.
If > 0, 0 (x) > 0 and (x) is a monotonic increasing function of x; the larger the
greater the rate of increase of (x, ). In particular (, ) is an increasing function of
: this is clear from the integral 12.65 because the integrand is positive and for most
values of a decreasing function of . Thus for a given value of x the upper limit,
508 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
(x), must increase as increases to compensate for the decreasing magnitude of the
integrand, see exercise 12.31.
If < 0, then (x) > 0 tends to a constant value as x . To see this observe
that 0 (x) = 0 when = c and c where 0 < c = tan1 (1/ ) < /2, and thus,
This behaviour is shown graphically in figure 12.5, where = 1/4, which gives
c = 1.107 and graphs of (x, ) are shown for various initial conditions. Figure 12.6
shows the graphs of (x, ), with the same initial condition 0 = 0.6, but various values
of . Since c depends upon , 0 (0) > 0 for > 2.14, and 0 (0) < 0 for < 2.14.
5 (x)
1.5 (x)
4 +c 0
-0.1
3 1 -0.5
-1.0
2 c -1.5
-2.0
0.5
-5.0
1 c
0 x/ 0 x/
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 12.5 Graphs of (x) for = 1/4 Figure 12.6 Graphs of (x), for the initial
and various initial conditions. condition 0 = 0.6, and various negative .
It is clear from these graphs that there can be at most one negative eigenvalue. For the
parameters of figure 12.6, 0 = 0.6, (, ) varies between 0 and tan1 ( + tan 0 ) =
1.315, as increases from to 0: if the boundary condition , at x = , lies in this
range there will be a single eigenvalue for some negative . Otherwise there will be no
negative eigenvalue.
Now restrict attention to the case > 0, where (x, ) increases with x, for fixed
, and with for fixed x. Graphs of (x, ) for 0 x and various values of are
shown in figure 12.7.
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 509
60
50 250
40 150
120
30 100
20 50
40 30
10 10
1
0
0 0.2 0.4 0.6 0.8 x/ 1
Figure 12.7 Some representative graphs of (x), defined by equation 12.65
with 0 = 0, for variousvalues of . Using the integral 12.65 it can be shown
that if 1, (x) ' x .
The following exercise uses the integral 12.65 to deduce some propoerties of (x, ) for
the differential equation 12.64 with the boundary conditions y(0) = y() = 0.
Exercise 12.31
(a) For the boundary value problem
y 00 + y = 0, y(0) = y() = 0,
show that (0, ) = 0 and (, ) = n, for some positive integer n. Use equa-
tion 12.65 to deduce that the value of satisfying this last equation is = n2 .
Deduce that the nth eigenvalue is n = n2 and show that its eigenfunction has
n 1 zeros in the interval 0 < x < .
Now consider a slightly different, but more typical problem, for which there is no simple
formula for (x). Consider the eigenvalue problem
d2 y
+ xy = 0, y(0) = y(1) = 0, (12.66)
dx2
also treated in exerise 12.25. In this example p = 1 and Q = x, so the equation for
is
d
= cos2 + x sin2 , (0) = 0, 0 x 1. (12.67)
dx
If > 0, 0 (x) > 0 and, as before, (x) is a monotonic increasing function of x, with a
greater rate of increase the larger . Further if 2 > 1 , (x, 2 ) (x, 1 ), as shown by
an application of the theorem for first-order equations quoted in exercise 12.32. Thus
for > 0 there is little qualitative difference between this and the previous simpler
example; some representative graphs of (x, ) are depicted in figure 12.8.
510 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
10
150
8
120
100
6 50
4 40
30
2 10
1
0
0 0.2 0.4 0.6 0.8 x 1
Figure 12.8 Some representative graphs of (x), defined by equation 12.67
for various values of .
If < 0 the behaviour is not so easy to understand but, nevertheless, is similar to the
simpler example. Put = , with > 0, so the equation for becomes
d
= cos2 x sin2 , (0) = 0, 0 x 1. (12.68)
dx
For small x, x2 < 1 this equation is approximated by 0 = cos2 ' 1, so (x) grows
linearly with x, that is (x) = x. The two terms on the right-hand side of equation 12.68
are comparable when 1 = x3 and near this value of x, 0 (x) becomes negative and for
large both and x are small, so the equation is approximately
d
= 1 x2 . (12.69)
dx
For x3 > 1 the approximate solution of this equation is the function that makes the
derivative zero, that is x 2 = 1. To see this put x 2 = 1 + , so 0 = : if > 0,
0 decreases; if < 0, 0 increases. In either case the solution moves towards the line13
x2 = 1. A more accurate solution in the region x3 > 1 is found in exercise 12.33.
In figure 12.9 we compare the numerically generated solution of equation 12.68 with
the linear approximation, for x < 1/3 and the approximation x 2 = 1 for larger x,
for the cases = 10 and 100. This comparison confirms the predicted behaviour.
0.5 0.25
=10 =100
0.4 0.2
0.3 0.15
0.2 0.1
0.1 0.05
x x
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 12.9 Graphs of the numerical solution of equation 12.68 and the approximations = x
and x 2 = 1, shown by the dashed lines, for small and larger values of x, respectively. The
boundary x = 1/3 is shown by the arrows.
13 This type of analysis is useful in the study of boundary layer problems, relaxation oscillations and
d 1
' w(x)2 .
dx p(x)
The same reasoning as above shows that the approximate solution is p(x)w(x) 2 = 1,
giving (b, ) ' (p(b)w(b))1/2 , that is the variation of (x) is too small for eigenval-
ues to exist, for the boundary conditions y(a) = y(b) = 0: for other boundary conditions
one negative eigenvalue may exist.
Exercise 12.32
In this exercise bounds on the positions of zeros and eigenvalues are obtained for
the Sturm-Liouville system defined by equation 12.56 with the boundary condi-
tions y(a) = y(b) = 0. For this the following comparison theorem for the first-order
equations y 0 = F (x, y) is needed.
Suppose that F (x, y) and G(x, z) satisfy the Lipshitz condition
on suitable intervals of y and z, for some constant L. If y 0 = F (x, y) and z 0 = G(x, z),
with y(a) = z(a), then if F (x, y) G(x, y) for a x b and a suitable domain
of y, it can be shown that y(x) z(x) for a < x b.
Use this theorem with equation 12.60 for (x) to show that the kth zero, xk lies
between the limits,
r r
p1 xk a p2
.
q2 + w2 k q1 + w1
Exercise 12.33
In equation 12.69 define a new variable = /, where = 1/ , and show that
0 2
(x) = 1 x .
By writing the solution of this equation in the form
and equating the coefficients of the powers of to zero, show that 0 , 1 and 2
satisfy the equations
1 x20 = 0, 00 = 2x0 1 , 01 = x 21 + 20 2
1 1 7
(x) = + + + O(5/2 ).
x 4x2 323/2 x7/2
Exercise 12.34
Use the comparison theorem for first-order equations quoted in exercise 12.32 to
show that if 2 > 1 then (b, 2 ) (b, 1 ).
Exercise 12.35
In this exercise an approximation to the eigenvalues and eigenfunctions for large n
is found. The Liouville transformation, exercise 12.3 (page 478), shows that the
equation
d dy
p + (q + w) y = 0, a x b,
dx dx
can be transformed to the equation
d2 v d2
q 1
+ Q(x)v = 0, Q(, ) = A 2 + ,
d 2 w d A
where y = A()v(),
Z x r
w
(x) = dx and A() = (wp)1/4 .
a p
d p Q0 d Q0
= Q sin 2 and (ln R) = cos 2.
d 4Q d 4Q
(b)
Assume that Q is bounded and that max(Q) and show that () '
and R ' r, where and r are constants, and deduce that with the boundary
conditions y(a) = y(b) = 0 the approximate eigenvalues and eigenfunctions are
2
n r n
n = and vn () = sin .
(b) Q(, n )1/4 (b)
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 513
1 100(y-z)
y
0
0.2 0.4 0.6 0.8 x 1
-0.06 -0.5
-0.08 -1
Figure 12.10 On the left we compare the exact solution of equation 12.70 with the variational
approximation, defined in equation 12.71. On the right we show the difference, 100(y z), be-
tween the exact and the variation approximation obtained using the trial function, defined in
equation 12.71.
Further thought suggests that this trial function is a poor choice, because the actual
solution is an odd function of x. This can be deduced from the differential equation
because its right-hand side is odd, so we expect the solution to be odd, for if y(x) were
even, so also is y 00 (x) and the left-hand side of the equation would be even. Thus a
more sensible trial function is
which leads to a = 7/38. This estimate of the solution is very close to the exact
solution as seen in figure 12.11 where we show the graphs of 100(y z): notice that
the differences are about 10 times smaller than those in figure 12.10, which shows that
a careful choice of trial function can lead to significantly improved results with little
extra effort.
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 515
0.1 100(y-z)
0.05
x
0
0.2 0.4 0.6 0.8 1
-0.05
-0.1
L
Figure 12.11 Graph of the difference 100(y z), between the exact solution and the
trial function defined in equation 12.72. Notice that the differences are about 10 times
smaller than those in figure 12.10.
A more general odd trial function that satisfies the boundary conditions is
d2 y
+ x2 y 2 = x, y(0) = y(2) = 0, (12.74)
dx2
whose solution cannot be expressed in terms of elementary functions. The functional
for this equation is
2
1 02 1 2 3
Z
S[y] = dx y x y + xy , y(0) = y(2) = 0. (12.75)
0 2 3
so that
64 3 4 2 4 64 8 4
S(a) = a + a + a and S 0 (a) = a2 + a + .
189 3 3 63 3 3
Now there are two stationary paths given by the roots of this quadratic, which we
denote by
1 1
a = 777 21 and a+ = 777 + 21 ,
16 16
which suggests that this nonlinear boundary value problem has two solutions. Numer-
ical calculations, guided by this approximation, confirm this and in figure 12.12 we
compare these approximate solutions with those given by a numerical calculation.
516 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
0 y(x) 4 y(x)
a<0 a>0
-0.1
approximate 3
-0.2
exact 2
-0.3
1
-0.4
x x
-0.5 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Figure 12.12 A graphical comparison of the exact (numerical) solutions of equation 12.74, the
solid lines, and the variational approximation, the dashed lines. On the left is the comparison for
a < 0 and on the right for a > 0.
By substituting a power series into the differential equation, it can be seen that a better
trial function is z = ax(4 x2 ), because the coefficient of the term x2 is zero, but for
this trial function the integrals are slightly more complicated. We have
1 2 1 2
Z 2
64 2 512 3 64
Z Z
02 2 3
dx z = a , dx x z = a , dx xz = a
2 0 5 3 0 45 0 15
so that
512 3 64 2 64 512 2 128 64
S(a) = a + a + a and S 0 (a) = a + a+
45 5 15 15 5 15
and the two stationary paths are given by setting a to the values,
3 17 3 + 17
a = and a+ = .
8 8
In figure 12.13 are compared these approximations with numerically generated solutions
of equation 12.74. For a = a the trial solution, shown by the circles, is very close the
the exact solution. In both cases the approximations are better, which again illustrates
the value of choosing suitable trial functions.
It is worth noting that some black-box numerical methods for solving boundary
value problems give only the first solution, a > 0, and provide no inkling that another
solution exists. Thus, simple variational calculations, such as described here, can avoid
embarrassing errors; but they give no guarantee that only two solutions to this problem
exist.
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 517
Exercise 12.36
Using the trial function y = 1 ax (1 a)x2 obtain an approximate solution for
the equation y 00 + xy = 0, y(0) = 1, y(1) = 0.
Exercise 12.37
(a) Show that the functional associated with the equation y 00 + y 3 = 0, y(0) = 0,
y 0 (X) = 0 is
Z X
1 02 1 4
S[y] = dx y y , y(0) = 0,
0 2 4
and a natural boundary condition at x = X.
(b) Use the trial function y = a sin(x/(2X)) to find an approximate solution.
You will need the integral
Z /2
3
du sin4 u = .
0 16
which we assume to have an infinite sequence of real eigenvalues 1 < 2 < 3 , and
associated eigenfunctions y1 (x), y2 (x), .
First we need the following relation between the nth eigenfunction, yn (x), and its
eigenvalue
Z 1
n = S[yn ] = dx pyn0 2 qyn2 . (12.79)
0
This formula is useful because we shall use it, with approximations for yn (x), to both
approximate and bound n .
518 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
Exercise 12.38
By multiplying equation 12.78 by yn and integrating over (0, 1) prove equa-
tion 12.79.
Exercise 12.39
If yn (x) is an exact eigenfunction with eigenvalue n and zn = yn + u(x), with
|| 1 and O(u) = 1, is an admissible function, show that
n = S[zn ] + O(2 ).
The result derived in the exercise 12.39 is important. It shows that if an eigenfunction
is known approximately, with an accuracy O(), then it can be used to approximate
the eigenvalue to an accuracy O(2 ).
For the linear system 12.78 we construct trial functions using a subset of a complete
set of functions {} = {1 , 2 , }, each of which satisfies the boundary conditions.
Normally this set are the eigenfunctions of another Sturm-Liouville system and it is
clear that when choosing this system it is sensible to use a system that is similar to
that being studied.
Here we use the complete, orthogonal sequence k (x), k = 1, 2, , satisfying
Z 1
dx i (x)j (x) = hi ij , (12.80)
0
with each i (x) satisfying the same boundary conditions as the original Sturm-Liouville
system, in this case i (0) = i (1) = 0. At the end of this analysis we shall use a specific
set of functions by setting k = sin kx. A trial function is obtained using a linear
combination of the first n of these functions
n
X
z(x; a) = ak k (x),
k=1
and this will provide an approximation to the first n of the required eigenvalues and
eigenfunctions. The trial function needs to satisfy the constraint C[z] = 1, and this
defines the function
Z 1 n
!2 n
X X
C(a) = dx ak k (x) = hk a2k = 1, (12.81)
0 k=1 k=1
where we have used the orthogonal property, equation 12.80. In the space of real
variables a = (a1 , a2 , . . . , an ) this quadratic function of a, equation 12.81, defines an
n-dimensional ellipsoid. It is convenient to write this constraint in terms of the vector a,
C(a) = a> Ha = 1,
where H is the n n, diagonal matrix with Hkk = hk . The functional S[z] defines
another function of a,
!2 !2
Z 1 Xn n
X
S(a) = dx p(x) ak 0k (x) q(x) ak k (x) , (12.82)
0 k=1 k=1
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 519
where S is a real, symmetric n n matrix, with elements Sij . Specifically, these matrix
elements are given by
Z 1
Sij = dx p(x)0i (x)0j (x) q(x)i (x)j (x) . (12.84)
0
Sa = Ha or H 1 Sa = a. (12.86)
That is the stationary points are given by the eigenvectors of H 1 S. Further, since
H 1 S is a real, symmetric matrix its n eigenvalues are real and can be ordered, 1 <
2 < < n , and the kth eigenvalue provides an approximation to the kth eigenvalue
of the original Euler-Lagrange equation, as shown next.
If ak is the kth eigenvector of H 1 S with eigenvalue k , then assuming that the
associated trial function, z(x; ak ), is an approximation to the kth eigenfunction of the
Sturm-Liouville system we have, from the result found in exercise 12.39,
d2 y
+ (x + )y = 0, y(0) = y(1) = 0, (12.87)
dx2
520 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
where p(x) = 1, q(x) = x and w(x) = 1. The associated functional and constraint are
Z 1 Z 1
S[y] = dx y 0 2 xy 2 , C[y] = dx y 2 = 1, y(0) = y(1) = 0. (12.88)
0 0
We use the complete set k = sin kx, k = 1, 2, , to construct the trial functions,
and the simplest of these is obtained by using only the first function,
z(x; a1 ) = a1 sin x.
Z 1
The constraint gives a21 dx sin2 x = 1, that is a21 = 2. Thus the first approximation
0
for 1 is given by
1
1
Z
1 ' S(a1 ) = a21 dx 2 cos2 x x sin2 x = 2 ' 9.3696. (12.89)
0 2
That is 1 ' 9.3696. The exact eigenvalues are given by the real solutions of
Ai()Bi(1 ) Ai(1 )Bi() = 0,
where Ai(z) and Bi(z) are the Airy functions which are solutions of Airys equation,
y 00 xy = 0: to 7 significant figures the first eigenvalue is, 9.368507, so the approxima-
tion is larger than this by 0.01%.
Exercise 12.40
Consider the eigenvalue problem
y 00 + (x2 + )y = 0, y(0) = y(1) = 0.
(a) Using the orthogonal set k (x) = sin kx, k = 1, 2, , show that an upper
bound to the smallest eigenvalue is
Z 1
1 1
dx 2 cos2 x x2 sin2 x = 2 + 2 ' 9.59.
`
1 2
0 3 2
(b) Show that the trial function z = ax(1 x) gives the bound 1 68/7 ' 9.71.
Which of these two estimates is closer to the exact value?
Exercise 12.41
Use the bounds determined in exercise 12.32 (page 511) to show that the nth
eigenvalue of system y 00 + (xa + )y = 0, y(0) = y(1) = 0, with a > 0 is bounded
by (n)2 1 n (n)2 .
Exercise 12.42
Using the trial function z = a(1 x2 ) show that a lower bound to the smallest
eigenvalue of the system
y 00 + x2p + y = 0, y(1) = y(1) = 0,
`
Exercise 12.44
(a) Find the eigenvalues and eigenfunctions of the problem
d2 y
+ y = 0, y 0 (0) = y(1) = 0.
dx2
(b) Use the first eigenfunction of this problem to show that an approximation to
the first eigenvalue of
d2 y x
0 2 4b
+ b sin + y = 0, y (0) = y(1) = 0 is 1 ' .
dx2 2 4 3
522 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
Hint use the nth eigenfunction of the system defined in part (a) to construct a
one parameter trial function.
Exercise 12.45
(a) Determine an approximation to the eigenvalues and eigenfunctions of the equa-
tion
d2 y
+ (x + )y = 0, y(0) = y(1) = 0,
dx2
by substituting the series
Xn
y(x) = ak sin kx
k=1
(b) Show that for n = 1 and 2 this method gives the approximations 12.89
and 12.90 respectively.
(c) Show that for arbitrary n this method gives the equation 12.86 (page 519) for
a if k = sin kx, p(x) = 1 and q(x) = x.
sin kx, k = 1, 2, 3, ,
x(1 x), x(1 x)(1 + x), x(1 x)(1 + x + x2 ), x(1 x)(1 + x + x2 + x3 ), .
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 523
From the sequence {} a finite dimensional subspace is formed from the first n members
{1 , 2 , , n }; that is the set of all the linear combinations
where the ak , k = 1, 2, , n are any real numbers. On this subspace S[z] becomes a
function of the real numbers a = (a1 , a2 , . . . , an ),
S(a) = S[z].
This is exactly as in the previous section; but now we use the fact that the functional
has a minimum.
Choose (a1 , a2 , . . . , an ) to minimise S(a) and denote this minimum value by sn and
the associated element of Mn by yn ,
sn = min S(a1 , a2 , . . . , an ) .
Clearly sn cannot increase with n because Mn+1 contains Mn , that is any linear
combination of {1 , 2 , , n } is a linear combination of {1 , 2 , , n , n+1 }. If the
sequence {} is complete, then it can be shown that the sequence sn converges to s, the
minimum value of S[y]. This method of successively approximating a functional using
sequences of functions is the Ritz method.
For Sturm-Liouville systems the significance of this result is that the eigenvalue is
just the value of a functional that has a minimum, equation 12.79.
Then the functional S(a), equation 12.82, has a mimimum, because S(a) is continuous,
therefore bounded above and below, and the constraint limits each ak to a finite region,
so there is some value of a that yields the minimum value. Substituting this value for
(n)
a into S(a) gives an upper bound 1 for 1 ,
(n)
1 1 = S(a). (12.92)
(m)
For each m = 1, 2, , we similarly obtain an upper, 1 for the lowest eigenvalue,
and by the same reasoning as used above, we see that
(1) (2) (m) (m)
1 1 1 and lim 1 = 1 . (12.93)
m
Thus the method used in the previous section provides successively closer upper bounds
to the lowest eigenvalue.
524 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
A numerical example of this behaviour was seen in the calculation of the smallest
eigenvalue of equation 12.87 (page 519) where we used the trial function
n
X
z(x; a) = ak sin kx.
k=1
The exact value of this eigenvalue is, to 10 significant figures 9.368 507 162: for n = 1, 2
and 3 the variational estimates for 1 are 9.3698, 9.368 509 and 9.368 508 6. With the
trial function
the estimates of this eigenvalue with n = 1, 2, 3 and 4 are 9.5, 9.4989, 9.3687 and
9.368 513. As predicted the estimates approach the exact value from above.
The Ritz method can be applied to any functional with a minimum value. In
particular it applies to the general Sturm-Liouville system
Z b
2 2
dx py 0 2 qy 2 ,
S[y] = p(a)y(a) + p(b)y(b) + (12.94)
a
see exercise 12.46. Provided the integrals exist, the Ritz method applies to singular and
regular systems. For the boundary conditions y(a) = 0 and/or y(b) = 0 the appropriate
boundary term of the functional is removed.
For this system a sequence can be found such that the smallest eigenvalue satisfies
the conditions of equation 12.93. Further, the rigorous application of this method proves
the existence of an infinite sequence of eigenvalues and eigenfunctions for both regular
and singular systems, see for instance Fomin and Gelfand (1992, chapter 8) or Courant
and Hilbert (1965, chapter 6).
By adding an additional constraint that forces the admissible functions to be or-
thogonal to y1 (x), the eigenfunction associated with the smallest eigenvalue, we obtain
bounds for the next eigenvalue. Thus by considering the system defined by equa-
tions 12.94 and 12.95 with the additional constraint
Z b
C1 [y, y1 ] = dx wyy1 = 0, (12.97)
a
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 525
and using trial functions z satisfying the two constraints C[z] = 1 and C1 [z, y1 ] = 0 we
obtain another convergent sequence
(1) (2) (m) (m)
2 2 2 and lim 2 = 2 . (12.98)
m
By adding further constraints this process can be continued to obtain upper bounds for
any eigenvalue.
Exercise 12.46
(a) Show that the constrained functional with natural boundary conditions
Z b
S[y] = p(a)y(a)2 + p(b)y(b)2 + dx py 0 2 qy 2
a
Exercise 12.48 Z x p
Show that changing to the independent variable t = dx q(x) converts the
a
00 0
equation y + p1 (x)y + q(x)y = 0, a x b, q(x) > 0, into
d2 y q 0 (x) + 2p1 q dy
+ + y = 0.
dt2 2q 3/2 dt
Exercise 12.49
For problems defined inside an elliptical region it is sometimes convenient to use
elliptical coordinates defined by
d2 g
(a 2q cos 2v) g = 0, q = (k)2 , g(v + 2) = g(v) for all v,
dv 2
2
d f
+ (a 2q cosh 2u) = 0.
du2
The first of these equations is commonly known as Mathieus equation and periodic
solutions exists only for certain values of a(q).
Exercise 12.50
Keplers equation
Show that Keplers equation = u sin u with 0 < 1 can be inverted in
terms of Bessel functions with the formula,
X 1
u=+2 Jk (k) sin k.
k
k=1
12.7. MISCELLANEOUS EXERCISES 527
Exercise 12.51
Show that the function defined by the integral 12.33 (page 490) satisfies the dif-
ferential equation 12.30, with = n.
Hint, by differentiating under the integral sign, show that Bessels equation can
be written in the form
Z
1 d n cos t
dt g(t)ei(ntx sin t) with g(t) = i + .
2 dt x2 t
Exercise 12.52
Find the eigenvalues and eigenfunctions of the Sturm-Liouville system y 00 +y = 0,
y(0) = 0, y() = y 0 () any real .
Exercise 12.53
If f (x) and g(x) and h(x) are any solutions of the second-order equation y 00 +
p1 (x)y 0 + q(x)y = 0, show that the following determinant is zero
f f 0 f 00
g g 0 g 00 .
h h0 h00
Exercise 12.54
Using the results found in exercise 12.21 (page 499) to construct a linear, homo-
geneous, second-order differential equation having the solutions
(a) (sinh x, sin x), (b) (tan x, 1/ tan x).
Exercise 12.55
Use the results found in exercise 12.21 (page 499) to show that the equation
d2 y u0 dy f0
u2 y = 0, u= ,
dx2 u dx f
has solutions f (x) and 1/f (x).
Exercise 12.56
Let f (x), g(x) and h(x) be three solutions of the linear, third order differential
equation
d3 y d2 y dy
+ p2 (x) 2 + p1 (x) + p0 (x)y = 0.
dx3 dx dx
Derive a first-order differential equation for the Wronskian
f g h
0
W (x) = f g 0 h0 .
f 00 g 00 h00
Exercise 12.57
Find the self-adjoint form of the equation y 00 + y 0 tan x = 0.
Exercise 12.58
x
Use a comparison theorem to show that the solutions of y 00 + y = 0 have
1+x
infinitely many zeros for x > 1.
Exercise 12.59
Show that the eigenvalues of the Sturm-Liouville system y 00 + y = 0 with the
2-periodic boundary conditions y() = y() and y 0 () = y 0 () are n = n2 ,
n = 0, 1, 2, and that for each eigenvalue, except 0 , there are two distinct
eigenfunctions, which can be expressed as the real or the complex functions
Show, also that any linear combination of the pairs einx is also an eigenfunction
with eigenvalue n = n2 .
Exercise 12.60
(a) Using the new independent variable defined by x = et , show that if B > 1/4
the equation y 00 (x) + By/x2 = 0 has infinitely many zeros on (1, ).
(b) Show that the equation y 00 (x) + q(x)y/x2 = 0 has infinitely many zeros on
(1, ) if q(x) > 1/4 for x 1.
Exercise 12.61
Consider the system
d2 y dy
x + + y = 0, x 0.
dx2 dx x
(a) Show that the self-adjoint form of this equation is
d dy
x + y = 0, x 0,
dx dx x
d2 u + 41
+ u = 0, u(x) = y(x) x,
dx2 x2
and determine the intervals on which it is a regular system and on which it is a
singular system.
(c) Find any eigenvalues and eigenfunctions for the boundary conditions y(0) =
y(1) = c, for any c.
(d) Find the eigenvalues and eigenfunctions for the boundary conditions y(a) = y(b) = 0,
0 < a < b.
12.7. MISCELLANEOUS EXERCISES 529
Exercise 12.62
The Schwarzian derivative
(a) If f (x) and g(x) are any two linearly independent solutions of the equation
y 00 + q(x)y = 0, show that the ratio v = f /g is a solution of the third order,
nonlinear equation S(v) = 2q(x), where
2
v 000 3 v 00
S(v) = 0 .
v 2 v0
The function S(v) is named the Schwarzian derivative and has the important
property that if S(F ) < 0 and S(G) < 0 in an interval, then S(H) < 0, where
H(x) = F (G(x)). This result is useful in study of bifurcations of the fixed points
of one dimensional maps.
530 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
p0 2 p00
00 q
u p + + 3/2 u = 0.
p 4p 2 p
Dividing by p gives the quoted result.
x
1
Z
where (x) = dt and b0 = (b). The associated Euler-Lagrange equation is
a p(t)
y 00 () + p(q + w)y = 0.
12.8. SOLUTIONS FOR CHAPTER 12 531
but
b b
d
Z Z
b
dx p(u2 )0 (v 2 )0 = pu2 (v 2 )0 a dx u2 (v 2 )0 p
a a dx
so, since u(a) = u(b) = 0, the functional becomes
b
1d
Z
2 02 2 02 2 0
2
S[y] = dx pv u qv pv + p(v ) u .
a 2 dx
d2 u p0 2 p00
q
+ + u = 0, u = y p,
d 2 p 4p2 2p
b0 2
1 A
Z
S[v] = d p 0 (x) A2 v 0 2 + (A2 )0 (v 2 )0 (q + w) p 0
(x)A 02
v2 .
a0 2 0 (x)
so that
ib 0
1h 0
S[v] = p (x)(A2 )0 v 2
2 a0
Z b0 2
A 1d 0
+ d A2 p 0 (x)v 0 2 (q + w) p 0 02
A + p (x)(A 2 0
) v2 .
a0 0 (x) 2 d
Now define (x) with the equation A2 p 0 (x) = 1 to put S[v] in the simpler form
b0 Z b0
1 (A2 )0 v 2
d v 0 2 F ()v 2
S[v] = 2
+
2 A a0 a0
where
A0 2 (A2 )0
1d
F () = (q + w)A4 p + .
A2 2 d A2
But
A0 2 (A2 )0 A0 2 d A0 A00 2A0 2 d2
1d 1
2 + = 2 + = = A ,
A 2 d A2 A d A A A2 d 2 A
and hence
d2
4 1
F () = (q + w)pA A 2 .
d A
d2
q 1
F () = + A 2 , where A = (wp)1/4
w d A
1 d2 X 1 d2 Y
+ + k 2 = 0.
X dx2 Y dy 2
Thus defining the two constants 1 and 2 by the equations
1 d2 X 1 d2 Y
= 12 and = 22
X dx2 Y dy 2
2
d dR 2
r + k r R = 0.
dr dr r
Now multiply this equation by p (x) , integrate and use the orthogonality relation 12.25
to obtain Z b
du p (u) F (u) = p yp hp ,
a
which gives a value for yp . Substituting this value for yk into the original sum for y(x)
gives a solution of the inhomogeneous equation in the form
b b
1
X Z Z
y(x) = du F (u)k (u) k (x) = du G(x, u)F (u)
k hk a a
k=1
X k (u) k (x)
where G(x, u) = .
hk k
k=1
12.8. SOLUTIONS FOR CHAPTER 12 535
2 2 14
1
I(x) = 2 1 + 4x x =1 .
4x x x2
Compare the nth coefficient of this and the original series to obtain the first result.
P
(ii) Put z = x in equation 12.31 eix sin t = n= Jn (x)eint , and now set t = +s,
P
so sin t = sin s to obtain eix sin t = n= Jn (x)ein eint . Compare the nth
coefficient of this and the original series to obtain the second result.
(iii) Put t = 0
X
X
X
1= Jn (z) = J0 (z) + Jn (z) + Jn (z) = J0 (z) + 2 J2n (z),
n= n=1 n=1
yn = cos nx, n = n2 , n = 0, 1, 2, .
The orthogonality condition is more difficult to establish in this case. First consider I0n ,
Z Z
I0n = dx sinh 0 x sin n x = i dx sin i0 x sin n x
0 0
Z
sin(n i0 )
= i= dx cos(n i0 )x = i=
0 n i0
i
= = ( n + i 0 ) sin n cos i 0 cos n sin i 0 ,
n2 + 02
where =(z) is the imaginary part of z. Using the definitions of k we see that the term
in the outer brackets is real, and hence I0n = 0.
538 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
For n, m 6= 0 we have
Z
1
Z
Inm = dx sin n x sin m x = dx (cos(n m )x cos(n + m )x)
0 2 0
1 sin(n m ) sin(n + m )
=
2 n m n + m
m sin n cos m n cos n sin m
=
n2 m2
n m
= cos n cos m cos n cos m .
n2 m
2
If n 6= m this is zero. The case n = m can be obtained from this using LHospitals
rule. Alternatively,
Z
2 1 2
1 1
Inn = dx sin n x = cos n =
0 2 2 1 + n2
so Inn /2 as n .
Finally
1 1 1
Z
2
I00 = dx sinh 0 x = cosh2 0 = .
0 2 2 1 02
giving
2 X (1)k
sin k + 12 x.
x= 1 2
(k + 2 )
k=0
since
Z
1 1
Z
dx sinh2 0 x = cosh2 0 ,
dx x sinh 0 x = ( 1) cosh 0 ,
0 0 0 2
where the definition sinh 0 = 0 cosh 0 has been used. For n 1, we use the
results
Z Z
1 1
dx sin2 n x = cos2 n
dx x sin n x = ( 1) cos n ,
0 n 0 2
to obtain
R
dx x sin n x 2( 1) cos n
an = R0 2 = n = 1, 2, 3 ,
0
dx sin n x n ( cos2 n )
giving
2( 1) cosh 0 X cos k sin k x
x= 2 sinh 0 x 2( 1) 2
.
0 ( cosh 0 ) k ( cos k )
k=1
which is zero only if A1 A2 +B1 B2 = 0, that is the vectors a = (A1 , A2 ) and b = (B1 , B2 )
are orthogonal.
540 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
1 a=1/5
a=1/10
0.5
0
1 2 3 4 5 6 7 8 9 10 11
-0.5
-1
Figure 12.14 Graphs of the functions u = sin and u = a for a = 1/5
and 1/10.
For > 0 these curves intersect if < c ' 1/a, giving real zeros; for > c the zeros
are complex. There are about N ' 1/a zeros because there is one zero every time
passes through an integer. Hence there are a finite number of real zeros.
Consider the inner product of two distinct eigenfunctions, yi and yj with j > i.
Z
1
Z
Iij = dx sin i x sin j x = dx (cos(j i )x cos(j + i )x)
0 2 0
1 sin(j i ) sin(j + i )
= sin k = ak , k = i, j
2 j i j + i
a j cos i i cos j j cos i + i cos j
=
2 j i j + i
aj i
= (cos i cos j ) .
j2 i2
It is obvious that cos j + cos i 6= 0, but this is easily proved. We note that
and since
cos j cos i = 2 sin(j i ) sin(j + i ) ,
2 2
12.8. SOLUTIONS FOR CHAPTER 12 541
and hence g 00 + qg = 0.
f 00 + p1 f 0 + p0 f = 0 and g 00 + p1 g 0 + p0 g = 0.
12.8. SOLUTIONS FOR CHAPTER 12 543
xa2 a2
p1 = and p0 =
1 ax 1 ax
giving the equation (1 ax)y 00 + xa2 y 0 a2 y = 0 which has a singular point at x = 1/a.
d2
q 1
f () = A 2
w d A
is continuous and hence has a minimum value Qm . If < Qm then f () + < 0 for
all and the result proved in the text shows that any solution v() has at most one
zero, so cannot satisfy the boundary conditions. Hence there are no eigenvalues smaller
than Qm .
1 d2 y
+ ay = 0 that is v 00 () + a3 v() = 0.
a2 d 2
(c) Suppose the solution of y 00 + xy = 0 with the condition y(0) = 0 is v(x); then
v(x) = y(1/3 x) where y(x) is a solution of y 00 + xy = 0.
If = rn3 then v(1) = y(rn ) = 0, so y(rn x) is an eigenfunction with eigenvalue n = rn3 ,
and there are infinitely many of these.
There are no negative eigenvalues because y(0) = 0 and there can be no other zeros.
du
d d dv
v(Lu) u (Lv) = v p + qu u p + qv
dx dx dx dx
dp du d 2 u d2 v
dp dv
= v + p 2 u +p 2
dx dx dx dx dx dx
2 2
du
dp dv d u d v
= v u + p v 2 u 2
dx dx dx dx dx
du
dp dv d du dv
= v u +p v u
dx dx dx dx dx dx
du
d dv
= p v u .
dx dx dx
12.8. SOLUTIONS FOR CHAPTER 12 545
and hence 2 2
p1 ba q2 p2 ba q1
n .
w2 n w2 w1 n w1
Also
dR 1 Q0 2 1 Q0 0 2
2R = 2vv 0 Q1/2 + v + 2v 0 00 1/2
v Q v
d 2 Q1/2 2 Q3/2
1 Q0 2
cos2 sin2
= R and hence
2Q
d 1 Q0
ln R = cos 2.
d 4Q
(b) Write Q(, ) = Q0 () + where Q0 () = q/w A(1/A)00 , is independent of , so
d 1 Q00
= ( + Q0 )1/2 sin 2
d 4 + Q0
d Q00
ln R = cos 2.
d 4( + Q0 )
If max(Q0 ) we may expand in powers of 1 ,
1 Q00
d Q0 Q0
= 1+ + 1 + sin 2
d 2 4
= + O(1/2 ),
and
d Q0
ln R = 0 cos 2 + O(2 ).
d 4
Hence an approximation accurate to the lowest order is
() = and R = r,
for some constants and r. Hence
r
v() = 1/4
cos
( + Q0 ())
and
since y(a) = y(b) = 0, we set = /2 to satisfy the condition at x = a ( = 0) and
(b) = n to satisfy the condition at x = b, to obtain the approximate eigenvalue
2 Z b r
n w r n
n = , (b) = dx , with eigenfunction vn () = sin .
(b) a p Q(, n )1/4 (b)
so that
19 2 17 7 19 17
S(a) = a a+ and S 0 (x) = a .
60 30 6 30 30
The stationary point is at a = 17/19 and hence the approximate solution is
17 2
z =1 x x2 .
19 19
In the interval [0, 1] the largest difference between this approximation and the numeri-
cally generated solution is 0.0012.
since y(0) = y(1) = 0 the boundary term vanishes and C[y] = S[y]. Putting y = y n
gives the result.
The functional is
Z 1 Z 1
2 2
S[a] = dx (2ax) a dx x2p (1 x2 )2
1 1
Z 1 Z 1
= 8a2 dx x2 2a2 dx x2p (1 2x2 + x4 )
0 0
2 8 1 2 1
= a 2 + .
3 2p + 1 2p + 3 2p + 5
Hence
5 6
1 S(a) = 1 .
2 (2p + 1)(2p + 3)(2p + 5)
The functional is
1
2 2
1 1 1 1
Z
S(a1 ) = a21 dx cos2 x x sin2 x = a21 2 .
0 4 2 2 8 4
Hence
2 1 2
1 ' S(a1 ) = 2 = 1.76476.
4 2
Note that to 10 significant figures the value of the first eigenvalue is 1.762682254 it
can be shown to be the first zero of Ai(u)Bi0 (1 u) Bi(u)Ai0 (1 u), where Ai
and Bi are Airy functions.
With the two parameter trial function is z = a1 sin x/2 + a2 sin 3x/2 and the
constraint gives
1 1
1 3 1 1
Z Z
a21 dx sin2 x + a22 dx sin2 x = a21 + a22 = 1
0 2 0 2 2 2
The functional is
1 2
x 3 3x
Z
S(a) = dx
a1 cos + a2 cos
0 2 2 2 2
Z 1 2
x 3x
dx x a1 sin + a2 sin
0 2 2
2 2
1 1 1 1 2
a1 + 9a22 + 2 a21 + + 2 a22 2 a1 a2
=
8 4 4 9
2 2
1 1 9 1 1 2
= 2 a21 + 2 a22 + 2 a1 a2
8 4 8 4 9
552 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
where we have used the integrals quoted in the question. Thus the equation is
2
1 2 2
4 2 2 2
a = a
2 9 2 1 2
2
2 4 2 9
and the eigevalues are given by the quadratic equation 2 23.4489 + 38.2261 = 0
and the smallest root is 1.7627.
Taking the lowest eigenfunction of the simpler problem, z = a cos(x/2), to be the trial
function the constraint gives
Z 1 x a2
a2 dx cos2 = = 1,
0 2 2
(c) With the trial function z = a cos(n 1/2)x, the constraint gives a2 = 2 and the
functional becomes
Z 1 Z 1
S(a) = a2 2 (n 1/2)2 dx sin2 (n 1/2)x a2 b dx sin(x/2) cos2 (n 1/2)x
0 0
But
1 1
sin(x/2) cos2 (n 1/2)x = sin(x/2) + sin(2n 1/2)x sin(2n 3/2)x ,
2 4
12.8. SOLUTIONS FOR CHAPTER 12 553
so
1
1 1 1 1
Z
dx sin(x/2) cos2 (n 1/2)x = +
0 4 2n 1/2 2n 3/2
1 1
=
(4n 1)(4n 3)
and hence
2
a2 b
1 2 2 1 1
S(a) = a n 1 , and since a2 = 2
2 2 (4n 1)(4n 3)
2b 1 1
n ' n2 2 1 , n=n .
16n2 1 2
These n linear equations for a can be written in the matrix form M a = a where Mij
is defined in the question.
(b) If n = 1 we have
Z 1
M11 2
= 2 dx x sin2 x
0
Z 1 Z 1
= 2 dx x(1 cos 2x), but since dx x cos 2x = 0,
0 0
and since
1
1 (1)k 16
Z
dx x cos kx = , M12 =
0 (k)2 9 2
and
1
1
Z
M22 = 4 2 dx x (1 cos 4x) = 4 2 ,
0 2
554 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
1 16
2
2 9 2
1 a = a,
16
2
4
9 2 2
which is just equation 12.90.
Z b
2 dx (py 0 )0 + (q + w)y h.
a
Using the class of variations with h(a) = h(b) = 0, we see that the Euler-Lagrange
equation,
d dy
p + (q + w)y = 0.
dx dx
must be satisfied by a stationary path. Further, since S = 0 for all admissible paths
the given boundary conditions must also be satisfied.
Using the constraint condition and the boundary conditions to replace y 0 (a) with y(a)
and y 0 (b) with y(b), this becomes
Z b
k = dx pyk0 2 qyk2 + p(b)y(b)2 p(a)y(a)2 = S[yk ].
a
12.8. SOLUTIONS FOR CHAPTER 12 555
so if v 0 /v = x/(1 x2 ), that is v = 1/ 1 x2 the equation becomes
d2 u
1 u
+ = 0.
dx2 1 x2 1 x2
dy dy dt dy p d2 y q 0 (x) dy d2 y
= = q(x) and 2
= + q(x) 2 .
dx dt dx dt dx 2 q dt dt
d2 y q 0 (x) dy dy
Hence the equation becomes q 2
+ + p1 q + qy = 0, which is the required
dt 2 q dt dt
result
f 00
00
2 g 2
+ 2(k) cosh 2u + 2(k) cos 2v = 0.
f g
f 00 g 00
+ 2(k)2 cosh 2u = a and 2(k)2 cos 2v = a
f g
where a is a constant. Hence the quoted equations. Since the points with coordinates
(u, v) and (u, v +2) are physically identical, g(v) must be 2-periodic, g(v +2) = g(v)
for all v.
X
u() = + ak sin k,
k=1
556 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
where
1 1 d
Z Z
ak = d (u() ) sin k = du sin u sin k (u sin u)
du
1
Z
1 d
Z
= du sin u(1 cos u) sin k (u sin u) = du sin u cos k (u sin u)
k du
h Z
1 i
= sin u cos k (u sin u) du cos u cos k (u sin u)
k
Z
= du cos u cos k (u sin u)
k
Z
h i
= du cos (k + 1)u k sin u + cos (k 1)u k sin u
2k
2
= (Jk+1 (k) + Jk1 (k)) = Jk (k).
k k
<0
If < 0, put = 2 , ( > 0), the general solution is y = A cosh x + B sinh x: the
boundary condition at x = 0 gives A = 0 and the boundary condition at x = gives
tanh = , > 0.
12.8. SOLUTIONS FOR CHAPTER 12 557
If > there are no real solutions of this equation the gradient of the left and right
hand sides at = 0 are, respectively, and , so if > , > tanh for > 0.
If < , the same reasoning shows that there is one real positive solution which we
denote by 0 .
>0
If > 0, put = 2 , ( > 0), the general solution is y = A cos x + B sin x: the
boundary condition at x = 0 gives A = 0 and the boundary condition at x = gives
tan = , > 0.
If > , the first positive solution, 0 is in (0, /2) and the solution k , is in the
interval (k, (k + 1/2)), k = 0, 1, .
If < , the first positive solution, 1 is in (/2, 3/2) with kk < (k + 1/2),
k = 1, 2, .
Thus we have the following,
if > the eigenvalues are k = k2 with k < < (k + 1/2), k = 0, 1, :
if = the function y = Bx is a solution for = 0 and all B:
if < then 0 = 02 and k = k2 with k < < (k + 1/2), k = 1, 2, .
Multiply the first and second rows by p0 and p1 , respectively, and add to the third row
to obtain
f g h
dW 0 0 0
= f
g h = p2 (x)W (x).
dx
p2 f 00 p2 g 00 p2 h00
Hence Z x
W (x) = W (a) exp dx p2 (x) .
a
d2 y
dy dy d dy
x = and x x = ,
dx dt dx dx dt2
12.8. SOLUTIONS FOR CHAPTER 12 559
00 0 pt 2
and the equation
becomes y y + By = 0. Putting y = e gives p p + B = 0 so
that 2p = 1 1 4B. If 4B > 1 this gives the general solution
h i
y = x A cos( ln x) + B sin( ln x) , = 4B 1,
which has infinitely many zeros for x > 1. If 4B < 1 the general solutions is
y = x Axq + Bxq , q = 1 4B,
d2 u + 14
1
I(x) = 2 (1 + 4) giving + u = 0 with y = u/ x.
4x dx2 x2
Comparing with equation 12.36 (page 494) we see that q and w are continuous only if
x 6= 0, so this system is regular provided the interval does not contain the origin.
(c) Put x = et , so 0 < t < and
du du d2 u d2 u du
x = , x2 2
= 2 +
dx dt dx dt dt
and the equation for u becomes u00 (t) + u0 (t) + ( + 1/4)u = 0. Putting u = ept gives
p2 + p + ( + 1/4) = 0 and hence the general solution is
u = et/2 Ae t + Be t ,
= 2 , > 0,
(Ax + Bx )
y = (A B ln x) , = 0,
i ln x i ln x
= 2 , > 0.
Ae + Be
(i) < 0: the solution is bound at the origin only if B = 0, so y = Ax giving y(0) = 0
and y(1) = A. Hence there are no nontrivial solutions.
(ii) = 0: In the case the bound solutions are y = A: if c 6= 0, the solution is y = c,
with eigenvalue = 0.
(iii) > 0: the solution is not defined at the origin for any A or B, except A = B = 0.
A B ln a = 0, A B ln b = 0 = A = B = 0.
Aei ln a + Aei ln a = 0
= e2i ln(b/a) = 1,
Aei ln b + Aei ln b = 0
n
hence n = n2 , n = , and yn = c sin (n ln(x/a)), for some constant c.
ln(b/a)
f 00 f g 00 2f g 0 2 2f 0 g 0
v 00 = 2 + , but f 0 = qf, g 0 = qg,
g g g3 g2
2g 0 0 0 2g 0 0
= (f g f g) = v.
g3 g
Hence
2g 0 00 g0 2
00
000 g
v = v 2 2 v0 ,
g g g
0 2 2
v 000 g v 000 3 v 00
= 6 2q, hence = 2q.
v0 g v0 2 v0
af + bg av + b
v= = and S(v) = 2q = S(v).
cf + dg cv + d
References
Books and articles referred to in the text
Akhiezer N I 1962 The Calculus of Variations, (Blaisdell Publishing Company, trans-
lated from Russian by A H Frink)
Apostol T M 1963 Mathematical Analysis: A Modern Approach to Advanced Calculus,
(Addison-Wesley)
Arnold V I 1973 Ordinary Differential Equations, (The MIT press)
Ashby A, Brittin W E, Love W F and Wyss W, 1975 Brachitochrone with Coulomb
Friction, Amer J Physics 43 902-5.
Aughton P 2001 Newtons Apple, (Weidenfeld and Nicolson)
Bernstein S N 1912 Sur les equations su calcul des variations, Ann. Sci. Ecole Norm
Sup. 29 431-485
Birkhoff G and Rota G-C 1962 Ordinary differential equations (Blaisdell Publishing
Co.)
Brunt, van B 2004 The Calculus of Variations, (Springer)
Courant R and Hilbert D 1937a Methods of Mathematical Physics, Vol 1 (Interscience
Publishers Inc)
Courant R and Hilbert D 1937b Methods of Mathematical Physics, Vol 2 (Interscience
Publishers Inc)
Gelfand I M and Fomin S V 1963 Calculus of Variations, (Prentice Hall, translated
from the Russian by R A Silverman), reprinted 2000 (Dover)
Goldstine H H 1980 A History of the Calculus of Variations from the 17 th through the
19 th Century, (Springer, New York)
Green G 1838 On the Motion of Waves in a variable Canal of small Depth and Width,
Camb Phil Soc, Vol VI, part III
Ince E L 1956 Ordinary differential equations (Dover)
Isenberg C 1992 The Science of Soap Films and Soap Bubbles, (Dover)
Jeffrey A 1990 Linear Algebra and Ordinary Differential Equations (Blackwell Scientific
Publications)
Kolmogorov A N and Fomin S V 1975 Introductory Real Analysis, (Dover)
Landau L D and Lifshitz E M 1959 Fluid mechanics, (Pergamon)
Lutzen J 1990 Joseph Liouville (1809 1882): Master of Pure and Applies Mathematics
(Springer-Verlag)
Prandtl L 1904 Uber Flussigkeitsbewegung bei sehr kleiner Reibung, Verhandlungendes
III. internationalen Mathematiker-kongresses, Heidelberg, 1904
Rudin W 1976 Principles of Mathematical Analysis, (McGraw-Hill)
Schlichting H 1955 Boundary Layer Theory, (McGraw-Hill, New York)
Smith G E 2000 Fluid Resistance: Why Did Newton Change His Mind? Published in
The Foundations of Newtonian Scholarship, Eds R H Dalitz and M Nauenberg, (World
Scientific)
Sutherland W A 1975 Introduction to Metric and Topological Spaces, (Oxford University
Press)
Troutman J L 1983 Variational Calculus with Elementary Convexity, (Springer-Verlag)
Watson G N 1965 A Treatise on the Theory of Bessel Functions (Cambridge University
Press), first published in 1922.
Whittaker E T and Watson G N 1965 A Course of Modern Analysis, (Cambridge
University Press)
Yoder J G 1988 Unrolling Time, (Cambridge University Press)
Yourgrau W and Mandelstram S 1968 Variational Principles in Dynamics and Quantum
Theory (Pitman)
563
derivative, 20 Fundamental lemma of the Calculus of
partial, 26 Variations, 124
total, 28 Fundamental Theorem of Calculus, 42
Descartes R, 163
Dido, 415 Gateaux differential, 122
differentiable, 20 Galileo G, 91
differentiation of an integral, 45 general theory of relativity, 84
diffusion equation, 480 geodesic, 84, 319
direct methods, 513 geodesics and conjugate points, 286
discontinuity global extrema, 82
removable, 17 Goldschmidt solution, 176
simple, 17 graph, 12
domain, 12 gravitational lensing, 84
drag coefficient, 92 great circle, 84, 322
dual problem, 402 Green G, 480
Zenodorus, 97