Variable End Point

Chapter 9
Variable end points
9.1 Introduction
The functionals considered previously all involve fixed end points, that is the inde-
pendent variable is defined on a given interval at the ends of which the value of the
dependent variable is known. It is not hard to find variational problems with different
types of boundary conditions: in this introduction we describe a few of these problems
in order to motivate the analysis described here and in chapter 11.
The simplest generalisation is to natural boundary conditions in which the interval
of integration is given, but the value of the path at either one or both ends is not
given but needs to be determined as part of the variational principle. An example is a
stationary, loaded, stiff beam, which adopts a configuration that minimises its energy.
If the unloaded beam is horizontal along the x-axis, between x = 0 and L, and y(x)
represents the displacement, assumed small, the bending energy is proportional to its
curvature, which for small |y| is proportional to y 00 (x)2 ; then if (x) is the load per unit
length the energy functional can be shown to be
L
1 00 2
Z
E[y] = dx y g(x)y , (9.1)
0 2
where is a positive constant and g the acceleration due to gravity. Note that here y(x)
is positive for displacements below the x-axis. The Euler-Lagrange equation for this
functional is a linear, fourth order equation, see section 9.2.1, so requires four boundary
conditions.
If the beam is clamped horizontally at x = 0, there are just two boundary conditions,
y(0) = y 0 (0) = 0, though experience shows that this problem has a unique solution.
It transpires that the other two conditions, needed to determine this solution of the
Euler-Lagrange equation, can be derived directly from the variational principle that
requires E[y] to be stationary.
Alternatively, if the beam is simply supported at both ends, giving the boundary
conditions y(0) = y(L) = 0, it can be shown that the remaining two boundary conditions
are also obtained by insisting that E[y] is stationary. We explore this problem in
section 9.2.1.
341
342 CHAPTER 9. VARIABLE END POINTS
The first person to generalise boundary conditions was Newton in his investigations
of the motion of an axially symmetric body through a resisting medium, see equa-
tion 2.22 (page 96).
The brachistochrone problem was generalised by John Bernoulli in 1697, by allowing
the lower end of the stationary path to move on a given curve, defined by an equation
of the form (x, y) = 0. In figure 9.1 we show an example where the right end of the
brachistochrone lies on the straight line defined by (x, y) = x + y 1 = 0, and the
left end is fixed at (0, A), with A < 1. In figure 9.1 are shown the brachistochrones
for various values of A when the particle starts at rest at (0, A). The equation for the
stationary paths is derived in exercise 9.13. Notice that the cycloid intersects the curve
(x, y) = 0 at right angles and at x = 0 the gradient of the cycloid is infinite.
1
y
0.8 (x,y)=x+y-1=0
0.6
Cycloid segments
0.4
0.2
x
0
0.2 0.4 0.6 0.8 1
L R
Figure 9.1 Diagram showing stationary paths through the point (0, A), for
A = 0.2, 0.5 and 0.9, and (v, y(v)) where the right end is constrained to lie on the
straight line (x, y) = x + y 1 = 0, and the particle starts from rest at (0, A).
In this case the functional is, see equation 4.5 (page 166),
Z v s
1 + y0 2
T [y] = dx , y(0) = A, (v, y(v)) = 0, (9.2)
0 2E/m 2gy
where A is known, but v and y(x), 0 x v, need to be determined. The actual

stationary path is clearly a cycloid, but which, of the infinitely many cycloids through
these points, needs to be determined by an additional equation for v. In Bernoullis
original formulation the curve (x, y) = 0 was a vertical line through a given point,
that is the value of v is fixed, but y(v) is unknown; in this case (x, y) = x v.
Exercise 9.1
Explain why the stationary curves depicted in figure 9.1 are cycloids.
Many fixed end point problems can be modified in this manner. For instance a variation
of the catenary problem, described in section 2.5.6 (page 97) is given by an inelastic rope
9.2. NATURAL BOUNDARY CONDITIONS 343
hanging between two curves, defined by 1 (x, y) and 2 (x, y), on which the ends may
slide without hindrance, as shown in figure 9.2: the curve AB is a catenary, but now we
also need to determine the positions of A and B. Another example is a cable hanging
between two points A and B between which a weight of mass M is attached at a given
point C, with the distances AC and CB along the curve known, see figure 9.3. The
segments AC and CB will be catenaries but the gradient at C will be discontinuous.
Both these problems involve constraints, so are dealt with in chapter 11.
y 1 (x,y) 2 (x,y) y
A B
Catenary
A B C
x Mg x
Figure 9.2 Diagram of a rope hanging between Figure 9.3 Diagram of a rope hanging between
the two curves defined by k (x, y) = 0, k = 1, 2, two given points, A and B, and with a weight
on which it can slide freely. firmly attached at a given point of the rope.
9.2 Natural boundary conditions
In this section we develop the theory for a particularly simple type of free boundary,
because this illustrates the method in the clearest manner. The ideas used for the more
general case are similar, but the algebra is more complicated. Here the interval [a, b]
and the value of the path at x = a are given, but the value of y(b) is to be determined.
Thus the functional is
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, (9.3)
a
and both y(x) and y(b) need to be chosen to ensure that S[y] is stationary, as shown
schematically in figure 9.4, where the stationary and a varied path are depicted. This
problem differs from the general case, treated later, in that the value of x at the right-
hand end is given. This type of boundary condition is known as a natural condition or
a natural boundary condition because the value of y(b) is not imposed, but is defined
by the variational principle.
The admissible paths all pass through (a, A); the right end is constrained to lie on
the line x = b, but the actual position on this line needs to be determined. If y + h
are admissible paths then h(a) = 0, but h(b) need not be zero.
y
y(b)
y(b) + h(b)
y(x)+h(x)
y(x)
A x
a b
L
Figure 9.4 Diagram showing the stationary path, the solid curve, and a varied path,
the dashed curve, for a problem in which the left end is fixed, but the other end is free
to move along the line x = b, parallel to the y-axis, so y(b) needs to be determined.
The Gateaux differential of the functional is given by equation 3.9 (page 125), that is
Z b
F F
S[y, h] = dx h(x) + h0 (x) 0 . (9.4)
a y y
As before we integrate the second term by parts: using the fact that h(a) = 0 this
gives, Z b
F d F F
S[y, h] = h(b) 0 dx h(x). (9.5)
y x=b dx y 0
a y
This is the equivalent of equation 3.10 (page 125) but now the boundary term is not
automatically zero.
For a stationary path S[y, h] = 0 for all h(x) and because the allowed variations
include those functions for which h(b) = 0 the stationary paths must satisfy the Euler-
Lagrange equation
d F F
0
= 0, y(a) = A, (9.6)
dx y y
with only one boundary condition1 . The general solution of this equation will contain
one arbitrary constant c, so we write the solution as y(x, c). Because y(x, c) satisfies
equation 9.6, the Gateaux differential becomes S = h(b)Fy0 (x, y, y 0 )|x=b and because
this must be zero for all h(b), the solution of the Euler-Lagrange equation must satisfy
the boundary condition
Fy0 (b, y(b, c), y 0 (b, c)) = 0, (9.7)
which determines possible values of c, and hence the stationary paths. Equation 9.7 is
the natural boundary condition.
As an example consider the brachistochrone problem, studied in section 4.2. It is
convenient to use the dependent variable z(x) = A y(x), defined in equation 4.6
(page 166), and as before, we suppose that the initial velocity is zero, v0 = 0. Then the
functional may be taken to be2
Z b r
1 + z0 2
T [z] = dx , z(0) = 0. (9.8)
0 z
1 This derivation assumes that there exists at least a one parameter family of variations, h(x), such
that h(a) = h(b) = 0, which is always the case for the problems we consider.
2 For convenience we ignore the factor (2g)1/2 , which does not affect the Euler-Lagrange equation.
The Euler-Lagrange equation is the same as in the previous discussion and, because the
functional does not depend explicitly upon x, it reduces to the first-order equation 4.7,
having the solution, see equation 4.8 (page 167),
1 2
x= c (2 sin 2), z = c2 sin2 , (9.9)
2
where we have set d = 0, because z = 0 when x = 0; the boundary condition at x = b
determines the value of c. For future reference we note that
dz dz . dx 2 sin cos 1
= = = because cos 2 = 1 2 sin2 .
dx d d 1 cos 2 tan
At x = b this solution must satisfy the boundary condition 9.7 which, for this
problem, becomes
z0
Fz 0 = p = 0.
z(1 + z 0 2 )
But z is bounded so the only solution is z 0 = 0, and since z 0 = 1/ tan , this gives
= /2, and means that the cycloid intersects the vertical line through x = b or-
thogonally, see figure 9.5. But at the right end x = b, so 2b = c2 , which gives the
solution
b 2b
x = (2 sin 2) , z = sin2 , 0 . (9.10)
2
The shape of this curve depends only upon b, rather than both A and b as in the
conventional problem. Here the value of A merely changes the vertical displacement of
the whole curve. It is therefore convenient to set A = 2b/, and then the dependence
upon b becomes a change of scale, seen by setting x = x/b and y = y/b to give

x = 2 sin 2, y = 2 cos2 , 0 . (9.11)
2
The graph of this scaled solution is shown in figure 9.5.
2
y
1.5
0.5
x
0 0.5 1 1.5 2 2.5 3
L
Figure 9.5 Graph showing the cycloid defined in equation 9.11,

where x = x/b and y = y/b.
The time
p of passage is also independent of A, and is given by the simple formula
T (b) = b/g, a result derived in exercise 9.6.
Exercise 9.2
Write down a functional for the distance between the point (0, A) and the line
x = X > 0, parallel to the y-axis. Show that the stationary path is the straight
line through (0, A) and parallel to the x-axis.
Exercise 9.3 Z /4
dx y 0 2 y 2 , y(0) = A > 0,
`
Find the stationary path of the functional S[y] =
0
where the right-hand end of the path lies on the line x = /4.
Exercise 9.4 Rb
Show that the functional S[y] = a dx F (x, y, y 0 ), y(b) = B, with the left end
of the path constrained to the line x = a, is stationary on the solution of the
Euler-Lagrange equation,

d F F

= 0, Fy0 (x, y, y 0 ) = 0, y(b) = B.

dx y 0 y x=a
Exercise 9.5 Z 1
dx y 0 2 + y 2 , y(1) = B > 0,
`
Find the stationary path for the functional S[y] =
0
with the left end of the path constrained to the y-axis.
Exercise 9.6 p
Show that the time to traverse the curve 9.10 is T (b) = b/g.
Hint use equation 9.8, but remember the factor (2g)1/2 .
Exercise 9.7
The navigation problem defined in section 2.5.4 gives rise to the functional
p
c2 (1 + y 0 2 ) v 2 vy 0
Z b
T [y] = dx F (x, y 0 ), F (x, y 0 ) = ,
0 c2 v 2
for the time to cross a river. The start point is at the origin so y(0) = 0, but the
terminus is, in this version of the problem, undefined so the boundary condition
at x = b is a natural boundary condition. Assuming that v(x) 0 show that the
stationary path is given by
1 x
Z
y(x) = du v(u).
c 0
Exercise 9.8
This exercise is important because it uses the method introduced in this section
to extend the range of boundary conditions that can be described by functionals.
(a) Show that the Euler-Lagrange equation for the functional
Z b
dx y 0 (x)2 y(x)2 ,
`
S[y] = y(a) = A, y(b) = B,
a
is y 00 + y = 0, y(a) = A, y(b) = B.
(b) Second-order equations of the above form occur frequently, but the boundary
conditions are sometimes different, involving linear combinations of y and y 0 . Thus
a typical equation is
d2 y
+ y = 0, ga y(a) + y 0 (a) = 0, gb y(b) + y 0 (b) = 0. (9.12)
dx2
where ga and gb are constants.
Show, from first principles, that the functional
Z b
S[y] = gb y(b)2 ga y(a)2 + dx y 0 (x)2 y(x)2
`
a
is stationary on the path that satisfies equation 9.12, for all ga and gb .
9.2.1 Natural boundary conditions for the loaded beam

In this section we discuss functionals such as those for the energy of the loaded beam,
equation 9.1, which contain the second derivative, y 00 , so the associated Euler-Lagrange
equation is fourth-order, see equation 9.16 below. We start with the general functional
Z b
S[y] = dx F (x, y, y 0 , y 00 ), y(a) = A, y 0 (a) = A0 ,
a
with natural boundary conditions at x = b. The derivation provided is brief because it

is similar to previous analysis. The Gateaux differential is
Z b
S[y, h] = dx (hFy + h0 Fy0 + h00 Fy00 ) . (9.13)
a
Integration by parts gives

Z b ib Z b
h d F
dx h0 Fy0 = hFy0 dx h ,
a a a dx y 0
b b Z b
d2

d F F
Z
dx h00 Fy00 = h0 Fy00 h + dx h ,
a dx y 00 a a dx2 y 00
so the Gateaux differential can be cast into the form
b
F d F 0 F
S[y, h] = h +h
y 0 dx y 00 y 00 a
Z b 2
d F d F F
+ dx + h. (9.14)
a dx2 y 00 dx y 0 y
In this example h(a) = h0 (a) = 0 but there are no conditions on h(b). Hence S
reduces to

F d F 0 F
S[y, h] = h(b) + h (b) y 00

y 0 dx y 00 b b
Z b 2
d F d F F
+ dx + h. (9.15)
a dx2 y 00 dx y 0 y
On a stationary path this must be zero for all allowed h(x). A subset of varied paths
has h(b) = h0 (b) = 0 and hence the stationary path must satisfy the Euler-Lagrange
equation
d2

F d F F
+ = 0, y(a) = A, y 0 (a) = A0 . (9.16)
dx2 y 00 dx y 0 y
The solution of this equation contains two arbitrary constants. Now consider those
varied paths for which h(b) = 0 and h0 (b) 6= 0, and those for which h(b) 6= 0 and
h0 (b) = 0, to see that the solutions of this Euler-Lagrange equation must also satisfy
the two extra boundary conditions,
d
Fy00 = 0 and Fy0 Fy00 = 0 at x = b, (9.17)
dx
which determine the two constants in the solution of equation 9.16.
Exercise 9.9
Derive equation 9.13.
Exercise 9.10
For the functional defined in equation 9.1 (page 341) with = constant and the
boundary conditions y(0) = y 0 (0) = 0, use equations 9.16 and 9.17 to derive the
associated Euler-Lagrange equation and show that its solution is
g 2 ` 2
x x 4Lx + 6L2 .

y(x) =
24
Exercise 9.11
(a) Show that the stationary paths of the functional

Z b
S[y] = dx F (x, y, y 0 , y 00 ), y(a) = A, y(b) = B,
a
satisfy the Euler-Lagrange equation
d2

F d F F

+ = 0, y(a) = A, y(b) = B, Fy00 = Fy00 = 0.

dx2 y 00 dx y 0 y a b
(b) Apply the result found in part (a) to the functional defined in equation 9.1
(page 341), with = constant and the boundary conditions y(0) = y(L) = 0, to
derive the associated Euler-Lagrange equation and show that its solution is
g
x(L x) L2 + xL x2 .
`
y(x) =
24
9.3. VARIABLE END POINTS 349
9.3 Variable end points

The theory for variable end points is similar to that described above, but is slightly
more complicated because the x-coordinate of the free end must also be determined.
Here we consider the case where the left end of the stationary path is known, and has
coordinates (a, A), but the right end is free to lie on a given curve, defined by the
equation (x, y) = 0, as shown schematically in figure 9.6: we shall assume that x and
y are not simultaneously zero in the region of interest. Note that if = x b, the
equation = 0 defines the line x = b parallel to the y-axis, which is the example dealt
with in the previous section.
y
y(x) + h(x)
y(x)
A (x,y)=0
x
L
a v v+ R
Figure 9.6 Diagram showing the stationary path, the solid line, and a varied
path, the dashed curve, for a problem in which the left-hand end is fixed, but
the other end is free to move along the line defined by (x, y) = 0.
The functional is Z v
S[y] = dx F (x, y, y 0 ), y(a) = A, (9.18)
a
where the path y(x) and v need to be chosen to make the functional stationary.
Let y(x)+h(x) be an admissible varied path, so h(a) = 0. If x = v is the right-hand
terminal point of y(x), the terminal point of the varied path is at x = v + , for some
, so the x and y coordinates of this point are,
x = v + and y = y(v + ) + h(v + )

= y(v) + y 0 (v) + h(v) + O(2 ).
This point also lies on the constraining curve so, to first-order in ,

v + , y(v) + y 0 (v) + h(v) = 0.

Expanding this to first-order in , and remembering that (v, y(v)) = 0, gives

(x + y 0 (v)y ) + h(v)y = 0, (9.19)
which provides a relation between and h(v) that is needed later.
The Gateaux differential of the functional is computed using equation 3.5 (page 121),
in the normal manner, except that the upper limit of the integral now depends upon .
Thus on the varied path
Z z
S[y + h] = dx F (x, y + h, y 0 + h0 ), z = v + , (9.20)
a
so the derivative with respect to is given by equation 1.52, (page 45), with b = z()
so dz/d = ,
dS Z z
F

0 F

0 0
= F z, y(z) + h(z), y (z) + h (z) + dx h +h .
d a y y 0
On putting = 0, so z = v, we obtain the Gateaux differential

Z v
S[y, h] = F (v, y(v), y 0 (v)) + dx (hFy + h0 Fy0 ) . (9.21)
a
Now use integration by parts and the fact that h(a) = 0 to give
Z v iv Z v
0
h d
dx h Fy = hFy
0 0 dx h (Fy0 )
a a a dx
Z v
d
= hFy0 dx h (Fy0 ) .

x=v a dx
Hence the Gateaux differential, equation 9.21, becomes
Z v
d F F
S[y, h] = F + hFy0 dx h. (9.22)

v a dx y 0 y
Finally we use equation 9.19 to express h(v) in terms of to arrive at the relation
Z v
0
d F F
S[y, h] = Fy x + (y Fy F )y dx h. (9.23)

0 0
y v a dx y 0 y
On a stationary path S[y, h] = 0 for all allowed h. A subset of these variations will
have = 0, consequently y(x) must satisfy the Euler-Lagrange equation,

d F F
= 0, y(a) = A. (9.24)
dx y 0 y
On a path satisfying this equation the Gateaux differential reduces to

S[y, h] = x Fy0 + (y 0 Fy0 F )y (9.25)

y v
and this must also be zero for all . Hence, the equation
x Fy0 + y (y 0 Fy0 F ) = 0, x = v, (9.26)
must be satisfied. This equation is the required boundary condition for the right-hand
end of the path and is named a transversality condition.
In order to see how this works, consider the solution of equation 9.24, y(x, c), which
depends upon a single constant c, because there is only one boundary condition. By
substituting this into equation 9.26 we obtain an equation relating v and c. But the
right-hand end of the path satisfies the condition (v, y(v, c)) = 0, and this gives another
relation between v and c: if these two equations can be solved for one or more real pairs
of v and c, stationary paths are obtained.
The derivation of equation 9.26 implicitly assumed that y 6= 0, see equation 9.23.
Suppose that on the stationary path y = 0, which means that at this point the curve
(x, y) = 0 is parallel to the y-axis, then from equation 9.19 we see that = 0, since we
assumed that x and y are not simultaneously zero, the boundary term of 9.22 reduces
to hFy0 = 0, which means that Fy0 = 0. Equation 9.26 also gives Fy0 = 0 if y = 0 so it
is also valid in this exceptional case. Note that in this limit the transversality condition
reduces to the natural boundary condition of equation 9.7, which is also retrieved by
setting = x b in equation 9.26.
The transversality condition can be written in an alternative form by noting that
if the equation (x, y) = 0 defines a curve y = g2 (x) then g20 (x) = x /y , and equa-
tion 9.26 becomes
F + (g20 y 0 )Fy0 = 0, x = v. (9.27)
This form of the transversality condition is not valid when y = 0, that is where |g20 (x)|
is infinite.
If the left end of the path is also constrained to a prescribed curve, (x, y) = 0, then
a similar equation can be derived. In summary we have the following result.
Theorem 9.1 Z v
For the functional S[y] = dx F (x, y, y 0 ) and the smooth curves C and C defined by
u
the equations (x, y) = 0 and (x, y) = 0, the continuously differentiable path joining
C and C , at x = u and x = v respectively, that makes S[y] stationary, satisfies the
Euler-Lagrange equation
d F F
=0 (9.28)
dx y 0 y
and the boundary conditions

x Fy0 + y (y 0 Fy0 F ) = 0 and x Fy0 + y (y 0 Fy0 F ) = 0. (9.29)

x=u x=v
Either of these boundary conditions may be replaced by conventional boundary condi-

tions.
As an example consider the functional

Z v p
S[y] = dx f (y) 1 + y 0 2 , y(0) = a, (9.30)
0
with the right end of the path terminating on the curve C defined by (x, y) = 0. For
this functional a first-integral exists and is given by
f (y)
F y 0 Fy 0 = p = c = constant.
1 + y0 2
The transversality condition 9.26 then gives
x y 0 f (y)
p cy = 0 that is x y 0 (v) = y .
1 + y0 2
But the gradient of C is x /y and hence at the terminal point the stationary path is
perpendicular to C.
Exercise 9.12 p
v
1 + y0 2
Z
Find the stationary path of the functional S[y] = dx , y(0) = 0, for
0 y
a path terminating on the line y = x a, a > 0.
Hint first show that the solutions of the Euler-Lagrange equation are circles
through the origin and with centres on the x-axis.
Exercise 9.13
Consider the brachistochrone in which the left end is fixed at (0, A) and the right
end is constrained to the curve x/a + y/b = 1, a, b > 0. Initially the particle is
stationary at (0, A).
Show that the equations of the stationary path are
1 2
x= c (2 sin 2) , y = A c2 sin2 , 0 b = tan1 (b/a),
2
where c is given by the equation c2 b = a (1 A/b).
Graphs of this solution, for various values of A and a = b = 1, are shown in
figure 9.1 (page 342).
Exercise 9.14
Consider the ellipse and the straight line defined, respectively, by the equations
x2 y2 x y
2
+ 2 =1 and + = 1, x > 0, y > 0,
a b A B
in the first quadrant, where a, b, A and B are positive constants.
(a) Show that these curves do not intersect if AB > , where 2 = A2 b2 + B 2 a2 .
(b) Construct a functional for the distance between two points (u, v) on the ellipse,
and (, ) on the straight line, and show that the solution of the associated Euler-
Lagrange equation is the straight line y = mx + c. Show also that the values of
the six constants m and c , (u, v) and (, ) making this distance stationary satisfy
the equations
mu v A u2 v2
= 2, m= , + 2 = 1, + = 1,
a2 b B a2 b A B
together with v = mu + c and = m + c.
(c) Solve these equations to show that when the curves do not intersect the sta-
AB
tionary distance is d = .
A2 + B 2
9.4 Parametric functionals

It is sometimes useful to formulate a functional in terms of curves defined parametrically
using the theory described in chapter 8. For variable end point problems the derivation
of the appropriate formulae follows in a similar manner to that described above, but
the homogeneity of the integrand simplifies the final result.
9.4. PARAMETRIC FUNCTIONALS 353
Consider the parametric functional

Z 1
S[x, y] = dt (x, y, x, y), x(0) = a, y(0) = A, (9.31)
0
where the end of the path at t = 0 is fixed and the end at t = 1 lies on a smooth
curve, C, defined parametrically by x = ( ), y = ( ), where both ( ) and ( )
are continuously differentiable and such that 0 ( ) and 0 ( ) are not simultaneously
zero for any in the region of interest. Notice that the parameter t varies in the fixed
interval [0, 1] because the integrand is homogeneous of degree one in x and y: this is
different from the functional 9.18 in which it was necessary to allow the upper limit to
vary. Here 0 t 1 on all paths.
By considering the varied path (x + h1 , y + h2 ) we obtain the Gateaux differential
in the usual manner,
Z 1
S[x, y, h1 , h2 ] = dt h1 x + h1 x + h2 y + h2 y . (9.32)
0
The left end of the path is fixed at t = 0, consequently h1 (0) = h2 (0) = 0, and
integration by parts gives
Z 1
d d
S = h1 x + h2 y dt h1 + h2 .

t=1 0 dt x x dt y y
(9.33)
If S[x, y] is stationary it is necessary that S = 0 for all allowed variations h1 (t) and
h2 (t). By restricting the varied paths to those on which h1 (1) = h2 (1) = 0 we see that
the stationary path must satisfy the Euler-Lagrange equations

d d
= 0, = 0, x(0) = a, y(0) = A. (9.34)
dt x x dt y y
The general solutions of these equations satisfying the conditions at t = 0 will contain
two constants, which we denote by c and d. On these paths the Gateaux differential
becomes
S = h1 (t)x + h2 (t)y . (9.35)

t=1
Because all admissible paths terminate on C, as shown in figure 9.7, the values of h1 (1)
and h2 (1) are related.
y
= 1 +
= 1
C
A Stationary path
Varied path
x
a
Figure 9.7 Diagram showing the stationary path, the terminating
curve, C, and a varied path. At the intersection of C and the sta-
tionary path = 1 ; and the varied path intersects C at = 1 + .
Suppose that the stationary path terminates at ((1 ), (1 )) and a varied path at a
different value of , 1 + . Hence
x(1) = (1 ) and x(1) + h1 (1) = (1 + ).
Expanding to first-order in gives h1 (1) = 0 (1 ) and, similarly, h2 (1) = 0 (1 ).

Thus equation 9.35 becomes

S = 0 (1 )x + 0 (1 )y . (9.36)

t=1
But S must be zero for all 6= 0 and hence the required boundary condition is

0 (1 )x + 0 (1 )y = 0. (9.37)

t=1
This is the transversality condition in parametric form and is the equivalent of equa-
tion 9.26 (page 350).
There are now three constants that need to be determined: these are (c, d) from
the solution of equations 9.34 and the value of the parameter 1 , where the stationary
path intersects C. Equation 9.37 gives one relation between these three parameters: the
other two are x(1, c, d) = (1 ) and y(1, c, d) = (1 ). In principle these equations can
be solved to give the required stationary path.
In order to see how this theory works consider the problem solved in exercise 8.1(b)
(page 313), that is the stationary values of the distance between the origin and the
parabola now defined parametrically by (1 2 , a ).
The parametric form of the functional is
Z 1 p
S[x, y] = dt x2 + y 2 , x(0) = y(0) = 0, (9.38)
0
and the boundary curve is ( ) = 1 2 , ( ) = a . Hence the boundary condition 9.37

becomes
ay = 21 x at t = 1. (9.39)
The Euler-Lagrange equations and the solutions that satisfy the boundary conditions
at the origin are
! !
d x d y
p = 0, p = 0 = x = ct, y = dt, (9.40)
dt x2 + y 2 dt x2 + y 2
where c and d are constants to be determined: these solutions are the parametric equa-
tions of a straight line through the origin, as expected. Hence equation 9.39 becomes
ad = 21 c. But at t = 1 the solution 9.40 intersects the parabola, hence c = 1 12 and
d = a1 . Substituting these into the equation ad = 21 c gives
a2
a2 1 = 21 (1 12 ) that is 1 = 0 or 12 = 1 .
2
The first of these solutions, 1 = 0, gives
x = 1 and y = 0. The second equation,
12 = 1 a2 /2, has real solutions if a < 2, which are the solutions found previously in
exercise 8.1(b).
9.5. WEIERSTRASS-ERDMANN CONDITIONS 355
Exercise 9.15
For the parametrically defined curve x = ( ), y = ( ), use the method described
above to show that the distance along the straight line y = mx from the origin to
a point on this curve is stationary if m = 0 ( )/0 ( ). If the curve is represented
by the function y(x), show that this becomes my 0 (x) = 1 and give a geometric
interpretation of this formula.
Exercise 9.16
Express the functional defined in equation 9.38 in non-parametric form and find
its stationary paths.
9.5 Broken Extremals: the Weierstrass-Erdmann con-

ditions
The theory so far has dealt almost entirely with continuously differentiable solutions
of the Euler-Lagrange equations. In the construction of the minimum area of a surface
of revolution, section 4.3, it was seen that the Goldschmidt function, equation 4.20
(page 176), was the only solution if the end radii were too small: this function is
continuous, but at two points its derivatives do not exist.
Solutions of variational problems that are continuous, but have discontinuous deriva-
tives at a finite number of points are named broken extremals (though they are often
merely stationary paths rather than extremals). The points of discontinuity are named
corners. Such solutions are dealt with by dividing the path into contiguous segments
in each of which the path is continuously differentiable and satisfies the Euler-Lagrange
equation; supplementing these equations are the Weierstrass-Erdmann (corner) condi-
tions which allow the paths in each segments to be joined to form a continuous path.
It transpires that the variational principle and the requirement of continuity provides
just sufficient extra conditions for particular solutions to be formed.
It is quite easy to find real problems that require broken extremals. One example is
illustrated in figure 9.3 (page 343), and we use a variant of this to introduce the basic
ideas before developing the general theory.
9.5.1 A taut wire

Consider a taut, elastic wire under tension T , fixed at both ends one being at the origin,
the other at x = L, on the horizontal x-axis. We suppose the wire sufficiently light that
it lies along Ox. If a weight of mass M is hung from the wire at a given point x =
it will deform as shown in figure 9.8, and we assume that the deflection is sufficiently
small that the change in tension is negligible.
If the y-axis is vertically upwards the energy due to the tension in the wire can
RL
be shown to be T2 0 dx y 0 2 provided the displacement, y(x), is sufficiently small for
Hookes law to be valid. The potential energy of the mass is M gy(), g being the
acceleration due to gravity, and, for the sake of simplicity, we assume that the wire
is sufficiently light that its potential energy is negligible. The functional for the total
energy of the system is

L
1
Z
E[y] = M gy() + T dx y 0 2 , y(0) = y(L) = 0, 0 < < L. (9.41)
2 0
The configuration adopted by the wire is the continuous stationary path of this func-
tional.
y
L x
y1 y2
Mg
Figure 9.8 Diagram of a light, taut wire of length L sup-
porting a weight at x = .
This energy functional is different from others considered because the point x = is
special. We deal with this by splitting the interval [0, L] into two subintervals, [0, ]
and [, L] and writing the whole path, y(x) in terms of two functions,
(
y1 (x), 0 x ,
y(x) = (9.42)
y2 (x), x L,
and since y(x) is continuous at x = , we have y1 () = y2 (). The derivatives of y(x)

are not defined at x = , but this does not hinder the analysis because we require only
the left and right-hand derivatives. These are defined, respectively, by
y() y( ) y1 () y1 ( )
lim = lim , (left derivative),
0+ 0+
y( + ) y() y2 ( + ) y2 ()
lim = lim , (right derivative).
0+ 0+
In the following the derivatives at x = are to be understood in this sense.
Now evaluate the functional on the varied path, y + h, also continuous at x = ,
and where h(0) = h(L) = 0,
1 Z 1
Z L
E[y + h] = M g y() + h() + T dx (y10 + h0 )2 + T dx (y20 + h0 )2 ,
2 0 2
so that the Gateaux differential is

Z Z L
E[y, h] = M gh() + T dx y10 h0 +T dx y20 h0 .
0
Integration by parts gives, on remembering that h(0) = h(L) = 0,

n o Z Z L
0 0 00
E[y, h] = M g + T y1 () y2 () h() T dx y1 h T dx y200 h. (9.43)
0
Now proceed in the usual manner. By choosing those h(x) for which h(x) = 0 for
x L and those for which h(x) = 0 for 0 x , we obtain the Euler-Lagrange
equations for y1 (x) and y2 (x),
d2 y1
= 0, 0 x < , y1 (0) = 0,
dx2
(9.44)
d2 y2
= 0, < x L, y2 (L) = 0.
dx2
On the path satisfying these equations the Gateaux differential becomes
n o
E[y, h] = M g + T y10 () y20 () h()
and this can be zero for all h(x) only if
Mg
y20 () y10 () = . (9.45)
T
Physically, this equation represents the resolution of forces acting on the weight in the
vertical direction. Together with the continuity of y(x) this condition provides sufficient
information to find a stationary path, as we now show by solving the equations.
The solutions of the Euler-Lagrange equations 9.44 that satisfy the boundary con-
ditions at x = 0 and x = L are
y1 (x) = x and y2 (x) = (L x),
for some constants and . Since y1 () = y2 () we have ( + ) = L and equa-

tion 9.45 gives + = M g/T . Hence the stationary path comprises the two straight
line segments,
M g (L )x,

0 x ,
TL

y(x) = (9.46)
Mg

(L x), x L.
TL
Exercise 9.17
Find the continuous stationary paths of the functional
1 L
Z
S[y] = Cy()2 + dx y 0 2 , y(0) = A, 0 < < L,
2 0
with natural boundary conditions at x = L. Explain why there cannot be a
unique, nontrivial solution if A = 0.
9.5.2 The Weierstrass-Erdmann conditions

Now consider the problem of finding stationary paths of the functional
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, (9.47)
a
that are continuously differentiable for a x b, except possibly at a single, unknown

point c (a < c < b), where the path is continuous but its derivatives are not.
The main difference between this and the previous special case is that the value of
c is not known in advance. However, we proceed in the same manner by splitting the
functional into two components,
Z c Z b
S[y] = dx F (x, y1 , y10 ) + dx F (x, y2 , y20 ), (9.48)
a c
and compute its value on the varied paths y1 + h1 and y2 + h2 , and also allowing the
point x = c to move to c0 = c + , as shown diagrammatically in figure 9.9.
y y 2 + h2
B
y 1 + h1 y2
A y1
x
a c c b
Figure 9.9 Diagram showing the stationary and a varied
path: here c0 = c + .
The value of the functional on the varied path is

Z c+ Z b
S[y + h] = dx F (x, y1 + h1 , y10 + h01 ) + dx F (x, y2 + h2 , y20 + h02 ).
a c+
Each integral is similar to that defined in equation 9.20 (page 349) and using the same
analysis that leads to equation 9.25, see exercise 9.18, we obtain,

S[y, h] = F + h1 Fy0 F + h2 Fy0 , (9.49)

(x,y)=(c,y1 ) (x,y)=(c,y2 )
with y1 (x) and y2 (x) satisfying the Euler-Lagrange equations

d F F
= 0, y1 (a) = A, a x c, (9.50)
dx y10 y1

d F F
= 0, y2 (b) = B, c x b. (9.51)
dx y20 y2
On the stationary path the coordinates of the corner are (c, y(c)) and on a varied path
these become (c + , y(c) + ), with and independent variables. In terms of y1 and
y2 we have
y(c) + = yk (c + ) + hk (c + ), k = 1 and 2,

= yk (c) + yk0 (c) + hk (c) + O(2 ).

Since y(x) is continuous, y1 (c) = y2 (c) = y(c), these equations allow h1 (c) and h2 (c)
to be expressed in terms of the independent variables and . Substituting these
expressions into equation 9.49 for S we obtain

S[y, h] = (F y 0 Fy0 ) + Fy0 (F y 0 Fy0 ) + Fy0 .

(x,y)=(c,y1 ) (x,y)=(c,y2 )
(9.52)
Note that each term of the right-hand side of this equation is similar to the left-hand
side of equation 9.26 (page 350) with = y and = x : the important difference is
that and are independent variables. Because of this S = 0 only if the coefficients
of and are both zero, which gives the two relations
lim F y 0 Fy0 = lim F y 0 Fy0 ,

(9.53)
xc xc+
lim Fy0 = lim Fy0 . (9.54)

xc xc+
These relations between the values of y1 and y2 , and their first derivative at x = c are
known as the Weierstrass-Erdmann (corner) conditions and they hold at every corner
of a stationary path. With one corner the Euler-Lagrange equations 9.50 and 9.51 may
be solved to give functions y1 (x, ) and y2 (x, ), each involving one arbitrary constant.
Substituting these into the corner conditions gives two equations relating , and c:
a third equation is given by the continuity equation y1 (c, ) = y2 (c, ). These three
equations allow, in principle, values for , and c to be found.
Exercise 9.18
Derive equations 9.499.52.
For an example consider the functional

Z 2
2
S[y] = dx y 0 2 (1 y 0 ) , y(0) = 0, y(2) = 1. (9.55)
0
Because the integrand depends only upon y 0 , the solutions of the Euler-Lagrange equa-
tion are the straight lines y = mx + , for some constants m and . Therefore the
smooth solution that fits the boundary conditions is y = x/2 and on this path S = 1/8:
moreover, by considering the second-order terms in the expansion of S[y + h] we see
that this is path is a local maximum of S.
However, if y 0 = 0 or y 0 = 1 the integrand is zero, so we can imagine a broken
path comprising segments of straight lines at 45 and parallel to the x-axis on which
S[y] = 0; because the integrand is non-negative such a path gives a global minimum.
We now show that the corner conditions give such solutions.
Suppose that there is one corner at x = c. The two solutions that fit the boundary
conditions either side of c are
(
y1 = m1 x, 0 x c,
y=
y2 = m2 (x 2) + 1, c x 2.
Since
Fy0 = 2y 0 (1 y 0 )(1 2y 0 ) and F y 0 Fy0 = y 0 2 (1 y 0 )(1 3y 0 )

the Weierstrass-Erdmann conditions become

m21 (1 m1 )(1 3m1 ) = m22 (1 m2 )(1 3m2 ) and
(9.56)
m1 (1 m1 )(1 2m1 ) = m2 (1 m2 )(1 2m2 ).
The only non-trivial solutions of these equations and the continuity condition, m1 c =
m2 (c 2) + 1 are (m1 , m2 , c) = (1, 0, 1) and (0, 1, 1), which give the two solutions shown
by the solid and dashed lines, respectively, in figure 9.10. On both lines the functional
has its smallest possible value of zero.
y
(1,1) (2,1)
Figure 9.10 Graph of some broken extremals for the functional 9.55. On the
solid line (m1 , m2 ) = (1, 0): on the dashed line (m1 , m2 ) = (0, 1) and in both
cases c = 1. The dotted line is a broken extremal with several corners.
In this example there are solutions with any number of corners comprising alternate
lines with unit gradient and horizontal lines; an example is depicted by the dotted line
in figure 9.10.
Exercise 9.19
(a) Show that the stationary path of the functional 9.55 without corners is y = x/2
and that on this path S[y] = 1/8.
(b) If y = x/2 show that
Z 2
1 2
S[y + h] = S[y] dx h0 (x)2
2 0
and deduce that this path gives a local maximim of S[y].
Exercise 9.20
Show that the only solutions of equations 9.56 are those given in the text.
Exercise 9.21
Find the stationary paths of the functional 9.55 with two corners.
Exercise 9.22 Z 4 2
dx y 0 2 1 , y(0) = 0,
`
Find the stationary paths of the functional S[y] =
0
y(4) = 2, having just one corner.
9.6. NEWTONS MINIMUM RESISTANCE PROBLEM 361
9.5.3 The parametric form of the corner conditions

The Weierstrass-Erdmann corner conditions for the parametric functional
Z 1
S[x, y] = dt (x, y, x, y), x(0) = a, y(0) = A, x(1) = b, y(1) = B, (9.57)
0
can be derived directly from equations 9.53 and 9.54, by setting (x, y, x, y) = xF (x, y, y/x)
and recalling the results of exercise 8.11 (page 319), to give
lim x = lim x and lim y = lim y , (9.58)

tc tc+ tc tc+
where the corner is at t = c, with 0 < c < 1. At such a corner either or both of x(t)
and y(t) are discontinuous.
9.6 Newtons minimum resistance problem

We now consider the solution of Newtons minimum resistance problem described in sec-
tion 2.5.3 where the relevant functionals are derived, equations 2.21 and 2.22 (page 96).
Although the solution of this problem is of little practical value, for the reasons discussed
in section 2.5.3, its derivation is worth pursuing because of the techniques needed. The
detailed analysis in this section is not, however, assessed. To recap we require the
stationary paths of the functionals
Z b
x
S1 [y] = dx , y(0) = A > 0, y(b) = 0, (9.59)
0 1 + y0 2
and
b
1 2 x
Z
S2 [y] = a + dx , y(a) = A > 0, y(b) = 0, 0 < a < b. (9.60)
2 a 1 + y0 2
For S2 [y] both the stationary path and the value of a need to be determined. Physical
considerations suggest that y 0 (x) is piecewise continuous; further in the derivation of
these functionals we made the implicit assumption that y 0 (x) 0 and without this
constraint S1 [y] can be made arbitrarily small, as shown in exercise 9.28.
Here we show that S1 [y] has no stationary paths that can satisfy the boundary
conditions; using this analysis we derive a stationary path for S2 [y].
The functional S1 [y] is of the type considered in exercise 7.13 (page 280) because
the integrand, F = x/(1 + y 0 2 ), does not depend explicitly upon y. The conclusion of
this exercise shows that if y(x) is a stationary path and if
2x(3y 0 2 1)
Fy 0 y 0 = > 0, (9.61)
(1 + y 0 2 )3
it gives a minimum value of S1 .
The Euler-Lagrange equation associated with S1 can be integrated directly, and
assuming that y(x) decreases monotonically from y(0) = A > 0 to y(b) = 0, we obtain
xy 0
= c, with c > 0. (9.62)
(1 + y 0 2 )2
This equation can be solved by defining a new positive variable3 ,
p(x) = y 0 (x) giving the equation xp = c(1 + p2 )2 . (9.63)
Integrating the first equation gives

Z x
y(x) = A dx p(x)
Z0 p Z p
dx d
= A dp p =A dp (xp) x ,
p0 dp p0 dp
where p0 = p(0) is an unknown constant. The last expression gives

h ip Z p
1 3
y(p) = A xp +c dp + 2p + p .
p0 p0 p
This equation can be integrated directly to obtain (x, y) in terms of p,
c(1 + p2 )2

2 3 4
x(p) = and y(p) = B + c ln p p p (9.64)
p 4
where B is a constant which has absorbed all other constants: in these equations p
may be regarded as a parameter, so we have found a solution in parametric form. The
required solution is obtained by finding the appropriate values of B, c and a range of p
that satisfy (x, y) = (0, A) and (b, 0): it transpires that this is impossible, as will now
be demonstrated.
Define the related functions
x (1 + p2 )2 yB 3
(p) = = and (p) = = ln p p2 p4 , p>0 (9.65)
c p c 4
which contain no arbitrary constants. Since, by definition, p = y 0 (x) it follows from
the chain rule that p 0 (p) = 0 (p) and hence for p 6= 0 the stationary points of (p)
and (p) coincide. The graphs of (p) and (p) are shown in figure 9.11.
7 0

6 -1
-2
5
-3
4
-4
3 -5
p p
2 -6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4

Figure 9.11 Graphs of (p) and (p). Each function is stationary at p = 1/ 3.

Since 0 (p) = (p2 + 1)(3p2 1)/p2, (p) has a single minimum
at p = 1/ 3 and, because
p 0 (p) = 0 (p), (p) has a single maximum at p = 1/ 3. The coordinates of the
3 Note that this is a fairly common trick and was used to simplify Riccatis equation in section 7.9.
5
stationary points are (, ) = (16 3/9, 21 ln 3 12 ) = (3.08, 0.97). The minimum

value of x(p) is 16 3c/9, with c > 0, hence there is no nontrivial stationary path that
can pass through x = 0, the lower boundary. However, we pursue the investigation of
this general solution because it is needed for the stationary path of S2 .
The graphs of (p) and (p) show that there are two branchesof the function (),
the solution of equation
9.62, one defined by p in the interval [1/ 3, ) and the other
on the interval (0, 1/ 3]. Consider each case

p 3 > 1: for p increasing from 1/ 3, (p) increases monotonically from its
minimum value and (p) decreases monotonically from its maximum value. Hence
the function
() remains in the fourth quadrant starting at (3.08, 0.97), where
4/3
p = 1/ 3, and behaving as 0 ' 3 /4 for large p. This
is the curve M R in
figure 9.12. At p = 1/ 3, () = 1/ 3. Since p > 1/ 3, this curve is a local
minimum of S1 [y], see equation 9.61.

p 3 < 1: for p decreasing from 1/ 3, (p) increases monotonically from its
minimum value and (p) decreases monotonically from its maximum value: again
() remains in the fourth quadrant, and for small p, ' 1/p and () ' ln .
On this curve() decreases more slowly than on the previous curve. At p = 1/ 3,
0 () = 1/ 3. This is the curve M S in figure 9.12.
The equations 9.64 define the parametric equations of a curve in the (, )-plane with
parameter p; this curve is shown in figure 9.12. In principle can be expressed in terms
of , but no simple formula for this relation exists. The two branches M R and M S of
(), shown in figure 9.12, start at (3.08, 0.97) with the same gradient.
2 3 4 5 6 7
0
p3 <1
-1 M
-2 S
-3
p3 >1
-4
-5
R
-6
Figure 9.12 Graph of the two branches of (), the solu-
tion of equation 9.62.
Solid surrounding a hollow right circular cylinder
The above analysis shows that there is no smooth solution stationary path for S 1 [y].
However, suppose that the solid of revolution surrounds a hollow cylinder with axis
along Oy and with a given radius, a and height A, and through which the fluid flows
unhindered, as shown in figure 9.13.
A
x
a b
Figure 9.13 Diagram of a solid surrounding a hollow cylinder.
The functional for this problem is a variation of that defined in equation 9.59,
b
x
Z
S3 [y] = dx , y(a) = A, y(b) = 0. (9.66)
a 1 + y0 2
The solution of the associated Euler-Lagrange

equation that makes S3 [y] a minimum is
given by equations 9.64 with 1/ 3 < p1 p p2 , with p1 corresponding to the point
(a, A) and p2 to (b, 0). The four constants, (p1 , p2 , c, B) are given in terms of (a, b, A)
by the four boundary conditions,
A = B + c(p1 ), (from y(p1 ) = A), (9.67)

0 = B + c(p2 ), (from y(p2 ) = 0), (9.68)
a = c(p1 ), b = c(p2 ), (from x(p1 ) = a and x(p2 ) = b > a). (9.69)
For given values of b/a and A/a we now show that these equations have a unique
solution, provided A/a is larger than some minimum value that depends upon b/a, and
tends to zero as b a. The equations can be solved numerically for the constants,
(p1 , p2 , c, B). But this task is made easier by first expressing p2 and c in terms of
p1 . This is achieved by dividing the two equations 9.69 to eliminate c and writing the
resultant expression in the form
b 1
(p2 ) = (p1 ), p1 > . (9.70)
a 3
This equation can be interpreted geometrically as illustrated in figure 9.14 which shows
the graphs of (p) = (1 + p2 )2 /p and b(p)/a > (p), the dashed line. For a given value
of p1 > 1/ 3, we can see by following the arrows on the dotted lines that a unique
value of p2 is obtained. For large p we have (p) ' p3 giving the approximate solution
p2 ' (b/a)1/3 p1 .
b(p)/a
(p)
1/3 p1 p2 p
Figure 9.14 Graphs of the functions (p), the solid line, and b(p)/a, the
dashed line and the geometric interpretation of the solution of equation 9.70.
The equation 9.70 for p2 is simplified by defining a new variable 2 , p2 = 1/ tan 2 =

tan(/2 2 ), with 0 < 2 < /3, so equation 9.70 becomes

a p1 3 3a
sin3 2 cos 2 = d3 , d3 = < , d < 0.69. (9.71)
b (1 + p21 )2 16 b
The left-hand side is O(23 ) as 2 0 and in this limit the solution is approximately
2 = d, that is p2 ' (b/a)1/3 p1 , which is why we wrote d3 on the right-hand side. For
larger values of d the solution can be approximated by a truncated Taylor series. It
transpires that the first few terms of this series provide sufficient accuracy for graphical
representations of the solution: the first four terms give,

1 2 2 1 4 78 6
p2 (d) = 1 d d d + . (9.72)
d 3 3 81
The constant c is given directly in terms of p1 by equation 9.69, so, on eliminating B
from equations 9.67 and 9.68 we obtain an equation for p1 ,

A (p1 ) (p2 )
= . (9.73)
a (p1 )
This equation can be used to determine p1 for a given value of A/a. Numerical inves-
tigations suggest that for a given value of b/a there is a maximum value of A above
which there are no solutions, with this critical value of A tending to zero as b a.
Alternatively, as a 0, for fixed b, the minimum value of A for which solutions exist
tends to infinity, see exercise 9.24. A few such solutions are shown in figure 9.15 for
A = 4a, and in this case there are no solutions if b > 3.72a.
y
4
a
3 b=3.5a
2
b=2.5a
1
b=1.5a b=2a x/a
0 0.5 1 1.5 2 2.5 3 3.5
Figure 9.15 Examples of the stationary paths of S3 [y] for A = 4a and various
values of b/a. For this value of A there are no solutions for b > 3.72a.
Body surrounding a solid right circular cylinder

We now return to Newtons original problem. Another solution of the Euler-Lagrange
equation 9.62 is y 0 (x) = 0, with c = 0, so we might expect a suitable solution to be the
piecewise differentiable function,

A, 0 x a,
z(x) =
y(x), a x b,
where y(x) is the solution found in equation 9.64 above, with y(a) = A, and a a
parameter to be found. This solution has a corner at (a, A), so can be treated using a
modification of the Weierstrass-Erdmann corner conditions the modification being
required because A is fixed and y(a) = A, so constraining the end of the varied path to
move only in the x-direction.
However a more transparent formulation of this problem is obtained by explicitly
including the path y = A, for 0 x a, in the functional. Thus the equivalent
functional is
Z b
1 x
S2 [y] = a2 + dx , y(a) = A, y(b) = 0, (9.74)
2 a 1 + y0 2
where both the path and the variable a need to be found. The varied path is y(x)+h(x),
with h(b) = 0, and a + k. At the corner, x = a
A = y(a) and A = y(a + k) + h(a + k) (9.75)
and expanding the second equation to first-order in we obtain a relation between k
and h(a),
ky 0 (a) + h(a) = 0. (9.76)
Setting F (x, y 0 ) = x/(1 + y 0 2 ) we obtain
Z b Z a+k
1
S2 [y + h] = (a + k)2 + dx F (x, y 0 + h0 ) dx F (x, y 0 + h0 ). (9.77)
2 a a
Differentiating with respect to , then setting to zero, integrating by parts and using
the fact that h(b) = 0 gives the Gateaux differential, see exercise 9.26,
Z b
dFy0
S2 = ak kF a, y 0 (a) h(a)Fy0 a, y 0 (a)

dx h . (9.78)
a dx
Using the subset of variations with k = h(a) = 0 gives equation 9.62, having the
parametric solution
(1 + p2 )2

2 3 4 1
x=c , y = B + c ln p p p , p1 p p2 , (9.79)
p 4 3

where c and B are constants and we restrict p > 1/ 3 because in the previous case
only this range of p gave a minimum: this assumption is justified in exercise 9.27.
On using equation 9.76 to express h(a) in terms of k, we see that on the stationary
path the Gateaux differential has the value

S2 = a F a, y 0 (a) + y 0 (a)Fy0 a, y 0 (a) k.

(9.80)
This must be zero for all k and hence

1 + 3y 0 2 (a)
a = F a, y 0 (a) y 0 (a)Fy0 a, y 0 (a) = a

. (9.81)
(1 + y 0 2 (a))2
From this it follows that y 0 (a) = 1 (we ignore the solution y 0 (a) = 0) and since y(x)
is a decreasing function y 0 (a) = 1. From the definition of p, equation 9.63, it follows
that p1 = 1. Thus the solution is
a (1 + p2 )2

7a a 2 3 4
x= , y =A+ + ln p p p , 1 p p2 . (9.82)
4 p 16 4 4
Finally the values of a and p2 are determined from the boundary conditions x(p2 ) = b
and y(p2 ) = 0. Combining these equations we obtain

bp2 3 4 2 7
A= p + p ln p 2 . (9.83)
(1 + p22 )2 4 2 2
4
The term in curly brackets is zero at p2 = 1 and the right-hand side of this equation
increases as p2 for large p2 . Also the gradient of the right-hand side is positive for
p2 > 1, see exercise 9.25. Hence for any positive value of A this equation gives a unique
value of p2 ; and then a can be determined from either of equations 9.82. Further, this
path is a local (weak) minimum, see exercise 9.27.
In figure 9.16 are shown some solutions for the cases A = 4, b = 1, 2, , 5, 8 and
10: in this figure only the curved parts of the solutions are shown.
4
y
3
2
1 2 3 4 b=5 b=8 b=10
1
0 1 2 3 4 5 6 7 8 9 x 10
Figure 9.16 Graphs of the solutions defined in equation 9.82 for A = 4 and
b = 1, 2, , 5, 8 and 10. Here the horizontal part of each solution, from x = 0
to a, is not shown.
Exercise 9.23
Derive the first two terms of equation 9.72.
Exercise 9.24
Show that as a 0 equation 9.73 can be written in the approximate form,
4/3
A 3 b
' (p1 )1/3 ,
a 4 a
and hence that for sufficiently small a there is no solution if A 1.09b4/3 a1/3 .
Exercise 9.25
Denote the right-hand side of equation 9.83 by bG(p2 ) where
ff
p 3 4 2 7
G(p) = p + p ln p
(1 + p2 )2 4 4
and show that G(1) = 0, G(p) = 3

p + O(1/p) as p , and that G0 (p) > 0 for
4
p > 1/ 3.
Exercise 9.26
Derive the Gateaux differential 9.78.
Exercise 9.27
Show that the second derivative of S2 [y + h] evaluated at = 0 is
Z b
d 2 S2 1 2 x(3y 0 2 1) 0 2
2
= k + 2 dx h ,
d 2 a (1 + y 0 2 )3
where k is defined in equation 9.76. Deduce that the stationary path defined by
equation 9.82 gives a (weak) local minimum of S2 [y], provided 3y 0 2 > 1.
Exercise 9.28
(a) Consider the value of S1 [y], defined in equation 9.59, on the path

1 x
z(x) = A cos n + , 0xb
2 b
where n is any integer, and show that S1 [z] may be made arbitrarily small.
(b) Which norm does this path satisfy?
9.7. MISCELLANEOUS EXERCISES 369
9.7 Miscellaneous exercises

Exercise 9.29 Z b p
Show that the stationary paths of the functional S[y] = dx f (x, y) 1 + y0 2,
a
y(a) = 0 and with natural boundary conditions at x = b are parallel to the x-axis
at x = b if f (x, y) 6= 0.
Exercise 9.30 Z a p
Show that the stationary paths of the functional S[y] = dx y 1 + y 0 2 , y(0) = A > 0,
0
with natural boundary conditions at x = a, are given by
a x a
y = c cosh with A = c cosh .
c c
Show that there are two solutions if A > 1.509a and none for smaller A.
Exercise 9.31
Derive the Euler-Lagrange equations for the functional
Z v
S[y] = G(y(v)) + dx F (x, y, y 0 ), y(a) = A,
a
for each of the two boundary conditions,

(a) natural boundary conditions at x = v, and
(b) the right end of the stationary path terminating on the curve defined by
(x, y) = 0.
Exercise 9.32 pv
1 + y0 2
Z
Find the stationary paths for the functional S[y] = dx , y(0) = 0,
0 y
where the point (v, y(v)) is constrained to the curve x + (y r) = r2 , that is a
2 2
circle of radius r with centre on the y-axis at y = r.
Exercise 9.33
Consider the functional
Z v
dx f (x, y) 1 + y 0 2 exp tan1 y 0 ,
p
S[y] = y(a) = A,
a
with the condition that the right-hand end of the stationary path lies on the curve
C defined by (x, y) = 0. If the gradient of C and the stationary path at the point
of intersection are, respectively, tan C and tan , show that
1
tan( C ) = .

Exercise 9.34
Show that the stationary path of the functional
Z 1
S[y] = dx (xy + y 00 2 ), y(0) = y 0 (0) = y(1) = 0,
0
2 2
is y(x) = x (1 x)(2x + 2x 7)/480.
Exercise 9.35
A weight of mass M is hung from the end, x = L, of the beam described by the
functional of equation 9.1 and the beam is clamped at x = 0. The relevant energy
functional is
Z L
1 00 2
E[y] = M gy(L) + dx y (x)gy , y(0) = y 0 (0) = 0,
0 2
where the y-axis is pointing downwards. Find the associated Euler-Lagrange equa-
tion and boundary conditions for this problem. Solve this equation in the case
that is independent of x.
Exercise 9.36
A weight of mass M is hung from a given point, x = , 0 < < L, of the beam
described by the functional of equation 9.1 and the beam rests on supports at
x = 0 and x = L, both at the same level. The relevant energy functional is
Z L
1 00 2
E[y] = M gy() + dx y (x)gy , y(0) = y(L) = 0,
0 2
where the y-axis is pointing downwards. Assuming that y(x) is continuous at
x = , find the associated Euler-Lagrange equation and all the boundary condi-
tions for this problem.
Exercise 9.37
Prove that the functional
Z b
dx y 0 2 + 2yy 0 + y 2 ,
`
S[y] = y(a) = A, y(b) = B,
a
can have no broken extremals.
Exercise 9.38 Ra
Can the functional S[y] = 0
dx y 0 3 , y(0) = 0, y(a) = A, have broken extremals
Exercise 9.39 Z a
dx y 0 4 6y 0 2 , y(0) = 0, y(a) = A > 0, have any
`
Does the functional S[y] =
0
stationary paths with a single corner? Find any such paths.
Exercise 9.40
Find the equation for the stationary curve of the modified brachistrochrone prob-
lem in which the initial point is (0, A), A > 0, and the final point is on a circle
with centre on the x-axis at x = b and with radius r < b. The particle starts from
rest at (0, A).
9.8. SOLUTIONS FOR CHAPTER 9 371
9.8 Solutions for chapter 9

Solution for Exercise 9.1
Suppose the stationary solution, y(x) exists then T [y] is stationary. One set of allowed
variations about this curve are those curves that pass through the same end points
as y(x). The stationary path for these boundaries is a cycloid because the Euler-
Lagrange equation must be the same as the conventional brachistochrone, only the
boundary conditions are different. The present boundary conditions merely picks out
a different cycloid.

The functional is
Z X p
S[y] = dx 1 + y0 2, y(0) = A,
0
and the Euler-Lagrange equation has the solution y 0 = 0. The solution passing through
(0, A) is therefore y = mx + A, for some m to be determined. The natural boundary
condition, equation 9.7, is
y0
Fy 0 = p = 0,
1 + y0 2
that is, y 0 = m = 0, so that y = A, which defines a straight line parallel to the x-axis.

The Euler-Lagrange equation is y 00 + y = 0 and the solution satisfying the boundary
condition at x = 0 is y = A cos x + sin x, for some . The natural boundary condition
is Fy0 = 2y 0 = 0 at x = /4,
that is cos /4 A sin /4 = 0, that is = A. Thus the
required solution is y = A 2 sin(x + /4).

The analysis follows that of the text, the only difference being that the varied path,
y + h, is fixed at x = b, so h(b) = 0. Then equation 9.5 becomes
Z b
F d F F
S[y, h] = h(a) 0 dx h(x).
y x=a a dx y 0 y
The same reasoning as used in the text gives the required result.

The Euler-Lagrange equation is y 00 y = 0 and the solution satisfying the boundary
condition at x = 1 is y = B cosh(1 x) + sinh(1 x), for some . The natural
boundary condition is Fy0 = 2y 0 = 0 at x = 0, that is B sinh 1 + cosh 1 = 0. Thus the
required solution is y = B cosh x/ cosh 1.

Using the path defined in equation 9.10 we have
dz dz . dx 1 dx 4b
z0 = = = and = sin2 ,
dx d d tan d
and the time is given by the functional, see equation 4.6 (page 166),
Z /2 r s Z /2 s
1 dx 1 + z 0 2 b b
T = d =2 d = .
2g 0 d z g 0 g

The Euler-Lagrange equation for this problem is dFy0 /dx = 0 and the natural boundary
condition at x = b is Fy0 |x=b = 0. Hence the Euler-Lagrange equation becomes Fy0 = 0,
that is,
( )
F 1 c2 y 0 v2
0
= 2 2
p v giving y 0 (x)2 = 2 .
y c v 2 0 2
c (1 + y ) v 2 c
Integrating this and assuming that v(x) 0 and that y 0 (x) 0 gives
1 x
Z
y(x) = du v(u).
c 0

(a) In this case F = y 0 2 y 2 so F/y 0 = 2y 0 , F/y = 2y giving the Euler-Lagrange
equation y 00 + y = 0.
(b) First note that no boundary conditions are given. Suppose that y(x) and y(x) +
h(x) are two admissible functions. Then the Gateaux differential is
h i Z b
S[y, h] = 2 gb h(b)y(b) ga h(a)y(a) + 2 dx (h0 y 0 hy) .
a
Integrate by parts to put this in the form

Z b
S[y, h] = 2h(b) y 0 (b) + gb y(b) 2h(a) y 0 (a) + ga y(a) 2 dx (y 00 + y) h.
a
Using the subset of variations with h(a) = h(b) = 0 and the fundamental lemma of the
Calculus of Variations we see that S[y] is stationary only on those paths satisfying the
equations y 00 + y = 0. On these paths the Gateaux differential is

S[y, h] = 2h(b) y 0 (b) + gb y(b) 2h(a) y 0 (a) + ga y(a)
and this is zero for all variations only if ga y(a) + y 0 (a) = 0 and gb y(b) + y 0 (b) = 0.
Solution for Exercise 9.9 Z b

On the varied path S[y + h] = dx F (x, y + h, y 0 + h0 , y 00 + h00 ) so differentiating
a
with respect to , using the chain rule, gives
Z b
dS F 0 F 00 F
= dx h +h +h
d a y y 0 y 00
where the derivatives of F are evaluated at y + h. Now put = 0 to obtain the result.

In this example F = 12 y 00 2 gy, so the Euler-Lagrange equation is y (4) = g with
y(0) = y 0 (0) = 0. The general solution of this equation is
g 4
y(x) = x + Ax3 + Bx2 + Cx + D
24
for some constants A, B, C and D. The conditions at x = 0 give C = D = 0. The
conditions 9.17 becomes y 00 (L) = 0 and y (3) (L) = 0 so that
g 2 g gL gL2
L + 6AL + 2B = 0 and L + 6A = 0 = A= , B= ,
2 6 4
g 2 2
x x 4Lx + 6L2 .

giving the solution y(x) =
24
(a) The Gateaux differential is given in equation 9.14 and since h(a) = h(b) = 0 this
reduces to
b b
d2

F F d F F
Z
0
S[y, h] = h + dx + h.
y 00 a a dx2 y 00 dx y 0 y
Using the subset of varied paths for which h0 (a) = h0 (b) = 0, we see that y(x) satisfies
the Euler-Lagrange equation
d2

F d F F
+ = 0, y(a) = A, y(b) = B.
dx2 y 00 dx y 0 y
The other boundary conditions are obtained by considering those paths for which
h0 (a) = 0 and those for which h0 (b) = 0, which gives Fy00 |a = Fy00 |b = 0.
(b) In this problem F = 12 y 00 2 gy and the appropriate Euler-Lagrange equation is

y (4) = g, with the boundary conditions are y(0) = y(L) = 0. The general solution of
this equation that satisfies the conditions y(0) = 0 is
g 4
y(x) = x + Ax3 + Bx2 + Cx.
24
But y(L) = 0 and since Fy00 = y 00 (x) we also have y 00 (0) = y 00 (L) = 0. Since
g 2
y 00 (x) = x + 6Ax + 2B the condition at x = 0 gives B = 0. Then the conditions at
2
x = L give
gL gL4 gL4 gL3

A= and 0 = + CL giving C=
12 24 12 24
so that
g g
x4 2Lx3 + L3 x = x(L x) L2 + xL x2 .

y(x) =
24 24

The integrand of the functional is independent of x, so the first integral is
!
02
1 p y 1 1
F y 0 Fy 0 = 1 + y0 2 p = p =
y 1+y 0 2 y 1+y 0 2 c
were c is a constant. Rearranging gives

y
c2 u
Z
02
y = 2 1 or, assuming y 0 > 0, x= du .
y 0 c2 u2
p
Integration gives x = c c2 y 2 , or y 2 + (x c)2 = c2 . This is the equation of a
circle, radius c with centre at (c, 0), and is also the solution for y 0 < 0.
The transversality conditions shows that the stationary path must be perpendicular
to the line y = x a. But the only lines that are perpendicular to a circle are its
diameters, and hence this line must pass through the centre of the circle, that is c = a,
giving the stationary path y 2 + (x a)2 = a2 .
The same result follows algebraically, but this derivation is more difficult. With
= y x + a, the boundary condition 9.26 becomes
y0 1
Fy0 + (y 0 Fy0 F ) = 0 that is p =0
y 1+y 0 2 c
p
but y 1 + y 0 2 = c therefore y 0 = 1.
0 0
for y , with y = 1 gives y = c/ 2and
If the intersection is at x = v, the equation
from the equation for the circle v = c(1 1/ 2); the required root is v = c(1 + 1/ 2),
the other root corresponding to y 0 = 1. Now substitute these coordinates into the
straight line equation, y = x a, to see that c = a.

The solution of the Euler-Lagrange equation is given in equations 4.8 (page 167) and
the boundary conditions at x = 0 give d = 0, so
1 2
x= c (2 sin 2) , y = A c2 sin2 , 0 b .
2
p
The integrand of the functional may be taken to be F = 1 + y 0 2 / A y, so the
transversality condition 9.26 gives, since = x/a + y/b 1,
y0

1 1 dy a
p = 0 that is = .
Ay 1+ y0 2 a b dx b
dy 1 a
Hence the equation for b is = = . Finally, at = b the end of the
dx tan b b
cycloid is on the line x/a + y/b = 1 that is
c2 1
2b sin 2b + A c2 sin2 b = 1.
2a b
Since sin 2b = 2 sin b cos b this becomes

2
a 2
A
c b sin b cos b sin b = a 1
b b
but b/a = tan b so this simplifies to c2 b = a(1 A/b), which gives c once a, b and
A are known. Notice that the solution exists only if A < b, as would be expected.

(a) Points on the ellipse can be parameterised by x = a cos and y = b sin . Substi-
tuting these expressions into the equation of the straight line gives
r
a b a2 b2
cos + sin = 1 which gives + cos( ) = 1,
A B A2 B2
where
a/A b/B
cos = q and sin = q .
a2 b2 a2 b2
A2 + B2 A2 + B2
The equation for can be written in the form cos( ) = AB/, and this has real
roots only if |AB| . If |AB| > the equation has only complex roots and the
ellipse and the line do not intersect.
Z p
(b) The functional is S[y] = dx 1 + y 0 2 where the pairs of coordinates (u, v), on
u
the ellipse, and (, ), on the line, satisfy the equations
u2 v2
e = 2
+ 2 1 and l = + 1
a b A B
and also v = y(u) and = y().
The general solution of the Euler-Lagrange equation is y = mx + c, where m and c are
constants, which are chosen to satisfy the boundary conditions.
For the boundary conditions, equation 9.29, we require
y0 m 1
Fy 0 = p = and y 0 Fy0 F = .
1+ y0 2 1 + m2 1 + m2
The boundary conditions on the ellipse give
u m v 1 mu v
2
2 = 0 and hence 2
= 2. (9.84)
a 1+m 2 b 1 + m2 a b
The boundary conditions on the straight line give
1 m 1 1 A
= 0 and hence m = , (9.85)
A 1+m 2 B 1+m 2 B
which is the condition for the stationary path to be perpendicular to the straight line.
Also these points lie on the boundary curves,
u2 v2
+ = 1 and v = mu + c (9.86)
a2 b2
and

+ = 1 and = m + c. (9.87)
A B
Thus we have six equations for the six parameters (u, v), (, ) and (m, c) that we need
to find.
The distance along the stationary path is
Z p p
S[y] = dx 1 + m2 = ( u) 1 + m2 . (9.88)
u
Now m = A/B is given directly, equation 9.85, and subtracting equations 9.86 from 9.87
gives v = ( u)m: rearranging this and substituting for v from 9.84 gives
2
b
= m + mu 1 .
a2
Substitute this into the first of equation 9.87 gives
2
2 u

2 2 2 b 2 2 2
+ 2 = AB 2 .

B +A +A u 1 = AB or ( u) B + A
a2 a

Using
equations 9.86 and 9.84 we obtain (since u > 0) u = aB/ and since 1 + m2 =
A2 + B 2 /B we have

A2 + B 2 AB
S[y] = ( u) = .
B A2 + B 2
There are easier ways of deriving this result.

As in the text the appropriate solution of the Euler-Lagrange equations are x = ct and
y = dt, which is equivalent to y = mx with m = d/c. The boundary bondition 9.37
then gives 0 ( )c + 0 ( )d = 0. If the boundary curve can be represented by a function
f (x), then f 0 (x) = 0 /0 so 0 /0 = c/d and mf 0 (x) = 1.
If two lines intersect at a point where they have gradients m1 and m2 the angle
between them, is given by

1 1 1 m2 m 1
= tan m2 tan m1 = tan .
1 + m 1 m2
If m1 m2 = 1 the lines intersect at right angles. Hence the condition mf 0 (x) = 1

means that the stationary path and the boundary line intersect at right angles.

The functional defining the distance between the origin and a point on the parabola is
Z v p
S[y] = dx 1 + y 0 2 , y(0) = 0,
0
with the right-hand end of the path on the curve (x, y) = y 2 + a2 (x 1) = 0.

The solution of the associated Euler-Lagrange equation satisfing the boundary con-
dition at x = 0 is y = mx for some constant m. The boundary condition on the
parabola gives, on using equation 9.26, a2 y 0 = 2y and hence v = a2 /2. At this point
y = ma2 /2, and since this point lies on the parabolawe obtain m2 = 4/a2 2. Thus
there are stationary paths if a < 2 and none if a > 2. Note that the stationary path
through (1, 0) is not given by this method.

First consider the case A = 0. Suppose that z(x) is a solution, so that S[z, h] = 0;
then for any constant c, S[cz] = c2 S[z] and since cz(x) also satisfies the boundary
conditions if A = 0, S[cz, h] = 0, cz is also a stationary path and there is no unique,
nontrivial solution.
The Gateaux differential is
Z Z L
S[y, h] = 2Cy()h() + dx y10 h0 + dx y20 h0
0
Z Z L
= h() 2Cy() + y10 () y20 () dx hy100 dx hy200 .
0
Using the same arguments as in the text we see that y1 and y2 satisfy the Euler-Lagrange
equations
y100 = 0, y1 (0) = A, 0 x , and y200 = 0, x L,
together with the following conditions at x =
y10 () y20 () + 2Cy() = 0 and y1 () = y2 (),
the first to make the path stationary and the second to ensure that the path is continous.
The Euler-Lagrange equations have the following solutions
y1 (x) = x + A and y2 (x) = x + ,
where , and are constants. The natural boundary condition at x = L, see equa-
tion 9.7, gives Fy0 = y 0 (L) = 0, that is = 0.
Continuity at x = gives = + A and the other condition at x = gives
+ 2C = 0, and these two equations give
2CA A
= , and = .
1 + 2C 1 + 2C
Hence the solution is

1 + 2C( x)

A
, 0 x ,
y(x) = 1 + 2C
A

, x L.
1 + 2C
Note that if A = 0, y(x) = 0.


Differentiate equation 9.49 with respect to and then set = 0,
h i
S = F (c, y1 (c), y10 (c)) F (c, y2 (c), y20 (c))
Z c Z b
+ dx h1 Fy + h01 Fy0 + dx h2 Fy + h02 Fy0 .
a c
Integrate by parts to give

S = lim F + h1 Fy0 lim F + h2 Fy0
xc xc+
Z c Z b
dFy0 dFy0
+ dx h1 Fy + dx h2 Fy .
a dx c dx
Using the subset of variations for which = 0 gives equations 9.50 and 9.51, for y 1 (x)
and y2 (x). On these paths S reduces to equation 9.49.

(a) The general solution of the Euler-Lagrange is y = mx + c and since y(0) = 0, c = 0;
also y(2) = 1, so m = 1/2 giving y = x/2.
(b) Substituting y + h into the functional and expanding in powers of gives
Z 2 Z 2
1 2
S[y + h] = S[y] + dx Fy0 h0 + 2 dx Fy0 y0 h0 2 + O(3 ), F (y 0 ) = y 0 2 (1 y 0 ) .
0 2 0
The terms O() is, by definition, zero on a stationary path and since
Fy0 y0 = 2 1 6y 0 + 6y 0 2

on the path y = x/2 we have

Z 2
2
S[y + h] = S[y] dx h0 2 + O(3 ).
0
Hence for all allowed h and 0 < || 1, S[y + h] S[y] < 0, so this stationary path is
a local maximum of S[y].

We require the simultaneous solutions of the equations f (x) = f (y) and g(x) = g(y),
where
g(z) = z(z 1)(2z 1) and f (z) = z 2 (z 1)(3z 1).
Clearly x = y (that is m1 = m2 ) are solutions but since c = (1 m2 )/(m1 m2 ), these
solutions are excluded. The solutions given are those for which f (z) = g(z) = 0, so now
we require solutions for which f and g are nonzero.
Given a value of y, with f (y) 6= 0 and g(y) 6= 0, the equation g(x) = g(y) is a cubic
for x and has either one or three real solutions. If there is one real solution this must be
x = y. If there are three real solutions, then one is x = y leaving two other potentially
interesting solutions; denote these by x1 (y) and x2 (y). These must also be solutions of
the quartic, f (x) = f (y). But the root structure of a quartic and a cubic is different; for
instance roots of the quartic will coalesce are different values of y than for the cubic, so
it is unlikely that there is a range of y for which x1 (y) and x2 (y) satisfy both equations.
There may, however, be accidental coincidences; we now show that there are none.
Consider the differences,
f (x) f (y) = (x y)F (x, y) and g(x) g(y) = (x y)G(x, y)
where F (x, y) and G(x, y) are respectively symmetric cubic and quadratic functions of
x and y. The solution x = y is of no interest, so we require the solutions of the equations
F (x, y) = 3(x3 + y 3 ) + 3x2 y + 3xy 2 4(x2 + y 2 + xy) + x + y = 0,
G(x, y) = 2(x2 + y 2 ) + 2xy 3(x + y) + 1 = 0.
These equations are more conveniently expressed in terms of the variables u = x + y
and v = xy, (so when x = y, u2 = 4v)
F (u, v) = 3u3 4u2 6uv + u + 4v = 0,
G(u, v) = 1 3u + 2u2 2v = 0,
which gives
1
1 3u + 2u2

v=
2
and
F = (1 u)(3u2 6u + 2) = 0.
If u = 1 then v = 0 and (x, y) = (0, 1) and (1, 0), which are the solutions found in the
text.
If 3u2 6u + 2 = 0, u = 1 3/3 and v = 1/3 3/6, so u2 = 4v giving x = y.
Hence there are no real solutions other than those found in the text.

Suppose the corners are at c1 and c2 , with 0 < c1 < c2 < 2, and that the gradients are
m1 (for 0 x < c1 ), m2 (for c1 < x < c2 ) and m3 (for c2 x 2). The Weierstrass-
Erdmann conditions can be applied at each corner so the following solutions are possible,
(m1 , m2 , m3 ) = (1, 0, 1) and (0, 1, 0).
Consider each in turn.
For (m1 , m2 , m3 ) = (1, 0, 1) the continuous stationary path is constructed by draw-
ing any horizontal line joining the lines y = x and y = x 1, that is

x, 0 x c,
y(x) = c, c x c + 1,
x 1, c + 1 x 2, 0 < c < 1.

For (m1 , m2 , m3 ) = (0, 1, 0) the stationary path is

0, 0 x c,
y(x) = x c, c x c + 1,
1, c + 1 x 2, 0 < c < 1.


The general solution of the Euler-Lagrange equation is y = mx + c (because the inte-
grand depends only upon y 0 ). Hence the continuous solution, with one corner at x = a,
has the form
m1 x, 0 x a < 4,
y(x) =
m2 (x 4) + 2, a x 4,
2 4m2
with continuity at x = a giving m1 a = m2 (a 4) + 2, that is a = .
m1 m 2
2 2
The Weierstrass-Erdmann condition 9.54 gives m1 (m1 1) = m2 (m2 1) and three
obvious solutions are (m1 , m2 ) = (0, 1), (1, 0), (1, 1) and m1 = m2 , which we
ignore, because it does not give a corner.
The condition 9.53 gives (m21 1)(3m21 + 1) = (m22 1)(3m22 + 1). Only the solution
(m1 , m2 ) = (1, 1), also satisfies this equation.
Thus there are two stationary paths

x, 0 x 3, x, 0 x 1,
y(x) = and y(x) =
6 x, 3 x 4, x 2, 1 x 4.
The value of S is the same on each path, S = 0, which is a global minimum.

Using the Taylor series for sin 2 and cos 2 the left-hand side of equation 9.71 becomes
3
1 1
23 1 22 + O(24 ) 1 22 + O(24 ) = 23 1 22 + O(24 ) ,
6 2
so the equation can be written in the form

1/3
2 = d 1 22 + O(24 ) .
If |2 | 1 the approximate solution of this equation is 2 = d, so we may put 2 = d

in the right-hand side to obtain

2 4
1/3 1 2 4
2 = d 1 d + O(d ) = d 1 + d + O(d ) .
3
For small |2 |
1
tan 2 = 2 + 23 + O(25 )
3
1 2 1 3 2 2
= d 1 + d + d + O(d ) = d 1 + d + O(d5 ),
5
3 3 3
and hence

1 1 1 2
p2 = = 2 2 4
= 1 d2 + O(d4 ) .
tan 2 d(1 + 3 d + O(d )) d 3

If a b we see from figure 9.14 that p2 p1 > 1/ 3, so we may use the approximation
(p2 ) ' p32 to write equation 9.70 in the form
b
p32 = (p1 ),
a
and then using the approximation (p2 ) (p1 ), see figure 9.11 , equation 9.73 becomes
4/3
A (p2 ) 3 A 3 b
' , but (p2 ) ' p42 giving ' (p1 ) ,
a (p1 ) 4 a 4(p1 ) a
that is 4/3
A 3b 1
' (p1 )1/3 , p1 > .
a 4a 3

Since (p) is monotonic increasing
for p > 1/ 3 the smallest value of the right-hand
side is given by setting p1 = 1/ 3. Then we see that if a b the equation for p1 has
a solution only if
4/3 1/3 4/3
A 3 b 1 b
= 1.09 ,
a 4 a 3 a
and there are no solutions for smaller values of A.
Numerical calculations with a = 1, b = 10, 20 and 30 give this lower boundary
to be 19.5, 53.3 and 94.2 respectively. The above formula gives 23.5, 59.2 and 102
respectively.

Clearly G(1) = 0 and for large p, on using the binomial expansion

3p 4 4 ln p 7
G(p) = 1+ 2 4
4(1 + p2 )2 3p 3p4 3p

3p 2 3 6 4 4 ln p 7
= 1 2 + 4 + O(p ) 1+ 2 4
4 p p 3p 3p4 3p

3p 2 4 ln p
= 1 2 + O(p4 ) .
4 3p 3p4
Hence
3p 1 ln p
G(p) = 3 + O(p3 ),
4 2p p
which gives the result quoted.
The derivative of G(p) can be obtained by defining
3 4 7 1
f (p) = p + p2 ln p so that f 0 (p) = (3p2 1)(p2 + 1)
4 4 p
and
(1 3p2 )f (p) 3p2 1 3p2 1

11 1 4
G0 (p) = + = + p + p2 + ln p .
(1 + p2 )3 p2 + 1 (p2 + 1)3 4 4

If h(p) = 11/4 + p4 /4 + p2 + ln p then clearly h(p) > 0 for p > 1. At p = 1/ 3,
h(p) = 111 12 ln 3 > 0 and h0 (p) = p3 + 2p2 + 1/p > 0 for p > 0; hence h(p) > 0 for
36
p > 1/ 3 and therefore G0 (p) > 0 for p > 1/ 3.

Since
b a+k
1
Z Z
S2 [y + h] = (a + k)2 + dx F (x, y 0 + h0 ) dx F (x, y 0 + h0 )
2 a a
differentiation with respect to gives

b
d
Z
S2 [y + h] = (a + k)k + dx h0 Fy0 (x, y 0 + h0 )
d a
Z z
kF z, y (z) + h0 (z)
0
dx h0 Fy0 (x, y 0 + h0 ),

(9.89)
a
where z = a + k. Now set = 0 to obtain the Gateaux differential

Z b
0
dx h0 Fy0 (x, y 0 ).

S2 [y, h] = ak kF a, y (a) +
a
Integrating by parts and using the fact that h(b) = 0 then gives the required result,
b
dFy0
Z
S2 [y, h] = ak kF a, y 0 (a) h(a)Fy0 a, y 0 (a)

dx h .
a dx

The first derivative of S2 [y+h] with respect to is given in the solution of exercise 9.26,
equation 9.89. Differentiating this expression again gives
Z b
d 2 S2
= k2 + dx h0 2 Fy0 y0 (x, y 0 + h0 )
d2
a
d
k kFx z, y 0 (z) + h0 (z) + Fy0 z, y 0 (z) + h0 (z) y 0 (z) + h0 (z)

d
Z z
kh0 (z)Fy0 z, y 0 (z) + h0 (z) dx h0 2 Fy0 y0 (x, y 0 + h0 ),

a
where z = a + k. Putting = 0 this becomes

b
d 2 S2
Z
= k2 + dx h0 2 Fy0 y0 (x, y 0 )
d2 a
k 2 Fx (a, y 0 (a)) k ky 00 (a) + h0 (a) Fy0 a, y 0 (a) kh0 (a)Fy0 (a, y 0 (a)).

But equation 9.75 can be expanded in powers of to give

1 2 00
0 = (ky 0 (a) + h(a)) + 2 k y (a) + kh0 (a) + O(3 ).
2
Since this equation is true for all in a neighbourhood of the origin it follows that
ky 0 (a) + h(a) = 0, as in equation 9.76, and k 2 y 00 (a) + 2kh0 (a) = 0. Thus the second
derivative becomes the simple expression
b
d 2 S2 h Z
2
i
= k 1 F x a, y 0
(a) + dx h0 2 Fy0 y0 (x, y 0 ).
d2 a
But,
x 1 2x(3y 0 2 1)
F = so that Fx = and Fy0 y0 = ,
1 + y0 2 1 + y0 2 (1 + y 0 2 )3
and since y 0 (a) = 1, see equation 9.81 the second derivative becomes
b
d 2 S2 1 x(3y 0 2 1) 0 2
Z
2
= k2 + 2 dx h .
d 2 a (1 + y 0 2 )3
It follows that provided 3y 0 2 > 1 the second variation is positive for all nonzero k and
h(x), and that the stationary path is a weak local minimum.

(a) The functional can be written
1
u A 1 x
Z
2
S1 [z] = b du 2 sin2 (n + 1/2)u
, B= n+ , u =
0 1 + B b 2 b
n p
1
u u
Z
X (n+1/2) Z
= b2 du 2 sin2 (n + 1/2)u
+ b2 du 2 sin2 (n + 1/2)u
.
p1
p=1 (n+1/2)
1 + B n
(n+1/2)
1 + B
In each integral of the sum put (n+1/2)u = p1+w, 0 w 1, and (n+1/2)u = n+w
in the last integral to write this in the form
n Z 1 Z 1/2
b2 X p1+w b2 n+w
S1 [z] = dw 2 + dw .
2
(n + 1/2) p=1 0 2
1 + B sin w (n + 1/2) 0
2 1 + B 2 sin2 w
But
1 1
p1+w 1 p
Z Z
dw p dw = , p = 1, 2, , n
0 1 + B 2 sin2 w 0 1+ B2 2
sin w 1 + B2
and
1/2 1
n+w 1 n+1
Z Z
dw (n + 1) dw =
0 1 + B 2 sin2 w 0 1+ B2 2
sin w 1 + B2
so that
n+1
b2 X b2
S1 [z] p= (1 + O(1/n)) .
(n + 1/2)2 1 + B 2 p=1 2 1 + B2
But B = O(n) so S1 [z] = O(1/n) for large n. Hence, given any number > 0, an n can
be found such that S1 [z] < .
(b) Since max(z 0 (x) = O(n) the derivative is not bounded and z(x) satisfies the D0
norm.

The natural boundary condition is, see equation 9.7, Fy0 = 0 at x = b, that is
f (x, y)y 0
p = 0 at x = b.
1 + y0 2
Since f (x, y) 6= 0, this means that y 0 (b) = 0 and that the stationary path is perpendic-
ular to the line x = b.

The general solution of the Euler-Lagrange equation is y(x) = c cosh(x/c + d), for some
constants c and d. The boundary condition at x = 0 gives A = c cosh d, and at x = a,
y 0 (a) = 0, as shown in exercise 9.29, gives
a a
y 0 (a) = sinh + d = 0 that is d = .
c c

ax a
Hence y(x) = c cosh with A = c cosh . With = a/c the equation for c be-
c c
comes A/a = 1 cosh , which was considered in section 4.3 (equation 4.16, page 174).
This has two real solutions if A > 1.509a and none for smaller A.

(a) On the path y + h, with h(a) = 0, the functional is
Z v
S[y + h] = G y(v) + h(v) + dx F (x, y + h, y 0 + h0 )
a
so that differentiating with respect to and then setting = 0 gives

Z v
S[y, h] = h(v)G0 (y(v)) + dx (hFy + h0 Fy0 )
a
Z v
0
dFy0
= h (G (y) + Fy0 ) + dx Fy h.

x=v a dx
Using the subset of variations for which h(v) = 0, we see that y(x) satisfies the Euler-
Lagrange equation. Then the boundary terms shows that the boundary condition at
x = v is
Fy0 (v, y(v), y 0 (v)) + G0 (y(v)) = 0.
(b) If the right end of the path satisfies (v, y(v)) = 0 and the varied path is y + h,
and ends at v + for some , the same analysis that leads to equation 9.19 gives

x + y 0 (v)y + h(v)y = 0.
On the varied path

Z v+
S[y + h] = G y(v + ) + h(v + ) + dx F (x, y + h, y 0 + h0 ).
a
It is convenient to write
y(v + ) + h(v + ) = y(v) + (y 0 (v) + h(v)) + O(2 )
then differentiate with respect to , and then set = 0 to obtain the Gateaux differential
Z v
S = G0 (y(v)) y 0 (v) + h(v) + F (v, y(v), v 0 (v)) + dx (hFy + h0 Fy0 ) .
a
Integrating the second terms of the integral by parts gives

v

0 0

0
Z dFy0
S = G (y)y + F + h G + Fy0 + dx Fy h.
a dx
Because is proportional to h, using the subset of variations for which h(v) = 0, we

deduce that y(x) satisfies the Euler-Lagrange equation. Then it follows that at x = v

G0 (y) + Fy0 x + y 0 Fy0 F y = 0.

The Euler-Lagrange equation has a first integral,
1 1
F y 0 Fy 0 = p =
y 1+y 0 2 c
so that, as in the solution to exercise 9.12, (x c)2 + y 2 = c2 . The boundary conditions

show that this circle must intersect the circle x2 + (y r)2 = r2 at right angles.
The two circles intersect perpendicularly at the origin, see figure 9.17. At other
points the angles of intersection are the same, so the circle (x c)2 + y 2 = c2 satisfies
the boundary condition for all c and there are infinitely many stationary paths.
y Boundary curve
Stationary path
r
r
c
x
c
Figure 9.17
Solution for Exercise 9.33 p

Since F (x, y, y 0 ) = f (x, y) 1 + y 0 2 exp( tan1 y 0 ) we have
( )
0

y p
Fy0 = f (x, y) exp( tan1 y 0 ) p + 1 + y0 2
1 + y0 2 1 + y0 2
f (x, y)
= p ( + y 0 ) exp( tan1 y 0 )
1 + y0 2
so that
f (x, y)
y 0 Fy 0 F = p exp( tan1 y 0 )(y 0 1).
1+y 0 2
The boundary condition 9.8 then gives
f (x, y)
p {( + y 0 )x (y 0 1)y } = 0 at x = v.
1 + y0 2
If the gradient of and the stationary path at x = v are respectively tan C and tan ,
so x = y tan C and y 0 = tan , this boundary condition becomes
( + tan ) tan c + ( tan 1) = 0
which rearranges to (tan tan c ) = 1 + tan tan c , giving the required result.

On the varied path y + h, with h(0) = h0 (0) = h(1) = 0, the functional has the value
Z 1 Z 1
dx x(y + h) + (y 00 + h00 )2 dx (xh + 2h00 y 00 ) .

S[y + h] = giving S =
0 0
Integrating by parts twice gives

h i1 Z 1
S = h0 y 00 hy 000 + dx x + 2y (4) h.
0 0
Since h(0) = h0 (0) = h(1) = 0 this gives

Z 1
S = h0 (1)y 00 (1) + dx x + 2y (4) h.
0
The Euler-Lagrange equation is therefore
2y (4) (x) = x, y(0) = 0, y 0 (0) = 0, y(1) = 0, y 00 (1) = 0.
The general solution of this equation is y(x) = x5 /240 + Ax3 + Bx2 + Cx + D. The
boundary conditions y(0) = y 0 (0) = 0 give C = D = 0, and the other two conditions
give the equations
7 1
0= +A+B and 0 = + 6A + 2B
480 12
with solution A = 3/160 and B = 7/480 giving y(x) = x2 (1 x)(2x2 + 2x 7)/480.

First compute the Gateaux differential by evaluating the functional on the path y + h,
where h(0) = h0 (0) = 0,
L
1
Z
E[y + h] = M g y(L) + h(L) + dx (y 00 + h00 )2 (x)g(y + h)
0 2
so that
L
dE
Z
= M gh(L) + dx h00 (y 00 + h00 ) gh
d 0
and putting = 0 gives the Gateaux differential
Z L
[E, h] = M gh(L) + dx h00 y 00 gh .
0
Integrating by parts twice gives

Z L
[E, h] = M g + y 000 (L) h(L) + y 00 (L)h0 (L) + dx y (4) g h(x).
0
Hence the stationary path satisfies the equation
d4 y Mg
= (x)g, y(0) = y 0 (0) = y 00 (L) = 0, y (3) (L) = .
dx4
If is independent of x the general solution of this equation that satisfies the boundary
conditions at x = 0 is
g 4
y(x) = x + Ax3 + Bx2
24
and the constants A and B are determined by the two conditions at x = L; since
g 2 g
y (2) (x) = x + 6Ax + 2B and y (3) (x) = x + 6A
2
these give the equations

g Lg L
A= (M + L) and B = M+ .
6 2 2
Hence
g 4 x3 Lx2

L
y(x) = x (M + L)g + M+ g,
24 6 2 2
g 2 2 Mg 2
= x (x 4xL + 6L2 ) + x (3L x).
24 6

This problem is essentially the same as that considered in section 9.5.1, but with the
addition of natural boundary conditions at both ends, as in exercise 9.11. Let the
stationary path be (
y1 (x), 0 x ,
y(x) =
y2 (x), x L,
with y1 () = y2 (). Using the same analysis as used to derive equation 9.43 we obtain
Z Z L
E = M gh() + dx y100 h001 gh1 + dx y200 h002 gh2 .
0
Since Z h i Z
dx y 00 h00 = y 00 h0 y 000 h + dx y (4) h
the Gateaux differential becomes

h i h iL
E = M gh() + y100 h01 y1000 h1 + y200 h02 y2000 h2
0
Z Z L
(4) (4)
+ dx y1 g h1 + dx y2 g h2 .
0
On collecting relevant terms together and using the fact that h(x) is continuous at x =
and that h1 (0) = h2 (L) = 0 this becomes

y2000 y1000 M g h(x) + (y100 h01 y200 h02 )

E =

x= x=
h i h i
00 0 00 0
y1 h1 + y 2 h2
x=0 x=L
Z Z L
(4) (4)
+ dx y1 g h1 + dx y2 g h2 .
0
Now choose the subset of variations that make all the boundary terms zero to see that
y1 and y2 satisfy the Euler-Lagrange equations
(4)
y1 = g, y1 (0) = 0, 0 x ,
(4)
y2 = g, y2 (L) = 0, x L.
The coefficient of h() in E gives

M g = y2000 () y1000 () .
Also, since h01 (0) 6= 0 and h02 (L) 6= 0 the natural boundary conditions at x = 0 and L
are
y100 (0) = y200 (L) = 0.
Finally choose the subset of variations for which h0 (x) is continuous to see that y100 () =
y200 (), that is y 00 (x) is continuous at x = .
For the sake of completeness we now show how these conditions can be used to
find the solution when is independent of x. This analysis was not requested in the
question.
The two solutions of the Euler-Lagrange equation that fit the boundary conditions
at x = 0 and L are
g 4
y1 (x) = x + a1 x3 + b1 x,
24
g
y2 (x) = (L x)4 + a2 (L x)3 + b2 (L x),
24
so there are four further constants to be determined by the conditions just derived. The
conditions of the second and third derivatives at x = give
gL
M g = gL 6(a1 + a2 ) and 6a2 L = 6(a1 + a2 ) + (2 L).
2
Hence
Mg Lg M g Lg
a1 = ( L) and a2 =
6L 12 6L 12
and
gL3 Mg gL3 Mg
b1 = + (L )(2L ) and b2 = + (L2 2 ).
24 6L 24 6L

Here F = y 0 2 + 2yy 0 + y 2 so that Fy0 = 2(y 0 + y). The Weierstrass-Erdmann
(corner) conditions, equations 9.53 and 9.54, show that this expression is continuous
at any corner. Since y(x) is continuous it follows that y 0 (x) is also continuous. The
Euler-Lagrange equation is second-order, so it follows (by differentiation) that all higher
derivatives are continuous. Hence there are no corners.

The general solution of the Euler-Lagrange equation is y = mx + d. Let m1 and m2 be
the gradients to the left and right, respectively, of the corner. The Weierstrass-Erdmann
corner conditions, equations 9.53 and 9.54, give, since F y 0 Fy0 = 2y 0 3 ,
m31 = m32 , m21 = m22 .
The second equation gives m2 = m1 and the first shows that the only solution is
m1 = m2 . Hence the functional has no corners.

Here F = y 0 4 6y 0 2 and the general solution of the associated Euler-Lagrange equation
is y = mx + d. If there is a corner and if the gradients on the left and right-hand sides
are m1 and m2 then since
Fy0 = 4y 0 y 0 2 3 and F y 0 Fy0 = 3y 0 2 2 y 0 2

the Weierstrass-Erdmann corner conditions, equations 9.53 and 9.54, give
m1 m21 3 = m2 m22 3 and m21 m21 2 = m22 m22 2 .

The first equation gives m2 = m1 (we ignore the solution m2 = m1 ), and then the
second equation gives

m1 = m, m2 = m with m = 3.
The stationary path that satisfies the boundary conditions is y = mx (for 0 x c)

and y = A + m(a x) (for c x a) and continuity at x = c gives mc = A + m(a c).
Thus the two solutions are

mx, 0 x c,
y(x) =
A + m(a x), c x a, where c = A + ma , m = 3
2m
Since 0 < c < a we also need, ma < A < ma.

The equation of the circle is most conveniently expressed in the parametric form
x = b + r cos , y = r sin (circle)
and the equation of the cycloid, found in section 4.2.3, is

1 2
x= c (2 sin 2), y = A c2 sin2 (cycloid).
2
The geometric interpretation of is shown in figure 9.18, where we have set A = b = 2
and r = 1/2 for the purpose of illustration.
1.5
Cycloid
1
Boundary circle
0.5
r
0
L
0 0.5 1 1.5 2 2.5 3 R
Figure 9.18 Example of a cycloid starting at (0, 2) and terminating on

the circumference of the circle of radius r = 1/2 with centre at (2, 0).
The boundary conditions, equation 9.37, is therefore
2c2 r sin2 sin 2c2 sin cos cos = 0
which gives cos( ) = 0 (we ignore the solution sin = 0). This equation has many
solutions and we determine which is appropriate by considering the limiting cases where
= /2 (and c2 = A) so at the terminus the cycloid is tangential to the x-axis. Then
= , so the required solution is = /2.
The equations for c and the value of at the terminus, which we denote by , are
1 2
c (2 sin 2) = b r sin and c2 sin2 = A r cos .
2
An equation for is therefore
2 sin 2 b r sin
2 = , (b > r).
2 sin A r cos
This equation has one real root in the interval 0 < < , as may be seen by sketching
the graphs of
2 sin 2 b r sin
f1 () = 2 and f2 () = .
2 sin A r cos
Observe that f1 () = 43 + O( 3 ), that f1 () as and that f1 is monotonc
increasing for 0 < < . Note also that the behaviour of f2 depends upon whether
A > r or A < r, but in either case sketches of these functions show that there is one
real root for 0 < < .
Chapter 10
Conditional stationary points
10.1 Introduction
In this chapter we introduce the method needed to treat constrained variational prob-
lems, examples of which are the isoperimetric and catenary problems, described in
sections 2.5.5 and 2.5.6. With such problems the admissible paths are constrained to
a subset of all possible paths: in the isoperimetric and catenary problems these con-
straints are the lengths of the boundary and chain, respectively.
We introduce the technique required using the simpler example of constrained sta-
tionary points of functions of two or more variables, beginning with a discussion of a few
elementary cases; the method is applied to the Calculus of Variations in the next chap-
ter. Throughout this chapter we assume that all functions are sufficiently differentiable
in the region of interest.
Consider a walker on a hill but confined to a one-dimensional path, AB, as shown
in figure 10.1.
3
2.5
2
h 1.5
1 B
0.5 2
0 A
3 2 0
1 0 x
y 1 2 3 2
Figure 10.1 Graph showing the height h(x, y) of the hill as x
and y vary. The path x + y = 1 is depicted by the solid line.
In this example the height of the hill is represented by the function

h(x, y) = 3 exp(x2 y 2 /2) (10.1)
393
394 CHAPTER 10. CONDITIONAL STATIONARY POINTS
and the path by the equation x + y = 1. This hill has a global maximum at x = y = 0,
but because the path does not pass through this point the maximum height attained
by the walker is less. The problem is to find this stationary point and its position: we
should also like to classify this stationary point, but usually this is more difficult.
The maximum height of the walker may be determined by rearranging the equation
of the path to express y in terms of x, y = 1 x, and then by expressing the height in
terms of x alone,
3 2 1
h(x) = 3 exp x + x . (10.2)
2 2
The maximum of this function may be found by the methods described in section 7.2,
see also exercise 10.1, and is max(h(x, y)) = 3e1/3 . In this example the path x + y = 1
constrains the walker and is named the constraint, or the equation of constraint.
Another problem is that of inscribing a rectangle of maximum area inside a given
ellipse, such that all corners of the rectangle lie on the ellipse, as shown in figure 10.2.
y
b
(x,y)
a x
Figure 10.2 Diagram of a rectangle inscribed in the ellipse

defined by equation 10.3.
The coordinates of the top right-hand corner of the rectangle are (x, y) and since the
equation of the ellipse is
x2 y2
+ = 1, (10.3)
a2 b2
this is the equation of constraint. The area of the rectangle is
A(x, y) = 4xy, x > 0, y > 0, (10.4)
so we need the maximum of this function subject to the constraint 10.3: this problem
is solved in exercise 10.2.
If there are two independent variables, (x, y), there can be only one constraint which
we denote by g(x, y) = 0, and we require the stationary points of f (x, y) subject to this
constraint. Geometrically the constraint equation, g(x, y) = 0, defines a curve C g in the
Oxy plane, see for example figure 10.3, so we are searching for the stationary points of
f (x, y) along this curve.
With two independent variables there can be only one constraint because another
constraint, (x, y) = 0, defines another curve, C that intersects Cg at isolated points,
if at all. Sometimes, however, the equations g(x, y) = 0 and (x, y) = 0 will define the
same curve, despite being algebraically dissimilar: then the functions g and are said
to be dependent and it can be shown that in the region where the curves g(x, y) = 0 and
(x, y) = 0 coincide there is a differentiable function F (u, v) of two real variables such
10.1. INTRODUCTION 395
that F (g(x, y), (x, y)) =constant: alternatively, using the implicit function theorem,
section 1.3.7, it can be shown that can be expressed in terms of g, (x, y) = G(g(x, y)),
or vice versa. It is not always obvious that two functions define the same curve: for
instance the equations
2
g(x, y) = y sinh1 (tan x) = 0 and (x, y) = 1 ey = 0 (10.5)
1 + tan(x/2)
define the same line in the vicinity of the origin.

The equation of constraint, g(x, y) = 0, can be used to express y in terms of x,
provided gy (x, y) 6= 0, and then the function f (x, y) becomes a function, f (x, y(x)), of
the single variable x, representing the variation of f along those segments of the curve
Cg not including tangents parallel to Oy, for instance the segments AB and BC of the
curve depicted in figure 10.3. The stationary points of f (x, y(x)) can then be found in
the usual manner. Similarly, for segments of Cg on which gx (x, y) 6= 0, such as A0 B 0
or B 0 C 0 in figure 10.3, we can form f (x(y), y) and treat this as a function of the single
variable y.
y C
C Cg
A
A B
B
x
Figure 10.3 A typical curve defined by a constraint equation

g(x, y) = 0. The segments AB and BC may be represented by func-
tions y(x) and the segment A0 B 0 and B 0 C 0 by functions x(y).
If there are three independent variables, (x, y, z) and we require the stationary points
of f (x, y, z) subject to the single constraint g1 (x, y, z) = 0, we may proceed in the same
manner, by using the constraint to express z in terms of (x, y) to form the function
f (x, y, z(x, y)) of two independent variables. With two constraints gk (x, y, z) = 0,
k = 1, 2, the more general implicit function theorem, described on page 32, may be used
to express any two variables in terms of the third, to express f (x, y, z) as a function
of one variable. In either case there are three ways to proceed and it is rarely clear in
advance which yields the simplest algebra.
In general, with n variables x = (x1 , x2 , . . . , xn ) there can be at most n 1 con-
straints. Suppose there are m constraints, m n 1, gk (x) = 0, k = 1, 2, , m.
Then, in principle we may use these m equations to express m of the variables in terms
of the remaining n m, hence giving a function of n m variables. In practice this is
rarely an easy task.
There are two main methods of dealing with constrained stationary problems. The
conceptually simplest method is to reduce the number of independent variables, as
described above, and in simple examples this method is usually preferable. The more
elegant method, due to Lagrange (1736 1813), is described in the next section.
There are two main disadvantages with the direct method:
1. The method is biased because it treats the variables asymmetrically, by expressing

some in terms of the others; it is often difficult to determine the most convenient
choice in advance.
2. The most important difficulty, however, is that the method cannot easily be gen-
eralised to deal with other situations, such as functionals.
Use of the direct method is illustrated in the following exercises.
Exercise 10.1
Show that the function defined in equation 10.1 has a local maximum at x = 1/3,
where y = 2/3, and that the height of the hill here is 3e1/3 .
Exercise 10.2
Show that the area of the rectangle inscribed in the ellipse shown in figure 10.2
can be expressed in the form
4b p 2
A(x) = x a x2 , 0 x a,
a
and by finding the stationary point of this expression show that max(A) = 2ab.
Exercise 10.3
Geometric problems often give rise to constrained stationary problems and here
we consider a relatively simple example.
Let P be a point in the Cartesian plane with coordinates
(A, B) and D the distance from P to any point (x, y) on y
the straight line with equation
x y b (x,y)
+ = 1. D
a b (A,B)
P
Show that D 2 = (x A)2 + (y B)2 and deduce that
the shortest distance is
x
|ab Ab Ba|
min(D) = . a
a2 + b 2
Exercise 10.4
If A, B and C are the angles of a triangle show that the function
f (A, B, C) = sin A sin B sin C
is stationary when the triangle is equilateral.

Hint the constraint is A + B + C = .
Exercise 10.5
If z = f (x, y) and x and y satisfy the constraint g(x, y) = 0, show that at the
stationary points of z the contours of f (x, y), that is the curves defined by the
equations f (x, y) = constant, are tangential to the curve defined by g(x, y) = 0.
10.2. THE LAGRANGE MULTIPLIER 397
10.2 The Lagrange multiplier

The method for finding constrained stationary points described in the introduction
is unsatisfactory partly because it forces an arbitrary distinction between variables,
and partly because this technique cannot be applied to constrained problems in the
Calculus of Variations. The introduction of the Lagrange multiplier overcomes both
these difficulties.
10.2.1 Three variables and one constraint

Lagranges method allows all variables to be treated equally, and may be illustrated
using a function f (x, y, z) of three variables and with one constraint g(x, y, z) = 0.
The problem is to find the points at which f (x, y, z) is stationary subject to the con-
straint. Let (a, b, c) be the required stationary point and consider the neighbouring
points (a + u, b + v, c + w), where is small, which also satisfy the constraint, that
is g(a + u, b + v, c + w) = 0. Using Taylors theorem, see section 1.3.9, we have

g g g
g(a + u, b + v, c + w) = g(a, b, c) + u +v +w + O( 2 ), (10.6)
x y z
where all derivatives are evaluated at (a, b, c). But both points satisfy the constraint so
we have
g g g
u +v +w = O().
x y z
The left-hand side is independent of , so taking the limit 0 gives
g g g
u +v +w = 0. (10.7)
x y z
This equation can be interpreted as the equation of a plane
passing through the origin, in the Cartesian space with axes w
Ou, Ov and Ow, as shown in the diagram. The normal to
n
this plane is parallel to the vector n = (gx , gy , gz ), and the
plane exists provided |n| 6= 0: this means that the constraint O
must not be stationary at (a, b, c). Any point in this plane can
be defined uniquely with just two coordinates. It follows that u v
(u, v, w) cannot vary independently but that usually any one
of these variables can be expressed in terms of the other two.
This is, of course, equivalent to using the implicit function theorem on the equation
g(x, y, z) = 0 to express one variable in terms of the other two.
If f (x, y, z) is stationary then, by definition, see section 2.2.1,
f (a + u, b + v, c + w) f (a, b, c) = O( 2 )
which means, by the same argument as before, that
f f f
u +v +w = 0. (10.8)
x y z
Recall that if there were no constraint this equation must hold for independent variations
u, v and w: then by choosing v = w = 0 and u 6= 0 we see that f /x = 0: the other two
equations, f /y = f /z = 0, are obtained similarly. But because of the constraint

u, v and w cannot vary independently.
We proceed by introducing a new variable, the Lagrange multiplier , also named the
undetermined multiplier, so there are now four variables to be determined (x, y, z) and
: surprisingly this simplifies the problem. Multiply equation 10.7 by and subtract
from equation 10.8 to form another equation,

f g f g f g
u+ v+ w = 0. (10.9)
x x y y z z
This equation is true for any value of . Because of the constraint, variations in u, v
and w are not independent but, if g/z 6= 0 we may choose to make the coefficient
of w in equation 10.9 zero, that is
f g
= 0. (10.10)
z z
Then equation 10.9 reduces to

f g f g
u+ v = 0.
x x y y
Because u and v may be varied independently, by first setting v = 0 and then u = 0,

we obtain the two equations
f g f g
= 0, = 0. (10.11)
x x y y
The three equations 10.10 and 10.11 relate the four variables x, y, z and . Assuming
that the implicit function theorem can be applied, that is the Jacobian 1.26 (page 32)
is not zero, we can use these equations to express (x, y, z) in terms of . Then the
constraint becomes g(x(), y(), z()) = 0, which determines appropriate values of .
This procedure is equivalent to defining an auxiliary function of four variables
F (x, y, z, ) = f (x, y, z) g(x, y, z) (10.12)
and finding the stationary points of F (x, y, z, ) using the conventional theory for all
four variables, that is the solutions of
f g f g f g
Fx = = 0, Fy = = 0, Fz = = 0,
x x y y z z
and F = g(x, y, z) = 0. Usually the first three of these are solved first to give
(x(), y(), z()) in terms of , and then the fourth, the equation of constraint, is
used to determine , although the order in which these equations are solved is clearly
immaterial.
Thus the introduction of the Lagrange multiplier , , gives a method of finding
stationary points that treats the three original variables equally. Before showing how
this method generalises to n variables and m n 1 constraints we apply it to the
triangle problem treated in exercise 10.4.
For this problem f (x, y, z) = sin x sin y sin z and g(x, y, z) = x + y + z , so that
the auxiliary function is
F (x, y, z, ) = sin x sin y sin z (x + y + z ),
with each of x, y and z in the interval (0, ). Equations 10.10 and 10.11 become
sin x sin y cos z = 0, sin x cos y sin z = 0, cos x sin y sin z = 0,
and x + y + z = . Three different equations, independent of , may be obtained by

forming pairs of differences: thus subtracting the second equation from the first gives
sin x (sin y cos z cos y sin z) = sin x sin(y z) = 0. (10.13)
Similarly, by subtracting the third from the second and the third from the first we
obtain
sin z sin(x y) = 0 and sin y sin(z x) = 0.
From 10.13 either sin x = 0 or sin(y z) = 0; but for a triangle of nonzero area none
of x, y or z can be zero or , so < y z < and the only solution is y = z. The
remaining two equations give y = x and z = x and hence x = y = z and then the
constraint gives x = y = z = /3.
Exercise 10.6
Use a Lagrange multiplier to find the stationary points of the problems set in
exercises 10.1, 10.2 and 10.3.
Exercise 10.7
Show that the stationary distance between the origin and
the plane defined by the
equation ax + by + cz = d is given by the formula |d|/ a2 + b2 + c2 .
Exercise 10.8
Consider a rectangle, two sides of which are along the x- and y-axes; the bot-
tom left-hand corner is at the origin and the opposite corner lies on the line
x/a + y/b = 1, where a and b are positive numbers. Show that the stationary
area of such a rectangle is A = ab/4 and that for this rectangle the top right-hand
corner is at (a/2, b/2).
10.2.2 Three variables and two constraints

If there are three variables and two constraints, g1 (x, y, z) = 0 and g2 (x, y, z) = 0, then
equation 10.7 must hold for both constraints so we have the two equations
g1 g1 g1 g2 g2 g2
u +v +w = 0 and u +v +w = 0, (10.14)
x y z x y z
where all derivatives are evaluated at the stationary point. Provided neither g 1 (x, y, z)
nor g2 (x, y, z) is stationary, and that the normals to the planes defined by the equations
are not parallel, so that the planes exist and are distinct, then the planes intersect along
a line and there can be only one independent variable.
Equation 10.8 remains valid and now we proceed by introducing two Lagrange mul-
tipliers, 1 and 2 , one for each constraint. Thus from equations 10.8 and 10.14 we
may form another equation,

f g1 g2 f g1 g2 f g1 g2
1 2 u+ 1 2 v+ 1 2 w = 0.
x x x y y y z z z
(10.15)
Now choose 1 and 2 to make the coefficients of v and w zero, that is
f g1 g2 f g1 g2
1 2 = 0 and 1 2 = 0. (10.16)
y y y z z z
Then, since u may be varied independently, we have a third equation
f g1 g2
1 2 = 0. (10.17)
x x x
The three equations 10.16 and 10.17 may, in principle, be solved to give (x, y, z) in terms
of 1 and 2 and then the constraints, gj (x, y, z) = 0, j = 1, 2, give two equations for
1 and 2 . Needless to say, in practice these equations are not usually easy to solve.
As in the previous case this is formally equivalent to defining an auxiliary function
F (x, y, z) = f (x, y, z) 1 g1 (x, y, z) 2 g2 (x, y, z), (10.18)
of five variables and finding the stationary points of this, that is the solutions of
F F F F F
= 0, = 0, = 0, = 0 and = 0.
x y z 1 2
We illustrate this method by showing how to find the stationary values of f (x, y, z) =
ax2 + by 2 + cz 2 , subject to the variables being confined to the planes x + y + z = 1 and
x + 2y + 3z = 2. The auxiliary function is
F = ax2 + by 2 + cz 2 1 (x + y + z 1) 2 (x + 2y + 3z 2)
so the equations to be solved are
Fx = 2ax (1 + 2 ) = 0,
Fy = 2by (1 + 2 ) 2 = 0,
Fz = 2cz (1 + 2 ) 22 = 0.
In this case it is convenient to define a new variable = 1 + 2 , and then these three
equations can be solved to give
+ 2 + 22
x= , y= , z=
2a 2b 2c
and the equations of constraint become
(ab + ac + bc)+2 (2ab + ac) = 2abc and (3ab + 2ac + bc)+2 (6ab + 2ac) = 4abc,
which have the solution

4ab 2b(c a)
= and 2 = .
a + 4b + c a + 4b + c
Hence
2b a+c 2b
x= , y= and z = x = . (10.19)
a + 4b + c a + 4b + c a + 4b + c
Exercise 10.9
Derive equations 10.19 by using the constraints to express x and y in terms of z.
Note that in this example the direct method is easier, because the constraints are
linear.
Exercise 10.10
If f (x) is a function of the n variables x = (x1 , x2 , , xn ) constrained by the
single function g(x) = 0 show that the stationary points can be found by forming
the auxiliary function F (x, ) = f (x) g(x) of n + 1 variables and finding its
stationary points.
10.2.3 The general case

The method of Lagrange multipliers is applied to the case of n variables and m n 1
constraints, gj (x), j = 1, 2, , m, in a similar fashion, but with m multipliers, so the
auxiliary function has n + m variables,
m
X
F (x, 1 , 2 , , m ) = f (x) j gj (x) (10.20)
j=1
where f (x) is the function for which stationary points are required. The stationary
points of F are at the roots of
m
F f X gj
= j = 0, k = 1, 2, , n, (10.21)
xk xk j=1 xk
F
= gj (x) = 0, j = 1, 2, , m n 1. (10.22)
j
This method has the advantage of treating all variables equally and hence retaining any
symmetries that might be present.
The Lagrange multiplier method determines the position of stationary points. It is
generally more difficult to determine the nature of a constrained stationary point and
normally one has to use physical or geometric considerations besides algebraic methods
to understand the problem.
10.3 The dual problem

We end this chapter by returning to the case of only one constraint and one Lagrange
multiplier. That is we seek the stationary points of the function f (x) subject to the
constraint g(x) = 0. The auxiliary function is F (x, ) = f (x) g(x) and, provided
6= 0 this may be rewritten in the alternative form
G(x, ) = g(x) f (x) where = 1 and G(x, ) = F (x, ). (10.23)
This equation can be used to find the stationary points of g(x) subject to the constraint
f (x) = 0, which are given by the roots of
G g f
= = 0, k = 1, 2, , n,
xk xk xk
which are the same equations as for the stationary points of the original problem. If
x() is a solution of these equations the stationary point of the new constrained problem
is given by those satisfying f (x()) = 0. Further, since = 1, the stationary points
of the original problem are x(1/) with the values of given by g(x(1/)) = 0. Thus
the Lagrange multiplier method highlights a duality between,
a) the stationary points of f (x) with the constraint g(x) = 0, and
b) the stationary points of g(x) with the constraint f (x) = 0,
which is not apparent in the conventional method.
Exercise 10.11
This exercise provides an illustration of the duality described above; compare this
problem with that considered in exercise 10.1.
Find the stationary value of the function g(x, y) = x + y 1 subject to the
constraint f (x, y) = 3 exp(x2 y 2 /2) c where c is a positive constant.
Exercise 10.12
An open rectangular box made of thin sheet metal and sides of height z and a
rectangular base of interior dimensions x and y. The base and sides of length x
are of (small) uniform thickness d and the sides of length y are of thickness 2d. If
the volume of metal is fixed prove that the volume of the box is stationary when
x = 2y = 4z.
Exercise 10.13
A vessel comprises a cylinder of radius r and height h with equal conical ends, the
semi-vertical angle of each cone being . Show that the volume V and the surface
area, S, are given by
2r3 2r2
V = r2 h + and S = 2rh + .
3 tan sin
If r, h and can vary, show that for a vessel of given volume the stationary surface
area occurs when cos = 23 . Also find the value of h in terms of r and and r in
terms of V .

Exercise 10.14
Show that equations 10.5 define the same line in the neighbourhood of the origin.
Exercise 10.15
Find the stationary value of f = x2 + y 2 + z 2 + w2 , subject to the constraint
(xyzw)2 = 1, and the values of the variables at which the stationary values are
attained.
Exercise 10.16
Find the stationary points of f = xyzw 9 subject to the constraint g = 4x4 + 2y 8 +
z 16 + 9w16 = 1 in the region where all variables are positive.
Exercise 10.17
If a, b, c and d are given positive numbers and x, y and z are positive, real variables
satisfying the equation x + y + z = d, show that the function
a2 b2 c2
f (x, y, z) = + +
x y z
possesses a stationary value (a + b + c)2 /d.
Exercise 10.18
Show that the shortest distance between the plane ax + by + cz = d in the Oxy-
plane and the point (A, B, C) is given by
|Aa + Bb + Cc d|
D= .
a2 + b 2 + c 2
Exercise 10.19
For a simple lens with focal length f the object distance p and the image distance
q are related by 1/p + 1/q = 1/f . If p + q =constant find the stationary value of f .
Exercise 10.20
Show that the stationary points of f = ax2 + by 2 + cz 2 , where the constants a, b
and c are all positive, on the line where the vertical cylinder, x2 +y 2 = 1, intersects
the plane x + y + z = 1, are given by
2 2 2
x= , y= and z= ,
2(a 1 ) 2(b 1 ) 2c
where the two possible values of (1 , 2 ) are
1 2c(6c 2) p
1 = (a + b + 4c) , 2 = with = (a b)2 + 8c2 .
2 2c
Exercise 10.21
Show that the area, S, of canvas needed to make a tent of given volume V com-
prising a right circular cylinder of radius r, made of a single thickness of canvas,
together with conical top of height h, made of two thickness of canvas is given by
2V p 2
S= + 2r r2 + h2 rh.
r 3
If both
r and h can vary show that the stationary value of S, for fixed V , is given
by r 2 = 4h = R where V = 2R3 /3.

Differentiating equation 10.2 gives h0 (x) = 3(3x+1)h(x) so that h0 (x) = 0 at x = 1/3,
y = 1 x = 2/3, and here h = 3 exp(1/3). Since h > 0, the sign change of h0 follows
that of 3x + 1, so this stationary point is a maximum, as is clear from figure 10.1.
Solution for Exercise 10.2 p

The constraint can be written in the form y = b 1 x2 /a2 ; substituting this into the
area function, A = 4xy, gives the expression quoted. Differentiation gives
x2 4b a2 2x2

dA 4b p 2 2
= a x = ,
dx a a2 x2 a a2 x2

so that A0 (x) = 0 when x = a/ 2 (since x > 0). If x = a/ 2 r2 , A0 2 , so that
4b a a2
A(x) has a local maximum at this point, with value A = a2 = 2ab.
a 2 2

Using Pythagoras theorem the distance is given by D 2 = (x A)2 + (y B)2 , and
since y = b(1 x/a) this becomes
x 2 x
D2 = x2 2Ax + A2 b2 1 2bB 1 + B2
a a
a2 + b 2 2 b2

Bb
= x 2x A + + A2 + (b B)2 .
a2 a a
This is a quadratic equation in x and since the coefficient of x is positive it has a single
minimum, which is most easily seen by writing it in the form
2
a2 + b 2 (Aa + b(b B))2

2 a 2 2
D = x (Aa + b(b B)) + A + (b B) .
a2 a2 + b 2 a2 + b 2
Hence D has its minimum value at

a
x= (Aa + b(b B))
a2 + b2
and here
(Aa + b(b B))2 (ab Ab Ba)2
D2 = A2 + (b B)2 = .
a2 + b 2 a2 + b 2
For this example the method of Lagrange multipliers is easier, see exercise 10.6.

Eliminate C to give f = sin A sin B sin(A + B), so that

fA = sin B cos A sin(A + B) + sin A cos(A + B) = sin B sin(2A + B),

fB = sin A cos B sin(A + B) + sin B cos(A + B) = sin A sin(A + 2B).
Since sin A 6= 0 and sin B 6= 0 (because 0 < A, B < ) we have sin(2A + B) = 0 and
sin(A+2B) = 0, that is 2A+B = n and A+2B = m, with n and m positive integers,
both smaller than 3. Hence 3A = (2nm) and 3B = (2mn), and 3C = (3nm).
The bounds on A and B give n = m = 1 and hence A = B = C = /3.

The contours of f are the curves f (x, y) = c, which we assume can be expressed in as
the function y(x); the gradient on these contours are given by fx + fy y 0 (x) = 0, that is
y 0 (x) = fx /fy .
Suppose the constraint g(x, y) = 0 (which is a particular contour of g) defines the
function yg (x), so that on this curve the original function has the values
dz
z(x) = f (x, yg (x)) and = fx + fy yg0 (x).
dx
fx
Thus z(x) is stationary when yg0 (x) = = y 0 (x), that is when the contour of f is
fy
tangential to the contour defined by g = 0.

(a) For the walker on the hill we define
F (x, y, ) = h(x, y) (x + y 1) where h(x, y) = 3 exp(x2 y 2 /2),
so that
Fx = 2xh(x, y) = 0 and Fy = yh(x, y) = 0.
These give = yh(x, y) and = 2xh(x, y), and since h 6= 0, y = 2x, then the
equation of constraint gives 3x = 1, hence the result.
(b) For the area of the rectangle inscribed inside the ellipse we define
2
y2

x
F (x, y, ) = 4xy + 2 1 ,
a2 b
where is the Lagrange multiplier. Thus
2 2
Fx = 4y x = 0, Fy = 4x y = 0,
a2 b2
so that y = sx, for some s. Divide these equations to see that = 2ab and hence
s = b/a. The constraint equation gives 2x2 = a2 and then 2y 2 = b2 , so that the
stationary value of the area is A = 2ab.
(c) The auxiliary function is
x y
F = (x A)2 + (y B)2 + 1
a b
so we require the solutions of

Fx = 2(x A) = 0 = x A =
a 2a

Fy = 2(y B) = 0 = y B = .
a 2b
Hence the constraint equation gives

A B 1 1 1 1 ab Ab Ba
+ + + 2 = 1 that is + 2 =
a b 2 a2 b 2 a2 b ab
which gives
2 (ab Ab Ba)2

1 1
D2 = + 2 = .
4 a2 b a2 + b 2

Let (x, y, z) be the point on the plane and D the distance, given by Pythagoras theorem,
D2 = x2 + y 2 + z 2 . Then F = D2 2(ax + by + cz d), with Lagrange multiplier 2.
Hence
Fx = 2(x a) = 0, Fy = 2(y b) = 0, Fz = 2(z c) = 0.
Now use the constraint equation to give (a2 + b2 + c2 ) = d and then
d2 |d|
D2 = 2 (a2 + b2 + c2 ) = that is D = .
a2 + b 2 + c 2 a2 + b2 + c2

Let the coordinates of the corner opposite the origin be y
(u, v), as shown in the diagram. The area is A = uv and if
is a Lagrange multiplier the auxiliary function is b
u v (u,v)
F = uv + 1
a b x/a+y/b=1
then Fu = v /a = 0 and Fv = u /b = 0. Solving these x
equation for u and v, and substituting into the constraint
equation gives 2 = ab and hence u = a/2, v = b/2 so the
a
stationary area is A = ab/4.

We use the constraints to express x and y in terms of z, which is trivial because the
constraint equations are both linear; they give
x+y =1z and x + 2y = 2 3z
and have the solution x = z and y = 1 2z. Hence the expression for f (x, y, z) becomes
f (z) = az 2 + b(1 2z)2 + cz 2

= (a + 4b + c)z 2 4bz + b.
This is a quadratic in z (provided a + 4b + c 6= 0): if a + 4b + c > 0 it has a single

minimum at the root of
2b
f 0 (z) = 2(a + 4b + c)z 4b = 0 that is z = .
a + 4b + c
If a + 4b + c < 0 this stationary point is a maximum. Using the expressions for x(z)
and y(z) the results quoted in equation 10.19 are obtained.

The analysis is a minor generalisation of that given in section 10.2.1. Suppose that
a is the stationary point, and consider a nearby point a + u, that also satisfies the
constraint, so Taylors theorem gives
n
X
g(a + u) = g(a) + uk gk (a) + O( 2 )
k=1
Pn
and since g(a + u) = g(a) = 0 wePhave k=1 uk gk (a) = 0. Also, by definition,
n
f (a + u) f (a) = O( 2 ) and hence k=1 uk fk (a) = 0.
Pn
If is a Lagrange multiplier we have, for all , k=1 uk fk (a) gk (a) = 0. It
follows, by the same reasoning as used in the text that fk (a) gk (a) = 0 for all k.
But these equations are just those that determine the stationary points of the aux-
iliary function F (x, ) = f (x) g(x).

In this case the auxiliary function is

F = x + y 1 h(x, y) c , h(x, y) = 3 exp(x2 y 2 /2),
so that Fx = 1 + 2xh = 0 and Fy = 1 + yh = 0, giving y = 2x (as in the dual

problem). The constraint equation, h(x, y) = c, then gives c = 3 exp(3x2 ), that is
3x2 = ln(3/c).

The volume of the box is Vb = xyz. The volume of the material is proportional to the
area of the sides and is Vm = (xy + 4yz + 2xz)d, so the auxiliary function can be taken
to be
F = xyz (xy + 4yz + 2xz)
where we have absorbed the thickness, d, into the Lagrange multiplier, and ignored an
irrelevant constant. The equation for the stationary values are
Fx = yz (y + 2z) = 0,
Fy = xz (x + 4z) = 0,
Fz = xy (4y + 2x) = 0.
Put y = sx and z = tx, so these become (assuming that x 6= 0)
stx = (s + 2t), tx = (1 + 4t), sx = (2 + 4s).
The second two equations can be rearranged to give t(x 4) = and s(x 4) = 2,
so that s = 2t. Also t = /(x 4), so the first equation gives x = 8; hence t = 1/4
and s = 1/2, giving 2y = x and 4z = x.

The volume and surface area of the cylinder are, respectively, r 2 h and 2rh. The
volume and surface area of a right circular cone of base radius r, height hc and slant
height l are, respectively, r 2 hc /3 and rl; if the semi-vertical angle is we have
tan = r/hc and sin = r/l. Adding the volumes and areas in the appropriate
proportions gives the quoted results.
The auxiliary function can be taken to be
2r2 2r3

F = 2rh + r2 h + .
sin 3 tan
Differentiation with respect to h gives Fh = r(2 r) = 0, so that r = 2. Differenti-

ation with respect to gives
2r3 2r2 r

2 cos
F = 2r + = cos = 0
sin2 3 sin2 sin2 3
Since r = 2, this gives cos = 2/3.

Finally, differentiation with respect to r gives

4r 2r
Fr = 2h + r 2h +
sin tan
4r 2r
= 2h + (1 cos ) = 0 hence h = .
sin 3 sin
The volume is
!1/3
2r3 2r3 10r3 3 5V
V = + = and hence r = .
3 sin 3 tan 3 5 10

The first equation can be rearranged to give tan x = sinh y and then 2 tan x = ey ey .
The second of these can be expressed as a quadratic in ey ,
p
e2y + 2ey tan x 1 = 0 with solutions ey = tan x 1 + tan2 x.
When x = 0, y = 0, so the upper sign gives the required solution, that is

p
ey = 1 + tan2 x tan x
s
4t2 2t
= 1+ 2 2
, where t = tan(x/2),
(1 t ) 1 t2
1t
= ,
1+t
where we have used the identity tan x = 2t/(1 t2 ). Adding unity to each side of the
last equation gives the second of equations 10.5.

If is the Lagrange multiplier, F = x2 + y 2 + z 2 + w2 (xyzw)2 and
Fx = 2x 1 (yzw)2 = x = 0 or (yzw)2 = 1,

Fy = 2y 1 (xzw)2 = y = 0 or (xzw)2 = 1,

Fz = 2z 1 (xyw)2 = z = 0 or (xyw)2 = 1,

Fw = 2w 1 (xyz)2 = w = 0 or (xyz)2 = 1.

If x = 0 then the three remaining equations for have no solution: hence we discard
the solution x = y = z = w = 0. The equation (yzw)2 = 1 gives = x2 (on
multiplying by x2 and using the constraint equation). Similarly, = y 2 = z 2 = w2 ,
hence (xyzw)2 = 4 = 1 and = 1 ( = 1 is not allowed); so there are 16 stationary
points, x = 1, y = 1, z = 1 and w = 1, all of which give f = 4.

If is the Lagrange multiplier, F = xyzw 9 (4x4 + 2y 8 + z 16 + 9w16 1) and
Fx = yzw9 16x3 = f = 16x4 ,
Fy = xzw9 16y 7 = f = 16y 8 ,
Fz = xyw9 16z 15 = f = 16z 16 ,
Fw = 9xyzw8 16 9w15 = f = 16w 16 .
Hence x4 = y 8 = z 16 = w16 and since all variables are positive we can put w = a > 0
to give z = a, y = a2 , x = a4 and the constraint equation becomes
g = 4a16 + 2a16 + a16 + 9a16 = 1 that is 16a16 = 1.
1 1 1
Thus the stationary point is x = , y = , z = w = 1/4 .
2 2 2

a2 b2 c2
If is the Lagrange multiplier, F = + + (x + y + z d) so that
x y z
F a2 F b2 F c2
= 2 = 0, = 2 = 0 and = 2 = 0.
x x y y z z
a2 2 b2 c2
Hence x2 = , y = and z 2 = and the equation of constraint gives

1 a+b+c
(a + b + c) = d that is = ,
d
(a + b + c)2
and at this point f = (a + b + c) = .
d
The distance D is given by D 2 = (x A)2 + (y B)2 + (z C)2 , so the auxiliary
function is
F = (x A)2 + (y B)2 + (z C)2 2(ax + by + cz d)
with Lagrange multiplier 2. The derivatives are
Fx = 2(x A) 2a = x = a + A,
Fy = 2(y B) 2b = y = b + B,
Fz = 2(z C) 2c = z = c + C.
Thus the constraint gives
Aa + Bb + Cc d
(a2 + b2 + c2 ) + Aa + Bb + Cc = d = = .
a2 + b 2 + c 2
But at the stationary point
(Aa + Bb + Cc d)2 |Aa + Bb + Cc d|

D2 = 2 (a2 + b2 + c2 ) = = D= .
a2 + b 2 + c 2 a2 + b 2 + c 2

pq
Since f = pq/(p + q) the auxiliary equation is F (p, q, ) = (p + q 4c) and
p+q
we require the roots of
F q2 F p2
= = 0 and = = 0.
p (p + q)2 q (p + q)2
Clearly since both p and q are positive, p = q and then = 1/4. The constraint
equation then gives p = q = 2c.

The auxiliary function is
F = ax2 + by 2 + cz 2 1 (x2 + y 2 1) 2 (x + y + z 1)
and so we require the solutions of
2
Fx = 2x(a 1 ) 2 = 0 = x =
2(a 1 )
2
Fy = 2y(b 1 ) 2 = 0 = y =
2(b 1 )
2
Fz = 2cz 2 = 0 = x = .
2c
The constraints now give the following equations for 1 and 2 : on the plane,

2 1 1 1
+ + = 1. (10.24)
2 a 1 b 1 c
And on the cylinder

2

1 1
2
+ = 1.
4 (a 1 ) (b 1 )2
The first of these equations gives

2
1 1 1 4 4 1 1
+ + = and the second gives = + .
a 1 b 1 c 22 22 (a 1 )2 (b 1 )2
These two equations give the following quadratic equation for 1
21 (a + b + 4c)1 + ab + 2c(a + b) + 2c2 = 0
which has the two real solutions

1 1 p
1 = (a + b + 4c) , = (a b)2 + 8c2 .
2 2
The quadratic equation for 1 can be rewritten in the form
(1 a)(1 b) = 2c(21 a b) 2c2
and also the solution for 1 gives a + b 21 = (4c ). Hence
1 1 4c
+ = .
a 1 b 1 2c(3c )
Hence, equation 10.24 becomes

2 4c 2c(6c 2)
1= 1 giving 2 = .
2c 6c 2 2c

Let hc be the height of the right crcular cylinder, so its volume and surface area are
Vcy = r2 hc and Scy = 2rhc .
The volume and surface area of the cone are

1 2 p
Vcn = r h and Scn = r r2 + h2
3
and since the material of the cone is double thickness the total volume and surface areas
are
1 p
V = r2 hc + r2 h and S = 2rhc + 2r r2 + h2 .
3
Eliminate hc to give
2V 2 p
S(r, h) = rh + 2r r2 + h2 .
r 3
Thus
S 2 2rh p
= r + = 0 and hence r2 + h2 = 3h or r = 2 2 h,
h 3 r 2 + h2
which is the first result. Also

S 2V 2 p 2r2
= 2 h + 2 r2 + h2 + = 0.
r r 3 r 2 + h2
3
Substituting for r = 2 2h and r2 + h2 = r gives
2 2

2V 2 r 3r 2 2 8
= + 2 + 2r = r 2
r2 3 2 2 2 2 3 3
and hence
4 3 2 3
V = 2r = r 2 .
3 3
Chapter 11
Constrained Variational
Problems
11.1 Introduction
In this chapter we apply the Lagrange multiplier method to functionals with constrained
admissible functions. Examples are the isoperimetric and the catenary problems, de-
scribed in sections 2.5.5 and 2.5.6, where the constraint is another functional. In these
examples the stationary path is described by a single function, y(x).
But, the most celebrated isoperimetric problem is that enshrined in the myth de-
scribing the foundation of the Phoenician city of Carthage in 814 BC: this is that Dido,
also known as Elissa, having fled from Tyre after her brother, King Pygmalion, had
killed her husband, was granted by the Libyans as much land as an ox-hide could cover.
By cutting the hide into thin strips, she was able to claim far more ground than an-
ticipated. In common with all foundation myths there is no trace of evidence for its
veracity.
Didos solution is a circle which cannot be described by a single function, the natural
representation being parametric. Thus, we need to consider the effects of constraints
on both types of functionals.
There is, however, another type of constrained problem, of equal significance, exem-
plified by the problem of finding geodesics on surfaces. Consider a surface defined in the
three dimensional Cartesian space, which we suppose can be defined by an equation of
the form S(x, y, z) = 0. Given two points on this surface we require the shortest line, on
the surface, joining these points. Any smooth path can be represented parametrically
by three functions (x(t), y(t), z(t)) of a parameter t, with end points at t = 0 and t = 1.
The distance along this path is given by the functional
Z 1 p
D[x, y, z] = dt x2 + y 2 + z 2
0
and the constraint that forces this path to be on the surface is S(x(t), y(t), z(t)) = 0 for
0 t 1. This is a different type of constraint than found in the problems described
above. In the non-assessed sections 11.7 and 11.8 this theory is used to solve variants
of the brachistochrone problem.
415
416 CHAPTER 11. CONSTRAINED VARIATIONAL PROBLEMS
No fundamentally new ideas are presented in this chapter, but many ideas and
techniques introduced in previous chapters are used in a slightly different context, to
derive new results. As you read through this chapter you should ensure that you
thoroughly understand the previous work upon which it is based.
11.2 Conditional Stationary values of functionals

11.2.1 Functional constraints
One possible method of dealing with constrained problems is to use admissible functions
that automatically satisfy the constraint; this is the equivalent of the direct method
discussed in the introduction to the previous chapter. Unfortunately it is not always
possible to formulate satisfactory rules for defining such functions so the alternative
method, described in theorem 11.1 below, is essential.
The general theory for this type of function is a combination of the Lagrange multi-
plier method, described in chapter 10, and the derivation of the Euler-Lagrange equation
given in chapter 3; it is convenient to summarise the result as a theorem.
Theorem 11.1
Given the functional
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, (11.1)
a
where the admissible curves must also satisfy the constraint functional
Z b
C[y] = dx G(x, y, y 0 ) = c, (11.2)
a
where c is a given constant, then, if y(x) is not a stationary path of C[y], there exists a
Lagrange multiplier such that y(x) is a stationary path of the auxiliary functional
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, (11.3)
a
where F = F G. That is, the stationary path is given by the solutions of the
Euler-Lagrange equation

d F F
= 0, y(a) = A, y(b) = B. (11.4)
dx y 0 y
The solution of this Euler-Lagrange equation will depend upon , the value of which is
determined by substituting the solution into the constraint functional 11.2.
The proof of this theorem requires a significant, and not immediately obvious, change
to the proof presented in section 3.4. Thus before providing the proof it is instructive
to see what happens when the general theory of section 3.4 is applied directly to this
type of problem; this shows why a modification is required.
11.2. CONDITIONAL STATIONARY VALUES OF FUNCTIONALS 417
Suppose that y(x) is the required solution: consider the neighbouring admissible
function y(x) + h(x) where h(a) = h(b) = 0, then the Gateaux differential is
Z b
F d F
S[y, h] = dx h(x). (11.5)
a y dx y 0
But both y(x)+h(x) and y(x) are chosen to satisfy the constraint, that is C[y + h] = C[y],
so the rate of change of C[y] is zero, that is the Gateaux differential is zero
Z b
G d G
C[y, h] = dx h(x) = 0 for all h(x). (11.6)
a y dx y 0
It is assumed that C[y] is not stationary, so G(x, y, y 0 ) does not satisfy the Euler-
Lagrange equation. But 11.6 is true for all h(x) only if G satisfies the Euler-Lagrange
equation. This contradiction can be resolved with a judicious choice of h(x). The
problem is that the constraint places an additional restriction on the variation h(x)
so that the theory developed in chapter 4, which placed no restriction (other than
differentiability) on h(x), needs to be modified.
The same problem arises with functions of n real variables, s(x) and a single con-
straint c(x) = 0. In this case the equivalents of expressions 11.5 and 11.6 are
n n
X s X c
s[x, h] = hk = 0 and c[x, h] = hk = 0.
xk xk
k=1 k=1
But the second of these equations is true for all variations satisfying the constraint, so
the hk cannot be varied independently, and therefore we cannot deduce that s/xk = 0
for all xk .
In order to derive the Euler-Lagrange equation 11.4 we use a special set of variations.
Recall that when first deriving the Euler-Lagrange equation in section 3.4 we used the
fundamental Lemma, section 3.3, which involved sets of functions h(x) that isolated
small intervals of the integrand. Here we use a modification of this method that involves
picking out two, small, distinct intervals.
This is achieved by writing
h(x) = 1 g(x 1 ) + 2 g(x 2 ), 1 6= 2 , (11.7)
where the function g(x ) is strongly peaked in a neighbourhood of x = and zero
for other x.
Such functions can be constructed from the type of function used to prove the
fundamental lemma, section 3.3; for example define
1 2 (x )2 ,

a < x + < b,
g(x ) = 2 (11.8)
0, otherwise.

The coefficient 2 is chosen to make g = O(1). This function is zero except in the
neighbourhood of width 2 centred at x = .
For any function f (x) possessing a third derivative for a x b, we have, see
exercise 11.1
Z b
4
dx f (x)g(x ) = f () + O( 3 ), = , a + < < b . (11.9)
a 3
In the following analysis we use the specific family of functions 11.8 in order to illustrate
how the proof works. Such a restriction is not necessary, but without it the more general
equivalent of equation 11.9 needs to be derived; the only significant difference between
equation 11.9 and the general case is that the term O( 3 ) is replaced by a term O( 2 ).
For convenience define the functions

d F F d G G
F(x) = 0
and G(x) = 0

dx y y dx y y
which we assume are sufficiently well behaved for a x b. Then the integrals 11.5
and 11.6 become, respectively,
Z b
S = dx F(x) 1 g(x 1 ) + 2 g(x 2 )
a

= 1 F(1 ) + 2 F(2 ) + O( 3 ), (11.10)
Z b
C = dx G(x) 1 g(x 1 ) + 2 g(x 2 )
a

= 1 G(1 ) + 2 G(2 ) + O( 3 ). (11.11)
The functional C[y] is not stationary therefore we may choose 2 such that G(2 ) 6= 0,
and then equation 11.11 gives, since C = 0,
G(1 )
2 = 1 + O( 2 ).
G(2 )
Substituting this into equation 11.10 and using the fact S[y] is stationary, so S = 0,

F(1 ) F(2 )
1 = O( 2 ).
G(1 ) G(2 )
Since this equation must be true for all 1 , and the left-hand side is independent of ,
we must have
F(1 ) F(2 )
= .
G(1 ) G(2 )
Finally, recall that 1 and 2 are arbitrary, so it follows that the ratio F(x)/G(x) is
independent of x. Setting this ratio to a constant we obtain

d F F d G G
= 0 for a x b, (11.12)
dx y 0 y dx y 0 y
which is just equation 11.4 and can be derived from the functional S[y] in the usual
manner.
This proof shows clearly why two small parameters, 1 and 2 , are necessary; we
need the flexibility to isolate two distinct points, 1 and 2 , in the interval (a, b) to
show that the ratio F(x)/G(x) is independent of x. In this proof it is necessary to
assume that G(x) 6= 0 for almost all values of x in this interval: that is, C[y] must not
be stationary.
Exercise 11.1
Prove equation 11.9
Exercise 11.2
Use theorem 11.1 to show that the stationary path of the variational problem
Z 1
S[y] = dx y 0 2 , y(0) = y(1) = 0,
0
subject to the constraint that the area under the curve is fixed, that is
Z 1
C[y] = dx y(x) = A,
0
is given by y = 6Ax(1 x), and that the undetermined multiplier is = 24A.
Exercise 11.3
Show that the stationary path of the functional
Z 2
S[y] = dx xy 0 2 , y(1) = y(2) = 0,
1
subject to the constraint

Z 2
2 ln 2 ln x
dx y = 1 is given by y(x) = 1x+ .
1 3 ln 2 2 ln 2
We end this section by considering the effect of M functional constraints on a functional

of n dependent variables. The extension required is identical to that described in
the previous chapter, namely a Lagrange multiplier is added for each constraint: we
summarise the result as a theorem.
Theorem 11.2
Z b
S[y] = dx F (x, y(x), y0 (x)), y(a) = A, y(b) = B, (11.13)
a
where y = (y1 , y2 , . . . , yn ) and where the admissible curves must also satisfy the M
constraint functionals
Z b
Cj [y] = dx Gj (x, y(x), y0 (x)) = cj , j = 1, 2, , M, (11.14)
a
where the cj are M given constants, then, if y(x) is not a stationary path of any of the
constraints, there exists a set of M Lagrange multipliers j , j = 1, 2, , M , such that
y(x) is a stationary path of the functional
Z b
S[y] = dx F (x, y(x), y0 (x)), y(a) = A, y(b) = B, (11.15)
a
PM
where F = F j=1 j Gj . That is, the stationary path is given by the solution of the
Euler-Lagrange equations

d F F
0 = 0, yk (a) = Ak , yk (b) = Bk , k = 1, 2, , n. (11.16)
dx yk yk
The solution of these n Euler-Lagrange equations will depend upon M Lagrange mul-
tipliers, the values of which are determined by substituting the solution into the M
constraint functionals 11.14.
Exercise 11.4
Show that the stationary paths of the functional
Z 1
S[y, z] = dx y 0 2 + z 0 2 2xz 0 4z , y(0) = z(0) = 0, y(1) = z(1) = 1,
0

Z 1
C[y, z] = dx y 0 2 xy 0 z 0 2 = c,
0
are given by
(4 3)x x2 (3 + 2)x x2
y= and z=
4(1 ) 2(1 + )
where is a solution of
24 46 + 232 1
1+c = .
48(1 )2 12(1 + )2
11.2.2 The dual problem

The form of theorem 11.1 suggests the same duality as for functions of real variables,
described in section 10.3. Thus we may change the roles of the functionals S[y] and C[y]
in theorem 11.1 and, provided 6= 0, the Euler-Lagrange equation of the functional
C[y] = C[y]S[y] gives the stationary paths of C[y] subject to the constraint S[y] = s,
which provides the equation for the Lagrange multiplier . The following exercise is
the dual of the problem considered in exercise 11.2.
Exercise 11.5
Show that the stationary path of the variational problem
Z 1
S[y] = dx y, y(0) = y(1) = 0,
0
Z 1
subject to the constraint C[y] = dx y 0 2 = c, is given by y(x) = 3c x(1 x),
0
and that the undetermined multiplier is = 1/(4 3c).
11.2.3 The catenary

Here we determine the shape of the catenary, that is, the shape assumed by an inexten-
sible cable of uniform density, , and known length, hanging between fixed supports.
In figure 11.1 we show an example of such a curve with the points of support at (0, B)
and (a, A), with a > 0 and B < A.
(a,A)
A
y
B
x
x=0 x=a
Figure 11.1 The catenary formed by a uniform cable
hanging between two points at different heights.
If a curve is described by a differentiable function y(x) it can be shown, see exercise 2.19
(page 103), that the potential energy E of the cable is proportional to the functional
Z a p
E[y] = g dx y 1 + y0 2, y(0) = B, y(a) = A B. (11.17)
0
The curve that minimises this functional, subject to the length of the cable,
Z a p
L[y] = dx 1 + y 0 2 , (11.18)
0
remaining constant is the shape assumed by the hanging cable.

Notice that the functional E[y] is identical to that giving the area of a surface of
revolution, see equation 4.11 (page 171). But, in the present case we shall see that the
existence of the constraint changes the behaviour of the solutions.
Experiencep leads us to expect that provided L is larger than the distance between
the supports, a2 + (A B)2 , the cable hangs in a specific manner; thus we expect
that there is a unique
p path that minimises E[y] with the constraint L[y]. Here we show
that provided L > a2 + (A B)2 (and the cable is strong enough) there are always
two stationary paths. But, in section 4.3 we saw that when A = B, with no constraint,
there are either two or no smooth stationary paths of E[y], depending upon the ratio
A/a. In exercise 4.19 it was shown that if B = 0 and A > 0, again with no constraint,
there is no solution. This illustrates the significance of constraints.
A physical interpretation of the effect of the removal of this constraint is given by
considering a slight modification of the catenary, whereby the points of support are two
smooth pegs and the cable is draped over these with the surplus cable resting on the
ground, as shown in figure 11.2: the important property of a smooth peg is that around
it tension in the cable does not change. We suppose that the cable is sufficiently long
that there is always some cable on the ground.
a
Figure 11.2 Diagram showing a cable hanging over two
smooth pegs, at the same height, A, above the ground, a
distance a apart. The cable is long enough to reach the
ground on both sides.
In this example the potential energy of the vertical segments is independent of the
shape of the hanging portion, so the energy is given by equation 11.17, and there is no
constraint. This is the same functional as gives the area of a surface of revolution.
The hanging portion of the cable is supported only by the weight of the vertical
portion of the cable, so we consider the effect of keeping A and B fixed and changing
a, the separation between the pegs. First consider the case A = B.
If a A the weight of the hanging cable is relatively small by comparison to the
vertical portion, and we expect the portion between the pegs to be almost horizontal.
In addition there will be a solution where the hanging portion falls almost vertically
near the pegs and with a section of it resting on the floor. Figure 4.11 (page 176) shows
the two solutions, for a A; one is almost horizontal and is shown in section 7.7 to be
a local minimum.
If a A the weight of the hanging cable is relatively large and cannot be sup-
ported by the vertical portion. Now the only solution is the Goldschmidt solution,
equation 4.20, which is physically possible only for an infinitely flexible cable.
Notice that if B = 0 the length of the vertical portion of the cable must be less
than the hanging portion, which therefore cannot be supported, so there is no smooth
solution, as in exercise 4.19. This example demonstrates the importance of constraints.
Returning to the main problem, choose the axes so the left-hand support is at the
origin, that is B = 0, and the right-hand end has coordinates (a, A). Further we may
assume, with no loss of generality, that A 0. The energy and constraint functionals
are given in equations 11.17 and 11.18, so if g is the Lagrange multiplier the auxiliary
functional is proportional to
Z a p
E[y] = dx (y ) 1 + y 0 2 , y(0) = 0, y(a) = A 0, (11.19)
0
and this can be expressed in terms of a new variable z = y

Z a p
E[z] = dx z 1 + z 0 2 , z(0) = , z(a) = A .
0
The first-integral of this functional is

z
= c,
1 + z0 2
where c is a constant. Solving this equation for z 0 gives the first-order equation

dz z 2 c2
= ,
dx c
which is same equation as derived in section 4.3.2. Putting z = c cosh (x) gives
c0 = 1, so the general solution is

x+d
y = + c cosh , (11.20)
c
where d is another constant. This solution contains three unknown constants, , d

and c, which are obtained from the two boundary conditions and the constraint, as
shown next.
The boundary conditions y(0) = 0 and y(a) = A give the equations

d a+d
= c cosh and A = + c cosh , (11.21)
c c
and the constraint becomes

s
a Z a

x+d x+d
Z
2
L = dx 1 + sinh = dx cosh
0 c 0 c

a+d d a + 2d a
= c sinh sinh = 2c cosh sinh . (11.22)
c c 2c 2c
Equations 11.21 and 11.22 give three equations enabling the constants , c and d to be
determined in terms of L, B and (a, A). It is not possible to find formulae for these
constants, but a numerical solution is made relatively easy after some rearrangements
are made. Subtracting equations 11.21 gives

a+d d a + 2d a
A = c cosh cosh = 2c sinh sinh . (11.23)
c c 2c 2c
On squaring and subtracting equations 11.22 and 11.23 we obtain

2 2 2 2
a a L2 A 2
L A = 4c sinh or, with = , sinh = . (11.24)
2c 2c a
This equation for has two real solutions, = 0 , where 0 is the positive solution
of the second equation; so c = c0 , c0 = a/(20 ) > 0. These two values of c give two
values, d , of d which can be found by dividing 11.23 by 11.22 to give

a + 2d A A
tanh = , 0< <1 .
2c0 L L
If D0 is the positive solution of tanh D0 = A/L then
d a
= 0 D0 , 0 = .
c0 2c0
Then equation 11.21 gives the following two values for ,
= c0 cosh(0 D0 ), giving + + = A.
Hence the two solutions are

20
y+ (x) = c0 cosh x (0 D0 ) cosh(0 D0 ) (11.25)
a
and
20
y (x) = c0 cosh x (0 + D0 ) cosh(0 + D0 ) . (11.26)
a
The solution y+ (x) has a local minimum at x = xm = a(1 D0 /0 )/2, and y (x) has
a maximum at x = xm . Also we note that y (a x) = A y+ (x). An example of each
of these solutions is shown in figure 11.3. Only y+ (x) is physically significant in the
present context.
2 y
y-(x)
1.5
0.5
x
0
0.2 0.4 0.6 0.8 1
-0.5
y+(x)
Figure 11.3 Graphs of the functions y (x) in the case
a = A = 1 and L = 3.
We can deduce the existence of these two solutions directly from the original functional.
Suppose that y(x) satisfies the Euler-Lagrange equations associated with E[y], then if
w(x) = A y(a x), so w(0) = 0 and w(a) = A, then
Z a p
E[y] = dx (A w(a x)) 1 + w0 (a x)2
0
Z a p
= du (w(u) ) 1 + w0 (u)2 , u = a x, + = A.
0
Thus if y+ (x) is a stationary path then so is y (x). Also,

E[y ] E[y+ ] = c20 sinh2 20 20 > 0 (11.27)
that is the potential energy of the path y+ (x) is less that that of the path y (x); physical
considerations suggest that y+ (x) gives a minimum value of E[y].
In figure 11.4 we show some examples of y+ (x) for a = A = 1 and various values
of L.
y
1
L=1.42
0.5
L=1.5
x
0 0.2 0.4 0.6 0.8 1
L=2
-0.5
L=3
-1
Figure 11.4 Graphs showing catenaries y+ (x), defined in
equation 11.25, of various lengths, L, for a = A = 1.
Exercise 11.6
Show that equation 11.24 for has a unique real solution if L is larger than the
distance between the origin and (a, A). What is the positive limiting value of c as
the stationary path y+ (x) tends to the straight line between the end points?
Exercise 11.7
For given values of a and L, (L > a), show that the catenary y+ (x) with zero
gradient at the left end, x = 0, has the height difference A = L tanh where
a sinh 2 = 2L.
Exercise 11.8
Prove the inequality 11.27.
Exercise 11.9
(a) Show that the Euler-Lagrange equation associated with the functional E[y],
defined in equation 11.19,, is
(y )y 00 = 1 + y 0 2 , y(0) = 0, y(a) = A.
(b) If y(x) is a solution of this equation and w(u) = A y(x), u = a x, show

that w(u) satisfies the equation
(w )w00 (u) = 1 + w 0 (u)2 , w(0) = 0, w(a) = A and + = A.
Hence explain why another solution for the minimum surface problem, discussed
in section 4.3, cannot be generated by this transformation.
11.3 Variable end points

Variational problems with variable end points, but without constraints, were considered
in section 9.3. The addition of one or more constraints does not alter this theory in any
significant way, although its implementation is usually more difficult.
Suppose that we require the stationary paths of the functional

Z v
S[y] = dx F (x, y, y 0 ), y(a) = A, (11.28)
a
where the right-hand end of the path lies on the curve defined by (x, y) = 0 and where
the constraint Z v
C[y] = dx G(x, y, y 0 ) = c, a constant, (11.29)
a
also needs to be satisfied.
Using a similar analysis to that outlined in section 11.2.1 it can be shown that the
required stationary path is given by the stationary path of the auxiliary functional
Z v
S[y] = dx F (x, y, y 0 ), y(a) = A, F = F G, (11.30)
a
where is a Lagrange multiplier and at x = v the transversality condition (page 350),

x F y 0 + y y 0 F y 0 F = 0, (11.31)
x=v
is satisfied.
As before the solution of the associated Euler-Lagrange equation depends upon ,
the value of which is determined by the constraint.
Exercise 11.10
A curve of given length L is described by the positive function y(x) passing through
the origin and some point, (v, 0), with v > 0, to be determined. Find the shape
of the curve making the area under it stationary.
Hint in this example the boundary curve is (x, y) = y = 0.
Exercise 11.11
A curve described by the positive function y(x) passing through the origin and
some point, to be determined, x = v > 0 on the x-axis, is rotated about the x-axis
to form a solid body.
(a) Show that the volume, V [y], and the surface area, A[y], of this body are given
by Z v Z v
dx y 2 and A[y] = 2
p
V [y] = dx y 1 + y 0 2 .
0 0
(b) If the surface area is given determine the path making the volume stationary,
and find the volume in terms of A.
Hint in this example the boundary curve is (x, y) = y = 0.
Exercise 11.12
Show that the equation of the cable with the right-hand end fixed at (a, A), where
a and A are positive, and with the left-hand end free
x to slide on aa vertical pole
aligned along the y-axis is given by y = A + c cosh c cosh , where c is
c c
given by the positive root of L/a = sinh and c = a/.
11.4. BROKEN EXTREMALS 427
Exercise 11.13
Show that the equations of a cable of length L and uniform density, with the left
end free to slide on a vertical pole aligned along the y-axis and the right end free
to slide along the straight line x/a + y/b = 1, a, b > 0, is
bL ax bL a
y =+ cosh , 0x sinh1 ,
a bL a b
for some for which you should find an expression in terms of a, b and L.
11.4 Broken extremals

The theory of broken extremals, section 9.5.2, remains essentially unchanged when
constraints are added. For one constraint the theory is as described in that section
except that the integrand F is replaced by F = F G, where is a Lagrange multiplier
and G the integrand of the constraint.
We illustrate this theory with the simple example requiring the stationary paths of
Z a
S[y] = dx y 0 2 , y(0) = 0, y(a) = A (11.32)
0
of given length Z a p
L[y] = dx 1 + y0 2
0
and with a discontinuous derivative at x = c, with 0 < c < a.

The modified functional is
Z a p
S[y] = dx y 0 2 1 + y 0 2 , y(0) = 0, y(a) = A. (11.33)
0
This integrand depends only upon y 0 , so the solutions of the associated Euler-Lagrange
equation are straight lines, y = mx + d. On the interval 0 x c, since y(0) = 0, the
appropriate solution is y = m1 x, for some constant m1 . On the interval c x a, the
solution through y(a) = A is y = A + m2 (x a). The solution is continuous at x = c,
so
(m1 m2 )c = A m2 a. (11.34)
The Weierstrass-Erdmann (corner) conditions connecting the two sides of the solution
at x = c are, see equations 9.53 and 9.53 (page 359),
lim F y 0 F y0 = lim F y 0 F y0 ,

xc xc+
lim F y0 = lim F y0 .
xc xc+
Since !

F y0 = y 0
2 p and F y 0 F y0 = y 0 2 p
1 + y0 2 1 + y0 2
these conditions become

m21 + p = m22 + p ,
1+ m21 1 + m22
! ! (11.35)

m1 2 p = m2 2 p .
1 + m21 1 + m22
A solution of the first equation is m1 = m, m2 = m, for some m; then the second

equation gives

p
m 2 = 0 giving the nontrivial solution 1 + m2 = .
1+m 2 2
The constraint now gives

r
a
2L L2
Z p p
L= dx 1 + m2 = a 1 + m2 and hence = and m = 1.
0 a a2
Equation 11.34 for continuity then gives c = (A + ma)/2m. Hence the stationary paths
are

mx, 0 x c, r
y(x) = 2
A + m(a x), c x a, where m = L 1 and c = A + ma .
a2 2m
Since 0 < c < a we must have |A| < ma.
With no corner conditions the differentiable solution exists only if L = a2 + A2 ,
there being insufficient flexibility to satisfy the constraints and the Euler-Lagrange
equation for any other values of L, a and A.
Exercise 11.14
Show that the only solutions of equations 11.35 are those considered in the text.
Exercise 11.15
This is a long, difficult question which should be attempted only if time permits.
An inextensible cable with uniform density , is suspended between the points
(0, B) and (a, A), with A B, where the y-axis is vertically upwards. A weight of
mass M is firmly attached to the cable at distances L1 and L2 from the left and
right ends respectively, all distances being measured along the cable.
(a) Show that the energy functional is
Z p Z a p
E[y] = M gy() + g dx y 1 + y 0 2 + g dx y 1 + y0 2,
0
where is the x-coordinate of the weight, and that the two constraints are
Z p Z L p
L1 = dx 1 + y0 2 and L2 = dx 1 + y0 2 .
0
11.5. PARAMETRIC FUNCTIONALS 429
(b) Derive the Euler-Lagrange equations for the cable and show that their solu-
tions are

x d1
y1 (x) = 1 + c1 cosh , 0 x , y1 (0) = B,
c1

x d2
y2 (x) = 2 + c2 cosh , x a, y2 (L) = A,
c2
where 1 and 2 are two Lagrange multipliers and (c1 , c2 , d1 , d2 ) are constants
arising from the integration of the Euler-Lagrange equations.
(c) Show that c1 = c2 = c and that the six remaining unknown constants (1 , 2 , , c, d1 , d2 )
are determined by the following six equations.
Z
d1 d1
q
L1 = dx 1 + y10 2 = c sinh + sinh
0 c c
Z a
a d d2
q
2
L2 = dx 1 + y20 2 = c sinh sinh .
c c

d1 a d2
B = 1 + c cosh and A = 2 + c cosh .
c c

d2 d1
M = c sinh sinh
c c
and
d1 d2
1 + c cosh = 2 + c cosh .
c c
11.5 Parametric functionals

The general theory for a parametrically defined curve is identical to that described
in section 11.2.1, in particular theorem 11.2. Consider the case of three independent
variables, (x, y, z), depending upon a parameter t, and one constraint: the functional
will be Z 1
S[x, y, z] = dt (x, y, z, x, y, z) (11.36)
0
with given boundary conditions and with admissible functions restricted to those paths
that satisfy the constraint,
Z 1
C[x, y, z] = dt G(x, y, z, x, y, z) = c (11.37)
0
where c is a constant. This is just the problem dealt with by theorem 11.2, so the
stationary paths satisfy the three Euler-Lagrange equations

d
, u = {x, y, z}, = G, (11.38)
dt u u
with the same boundary conditions as defined for the original functional, and where
is a Lagrange multiplier.
We illustrate this theory by applying it to the original isoperimetric problem of

Dido, that is we require the shape of the closed curve of given length L that encloses
the largest area, though we show only that the area is stationary. A version of this
problem was considered in exercise 11.10, where only the upper half of the curve was
considered: using a parametric representation of the functions this restriction is not
necessary.
The area of a closed curve in the Oxy-plane, see equation 8.5 (page 314), is
1 2
Z
A[x, y] = dt (xy xy) , x(0) = x(2), y(0) = y(2), (11.39)
2 0
where the range of the parameter t is appropriate for it to be an angle, and the curve
is traversed anti-clockwise. The constraint is the length,
Z 2 p
C[x, y] = dt x2 + y 2 = L. (11.40)
0
If is the Lagrange multiplier the modified functional is

1 2 h
Z p i
A[x, y] = dt xy xy 2 x2 + y 2 (11.41)
2 0
and the two associated Euler-Lagrange equations for x and y, respectively, are
!
d x 1 1
p + y + y = 0,
dt 2
x + y 2 2 2
!
d y 1 1
p x x = 0.
dt 2
x + y 2 2 2
These simplify to
! !
d x d y
p + y = 0 and p x = 0, (11.42)
dt x2 + y 2 dt x2 + y 2
which integrate directly to
x y
p =y and p = + x, (11.43)
x2 + y 2 x2 + y 2
for some constants and . Now multiply the first of these by y, the second by x and
subtract to give ( y)y ( + x)x = 0. Integrate this to obtain
(x + )2 + (y )2 = 2 , (11.44)
where is another real constant. This is the equation of the circle with centre at
(, ) and radius . Its circumference is 2 = L, which gives the required path. In
parametric form its equations are
L L
x = +cos t and y = + sin t. (11.45)
2 2
The position of the centre of this circle cannot be determined from the information
provided.
11.6. THE LAGRANGE PROBLEM 431
Exercise 11.16
An alternative method of finding the stationary path for the area from equa-
tions 11.42 is to use the arc length, s, as the independent variable, which is related
to the parameter t by the relation
ds p
= x2 + y 2 .
dt
(a) Show that with s as the independent variable equations 11.42 become
dx dy
=y and = + x.
ds ds
Further, show that these equations can be converted to
(
2 y = + a cos(s/ + )
2d y
+ y = having the general solutions
ds2 x = a sin(s/ + ),
where a and are constants.

(b) Show that 2 = L, derive equations 11.41 and deduce that 2a = L.
Exercise 11.17
What is the shape of the closed curve, enclosing a given area, for which the length
is stationary.
11.6 The Lagrange problem

A different type of problem, originally formulated by Lagrange (1736 1813) and since
associated with his name, consists of finding stationary paths of the functional
Z b
S[y] = dx F (x, y, y0 ), (11.46)
a
where y(x) = (y1 , y2 , . . . , yn ) is an n-dimensional vector function, with constraints

defined by the m < n functions
Cj (x, y, y0 ) = 0, j = 1, 2, , m < n, (11.47)
and such that certain boundary conditions are satisfied.

There are a number of complications and variants to this type of problem, which is
one reason that boundary conditions were not specified, but is also why this introductory
treatment is not assessed.
There are two different types of constraints to consider. The simplest type depends
upon x and y, but not the derivatives y0 . Such constraints play an important role
in dynamics and are known as holonomic constraints. Constraints that depend upon
y0 , and cannot be reduced to a form independent of y0 , are known as non-holonomic
constraints: both types of constraints are sometimes named finite subsidiary conditions,
or side-conditions. We consider holonomic constraints first.
The simplest method of dealing with holonomic constraints is to use a coordinate
system that automatically satisfies the constraint. If possible this is usually the most
convenient method and has the advantage that each constraint reduces the number
of dependent variables by unity. For example if there are three variables with the
constraint C(y1 , y2 , y3 ) = y12 + y22 + y32 r2 = 0, that forces the admissible paths to lie
on a sphere, it is usually better to use the two spherical polar angles (, ), where
y1 = r sin cos , y2 = r sin sin , y3 = r cos .
The method described here is an alternative, and a specific example is considered in
section 11.6.2.
We assume that the m constraint equations Cj (x, y) = 0 are sufficiently well behaved
that along the stationary path they can be used to express m of the dependent variables
in terms of the remaining n m variables, which means that boundary conditions for at
most n m variables need be specified. We shall assume that all holonomic constraints
are consistent with the boundary conditions. In the following proof we assume that
there is just one constraint, C(x, y) = 0.
Suppose that y(x) is the stationary path with the boundary conditions y(a) = A,
y(b) = B. If y + h is a neighbouring admissible path that also satisfies the constraint,
so h(a) = h(b) = 0, and for each j, hj (x) is in D1 (a, b), then the Gateaux differential
is Z b n
X F d F
S[y, h] = dx hk (x). (11.48)
a yk dx yk0
k=1
But also C(x, y) = C(x, y + h) for all a x b, and hence
n
X C
hk (x) = 0, (11.49)
yk
k=1
which shows that the variations, hk (x), are not independent.

Now integrate this expression over the range of x and choose the hk to be func-
tions peaked about x = , see equation 11.8 (page 417), so that for any sufficiently
differentiable function f (x)
Z b
dx f (x)hk (x) = k f () + O(k3 )
a
as in equation 11.9 (page 417). Thus equations 11.48 and 11.49 become
n n
X C 3
X F d F
k = O( ) and k = O( 3 ),
yk yk dx yk0
k=1 k=1
all functions being evaluated at x = . Introduce a Lagrange multiplier, (), which is

a function of , and subtract these equations to obtain
n
X F d F C
k () = 0. (11.50)
yk dx yk0 yk x=
k=1
Now choose () so that the coefficient of n is zero. We have the freedom to choose
the remaining n 1 coefficients k , k = 1, 2, , n 1, independently hence, using the
same argument as in section 10.2, we obtain the n Euler-Lagrange equations

d F F C
0 + (x) = 0, y(a) = A, y(b) = B, k = 1, 2, , n. (11.51)
dx yk yk yk
11.6. THE LAGRANGE PROBLEM 433
The derivation of this result assumed that there is a single holonomic constraint C(x, y).
This is not necessary; the addition of another holonomic constraint adds another La-
grange multiplier and in equation 11.51 the term
C C1 C2
(x) is replaced by 1 (x) + 2 (x) .
yk yk yk
A common type of problem involving a single holonomic constraint is described in
section 11.6.2.
11.6.1 A single non-holonomic constraint

In order to be specific consider a single non-holonomic constraint, that is m = 1, and
n = 2 in equations 11.46 and 11.47. Assume first that the boundary conditions,
y1 (a) = A1 , y2 (a) = A2 , y1 (b) = B1 , y2 (b) = B2 ,
are prescribed. But the constraint C(x, y1 , y2 , y10 , y20 )
= 0 can, provided Cy20 6= 0, be
inverted to express y20 as a function of all the other variables. If we assume that y1 is
known, as it would be if the stationary paths had been found, then the constraint gives
another first-order differential equation for y2 : integration gives one arbitrary constant,
which may be chosen to satisfy the boundary condition y2 (a) = A2 , but there is no
guarantee that the other boundary condition, y2 (b) = B2 , will be satisfied. In these
circumstances it is usually necessary to impose fewer boundary conditions and rely on
natural boundary conditions to supply the rest. Because there are many combinations
of imposed and natural boundary conditions we provide a flavour of the theory by
quoting a theorem valid for the restricted set of imposed conditions,
y1 (a) = A1 , y2 (a) = A2 , y1 (b) = B1 ,
and a natural boundary condition on y2 at x = b.
Theorem 11.3
Z b
S[y] = dx F (x, y1 , y2 , y10 , y20 ), y1 (a) = A1 , y2 (a) = A2 , y1 (b) = B1 , (11.52)
a
with the single constraint
C(x, y1 , y2 , y10 , y20 ) = 0 where Cy20 6= 0, a x b, (11.53)
then if y1 (x) and y2 (x) are twice continuously differentiable, stationary paths of this
system, there exists a Lagrange multiplier, (x), such that
Z b
S[y] = dx F (x, y1 , y2 , y10 , y20 ), y1 (a) = A1 , y2 (a) = A2 , y1 (b) = B1 , (11.54)
a
where F = F (x)C, is stationary on this path and satisfies the natural boundary
condition
F y20 = Fy20 Cy20 = 0. (11.55)

x=b
The solution of the associated Euler-Lagrange equation will depend upon (x), which
is determined by substituting the solution into the constraint equation 11.53.
11.6.2 An example with a single holonomic constraint

A simple problem with a single holonomic constraint involves finding geodesics on the
surface of a right circular cylinder. Consider such a surface in Oxyz, with equation
x2 + y 2 = a2 : we require the geodesics on this surface through points with coordinates
(a, 0, 0) and (a cos , a sin , b). Let the paths be parameterised by a variable 0 t 1,
so the distance along a path is
Z 1 p
S[x, y, z] = dt x2 + y 2 + z 2 with the constraint x(t)2 + y(t)2 = a2 . (11.56)
0
Using the Euler-Lagrange equations 11.51 we see that

d x
+ 2x = 0, 2 = x2 + y 2 + z 2 ,
dt

d y d z
+ 2y = 0 and = 0,
dt dt
It is now helpful to use s, the arc length along the curve for the independent variable,
so that s2 = 2 . First we note that
d2 x

d x d 1 dx
= s s = s 2 .
dt ds ds ds
Since t is an arbitrary parameter we may put t = s to reduce the three Euler-Lagrange

equations (since now s = 1) to
x00 (s) + 2(s)x(s) = 0, y 00 (s) + 2(s)y(s) = 0, z 00 (s) = 0. (11.57)
We now show that the Lagrange multiplier is a constant, which makes the integration
of these equations easy. Differentiating the constraint twice with respect to s gives
xx00 + yy 00 + x0 2 + y 0 2 = 0.
But, from the definition of s we have x0 2 +y 0 2 +z 0 2 = 1 and together with equations 11.57
we obtain 2a2 + 1 z 0 2 = 0, and hence =constant (since z 0 is a constant). The
solution of equations 11.57 that fit the initial conditions are, with 2 = 2,
x = a cos s, y = a sin s, z = s,
for some constant . If the length of the curve is S then S = + 2n, for some integer
n, and S = b. Defining a new variable = s we obtain a parametric representation
of a geodesic,
b
x = a cos , y = a sin , z= , 0 2n + . (11.58)
2n +
For this example it is far easier to use cylindrical polar coordinates, see exercise 2.20
(page 104), which automatically satisfy the constraint.
11.7. BRACHISTOCHRONE IN A RESISTING MEDIUM 435
Exercise 11.18
Z b
dx y 0 2 + z 0 2 y 2 ,
`
S[y, z] = y(a) = A1 , z(a) = A2 , y(b) = B1 ,
a
with the constraint C(z, y 0 ) = z y 0 = 0 and with a natural boundary condition

for z(b).
(a) Show that the Euler-Lagrange equations can be written in the form
d4 y d2 y
y = 0, y(a) = A1 , y 0 (a) = A2 , y(b) = B1 , y 00 (b) = 0,
dx4 dx2
with z = y 0 and = y 000 .

(b) Show that this equation for y(x) can be derived from the associated functional
of the single dependent variable
Z b
dx y 00 2 + y 0 2 y 2 , y 0 (a) = A2 ,
`
J[y] = y(a) = A1 , y(b) = B1
a
and with a natural boundary condition for y 00 (b).
11.7 Brachistochrone in a resisting medium

The modification of the brachistochrone problem to include a resistance is of historical
importance and was first successfully treated by Euler (1707 1783) in chapter 3 of
his 1744 volume The method of Finding Plane Curves that Show Some Property of
Maximum or Minimum . . . . Indeed it was Euler who first considered the problems
described in chapters 9 and 11. The problem considered here is difficult, requiring
many of the techniques and ideas developed earlier in the course, and is therefore good
revision even though this section is not assessed. Euler, on the other hand, developed
these techniques in order to solve this type of problem.
The analysis that follows is difficult and follows that outlined by Pars1 (1965, chap-
ter 8). You may find it hard to understand why certain steps are taken but, as usual
with any complicated problem, there is often no simple explanation and what is written
down is the result of trial and many errors: the blind alleys cannot be shown.
There are a variety of types of resistance that can be considered and here we follow
Euler by assuming that the resistance depends only upon the speed, v, of the particle.
This is a more difficult problem than that dealt with in chapter 4 because now energy
is not conserved, which means that there is not a simple relation between the speed
and the height of the particle, as in equation 4.2 (page 165). Instead we need to use
Newtons equation of motion, which here takes on the role of a constraint. First we
need to derive this equation in an appropriate form.
1L A Pars, An Introduction to the Calculus of Variations, (Heinemann).
Newtons equation of motion

For a particle of mass, m, sliding along a smooth, rigid wire a natural variable for the
description of its position is the distance, s, measured along the wire, from the starting
point. The Cartesian coordinates of the initial point are taken to be (x, y) = (0, A),
and here s = 0; we take the y-axis to be vertically upwards. There are two forces acting
on the particle, the downward force of gravity and the resistance, that depends upon
the speed, v = s and acts tangentially (because the wire is smooth) so as to slow the
motion.
y
A s
s y
P x
R(v)

mg x
O
N b
Figure 11.5 Diagram showing the forces acting on the particle, assuming that the
distance AP is increasing with time. The line P N is the tangent to the curve at
the instantaneous position P , and makes an angle with the downward vertical.
For a particle at P , consider the tangent P N , figure 11.5, to the curve which makes an
angle with the downward vertical, and let s be the distance along the curve from the
starting point, increasing with x. The component of the vertical force of gravity along
the tangent, P N , in the direction of increasing s, is mg cos .
If the magnitude of the resistance per unit mass is R(v), where R(v) is a positive
function such that2 R(0) = 0, then by resolving forces along the tangent at P , Newtons
equation becomes
d2 s
m 2 = mR(v) + mg cos . (11.59)
dt
The chain rule gives
d2 s dv dv ds dv
= = =v
dt2 dt ds dt ds
and since y = s cos the equation of motion can be written as the first-order
equation
dv dy
v = R(v) g . (11.60)
ds ds
We consider only cases where initially the particle is either stationary or moving down-
wards with a speed such that R(v) is small compared with the gravitational force, g
per unit mass. Thus, v 0 (s) is initially increasing. Subsequently there are two possible
types of motion:
A: v(s) steadily increases until the terminal point is reached, or;
B: v(s) increases to a maximum value at which v 0 (s) = 0, so here gy 0 (s) = R(v) < 0,
2 A typically approximation to assume that R is proportional to v 2 , see section 2.5.3, but this is
poor for low speeds, when R is proportional to v, and fails near the speed of sound.
after which v(s) decreases to its value at the terminal point.

We assume that the actual motion is either type A or type B; it will be seen that the
distinction between these two types of motion is important.
Exercise 11.19
If the wire is vertical, so s = y, and the particle starts from rest at s = 0, and
R(v) = v 2 , for some constant , show that the equation of motion 11.60 becomes
dv g`
= v 2 + g and hence show that v 2 = 1 e2s , for a particle start-

v
ds
ing at rest where s = 0. p
Note that as s , v g/ and approaches this limiting or terminal speed
monotonically.
The functional and boundary conditions

Now consider the integral for the time taken to travel between two given points (0, A)
and (b, 0), along a curve parameterised by [0, 1]. The time of passage, T , is given
by
Z T Z 1
dt
T = dt = d . (11.61)
0 0 d
If the coordinates of points on the curve are (x( ), y( )), by definition,
s
2 2
dx dy p d
v= + = x0 ( )2 + y 0 ( )2 ,
dt dt dt
and hence the functional for the three variables x( ), y( ) and v( ) is

Z 1 p
x0 ( )2 + y 0 ( )2
T [x, y, v] = d , (11.62)
0 v
where a prime denotes differentiation with respect to . Now express the equation of
motion in terms of the independent variable , rather than s. Since
ds p 0 2
= x ( ) + y 0 ( )2 (11.63)
d
dv d d
equation 11.60 becomes, on using the chain rule, v = R(v) gy 0 , that is,
d ds ds
p
vv 0 + R(v) x0 ( )2 + y 0 ( )2 + gy 0 = 0.
This constraint is satisfied by the three variables, so the auxiliary functional is

Z 1
T [x, y, v] = d F (x0 , y 0 , v, v 0 ), (11.64)
0
where
p 1
F = H(, v) x0 ( )2 + y 0 ( )2 vv 0 gy 0 with H(, v) = ( )R(v) (11.65)
v
and where ( ) is the Lagrange multiplier, that depends upon the independent variable,
here .
There are five known boundary conditions. The initial values of (x, y, v) are assumed
known, and given by (0, A, v0 ), and the final values of (x, y) are given by (b, 0). The
final value of v is not known, because this depends upon the path taken. For this we
use the natural boundary condition, equation 11.55 (page 433), at = 1,
F
= (1)v(1) = 0. (11.66)
v 0
Assuming that v(1) 6= 0, this gives (1) = 0. In exercise 11.20 it is shown that ( ) < 0
for 0 < 1, and hence that H(, v) > 0.
Exercise 11.20
(a) Show that 0 ( ) > 0 at = 1.
(b) If (1 ) = 0 for 0 < 1 < 1, show that 0 (1 ) > 0. Deduce that ( ) < 0 for
0 < 1, and that H(, v) > 0 for 0 1.
Hint for part (a) you will need the Euler-Lagrange equation for ( ), given in
equation 11.71.
The Euler-Lagrange equations and their solution

The Euler-Lagrange equations for x and y are particularly simple because F does not
depend upon either x or y. Thus the equation for x is
!
d x0 H x0 H
p = 0 that is p = , (11.67)
d x0 ( )2 + y 0 ( )2 x0 ( )2 + y 0 ( )2
where is a constant. Because we expect x( ) to be an increasing function of ,

must be a positive constant.
The Euler-Lagrange equation for y is
!
d y0 H y0 H
p g = 0 that is p = g , (11.68)
d x0 ( )2 + y 0 ( )2 x0 ( )2 + y 0 ( )2
where is another constant. Since (1) = 0 it follows that for type A motion, in which
y 0 ( ) < 0 for all , > 0; for type B motion during which y 0 ( ) changes sign we must
have, < 0. It is shown how the values of the constants and may be determined
by expressions derived at the end of this calculation.
It is now helpful to use s as the independent variable because, using equation 11.63,
1 dx dx 1 dy dy
p = and p =
x0 ( )2 + y 0 ( )2 d ds x0 ( )2 + y 0 ( )2 d ds
and hence equations 11.67 and 11.68 have the simpler form
dx
H(s, v) = , (11.69)
ds
dy 1
H(s, v) = g , H= (s)R(v). (11.70)
ds v
The third Euler-Lagrange equation, for v, is
d p
(v) x0 ( )2 + y 0 ( )2 Hv + v 0 = 0
d
and this simplifies to
d p 0 2
v+ x ( ) + y 0 ( )2 Hv = 0. (11.71)
d
Again using s for the independent variable gives the simpler equation
d
v = Hv . (11.72)
ds
Equations 11.69, 11.70 and 11.72 are the three Euler-Lagrange equations that we need
to solve. The remaining analysis is difficult partly because we change variables several
times and partly because it is necessary to keep in mind the expected behaviour of the
solution: in particular the two types of motion described before exercise 11.19 need to
be treated slightly differently.
Since x0 (s)2 + y 0 (s)2 = 1, squaring and adding equations 11.69 and 11.70 gives
2
2 2 2 1
H = + (g ) that is R = 2 + (g )2 (11.73)
v
where we have used the definition 11.65. This is a quadratic equation for and hence
can be used to express as a function of v.
Before solving this equation consider its value at the terminal point, = 1, where
(1) = 0. If the speed at the terminus is Vt , this equation gives
1
Vt2 = ,
2 + 2
a result needed later.

It is helpful to concentrate first on the type A motion, in which the speed steadily
increases. Then > 0, the maximum speed is at the terminus, max(v) = Vt , conse-
quently during the motion v 2 > 2 + 2 . The quadratic equation 11.73 can be written
in the form
R 1
2 g 2 R2 2 g 2 2

= 0. (11.74)
v v2
In general air resistance is relatively small, so we assume that g > R for the range of
speeds considered. Then this quadratic equation has the solutions
s 2
2 2
R R 2 2
1 2 2
g R = g g + (g R ) . (11.75)
v v v2
Since = 0 at the terminus the correct solution is given by the negative sign, and this
is conveniently written in the form

2 2
R f (v)
g R g = p (11.76)
v 2 + 2
where f (v) is the positive function defined by

2
2 2 2 g 2 2 1 2 2
f (v) = ( + )R + g .
v v2
The first two Euler-Lagrange equations, 11.69 and 11.70, are simplified if divided by
the third, equation 11.72, to give
dx v dy (g )v
= and = . (11.77)
d HHv d HHv
These equations can be used to express (x, y) as integrals over known functions of the
speed v. First we need to express HHv in terms of known quantities: differentiate H
with respect to v,
dH d d
= Hv + H = Hv R(v) ,
dv dv dv
where we have used equation 11.65 for H. Similarly, from equation 11.73
dH d
H = g(g ) ,
dv dv
and on combining these two results

d d 2 2
R d
H Hv R = g(g ) that is HHv = g R g .
dv dv v dv
(11.78)
Observe that the right-hand side of this equation is proportional to the left-hand side
of equation 11.76 for , and hence
dv f (v)
HHv = p . (11.79)
d 2 + 2
But, using the chain rule equations 11.77 can be written in the form
dx v d dy (g )v d
= and =
dv HHv dv dv HHv dv
so that using equation 11.79 these can be written as a pair of uncoupled first-order
differential equations,
dx p v dy p ( g)v
= 2 + 2 and = 2 + 2 . (11.80)
dv f (v) dv f (v)
Notice that x0 (v) > 0 and y 0 (v) < 0, since 0 and > 0 (for type A motion).
The right-hand sides of these equations are functions of v, with (v) being given
by equation 11.76. Integration, and taking account of the initial conditions, gives the
equation of the curve in the form
Z v
p
2 2
v
x(v) = + dv , (11.81)
v0 f (v)
Z v
p ( g)v
y(v) = A 2 + 2 dv . (11.82)
v0 f (v)
Using equation 11.76 for we obtain

1 gR 2 g f (v)
g = 2 2
R + 2 2
p
g R v g R 2 + 2
so that the equation for y(v) becomes
Z v Z v
v p
2 2
gR vR2
y(v) = A g dv 2 + dv . (11.83)
v0 g R2 v0 f (v)(g 2 R2 )
These expressions depend upon the unknown constants and , which are obtained
using information about the terminal point at which
1
v = Vt = p , x(Vt ) = b and y(Vt ) = 0.
2 + 2
Thus the end conditions give two equations for the unknown constants and in terms
of the given parameters b and A. These equations are, however, nonlinear so are difficult
to solve: this difficulty is compounded by the fact that the relations can usually only
be determined by numerically evaluating the integrals. Physical considerations suggest,
however, that for any pair of values (b, A) a solution exists.
Equations 11.81 and 11.82 define the stationary path parametrically, with the speed
v as the parameter. They are therefore directly equivalent to equations 4.8 (page 167),
in which the parameter is the angle . Further, in the limit R(v) = 0 these equations
should reduce to those found previously: it is important that we establish that this is
true in order to check the derivation.
Exercise 11.21
In this exercise the limit R = 0 is considered and it is shown that equations 11.81
and 11.83 reduce to the conventional parametric equations of the cycloid.
(a) Show that in the limit R = 0 equation 11.83 for y(v) reduces to the energy
equation
1 1
mg(A y) = mv 2 mv02 .
2 2
(b) Show that if R = 0,
p
2 + 2 v
=
f (v) g 1 2 v 2
and hence that equation 11.81 for x(v) becomes
v v2
Z
x(v) = dv .
g v0 1 2 v 2
(c) Using the substiution v = sin and setting v0 = 0, show that the equation
found in parts (a) and (b) become
c2 1
x= (2 sin 2), y = A c2 sin2 , c2 = , 0 b .
2 22 g
(d) Show also that g = / tan , and hence that = / tan b . Deduce that
> 0 if 0 b < /2 and < 0 if /2 < b < ; explain the significance of the
condition b = /2.
In general equations 11.81 and 11.82 can be dealt with only using numerical methods,
and this is not easy because it is necessary to solve two coupled nonlinear equations,
which can be evaluated only by numerical integration. However, if the resistance is
relatively small we expect the stationary path to be close to that of the cycloid of the
resistance free motion, which suggests making an expansion in powers of R(v). Such
an analysis also helps check numerical solutions.
In order to facilitate this expansion it is helpful to replace R(v) by R(v), where
is a small positive, dimensionless, quantity, and to use to keep track of the expansion.
An approximation to f (v), to order , can be written,
p
2 + 2 v
= q + O(2 ).
f (v) g 1 v 2 2 2
vR(v)
g
so that equation 11.81 for x(v) becomes

v v2
Z
x(v) = dv q
g v0 1 2 v 2 2
g vR(v)
Z v
v2

vR(v) 2
= dv 1+ + O( )
g v0 1 2 v 2 g 1 2 v 2
and equation 11.83 for y(v) becomes, to this order
Z v
1 2 2
vR(v)
y(v) = A v v0 2 dv .
2g g v0 1 2 v 2
We now set v0 = 0 and use the same substitution as used in exercise 11.21, v = sin ,
to write these relations in the form
Z
1 sin3
x() = 2
(2 sin 2) + 3 2
d R(v) (11.84)
4 g g 0 cos2
Z
1 2
y() = A 2 sin 2 2 d sin R(v). (11.85)
2 g g 0
p
At the terminal point (x, y) = (b, 0), if = b we have = 2 + 2 sin b , which can
be rearranged to give = / tan b , so the two unknown parameters are now and b .
It is now necessary to choose a particular function for the resistance: a natural
choice is R = v 2 , where is a constant (with the dimensions of inverse length). Then
equations 11.84 and 11.85 become
1
x() = 2
(2 sin 2) + 2 5 G1 () (11.86)
4g g
sin2
y() = A 2 4 G2 () (11.87)
2g2 g
where v = sin and

sin5 1 8 7 1
Z
G1 () = d 2
= + cos cos 3,
0 cos cos 3 4 12

2 3 1
Z
G2 () = d sin3 = cos + cos 3.
0 3 4 12
The first task is to determine the values of and b from the terminal conditions. This
is facilitated by noting that the equation y(b ) = 0 is a quadratic in 1/(g2 ),
sin2 b
G2 (b ) + A = 0.
g 2 4 2g2
The quadratic term is proportional to so one of the roots behaves as 1 , as 0,
and since we require a root that is finite when there is no resistance, the relevant
solution is
1 4A
= . (11.88)
g2
q
sin2 b + sin4 b 16AG2 (b )
This expression defines in terms of b , but numerical calculations show that it is real
only for small .
Using the equation = / tan b for allows us to write the equation x(b ) = b in
the form
1
b= (2b sin 2b ) + 2 4 G1 (b ). (11.89)
4g2 g tan b
Since g2 is given in terms of b by equation 11.88 this is a single equation for b that
can be solved numerically.
In figure 11.6 we show an example of such a solution. For the purposes of illustration
we choose g = 1 and take the end points to be (0, A), with A = 2/( 2), and (b, 0),
with b = 1, so for the cycloid b = /4. For these parameters it is necessary that
< 0.135 (approximately) for (b ) to be real, so we take = 0.12. As might be
expected the resistance forces the stationary path below that of the cycloid, on to a
path that is initially steeper.
2
y
1.5 Cycloid
1 With resistance
0.5
x
L
0 0.2 0.4 0.6 0.8 1 R
Figure 11.6 An example of a stationary path of a brachistochrone with resis-

tance, with end points given by A = 2/(2) and b = 1. The other parameters
used are defined in the text.
Now return briefly to case B when the speed reaches a maximum value along the path,
so v(s) is not a monotonic increasing function for all s. If v 0 (s) = 0 at some intermediate
point where s = Sm and v = Vm , then the equation of motion 11.60 shows that at this
point gy 0 (Sm ) = R(Vm ) < 0, that is y(s) is still decreasing, so the maximum speed is
reached before the lowest point of the path; this is contrary to the case R = 0 where
energy conservation ensures that these points coincide. Substituting this value of y 0
into the Euler-Lagrange equation 11.70 for y 0 (s) gives the relation
R
g 2 R2 = g ,

v
which, on comparing with equation 11.76, gives f (Vm ) = 0. Prior to this point the speed
is increasing to its maximum, Vm and y 0 (s) < 0; subsequently v decreases steadily to the
speed at the terminus. The vertical component of the velocity changes when y 0 (s) = 0.
This situation is summarised in figure 11.7.
y
f(v)=0, v(s)=0
A
g=0, y(s)=0
x
Figure 11.7 Diagram showing where v 0 (s) and y 0 (s) are zero.
On the first part of the path v 0 (s) > 0 and g < 0 and we have
g f (v) gR vR2
g = , < 0.
g 2 R 2 2 + 2 v(g 2 R2 )
p
On the second part of the path g = 0 at some point and

g f (v) gR vR2
g = , < 0.
g 2 R 2 2 + 2 v(g 2 R2 )
p
We now use the limiting case, R = 0 dealt with in exercise 11.20, to suggest how this
problem may be simplified. Assuming that at v = Vm , f (v)2 has a simple zero, it is
convenient to factor f (v) in the form f (v)2 = (Vm2 v 2 )f1 (v), where f1 (v) > 0 for
0 v Vm . Now define a new parameter [0, ] by v = pVm sin , so v increases for
< /2 and decreases for > /2, and then f (v) = Vm f1 (v) cos . If R = 0 then
Vm = 1/ and this is the same parameter used for the cycloid. The two expressions for
g can now both be written in the form
p
gVm cos f1 (v) gR vR2
g = 2 , v = Vm sin .
g R2 v(g 2 R2 )
p
2 + 2
In terms of equations 11.80 for x and y become
dx p v dy p v(g )
= 2 + 2 p and = 2 + 2 p .
d f1 (v) d f1 (v)
Substituting for g and integrating gives
Z Z
p v p sin cos
x() = 2 + 2 d p = Vm2 2 + 2 d , (11.90)
0 f1 (v) 0 f (v)
and

sin cos p 2 gR vR2
Z Z
y() = gVm2 d + 2 d
g 2 R2
p
0 0 (g 2 R2 ) f1 (v)
v
v cos gR vR2
Z p Z
= Ag dv 2 2
Vm 2 + 2 d . (11.91)
v0 g R 0 f (v) g 2 R2
11.8. BRACHISTOCHRONE WITH COULOMB FRICTION 445
The first-integral in this expression is the equivalent of the kinetic energy discussed in
part (a) of exercise 11.21, to which it reduces when R = 0. Further, for < /2 these
two equations for (x(), y()) are identical to equations 11.81 and 11.82, but now they
are valid for all . The two equations for and are obtained by integrating to t ,
where Vt = Vm sin t and where t > /2 if < 0 and t < /2 if > 0.
Exercise 11.22
Consider the case where the initial speed, v0 , is large, so that R(v0 ) > g, and show
that the equations for the stationary path are now
dx p v dy p v
= 2 + 2 and = ( g) 2 + 2
dv f (v) dv f (v)
where
2
g 1
f (v)2 = (2 + 2 )R 2 g 2 2 + 2 2 .
v v
Hence show that in the limit g 0 the stationary path between the points (0, A)
and (b, 0) is the straight line y = A(1 x/b), as expected.
11.8 Brachistochrone with Coulomb friction

In this variant of the brachistochrone problem there is friction between the wire and
the bead. Coulomb friction is proportional to the normal force between the bead and
the wire and opposes the motion. Thus the force normal to the wire affects the motion,
which is not so for a smooth wire as in the conventional brachistochrone or the problem
treated in the previous section. This means that energy is not conserved, and the
simplicity of the original problem is lost, as when the bead falls through a resisting
medium. A complete solution of this problem appears to have been described only
relatively recently by Ashby et al (1975)3 , and here we follow their analysis.
If the ratio of the horizontal to the vertical distance of the end points is large and
the initial speed is zero, the frictional forces must be small for a stationary path to
exist. As this ratio increases we expect the critical value of the friction, beyond which
there is no stationary path, to decrease: this behaviour is difficult to see in the exact
solution but is illustrated in exercises 11.23, 11.24 and 11.32.
Newtons equation of motion
The Cartesian coordinates of the end points of the wire are taken to be (x, y) = (0, A),
for the starting point, and (b, 0) for the terminus, with A > 0 and b > 0, and where
the y-axis is vertically upwards. If m is the mass of the bead this configuration and
the forces acting on the bead are shown in figure 11.8. The gradient of the wire at the
bead is tan = dy/dx, where y(x) is the required curve.
3N Ashby, W E Brittin, W F Love and W Wyss, Amer J Phys 1975, 43 pages 902-6.
y
A y
N wire
x
b x
N
mg
mg
Figure 11.8 Diagram showing the wire and its terminal points, on the left, and the
forces acting on the bead on the right: here N is the force normal to the wire.
There are three forces acting on the bead, as shown on the right of figure 11.8; that due
to gravity, the force N normal to the wire, which does not directly affect the motion, and
the frictional force of magnitude N directed along the wire and opposing the motion.
Here is the constant coefficient of friction and 0. For the reason discussed above
for a given value of we expect no stationary paths if b/A is too large.
The forces on the bead in the x- and y-directions are obtained directly by resolving
the forces shown in the inset of figure 11.8,
Fx = N (sin + cos ), Fy = N (cos sin ) mg, (11.92)
so the force in the tangential direction is
FT = Fx cos + Fy sin = N mg sin . (11.93)
Newtons equations of motion are therefore
mx = N (sin + cos ), (11.94)

my = N (cos sin ) mg, (11.95)
where we use the notation (due to Newton) x = dx/dt and x = d2 x/dt2 . Along the
wire, if v is the speed
mv = FT = N mg sin . (11.96)
Eliminating N from equations 11.94 and 11.95 gives
m (x sin y cos ) = N + mg cos . (11.97)
But also
x = v cos and y = v sin , (11.98)
and by differentiation we see that x sin y cos = v , so that equation 11.97 becomes
N = mv + mg cos . (11.99)
By substituting this into equation 11.96, for the tangential motion, we obtain the equa-
tion of motion
v + (v + g cos ) + g sin = 0. (11.100)
Using equation 11.98 this equation can be written in the alternative form
v v + v 2 + g x + g y = 0. (11.101)
In this equation (x, y) are related to v and , by geometry, equation 11.98; squaring
and adding these equations gives the obvious identity v 2 = x2 + y 2 , which is one of the
constraints on the functional. Differentiation of equations 11.98 gives
y cos x sin yx xy
= = .
v v2
This relation, together with the equation of motion 11.101, is the other constraint.
Exercise 11.23
A bead slides on a rough wire joining (0, A) to (b, 0) in a straight line, starting
from (0, A) with speed v0 .
Show that provided v02 > 2g(b A) the bead reaches the terminus at the time

2 A2 + b 2
t= p .
v0 + v02 + 2g(A )
Exercise 11.24
Consider a wire in the shape of the quadrant of a circle of radius R, centre at
(R, R) joining the points (0, R) and (R, 0). The coordinates of a point on this
quadrant can be expressed in terms of the angle ,

x = R(1 cos ), y = R(1 sin ), 0 ,
2
with increasing from 0 at (0, R) to /2 at (R, 0).
(a) Show that = /2 where is the angle defined in figure 11.8.

(b) Show that the equation of motion of the bead on the wire is
dv
v + v 2 = gR(cos sin ).
d
(c) By making an appropriate change of variable deduce, without solving the equa-
tion, that if v(0) = 0 the value of for which v(/2) = 0 is independent of R.
(d) By solving the differential equation derived in part (b) with v(0) = 0 show
that v(/2) = 0 for = 1 where 1 is the solution of
22 + 3e = 1.
Deduce that if is slightly larger that 1 the bead does not reach the terminus.
The functional and boundary conditions

The time of passage, T , is given by equation 11.61 (page 437),
1
dt
Z
T = d (11.102)
0 d
where is the parameter defining the position along the path if the natural variable,
t the time, were used the required quantity T would appear as a limit in the integral,
which is inconvenient.
This functional has two constraints: the equation of motion 11.101 and the relation
between v and (x, y), so this is a Lagrange problem with two multipliers. The constraints
need to be expressed in terms of . For v
s 2 2 p 0 2
dx d dy d x ( ) + y 0 ( )2
v( ) = + = ,
d dt d dt t0 ( )
with a prime denoting differentiation with respect to . For 0 , since tan = y/x =
y 0 /x0 , differentiation gives
1 d y 00 y 0 x00 d y 00 x0 y 0 x00 y 00 x0 y 0 x00

2
= 0 02 hence = 2 2
= .
cos d x x d 0
x +y 0 v 2 t0 2
Thus the equation of motion 11.101 becomes
y 00 x0 y 0 x00
vv 0 + gy 0 + + gx0 = 0.
t0 2
The auxiliary functional is therefore
Z 1
T [x, y, v, t] = d F (x0 , x00 , y 0 , y 00 , v, v 0 , t0 ) (11.103)
0
where
y 00 x0 y 0 x00
p
0 0 0
F = t + 1 vv + gy + + gx0 + 2 x0 2 + y 0 2 vt0 , (11.104)
t0 2
with both the Lagrange multipliers, 1 and 2 , depending upon . The dependent
variables are (x, y, v, t) and the functional contains second derivatives of x and y.
The known boundary conditions at the start, = 0, are
x(0) = 0, y(0) = A > 0, v(0) = v0 0, t(0) = 0, (11.105)
and at the terminus, = 1,

x(1) = b, y(1) = 0. (11.106)
The remaining conditions are determined by the natural boundary conditions: for x
and y,
F y0 F x0
00
= 1 0 2 = 0, 00
= 1 0 2 = 0, at = 0 and 1, (11.107)
x t y t
and for v at the terminus

F
= 1 (1)v 0 (1) = 0. (11.108)
v 0
This gives 1 (1) = 0 and hence the boundary condition 11.107 at the terminus is
automatically satisfied.
The four Euler-Lagrange equations are obtained from the derivatives
F F x0 y 00 y 0 x00
= 0, = 1 2 1 2 v
t t0 t0 3
F F
= 1 v 0 2 t0 , = 1 v,
v v 0
F 1 y 00 2 x0 F 1 y 0
= + 1 g + , = ,
x0 t0 2 x00 t0 2
p
x0 2 + y 0 2
F 1 x00 2 y 0 F 1 x0
0
= 0 2 + 1 g + p , 00
= 02 .
y t x0 2 + y 0 2 y t
From these expressions we obtain the four Euler-Lagrange equations in terms of , after
which we may replace by t (because the choice of parameter is arbitrary). Thus the
four following Euler-Lagrange equations are obtained
2 v + 21 v 2 = c1 (for t), (11.109)

v 1 + 2 = 0 (for v), (11.110)
21 y + y1 + 1 g + 2 cos = cx (for x), (11.111)
21 x + x1 1 g 2 sin = cy (for y), (11.112)
where c1 , cx and cy are integration constants. These four equations, together with the
constraints allow a solution to be found; remarkably these equations can be integrated
in terms of known functions, though this process is not simple.
Using equations 11.110 and 11.98 we see that x1 = 2 cos and y1 = 2 sin .
Equation 11.109 gives 2 in terms of 1 , and the second derivatives in equations 11.111
and 11.112, x and y, may be replaced by the first derivatives v and using
x = v cos v sin , y = v sin + v cos ,
so equations 11.111 and 11.112 become, respectively,

c1
21 v + v sin + 1 g + (cos sin ) = cx , (11.113)
v
c1
21 v + v cos 1 g (sin + cos ) = cy . (11.114)
v
Now note that the combination v + v also occurs in the equation of motion 11.101,
which can therefore be used to obtain two algebraic equations relating v and 1 . Thus
equations 11.113 and 11.114 become
c1
1 g 1 2 sin cos 2 sin2 + (cos sin )

= cx , (11.115)
v
2 2
c1
1 g 1 + 2 sin cos + 2 cos (sin + cos ) = cy . (11.116)
v
These equations are linear in 1 g and c1 /v so may be solved directly to give
cos
v() = where h() = 1 + 2 sin cos + 2C cos2 (11.117)
Bh()
and
cx cy cy + cx
1 g = Bc1 (C + tan ) where B = , C= . (11.118)
c1 (1 + 2 ) cx cy
Thus both v and 1 are explicit functions of .

If v(0) = 0 the initial value of satisfies cos = 0, and physical considerations give
(0) = /2; that is, the stationary curve is initially vertical, as in the conventional
problem.
Because v is a function of it is possible to express x and y as first-order differential
equations with as the independent variable. First note that = 1/t0 (), then
x0 () dx dt dy dt
x = = v() cos that is = v cos and similarly = v sin .
t0 () d d d d
An expression for t0 () is obtained from the equation of motion 11.101 by dividing by

to give
v (v 0 + v) + gt0 () (sin + cos ) = 0,
that is
dt v 0 + v
g = .
d sin + cos
Using equation 11.117 in this expression it becomes, after some algebra,
dt 2 1
gB = 2 . (11.119)
d h h
Hence the differential equations for x() and y() are

dx 2 1 2 dy 2 1
gB 2 = cos 2
and gB = sin cos . (11.120)
d h3 h2 d h3 h2
At the terminus, where = 1 , 1 = 0 so equation 11.118 relates C to 1 , C = tan 1 .

Thus the equations for the stationary path are
Z
1 2 1
x(, B) = d cos2 , x(1 , B) = b, (11.121)
gB 2 /2 h3 h2
Z
1 2 1
y(, B) = A + d sin cos , y(1 , B) = 0. (11.122)
gB 2 /2 h3 h2
The two boundary conditions give two equations for B and 1 which may be solved
(numerically) to yield the stationary path. Some examples of the solutions of these
equations are shown in figure 11.9; here the frictionless case ends tangentially to the
x-axis and if > 0 the stationary path dips below the x-axis, but too little to be seen
on this graph.
1
y
0.8
=0.5
0.6
=0.3
=0 =0.2
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 x
Figure 11.9 Graphs of the curves traced out by equations 11.121 and 11.122
for the terminal points (0, 1) and (/2, 0) for which the frictionless brachis-
tochrone, = 0, ends tangentially to the x-axis; this is depicted by the dashed
line. The cases = 0.2, 0.3 and 0.5 are shown.
In figure 11.10 is shown stationary paths with the end points (0, 1) and (5, 0) for which
the frictionless brachistochrone dips below the x-axis. In this case the distance travelled
is longer than in figure 11.9 and the value of above which there is no stationary path
is smaller, as illustrated in the problems considered in exercises 11.24 and 11.32.
1
y =0.2
=0.15
0.5
=0.1
x
0
1 2 3 4 5
-0.5
=0
=0.05
Figure 11.10 Graphs of the curves traced out by equations 11.121 and 11.122
for the terminal points (0, 1) and (5, 0) and various values of , with the case
= 0 shown with the dashed line.
Exercise 11.25
Assuming that v0 = 0 show that at the end points h() = 1, where h() is defined
in equation 11.117, and that h() has a single minimum at = 1 /2 /4.
Find the minimum value of h() and deduce that solutions exist only if

1
tan + < 1.
2 4
Exercise 11.26
In the friction free limit, = 0, show that equations 11.121 and 11.122 give
1 1
x= (2 sin 2) and y =A (1 cos 2), = + ,
4gB 2 4gB 2 2
and that 1 is related to A by
b 21 sin 21
= , 1 = + 1 .
A 1 cos 21 2

Exercise 11.27
(a) Show that the functional
Z
S[y] = dx y 0 2 y(0) = y() = 0,
0
Z
subject to the constraint dx y 2 = 1, gives rise to the equation
0
d2 y
+ y = 0, y(0) = y() = 0,
dx2
where is the Lagrange multiplier.
(b) Show that the functions y(x) = 2/ sin nx, with Lagrange multiplier = n2 ,
p
n = 1, 2, , are solutions of this equation.
Exercise 11.28
(a) Show that the functional, which is quadratic in y and y 0 ,
Z b
dx p(x)y 0 2 q(x)y 2 , y(a) = y(b) = 0,
`
S[y] =
a
Z b
and the constraint dx w(x)y(x)2 = 1 leads to the linear equation
a

d dy
p(x) + (q(x) + w(x))y = 0, y(a) = y(b) = 0.
dx dx
(b) If the constraint were not also quadratic in y(x) would the resulting Euler-
Lagrange equation be linear?
Exercise 11.29 Z 1
Find the stationary value of the functional S[y] = dx y 2 subject to the con-
Z 1 0
straint dx y = a.
0
Exercise 11.30 Z
Find the function y(x) making the functional P [y] = dx y ln y stationary
Z Z
subject to the two constraints dx y = 1 and dx x2 y = 2 , and where

y(x) goes to zero sufficiently rapidly as |x| for all integrals to exist.
You will find the following integrals useful:
Z r Z
2 2
dx eax = , dx x2 eax = 3/2 where <(a) > 0.
a 2a
This is an important problem that occurs in statistical physics and information
theory, where y(x) is the probability distribution of a continuously distributed
random variable x and P [y] is the entropy. The first constraint is just the normal-
isation condition, satisfied by all distributions, and the second is the variance.
Exercise 11.31
Show that the the stationary path of the functional
Z
S[y] = dx y 0 2 , y(0) = y() = 0,
0
R
subject to the constraint 0
dx y sin x = a, is y(x) = (2a/) sin x.
Exercise 11.32
The points (0, a) and (b, 0), respectively on the Oy and Ox axes, are joined by
a rough wire in the shape of the quadrant of the ellipse parameterised by the
equations

x = b(1 cos ), y = a(1 sin ), 0 .
2
A bead slides down this wire under the influence of gravity and Coulomb friction,
show that the equation of motion 11.101 can be written in the form
dz 2abz
+ 2 = g(a cos b sin ),
d a cos2 + b2 sin2
where z = v 2 /2. If v(0) = 0 show that

Z
1 2 g
v = dw (a cos w b sin w)f (w),
2 f () 0
where
b
f () = exp 2 tan1 tan .
a
Deduce that if = 1 where 1 is the positive solution of
Z /2
dw (cos w sin w)f (w) = 0
0
and = b/a the bead has zero speed at the terminus.


Put x = y + , so the integral becomes
1 1

1 2 00 1 3 000
Z Z
2 2 0

dy f ( +y) y = 2 dy f () + yf () + y f () + y f ( + )
2 2 6
for some in (, ), where we have used Taylors series, (section 1.3.8). Since
4
4 5
Z Z Z
dy 2 y 2 = 3 , dy 2 y 2 y = 0, dy 2 y 2 y 2 =

,
3 15
we see that
b
4
Z
dx f (x)g(x ) = f () + O( 3 ), = .
a 3

If is the Lagrange multiplier,
Z 1
dx y 0 2 y ,

S[y] = y(0) = y(1) = 0,
0
and the associated Euler-Lagrange equation is 2y 00 + = 0. This has the general solution

y(x) = x2 + ax + b,
4
where a and b are constants. The boundary condition at x = 0 gives b = 0: that at

x = 1 gives a = /4, so the solution is y(x) = x(1 x). The constraint gives
4
Z 1

A= dx x(1 x) = giving y(x) = 6Ax(1 x).
4 0 24

Z 2
dx xy 0 2 y ,

S[y] = y(1) = y(2) = 0,
1
and the associated Euler-Lagrange equation is 2(xy 0 )0 + = 0, which has the general
solution

y(x) = x + a ln x + b,
2
where a and b are constants. The boundary condition at x = 1 gives b = /2: that at
x = 2 gives 0 = a ln 2 /2, so the solution is
ln x
y(x) = (1 x) + .
2 2 ln 2
The constraint gives

2
2

ln x 1 2 x ln x x (3 ln 2 2)
Z
1= dx 1 x + = x x + =
2 1 ln 2 2 2 ln 2 1 4 ln 2
and hence
2 ln 2 ln x
y(x) = 1x+ .
3 ln 2 2 ln 2


F = y 0 2 + z 0 2 2xz 0 4z y 0 2 xy 0 z 0 2
= (1 )y 0 2 + (1 + )z 0 2 2xz 0 + xy 0 4z
so the Euler-Lagrange equations for y and z are, respectively,

d
2(1 )y 0 + x = 0
dx
d
2(1 + )z 0 2x + 4 = 0.
dx
The first equation gives
1
2(1 )y 0 + x = A and hence 2(1 )y = Ax + B x2
2
where A and B are constants. The boundary condition at x = 0 give B = 0 and that
at x = 1 gives A = (4 3)/2.
Similarly the second Euler-Lagrange equation gives
2(1 + )z 0 2x = 4x + C and hence 2(1 + )z = x2 + Cx + D
and the boundary condition at x = 0 gives D = 0 and that at x = 1 gives C = 3 + 2.

Hence the solution is
(4 3)x x2 (3 + 2)x x2
y= and z = .
4(1 ) 2(1 + )
The values of the Lagrange multiplier are obtained by substituting these functions into
the constraint. The component integrals are
Z 1 Z 1
1
c1 = dx y 0 2 = dx (4 3 2x)2
0 16(1 )2 0

1 49 2
= 16 32 +
16(1 )2 3
Z 1 Z 1
1
dx xy 0 = dx (4 3)x 2x2

c2 =
0 4(1 ) 0
12 13
= .
24(1 )
Hence
24 46 + 232
c1 c 2 = .
48(1 )2
Also
1 1
1 1
Z Z
2
c3 = dx z 0 2 = dx (3 + 2 2x) = 1 + .
0 4(1 + )2 0 12(1 + )2
Thus the constraint becomes
24 46 + 232 1
c= 1 .
48(1 )2 12(1 + )2

If is the Lagrange multiplier then
Z 1
dx y y 0 2 ,

S[y] = y(0) = y(1) = 0,
0
and the Euler-Lagrange equation is 2y 00 + 1 = 0, with the general solution

x2
y(x) = + Ax + B.
4
The boundary condition at x = 0 gives B = 0 and that at x = 1 gives A = 1/(4), so
the stationary path is
1
y(x) = x(1 x).
4
The constraint gives
2 Z 1
1 1 1
c= dx (1 2x)2 = 2
and hence = 3c,
4 0 3(4) 4
which gives the required result.

The function sinh passes through the origin with unitpgradient, and for > 0 its
gradient is a monotonic increasing function. Hence if L2 (A B)2 > a, equa-
tion 11.24 has a unique, positive solution. This condition can be written in the form
L2 > a2 + (A B)2 , which is simply the condition that the cable is longer than the
distance between the end points.
In the limit L2 2 2
p a + (A B) the root of equation 11.24 lend to zero, so c .
More precisely, if L2 (A B)2 = a(1 + ), for some small positive quantity ,
equation 11.24 becomes sinh = a(1 + ) and using the approximation sinh = +
a
3 /6 + O( 5 ) gives 2 = 6, so c ' .
2 6
Since y 0 (x) = sinh((x + d)/c), y 0 (0) = 0 if d = 0. Then equations 11.22 and 11.23 give
a a
L = c sinh = c sinh 2, = , and A B = c (cosh 2 1)
c 2c
and we may eliminate c by dividing these two equations,

AB cosh 2 1
= = tanh .
L sinh 2
Substitute for A B in equation 11.24 to obtain
L L
q
sinh = 1 tanh2 = giving a sinh 2 = L2.
a a cosh
Given L and a, provided L > a, this equation gives a unique real, positive value of .
Then the height difference, A B, is obtained from the equation A B = L tanh .

On the path defined in equation 11.20 the energy functional is
Z a Z a
x+d x+d 2 x+d
E[y] = dx + c cosh cosh = L + c dx cosh
0 c c 0 c
2

1 c 2a + 2d 2d
= L + ac + sinh sinh
2 4 c c
where = c cosh(d/c). Hence

d 1 1 2 a + 2d a
E[y ] = Lc0 cosh ac0 c0 cosh sinh ,
c0 2 2 c0 c0

d+ 1 1 2 a + 2d+ a
E[y+ ] = Lc0 cosh + ac0 + c0 cosh sinh .
c0 2 2 c0 c0
Define = E[y ] E[y+ ], so we need to show that > 0. Write in the form

d d+
= ac0 + Lc cosh + cosh
c0 c0

1 a + 2d a + 2d+ a
c20 cosh + cosh sinh
2 c0 c0 c0

d + d d+ d
= ac0 + 2Lc0 cosh cosh
2c0 2c0

a a + d + + d d+ d
c20 sinh cosh cosh .
c0 c0 c0
But d = D0 0 , so (d+ + d )/c0 = 20 and (d+ d )/c0 = 2D0 , with 20 = a/c0 .

Hence
= ac0 + 2Lc0 cosh 0 cosh D0 c20 sinh 20 cosh 2D0 .
But L = 2c0 cosh D0 sinh 0 , and hence
= c20 sinh 20 2 cosh2 D0 cosh 2D0 ac0

= c20 (sinh 20 20 ) > 0 if 0 > 0.


(a) The Euler-Lagrange equation is
!
d (y )y 0 p
p 1 + y 0 2 = 0 y(0) = 0, y(a) = A,
dx 1 + y0 2
and this expands to
(y )y 00 = 1 + y 0 2 , y(0) = 0, y(a) = A.
(b) If y(x) = A w(u), u = a x then

dy dw du dw d2 y d2 w
= = and similarly = .
dx du dx du dx2 du2
Hence the Euler-Lagrange equation becomes
d2 w
(A w(u)) = 1 + w0 (u)2 , w(0) = 0, w(a) = A,
du2
and this can be rewritten in the form
(w )w00 (u) = 1 + w0 (u)2 , + = A.
Thus if y(x) is a solution of the Euler-Lagrange equation, so is w(x) if is replaced by
A .
For the minimum surface problem there is no Lagrange multiplier, that is = 0, so we
do not have sufficient flexibility for w(x) to be a solution.

The area, A[y], and the constraint, C[y] = L, are
Z v Z v p
A[y] = dx y and C[y] = dx 1 + y 0 2 = L, y(0) = 0.
0 0
If is a Lagrange multiplier the auxiliary functional is

Z v p
A[y] = dx y 1 + y 0 2 , y(0) = 0.
0
The boundary condition at x = v is given by equation 11.31, with (x, y) = y, that is

p
y 0 F y0 F = 0 where F = y 1 + y 0 2 .
But the functional has the first integral y 0 F y0 F = c, for some constant c, and the
boundary condition shows that c = 0. Hence the equation for the stationary path is
p
dy 2 y 2
y=p or = with solution 2 = (x A)2 + y 2 ,
1 + y0 2 dx y
for some constant A. The boundary condition at x = 0 gives = A, so the stationary
path is a semicircle of radius with centre at (, 0) and hence v = 2. The length of
the arc is therefore L = , which gives .

(a) The surface area is derived in section 4.3. For the volume consider a thin disc centre
at x, and width x: its volume approximately is the product of the area of one surface,
y(x)2 , and the width. Hence
Z v Z v p
V [y] = dx y 2 and A[y] = 2 dx y 1 + y0 2.
0 0
(b) If /2 is the Lagrange multiplier the modified functional is

Z v p
V [y] = dx F (y, y 0 ) where F = y 2 y 1 + y0 2, y(0) = 0.
0
The first integral of the associated Euler-Lagrange equation is
y
y 0 F y0 F = p y 2 = c,
1 + y0 2
where c is a constant.
The boundary condition at x = v is given by equation 11.31 with (x, y) = y: hence
as in exercise 11.10, c = 0 and,the equation for the stationary path is exactly the same
as in exercise 11.10, that is 2 = (x B)2 + y 2 , for some constant B. The boundary
condition at x = 0 gives = B, so the stationary path is a semicircle of radius with
centre at (, 0) and hence v = 2. Since the shape created is a sphere of radius , its
area and volume are A = 42 and V = 34 3 = A3/2 /(6 ).

The solution of the Euler-Lagrange equation is given in equation 11.20, and is

xd
y = + c cosh .
c
The left-hand end of the cable is constrained to the curve = x = 0, so the boundary
condition at x = 0 is, from equation 11.31,
(y )y 0
0 = F y0 = p , (x = 0), which gives d = 0.
1 + y0 2
The boundary condition at x = a and the length constraint give

a a
A = + c cosh and L = c sinh .
c c
The second of these equations can be written in the form L/a = sinh , = a/c, so
give a unique positive value for c. Then the first equation can be used to write the
solution in the form a x
y = A c cosh + c cosh .
c c

If is the Lagrange multiplier, the functional is given by equation 11.19 and the asso-
ciated Euler-Lagrange equation has the general solution

xd
y(x) = + c cosh .
c
p is a natural boundary condition at x = 0, so here F y = 0, where F = (y

There 0
) 1 + y 0 2 , so
y 0 (y )
p = 0, that is y(0) = or y 0 (0) = 0.
1 + y0 2
The first equation gives cosh(d/c) = 0, which cannot be satisfied, and the second gives
sinh(d/c) = 0, which gives d = 0.
The transversality condition, equation 11.31, with = x/a + y/b 1, gives
0
y y 1 v a
p = 0 at x = v, and hence sinh = .
1 + y0 2 a b c b
This is one equation relating v and c. The other is given by the length constraint
Z v p
v ac bL
L= dx 1 + y 0 2 = c sinh = hence c = .
0 c b a
Thus the required solution is
Lb ax Lb a
y =+ cosh , 0x sinh1 .
a bL a b
Finally, at x = v we have v/a + y(v)/b = 1 and since

Lb b Lp 2
y(v) = + cosh sinh1 =+ a + b2 ,
a a a
this gives
Lp 2 vb Lp 2 Lb2 a
=b a + b2 =b a + b2 2 sinh1 .
a a a a b

Put m1 = tan 1 and m2 = tan 2 so that equations 11.35 become

tan2 1 tan2 2 = cos 2 cos 1 and 2 tan 1 tan 2 = sin 1 sin 2 .
Eliminate by dividing these equations: assuming that tan 1 6= tan 2 , a solution of

no interest, we obtain
sin 1 + sin 1

sin 1 cos 2 + sin 2 cos 1 2
2
2
2
= .
cos 1 + sin 1

2 cos 1 cos 2 2
2
2
2
Assuming that 1 6= 2 , because we have already dealt with this solution, gives

2 1 + 2
cos = cos 1 cos 2
2
which simplifies to 1 = cos 1 cos 2 + sin 1 sin 2 = cos(1 2 ), the only solution of
which is 1 = 2 + 2n.

(a) The energy of the hanging mass is M gy() and the energy of the two portions of
the cable either side of it are
Z p Z a p
E1 [y] = g dx y 1 + y 0 2 and E2 [y] = g dx y 1 + y 0 2 .
0
Since the total energy is the sum of these three components we obtain the given result.
The constrains are just the lengths along each portion of the cable.
(b) If g1 and g2 are the Lagrange multipliers the modified functional is

Z q Z a q
E[y] = M gy() + g dx (y1 1 ) 1+ y10 2 + g dx (y2 2 ) 1 + y20 2 ,
0
where
y1 (x), 0 x ,
y(x) = with y1 () = y2 ().
y2 (x), x a,
Now evaluate the functional on the varied path y + h, using the method described in
section 9.5.2. The corner moves to the point ( + u, y() + v), where u and v are
independent variables. Thus
Z +u Z a
E[y+h] = M g y() + v +g dx F (y1 +h1 , y10 +h01 )+g dx F (y2 +h2 , y20 +h02 )
0 +u
p (11.123)
where F (y, y 0 ) = (y ) 1 + y0 2.
We have, as in section 9.5.2 (but with notation changes)
y() + v = yk ( + u) + hk ( + u), k = 1 and 2,

= yk () + uyk0 () + hk () + O(2 ).

Differentiate equation 11.123 with respect to and then set = 0 to obtain the Gateaux
differential,
h i
E[y] = M gv + gu F (y1 , y10 ) F (y2 , y20 )
x=
Z Z a
dx h1 Fy1 + h01 Fy10 + g dx h2 Fy2 + h02 Fy20 .

+g
0
The usual integration by parts gives, since h1 (0) = h2 (a) = 0,

h i
E[y] = M gv + gu F (y1 , y10 ) F (y2 , y20 ) + g h1 Fy10 + g h2 Fy20

x= x= x=
Z Z a
d d
= +g dx Fy1 Fy10 + g dx Fy2 Fy20 .
0 dx dx
First consider the subset of variations for which u = h() = 0, to obtain the Euler-
Lagrange equations satisfied by y1 and y2 :
d
Fyk0 Fyk = 0, y1 (0) = B y2 (a) = A.
dx
p
Since F = (y ) 1 + y 0 2 is independent of x we the first integrals
yk k
p = ck = constant, k = 1 and 2. (11.124)
1 + yk0 2
The general solution of these equations are

x dk
yk = k + ck cosh k = 1 and 2, (11.125)
ck
where c1 , c2 , d1 , d2 ) are constants to be determined.

(c) Now we need Weierstrass-Erdmann conditions at x = . From equation 11.123 we
have v = uyk0 () + hk (), k = 1 2, to replace hk (). Thus
h i h i
E[y] = M gv+gu F (y1 , y10 ) F (y2 , y20 ) +g (v uy10 ) Fy10 (v uy20 ) Fy20 .
x= x=
Collecting the coefficients of u and v together gives
E[y] = gv M Fy20 Fy10 + gu F (y1 , y10 ) y10 Fy10 F (y2 , y20 ) y20 Fy20 .

This expression must be zero for all u and v and hence we have the conditions
lim (F y 0 Fy0 ) = lim (F y 0 Fy0 ) .

M = Fy20 Fy10 and (11.126)
x+ x
The first of these equations represents the resolution of forces in the vertical direction
at x = : the second equation is the resolution of forces in the horizontal direction.
In addition the first integral, equation 11.125, represents the fact that the horizontal
component of the tension in the cable is constant. Since F y 0 Fy0 is c1 or c2 we see
that the second of these conditions gives c1 = c2 = c. Using the actual expression for
F the first condition becomes
( )
(y2 2 )y20 (y1 1 )y10
M = p p = c (y20 y10 ) . (11.127)
1 + y20 2 1 + y10 2
Now we have sufficient conditions to solve the problem, as may be seen by substituting
the solutions 11.126 into these equations.
First we have the length constraints

d1 d1
Z q
L1 = dx 1 + y10 2
= c sinh + sinh (11.128)
0 c c
Z a q
0 2 a d 2 d2
L2 = dx 1 + y2 = c sinh sinh . (11.129)
c c
The boundary conditions give

d1 a d2
B = 1 + c cosh and A = 2 + c cosh . (11.130)
c c
Equation 11.127 gives

d2 d1
M = c sinh sinh . (11.131)
c c
Finally the solution is continuous at x = ,

d1 d2
1 + c cosh = 2 + c cosh . (11.132)
c c
Thus we have six equations for the six constants (1 , 2 , , c, d1 , d2 ).

df
(a) Since dt = f 0 (s)s equation 11.42 becomes
d2 x 0 d2 y
s + sy (s) = 0 and s sx0 (s) = 0.
ds2 ds2
On putting t = s, so s = 1 these integrate to
x0 + y = and y 0 x = .
Differentiate the second with respect to s and use the first to substitute for x0 to obtain
2 y 00 + y = which has the general solution y = + a cos(s/ + ), where a and are
constants. From this we obtain x = y 0 = a sin(s/ + ).
(b) The curve is closed and has length L, that is x(0) = x(L) and y(0) = y(L), so
that L/ = 2. Further (x + )2 + (y a)2 = a2 , so a is the radius of the circle of
circumference L, that is 2a = L.

This is the dual of the problem dealt with in the text, so the stationary curve is a circle.
It is not necessary to do any calculations to prove this but here we provide the details.
The auxiliary functional is
Z 2 p

L[x, y] = dt x2 + y 2 (xy xy .
0 2
where is the lagrange multiplier The Euler-Lagrange equations are

! !
d x d y
p + y = 0 and p x = 0.
dt x2 + y 2 dt x2 + y 2
Integrating these and using s, the arc length, for the independent variables gives as in
exercise 11.16
dx dy
= y and = + x,
ds ds
with solutions
x = a sin(s + ) and y = + a cos(s + )
where is a constant.

(a) If 2(x) is the Lagrange multiplier the auxiliary functional is
Z b
dx y 0 2 + z 0 2 y 2 2(z y 0 )

S[y, z] =
a
so the Euler-Lagrange equations for y and z, respectively, are
d d
(y 0 + ) + y = 0 and (z 0 ) + = 0,
dx dx
and the natural boundary condition is z 0 (b) = 0. These equations simplify to
y 00 + y + 0 = 0 and = z 00 ,
so eliminating z and gives
d4 y d2 y
4
2 y = 0, y(a) = A1 , z(a) = A2 , y(b) = B1 , z 0 (b) = y 00 (b) = 0.
dx dx
(b) The Euler-Lagrange equation for the functional J[y] is given using the general result
given in section 9.2.1, but see also exercise 3.34 (page 141),
d2

F d F F
2 00
0
+ = 0,
dx y dx y y
with F = y 00 2 + y 0 2 y 2 this gives y (4) y (2) y = 0, with y(a) = A1 , y 0 (a) = A2 ,

y(b) = B1 . The natural boundary condition for y 00 is given by Fy00 = 0, that is y 00 (b) = 0.

Since y = s, dy/ds = 1 and the equation of motion is
dv
v = v 2 + g, v(0) = 0.
ds
Integration gives
v iv
v
Z h
2
dv = s that is ln(g v = 2s
0 g v 2 0
which simplifies to the quoted result.

(a) The Euler-Lagrange equation for is
p
d 1 0
v = + ( )R (v) x0 2 + y 0 2 .
d v2
p
Since (1) = 0, 0 (1) = x0 2 + y 0 2 /v 3 > 0.
(b) If (1 ) = 0 then by the same arguments as used above 0 (1 ) > 0.

Thus ( ) can only increase through a zero and, since is continuous there cannot be
adjacent zeros in the interval [0, 1]. Since (1) = 0 and 0 (1) > 0 we must have ( ) < 0
for 0 < 1.
Since H(, v) = v 1 ( )R(v) and R > 0 it follows that H(, v) > 0.

(a) When R = 0 equation 11.83 becomes
1 v 1 1
Z
y =A dv v that is g(A y) = v 2 v02 .
g v0 2 2
Multiplying by the mass gives the energy equation: the left-hand side is the loss in
potential energy as the particle falls through a distance A y; the right-hand side is
the gain in kinetic energy.
(b) If R = 0
2 g2

2 2 2 1 2 2
f (v) = + g
v2 v2
p
2 + 2

1 v
= g 2 2 + 2 2
hence =
v2 f (v) g 1 2 v 2
so that equation 11.81 for x(v) becomes
v v2
Z
x(v) = dv p .
g v0 1 (v)2
(c) Putting av = sin and v0 = 0 gives, with g + > 0,

1 1
Z
x(v) = d sin2 = (2 sin 2) .
2 g 0 42 g
The energy equation, found in part (a), gives
1 1
y =A sin2 = A 2 (1 cos 2) .
22 g 4 g
Putting c2 = 1/(22 g) gives the required result.
(d) Equation 11.76 becomes, with R = 0,

f (v) 1 2 v 2
g = p =
g 2 + 2 v
cos
= hence g = .
sin tan
At the terminus = 0, so = / tan b . Since tan > 0 for (0, /2) and tan < 0
for (/2, ) the result follows.
If b = /2 the cycloid is tangent to the x-axis at the terminus. If b > /2 it crosses
the x-axis and reaches a point lower than the end point. Thus type A motion has
b < /2 and type B motion has b > /2.

In this example we are interested in the limit where the gravitational force is negligible
by comparison to the resistive force, so R > g. The quadratic equation 11.74 for is
therefore most conveniently written in the form,

R 1
2 R2 g 2 2 g 2 + 2 2 = 0.

v v
p
At the terminus (1) = 0 and 1/Vt = 2 + 2 and we assume that v(t) > Vt , so
v 2 Vt2 < 0 throughout the motion and the third term of this quadratic is negative.
The solution is
s 2
2 2
R R 2 2 2 2
1
R g = g g + (R g ) + 2
g g v
and using the condition v = Vt when = 0 we see that the lower sign gives the required
solution. Hence
2 2
R f (v)
R g g = p
g 2 + 2
where f (v) is defined in the question. Then, as in the text,
2
1
H 2 = 2 + (g )2 = R = 2 + (g )2
v
and
dH d d d
H = g(g ) and H Hv R = ( g)
dv dv dv dv
so that
d 1 d f (v)
HHv = R R g( g) = p
dv v dv 2 + 2
As in the text, equation 11.77
dx v d dy ( g)v d
= and =
dv HHv dv d HHv dv
and hence
dx p v dy p v
= 2 + 2 and = ( g) 2 + 2 .
dv f (v) dv f (v)
If g = 0, f = (2 + 2 )R and since v decreases along the path

Z v0 Z v0
p v p v
x(v) = 2 + 2 dv and y = A 2 + 2 dv = A x.
v R(v) v R(v)
With / = A/b this gives the straight line y = A(1x/b) through the terminal points.

If tan is the gradient of the wire, < 0 and
A b
sin = , cos = and = 0
A2+ b2 A2+ b2
so the equation of motion 11.100 becomes
dv g(A b)
= .
dt A2 + b 2
If A > b, v > 0 and the bead reaches the end at x = b. If A < b the bead decellerates
and it reaches the end only of the initial speed is sufficiently large: in this case the
equation of motion is valid only until v(t) = 0.
Integration gives
ds g(A b)t g(A b)t2

= v(t) = v0 + and s = v0 t + ,
dt A2 + b 2 2 A2 + b 2
where
s is the distance travelled along the wire. If A > b the end is reached when
s = A2 + b2 , that is at the positive root of
g(A b) 2 p
t + v0 t A2 + b2 = 0
2 A2 + b 2
that is
p
v02 + 2g(A b) v0 p 2 2
2 A2 + b 2
t= A +b = p .
g(A b) v0 + v0 + 2g(A b)
If A < b the above expression for s(t) is valid only t < t0 where v(t0 )= 0: for t t0
the bead is stationary. The equation for the time to reach the end, s = A2 + b2 is the
same but now both roots are positive and only one satisfies t < t0 and this gives the
above expression for t. If v02 < 2g( A) this time is complex and the bead does not
reach the point x = b.

(a) There are two ways of doing this. The easiest is by using elementary geometry. The
harder method is to note that
dy dy . dx sin
tan = = = hence tan tan = 1,
dx d d cos
so that tan( ) = . This equation has many solutions, but = /2 when = 0,
so the appropriate solution is = /2.
(b) Multiply equation 11.100 by dt/d to obtain

dv d dy dx
v + v 2 +g + = 0,
d d d d
and hence
dv
v + v 2 = gR(cos sin ).
d
(c) Observe that the equation of motion is homogeneous of degree two in v, which
suggests using the variable w defined by v 2 = gRw2 : the equation for w is
dw
w + w2 = (cos sin ), w(0) = 0.
d
The solution of this equation, w(, ) depends only upon and , so the condition
w(/2, ) = 0 gives an equation involving only.
(d) Write the equation for v in the form

d 1 2 2
e2 v e = gR(cos sin )
d 2
and integrate to give

1 2 2
Z
v e = gR d e2 (cos sin ).
2 0
But
1
Z
dx eax+ibx = eax+ibx
a + ib
eax
= (a cos bx + b sin bx + i(a sin bx b cos bx))
a2 + b 2
so that
e2
Z
d e2 (cos sin ) = 3 cos + (1 22 ) sin )

1 + 4 2
and hence
1 2 gR h i
v = 2
3 cos + (1 22 ) sin 3e2
2 1 + 4
so that v(/2) = 0 if 22 + 3e = 1.
The speed is zero when and satisfy
g(, ) = 3 cos + (1 22 ) sin 3e2 = 0.
This equation defines a function (), the angle at which the bead stops. If > 1 ,
where 1 is the solution of this equation when = /2, the implicit function theorem
gives the rate of change of (1 ), d/d = g /g with the derivatives evaluated at
= /2 and = 1 . Thus
4 + 3(1 )e
0 (1 ) = = 1.36
3(1 2e )
where we have used the result 1 = 0.603. As expected () decreases as increases

past 1 .

At the start = /2 and h(/2) = 1. At the terminus = 1 and since C = tan 1 ,
h(1 ) = 1 + 2 sin 1 cos 1 2 tan 1 cos2 1 = 1.
The derivative is h0 () = 2(cos 2C sin 2) and h0 () = 0 when cos 2 = tan 1 sin 2,

that is tan 1 tan 2 = 1. But
tan 2 tan 1
tan(2 1 ) =
1 + tan 1 tan 2
so the stationary points are at 2 1 = /2, 3/2, . But /2 1 < /2

and the only physically significant solution is 2 1 = /2.

If = 0, h = 1 and

1 1
Z Z
x() = d (1 + cos 2) and y() = A + d sin 2.
2gB 2 /2 2gB 2 /2
Putting = /2 + gives

1 1
Z
x() = d (1 cos 2) = (2 sin 2)
2gB 2 0 4gB 2

1 1
Z
y() = A d sin 2 = A (1 cos 2).
2gB 2 0 4gB 2
At the terminus x = b, y = 0 and put = 1 so
4gB 2 b = 21 sin 21 and 4gB 2 A = 1 cos 21
and division gives the required equation for 1 .


If is the Lagrange multiplier the functional is
Z
dx y 0 2 y 2 ,

S[y] = y(0) = y() = 0,
0
and the associated Euler-Lagrange equation is y 00 + y = 0. If 0 there are no

solutions that satisfy the boundary conditions. Thus we set = 2 , to give the solution
y = A sin x that fits the boundary condition at x = 0. The condition at x = then
gives = 1, 2, 3, , so there are infinitely many solutions. The constraint gives
Z r
2 2 1 2 2
1=A dx sin nx = A giving A = ,
0 2
and the Lagrange multiplier = n2 .

(a) If is the Lagrange multiplier the functional is
Z b
dx py 0 2 (q + w)y 2 ,

S[y] = y(a) = y(b) = 0,
a
with associated Euler-Lagrange equation

d dy
p + (q + w)y = 0, y(a) = y(b) = 0.
dx dx
Z b
(b) If the constraint is dx w(x)f (y) = 1 the functional becomes
a
Z b
dx py 0 2 qy 2 wf (y) ,

S[y] = y(a) = y(b) = 0,
a
with associated Euler-Lagrange equation

d dy 1
p + qy + wf 0 (y) = 0, y(a) = y(b) = 0.
dx dx 2
If f (y) is linear in y, f = y, this equation becomes

d dy 1
p qy = w, y(a) = y(b) = 0,
dx dx 2
which is a linear inhomogeneuos equation. Otherwise f 0 (y) is not linear and the Euler-
Lagrange equation is a nonlinear equation.

If is the Lagrange multiplier
Z 1
dx y 2 y

S[y] =
0
and the Euler-Lagrange equation gives y = /2. The constraint then gives = 2a, so
y = a.

If 1 and 2 are the Lagrange multipliers then
Z
dx y ln y + 1 y + 2 x2 y ,

P [y] =

so the Euler-Lagrange equation is simply
1 + ln y + 1 + 2 x2 = 0 that is y = exp 1 1 2 x2 .

The first constraint gives

r

Z
11 2 x2 11
1=e dx e =e
2
and then the second constraint gives

r Z
2 2 2 1
= dx x2 e2 x =
22
Hence
x2

1
y(x) = exp 2 .
2 2

The auxiliary functional is
Z
dx y 0 2 y sin x ,

S[y] = y(0) = y() = 0
0
where is the Lagrange multiplier. The Euler-Lagrange equation 2y 00 + sin x = 0,

with the general solution

y(x) = sin x + Ax + B.
2
The boundary condition at x = 0 gives B = 0 and that at x = gives A = 0. The
constraint then gives
2a
Z
a= dx sin2 x = hence y(x) = sin x.
2 0 4

On the ellipse the relation between , see figure 11.8 and is
dy a cos a
tan = = =
dx b sin b tan
hence
1 d a d ab
= that is = 2 .
2
cos d b sin2 d a cos + b2 sin2
2
Thus the equation of motion is
dv abv 2
v + 2 = g(a cos b sin ).
d a cos2 + b2 sin2
If z = v 2 /2 this gives
dz 2abz
+ = g(a cos b sin ).
d a2 cos2 + b2 sin2
Now define a function f () by

1 b
Z
ln f = 2ab dw 2 = 2 tan1 tan ,
0 a cos w + b2 sin2 w
2 a
so the equation can be written in the form
d(zf )
= g(a cos b sin )f (), z(0) = 0,
d
and integration gives

g
Z
z() = dw (a cos w b sin w)f (w)
f () 0
which is the required result. If z(/2) = 0 the equation for = b/a is

Z /2
dw (cos w sin w)f (w) = 0.
0
Chapter 12
Sturm-Liouville systems
12.1 Introduction
The general theory of Sturm-Liouville systems presented in the first part of this chapter
was created in a series of articles in 1836 and 1837 by Sturm (1803 1855) and Liouville
(1809 1882): their work, later known as Sturm-Liouville theory, created a new sub-
ject in mathematical analysis. The theory deals with the general linear, second-order
differential equation

d dy
p(x) + q(x) + w(x) y = 0 (12.1)
dx dx
where the real variable, x, is confined to an interval, a x b, which may be the whole
real line or just x 0. The functions p(x), q(x) and w(x) are real and satisfy certain,
not very restrictive, conditions that will be delineated in section 12.4; in any particular
problem these functions are known. A second-order differential equation is said to be
in self-adjoint form when expressed as in equation 12.1: most second-order equations
can be expressed in this form, see exercise 12.1.
In addition to the differential equation, boundary conditions are specified with the
consequence that solutions exist for only particular values of the constant = k ,
k = 1, 2, , which are named1 eigenvalues: the solution yk (x) is named the eigenfunc-
tion for the eigenvalue2 k . At this stage we shall not specify any boundary conditions,
despite their importance, because different types of problems produce different types of
conditions. Equation 12.1, together with any necessary boundary conditions, is known
as a Sturm-Liouville system, or problem, which belongs to the class of problems known
as eigenvalue problems.
Sturm-Liouville problems are important partly because they arise in diverse cir-
cumstances and partly because the properties of the eigenvalues and eigenfunctions are
1 The fact that we use the same symbol for the eigenvalue and the Lagrange multiplier introduced
in chapter 11, is not a coincidence, as is seen by comparing equation 12.1 with the equation derived in
exercise 11.28, page 453.
2 There are also important examples where the eigenvalues can take any real number in an interval
(which may be infinite), and there are examples in which the eigenvalues can be both discrete and
continuous. Such problems are common and important in quantum mechanics. In this course we deal
only with discrete sets of eigenvalues.
475
476 CHAPTER 12. STURM-LIOUVILLE SYSTEMS
well understood. Moreover, the behaviour of both the eigenvalues and eigenfunctions
of a wide class of Sturm-Liouville systems are remarkably similar and is independent
of the particular form of the functions p(x), q(x) and w(x). In this class of problems
there is always a countable infinity of real eigenvalues k , k = 1, 2, , and the set of
eigenfunctions yk (x), k = 1, 2, , is complete, meaning that these functions may be
used to form generalised Fourier series, as described in section 12.3. Further, there are
simple approximations for both the eigenvalues and eigenfunctions which are accurate
for large k, as shown in exercise 12.35 (page 512).
The achievements of Sturm and Liouville are more impressive when seen in the
context of early nineteenth century mathematics. Prior to 1820 work on differential
equations was concerned with finding solutions in terms of finite formulae or power
series; but for the general equation 12.1 Sturm could not find an expression for the
solution and instead obtained information about the properties of the solution from
the equation itself. This was the first qualitative theory of differential equations and
anticipated Poincares work on nonlinear differential equations developed at the end
of that century. Today the work of Sturm and Liouville is intimately interconnected:
however, though lifelong friends who discussed their work prior to publication, this
theory emerged from a series of articles published separately by each author during
the period 1829 to 1840. More details of this history may be found in Lutzen (1990,
chapter 10).
This chapter introduces the basic theory of Sturm-Liouville systems and shows that
variational principles are useful for the approximation of eigenvalues and eigenfunctions.
Traditional Sturm-Liouville theory does not depend upon the calculus of variations, but
stems from the theory of ordinary linear differential equations which is introduced in
section 12.5.
The Sturm-Liouville eigenvalue problem is, however, readily formulated as a con-
strained variational principle, and this formulation can be used to approximate the
solutions. The crucial property of Sturm-Liouville systems that makes this method so
useful and important is their linearity, which means that the associated functional is
quadratic in y. Besides allowing convenient approximations many general properties
of the eigenvalues can be derived using the variational principle. Some aspects of this
theory are presented in section 12.6.
Sturm-Liouville systems are important because they arise in attempts to solve the
linear, partial differential equations that describe a wide variety of physical problems.
In addition most of the special functions that are so useful in mathematical physics,
and the study of which led to advances in analysis in the 19 th century, originate in
Sturm-Liouville equations. The importance of these functions should not be under-
estimated, as is frequent in this age of computing, for they furnish useful solutions to
many physical problems and can lead to a broader understanding than purely numerical
solutions. Further, the mathematics associated with these functions is elegant and its
study rewarding. There is no time in this course for any discussion of these functions,
but aspects of the important Bessel function are described in section 12.3.1.
Section 12.2 therefore briefly describes how Sturm-Liouville systems occur and gives
some idea of the variety of types of Sturm-Liouville problems that need to be tackled.
This section is optional, but recommended.
In section 12.3 we consider a particularly simple, solvable, Sturm-Liouville system
and examine the properties of its eigenvalues and eigenfunctions in order to illustrate
12.1. INTRODUCTION 477
all the relevant properties of more general systems, which normally cannot be solved in
terms of elementary functions. Some of these properties depend on elementary prop-
erties of second-order differential equations; this theory in described in section 12.5.
Other properties are endowed on the eigenvalues and eigenfunctions because the canon-
ical form of equation 12.1 is self-adjoint, a term defined in section 12.5.3.
The canonical form of equation 12.1 may seem rather special and to be unrepresen-
tative of most linear second-order differential equations: in fact, as shown in the next
exercise, this equation is typical of a large class of such equations.
Equation 12.1 can be cast into a variety of other forms which are useful in the
following discussion. Additionally this equation, with appropriate boundary conditions,
is the Euler-Lagrange equation of a constrained variational problem, with as the
Lagrange multiplier, and this is crucial for the later developments in section 12.6. The
following exercises lead you through this background and we recommend that you do
these exercises.
Exercise 12.1
Consider the second-order, homogeneous, linear differential equation
d2 y dy
a2 (x) + a1 (x) + a0 (x)y = 0.
dx2 dx
(a) Show that it may be put in the canonical form

d dy
p(x) + q(x)y = 0 (12.2)
dx dx
Z
a1 (x) a0 (x)
where p(x) = exp dx and q(x) = p(x).
a2 (x) a2 (x)
This transformation shows that most linear, second-order, homogeneous differen-
tial equation may be cast into the self-adjoint form of equation 12.1.
(b) By putting y = uv, with a judicious choice of the function v(x), show that
equation 12.2 may be cast into the form
d2 u
+ I(x)u = 0, u = y p, (12.3)
dx2
1 ` 02
p + 4qp 2pp00 . Equation 12.3 is sometimes known as

and where I(x) = 2
4p
the normal form and I(x) the invariant of the original equation.
Exercise 12.2
(a) Show that the Euler-Lagrange equation for the functional and constraint
Z b Z b
S[y] = dx py 0 2 qy 2 , C[y] = dx w(x)y 2 = 1,
a a
with admissible functions satisfying y(a) = y(b) = 0, is

d dy
p + (q + w) y = 0, y(a) = y(b) = 0.
dx dx
Z x
du
(b) Define a new independent variable by = to show that this Euler-
a p(u)
Lagrange equation is transformed into
d2 y
+ p(q + w)y = 0.
d 2
(c) By putting y = uv and by choosing v carefully, show that the original func-
tional and constraint can be written in the form
Z b Z b
1 ` w
dx u0 2 2 p0 2 + 4pq 2pp00 u2 , C[u] = dx u2 ,

S[u] =
a 4p a p
where u(a) = u(b) = 0. Hence derive the Euler-Lagrange equation for u and
compare this with equation 12.3.
Exercise 12.3
Liouvilles normal form:
Z b
S[y] = dx p(x)y 0 2 (q + w)y 2 .
a
(a) Change the independent variable to = (x) and the dependent variable to
v() where y = A()v(). With a suitable choice of (x) show that the functional
can be written in the form
Z d 2 !
1h 0 ` 2 0 2 i d dv 2
S[v] = p (x) A v + d F ()v ,
2 c c d
d 1
where = , c = (a), d = (b) and
dx pA2
d2

4 1
F () = (q + w)pA A 2 .
d A
(b) By defining A = (wp)1/4 , show that 0 (x) = w/p and the associated Euler-
p
Lagrange equation is
d2 v d2

q 1
+ A + v = 0.
d 2 w d 2 A
This transformation is sometimes named Liouvilles transformation , and is par-
ticularly useful for approximating the eigenvalues and eigenfunctions when is
large, see exercise 12.35 (page 512).
12.2 The origin of Sturm-Liouville systems

In this section we show how various types of Sturm-Liouville problems arise. This
material is not assessed but it is recommended that you read it and, time permitting,
that you do some of the exercises at the end of this section because it is important
background material.
12.2. THE ORIGIN OF STURM-LIOUVILLE SYSTEMS 479
The original work of Sturm appears to have been motivated by the problem of
heat conduction. One example he discussed is the temperature distribution in a one-
dimensional bar, described by the linear partial differential equation

u u
h(x) = p(x) l(x)u, (12.4)
t x x
where u(x, t) denotes the temperature at a point x of the bar at time t, and h(x), p(x)
and l(x) are positive functions. If the surroundings of the bar are held at constant
temperature and the ends of the bar, at x = 0 and x = L, are in contact with large
bodies at a different temperature, then the boundary conditions can be shown to be
u
p(x) + u(x, t) = 0, at x = 0,
x (12.5)
u
p(x) + u(x, t) = 0, at x = L,
x
for some constants and . Finally, the initial temperature of the bar needs to be
specified, so u(x, 0) = f (x) where f (x) is the known initial temperature.
Sturm attempted to solve this equation by first substituting a function of the form
u(x, t) = X(x)et , where is a constant and X(x) is independent of t. This yields
the ordinary differential equation

d dX
p(x) + h(x) l(x) X = 0 (12.6)
dx dx
for X(x) in terms of the unknown constant , together with the boundary conditions
p(0)X 0 (0) + X(0) = 0 and p(L)X 0 (L) + X(L) = 0. (12.7)
This is an eigenvalue problem. Assuming that there are solutions Xk (x) with eigenvalues
= k , for k = 1, 2, , Sturm used the linearity of the original equation to write a
general solution as the sum

X
u(x, t) = Ak Xk (x)ek t ,
k=1
where the coefficients Ak are arbitrary. This solution formally satisfies the differential
equation and the boundary conditions, but not the initial condition u(x, 0) = f (x),
which will be satisfied only if

X
f (x) = Ak Xk (x).
k=1
Thus the problem reduces to that of finding the values of the Ak satisfying this equation.
Fourier (1768 1830) and Poisson (1781 1840) found expressions for the coefficients
Ak for particular functions h(x), p(x) and l(x), but Sturm and Liouville determined
the general solution.
Typically Sturm-Liouville equations occur when the method of separating variables

is used to solve the linear partial differential equations that arise frequently in physical
problems; some common examples are
2 + k 2 = 0, (12.8)

2 k = 0, heat or diffusion equation, (12.9)
t
2
1
2 2 2 = 0, wave equation, (12.10)
c t
1 2

1
(x) 2 2 = 0, canal or horn equation, (12.11)
(x) x x c t
where c is a constant representing the speed of propagation of small disturbances in the

medium, k is a positive constant, (x) some positive function of x and
2 2 2
2 = + + .
x2 y 2 z 2
The first of these equations arises in the solution of Poissons equation that is,
2 = F (r) and similar equations occur when using separation of variables. The
second equation describes diffusion processes and heat flow. The third equation 12.10
is the wave equation for propagation of small disturbances in an isotropic medium and
describes a variety of wave phenomena such as electromagnetic radiation, water and
air waves, waves in strings and membranes. The fourth equation is a variant of the
previous wave equation and in this form was derived by Green (1793 1841) in his
18383 paper describing waves on a canal of rectangular cross section but with a width
varying along its length; a similar equation describes, approximately, the air pressure
in a horn, though in many instruments the flare is sufficiently rapid for the longitudinal
and radial modes to couple, so it is necessary to use the two-dimensional version of 12.11
in which the variation of the air pressure along the length of the pipe and in the radial
direction is included.
The many different forms of the Sturm-Liouville system that we discuss in the fol-
lowing sections are largely a consequence of the shapes of the regions in which the
physical system is defined and of the coordinate system that simplifies the equations.
A Sturm-Liouville system arises when the method of separation of variables is used to
reduce a partial differential equation to a set of uncoupled ordinary differential equa-
tions. Whether or not such a simplification is feasible depends upon the existence of
a suitable coordinate system and this depends upon the form of the original equation
and the shape of the boundary. Relatively few problems yield to this treatment, but
it is important because it is one of the principal means of finding solutions in terms
of known functions: the main alternatives are numerical and variational methods, the
latter being introduced in section 12.6.
In problems with two spatial dimensions separation of variables can be used with
equations 12.8 and 12.10 for rectangular, circular and elliptical boundaries but not, for
example, most triangular boundaries.
3 On the Motion of Waves in a variable Canal of small Depth and Width, 1838 Camb Phil Soc, Vol
VI, part III.

We end this section by separating variables for the equation 2 + k 2 = 0, using

the spherical polar coordinates,
x = r cos cos , y = r cos sin , z = r sin ,
where 0 , 0 2 and r 0 which are appropriate when the equation
is defined in a spherically symmetric region, for instance the interior or exterior of a
sphere of given radius or the region between two spheres of given radii and coincident
centres. The purpose of this section is to show how and why different Sturm-Liouville
systems occur. Although this material is not assessed, you should read it in order to
understand why some of the later mathematics is necessary.
In these coordinates it can be shown that equation 12.8 becomes
1 2

1
r2 + sin + + k 2 r2 = 0. (12.12)
r r sin sin2 2
First, write (r, , ) as the product (r, , ) = R(r)S(, ) where R depends only
upon r and S only upon (, ). Equation 12.12 then can be written in the form
1 2S

1 d 2 dR 2 2 1 1 S
r +k r = sin + .
R dr dr S sin sin2 2
The left-hand side of this equation depends only upon r and the right-hand side only
upon (, ). Because (r, , ) are independent variables this equation can be satisfied
only if each side is equal to the same constant, which we denote by ; constants intro-
duced for this purpose are named separation constants; note that the constant k is also
a separation constant obtained when separating the time from the spatial coordinates,
as in passing from equations 12.4 to 12.6. Thus we obtain the two equations,

d 2 dR
+ k 2 r2 R = 0,

r (12.13)
dr dr
1 2S

1 S
sin + + S = 0. (12.14)
sin sin2 2
The first of these equations is already in the canonical form of equation 12.1, and
contains two constants k and which are determined by the boundary conditions.
The second equation for S is converted into two suitable equations in the same
manner: substitute S = ()() where and are respectively functions of and
only. Then equation 12.14 can be cast in the form,
1 d2

sin d d
sin + sin2 = .
d d d2
The left-hand side of this equation depends only upon and the right-hand side only
upon , so each must equal the same constant. Later we shall see that the separation
constant
must be positive or zero: denoting it by 2 , with 0 so that the sign of
2
is unambiguous, gives the two equations
d2
+ 2 = 0, (12.15)
d2
2

1 d d
sin + = 0. (12.16)
sin d d sin2
Finally, if we define a new independent variable by x = cos , so

df df 1 d df d df
= sin and sin = (1 x2 ) ,
d dx sin d d dx dx
the equation for becomes
2

d 2 d
(1 x ) + = 0, 1 x 1. (12.17)
dx dx 1 x2
Both equation 12.15 for and 12.17 for are in the canonical form of equation 12.1.
Comparison of 12.15 for with equation 12.1 shows that the separation constant 2
now plays the role of the eigenvalue; its value is determined by the boundary conditions
that needs to satisfy. Comparison of 12.17 for with equation 12.1 shows that here
plays the role of the eigenvalue.
This analysis shows that in spherical polar coordinates the equation 2 + k 2 = 0
gives rise to three Sturm-Liouville systems for R(r), () and () where = R(r)()().
These equations are summarised in table 12.1.
Table 12.1: Summary of the three Sturm-Liouville systems arising from separation of vari-
ables of equation 12.8 using spherical polar coordinates, giving the explicit form for the three
functions p, q and w, in each case.
Equation p q w Eigenvalue
00 + 2 = 0 1 0 1 2
0 2 2

(1 x2 )0 (x) + =0 1 x2 1
1 x2 1 x2
0
r2 R0 (r) + (k 2 r2 )R = 0 r2 r2 k2
Now consider the boundary conditions.

The equation for : the points with coordinates (r, , ) and (r, , + 2n), n =
0, 1, 2, , all label the same point in space, so in most physical problems we must
have ( + 2n) = () for all , that is () must be 2-periodic. This is why the
separation constant introduced to derive equations 12.13 and 12.14 had to be positive,
for the equation 00 2 = 0, with > 0, does not have periodic solutions; further,
is 2-periodic only if is a non-negative integer, = m, m = 0, 1, 2, , see exer-
cise 12.12 (page 493).
The equation for , has p(x) = 1 x2 , which is zero at the ends of the interval
(1, 1), that is at = 0 and , corresponding to the poles. The poles are singular
points of spherical polar coordinates, because at each pole is undefined, and this is
why p(x) = 0 at x = 1. Further, because the coefficient of 00 () is zero at x = 1,
the general theory of linear differential equations shows that there are two types of
solutions, those that are bounded at x = 1 and those that are unbounded. Physical
considerations suggest that in most circumstances only bounded solutions are signifi-
cant. Thus for this type of Sturm-Liouville problem the boundary conditions are simply
that () is bounded for x [1, 1]. It can be shown that with = m, this condition
gives = l(l + 1), l = m, m + 1, m + 2, 4 ; these solutions are named the associated
Legendre polynomials and are denoted by Plm (x).
The radial equation for R(r) has p(r) = r 2 , so if the original space includes the origin
we find that because p(0) = 0 the solutions are of two types, those that are bounded
and those that are unbounded at r = 0. Again, physical considerations usually suggest
that the bounded solutions are chosen. The other boundary conditions are either given
by some condition at r = a > 0, where a is the radius of the sphere in which the original
problem is defined, or that the solutions remain bounded as r .
Summary: the method of separation of variables applied to the equation 2 + k 2 = 0,
using spherical polar coordinates leads to three different types of Sturm-Liouville sys-
tems. In this summary we introduce the idea of regular and singular Sturm-Liouville
systems, that will be discussed further and defined in section 12.4.
(1) The equation

d2
+ 2 = 0 (12.18)
d2
with periodic boundary conditions ( + 2) = () for all , which determines
possible values of . Note that this condition implies the conditions (0) = (2)
and 0 (0) = 0 (2).
(2) The equation
2

d 2 d
(1 x ) + = 0, 1 x 1. (12.19)
dx dx 1 x2
The condition that () is bounded for all x serves the same purpose as boundary
conditions, and determines possible values of the eigenvalue , once 2 is known.
Because p(x) = 1x2 is zero at the ends of the interval this type of Sturm-Liouville
equation is classified as a singular Sturm-Liouville system.
(3) The equation
d 2 dR
+ k 2 r2 R = 0.

r (12.20)
dr dr
For this equation several types of conditions can specify the solution uniquely and
determine possible values of the eigenvalue k 2 .
(i) If 0 r a, since p(r) = r 2 is zero at r = 0, the solutions will normally
be required to be bounded at r = 0 and satisfy a condition of the form
A1 y(a) + A2 y 0 (a) = 0 at r = a, where A1 and A2 are constants. This system
is classified as a singular Sturm-Liouville system because p(r) = 0 at r = 0.
(ii) If r [0, ), since p(0) = 0 the solutions will normally be required to
be bounded at r = 0 and tend to zero as r . Again this is a singular
Sturm-Liouville system.
4 A physical reason why l m is that in some circumstances l is proportional to the magnitude of
an angular momentum and m a projection of this vector along a given axis, which can be no longer
than the original vector.
(iii) If 0 < a r b the solution will be required to satisfy boundary condi-

tions of the form
A1 y(a) + A2 y 0 (a) = 0 and B1 y(b) + B2 y 0 (b) = 0,
where A1 , A2 , B1 and B2 are constants. For this system p(r) = r 2 > 0 for all
r and the system is a regular Sturm-Liouville system.
The examples described in this section show how Sturm-Liouville equations arise and
why a variety of types of these equations exist. The significance of the differing types
will become clear as the theory develops.
Exercise 12.4
Consider the system 2 + k2 = 0 with (x, y) = 0 on the rectangle defined
by the x- and y-axes, and the lines x = a > 0, y = b > 0. Show that inside
this rectangle separation of variables with Cartesian coordinates leads to the two
Sturm-Liouville systems
d2 X d2 Y
+ 12 X = 0 and + 22 Y = 0
dx2 dy 2
with X(0) = X(a) = 0, Y (0) = Y (b) = 0 and where = X(x)Y (y) and
12 + 22 = k2 .
Exercise 12.5
Consider the system 2 + k2 = 0 with (x, y) = 0 defined inside the circle of
radius a. Use the polar coordinates x = r cos , y = r sin , 0 r a to cast the
equation in the form
2 1 1 2
2
+ + 2 + k2 = 0.
r r r r 2
By putting = R(r)(), where R(r) depends only upon r and () only upon
, show that
d2
+ 2 = 0, with () 2-periodic,
d2
d2 R dR ` 2 2
r2 2 + r + k r 2 R

= 0,
dr dr
where is a positive constant. Show further that the equation for R(r) can be
cast in self-adjoint form
2

d dR
r + k2 r R = 0.
dr dr r
12.3. EIGENVALUES AND FUNCTIONS OF SIMPLE SYSTEMS 485
12.3 Eigenvalues and functions of simple systems

The eigenvalues and eigenfunctions of most Sturm-Liouville systems are not easy to
find; yet the theory of Sturm-Liouville systems, to be described later, shows that the
eigenfunctions for most Sturm-Liouville systems with discrete eigenvalues behave sim-
ilarly, independent of the detailed form of the three functions p, q and w and of the
boundary conditions.
Thus in this section, in order to help understand this behaviour, we consider the
Sturm-Liouville system defined by the equation
d2 y
+ y = 0, y(0) = y() = 0, (12.21)
dx2
with p(x) = w(x) = 1, q(x) = 0 and defined in the interval [0, ]. This equation
has simple solutions, found in exercise 12.6, and by studying these it is possible to
understand almost everything about the solutions of other Sturm-Liouville systems
with discrete eigenvalues. We illustrate this point in section 12.3.1 by describing the
properties of a singular Sturm-Liouville system closely related to equation 12.20, and
whose eigenfunctions are Bessel functions.
Exercise 12.6
(a) Show that equation 12.21 has no real, nontrivial solutions if 0.
(b) Find the values of > 0 for which solutions exist and find these solutions.
In exercise 12.6 it was shown that the eigenfunctions and eigenvalues of equation 12.21
are
yn (x) = B sin nx, n = n2 , n = 1, 2, . (12.22)
The constant B is undetermined because the equation and boundary conditions are
homogeneous. It is often convenient to fix the value of this constant by normalising the
eigenfunctions to unity, that is we set
Z Z
1
2
dx yn (x) = 1 and this gives B 2
dx sin2 nx = B 2 = 1. (12.23)
0 0 2
By choosing B to be positive this convention gives the following eigenfunctions and
eigenvalues r
2
yn (x) = sin nx, n = n2 , n = 1, 2, . (12.24)

Graphs of the adjacent pairs of eigenfunctions {y1 (x), y2 (x)}, and {y5 (x), y6 (x)} are
shown in the following figure.
y y
k=1
k=5
0.5 0.5
k=2 k=6
x x
0 0
1 2 3 1 2 3
-0.5 -0.5
p
Figure 12.1 Graphs of yk (x) = 2/ sin kx for k = 1, 2 on the left, and k = 5, 6 on the right.
We now list the important properties of these eigenvalues and eigenfunctions and state
which are common to all Sturm-Liouville systems. It is surprising that most of these
properties are common to all Sturm-Liouville systems regardless of the precise forms of
the functions p, q and w.
In this list we first state the specific property of the solutions of the Sturm-Liouville
system 12.21, and then state the equivalent general property of the solutions for the
general system, equation 12.1.
Real eigenvalues The eigenvalues n = n2 , n = 1, 2, are real.

The eigenvalues of all Sturm-Liouville systems are real and this is a consequence of
the form of the differential equation and the boundary conditions, which together
produce a self-adjoint operator: for an example of boundary conditions that give
complex eigenvalues, see exercise 12.13 (page 494).
Behaviour of eigenvalues The smallest eigenvalue is unity, but there is no largest

eigenvalue: further, n /n2 = O(1) as n .
For the general Sturm-Liouville system there is a smallest but no largest eigenvalue
and n increases as n2 for large n; this is proved in exercise 12.35 (page 512).
Uniqueness of eigenfunctions For each eigenvalue n there is a single eigenfunction,

yn sin nx, unique to within a multiplicative constant.
This is also true of regular Sturm-Liouville systems and most singular Sturm-
Liouville systems of physical interest. The important exception described in ex-
ercise 12.12 (page 493) shows that there is not always a unique eigenfunction for
periodic boundary conditions. The example of exercise 12.14 shows that some
singular Sturm-Liouville systems have no eigenfunctions.
Interlacing zeros The zeros of adjacent eigenfunctions interlace, so there is one and
only one zero of yn+1 (x) between adjacent zeros of yn (x), see figure 12.1.
This is also true in the general case, and is a property of many solutions of second-
order equations, see theorem 12.2 (page 501), see also theorem 12.3.
Number of zeros of the nth eigenfunction The nth eigenfunction has n 1 zeros
in 0 < x < .
For the general Sturm-Liouville problem on the interval [a, b] the nth eigenfunction
has n 1 zeros in a < x < b. This property is largely a consequence of the
interlacing of zeros.
Orthogonality of eigenfunctions The integral of the product of two distinct eigen-

functions over the interval (0, ) is zero,
Z Z
dx yn (x)ym (x) = dx sin nx sin mx = 0, n 6= m.
0 0
For the general Sturm-Liouville system, regular and singular, defined in equa-
tion 12.1 there is a similar result. If n (x) and m (x) are eigenfunctions belonging
to two distinct eigenvalues, then they can be shown to satisfy the orthogonality
relation Z b
dx w(x)n (x) m (x) = hn nm , (12.25)
a
where hn is a sequence of positive numbers, nm is the Kronecker delta5 and

a denotes the complex conjugate. Note that there are two differences between
the specific example of equation 12.21 and the general case. First, the function
w(x), the same function that multiplies the eigenvalue in the original differential
equation 12.1, has been included in the integrand: in this context w(x) is often
named the weight function. Second, the complex conjugate of n (x) appears.
This is necessary because there are circumstances when it is more convenient to
use complex solutions even though the equations are real: for instance, we often
use einx in place of the real trigonometric functions cos nx and sin nx.
By analogy with ordinary geometric vectors this integral is named an inner product
and it is convenient to introduce the short-hand notation
Z b
(f, g)w = dx w(x)f (x) g(x) (12.26)
a
where f (x) and g(x) are any functions, which may be complex, for which the
integral exists. Notice that (g, f )w = (f, g)w . With this notation equation 12.25
can be written in the form hn = (n , n )w . If w(x) = 1 we denote the inner
product by (f, g).
If (f, g)w = 0 the two functions are said to be orthogonal and if (f, f )w = 1 the
function f is said to be normalised.
Completeness of eigenfunctions The eigenfunctions yn (x) = sin nx may be used in

a
R Fourier series to represent any sufficiently well behaved function f (x) for which
2
0 dx |f (x)| exists. The Fourier representation of f (x) is,

2
X Z
f (x) = bn sin nx, 0<x< where bn = dx f (x) sin nx. (12.27)
n=1
0
The infinite set of functions sin nx, n = 1, 2, , is said to be complete on the

interval (0, ) because any sufficiently well behaved function can be represented
in terms of such an infinite series.
In general if n (x), n = 1, 2, , are the eigenfunctions of a Sturm-Liouville
system defined on (a, b), with given boundary conditions, they are complete which
Rb
means that that any sufficiently well behaved function f (x) for which a dx |f (x)|2
exists, can be represented by the infinite series

X
f (x) = an n (x), a < x < b, (12.28)
n=1
5 The Kronecker delta is a function of two integers, (n, m), defined as

nm = 0 if n 6= m and 1 if
n = m.
where
b
(n , f )w 1
Z
an = = dx w(x)n (x) f (x), hn = (n , n )w .
(n , n )w hn a
It is conventional to name the more general series 12.28 a Fourier series and the
coefficients an the Fourier components: the series 12.27 is often referred to as a
trigonometric series, if a distinction is necessary.
The twin properties of orthogonality and completeness of the eigenfunctions, and
hence the existence of the series 12.28, are two reasons why Sturm-Liouville sys-
tems play a significant role in the theory of linear differential equations. It means,
for instance, that solutions of the inhomogeneous equation

d dy
p(x) + q(x)y = F (x), (12.29)
dx dx
with suitable boundary conditions, can usually be expressed as a linear combina-
tion of the eigenfunctions of the related Sturm-Liouville system,

d dy
p(x) + q(x) + w(x) y = 0,
dx dx
with the same boundary conditions. The rigorous treatment of this theory is too
involved to be included in this course, but an outline of the theory is contained
in the next exercise.
Exercise 12.7
Suppose that the Sturm-Liouville system

d dy
p + (q + w)y = 0, y(a) = y(b) = 0,
dx dx
has an infinite set of eigenvalues and eigenfunctions k and k (x), k = 1, 2, ,
with 0 < 1 < 2 < . which satisfy the orthogonality relation 12.25.
(a) Consider the infinite series

X
y(x) = yk k (x)
k=1
where the coefficients yk are constants. Assuming the order of summation and
differentiation can be interchanged, show that

d dy X
p + qy = yk k w(x)k (x).
dx dx
k=1
(b) Hence show that the solution of the inhomogeneous equation 12.29 can be
written in the form
Z b
X k (u) k (x)
y(x) = du G(x, u)F (u) where G(x, u) = .
a k=1
h k k
12.3.1 Bessel functions

Here we show that the properties described in the previous section are shared by Bessel
functions, which is one of the special functions that can be defined by a singular Sturm-
Liouville equation, given in equation 12.30.
We choose the Bessel function for this illustration because it is one of the more
important special functions of mathematical physics. It was one of the first special
functions to be the subject of a comprehensive treatise (Watson 1966)6 which provides
a thorough history of the early development and use of Bessel functions: they have oc-
curred in the work of Euler (1764, in the vibrations of a stretched membrane), Lagrange
(1770, in the theory of planetary motion), Fourier (1822, in his theory of heat flow),
Poisson (1823, in the theory of heat flow in spherical bodies) and by Bessel (1824, who
studied these functions in detail): Watson (1966) abandons his attempt to delineate the
chronological order of the study after Bessel as After the time of Bessel, investigations
on the functions become so numerous . . . .
Bessel functions are important because, unlike most other special functions, they
arise in two quite distinct types of problems. The first is in the solution of linear partial
differential equations where separation of variables is used to derive ordinary differential
equations; typically problems involving cylindrical and spherical symmetry give rise to
Bessel functions, but so does the problem of the small vibrations of a chain suspended
from one end (considered by Euler in 1782).
These types of problem lead to differential equations that can be cast into the form
d2 y dy
x2 +x + (x2 2 )y = 0, (12.30)
dx2 dx
where is a real number7 , though in the following we consider only the case = 1.
The various solutions of this equation are collectively named Bessel functions. This
equation is singular at the origin (see section 12.5) and, as a consequence, it can be
shown to possess two types of solution. Those denoted by J (x) are bounded at the
origin: those denoted by Y (x) are unbounded at the origin.
The second application arises because it is frequently necessary to expand the func-
tion eiz sin t , which is 2-periodic in t, as a Fourier series. It transpires that the Fourier
components are Bessel functions,

X
eiz sin t = Jn (z)eint . (12.31)
n=
This relation is useful in the modern problem of the interaction of periodic electric
fields, lasers for example, with atoms and molecules: but the original application of
Bessel functions in this context was the inversion of Keplers equation, which relates
the time, t, to the eccentric anomaly, u, of a planet in an elliptical orbit with the Sun
at one focus,
t = u sin u (Keplers equation). (12.32)
6 G N Watson 1966 A treatise on the Theory of Bessel Functions (Cambridge University Press),
first published 1922.

7 In the general theory both x and are complex variables. The important Modified Bessel functions
are obtained by making purely imaginary.

Here is the angular frequency of the planet and the eccentricity of the elliptical
path typically less than 0.1, the exceptions being Mercury (0.21) and Pluto (0.25).
Elementary dynamics gives the approximate position of each planet in terms of u, but
for practical applications they are needed in terms of the time. By writing t = and
u = + P (), so P () is a 2-periodic function, we find that the Fourier components
of P () are related to Bessel functions, see exercise 12.49.
This application gives rise to the integral definition of Jn (x),
Z
1
Jn (x) = dt exp i (nt x sin t) , n = 0, 1, 2, . (12.33)
2
The integral representation of J (x), where is not an integer, is more complicated

(Whittaker and Watson, 1965, sections 17.1 and 17.231). It can be shown, by differ-
entiating equation 12.33, that the function defined in this way satisfies the differential
equation 12.30, see exercise 12.50.
Exercise 12.8
(a) Show that the self-adjoint form of equation 12.30 is
2

d dy
x + x y = 0.
dx dx x
(b) Show that the normal form, defined in exercise 12.1, of equation 12.30 is
!
d2 u 2 14 u(x)
+ 1 u = 0 where y(x) = , x > 0.
dx2 x2 x
(c) Apply the Liouville transformation, defined in exercise 12.3, to equation 12.30
to give the alternative form of Bessels equation
d2 y 2
2
+ e 2 y = 0 where = ln x, x > 0.
d
Exercise 12.9
(a) Use the Fourier series 12.31 to show that
(i) Jn (x) = (1)n Jn (x);
(ii) Jn (x) = (1)n Jn (x);
(iii) J0 (x) + 2J2 (x) + 2J4 (x) + = 1.
(b) Use the integral definition to show that J0 (0) = 1 and that Jn (0) = 0 for
n 6= 0.
(c) By differentiating the integral definition 12.33 with respect to x derive the
recurrence relation
2Jn0 (x) = Jn1 (x) Jn+1 (x).
(d) Use the integral definition 12.33 to show that
2n
Jn1 (x) + Jn+1 (x) = Jn (x).
x
In the remainder of this section we describe the behaviour of the eigenvalues and eigen-
functions of the singular Sturm-Liouville system associated with Bessels equation,
d2 y dy
x2 +x + (2 x2 1)y = 0, 0 x 1, y(1) = 0. (12.34)
dx2 dx
with > 0, in particular we show that they satisfy most of the properties listed at
the beginning of section 12.3. By converting equation 12.34 to the self-adjoint form
(xy 0 )0 + (x2 1/x)y = 0, see exercise 12.8, and comparing with equation 12.1 we see
that the eigenvalue is = 2 (and p = w = x, q = 1/x). By changing the independent
variable to = x we see that this equation is the same as equation 12.30 with = 1
and hence has the solutions Y1 (x) and J1 (x); we require the solution that is bounded,
that is J1 (x).
The boundary condition at x = 1 then gives J1 () = 0, that is must be one of the
zeros of the Bessel function. A graph of J1 () is shown in figure 12.2 and this suggests
that there are an infinite number of positive zeros, k , k = 1, 2, .
J1()
0.6
0.4
0.2

0
2 4 6 8 10 12 14 16 18 20
-0.2
-0.4
Figure 12.2 Graph of the Bessel function J1 ().
Using its series expansion Daniel Bernoulli (1738) first suggested that this Bessel func-
tion has an infinite set of zeros. Later we shall see how this follows from the general
theory of second-order differential equations: the first five zeros are
1 = 3.832, 2 = 7.016, 3 = 10.17, 4 = 13.32, 5 = 16.47,
and these numbers can be approximated by the formula

1 3
k = k + + O(k 3 ), k = 1, 2, ,
4 8(k + 1/4)
which gives the first zero to within 0.006% and progressively improves in accuracy with
increasing k.
The easiest way to understand why J1 (x) oscillates in the manner shown in fig-
ure 12.2is to use the result derived in exercise 12.8(b). For large x this shows that
u(x) = xJ1 (x) is given approximately by the equation u00 + u = 0, so that J1 (x) '
(A cos x + B sin x)/ x; this shows why J1 (x) oscillates but does not give the phase of
the oscillations, that is the values of A and B.
The eigenfunctions of equation 12.34 are thus
yk (x) = J1 (k x), k = 1, 2, . (12.35)

In the following two figures are shown the graphs of the eigenfunctions {y1 (x), y2 (x)}
and {y5 (x), y6 (x)}, as in figure 12.1 (page 485), with which you should compare the
present figures.
y y
k=5
0.4 k=1 0.4
0.2 k=2 0.2

k=6
x x
0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
-0.2 -0.2
-0.4 -0.4
Figure 12.3 Graphs of yk (x) = J1 (k x), for k = 1, 2, on the left, and k = 5, 6 on the right.
These eigenfunctions and eigenvalues all behave as previously described, namely:

the eigenvalues are real:
for large n the eigenvalues behave as n = 2n ' (n+1/4)2 2 , that is n /n2 = O(1)
as n :
the nth eigenfunction has n 1 zeros in the interval 0 < x < 1:
there is one and only one zero of yn+1 (x) between adjacent zeros of yn (x):
the eigenfunctions are orthogonal with weight function w(x) = x. In this case it
can be shown that
Z 1
1
dx xJ1 (xn )J1 (xm ) = nm hn with hn = J10 (n )2 .
0 2
The eigenfunctions are complete, which means that any sufficiently well behaved
real function, f (x), on the interval 0 < x < 1 can be expressed as the infinite
series, equation 12.28 (page 487),
1
2
X Z
f (x) = an J1 (xn ) where an = dx xf (x)J1 (xn ).
n=1
J10 (n )2 0
Exercise 12.10
This exercise shows how the boundary conditions can affect the eigenvalues and
eigenfunctions. Find all eigenvalues and eigenfunctions of the Sturm-Liouville
systems defined by the differential equation
d2 y
+ y = 0,
dx2
and the three sets of boundary conditions
(a) y 0 (0) = y 0 () = 0, (b) y(0) = y 0 () = 0, (c) y(0) = 0, y() = y 0 ().

In each case show that the eigenfunctions, n (x), belonging to distinct eigenvalues
are orthogonal, that is satisfy,
Z
dx n (x) m (x) = hn nm
0
where hn is a sequence of positive numbers which you should find.
Exercise 12.11
This exercise involves lengthy algebraic manipulations. In exercise 12.10 you found
the following sets of eigenfunctions, yn (x), and eigenvalues, n , for the equation
d2 y/dx2 + y = 0 with three different boundary conditions,
(a) yn (x) = cos nx, n = n2 , n = 0, 1, , y 0 (0) = y 0 () = 0;

(b) yn (x) = sin(n + 1/2)x, n = (n + 1/2)2 , n = 0, 1, , y(0) = y 0 () = 0;
(c) y0 (x) = sinh 0 x, 0 = 02 , yn (x) = sin n x, n = n2 , where tanh 0 = 0
and tan n = n , n = 1, 2, .
The Sturm-Liouville theorem shows that each of these sets of functions is complete
on (0, ). Use equation 12.28 to show that the function x may be represented by
any of the following series on the interval (0, )

4 X cos(2k + 1)x
x = ,
2 (2k + 1)2
k=0

2 X (1)k

1
x = sin k + x,
(k + 1/2)2 2
k=0

2( 1) cosh 0 X cos k sin k x
x = 2
sinh 0 x 2( 1) 2
.
0 ( cosh 0 ) k=1
k ( cos k )
Exercise 12.12
Periodic boundary conditions:
(a) Show that the eigenvalues of the Sturm-Liouville system
d2 y
+ y = 0, y(0) = y(2a), y 0 (0) = y 0 (2a), a > 0,
dx2
are given by
n 2
n = , n = 0, 1, 2, ,
a
and that there are no negative eigenvalues. Show also that for n = 0 there is
just one eigenfunction, which can be taken to be y0 (x) = 1, and for n 1 each
eigenvalue has two linearly independent eigenfunctions,
n nx nx o
yn (x) = cos , sin ,
a a
or any linear combination of these.
(b) Consider the two eigenfunctions associated with the nth eigenvalue
nx nx nx nx
u1 (x) = A1 cos + B1 sin and u2 (x) = A2 cos + B2 sin .
a a a a
Show that these are orthogonal only if A1 A2 + B1 B2 = 0.
Exercise 12.13
Mixed boundary conditions:
The solutions of a Sturm-Liouville equation with mixed boundary conditions usu-
ally behave quite differently from those with unmixed conditions. An example is
considered in this exercise.
Consider the system with mixed boundary conditions
d2 y
+ y = 0, y(0) = 0, y() = ay 0 (0), a > 0.
dx2
Show that if 0 < a < there are a finite number of real eigenvalues given by the
real roots of the equation sin = a, (1 , 2 , , N ), with = 2 and with
eigenfunctions yk (x) = sin k x and N ' 1/a.
Are these eigenfunctions orthogonal?
12.4 Sturm-Liouville systems

In the previous section it was shown how the eigenvalues and eigenfunctions of a particu-
lar Sturm-Liouville system behave and it was stated that most of these systems behave
similarly. We now formally define regular and singular systems before investigating
some of these properties. The distinction between regular and singular systems is im-
portant, because not all singular systems have eigenvalues, see exercise 12.14; however,
the regular and singular systems that arise from linear partial differential equations
behave similarly.
A regular Sturm-Liouville system is defined to be the linear, homogeneous, second-
order differential equation8

d dy
p(x) + q(x) + w(x) y = 0 (12.36)
dx dx
defined on a finite interval of the real axis a x b, together with the homogeneous
boundary conditions
A1 y(a) + A2 y 0 (a) = 0 and B1 y(b) + B2 y 0 (b) = 0, (12.37)
with A1 , A2 , B1 and B2 real constants, and the two cases A1 = A2 = 0 and B1 = B2 = 0

are excluded. These conditions are sometimes named separated boundary conditions.
the functions p(x), q(x) and w(x) are real and continuous for a x b;
p(x) and w(x) are strictly positive for a x b;
p0 (x) exists and is continuous for a x b.

8 There is no agreed convention for the signs in this equation. For instance, in Courant and Hilbert
(1965) and Birkhoff and Rota (1962) the sign in front of q(x) is negative and in Korner (1988) the
signs in front of q(x) and are negative. Care is needed when using different sources.
12.4. STURM-LIOUVILLE SYSTEMS 495
Equation 12.34, defining the Bessel function, and the radial equation 12.20 for R(r) and
equation 12.19 for (), do not satisfy the condition p > 0. Further, equation 12.18
for () has a different type of boundary condition than those of equation 12.37. It
follows that the scope of the theory needs to be extended if it is to be useful.
First, it needs to apply to periodic boundary conditions, that is
y(a) = y(b), y 0 (a) = y 0 (b) (12.38)
which are an important subset of the class of mixed boundary conditions, see exer-
cise 12.12. Equation 12.18 for () has this type of boundary condition. Another
common Sturm-Liouville system with periodic boundary conditions is Mathieus equa-
tion,
d2 y
+ ( 2q cos 2) y = 0, y(0) = y(), y 0 (0) = y 0 (), (12.39)
d2
where here q is a real variable. This equation seems to have been first studied by the
French mathematician Mathieu (1835 1890) in his discussion of the vibrations of an
elliptic membrane and occurs when separating variables in elliptical coordinates, see
exercise 12.49 (page 526). In this example (q) is the eigenvalue and it has a fairly
complicated dependence upon the variable q.
The main difference between periodic and separated boundary values is that some-
times, see exercise 12.12, each eigenvalue has more than one eigenfunction. In such
cases it is always possible to choose linear combinations that are orthogonal.
The second necessary extension is to those equations where p(x) = 0 at either or
both end points. In the example treated in section 12.2, the equation 12.20 for R(r) is
singular if the interval contains r = 0, as is the Bessel function example, equation 12.34:
the equation 12.19 for () is singular because p(x) = 1 x2 is zero at both ends of
the interval. Thus singular systems are as common as regular systems.
As an aside we note that all these singular systems arise because the spherical polar
coordinates used to separate variables are singular at the poles, where x = cos = 1
and is undefined, and at r = 0 where neither nor are defined. It is this geo-
metric singularity in the transformation between Cartesian and polar coordinates that
makes the Sturm-Liouville systems singular: therefore we do not expect these particular
singular systems to be much different from regular systems.
A Sturm-Liouville system for which p(x) is positive for a < x < b but vanishes at
one or both ends is named a singular Sturm-Liouville system. These systems comprise
the differential equation 12.36, with w(x) and q(x) satisfying the same conditions as
for a regular system, and
the solution is bounded for a x b;
at an end point at which p(x) does not vanish, y(x) satisfies a boundary condition
of the type 12.37.
The example of equation 12.19 shows that for some singular systems q(x) is unbounded
at the interval ends. The behaviour of q(x) is not, however, so important in determining
the behaviour of the eigenfunctions.
The third necessary extension is to systems defined on infinite or semi-infinite inter-
vals, which arise in many applications in quantum mechanics. We shall not deal with
these problems, but note that in many cases these systems behave like regular systems.
Exercise 12.14
Consider the eigenvalue problem

d 2 dy
x + y = 0, 0 x 1, y(1) = c 0,
dx dx
and with y(x) bounded.

(a) Find the general solution of this equation and show that this problem has no
eigenvalues if c = 0 and infinitely many if c > 0.
(b) How does this problem change if the boundary conditions become y(a) =
y(1) = 0, 0 < a < 1?
12.5 Second-order differential equations

In this section we describe the elementary properties of linear, second-order differential
equations, that give rise to some of the properties listed in the previous section. This
account merely highlights salient points of a theory that is far too extensive to do
it justice here. The interested reader should consult standard texts, such as Birkoff
and Rota9 (1962) and Ince10 (1956). The equation we consider is the inhomogeneous
equation,
d2 y dy
p2 (x) 2 + p1 (x) + p0 (x)y = h(x), a x b, (12.40)
dx dx
where the coefficients pk (x), k = 0, 1, 2 are real and assumed to be continuous for
x (a, b). The interval (a, b) may be finite or infinite. The problem is to find the
functions y = f (x) satisfying this equation, that is the solutions, and to understand
their behaviour.
The nature of the solutions depends upon p2 (x), the coefficient of y 00 (x). The theory
is valid in intervals for which p2 (x) 6= 0 and for which p1 /p2 and p0 /p2 are continuous.
If p2 (x) = 0 at some point x = c the equation is said to be singular at x = c, or to have
a singular point. Singular points, when they exist, always define the ends of intervals of
definition; hence we may always choose p2 (x) 0 for x [a, b]. Singular equations are
important, as was seen in section 12.2, particularly equations 12.19 and 12.20, but a
proper treatment of these equations requires complex variable theory, and the examples
that occur are usually dealt with as individual cases: important examples give rise to
the Special Functions which are important in many applications of mathematics.
The homogeneous equation associated with equation 12.40 is obtained by setting
h(x) = 0,
d2 y dy
p2 (x) 2 + p1 (x) + p0 (x)y = 0, a x b. (12.41)
dx dx
All homogeneous equations have the trivial solution y(x) = 0, for all x. Solutions that
do not vanish identically are called nontrivial.
9G Birkhoff and G-C Rota, 1962 Ordinary differential equations (Blaisdell Publishing Co.).
10 E L Ince, 1956 Ordinary differential equations (Dover).
12.5. SECOND-ORDER DIFFERENTIAL EQUATIONS 497
Exercise 12.15
d2 y
Consider the inhomogeneous equation + y = x.
dx2
(a) Find the general solution of the homogeneous equation.
(b) Find any particular integral of the inhomogeneous equation, and hence find
its general solution.
The solutions of equations 12.40 and 12.41 satisfy the following properties.
P1: Solutions of the homogeneous equation satisfy the superposition principle:

that is if f (x) and g(x) are solutions of equation 12.41 then so is any linear
combination
y(x) = c1 f (x) + c2 g(x)
where c1 and c2 are any constants.
P2: Uniqueness of the initial value problem. If p1 /p2 and p0 /p2 are continu-
ous for x [a, b] then at most one solution of equation 12.40 can satisfy the given
initial conditions y(a) = 0 , y 0 (a) = 1 , see also section 3.5.2.
P3: If f (x) and g(x) are solutions of the homogeneous equation 12.40 and if, for
some x = , the vectors (f (), f 0 ()) and (g(), g 0 ()) are linearly independent,
then every solution of equation 12.41 can be written as a linear combination of
f (x) and g(x),
y(x) = c1 f (x) + c2 g(x).
The two functions f (x) and g(x) are said to form a basis of the differential equa-
tion.
P4: The general solution of the inhomogeneous equation 12.40 is given by the
sum of any particular solution and the general solution of the homogeneous equa-
tion 12.41.
Exercise 12.16
Use properties P2 to show that if a nontrivial solution of equation 12.41 y(x) is
zero at x = , then y 0 () 6= 0, that is the zeros of the solutions are simple.
Exercise 12.17
Consider the two vectors x = (x1 , x2 ) and y = (y1 , y2 ) in the Cartesian plane.
Show that they are linearly independent, that is not parallel, if

x1 x2
x1 y2 x2 y1 = 6 0.
=
y1 y2
12.5.1 The Wronskian

In property P3 we introduced the vectors (f, f 0 ) and (g, g 0 ) and in exercise 12.17 it was
shown that these vectors are linearly independent if

f (x) f 0 (x)
W (f, g; x) =
= f (x)g 0 (x) f 0 (x)g(x) 6= 0. (12.42)
g(x) g 0 (x)
The function W (f, g; x) is named the Wronskian11 , of the functions f (x) and g(x).
This notation for the Wronskian shows which functions are used to construct it and the
independent variable; sometimes such detail is unnecessary so either of the notations
W (x) or W (f, g) are freely used.
If W (f, g; x) 6= 0 for a < x < b the functions f (x) and g(x) are said to be linearly
independent in (a, b); alternatively if W (f, g; x) = 0 they are linearly dependent. These
rules apply only to sufficiently smooth functions as the example of exercise 12.22 shows.
The Wronskian of any two solutions, f and g, of equation 12.41 satisfies the identity
Z x
p1 (t)
W (f, g; x) = W (f, g; a) exp dt . (12.43)
a p2 (t)
This identity is proved in exercise 12.23 by showing that W (x) satisfies a first-order
differential equation and solving it. Because the right-hand side of equation 12.43
always has the same sign, it follows that the Wronskian of two solutions is either always
positive, always negative or always zero. Thus, if f and g are linearly independent
at one point of the interval (a, b) they are linearly independent at all points of (a, b).
Conversely, if W (f, g) vanishes anywhere it vanishes everywhere.
The Wronskian can be used with one known solution to construct another. Suppose
that f (x) is a known solution and let g(x) be another (unknown) solution. The equation
for W (x) can be interpreted as a first-order equation for g,
g 0 f gf 0 = W (x),

0 0 2 d g
and, because g f gf = f , this equation, with 12.43, can be written in the
dx f
form Z x
d g W (a) p1 (t)
= exp dt
dx f f (x)2 a p2 (t)
having the general solution
x Z s
1 p1 (t)
Z
g(x) = f (x) C + W (a) ds exp dt , (12.44)
a f (s)2 a p2 (t)
where C is an arbitrary constant.
Exercise 12.18
If F (z) is a differentiable function and g = F (f ), with f (x) a differentiable,
non-constant function of x, show that W (f, g) = 0 only if g(x) = cf (x) for any
constant c.
11 Josef Hoene (1778 1853) was born in Poland, moved to France and become a French citizen in
1800. He moved to Paris in 1810 and adpoted the name Josef Hoene de Wronski at about that time,
just after he married.
Exercise 12.19
Show that the functions a1 sin x + a2 cos x and b1 sin x + b2 cos x are linearly inde-
pendent if a1 b2 6= a2 b1 .
Exercise 12.20
Use equation 12.44 to show that if f (x) is any nontrivial solution ofZ the equation
x
ds
y 00 + q(x)y = 0 for a < x < b, then another solution is g(x) = f (x) 2
.
a f (s)
Exercise 12.21
(a) If f and g are linearly independent solutions of the homogeneous differential
equation y 00 + p1 (x)y 0 + p0 (x)y = 0, show that
f g 00 gf 00 f 0 g 00 g 0 f 00
p1 (x) = and p0 (x) = .
W (f, g; x) W (f, g; x)
(b) Construct three linear, homogeneous, second-order differential equation having

the following bases of solutions:
(i) (x, sin x), (ii) (xa , xb ), (iii) (x, eax ),
where a and b are distinct real numbers. Determine any singular points of these
equations.
Exercise 12.22
If f (x) = x3 and g(x) = |x|3 show that, (a) W (f, g) = 0 for all x 6= 0, and (b) that
the vectors (f, f 0 ), (g, g 0 ) are linearly independent for 1 x 1. Why does this
not contadict the properties stated after equation 12.43?
Exercise 12.23
Show that the Wronskian W (f, g; x), where f and g are linearly independent
solutions of equation 12.41 satisfies the first-order differential equation
dW p1 (x)
= W
dx p2 (x)
and hence derive equation 12.43.
12.5.2 Separation and Comparison theorems

The Wronskian can be used to derive some useful properties about the positions of the
zeros of the solutions of the homogeneous equation 12.41. The theorems given here
were first discovered by Sturm: the first involves the relative positions of the zeros of
two linearly independent solutions, f (x) and g(x), of the homogeneous equation 12.41.
Since W (f, g) 6= 0, if g(x) = 0 at x = c, then
W (f, g; c) = f (c)g 0 (c) 6= 0.
Hence f (c) 6= 0 and g 0 (c) 6= 0.

Now let c and d be two successive zeros of g(x), so g(c) = g(d) = 0 then f (c) 6= 0
and f (d) 6= 0; also g 0 (c) and g 0 (d) must have different signs (because if g(x) is increasing
at x = c it must be decreasing at x = d, or vice-versa). Since W (f, g; x) has constant
sign and
W (c) = f (c)g 0 (c), W (d) = f (d)g 0 (d),
it follows that f (c) and f (d) must have opposite signs. Hence f (x) must have at least
one zero for c < x < d; two possible situations are shown in figure 12.4.
y y
f(x) g(x) f(x) g(x)
x x
c d c d
Figure 12.4 Diagram showing the behaviour of f (x) between two adjacent zeros of g(x),
consistent with W (f, g) not changing sign. Only the behaviour on the left-hand side is
actually possible, because we assume that g(x) 6= 0 for c < x < d, see text.
However, there can be only one zero of f (x) between adjacent zeros of g(x). Suppose
there are more: by reversing the roles of f and g we see that between two of the zeros
of f (x), there must be at least one zero of g(x), which contradicts the assumption that
c and d are adjacent zeros. Thus we have the following theorem.
Theorem 12.1
Sturms separation theorem. If f (x) and g(x) are linearly independent solutions of
the second-order homogeneous equation
d2 y dy
p2 (x) + p1 (x) + p0 (x)y = 0, a x b, (12.45)
dx2 dx
where p2 (x) 6= 0 for x [a, b], then the zeros of f (x) and g(x) alternate in (a, b).
A well known example of this theorem is the equation y 00 + y = 0, on the whole real
line, which has the independent solutions sin x and cos x with the alternating zeros n
and (n + 1/2), n = 0, 1, 2, , respectively. A less obvious consequence is that the
two functions
f (x) = a1 sin x + a2 cos x and g(x) = b1 sin x + b2 cos x
have alternating zeros provided a1 b2 6= a2 b1 , see exercise 12.19.

Note that this theorem does not prove that the zeros exist. The equation y 00 y = 0,
with solutions sinh x and cosh x shows that zeros need not exist.
The next theorem is more useful and in some circumstances can be used to show that
zeros exist and also to give their approximate positions. This is Sturms comparison
theorem, which we first state, then prove.
Theorem 12.2
Sturms comparison theorem. Let y1 (x) and y2 (x) be, respectively, nontrivial so-
lutions of the differential equations
d2 y d2 y
+ Q1 (x)y = 0 and + Q2 (x)y = 0 (12.46)
dx2 dx2
on an interval (a, b) and assume that Q1 (x) Q2 (x) everywhere in this interval. Then
between any two zeros of y2 (x) there is at least one zero of y1 (x), unless Q1 (x) = Q2 (x)
everywhere and y1 is a constant multiple of y2 .
A simple example of this theorem is the equation y 00 + 2 y = 0, with solution sin x

having zeros at n/, equally spaced, a distance / apart. Hence for the two equations
with = 2 and = 1 > 2 there must be at least one zero of sin 1 x between
adjacent zeros of sin 2 x.
Proof of the comparison theorem
The following proof depends upon the properties of the Wronskian. If x = c and x = d
are adjacent zeros of y2 (x), with c < d, suppose that y1 (x) 6= 0 for c x d. We may
assume that both y1 (x) and y2 (x) are positive in (c, d). Then
W (y1 , y2 ; c) = y1 (c)y20 (c) > 0, since y20 (c) > 0,

(12.47)
W (y1 , y2 ; d) = y1 (d)y20 (d) < 0, since y20 (d) < 0.
But
dW d
= (y1 y20 y10 y2 ) = y1 y200 y100 y2
dx dx
and, on using the differential equations 12.46 defining y1 and y2 , this simplifies to
dW
= Q1 (x) Q2 (x) y1 (x)y2 (x) 0, c x d.
dx
It follows that if Q1 (x) > Q2 (x), W (y1 , y2 ; x) is a monotonic increasing function of x,
so that W (c) W (d), which contradicts equation 12.47. Thus we must have y 1 (d) < 0
and hence y1 (x) must have at least one zero in (c, d).
Further, if Q1 = Q2 the separation theorem implies that there is one zero unless y1
and y2 are linearly dependent, that is y2 (x) is a multiple of y1 (x).
Applications of the comparison theorem
The equation y 00 + Q(x)y = 0, Q(x) 0
The first important result that follows from this is that every nontrivial solution of
d2 y
+ Q(x)y = 0 (12.48)
dx2
has at most one zero in any interval where Q(x) 0.
The proof is by contradiction. A solution of y 00 = 0 (that is, Q1 (x) = 0) is y1 (x) = 1.
If a solution of 12.48 has two zeros in a region where Q2 Q1 = 0, then y1 (x) would
have at least one zero in between, which is a contradiction.
The elementary equation y 00 y = 0, with the two sets of linearly independent

solutions, {cosh x, sinh x} and {ex, ex }, illustrates this result: only the second member
of the first pair, sinh x, has a zero.
Bessel functions
The comparison theorem can sometimes be applied to obtain useful properties of solu-
tions. For instance the equation for an ordinary Bessel function of order is
d2 y 2

1 dy
+ + 1 y = 0, (12.49)
dx2 x dx x2
which can be written in the normal form, exercise 12.1 (page 477),
d2 u 1 4 2

u(x)
2
+ 1+ 2
u = 0 where y(x) = , x > 0. (12.50)
dx 4x x
1 4 2
If < 1/2 the function Q1 (x) = 1 + > 1 so a suitable comparison equation is
00
4x2
v + v = 0, that is Q2 = 1 < Q1 . A solution of the comparison equation is v = sin x,
with positive zeros at x = n, n = 1, 2, . Hence u(x) has at least one zero in each
of the intervals (n, (n + 1)), n = 1, 2, .
If > 1/2 we can show that the solution has an infinity of positive zeros. In this case
Q1 (x) = 1 (4 2 1)/x2 < 1, so we take the comparison equation to be v 00 + 2 v = 0,
with 0 < < 1: then for x > x0 (), where Q1 (x0 ) = 2 , Q1 (x) > Q2 = 2 , and
the comparison theorem shows that there is at least one zero of u(x) in each interval
(n/, (n + 1)/), with n > x0 ; as x , we may chose close to unity.
We end this section by quoting, without proof, a more general comparison theorem,
needed later to obtain approximate positions of the zeros of an eigenfunction. The proof
of this theorem may be found in Birkhoff and Rota (1962, chapter 10).
Theorem 12.3
Sturms comparison theorem II. For the differential equations

d dy d dy
p1 (x) + Q1 (x)y = 0 and p2 (x) + Q2 (x)y = 0, a x b,
dx dx dx dx
where p2 (x) p1 (x) and Q2 (x) Q1 (x) for x (a, b), then if y1 (x) is a solution of the
first equation and y2 (x) any solution of the second equation, between any two adjacent
zeros of y2 there lies at least one zero of y1 , except if p1 = p2 , Q1 = Q2 , for all x [a, b],
and y1 is a constant multiple of y2 .
A shorter, approximate, easy to remember version is that as Q(x) increases and/or p(x)
decreases, the number of zeros of every solution increases.
The first comparison theorem is a direct consequence of this theorem. These the-
orems can be used to show that for a regular Sturm-Liouville system, provided the
eigenfunctions yn (x) exist and the eigenvalues satisfy 1 < 2 < < n < n+1 < ,
then the zeros of yn (x) interlace and that yn (x) has n 1 zeros in (a, b). We outline a
proof that these eigenfunctions exist in section 12.5.4.
Exercise 12.24
Use the Liouville normal form found in exercise 12.3 (page 478) and the comparison
theorem to show that there is a lower bound on the eigenvalues of a regular Sturm-
Liouville system with the boundary conditions y(a) = y(b) = 0.
Exercise 12.25
(a) Show that every solution of the Airy equation y 00 + xy = 0 vanishes infinitely
often for x > 1 and at most once for x < 0.
(b) Show that if y(x) satisfies Airys equation, then v(x) = y(ax) satisfies the
equation v 00 + a3 xv = 0.
(c) Show that the Sturm-Liouville system y 00 + xy = 0, y(0) = y(1) = 0, has an
infinite sequence of positive eigenvalues and no negative eigenvalues.
12.5.3 Self-adjoint operators

The eigenvalues of a Sturm-Liouville system are real and the eigenfunctions of most
systems are orthogonal. These two important properties follow directly from the form
of the real, differential operator,

d df
Lf = p(x) + q(x)f, a x b, (12.51)
dx dx
which defines the Sturm-Liouville equation.

The first result we need is Lagranges identity,
du

d dv
v(Lu) u Lv = p(x) v u (12.52)
dx dx dx
where u and v are any, possibly complex, functions for which both sides of the identity
exist.
Exercise 12.26
Prove Lagranges identity, equation 12.52.
Z b
12
Using the the inner product notation, with unit weight function , (f, g) = dx f (x) g(x),
a
Lagranges identity can be written in the form
b
du

dv
(Lu, v) (u, Lv) = p(x) v(x) u(x) . (12.53)
dx dx a
For some boundary conditions the right-hand side of this equation is zero and then
(Lu, v) = (u, Lv). (12.54)

12 There is no agreed version of the inner product notation. That adopted here is normally used in
physics, particularly in quantum mechanics, but in mathematics texts the integrand is often taken to
be f (x)g(x) . Provided one definition is used consistently the difference is immaterial.
In this case the operator and the boundary conditions are said to be self-adjoint. It is
important to note that a differential operator cannot be self-adjoint without appropriate
boundary conditions.
For the homogeneous, separated boundary conditions defined in equation 12.37
(page 494) we have, since A1 and A2 are real, and assuming A2 6= 0,
A1 u(a) + A2 u0 (a) = 0 u0 (a) v 0 (a)

= = .
A1 v(a) + A2 v 0 (a) = 0 u(a) v(a)
This shows that the boundary term of equation 12.53 is zero at x = a; a similar analysis
shows it to be zero at x = b. If A2 = 0 then u(a) = v(a) = 0 and the same result follows.
For a singular system, if p(a) = 0 the boundary term at x = a is clearly zero. Thus
for regular and singular systems (Lu, v) = (u, Lv) and the operator L is self-adjoint.
Periodic boundary conditions also make the system self-adjoint, as shown in the next
exercise.
Exercise 12.27
Prove that if the boundary conditions are periodic, y(a) = y(b) and y 0 (a) = y 0 (b)
and p(a) = p(b), then L is self-adjoint.
Note: periodic boundary conditions are examples of mixed boundary conditions
in which the values of the function, and possibly its derivative, at the two ends of
the range are non-trivially related. Normally mixed boundary conditions produce
operators that are not self-adjoint, exercise 12.30.
Exercise 12.28
In this chapter the operators considered are real but complex operators are often
useful.
R
Show that on the space of differentiable functions for which dx |u(x)|2 exists
d
the real operator L = dx is not self-adjoint, but that the complex operator L = iL
is self-adjoint.
RInthis example2
there are no boundary conditions: the condition that the integral

dx |u(x)| exists means that |u| 0 as x and this plays the role of
the boundary conditions.
Exercise 12.29
Show that the operator L defined by
d2 y
Ly = + y = 0, y(0) = A, y 0 () = B,
dx2
where , A and B are nonzero constants, is not self-adjoint. This exercise shows
why the boundary conditions need to be homogeneous.
Exercise 12.30
Show that the system Ly = y 00 + y = 0, with the mixed boundary conditions,
y(0) = 0, y() = ay 0 (0), a 6= 0, is not self-adjoint.
Note in exercise 12.13 it was shown that some of the eigenvalues of this system
are complex and that the eigenfunctions are not orthogonal.
The eigenvalues of a self-adjoint operator are real

If (x) is an eigenfunction corresponding to an eigenvalue , then L = w and
(L, ) = (w, ) = (w, ).
Also
(, L) = (, w) = (, w)
and hence, since w(x) is real,
Z b
0 = (L, ) (, L) = ( )
dx w(x)|(x)|2 .
a
Since w(x) > 0 and (, )w > 0, for almost all x, the right-hand side can be zero only if
= , that is the eigenvalues of a Sturm-Liouville system are real: this proof is valid
for regular and singular systems and if the boundary conditions are periodic.
The eigenfunctions are orthogonal
Now consider two eigenfunctions (x) and (x) corresponding to distinct eigenvalues
and , respectively, that is L = w and L = w. By the self-adjoint property
0 = (L, ) (, L) = (, )w + (, )w
Z b
= ( ) dx w(x)(x) (x).
a
Since we have assumed that 6= it follows that

Z b
(, )w = dx w(x)(x) (x) = 0. (12.55)
a
12.5.4 The oscillation theorem

In this optional section we provide a brief outline of a proof that a regular Sturm-
Liouville system possesses a countable infinity of eigenfunctions. The final result is
summarised in the following theorem, which is a consequence of the oscillation theorem,
theorem 12.5. In the remainder of this section we describe the ideas behind the proof
of the oscillation theorem: rigorous details may be found in Birkhoff and Rota (1962,
chapter 10).
Theorem 12.4
The regular Sturm-Liouville system

d dy
p + Q(x)y = 0, Q(x) = q(x) + w(x), a x b, (12.56)
dx dx
with the separated boundary conditions
A1 y(a) + A2 y 0 (a) = 0 and B1 y(b) + B2 y 0 (b) = 0 (12.57)
has an infinite sequence of real eigenvalues 1 < 2 < < n < n+1 < with
limn n = . The eigenfunction yn (x) belonging to the eigenvalue n has exactly
n 1 zeros in the interval a < x < b and is determined uniquely up to a constant

multiplicative factor.
The main idea behind the proof outlined here is the Prufer substitution, named after
the German mathematician Heinz Prufer (1896 1934); this involves using polar coordi-
nates in the Cartesian plane having coordinates (py 0 , y) to understand how the solution
behaves. Two new dependent variables (r(x), (x)) are defined by the relations
p(x)y 0 = r cos and y = r sin (12.58)
so that
y
r 2 = y 2 + p2 y 0 2 and tan = . (12.59)
py 0
Since y and y 0 cannot simultaneously be zero, r > 0. Notice that y(x) = 0 when
(x) = n, where n is an integer.
First we need the differential equations for r and . Differentiating the equation for
tan gives
2
1 y(py 0 )0

1 d 1 y
= = + Q
cos2 dx p (py 0 )2 p py 0
where we have used the relation (py 0 )0 = Qy. Multiplying by cos2 gives
d 1
= Q(x) sin2 + cos2 , Q = q(x) + w(x). (12.60)
dx p(x)
This first-order equation for is independent of r, and provided p(x) 6= 0, it has a

unique solution for every initial value of , that is (a). Further it can be shown that
the solution (x, ) is a continuous function of x and in the intervals a x b and
< < .
The equation for r is found by differentiating the equation for r 2 and then using the
original equation
dr dy d(py 0 ) r2
r =y + py 0 = sin cos Qr 2 sin cos .
dx dx dx p
Hence
dr 1 1
= Q(x) r sin 2. (12.61)
dx 2 p(x)
The two equations 12.60 and 12.61 are equivalent to the original differential equation
and are named the Prufer system assocated with the self-adjoint equation 12.36.
The equation for r can be expressed as an integral
Z x
1 1
r(x) = r(a) exp dt Q(t) sin 2(t) , (12.62)
2 a p(t)
which can be evaluated once (x) is known; however, we shall not need this equation.
Notice that because the original equation for y is homogeneous the magnitude of r(x)
is unimportant, and is why r(x) depends linearly upon r(a).
The solution of equation 12.60 for (x) depends only upon the initial conditions,
that is the boundary condition A1 y(a) + A2 y 0 (a) = 0, which gives
A2
tan a = with 0 a < , (12.63)
A1 p(a)
and with a = /2 if A1 = 0 The eigenvalues are given by those values of for which
b = (b, ), satisfies the equation tan b = B2 /(B1 p(b)). However, here the main
objective is not to find the eigenvalues but to first determine that they exist and second
to determine some of their properties, and for this only the initial condition is required.
It is necessary to understand how (x, ) behaves as a function of x and ; this
behaviour is summarised in the following theorem which is proved rigorously in Birkhoff
and Rota (1962, chapter 10).
Theorem 12.5
The oscillation theorem. The solution of the differential equation 12.60 satisfying
the initial condition (a, ) = a < , for all , is a continuous and strictly monotonic
in for fixed x on the interval a < x b. Also
lim (x, ) = and lim (x, ) = 0 for a < x b.

This theorem show that y(b, ) = r(b) sin (b, ) has infinitely many zeros for > 0,
and hence that there are infinitely many eigenfunctions.
In order to understand why (x, ) behaves in the manner described in theorem 12.5
we consider two specific examples. The first is a very simple system with known eigen-
functions; the second example is sufficiently general to contain all the essential features
of the general case.
The first system is
d2 y
+ y = 0, 0 x , (12.64)
dx2
and here p = 1 and Q = , so the equation 12.60 for is
d
= cos2 + sin2 , (0) = 0 .
dx
This equation is particularly simple because the right-hand side is independent of x, so
it can be integrated directly, to give

1
Z
x() = d . (12.65)
0 cos2 + sin2
However, this means that it is unrepresentative which is why another example is con-
sidered after the following discussion. We now deduce the qualitative behaviour of the
function (x) from this integral.
If > 0, 0 (x) > 0 and (x) is a monotonic increasing function of x; the larger the
greater the rate of increase of (x, ). In particular (, ) is an increasing function of
: this is clear from the integral 12.65 because the integrand is positive and for most
values of a decreasing function of . Thus for a given value of x the upper limit,
(x), must increase as increases to compensate for the decreasing magnitude of the
integrand, see exercise 12.31.
If < 0, then (x) > 0 tends to a constant value as x . To see this observe
that 0 (x) = 0 when = c and c where 0 < c = tan1 (1/ ) < /2, and thus,
if 0 = c then (x) = c for all x; this solution is stable;
if 0 = c then (x) = c for all x; this solution is unstable;
if 0 0 < c , then 0 (0) > 0 and (x) increases monotonically to c as x ;
if c < 0 c , then 0 (0) < 0 and (x) decreases monotonically to c as

x ;
if c < 0 < then 0 (0) > 0 and (x) increases monotonically to c +

as x .
This behaviour is shown graphically in figure 12.5, where = 1/4, which gives
c = 1.107 and graphs of (x, ) are shown for various initial conditions. Figure 12.6
shows the graphs of (x, ), with the same initial condition 0 = 0.6, but various values
of . Since c depends upon , 0 (0) > 0 for > 2.14, and 0 (0) < 0 for < 2.14.
5 (x)
1.5 (x)
4 +c 0
-0.1
3 1 -0.5
-1.0
2 c -1.5
-2.0
0.5
-5.0
1 c
0 x/ 0 x/
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 12.5 Graphs of (x) for = 1/4 Figure 12.6 Graphs of (x), for the initial
and various initial conditions. condition 0 = 0.6, and various negative .
It is clear from these graphs that there can be at most one negative eigenvalue. For the
parameters of figure 12.6, 0 = 0.6, (, ) varies between 0 and tan1 ( + tan 0 ) =
1.315, as increases from to 0: if the boundary condition , at x = , lies in this
range there will be a single eigenvalue for some negative . Otherwise there will be no
negative eigenvalue.
Now restrict attention to the case > 0, where (x, ) increases with x, for fixed
, and with for fixed x. Graphs of (x, ) for 0 x and various values of are
shown in figure 12.7.
60
50 250
40 150
120
30 100
20 50
40 30
10 10
1
0
0 0.2 0.4 0.6 0.8 x/ 1
Figure 12.7 Some representative graphs of (x), defined by equation 12.65
with 0 = 0, for variousvalues of . Using the integral 12.65 it can be shown
that if 1, (x) ' x .
The following exercise uses the integral 12.65 to deduce some propoerties of (x, ) for
the differential equation 12.64 with the boundary conditions y(0) = y() = 0.
Exercise 12.31
(a) For the boundary value problem
y 00 + y = 0, y(0) = y() = 0,
show that (0, ) = 0 and (, ) = n, for some positive integer n. Use equa-
tion 12.65 to deduce that the value of satisfying this last equation is = n2 .
Deduce that the nth eigenvalue is n = n2 and show that its eigenfunction has
n 1 zeros in the interval 0 < x < .
(b) If (, ) = () show that

sin2
Z
d
= cos2 + sin2
`
d > 0.
d 0 (cos2 + sin2 )2
(c) Show that lim (, ) = .
For part (a) you will need the integral
Z /2
1
d 2 = , a > 0, b > 0.
0 a cos2 + b2 sin 2ab
Now consider a slightly different, but more typical problem, for which there is no simple
formula for (x). Consider the eigenvalue problem
d2 y
+ xy = 0, y(0) = y(1) = 0, (12.66)
dx2
also treated in exerise 12.25. In this example p = 1 and Q = x, so the equation for
is
d
= cos2 + x sin2 , (0) = 0, 0 x 1. (12.67)
dx
If > 0, 0 (x) > 0 and, as before, (x) is a monotonic increasing function of x, with a
greater rate of increase the larger . Further if 2 > 1 , (x, 2 ) (x, 1 ), as shown by
an application of the theorem for first-order equations quoted in exercise 12.32. Thus
for > 0 there is little qualitative difference between this and the previous simpler
example; some representative graphs of (x, ) are depicted in figure 12.8.
10
150
8
120
100
6 50
4 40
30
2 10
1
0
0 0.2 0.4 0.6 0.8 x 1
Figure 12.8 Some representative graphs of (x), defined by equation 12.67
for various values of .
If < 0 the behaviour is not so easy to understand but, nevertheless, is similar to the
simpler example. Put = , with > 0, so the equation for becomes
d
= cos2 x sin2 , (0) = 0, 0 x 1. (12.68)
dx
For small x, x2 < 1 this equation is approximated by 0 = cos2 ' 1, so (x) grows
linearly with x, that is (x) = x. The two terms on the right-hand side of equation 12.68
are comparable when 1 = x3 and near this value of x, 0 (x) becomes negative and for
large both and x are small, so the equation is approximately
d
= 1 x2 . (12.69)
dx
For x3 > 1 the approximate solution of this equation is the function that makes the
derivative zero, that is x 2 = 1. To see this put x 2 = 1 + , so 0 = : if > 0,
0 decreases; if < 0, 0 increases. In either case the solution moves towards the line13
x2 = 1. A more accurate solution in the region x3 > 1 is found in exercise 12.33.
In figure 12.9 we compare the numerically generated solution of equation 12.68 with
the linear approximation, for x < 1/3 and the approximation x 2 = 1 for larger x,
for the cases = 10 and 100. This comparison confirms the predicted behaviour.
0.5 0.25
=10 =100
0.4 0.2
0.3 0.15
0.2 0.1
0.1 0.05
x x
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 12.9 Graphs of the numerical solution of equation 12.68 and the approximations = x
and x 2 = 1, shown by the dashed lines, for small and larger values of x, respectively. The
boundary x = 1/3 is shown by the arrows.
13 This type of analysis is useful in the study of boundary layer problems, relaxation oscillations and
certain types of limit cycles.


These graphs and the approximations show that for 1, (1, ) ' 1/ , and
max0<x<1 () ' ()1/3 ; hence for 1, there can be no eigenvalues for the
boundary conditions y(0) = y(1) = 0.
We now apply the same method to the general case 0 = (q+w) sin2 +(1/p) cos2 ,
to show that its solutions behave similarly. If is sufficiently large that is q + w > 0
for a < x < b, then (x) is a monotonic increasing function of x. Further, it can be
shown, see exercise 12.34, that (b, ) increases with and that lim (b, ) = .
Hence there are infinitely many positive eigenvalues with distinct eigenfunctions.
If = 1, with the initial condition (0) = 0, we again see that for small
x, (x) ' (x a)/p(a). This growth continues until w sin2 is large enough, that is
w(x)(x a)2 ' p(x), and subsequently the equation is approximately
d 1
' w(x)2 .
dx p(x)
The same reasoning as above shows that the approximate solution is p(x)w(x) 2 = 1,
giving (b, ) ' (p(b)w(b))1/2 , that is the variation of (x) is too small for eigenval-
ues to exist, for the boundary conditions y(a) = y(b) = 0: for other boundary conditions
one negative eigenvalue may exist.
Exercise 12.32
In this exercise bounds on the positions of zeros and eigenvalues are obtained for
the Sturm-Liouville system defined by equation 12.56 with the boundary condi-
tions y(a) = y(b) = 0. For this the following comparison theorem for the first-order
equations y 0 = F (x, y) is needed.
Suppose that F (x, y) and G(x, z) satisfy the Lipshitz condition
|F (x, y) G(x, z)| L|y z|, a x b,
on suitable intervals of y and z, for some constant L. If y 0 = F (x, y) and z 0 = G(x, z),
with y(a) = z(a), then if F (x, y) G(x, y) for a x b and a suitable domain
of y, it can be shown that y(x) z(x) for a < x b.
Use this theorem with equation 12.60 for (x) to show that the kth zero, xk lies
between the limits,
r r
p1 xk a p2
.
q2 + w2 k q1 + w1
where p1 p(x) p2 , q1 q(x) q2 , and w1 w(x) w2 . Deduce that n , the

nth eigenvalue satisfies
2 2
p1 n q2 p2 n q1
n .
w2 b a w2 w1 b a w1
Exercise 12.33

In equation 12.69 define a new variable = /, where = 1/ , and show that
0 2
(x) = 1 x .
By writing the solution of this equation in the form
(x) = 0 (x) + 1 (x) + 2 2 (x) + ,

and equating the coefficients of the powers of to zero, show that 0 , 1 and 2
satisfy the equations

1 x20 = 0, 00 = 2x0 1 , 01 = x 21 + 20 2
and hence show that
1 1 7
(x) = + + + O(5/2 ).
x 4x2 323/2 x7/2
Exercise 12.34
Use the comparison theorem for first-order equations quoted in exercise 12.32 to
show that if 2 > 1 then (b, 2 ) (b, 1 ).
Exercise 12.35
In this exercise an approximation to the eigenvalues and eigenfunctions for large n
is found. The Liouville transformation, exercise 12.3 (page 478), shows that the
equation

d dy
p + (q + w) y = 0, a x b,
dx dx
can be transformed to the equation
d2 v d2

q 1
+ Q(x)v = 0, Q(, ) = A 2 + ,
d 2 w d A
where y = A()v(),
Z x r
w
(x) = dx and A() = (wp)1/4 .
a p
(a) Define the modified Prufer transformation
v() = RQ1/4 cos , v 0 () = RQ1/4 sin ,
where we assume Q() > 0 for all , and show that
d p Q0 d Q0
= Q sin 2 and (ln R) = cos 2.
d 4Q d 4Q
(b)
Assume that Q is bounded and that max(Q) and show that () '
and R ' r, where and r are constants, and deduce that with the boundary
conditions y(a) = y(b) = 0 the approximate eigenvalues and eigenfunctions are
2
n r n
n = and vn () = sin .
(b) Q(, n )1/4 (b)
12.6. DIRECT METHODS USING VARIATIONAL PRINCIPLES 513
12.6 Direct methods using variational principles

12.6.1 Introduction
The approach adopted in this course has been to use a variational principle to obtain a
functional from which the Euler-Lagrange equation is derived. The stationary paths of
the functional are obtained by solving this equation. This approach is not always the
most practical because the Euler-Lagrange equation is usually a nonlinear boundary
value problem, and these are notoriously difficult to solve even numerically. The diffi-
culties of this approach are compounded if there are two or more independent variables
when the Euler-Lagrange equation becomes a partial differential equation.
These difficulties have led to the development of direct methods which avoid the
need to solve differential equations by dealing directly with the functional. Starting
with a differential equation the approach is to find an associated functional and to use
this to find approximations to the stationary paths, which are necessarily solutions of
the original differential equation.
A further refinement applies to those functionals for which the stationary paths
are actual minima. The technique described in section 12.6.4 shows how to construct
a sequence of stationary paths so that the functional approaches its minimum value
from above: this idea is particularly useful for Sturm-Liouville systems because the
eigenvalues are equal to the value of the functional to be minimised, provided suitable
admissible functions are used.
12.6.2 Basic ideas

The direct method is very simple and was introduced by Euler before the Euler-Lagrange
equation was discovered. Suppose we require a stationary path of a functional S[y], with
y belonging to a given class of admissible functions. Rather than solving the Euler-
Lagrange equation, we use a restricted set of admissible functions z(x; a), named a
trial function, depending upon a set of real variables a = (a1 , a2 , . . . , an ). Substituting
this into the functional gives a function, S(a) = S[z], of the n real variables. The
stationary points of S(a) can be determined using the methods of ordinary calculus
and this provides an approximation to the exact stationary path. An example of this
procedure was described in exercise 4.10 (page 172) and there it was shown how a
very simple trial function captured the qualitative features of the exact solution for the
minimum surface of revolution. Another example, described in section 3.2.1, is Eulers
original method whereby smooth paths are approximated by straight line segments,
with the vertex values (y1 , y2 , . . . , yN ) see figure 3.1 (page 118), playing the part of the
parameters a: exercise 4.10 is an example of this method.
Generally there are no rules for choosing the trial function z(x; a), other than it
being an admissible function, and the choice is guided by intuition and convenience.
The number of parameters, n, can be as small as one, or as large as one pleases; but
the larger n, the harder the algebra, though computers are particularly useful for this
type of problem. We illustrate this method with some simple problems.
First consider the functional
Z 1
S[y] = dx y 0 2 y 2 + 2xy , y(0) = y(1) = 0. (12.70)
0
The Euler-Lagrange equation is y 00 + y = x and has the solution

sin x
y(x) = x .
sin 1
A simple trial function satisfying the boundary conditions is the polynomial z(x; a) =
ax(1 x), having just one free parameter, a. Substituting this into the functional we
obtain the integrals
1 1 1 1
1 2 1 2
Z Z Z Z
dx z 0 2 = a2 dx (1 2x)2 = a , dx z 2 = a2 dx x2 (1 x)2 = a ,
0 0 3 0 0 30
1 1
1
Z Z
2 dx xz = 2a dx x2 (1 x) = a,
0 0 6
3 2 1
so that S(a) = a + a. This is stationary at a = 5/18, giving the approximation
10 6
5x sin x
z= (1 x) to y = x . (12.71)
18 sin 1
In the left-hand of figure 12.10 we compare the graphs of the exact and approximate
functions.
1 100(y-z)
y
0
0.2 0.4 0.6 0.8 x 1
-0.02 approximation, z(x)

0.5
x
-0.04 exact, y(x) 0
0.2 0.4 0.6 0.8 1
-0.06 -0.5
-0.08 -1
Figure 12.10 On the left we compare the exact solution of equation 12.70 with the variational
approximation, defined in equation 12.71. On the right we show the difference, 100(y z), be-
tween the exact and the variation approximation obtained using the trial function, defined in
equation 12.71.
Further thought suggests that this trial function is a poor choice, because the actual
solution is an odd function of x. This can be deduced from the differential equation
because its right-hand side is odd, so we expect the solution to be odd, for if y(x) were
even, so also is y 00 (x) and the left-hand side of the equation would be even. Thus a
more sensible trial function is
z(x; a) = ax(1 x2 ) (12.72)
which leads to a = 7/38. This estimate of the solution is very close to the exact
solution as seen in figure 12.11 where we show the graphs of 100(y z): notice that
the differences are about 10 times smaller than those in figure 12.10, which shows that
a careful choice of trial function can lead to significantly improved results with little
extra effort.
0.1 100(y-z)
0.05
x
0
0.2 0.4 0.6 0.8 1
-0.05
-0.1
L
Figure 12.11 Graph of the difference 100(y z), between the exact solution and the
trial function defined in equation 12.72. Notice that the differences are about 10 times
smaller than those in figure 12.10.
A more general odd trial function that satisfies the boundary conditions is
z(x; a) = x(1 x2 ) a0 + a1 x2 + a2 x4 + + an x2n ,

(12.73)
and this has n + 1 parameters.

For the second, slightly more complicated, example we find an approximate solution
to the nonlinear boundary value problem,
d2 y
+ x2 y 2 = x, y(0) = y(2) = 0, (12.74)
dx2
whose solution cannot be expressed in terms of elementary functions. The functional
for this equation is
2
1 02 1 2 3
Z
S[y] = dx y x y + xy , y(0) = y(2) = 0. (12.75)
0 2 3
Now we use trial function

z(x; a) = ax(2 x).
The three integrals needed are,
2 2 2
1 4 2 1 64 3 4
Z Z Z
dx z 0 2 = a , dx x2 z 3 = a , dx xz = a,
2 0 3 3 0 189 0 3
so that
64 3 4 2 4 64 8 4
S(a) = a + a + a and S 0 (a) = a2 + a + .
189 3 3 63 3 3
Now there are two stationary paths given by the roots of this quadratic, which we
denote by
1 1
a = 777 21 and a+ = 777 + 21 ,
16 16
which suggests that this nonlinear boundary value problem has two solutions. Numer-
ical calculations, guided by this approximation, confirm this and in figure 12.12 we
compare these approximate solutions with those given by a numerical calculation.
0 y(x) 4 y(x)
a<0 a>0
-0.1
approximate 3
-0.2
exact 2
-0.3
1
-0.4
x x
-0.5 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Figure 12.12 A graphical comparison of the exact (numerical) solutions of equation 12.74, the
solid lines, and the variational approximation, the dashed lines. On the left is the comparison for
a < 0 and on the right for a > 0.
By substituting a power series into the differential equation, it can be seen that a better
trial function is z = ax(4 x2 ), because the coefficient of the term x2 is zero, but for
this trial function the integrals are slightly more complicated. We have
1 2 1 2
Z 2
64 2 512 3 64
Z Z
02 2 3
dx z = a , dx x z = a , dx xz = a
2 0 5 3 0 45 0 15
so that
512 3 64 2 64 512 2 128 64
S(a) = a + a + a and S 0 (a) = a + a+
45 5 15 15 5 15
and the two stationary paths are given by setting a to the values,

3 17 3 + 17
a = and a+ = .
8 8
In figure 12.13 are compared these approximations with numerically generated solutions
of equation 12.74. For a = a the trial solution, shown by the circles, is very close the
the exact solution. In both cases the approximations are better, which again illustrates
the value of choosing suitable trial functions.
0 y(x) a=a_ < 0 3 y(x) a=a+ > 0

-0.1 2.5
2
-0.2
1.5
-0.3
approximate
1
-0.4 0.5 exact
x x
-0.5 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
Figure 12.13 Graphs comparing the exact numerically generated solution of equation 12.74 and
the approximations obtained using the trial function z = ax(1 x2 ), denoted by the circles in the
left panel.
It is worth noting that some black-box numerical methods for solving boundary
value problems give only the first solution, a > 0, and provide no inkling that another
solution exists. Thus, simple variational calculations, such as described here, can avoid
embarrassing errors; but they give no guarantee that only two solutions to this problem
exist.
Exercise 12.36
Using the trial function y = 1 ax (1 a)x2 obtain an approximate solution for
the equation y 00 + xy = 0, y(0) = 1, y(1) = 0.
Exercise 12.37
(a) Show that the functional associated with the equation y 00 + y 3 = 0, y(0) = 0,
y 0 (X) = 0 is
Z X
1 02 1 4
S[y] = dx y y , y(0) = 0,
0 2 4
and a natural boundary condition at x = X.
(b) Use the trial function y = a sin(x/(2X)) to find an approximate solution.
You will need the integral
Z /2
3
du sin4 u = .
0 16
12.6.3 Eigenvalues and eigenfunctions

In this section we show how the first n eigenvalues and eigenfunctions of a Sturm-
Liouville system can be approximated by solving a set of n linear equations in n vari-
ables. The method can start either from the Euler-Lagrange equation or the associated
functional, and though it is normally slightly easier to use the Euler-Lagrange equa-
tions, see exercise 12.45, we use the functional because this analysis is needed in the
next section to provide upper bounds to the eigenvalues.
We illustrate the method with the functional
Z 1
S[y] = dx py 0 2 qy 2 , y(0) = y(1) = 0, (12.76)
0

Z 1
C[y] = dx y 2 = 1. (12.77)
0
If is the Lagrange multiplier this leads to the Sturm-Liouville system

d dy
p + (q + )y = 0, y(0) = y(1) = 0, (12.78)
dx dx
which we assume to have an infinite sequence of real eigenvalues 1 < 2 < 3 , and
associated eigenfunctions y1 (x), y2 (x), .
First we need the following relation between the nth eigenfunction, yn (x), and its
eigenvalue
Z 1
n = S[yn ] = dx pyn0 2 qyn2 . (12.79)
0
This formula is useful because we shall use it, with approximations for yn (x), to both
approximate and bound n .
Exercise 12.38
By multiplying equation 12.78 by yn and integrating over (0, 1) prove equa-
tion 12.79.
Exercise 12.39
If yn (x) is an exact eigenfunction with eigenvalue n and zn = yn + u(x), with
|| 1 and O(u) = 1, is an admissible function, show that
n = S[zn ] + O(2 ).
The result derived in the exercise 12.39 is important. It shows that if an eigenfunction
is known approximately, with an accuracy O(), then it can be used to approximate
the eigenvalue to an accuracy O(2 ).
For the linear system 12.78 we construct trial functions using a subset of a complete
set of functions {} = {1 , 2 , }, each of which satisfies the boundary conditions.
Normally this set are the eigenfunctions of another Sturm-Liouville system and it is
clear that when choosing this system it is sensible to use a system that is similar to
that being studied.
Here we use the complete, orthogonal sequence k (x), k = 1, 2, , satisfying
Z 1
dx i (x)j (x) = hi ij , (12.80)
0
with each i (x) satisfying the same boundary conditions as the original Sturm-Liouville
system, in this case i (0) = i (1) = 0. At the end of this analysis we shall use a specific
set of functions by setting k = sin kx. A trial function is obtained using a linear
combination of the first n of these functions
n
X
z(x; a) = ak k (x),
k=1
and this will provide an approximation to the first n of the required eigenvalues and
eigenfunctions. The trial function needs to satisfy the constraint C[z] = 1, and this
defines the function
Z 1 n
!2 n
X X
C(a) = dx ak k (x) = hk a2k = 1, (12.81)
0 k=1 k=1
where we have used the orthogonal property, equation 12.80. In the space of real
variables a = (a1 , a2 , . . . , an ) this quadratic function of a, equation 12.81, defines an
n-dimensional ellipsoid. It is convenient to write this constraint in terms of the vector a,
C(a) = a> Ha = 1,
where H is the n n, diagonal matrix with Hkk = hk . The functional S[z] defines
another function of a,
!2 !2
Z 1 Xn n
X
S(a) = dx p(x) ak 0k (x) q(x) ak k (x) , (12.82)
0 k=1 k=1
which is also a quadratic form and can be written as

n X
X n
S(a) = ai Sij aj = a> Sa, (12.83)
i=1 j=1
where S is a real, symmetric n n matrix, with elements Sij . Specifically, these matrix
elements are given by
Z 1
Sij = dx p(x)0i (x)0j (x) q(x)i (x)j (x) . (12.84)
0
An approximation to the first n eigenvalues and eigenfunctions of the Euler-Lagrange

equation 12.78 is given by the stationary values of S(a), subject to the constraint 12.81;
this is a conventional constrained stationary problem, dealt with in section 10.2. If
is the Lagrange multiplier for this problem the auxiliary function is
S(a) = S(a) C(a)

Xn Xn n
X
= ai Sij aj hi a2i = a> Sa a> Ha. (12.85)
i=1 j=1 i=1
The stationary values of S(a) are given by the solutions of S(a)/ai = 0, i = 1, 2, , n,

that is, the solution of the matrix eigenvalue equation
Sa = Ha or H 1 Sa = a. (12.86)
That is the stationary points are given by the eigenvectors of H 1 S. Further, since
H 1 S is a real, symmetric matrix its n eigenvalues are real and can be ordered, 1 <
2 < < n , and the kth eigenvalue provides an approximation to the kth eigenvalue
of the original Euler-Lagrange equation, as shown next.
If ak is the kth eigenvector of H 1 S with eigenvalue k , then assuming that the
associated trial function, z(x; ak ), is an approximation to the kth eigenfunction of the
Sturm-Liouville system we have, from the result found in exercise 12.39,
k ' S[z] = S(ak ) = a> >

k Sak = k ak Hak = k .
An example using the orthogonal set k (x) = sin kx

For the interval (0, 1) and boundary conditions y(0) = y(1) = 0, a convenient orthogonal
set is k (x) = sin kx. For this set hk = 1/2 for all k, and the matrix elements of H 1 S
become
Z 1 Z 1
1 2
(H S)ij = 2 ij dx p(x) cos ix cos jx 2 dx q(x) sin ix sin jx .
0 0
Now we apply this approximation to the particulareigenvalue problem,
d2 y
+ (x + )y = 0, y(0) = y(1) = 0, (12.87)
dx2
where p(x) = 1, q(x) = x and w(x) = 1. The associated functional and constraint are
Z 1 Z 1
S[y] = dx y 0 2 xy 2 , C[y] = dx y 2 = 1, y(0) = y(1) = 0. (12.88)
0 0
We use the complete set k = sin kx, k = 1, 2, , to construct the trial functions,
and the simplest of these is obtained by using only the first function,
z(x; a1 ) = a1 sin x.
Z 1
The constraint gives a21 dx sin2 x = 1, that is a21 = 2. Thus the first approximation
0
for 1 is given by
1
1
Z
1 ' S(a1 ) = a21 dx 2 cos2 x x sin2 x = 2 ' 9.3696. (12.89)
0 2
That is 1 ' 9.3696. The exact eigenvalues are given by the real solutions of
Ai()Bi(1 ) Ai(1 )Bi() = 0,
where Ai(z) and Bi(z) are the Airy functions which are solutions of Airys equation,
y 00 xy = 0: to 7 significant figures the first eigenvalue is, 9.368507, so the approxima-
tion is larger than this by 0.01%.
Exercise 12.40
Consider the eigenvalue problem
y 00 + (x2 + )y = 0, y(0) = y(1) = 0.
(a) Using the orthogonal set k (x) = sin kx, k = 1, 2, , show that an upper
bound to the smallest eigenvalue is
Z 1
1 1
dx 2 cos2 x x2 sin2 x = 2 + 2 ' 9.59.
`
1 2
0 3 2
(b) Show that the trial function z = ax(1 x) gives the bound 1 68/7 ' 9.71.
Which of these two estimates is closer to the exact value?
Exercise 12.41
Use the bounds determined in exercise 12.32 (page 511) to show that the nth
eigenvalue of system y 00 + (xa + )y = 0, y(0) = y(1) = 0, with a > 0 is bounded
by (n)2 1 n (n)2 .
Exercise 12.42
Using the trial function z = a(1 x2 ) show that a lower bound to the smallest
eigenvalue of the system
y 00 + x2p + y = 0, y(1) = y(1) = 0,
`
where p is a positive integer, is given by

5 6
1 1 .
2 (2p + 1)(2p + 3)(2p + 5)
Better approximations to 1 are obtained by increasing n, but quickly the algebra

becomes cumbersome, time consuming and error prone. However, because the equa-
tions for a are linear the standard methods of linear algebra may be used and so the
calculations become relatively trivial if a computer is available.
Here we limit the calculation to n = 2, which illustrates all the relevant details: for
the sake of brevity the minor details of this calculation are omitted. The trial function
is now
z(x; a1 , a2 ) = a1 sin x + a2 sin 2x
and the constraint, equation 12.81 gives a21 + a22 = 2. The functional is given by
2
1 1 16
S2 (a1 , a2 ) = a21 + 2 2 a22 + 2 a1 a2 .
2 4 4 9
Hence the matrix eigenvalue equation 12.86 is
1 16

2
2 9 2
1 a = a. (12.90)

16 2
4
9 2 2
Since 16/(9 2 ) 1 the eigenvalues of the matrix on the left are approximately 2
1/2 and (2)2 1/2, giving 1 ' 9.3696 (the previous value) and 2 ' 38.9784; the
eigenvalues of this matrix are actually (9.368509, 38.9795), which compare favourably
with the exact eigenvalues (9.368507, 38.9787)
Exercise 12.43
Using the complete set of function k (x) = sin(k 21 )x, k = 1, 2, , which
satisfy the boundary conditions k (0) = 0k (1) = 0, find approximations to the
first eigenvalue of the system
d2 y
+ (x + )y = 0, y(0) = y 0 (1) = 0,
dx2
using trial functions with one and with two parameters.
The following integral will be useful
1 1
8
> + 2 , n = m = 1,
4
>
>
Z 1 >
>
n m <
1 1
dx x sin x sin x = + 2 , n = m = 3,
0 2 2 >
> 4 9
: 1 ,
>
>
n = 1, m = 3.
>
2
Exercise 12.44
(a) Find the eigenvalues and eigenfunctions of the problem
d2 y
+ y = 0, y 0 (0) = y(1) = 0.
dx2
(b) Use the first eigenfunction of this problem to show that an approximation to
the first eigenvalue of
d2 y x
0 2 4b
+ b sin + y = 0, y (0) = y(1) = 0 is 1 ' .
dx2 2 4 3
(c) Show that an approximation to the nth eigenvalue is

2b 1 1
n ' 2 n 2 1 , n=n .
16n2 1 2
Hint use the nth eigenfunction of the system defined in part (a) to construct a
one parameter trial function.
Exercise 12.45
(a) Determine an approximation to the eigenvalues and eigenfunctions of the equa-
tion
d2 y
+ (x + )y = 0, y(0) = y(1) = 0,
dx2
by substituting the series
Xn
y(x) = ak sin kx
k=1
into the equation to form the matrix equation M a = a where a = (a1 , a2 , . . . , an )

and M is an n n, real symmetric matrix, with elements
Z 1
Mij = 2 i2 ij 2 dx x sin ix sin jx.
0
(b) Show that for n = 1 and 2 this method gives the approximations 12.89
and 12.90 respectively.
(c) Show that for arbitrary n this method gives the equation 12.86 (page 519) for
a if k = sin kx, p(x) = 1 and q(x) = x.
12.6.4 Minimising sequences and the Ritz method

In this section we consider the eigenvalues of Sturm-Liouville systems which are spe-
cial because the associated functionals have strict minima. This property allows the
eigenvalues to be bounded above arbitrarily accurately by a well defined and relatively
(especially with a computer) simple procedure; for the smallest eigenvalue this proce-
dure is that described in the previous section. The method is also applicable to many
important partial differential equations which is the main reason that it is important.
Suppose that the functional S[y] has a minimum value: this means that in the
class of admissible functions, M, S[y] has a greatest lower bound, s and this bound is
achieved by a function in M. There are several technical issues behind this assertion
that we ignore.
The aim is to construct a sequence of functions {y1 , y2 , }, each in M, such
that S[yk ] S[yk+1 ], so that sk = S[yk ] is a decreasing infinite sequence such that
limk sk = s.
We start with an infinite set of functions {} = {1 , 2 , }, each in M, and with
a natural ordering. For instance if the admissible functions are defined on [0, 1] and are
zero at the end points, x = 0 and 1, typical sequences would be
sin kx, k = 1, 2, 3, ,
x(1 x), x(1 x)(1 + x), x(1 x)(1 + x + x2 ), x(1 x)(1 + x + x2 + x3 ), .
From the sequence {} a finite dimensional subspace is formed from the first n members
{1 , 2 , , n }; that is the set of all the linear combinations
z(x; a) = a1 1 (x) + a2 2 (x) + + an n (x), (12.91)
where the ak , k = 1, 2, , n are any real numbers. On this subspace S[z] becomes a
function of the real numbers a = (a1 , a2 , . . . , an ),
S(a) = S[z].
This is exactly as in the previous section; but now we use the fact that the functional
has a minimum.
Choose (a1 , a2 , . . . , an ) to minimise S(a) and denote this minimum value by sn and
the associated element of Mn by yn ,

sn = min S(a1 , a2 , . . . , an ) .
Clearly sn cannot increase with n because Mn+1 contains Mn , that is any linear
combination of {1 , 2 , , n } is a linear combination of {1 , 2 , , n , n+1 }. If the
sequence {} is complete, then it can be shown that the sequence sn converges to s, the
minimum value of S[y]. This method of successively approximating a functional using
sequences of functions is the Ritz method.
For Sturm-Liouville systems the significance of this result is that the eigenvalue is
just the value of a functional that has a minimum, equation 12.79.
The smallest eigenvalue of a Sturm-Liouville system

The simplest use of this technique is to estimate the lowest eigenvalue and eigenfunction
of a Sturm-Liouville system.
Consider the functional 12.76 (page 517), subject to the constraint 12.77. A suitable
subspace Mn is {sin x, sin 2x, , sin nx}, giving the linear combination
n
X
z(x; a) = ak sin kx.
k=1
Then the functional S(a), equation 12.82, has a mimimum, because S(a) is continuous,
therefore bounded above and below, and the constraint limits each ak to a finite region,
so there is some value of a that yields the minimum value. Substituting this value for
(n)
a into S(a) gives an upper bound 1 for 1 ,
(n)
1 1 = S(a). (12.92)
(m)
For each m = 1, 2, , we similarly obtain an upper, 1 for the lowest eigenvalue,
and by the same reasoning as used above, we see that
(1) (2) (m) (m)
1 1 1 and lim 1 = 1 . (12.93)
m
Thus the method used in the previous section provides successively closer upper bounds
to the lowest eigenvalue.
A numerical example of this behaviour was seen in the calculation of the smallest
eigenvalue of equation 12.87 (page 519) where we used the trial function
n
X
z(x; a) = ak sin kx.
k=1
The exact value of this eigenvalue is, to 10 significant figures 9.368 507 162: for n = 1, 2
and 3 the variational estimates for 1 are 9.3698, 9.368 509 and 9.368 508 6. With the
trial function
z(x, a) = x(1 x) a0 + a1 x + a2 x2 + an1 xn1

the estimates of this eigenvalue with n = 1, 2, 3 and 4 are 9.5, 9.4989, 9.3687 and
9.368 513. As predicted the estimates approach the exact value from above.
The Ritz method can be applied to any functional with a minimum value. In
particular it applies to the general Sturm-Liouville system
Z b
2 2
dx py 0 2 qy 2 ,

S[y] = p(a)y(a) + p(b)y(b) + (12.94)
a
with natural boundary conditions, and with the constraint

Z b
C[y] = dx wy 2 = 1 (12.95)
a
which gives the Euler-Lagrange equation

d dy
p + (q + w)y = 0, (12.96)
dx dx
with the separated boundary conditions
y(a) + y 0 (a) = 0, and y(b) + y 0 (b) = 0,
see exercise 12.46. Provided the integrals exist, the Ritz method applies to singular and
regular systems. For the boundary conditions y(a) = 0 and/or y(b) = 0 the appropriate
boundary term of the functional is removed.
For this system a sequence can be found such that the smallest eigenvalue satisfies
the conditions of equation 12.93. Further, the rigorous application of this method proves
the existence of an infinite sequence of eigenvalues and eigenfunctions for both regular
and singular systems, see for instance Fomin and Gelfand (1992, chapter 8) or Courant
and Hilbert (1965, chapter 6).
By adding an additional constraint that forces the admissible functions to be or-
thogonal to y1 (x), the eigenfunction associated with the smallest eigenvalue, we obtain
bounds for the next eigenvalue. Thus by considering the system defined by equa-
tions 12.94 and 12.95 with the additional constraint
Z b
C1 [y, y1 ] = dx wyy1 = 0, (12.97)
a
and using trial functions z satisfying the two constraints C[z] = 1 and C1 [z, y1 ] = 0 we
obtain another convergent sequence
(1) (2) (m) (m)
2 2 2 and lim 2 = 2 . (12.98)
m
By adding further constraints this process can be continued to obtain upper bounds for
any eigenvalue.
Exercise 12.46
(a) Show that the constrained functional with natural boundary conditions
Z b
S[y] = p(a)y(a)2 + p(b)y(b)2 + dx py 0 2 qy 2
a
and the constraint Z b

C[y] = dx wy 2 = 1
a
gives rise to the Euler-Lagrange equations

d dy
p + (q + w)y = 0, y(a) + y 0 (a) = 0, y(b) + y 0 (b) = 0,
dx dx
where is the Lagrange multiplier.

(b) Show that if k is the eigenvalue of the eigenfunction yk , k = S[yk ].

Exercise 12.47
Find the normal forms, as defined in exercise 12.1(b) of Legendres equation

d dy
(1 x2 ) + y = 0.
dx dx
Exercise 12.48 Z x p
Show that changing to the independent variable t = dx q(x) converts the
a
00 0
equation y + p1 (x)y + q(x)y = 0, a x b, q(x) > 0, into
d2 y q 0 (x) + 2p1 q dy
+ + y = 0.
dt2 2q 3/2 dt
Exercise 12.49
For problems defined inside an elliptical region it is sometimes convenient to use
elliptical coordinates defined by
x = cosh u cos v and y = sinh y sin v
where is a positive constant, so that

x2 y2
2
+ = 1,
2 cosh u 2 sinh2 u
and when v changes by 2, with u fixed, this equation defines an ellipse.
Any elliptical boundary can be defined by a particular choice of and u = u0 ,
and the interior of the ellipse if given by 0 u u0 , v .
In these coordinates the partial differential equation 2 + k2 = 0 becomes
2
2

1
+ + k2 = 0.
22 (cosh 2u cos 2v) u2 v 2
By putting = f (u)g(v) and using separation of variables form the two equations
d2 g
(a 2q cos 2v) g = 0, q = (k)2 , g(v + 2) = g(v) for all v,
dv 2
2
d f
+ (a 2q cosh 2u) = 0.
du2
The first of these equations is commonly known as Mathieus equation and periodic
solutions exists only for certain values of a(q).
Exercise 12.50
Keplers equation
Show that Keplers equation = u sin u with 0 < 1 can be inverted in
terms of Bessel functions with the formula,

X 1
u=+2 Jk (k) sin k.
k
k=1
Exercise 12.51
Show that the function defined by the integral 12.33 (page 490) satisfies the dif-
ferential equation 12.30, with = n.
Hint, by differentiating under the integral sign, show that Bessels equation can
be written in the form
Z
1 d n cos t
dt g(t)ei(ntx sin t) with g(t) = i + .
2 dt x2 t
Exercise 12.52
Find the eigenvalues and eigenfunctions of the Sturm-Liouville system y 00 +y = 0,
y(0) = 0, y() = y 0 () any real .
Exercise 12.53
If f (x) and g(x) and h(x) are any solutions of the second-order equation y 00 +
p1 (x)y 0 + q(x)y = 0, show that the following determinant is zero
f f 0 f 00

g g 0 g 00 .

h h0 h00

Exercise 12.54
Using the results found in exercise 12.21 (page 499) to construct a linear, homo-
geneous, second-order differential equation having the solutions
(a) (sinh x, sin x), (b) (tan x, 1/ tan x).
Exercise 12.55
Use the results found in exercise 12.21 (page 499) to show that the equation
d2 y u0 dy f0
u2 y = 0, u= ,
dx2 u dx f
has solutions f (x) and 1/f (x).
Exercise 12.56
Let f (x), g(x) and h(x) be three solutions of the linear, third order differential
equation
d3 y d2 y dy
+ p2 (x) 2 + p1 (x) + p0 (x)y = 0.
dx3 dx dx
Derive a first-order differential equation for the Wronskian

f g h
0
W (x) = f g 0 h0 .
f 00 g 00 h00
You will need to differentiate this determinant: the derivative of an n n deter-

minant, A, where the elements depend upon x is
n
d X
det(A) = det(Ak )
dx k=1
where Ak is the determinant formed by differentiating the kth row of A.

Exercise 12.57
Find the self-adjoint form of the equation y 00 + y 0 tan x = 0.
Exercise 12.58
x
Use a comparison theorem to show that the solutions of y 00 + y = 0 have
1+x
infinitely many zeros for x > 1.
Exercise 12.59
Show that the eigenvalues of the Sturm-Liouville system y 00 + y = 0 with the
2-periodic boundary conditions y() = y() and y 0 () = y 0 () are n = n2 ,
n = 0, 1, 2, and that for each eigenvalue, except 0 , there are two distinct
eigenfunctions, which can be expressed as the real or the complex functions
yn (x) = {cos nx, sin nx} or yn (x) = einx , n = 0, 1, 2, .
Show, also that any linear combination of the pairs einx is also an eigenfunction
with eigenvalue n = n2 .
Exercise 12.60
(a) Using the new independent variable defined by x = et , show that if B > 1/4
the equation y 00 (x) + By/x2 = 0 has infinitely many zeros on (1, ).
(b) Show that the equation y 00 (x) + q(x)y/x2 = 0 has infinitely many zeros on
(1, ) if q(x) > 1/4 for x 1.
Exercise 12.61
Consider the system
d2 y dy
x + + y = 0, x 0.
dx2 dx x
(a) Show that the self-adjoint form of this equation is

d dy
x + y = 0, x 0,
dx dx x
and determine the intervals on which it is a regular system and on which it is a

singular system.
(b) Show that the normal form of the equation is
d2 u + 41
+ u = 0, u(x) = y(x) x,
dx2 x2
and determine the intervals on which it is a regular system and on which it is a
singular system.
(c) Find any eigenvalues and eigenfunctions for the boundary conditions y(0) =
y(1) = c, for any c.
(d) Find the eigenvalues and eigenfunctions for the boundary conditions y(a) = y(b) = 0,
0 < a < b.
Exercise 12.62
The Schwarzian derivative
(a) If f (x) and g(x) are any two linearly independent solutions of the equation
y 00 + q(x)y = 0, show that the ratio v = f /g is a solution of the third order,
nonlinear equation S(v) = 2q(x), where
2
v 000 3 v 00
S(v) = 0 .
v 2 v0
(b) If a, b, c and d are four constants with ad 6= bc deduce that

av + b
S = S(v).
cv + d
The function S(v) is named the Schwarzian derivative and has the important
property that if S(F ) < 0 and S(G) < 0 in an interval, then S(H) < 0, where
H(x) = F (G(x)). This result is useful in study of bifurcations of the fixed points
of one dimensional maps.

(a) By expanding equation 12.2, dividing by p(x) and comparing the coefficients of y 0
p0 a1 q a0
and y (after division by a2 (x)) we obtain = and = . Integrating the first
p a2 p a2
equation gives
Z
a1 a1
Z
ln p = dx that is p = exp dx .
a2 a2
(b) Putting y = uv in equation 12.2 gives

p u00 v + 2u0 v 0 + uv 00 + p0 u0 v + uv 0 + quv = 0
which can be rearranged to

pvu00 + u0 (2pv 0 + p0 v) + u qv + p0 v 0 + pv 00 = 0.
Choose v to make the coefficient of u0 zero, that is pv 2 = 1, to give
p0 2 p00

00 q
u p + + 3/2 u = 0.
p 4p 2 p

Dividing by p gives the quoted result.

(a) With the Lagrange multiplier, the auxiliary functional is
Z b
dx py 0 2 (q + w)y 2

S[y] =
a
with the Euler-Lagrange equation (py 0 )0 + (q + w)y = 0.
(b) With 0 (x) = 1/p(x) this functional is

2 !
b0
dx dy d
Z
2
S[y] = d p (q + w)y
0 d d dx
Z b0 2 !
dy
= d p(q + w)y 2 ,
0 d
x
1
Z
where (x) = dt and b0 = (b). The associated Euler-Lagrange equation is
a p(t)
y 00 () + p(q + w)y = 0.
(c) With y = uv the functional S is

b
1
Z
S[y] = dx p u0 2 v 2 + (u2 )0 (v 2 )0 + u2 v 0 2 qu2 v 2
a 2
but
b b
d
Z Z
b
dx p(u2 )0 (v 2 )0 = pu2 (v 2 )0 a dx u2 (v 2 )0 p

a a dx
so, since u(a) = u(b) = 0, the functional becomes
b
1d
Z
2 02 2 02 2 0
2
S[y] = dx pv u qv pv + p(v ) u .
a 2 dx
Now set pv 2 = 1 to give

b
p0 2 p00

q
Z
02 2
S[y] = dx u + 2 u .
a p 4p 2p
The constraint then becomes

b
w 2
Z
C[u] = dx u .
a p
so that the Euler-Lagrange equation is
d2 u p0 2 p00

q
+ + u = 0, u = y p,
d 2 p 4p2 2p
which is the same as equation 12.3, if in that equation, q is replaced by q + w.

(a) In terms of the functional is
2 !
b0
d dy 0
Z
2
S[y] = 0
p (x) (q + w)y
a0 (x) d
Z b0
1
= d p 0 (x)y 0 ()2 0 (q + w)y 2 ,
a0 (x)
where a0 = (a) and b0 = (b). Now put y = A()v() to give
b0 2
1 A
Z
S[v] = d p 0 (x) A2 v 0 2 + (A2 )0 (v 2 )0 (q + w) p 0
(x)A 02
v2 .
a0 2 0 (x)
But, integration by parts gives

b0 b0
d
Z Z
b0
d p 0 (x)(A2 )0 (v 2 )0 = p 0 (x)(A2 )0 v 2 a0 d v 2 p 0 (x)(A2 )0

a0 a0 d
so that
ib 0
1h 0
S[v] = p (x)(A2 )0 v 2
2 a0
Z b0 2
A 1d 0
+ d A2 p 0 (x)v 0 2 (q + w) p 0 02
A + p (x)(A 2 0
) v2 .
a0 0 (x) 2 d
Now define (x) with the equation A2 p 0 (x) = 1 to put S[v] in the simpler form
b0 Z b0
1 (A2 )0 v 2

d v 0 2 F ()v 2

S[v] = 2
+
2 A a0 a0
where
A0 2 (A2 )0

1d
F () = (q + w)A4 p + .
A2 2 d A2
But
A0 2 (A2 )0 A0 2 d A0 A00 2A0 2 d2

1d 1
2 + = 2 + = = A ,
A 2 d A2 A d A A A2 d 2 A
and hence
d2

4 1
F () = (q + w)pA A 2 .
d A
(b) The coefficient of is made unity by defining wpA4 = 1 and then
d2

q 1
F () = + A 2 , where A = (wp)1/4
w d A
and the Euler-Lagrange equation is

s
x
d2 v v() w(t)
Z
+ F ()v = 0, y= , (x) = dt .
d 2 (wp)1/4 a p(t)

Substituting = XY into the equation and dividing by gives
1 d2 X 1 d2 Y
+ + k 2 = 0.
X dx2 Y dy 2
Thus defining the two constants 1 and 2 by the equations
1 d2 X 1 d2 Y
= 12 and = 22
X dx2 Y dy 2
gives the two quoted equations if 12 + 22 = k 2 .

Since = 0 on the boundary we have X(0)Y (y) = X(a)Y (y) = 0 which, for
nontrivial Y (y), gives X(0) = X(a) = 0. Similarly Y (0) = Y (b) = 0.

We need to express xx and yy in terms of the differentials of r and , using the chain
rule which gives
r r
= + and = + .
x r x x y r y y
But, since r2 = x2 + y 2 we have r/x = x/r = cos and r/y = y/r = sin . Also
by differentiating x = r cos and y = r sin with respect to x we obtain
r
1 = cos r sin
x x = r = sin
r x
0 = sin + r cos
x x
and, by differentiating with respect to y
r
0 = cos r sin
y y
= r = cos .
r y
1 = sin + r cos
y y
Hence

1 1
= cos sin and = sin + cos .
x r r y r r
Applying the first formula twice gives
2

1 1
= cos sin cos sin
x2 r r r r
2

2 1 1 1
= cos 2 sin cos sin cos + 2 sin sin
r r r r r r
and similarly
2 2

1 1 1
2
= sin2 2 cos cos + cos sin + 2 cos cos .
y r r r r r r
Adding these two expressions gives
2 1 1 2
2 = + + .
r2 r r r2 2
If = R(r)() the equation 2 + k 2 = 0 becomes, on division by and multipli-
cation by r2 ,
r2 2 R r R 2 2 1 2
+ + k r + = 0.
R r2 R r 2
Putting = 2 , where is a positive constant, gives
d2
+ 2 = 0,
d2
d2 R dR
r2 2 + r + k 2 r2 2 R = 0.

dr dr
The constant is chosen to ensure that () is 2-periodic.

Since (rR0 )0 = rR00 + R0 we can write the equation for R in the self-adjoint form
2

d dR 2
r + k r R = 0.
dr dr r

(a) If < 0, put = 2 ( > 0) to give y 00 2 y = 0 with the general solution
y = A cosh x + B sinh x. The boundary condition at x = 0 gives A = 0 and at x = ,
B sinh = 0, which can be satisfied only if B = 0 (since is real).
If = 0 the general solution is y = A + Bx, which satisfies the boundary conditions
only if A = B = 0.
(b) If > 0, put = 2 ( > 0) to give y 00 + 2 y = 0 with the general solution
y = A cos x + B sin x. The boundary condition at x = 0 gives A = 0 and at x = ,
B sin = 0, which is satisfied if = n, n = 1, 2, . Hence the eigenvalues and
eigenfunctions are
n = n 2 , yn (x) = B sin nx, n = 1, 2, .

(a) We have

d dy X d dk X
p + qy = yk p + qk = k yk w(x)k .
dx dx dx dx
k=1 k=1
(b) If y(x) is a solution of the inhomogeneous equation 12.29 this gives

X
F (x) = k yk w(x)k .
k=1
Now multiply this equation by p (x) , integrate and use the orthogonality relation 12.25
to obtain Z b
du p (u) F (u) = p yp hp ,
a
which gives a value for yp . Substituting this value for yk into the original sum for y(x)
gives a solution of the inhomogeneous equation in the form
b b
1
X Z Z
y(x) = du F (u)k (u) k (x) = du G(x, u)F (u)
k hk a a
k=1

X k (u) k (x)
where G(x, u) = .
hk k
k=1
Solution for Exercise 12.8 Z

dx
(a) Comparing with the equation in exercise 12.1 we see that p = exp = x
x
and that the self-adjoint form is
2

d dy
x + x y = 0.
dx dx x

(b) In this example p = x, q = x 2 /x so v = 1/ p = 1/ x and
2 2 14

1
I(x) = 2 1 + 4x x =1 .
4x x x2
12.3 we put p = x, q = x, = 2 and

(c) Comparing with the equations in exercisep
1/4 0
w = 1/x, so A = (pw) = 1 and (x) = w/p = 1/x, giving = ln x. Then the
transformed equation is
d2 v
+ e2 2 v = 0,

y(x()) = v().
d 2
The solution of this equation is J (e ).

(a) (i) Put n = m in equation 12.31

X
eiz sin t = Jm (z)eimt
m=
now put t = s, so sin t = sin s to put this in the form

X
eiz sin s = Jm (z)eim eims .
m=
Compare the nth coefficient of this and the original series to obtain the first result.
P
(ii) Put z = x in equation 12.31 eix sin t = n= Jn (x)eint , and now set t = +s,
P
so sin t = sin s to obtain eix sin t = n= Jn (x)ein eint . Compare the nth
coefficient of this and the original series to obtain the second result.
(iii) Put t = 0

X
X
X
1= Jn (z) = J0 (z) + Jn (z) + Jn (z) = J0 (z) + 2 J2n (z),
n= n=1 n=1
since the terms with odd n cancel.

(b) Putting z = 0 gives

1
Z
int 1, n = 0,
Jn (0) = dt e =
2 0, otherwise.
(c) Differentiate equation 12.33 (page 490) with respect to x,

Z
dJn 1
= dt i sin t exp i (nt x sin t)
dx 2
Z
1 1
dt eit eit ei(ntx sin t) = (Jn1 (x) Jn+1 (x)) ,

=
4 2
that is 2Jn0 (x) = Jn1 (x) Jn+1 (x).
(d) The sum is
Z
1
Jn1 (x) + Jn+1 (x) = dt ei((n1)tx sin t) + ei((n+1)tx sin t)
2
Z
1 1
Z
dt ei(ntx sin t) eit + eit = dt cos t ei(ntx sin t)

=
2
But
d i(ntx sin t)
e = i(n x cos t)ei(ntx sin t)
dt
so
n i(ntx sin t) 1 d i(ntx sin t)
cos t ei(ntx sin t) = e e ,
x ix dt
and

n 1 d i(ntx sin t) 2n
Z Z
i(ntx sin t)
Jn1 (x) + Jn+1 (x) = dt e dt e = Jn (x).
x ix dt x

(a) If = 0 the solution is y = constant. Otherwise the general solution that fits the
boundary condition at x = 0 is

A cosh x, < 0,
y=
A cos x, > 0.
Only if > 0 can the boundary condition at x = be satisfied, so put = 2 ( > 0)

and then sin = 0, that is = n, n = 0, 1, 2, and = n2 . Note that n = 0 gives
the = 0 eigenvalue. Hence the eigenfunctions and eigenvalues are
yn = cos nx, n = n2 , n = 0, 1, 2, .
For the orthogonality condition the integral needed is

(
Z
1
Z 0, n 6= m,
dx cos nx cos mx = dx cos(n m)x + cos(n + m)x =
0 2 0 , n = m.
2
(b) The general solution that fits the boundary condition at x = 0 is

A sinh x, < 0,
y= Ax, = 0,
A sin x, > 0.

Only if > 0 can the boundary condition at x = be satisfied, so put = 2 ( > 0)

and then cos = 0, that is = n + 1/2, n = 0, 1, 2, and = (n + 1/2)2 . Hence
the eigenfunctions and eigenvalues are
yn = sin(n + 1/2)x, n = (n + 1/2)2 , n = 0, 1, 2, .
For the orthogonality condition the integral needed is

(
Z
1
Z 0, n 6= m,
dx sin(n+ 21 )x sin(m+ 12 )x = dx cos(n m)x cos(n + m + 1)x =
0 2 0 , n = m.
2
(c) The general solution that fits the boundary condition at x = 0 is

A sinh x, < 0,
y= Ax, = 0,
A sin x, > 0.

If = 0 the boundary condition at x = cannot be satisfied. If < 0, put = 2

( > 0) to give tanh = and this has one real (positive) solution (as can be seen
by sketching the graphs of either side of the equation). Hence there is one negative
eigenvalue.
If > 0, put = 2 ( > 0) and the boundary condition at x = gives tan = ,

which has an infinity of positive solutions, k , k = 1, 2, , with (k1)/2 < k < k/2
and k k/2 as k , as can be seen by sketching the graphs of either side of the
equation.
Hence the eigenfunctions and eigenvalues are
y0 (x) = sinh 0 x, 0 = 02 , tanh 0 = 0 , 0 > 0,

yn (x) = sin n x, n = n2 , tan n = n , n > 0.
The orthogonality condition is more difficult to establish in this case. First consider I0n ,
Z Z
I0n = dx sinh 0 x sin n x = i dx sin i0 x sin n x
0 0
Z
sin(n i0 )
= i= dx cos(n i0 )x = i=
0 n i0

i
= = ( n + i 0 ) sin n cos i 0 cos n sin i 0 ,
n2 + 02
where =(z) is the imaginary part of z. Using the definitions of k we see that the term
in the outer brackets is real, and hence I0n = 0.
For n, m 6= 0 we have
Z
1
Z
Inm = dx sin n x sin m x = dx (cos(n m )x cos(n + m )x)
0 2 0

1 sin(n m ) sin(n + m )
=
2 n m n + m
m sin n cos m n cos n sin m
=
n2 m2
n m
= cos n cos m cos n cos m .
n2 m
2
If n 6= m this is zero. The case n = m can be obtained from this using LHospitals
rule. Alternatively,
Z
2 1 2
1 1
Inn = dx sin n x = cos n =
0 2 2 1 + n2
so Inn /2 as n .
Finally

1 1 1
Z
2
I00 = dx sinh 0 x = cosh2 0 = .
0 2 2 1 02

In all cases we use the formula 12.28 (page 487) with f (x) = x and w(x) = 1.
(a) The Fourier components are a0 = /2 and

R
dx x cos nx n 0, n even,
2((1) 1)
an = R0 = = 4 n = 1, 2, ,
0
dx cos2 nx n2 , n odd,
n2
giving

4 X cos(2k + 1)x
x= 2 .
2 (2k + 1)2
k=0
(b) The Fourier components are

R
dx x sin(n + 12 )x 8 (1)n
an = R0 2 1
= , n = 0, 1, 2 ,
0
dx sin (n + 2 )x (2n + 1)2
giving

2 X (1)k
sin k + 12 x.

x= 1 2
(k + 2 )
k=0
(c) The Fourier components for n = 0 is

R
dx x sinh 0 x 2( 1) cosh 0
a0 = R0 2 =
0 dx sinh 0 x 0 ( cosh2 0 )
since
Z
1 1
Z
dx sinh2 0 x = cosh2 0 ,

dx x sinh 0 x = ( 1) cosh 0 ,
0 0 0 2
where the definition sinh 0 = 0 cosh 0 has been used. For n 1, we use the
results
Z Z
1 1
dx sin2 n x = cos2 n

dx x sin n x = ( 1) cos n ,
0 n 0 2
to obtain
R
dx x sin n x 2( 1) cos n
an = R0 2 = n = 1, 2, 3 ,
0
dx sin n x n ( cos2 n )
giving

2( 1) cosh 0 X cos k sin k x
x= 2 sinh 0 x 2( 1) 2
.
0 ( cosh 0 ) k ( cos k )
k=1

(a) If < 0 the nontrivial solution cannot be made to satisfy the boundary conditions.
If = 0 the solution y = 1 satisfies the boundary condition.
If > 0, put = 2 , ( > 0) giving the general solution (note it is easier to use the
complex form here)
y = Aeix + Beix .
The boundary conditions give
A+B = Ae2ia + Be2ia
and therefore A = Ae2ia and B = Be2ia .
AB = Ae2ia Be2ia
If a = n, n = 0, 1, 2, , both equations are satisfied otherwise they are not. Hence
there are two linearly independent solution (except if n = 0), y = exp(inx/a) with
the eigenvalue n = 2 = (n/a)2 .
Alternatively we may use the linear combinations,
n nx nx o n 2
yn = cos , sin , n = , n = 0, 1, 2, .
a a a
(b) Consider the integral

Z 2a Z 2
dx u1 u2 = a dw A1 A2 cos2 nw + B1 B2 sin2 nw
0 0

+ (A1 B2 + A2 B1 ) sin nw cos nw
= a (A1 A2 + B1 B2 ) ,
which is zero only if A1 A2 +B1 B2 = 0, that is the vectors a = (A1 , A2 ) and b = (B1 , B2 )
are orthogonal.

First assume < 0 and put = 2 , > 0, giving y 00 2 y = 0 with the general
solution y = A cosh x + B sinh x. The boundary condition at x = 0 gives A = 0 and
then the other boundary condition gives sinh = a. This equation has no positive
roots if a < , and one if a > , as may be seen by sketching the graphs of a and
sinh .
If = 0 the solution satisfying the boundary conditon at x = 0 is y = Ax and the
second boundary condition gives a = .
Hence all eigenvalues are positive. Put = 2 , > 0, giving y 00 + 2 y = 0 with the
solution satisfying the condition at x = 0 being y = B sin x. The second boundary
condition gives sin = a.
In figure 12.14 the graphs of u = sin and u = a, for some representative values
of a are shown.
1 a=1/5
a=1/10
0.5

0
1 2 3 4 5 6 7 8 9 10 11
-0.5
-1
Figure 12.14 Graphs of the functions u = sin and u = a for a = 1/5
and 1/10.
For > 0 these curves intersect if < c ' 1/a, giving real zeros; for > c the zeros
are complex. There are about N ' 1/a zeros because there is one zero every time
passes through an integer. Hence there are a finite number of real zeros.
Consider the inner product of two distinct eigenfunctions, yi and yj with j > i.
Z
1
Z
Iij = dx sin i x sin j x = dx (cos(j i )x cos(j + i )x)
0 2 0

1 sin(j i ) sin(j + i )
= sin k = ak , k = i, j
2 j i j + i

a j cos i i cos j j cos i + i cos j
=
2 j i j + i
aj i
= (cos i cos j ) .
j2 i2
It is obvious that cos j + cos i 6= 0, but this is easily proved. We note that
a(j i ) = 2 sin(j i ) 2 cos(j + i ) 2 > 0

a(j + i ) = 2 sin((j + i ) 2 cos((j i ) 2 > 0
and since

cos j cos i = 2 sin(j i ) sin(j + i ) ,
2 2
it follows that Iji 6= 0.

(a) Put x = et so
d2 y

dy dy dt dy dy d dy
= giving x = , similarly x x =
dx dt dx dx dt dx dx dt2
and the equation becomes
d2 y dy
+ y = 0.
dt2 dt

It we put y = ept the equation for p is p2 p + = 0 giving 2p = 1 1 4. Thus
the general solution is
1
y = et/2 Aeqt + Beqt , q =

1 4,
2
or, in terms of x
1 1 1
y = Axq + Bxq , q =

1 4, < ,
x 2 4
1 1 1
= Aei ln x + Bei ln x , =

4 1, > ,
x 2 4
and if = 1/4, the general solution is y = et/2 (A + Bt) giving
1 1
y = (A B ln x) , = .
x 4
1
These solutions are bounded as x 0 only if < 4 and then only if B = 0 and
q 21 > 0, that is < 0. Thus if c > 0 the solution is
1
y = cxq1/2 , q= 1 4, for all < 0.
2
If c = 0 there are no nontrivial solutions.
(b) With the boundary conditions y(a) = y(1) = 0 the system is regular and we have:
< 1/4: the boundary conditions give

Aaq + Baq = 0 and A + B = 0
which have no real solutions for A and B.
= 1/4: the boundary conditions give A = B ln a and A = 0, so there is no nontrivial
solution.
> 1/4: the boundary conditions give
Aei ln a = Bei ln a and A + B = 0
hence = n/ ln a giving the eigenfuctions and eigenvalues

1 ln x 1 n 2
yn (x) = sin n , n = + , n = 1, 2, .
x ln a 4 ln a

The homogeneous equation, y 00 +y = 0, has the general solution y = A cos x+B sin x. A
particular solution of y 00 + y = x is y = x, so the general solution of the inhomogeneous
equation y = x + A cos x + B sin x.

Suppose y() = y 0 () = 0, then a solution is y(x) = 0 for all x, which, since by P2
this is the only solution, contradicts the assumption that y(x) is nontrivial. Hence if
y() = 0, y 0 () 6= 0 and the zero is simple.

The vectors are dependent only if parallel, that is x y = |x||y|. The vectors are not
2 2 2 2 2
parallel if x y 6= |x||y|, that is
x 1 + x 2 y 1 + y 2 6= (x1 y1 + x2 y2 ) which rearranges
2 x x2
to (x1 y2 x2 y1 ) 6= 0 that is 1 6= 0.
y1 y2

Since g 0 (x) = F 0 (f )f 0 (x), W (f, g) = [f F 0 (f )F (f )]f 0 (x), so W = 0 only if F 0 /F = 1/f,
that is g(x) = cf (x).

The Wronskian is

W = a1 cos x a2 sin x b1 sin x + b2 cos x

a1 sin x + a2 cos x b1 cos x b2 sin x = a1 b2 a2 b1 .
The functions are linearly independent if W 6= 0, that is if a1 b2 6= a2 b1 .

In this example p2 = 1 and p1 = 0 so equation 12.44 becomes
Z x
1
g1 (x) = f (x) C + W (a) ds .
a f (s)2
Putting C =Z 0 and g1 = W (a)g(x) (permissible because the equation is linear) gives

x
1
g(x) = f (x) ds 2
.
a f (s)
Differentiating this twice gives
Z x Z x
0 0 1 1 00 00 1
g (x) = f (x) ds 2
+ and g (x) = f (x) ds 2
,
a f (s) f (x) a f (s)
and hence g 00 + qg = 0.

(a) The functions f and g satisfy the equations
f 00 + p1 f 0 + p0 f = 0 and g 00 + p1 g 0 + p0 g = 0.
Multiply the first by g, the second by f and subtract to obtain

f g 00 f 00 g
f 00 g g 00 f + (f 0 g f g 0 ) p1 = 0 that is p1 = .
W (f, g; x)
Multiply the first by g 0 , the second by f 0 and subtract to obtain
f 0 g 00 g 0 f 00
f 00 g 0 g 00 f 0 + (f g 0 gf 0 ) p0 = 0 that is p0 = .
W (f, g; x)
(b) (i) If f = x and g = sin x then W = x cos x sin x and

x sin x sin x
p1 = and p0 =
x cos x sin x x cos x sin x
giving the equation (x cos x sin x)y 00 + y 0 x sin x y sin x = 0 which has singular points
at the roots of tan x = x.
(ii) If f = xa and g = xb then W = (b a)xa+b1 and
a+b1 ab
p1 = and p0 =
x x2
giving the equation x2 y 00 (a + b 1)y 0 + aby = 0 which has a singular point at x = 0.
(iii) If f = x and g = eax then W = (ax 1)eax and
xa2 a2
p1 = and p0 =
1 ax 1 ax
giving the equation (1 ax)y 00 + xa2 y 0 a2 y = 0 which has a singular point at x = 1/a.

If x > 0, f = g and W (f, g) = 0. If x < 0 then g = f and again W (f, g) = 0.
However, for x > 0, (f, f 0 ) = (g, g 0 ) and for x < 0, (f, f 0 ) = (g, g 0 ). For x (1, 0) or
x (0, 1) the functions are linearly dependent; but they are not linearly dependent for
x (1, 1).
The function g(x) is not differentiable at x = 0, so is not a solution of a second
order differential homogeneous equation with continuous coefficients

Differentiate the Wronskian and use the fact that f and g satisfy equation 12.41,
dW g f 0
= f g 00 f 00 g = p1 f 0 + p 0 f p1 g + p 0 g
dx p2 p2
p1 0 0
p1
= gf g f = W.
p2 p2
Integrate this equation,
Z x Z x
h ix p1 (t) p1 (t)
ln W = dt . Hence W (x) = W (a) exp dt .
a a p2 (t) a p2 (t)

If w(x) > 0, p(x) > 0 and both have continuous second derivatives, the function
d2

q 1
f () = A 2
w d A
is continuous and hence has a minimum value Qm . If < Qm then f () + < 0 for
all and the result proved in the text shows that any solution v() has at most one
zero, so cannot satisfy the boundary conditions. Hence there are no eigenvalues smaller
than Qm .

(a) For x < 0 the equation is like equation 12.48 with Q(x) = x < 0. Hence all solutions
have at most one zero for x < 0.
For x 1 use the comparison theorem 12.2 with Q1 = x and Q2 = 1 Q1 . The
comparison equation has a solution sin x with infinitely many zeros in between which
there is at least one zero of the equation y 00 +xy = 0. Let these zeros be rn , n = 1, 2, .
(b) If = ax the equation becomes
1 d2 y
+ ay = 0 that is v 00 () + a3 v() = 0.
a2 d 2
Thus if y(x) is a solution of y 00 + xy = 0, v() = y(a) is a solution of v 00 + a3 v = 0.
(c) Suppose the solution of y 00 + xy = 0 with the condition y(0) = 0 is v(x); then
v(x) = y(1/3 x) where y(x) is a solution of y 00 + xy = 0.
If = rn3 then v(1) = y(rn ) = 0, so y(rn x) is an eigenfunction with eigenvalue n = rn3 ,
and there are infinitely many of these.
There are no negative eigenvalues because y(0) = 0 and there can be no other zeros.

We have
du

d d dv
v(Lu) u (Lv) = v p + qu u p + qv
dx dx dx dx
dp du d 2 u d2 v

dp dv
= v + p 2 u +p 2
dx dx dx dx dx dx
2 2
du

dp dv d u d v
= v u + p v 2 u 2
dx dx dx dx dx
du

dp dv d du dv
= v u +p v u
dx dx dx dx dx dx
du

d dv
= p v u .
dx dx dx

The boundary term, B, of equation 12.53 is

B = p(b) v(b)u0 (b) u(b) v 0 (b) p(a) v(a)u0 (a) u(a) v 0 (a)

= p(b) p(a) v(b)u0 (b) u(b) v 0 (b) ,
which is zero, for all u and v, only if p(a) = p(b).

For Lu = du/dx we have
Z Z
dv
h

i du
(u, Lv) = dx u = u v dx v = (Lu, v),
dx dx
since u and v tend to zero as |x| 0.

For Lu = idu/dx we have, similarly

du

dv du
Z Z Z
(u, Lv) = i dx u = i dx v= dx i v = (Lu, v),
dx dx dx
so that L is self-adjoint, but L is not.

For this operator equation 12.53, with p = 1, holds so, with u and v real and satisfying
the boundary conditions we have

(Lu, v) = (u, Lv) = v()u0 () u()v 0 () v(0)u0 (0) u(0)v 0 (0)

= B v() u() A u0 (0) v 0 (0) .
The right hand side is zero for all u and v only if A = B = 0.

In this case p = 1 and for real functions

(Lu, v) (u, Lv) = v(0)u0 (0) u(0)v 0 (0) v()u0 () u()v 0 ()

= a u0 (0)v 0 () v 0 (0)u0 () ,
which is zero for all u and v only if a = 0.

(a) Equation 12.63 gives tan 0 = 0, so 0 = 0. Since y = r sin , y(x) = 0 when = n,
n = 1, 2, . For the nth eigenfunction y() = 0 and this is the nth zero, so is given
implicitly by
n /2
1 1 n
Z Z
= d = 2n d = .
0 cos + sin2
2
0 cos2 2
+ sin
showing that the nth eigenvalue is given by = n2 .

As increases from 0 to n, it passes through , 2, , (n 1), at which points
y(x) = 0, so there are precisely n 1 zeros.
(b) If () = (, ) it is defined implicitly by equation 12.65,
Z
1
= d 2 + sin2
.
0 cos
Differentiating with respect to gives

1 d sin2
Z
0= d ,
cos2 + sin2 d 0 (cos2 + sin2 )2
which gives the required result, and shows that increases with if > 0.
(c) If = n2 , (, ) = n, so as n , (, n2 ) = n . But (, ) is a
continuous, monotonic increasing function of , hence (, ) for all .

Since y = r sin (x), with (a) = 0, the kth zero of y(x) occurs at (x) = k. But
dx 1
= = F (x, )
d Q(x) sin2 + p(x)1 cos2
and we have
1 1
2 1 = G1 () F (x, ) G2 () = 2
2
Q2 sin + p1 cos Q1 sin + p1 2
2 cos
If x1 () and x2 () are the solutions of x0i = Gi (), i = 1, 2, we have x1 () x()

x2 (). The kth zero of the comparison equations are at = k so we have
Z k Z k
1 1
d 2 1 x k a d 2 .
0 Q 2 sin + p 1 cos2
0 Q 1 sin + p1 2
2 cos
But if A > 0 and B > 0 we have

Z k Z /2
1 1 k
d 2 2
= 2k d 2 2
= ,
0 A sin + B cos 0 A sin + B cos AB
and hence
p1 xk a p2
r r
.
q2 + w2 k q1 + w1
For the nth eigenfunction the nth zero is at x = b, so
2 2
ba p2 ba
= q1 + n w1 p2
n q1 + w1 n
and 2 2
ba p1 ba
= q 2 + n w2 p 1
n q2 + w2 n
and hence 2 2
p1 ba q2 p2 ba q1
n .
w2 n w2 w1 n w1

If = the equation becomes
1 d
= 1 x2 that is 0 = 1 x2 .
dx
Substituting the series into this gives
00 + 2 01 + 3 02 + = 1 x 0 + 1 + 2 2 +

= 1 x20 2x0 1 2 x 21 + 20 2 + O(3 ),

and this rearranges to

1 x20 + 00 + 2x0 1 + 2 01 + x 21 + 20 1 = O(3 ).
Equating the coefficients of k , k = 0, 1, , to zero gives

1 00 1 01 + x21 7
0 = , 1 = = 2, 2 = = ,
x 2x0 4x 20 32x7/2
which gives the series quoted.

Since 0 (x) = F (, x; ) = (q + w) sin2 + p1 cos2 , if 2 > 1 , F (, x; 2 )
F (, x; 1 ) and the comparison theorem shows that (x, 2 ) (x, 1 ). Setting x = b
gives the required result.
If q(x) q1 , w(x) w1 and p(x) p2 for x (a, b) then
F (, x; ) (q1 + w1 ) sin2 + p1 2
2 cos = G()
and if 0 = G() the comparison theorem gives (x, 2 ) (x, 1 ).

(a) Since Q1/4 v = R cos and Q1/4 v 0 = R sin , we have
v0
Q1/2 v 2 + Q1/2 v 0 2 = R2 and tan = .
Q1/2 v
Hence
1 d v 00 v0 2 v 0 Q0
= 1/2 2 , but v 00 = Qv
cos2 d 1/2
Q v Q v 2Q3/2 v
v0 2 + v2 Q v 0 Q0
= 2 1/2
v Q 2vQ3/2
Q1/2 1 Q0 sin
= 2 and hence
cos 2 Q cos
d 1 Q0
= Q1/2 sin 2.
d 4Q
Also
dR 1 Q0 2 1 Q0 0 2
2R = 2vv 0 Q1/2 + v + 2v 0 00 1/2
v Q v
d 2 Q1/2 2 Q3/2
1 Q0 2
cos2 sin2

= R and hence
2Q
d 1 Q0
ln R = cos 2.
d 4Q
(b) Write Q(, ) = Q0 () + where Q0 () = q/w A(1/A)00 , is independent of , so
d 1 Q00
= ( + Q0 )1/2 sin 2
d 4 + Q0
d Q00
ln R = cos 2.
d 4( + Q0 )
If max(Q0 ) we may expand in powers of 1 ,
1 Q00

d Q0 Q0
= 1+ + 1 + sin 2
d 2 4

= + O(1/2 ),
and
d Q0
ln R = 0 cos 2 + O(2 ).
d 4
Hence an approximation accurate to the lowest order is

() = and R = r,
for some constants and r. Hence
r
v() = 1/4
cos
( + Q0 ())
and
since y(a) = y(b) = 0, we set = /2 to satisfy the condition at x = a ( = 0) and
(b) = n to satisfy the condition at x = b, to obtain the approximate eigenvalue
2 Z b r
n w r n
n = , (b) = dx , with eigenfunction vn () = sin .
(b) a p Q(, n )1/4 (b)

Z 1
dx y 0 2 xy 2 , y(0) = 1, y(1) = 0. The trial function

The functional is S[y] =
0
y = 1 ax (1 a)x2 satisfies the boundary conditions, so we need the integrals
Z 1 2 Z 1
S1 = dx a + 2(1 a)x = dx a2 + 4a(1 a)x + 4(1 a)2 x2
0 0
4 2 1
= a + a2
3 3 3
Z 1
S2 = dx x 1 2ax + (a2 + 2a 2)x2 + 2a(1 a)x3 + (1 a)2 x4
0
1 1 1
= a + a2
6 10 60
so that
19 2 17 7 19 17
S(a) = a a+ and S 0 (x) = a .
60 30 6 30 30
The stationary point is at a = 17/19 and hence the approximate solution is
17 2
z =1 x x2 .
19 19
In the interval [0, 1] the largest difference between this approximation and the numeri-
cally generated solution is 0.0012.

(a) The Euler-Lagrange of the functional is y 00 + y 3 = 0, y(0) = 0, and the natural
boundary condition at x = X is, equation 9.7 (page 344), y 0 (X) = 0.
x a x
(b) The trial function y(x) = a sin , y 0 (x) = cos , satisfies both bound-
2X 2X 2X
arty conditions. Substituting it into the functional gives
a 2 Z X x a4 Z X x
S(a) = dx cos2 dx sin4
2X 0 2X 4 0 2X
which becomes
2X 1 a 2 3 4
S(a) = a
2 2X 64
p
so S 0 (a) = 0 when aX = 4/3, giving the approximate solution
r
1 4 x
y= sin .
X 3 2X

Multiply equation 12.78 by y and integrate,
Z 1 Z 1
2 d dy 2
dx y = dx, y p qy
0 0 dx dx
1 Z 1 2 !
dy dy
= py + dx p qy 2 = S[y],
dx 0 0 dx
since y(0) = y(1) = 0 the boundary term vanishes and C[y] = S[y]. Putting y = y n
gives the result.

We have
S[zn ] = S[yn + u] = S[yn ] + S[yn , u] + O(2 )
But the Gateaux differential S is
Z 1
S[yn , u] = dx ((pyn0 )0 + qyn ) u
0
Z 1
= n dx yn u from the Euler-Lagrange equation.
0
But both yn and zn satisfy the constraint,

Z 1 Z 1
1 = C[yn + u] = C[yn ] + 2 dx yn u + O(2 ) = dx yn u = O(),
0 0
from which it follows that S[zn ] = n + O(2 ).

(a) In this example p(x) = w(x) = 1 and q(x) = x2 , and a simple trial function is
z = a sin x having only one free variable, which is determined by the constraint,
Z 1
1
C[z] = a2 dx sin2 x = a2 = 1.
0 2
The functional is
Z 1 Z 1 Z 1
02 2 2 2 2 2
dx x2 sin2 x.

S(a) = dx pz qz = a dx cos x a
0 0 0
R1 R1
But 0 dx x2 sin2 x = 1
dx x2 (1 cos 2x) and
2 0
Z 1 1
1 1
2
x
Z
dx x2 cos 2x = sin 2x dx x sin 2x
0 2 0 0
i1 Z 1
1 h x 1 1
= cos 2x + dx cos 2x = .
2 0 2 0 2 2
1 1
Hence 1 S(a) = 2 3 + 2 2 ' 9.587.
(b) For the trial function z = ax(1 x), the constraint gives
Z 1
a2

1 2 1
C[z] = a2 dx x2 (1 x)2 = a2 + = = 1.
0 3 4 5 30
The functional therefore has the value
Z 1 Z 1
2
S[z] = 30 dx (1 2x) 30 dx x4 (1 x)2
0 0

1 1 1 68
= 10 30 + = ' 9.714.
5 3 7 7
Hence 1 < 9.71. The first bound is smaller, so is the better approximation.

In this example q(x) = xa , so for x [0, 1], q1 = 0 and q2 = 1; since p = w = 1, the
inequality of exercise 12.32 is (n)2 1 n (n)2 . This shows that for large n, n
is relatively close to (n)2 .

In this case p(x) = w(x) = 1 and q(x) = x2p . With the trial function z = a(1 x2 ) the
constraint gives
Z 1 Z 1
16 2
C[z] = a2 dx (1 x2 )2 = 2a2 dx (1 2x2 + x4 ) = a .
1 0 15
The functional is
Z 1 Z 1
2 2
S[a] = dx (2ax) a dx x2p (1 x2 )2
1 1
Z 1 Z 1
= 8a2 dx x2 2a2 dx x2p (1 2x2 + x4 )
0 0

2 8 1 2 1
= a 2 + .
3 2p + 1 2p + 3 2p + 5
Hence
5 6
1 S(a) = 1 .
2 (2p + 1)(2p + 3)(2p + 5)

With the one parameter trial function is z = a1 sin x/2 and the constraint gives
1
1 1
Z
a21 dx sin2 x = a21 = 1.
0 2 2
The functional is
1
2 2

1 1 1 1
Z
S(a1 ) = a21 dx cos2 x x sin2 x = a21 2 .
0 4 2 2 8 4
Hence
2 1 2
1 ' S(a1 ) = 2 = 1.76476.
4 2
Note that to 10 significant figures the value of the first eigenvalue is 1.762682254 it
can be shown to be the first zero of Ai(u)Bi0 (1 u) Bi(u)Ai0 (1 u), where Ai
and Bi are Airy functions.
With the two parameter trial function is z = a1 sin x/2 + a2 sin 3x/2 and the
constraint gives
1 1
1 3 1 1
Z Z
a21 dx sin2 x + a22 dx sin2 x = a21 + a22 = 1
0 2 0 2 2 2
The functional is
1 2
x 3 3x
Z
S(a) = dx
a1 cos + a2 cos
0 2 2 2 2
Z 1 2
x 3x
dx x a1 sin + a2 sin
0 2 2
2 2

1 1 1 1 2
a1 + 9a22 + 2 a21 + + 2 a22 2 a1 a2

=
8 4 4 9
2 2
1 1 9 1 1 2
= 2 a21 + 2 a22 + 2 a1 a2
8 4 8 4 9
where we have used the integrals quoted in the question. Thus the equation is
2
1 2 2
4 2 2 2
a = a
2 9 2 1 2
2
2 4 2 9
and the eigevalues are given by the quadratic equation 2 23.4489 + 38.2261 = 0
and the smallest root is 1.7627.

(a) If = 0 the general solution is y = A+Bx; the boundary conditions give A = B = 0.
If < 0, put = 2 , ( > 0), so the general solution is y = A cosh x + B sinh x;
the boundary condition at x = 0 gives B = 0, and that at x = 1 gives A cosh = 0, so
A = 0.
If > 0, put = 2 , ( > 0), so the general solution is y = A cos x + B sin x; the
boundary condition at x = 0 gives B = 0, and that at x = 1 gives A cos = 0, so
= (n 1/2), n = 1, 2, , giving n = (n 1/2)2 2 .
(b) For this problem the functional and constraint are,
Z 1 x Z 1
S[y] = dx y 0 2 by 2 sin and C[y] = dx y 2 = 1.
0 2 0
Taking the lowest eigenfunction of the simpler problem, z = a cos(x/2), to be the trial
function the constraint gives
Z 1 x a2
a2 dx cos2 = = 1,
0 2 2
and the functional becomes

Z 1 x
S(a) = dx y 0 2 by 2 sin
0 2
1 2 2 1
Z x Z 1 x x
= a dx sin2 a2 b dx sin cos2
4 0 2 0 2 2
2 2 2b 2 2 4b
= a a and hence 1 ' .
8 3 4 3
(c) With the trial function z = a cos(n 1/2)x, the constraint gives a2 = 2 and the
functional becomes
Z 1 Z 1
S(a) = a2 2 (n 1/2)2 dx sin2 (n 1/2)x a2 b dx sin(x/2) cos2 (n 1/2)x
0 0
But
1 1
sin(x/2) cos2 (n 1/2)x = sin(x/2) + sin(2n 1/2)x sin(2n 3/2)x ,
2 4
so
1
1 1 1 1
Z
dx sin(x/2) cos2 (n 1/2)x = +
0 4 2n 1/2 2n 3/2
1 1
=
(4n 1)(4n 3)
and hence
2
a2 b

1 2 2 1 1
S(a) = a n 1 , and since a2 = 2
2 2 (4n 1)(4n 3)

2b 1 1
n ' n2 2 1 , n=n .
16n2 1 2

(a) Substituting the series into the solution gives
n
X n
X
ak k 2 2 + xak sin kx =

ak sin kx.
k=1 k=1
Now multiply by sin px and integrate to obtain

n
X Z 1
p2 2 ap + 2 ak dx x sin kx sin px = ap , p = 1, 2, , n.
k=1 0
These n linear equations for a can be written in the matrix form M a = a where Mij
is defined in the question.
(b) If n = 1 we have
Z 1
M11 2
= 2 dx x sin2 x
0
Z 1 Z 1
= 2 dx x(1 cos 2x), but since dx x cos 2x = 0,
0 0
this gives 1 ' M11 = 2 1/2.

If n = 2 the other matrix elements are
Z 1 Z 1
M12 = 2 dx x sin x sin 2x = dx x (cos x cos 3x))
0 0
and since
1
1 (1)k 16
Z
dx x cos kx = , M12 =
0 (k)2 9 2
and
1
1
Z
M22 = 4 2 dx x (1 cos 4x) = 4 2 ,
0 2
giving the eigenvalue problem
1 16

2
2 9 2
1 a = a,

16
2
4
9 2 2
which is just equation 12.90.
(c) If p = 1, q = x and k = sin kx we have

Z 1 Z 1
2
(HS)ij = 2 ij dx cos ix cos jx 2 dx x sin ix sin jx
0 0
Z 1
= 2 ijij 2 dx x sin ix sin jx = Mij .
0

(a) If is the Lagrange multiplier, the Gateaux differential is
b Z
S = 2p(a)y(a)h(a) + 2p(b)y(b)h(b) + 2 dx py 0 h0 qyh wyh
a

= 2h(a)p(a) y(a) + y (a) + 2h(b)p(b) y(b) + y 0 (b)
0
Z b
2 dx (py 0 )0 + (q + w)y h.
a
Using the class of variations with h(a) = h(b) = 0, we see that the Euler-Lagrange
equation,

d dy
p + (q + w)y = 0.
dx dx
must be satisfied by a stationary path. Further, since S = 0 for all admissible paths
the given boundary conditions must also be satisfied.
(b) Since wy = qy + (py 0 )0 we have

Z b Z b Z b
0
k dx wyk2 = dx qyk2 + dx yk (pyk0 )
a a a
h ib Z b
= yk pyk0 dx pyk0 2 qyk2 .
a a
Using the constraint condition and the boundary conditions to replace y 0 (a) with y(a)
and y 0 (b) with y(b), this becomes
Z b
k = dx pyk0 2 qyk2 + p(b)y(b)2 p(a)y(a)2 = S[yk ].
a

The equation is (1 x2 )y 00 2xy 0 + y = 0. Put y = uv to give
(1 x2 )vu00 + 2(1 x2 )v 0 2xv u0 + v + (1 x2 )v 00 2xv 0 u = 0

so if v 0 /v = x/(1 x2 ), that is v = 1/ 1 x2 the equation becomes
d2 u

1 u
+ = 0.
dx2 1 x2 1 x2

The chain rule gives
dy dy dt dy p d2 y q 0 (x) dy d2 y
= = q(x) and 2
= + q(x) 2 .
dx dt dx dt dx 2 q dt dt
d2 y q 0 (x) dy dy
Hence the equation becomes q 2
+ + p1 q + qy = 0, which is the required
dt 2 q dt dt
result

If = f (u)g(v) the equation can be written in the form
f 00
00
2 g 2
+ 2(k) cosh 2u + 2(k) cos 2v = 0.
f g
The terms in curly braces are functions of u and v only, so
f 00 g 00
+ 2(k)2 cosh 2u = a and 2(k)2 cos 2v = a
f g
where a is a constant. Hence the quoted equations. Since the points with coordinates
(u, v) and (u, v +2) are physically identical, g(v) must be 2-periodic, g(v +2) = g(v)
for all v.

Increasing u by 2 increases by 2, so we define u() = + P () where P () is an
odd 2-periodic function of . Thus we can write u() in the form

X
u() = + ak sin k,
k=1
where
1 1 d
Z Z
ak = d (u() ) sin k = du sin u sin k (u sin u)
du
1
Z
1 d
Z
= du sin u(1 cos u) sin k (u sin u) = du sin u cos k (u sin u)
k du
h Z
1 i
= sin u cos k (u sin u) du cos u cos k (u sin u)
k
Z

= du cos u cos k (u sin u)
k
Z
h i
= du cos (k + 1)u k sin u + cos (k 1)u k sin u
2k
2
= (Jk+1 (k) + Jk1 (k)) = Jk (k).
k k

We have, by differentiating under the integral sign,
Z Z
1 1
0
Jn (x) = dt (i sin t)e i(ntx sin t) 00
and Jn (x) = dt ( sin2 t)ei(ntx sin t) ,
2 2
so that Bessels equation becomes
Z
n2

1 2 i
dt sin t sin t + 1 2 ei(ntx sin t) .
2 x x
If the integrand of this integral can be expressed as a differential of a periodic function,
the integral is zero and we have proved the required result. Consider the integral
Z
1 d
dt g(t)ei(ntx sin t) .
2 dt
By expanding this and comparing with the previous integrand we obtain the differential
equation,
n2

i
g 0 + i(n x cos t)g = sin2 t sin t + 1 2 .
x x
Consider a solution g = A + B cos t; by substituting this in the left hand side we see
that if B = i/x and A = in/x2 a solution is obtained. This solution is periodic, hence
the result.

If = 0 the general solution is y = A+Bx: the boundary condition at x = 0 gives A = 0
and the boundary condition at x = gives B = B. Hence there is no nontrivial
solution if = 0 (except possibly if = , a case we return to later).
<0
If < 0, put = 2 , ( > 0), the general solution is y = A cosh x + B sinh x: the
boundary condition at x = 0 gives A = 0 and the boundary condition at x = gives
tanh = , > 0.
If > there are no real solutions of this equation the gradient of the left and right
hand sides at = 0 are, respectively, and , so if > , > tanh for > 0.
If < , the same reasoning shows that there is one real positive solution which we
denote by 0 .
>0
If > 0, put = 2 , ( > 0), the general solution is y = A cos x + B sin x: the
boundary condition at x = 0 gives A = 0 and the boundary condition at x = gives
tan = , > 0.
If > , the first positive solution, 0 is in (0, /2) and the solution k , is in the
interval (k, (k + 1/2)), k = 0, 1, .
If < , the first positive solution, 1 is in (/2, 3/2) with kk < (k + 1/2),
k = 1, 2, .
Thus we have the following,
if > the eigenvalues are k = k2 with k < < (k + 1/2), k = 0, 1, :
if = the function y = Bx is a solution for = 0 and all B:
if < then 0 = 02 and k = k2 with k < < (k + 1/2), k = 1, 2, .

Because the second derivative of each function is a linear combination of the function
and its derivative, the third column is a linear combination of the first two and the
determinant is zero.

Use the results found in exercise 12.21 (page 499).
(a) The Wronskian is W = sinh x cos x cosh x sin x and then
f 0 g 00 g 0 f 00 cosh x sin x + sinh x cos x
p0 = =
W (f, g; x) cosh x sin x sinh x cos x
f g 00 gf 00 2 sinh x sin x
p1 = =
W (f, g; x) sinh x cos x cosh x sin x
so that the equation is
(sinh x cos x cosh x sin x) y 00 + 2 sinh x sin x y 0 (cosh x sin x + sinh x cos x )y = 0
(b) The Wronskian is W = 4/ sin 2x and

cos 2x
p0 = 4/ sin2 2x, p1 = 2 giving the equation y 00 sin2 2x + y 0 sin 4x 4y = 0.
sin 2x

If g = 1/f then the Wronskian is W = 2f 0 /f and
0 0 0 3
f f
W p0 = 2 and W p1 = 2
f f
so that the equation with solutions f and g = 1/f is

u0 0
y 00 y u2 y = 0 where u = f 0 /f.
u

There are three determinants formed by differentiating W (x); those obtained by differ-
entiating the first and second row are zero, so

f g h f g h
dW 0 0 0 0 0 0

= f g h =
f g h .
dx f 000 g 000 h000 p2 f 00 p1 f 0 p0 f p2 g 00 p1 g 0 p0 g p2 h00 p1 h0 p0 h

Multiply the first and second rows by p0 and p1 , respectively, and add to the third row
to obtain
f g h
dW 0 0 0

= f
g h = p2 (x)W (x).
dx
p2 f 00 p2 g 00 p2 h00
Hence Z x
W (x) = W (a) exp dx p2 (x) .
a

In this example p0 /p = tan x, so that p(x) = 1/ cos x and the self-adjoint form is

d 1 dy
cos x = 0.
dx cos x dx

Since x/(1 + x) 1/2 for x 1, put Q2 = 1/2 and Q1 = x/(1 + x) in the comparison

theorem 12.2, to see that the given equation has at least one zero in (n 2, (n+1) 2)
for every n = 1, 2, .

If < 0 there are no periodic solutions. If = 0 the general solution is y = A + Bx,
which is periodic if B = 0. If > 0, put = 2 , > 0, to give the solutions cos x
and sin x, which are 2-periodic if n = 1, 2, .
Hence 2-periodic solutions are
yn = {cos nx, sin nx}, n = 0, 1, , with n = n2 .
Alternatively the functions zn = einx , n = 0, 1, , satisfy the equation and are

2-periodic, with n = n2 . These are linear combinations of the first set of functions.

(a) If x = et then t (0, ) if x (1, ), and
d2 y

dy dy d dy
x = and x x = ,
dx dt dx dx dt2
00 0 pt 2
and the equation
becomes y y + By = 0. Putting y = e gives p p + B = 0 so
that 2p = 1 1 4B. If 4B > 1 this gives the general solution
h i
y = x A cos( ln x) + B sin( ln x) , = 4B 1,
which has infinitely many zeros for x > 1. If 4B < 1 the general solutions is

y = x Axq + Bxq , q = 1 4B,

which has at most one zero.

(b) If q(x) > 1/4 use the equation defined in part (a) as a comparison equation.

(a) Since (xy 0 )0 = xy 00 + y 0 the first result follows directly. Since p(x) = x, the system
is regular provided the interval does not include the origin.
(b) With p = x, q = /x the normal form, exercise 12.1 (page 477) is
d2 u + 14

1
I(x) = 2 (1 + 4) giving + u = 0 with y = u/ x.
4x dx2 x2
Comparing with equation 12.36 (page 494) we see that q and w are continuous only if
x 6= 0, so this system is regular provided the interval does not contain the origin.
(c) Put x = et , so 0 < t < and
du du d2 u d2 u du
x = , x2 2
= 2 +
dx dt dx dt dt
and the equation for u becomes u00 (t) + u0 (t) + ( + 1/4)u = 0. Putting u = ept gives
p2 + p + ( + 1/4) = 0 and hence the general solution is

u = et/2 Ae t + Be t ,

= 2 , > 0,

(Ax + Bx )

y = (A B ln x) , = 0,

i ln x i ln x
= 2 , > 0.

Ae + Be

(i) < 0: the solution is bound at the origin only if B = 0, so y = Ax giving y(0) = 0
and y(1) = A. Hence there are no nontrivial solutions.
(ii) = 0: In the case the bound solutions are y = A: if c 6= 0, the solution is y = c,
with eigenvalue = 0.
(iii) > 0: the solution is not defined at the origin for any A or B, except A = B = 0.
(d) (i) < 0: the boundary conditions give

Aa + Ba = 0
= A = B = 0.
Ab + Bb = 0
(ii) = 0: the boundary conditions give
A B ln a = 0, A B ln b = 0 = A = B = 0.
(iii) > 0: the boundary conditions give
Aei ln a + Aei ln a = 0
= e2i ln(b/a) = 1,
Aei ln b + Aei ln b = 0
n
hence n = n2 , n = , and yn = c sin (n ln(x/a)), for some constant c.
ln(b/a)

(a) If v = f /g then v 0 = (f 0 g f g 0 )/g 2 and
f 00 f g 00 2f g 0 2 2f 0 g 0
v 00 = 2 + , but f 0 = qf, g 0 = qg,
g g g3 g2
2g 0 0 0 2g 0 0
= (f g f g) = v.
g3 g
Hence
2g 0 00 g0 2
00
000 g
v = v 2 2 v0 ,
g g g
0 2 2
v 000 g v 000 3 v 00
= 6 2q, hence = 2q.
v0 g v0 2 v0
(b) If f and g are linearly independent, so are af + bg and cf + dg, so
af + bg av + b
v= = and S(v) = 2q = S(v).
cf + dg cv + d
References
Books and articles referred to in the text
Akhiezer N I 1962 The Calculus of Variations, (Blaisdell Publishing Company, trans-
lated from Russian by A H Frink)
Apostol T M 1963 Mathematical Analysis: A Modern Approach to Advanced Calculus,
(Addison-Wesley)
Arnold V I 1973 Ordinary Differential Equations, (The MIT press)
Ashby A, Brittin W E, Love W F and Wyss W, 1975 Brachitochrone with Coulomb
Friction, Amer J Physics 43 902-5.
Aughton P 2001 Newtons Apple, (Weidenfeld and Nicolson)
Bernstein S N 1912 Sur les equations su calcul des variations, Ann. Sci. Ecole Norm
Sup. 29 431-485
Birkhoff G and Rota G-C 1962 Ordinary differential equations (Blaisdell Publishing
Co.)
Brunt, van B 2004 The Calculus of Variations, (Springer)
Courant R and Hilbert D 1937a Methods of Mathematical Physics, Vol 1 (Interscience
Publishers Inc)
Courant R and Hilbert D 1937b Methods of Mathematical Physics, Vol 2 (Interscience
Publishers Inc)
Gelfand I M and Fomin S V 1963 Calculus of Variations, (Prentice Hall, translated
from the Russian by R A Silverman), reprinted 2000 (Dover)
Goldstine H H 1980 A History of the Calculus of Variations from the 17 th through the
19 th Century, (Springer, New York)
Green G 1838 On the Motion of Waves in a variable Canal of small Depth and Width,
Camb Phil Soc, Vol VI, part III
Ince E L 1956 Ordinary differential equations (Dover)
Isenberg C 1992 The Science of Soap Films and Soap Bubbles, (Dover)
Jeffrey A 1990 Linear Algebra and Ordinary Differential Equations (Blackwell Scientific
Publications)
Kolmogorov A N and Fomin S V 1975 Introductory Real Analysis, (Dover)
Landau L D and Lifshitz E M 1959 Fluid mechanics, (Pergamon)
Lutzen J 1990 Joseph Liouville (1809 1882): Master of Pure and Applies Mathematics
(Springer-Verlag)
Prandtl L 1904 Uber Flussigkeitsbewegung bei sehr kleiner Reibung, Verhandlungendes
III. internationalen Mathematiker-kongresses, Heidelberg, 1904
Rudin W 1976 Principles of Mathematical Analysis, (McGraw-Hill)
Schlichting H 1955 Boundary Layer Theory, (McGraw-Hill, New York)
Smith G E 2000 Fluid Resistance: Why Did Newton Change His Mind? Published in
The Foundations of Newtonian Scholarship, Eds R H Dalitz and M Nauenberg, (World
Scientific)
Sutherland W A 1975 Introduction to Metric and Topological Spaces, (Oxford University
Press)
Troutman J L 1983 Variational Calculus with Elementary Convexity, (Springer-Verlag)
Watson G N 1965 A Treatise on the Theory of Bessel Functions (Cambridge University
Press), first published in 1922.
Whittaker E T and Watson G N 1965 A Course of Modern Analysis, (Cambridge
University Press)
Yoder J G 1988 Unrolling Time, (Cambridge University Press)
Yourgrau W and Mandelstram S 1968 Variational Principles in Dynamics and Quantum
Theory (Pitman)
Books on the Calculus of Variations

The following books have also been used in the preparation of these course notes and
should be consulted for a more detailed study of the subject.
Akhiezer N I 1962 The Calculus of Variations, (Blaisdell Publishing Company, trans-
lated from Russian by A H Frink)
Courant R and Hilbert D 1937a Methods of Mathematical Physics, Vol 1 (Interscience
Publishers Inc)
Courant R and Hilbert D 1937b Methods of Mathematical Physics, Vol 2 (Interscience
Publishers Inc)
Forsyth A R 1926 Calculus of Variations, (Cambridge University Press), reprinted 1960
(Dover)
Fox C 1963 An Introduction to the Calculus of Variations, (Oxford University Press),
reprinted 1987 (Dover)
Gelfand I M and Fomin S V 1963 Calculus of Variations, (Prentice Hall, translated
from the Russian by R A Silverman), reprinted 2000 (Dover)
Pars L A 1962 An Introduction to the Calculus of Variations, (Heineman)
Sagan H 1969 An Introduction to the Calculus of Variations, (General publishing Com-
pany, Canada), Reprinted 1992 (Dover)
Troutman J L 1983 Variational Calculus with Elementary Convexity, (Springer-Verlag)
Index
C , 22 broken extremals, 355

Cn (a, b), 22 Brownian motion, 20
O-notation, 14 bubbles, 179
ln x, 37
D0 norm, 120, 135 Cam, bridge over, 163
D1 norm, 120, 134 canal equation, 480
f 1 (y), 19 cardioid, 316
o-notation, 15 catenary, 342, 421
catenary equation, 98
admissible function, 120 Cauchy inequality, 43
Airys equation, 503 chain rule, 21
allowed variations, 121 Chartiers theorem, 44
Aristotle, 99 closed interval, 13
astroid, 316 codomain, 12
auxiliary comparison theorem
function, 398 first-order equation, 511
functional, 416 second-order equation, 500, 502
completeness, 487
basis functions, 497 conjugate point, 278
beam, loaded, 341, 347 and geodesics, 286
Bernoulli D, 491 and lenses, 286
Bernoulli John, 91, 161, 342 conservation laws, 252
Bernstein S N, 131 constant of the motion, 248
Bernsteins theorem, 131 constraint, 394
Bessel F W, 489 constraint, functional, 416
Bessel functions, 489, 502 continuous function, 16
binomial corners, 355
coefficients, 23, 37 coupled equations, 219
expansion, 36 critical point, 268
Bois-Reymond, P du, 124, 131 cycloid, 91, 162, 316, 345
boundary conditions area and length, 164
mixed, 494, 504 pendulum, 164, 187
periodic, 493, 504
separated, 494 dAlemberts paradox, 93
boundary layer, 93 definite integrals, 43
boundary value problem, 117, 131 degenerate stationary point, 270
brachistochrone, 90, 161, 322, 342, 344 dependent
in a resisting medium, 435 functions, 394
with Coulomb friction, 445 variable, 12
563
derivative, 20 Fundamental lemma of the Calculus of
partial, 26 Variations, 124
total, 28 Fundamental Theorem of Calculus, 42
Descartes R, 163
Dido, 415 Gateaux differential, 122
differentiable, 20 Galileo G, 91
differentiation of an integral, 45 general theory of relativity, 84
diffusion equation, 480 geodesic, 84, 319
direct methods, 513 geodesics and conjugate points, 286
discontinuity global extrema, 82
removable, 17 Goldschmidt solution, 176
simple, 17 graph, 12
domain, 12 gravitational lensing, 84
drag coefficient, 92 great circle, 84, 322
dual problem, 402 Green G, 480
Eddington A S, 84 Holder inequality, 43

eigenfunction, 475 Hamilton W R, 101
eigenvalue, 222, 475 hanging cable, 97, 421
Einstein A, 84 heat equation, 480
elastic wire, 355 Heaviside function, 17
ellipse, 316 Hero of Alexandria, 99
elliptical coordinates, 526 Hessian matrix, 270
Emden-Fowler equation, 254 Hilbert D, 11
epicycloid, 326 holonomic constraint, 431
equation of constraint, 394 homogeneous functions, 30
Essex J, 162 horn equation, 480
Euclid, 99, 181 Huygens C, 163
Euler L, 79, 100, 118, 209, 294, 435, 489
Eulers formula, 30 implicit function, 31
Euler-Lagrange equation, 117, 125 theorem, 31
extrema, local and global, 82 indefinite integral, 43
extremal, 121, 126 independent variable, 12
inflection, point of, 269
Fermat P de, 99, 163 initial value problems, 133
Fermats principle, 98 inner product, 487, 503
finite subsidiary conditions, 431 integral
first-integral, 126, 248, 253 definite, 43
folium of Descartes, 50 differentiation of a parameter, 45
Fourier components, 488 indefinite, 43
Fourier J B J, 479, 489 of oscillatory functions, 44
Fourier series, 488 integral of the motion, 248
Frechet M, 11 integrand, 42
Fredholm I, 11 integration
frustum, 171 by parts, 45
functional, 9 limits, 42
differentiation of, 119 invariant, 477
stationary value of, 121 invariant functional, 251
inverse function, 19 minimum point, 268
inverse problem, 224 minimum resistance problem, 92, 361
Isochrone, 91 Minkowski inequality, 43
isoperimetric problem, 96, 415 minor of determinant, 272
mixed derivative rule, 27
Jacobis equation, 279 monotonic function, 19
Jacobian determinant, 32, 215 Morse H C M, 270
Morse Lemma, 270
Keplers equation, 489, 526
kinetic focus, 286 natural boundary condition, 341, 344
Kronecker delta, 487 natural logarithm, 37
navigation problem, 96, 346
LHospital G F A, Marquis de , 40 Newton I, 41, 91, 95, 162, 342
LHospitals rule, 40 Newtons problem, 92, 361
Lagrange J-L, 79, 100, 118, 395, 431, Noether E, 252
489 Noethers theorem, 253
Lagrange multiplier, 398 nontrivial solution, 496
Lagranges identity, 503 norm, 13
Lalouvere, de A, 163 on function space, 120
least squares fit, 273 normal form, 477
Lebesgue H, 11 Liouvilles, 478
Legendres condition, 276 normalised functions, 487
Leibniz G W, 91
Leibnizs rule, 23 open
Lemniscate of Bernoulli, 314 ball, 13
lenses and conjugate points, 286 interval, 13
linear independence, 498 set, 14
Liouville J, 475 order notation, 14
Liouville transformation, 478, 512 orthogonal functions, 487
Lipshitz condition, 511 oscillation theorem, 507
loaded beam, 341, 347
local Pappus of Alexandria, 97
extrema, 82 parametric functional, 316, 352
maximum, 268 partial derivative, 26
minimum, 268 Pascal B, 163
logarithm, natural, 37 pendulum
clock, 163
Maclaurin C, 33 cycloidal, 164, 187
Mathieu E L, 495 periodic boundary conditions, 495
Mathieus equation, 526 piecewise continuous, 17
Maupertuis P L N de, 100 Plateau J A F, 181
maximum point, 268 Poisson S D, 479, 489
Mean Value Theorem positive definite matrix, 222
one variable Cauchys form, 24 positive-homogeneous functions, 317
one variable integral form, 25 Prufer system, 506
minimal Prandtl L, 93
moment of inertia, 186 principal minor, 272
surface of revolution, 170 product rule, 21
quadratic form, 271 Stirlings approximation, 35
quadrature, 223 strictly
quotient rule, 21 increasing, 19
monotonic, 19
radius of convergence, 34 strong stationary path, 135
regular Sturm-Liouville, 494 strong variations, 135
relative extrema, 82 strongly positive, 274
Riccati J, 293 structurally stable functions, 269
Riccatis equation, 281, 293 Sturm J C F, 475
Riemann G F B, 42 Sturm-Liouville system, 494
Ritz method, 522, 523 regular, 494
Roberval G P de, 163 singular, 495
sufficiently smooth, 22
saddle, 269 supremum norm, 120
scale transformation, 251 surface of revolution
Schwarz inequality, 43 minimum area, 323, 421, 425
Schwarzian derivative, 49, 529 symmetric matrix, 222
second variation, 273
self-adjoint form, 475 tangent line, 20
self-adjoint operator, 504 Tautochrone, 91
separated boundary conditions, 494 Taylor
separation constant, 481 polynomials, 33
separation theorem, 500 series, 33, 39
Sgn function, 17 Taylor B, 33
shortest distance total derivative, 28
in a plane, 84 transversality condition, 350, 354, 426
on a cylinder, 104, 434 trial functions, 513
on a sphere, 84 triangle inequality, 13
side-conditions, 431 trigonometric series, 488
singular point, 496 trivial solution, 496
singular Sturm-Liouville system, 483 trochoid, 162, 326
Smith R, 162
smooth function, 22 uncoupled equations, 221
Snells law, 100 undetermined multiplier, 398
soap films, 179
special functions, 476, 489 variable end points, 96, 349
speed of light, 98 variational equation, 285
spherical polar coordinates, 320
stationary Wallis J, 163
point, classification, 82 wave equation, 480
curve, 80 weak stationary path, 135
functional, 121, 124 weak variations, 135
path, 80, 121, 125 Weierstrass K, 93, 128, 317
point, 80, 121 Weierstrass-Erdmann conditions, 355, 359,
point, degenerate, 270 427
Steiner J, 182 weight function, 487
stiff beam, 341, 347 Wren C, 163
Stirling J, 33 Wronski J H de, 498
Wronskian, 498, 527
Zenodorus, 97

Variable End Point

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Variable End Point

Uploaded by

Copyright:

Available Formats

Chapter 9

Variable end points

where A is known, but v and y(x), 0 x v, need to be determined. The actual

9.2 Natural boundary conditions

Figure 9.5 Graph showing the cycloid defined in equation 9.11,

9.2.1 Natural boundary conditions for the loaded beam

with natural boundary conditions at x = b. The derivation provided is brief because it

Integration by parts gives

(a) Show that the stationary paths of the functional

satisfy the Euler-Lagrange equation

9.3 Variable end points

This point also lies on the constraining curve so, to first-order in ,

Expanding this to first-order in , and remembering that (v, y(v)) = 0, gives

On putting  = 0, so z = v, we obtain the Gateaux differential

On a path satisfying this equation the Gateaux differential reduces to

x Fy0 + y (y 0 Fy0 F ) = 0, x = v, (9.26)

Either of these boundary conditions may be replaced by conventional boundary condi-

As an example consider the functional

9.4 Parametric functionals

Consider the parametric functional

x(1) = (1 ) and x(1) + h1 (1) = (1 + ).

Expanding to first-order in  gives h1 (1) = 0 (1 ) and, similarly, h2 (1) = 0 (1 ).

and the boundary curve is ( ) = 1 2 , ( ) = a . Hence the boundary condition 9.37

9.5 Broken Extremals: the Weierstrass-Erdmann con-

9.5.1 A taut wire

energy of the system is

and since y(x) is continuous at x = , we have y1 () = y2 (). The derivatives of y(x)

so that the Gateaux differential is

Integration by parts gives, on remembering that h(0) = h(L) = 0,

and this can be zero for all h(x) only if

y1 (x) = x and y2 (x) = (L x),

for some constants and . Since y1 () = y2 () we have ( + ) = L and equa-

9.5.2 The Weierstrass-Erdmann conditions

that are continuously differentiable for a x b, except possibly at a single, unknown

The value of the functional on the varied path is

with y1 (x) and y2 (x) satisfying the Euler-Lagrange equations

y(c) +  = yk (c + ) + hk (c + ), k = 1 and 2,

lim F y 0 Fy0 = lim F y 0 Fy0 ,

lim Fy0 = lim Fy0 . (9.54)

For an example consider the functional

Fy0 = 2y 0 (1 y 0 )(1 2y 0 ) and F y 0 Fy0 = y 0 2 (1 y 0 )(1 3y 0 )

the Weierstrass-Erdmann conditions become

and deduce that this path gives a local maximim of S[y].

9.5.3 The parametric form of the corner conditions

lim x = lim x and lim y = lim y , (9.58)

9.6 Newtons minimum resistance problem

This equation can be solved by defining a new positive variable3 ,

p(x) = y 0 (x) giving the equation xp = c(1 + p2 )2 . (9.63)

Integrating the first equation gives

where p0 = p(0) is an unknown constant. The last expression gives

This equation can be integrated directly to obtain (x, y) in terms of p,

Solid surrounding a hollow right circular cylinder

The solution of the associated Euler-Lagrange

A = B + c(p1 ), (from y(p1 ) = A), (9.67)

The equation 9.70 for p2 is simplified by defining a new variable 2 , p2 = 1/ tan 2 =

Body surrounding a solid right circular cylinder

This must be zero for all k and hence

and show that G(1) = 0, G(p) = 3

9.7 Miscellaneous exercises

for each of the two boundary conditions,

circle of radius r with centre on the y-axis at y = r.

can have no broken extremals.

9.8 Solutions for chapter 9

This point also lies on the constraining curve so, to first-order in ,

Expanding this to first-order in , and remembering that (v, y(v)) = 0, gives

On putting = 0, so z = v, we obtain the Gateaux differential

x(1) = (1 ) and x(1) + h1 (1) = (1 + ).

Expanding to first-order in gives h1 (1) = 0 (1 ) and, similarly, h2 (1) = 0 (1 ).

y(c) + = yk (c + ) + hk (c + ), k = 1 and 2,

If |2 | 1 the approximate solution of this equation is 2 = d, so we may put 2 = d

differentiation with respect to gives

where z = a + k. Now set = 0 to obtain the Gateaux differential

where z = a + k. Putting = 0 this becomes

But equation 9.75 can be expanded in powers of to give

so that differentiating with respect to and then setting = 0 gives

y(v + ) + h(v + ) = y(v) + (y 0 (v) + h(v)) + O(2 )