/
Текст
para/today
no es un proyecto lucrative sino
un esfuerzo colectivo de estudiantes у profesores de la UNAM
para facilitar el acceso a los materiales necesarios para la
education de la mayor cantidad de gente posible. Pensamos
editar en formato digital libros que por su alto costo, о bien
porque ya no se consiguen en bibliotecas у librerias, no son
accesibles para todos.
Invitamos a todos los interesados en participar en este proyecto a
sugerir titulos, a prestarnos los textos para su digitalizacion у а
ayudarnos en toda la labor tecnica que implica su reproduction.
El nuestro, es un proyecto colectivo abierto a la participation de
cualquier persona у todas las colaboraciones son bienvenidas.
Nos encuentras en los Talleres Estudiantiles de la Facultad de
Ciencias у puedes ponerte en contacto con nosotros a la siguiente
direction de correo electronico:
eduktodosG/ hotmail.com
http://eduktodos.dvndns.org
MATHEMATICAL
METHODS
FOR PHYSICISTS
Third Edition
MATHEMATICAL
METHODS
FOR PHYSICISTS
Third Edition
GEORGE ARFKEN
Miami University
Oxford, Ohio
ACADEMIC PRESS, INC.
Harcourt Brace Jovanovich, Publishers
San Diego New York Berkeley Boston
London Sydney Tokyo Toronto
Copyright ©1985 by Academic Press, Inc.
All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopy, recording, or any information
storage and retrieval system, without permission in writing from the publisher.
ACADEMIC PRESS, INC.
San Diego, California 92101
United Kingdom Edition Published by Academic Press, Inc. (London) Ltd.,
24/28 Oval Road, London NW1 7DX
ISBN: 0-12-059820-5
ISBN: 0-12-059810-8 (paper)
Library of Congress Catalog Card Number: 84-71328
PRINTED IN THE UNITED STATES OF AMERICA
87 gg 89 9 g 7 6 5 4 3
To Carolyn
CONTENTS
Chapter 1 VECTOR ANALYSIS 1
1.1 Definitions, Elementary Approach 1
1.2 Advanced Definitions 7
1.3 Scalar or Dot Product 13
1.4 Vector or Cross Product 18
1.5 Triple Scalar Product, Triple Vector Product 26
1.6 Gradient 33
1.7 Divergence 37
1.8 Curl 42
1.9 Successive Applications of V 47
1.10 Vector Integration 51
1.11 Gauss's Theorem 57
1.12 Stokes's Theorem 61
1.13 Potential Theory 64
1.14 Gauss's Law, Poisson's Equation 74
1.15 Helmholtz's Theorem 78
Chapter 2 COORDINATE SYSTEMS 85
2.1 Curvilinear Coordinates 86
2.2 Differential Vector Operations 90
2.3 Special Coordinate Systems—Rectangular Cartesian
Coordinates 94
2.4 Circular Cylindrical Coordinates (p,(p,z) 95
2.5 Spherical Polar Coordinates (r, 9, q>) 102
2.6 Separation of Variables 111
Chapter 3 TENSOR ANALYSIS 118
3.1 Introduction, Definitions 118
3.2 Contraction, Direct Product 124
vii
viii CONTENTS
3.3 Quotient Rule 126
3.4 Pseudotensors, Dual Tensors 128
3.5 Dyadics 137
3.6 Theory of Elasticity 140
3.7 Lorentz Covariance of Maxwell's Equations 150
3.8 Noncartesian Tensors, Covariant Differentiation 158
3.9 Tensor Differential Operations 164
Chapter 4
DETERMINANTS, MATRICES, AND GROUP
THEORY 168
4.1 Determinants 168
4.2 Matrices 176
4.3 Orthogonal Matrices 191
4.4 Oblique Coordinates 206
4.5 Hermitian Matrices, Unitary Matrices 209
4.6 Diagonalization of Matrices 217
4.7 Eigenvectors, Eigenvalues 229
4.8 Introduction to Group Theory 237
4.9 Discrete Groups 243
4.10 Continuous Groups 251
4.11 Generators 261
4.12 SUB), SUC), and Nuclear Particles 267
4.13 Homogeneous Lorentz Group 271
Chapter 5 INFINITE SERIES 277
5.1 Fundamental Concepts 277
5.2 Convergence Tests 280
5.3 Alternating Series 293-
5.4 Algebra of Series 295
5.5 Series of Functions 299
5.6 Taylor's Expansion 303
5.7 Power Series 313
5.8 Elliptic Integrals 321
5.9 Bernoulli Numbers, Euler-Maclaurin Formula
5.10 Asymptotic or Semiconvergent Series 339
5.11 Infinite Products 346
327
Chapter 6 FUNCTIONS OF A COMPLEX VARIABLE I 352
6.1 Complex Algebra 353
6.2 Cauchy-Riemann Conditions 360
Chapter 7
Chapter 8
6.3
6.4
6.5
6.6
6.7
CONTENTS ix
Cauchy's Integral Theorem 365
Cauchy's Integral Formula 371
Laurent Expansion 376
Mapping 384
Conformal Mapping 392
FUNCTIONS OF A COMPLEX VARIABLE II: Calculus
of Residues 396
7.1
7.2
7.3
7.4
Singularities 396
Calculus of Residues 400
Dispersion Relations 421
The Method of Steepest Descents 428
DIFFERENTIAL EQUATIONS 437
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
Partial Differential Equations of Theoretical
Physics 437
First-Order Differential Equations 440
Separation of Variables—Ordinary Differential
Equations 448
Singular Points 451
Series Solutions—Frobenius' Method 454
A Second Solution 467
Nonhomogeneous Equation—Green's Function 480
Numerical Solutions 491
Chapter 9
STURM-LIOUVILLE THEORY—ORTHOGONAL
FUNCTIONS 497
9.1 Self-Adjoint Differential Equations 497
9.2 Hermitian (Self-Adjoint) Operators 510
9.3 Gram-Schmidt Orthogonalization 516
9.4 Completeness of Eigenfunctions 523
Chapter 10 THE GAMMA FUNCTION (FACTORIAL
FUNCTION) 539
10.1 Definitions, Simple Properties 539
10.2 Digamma and Poly gamma Functions 549
10.3 Stirling's Series 555
10.4 The Beta Function 560
10.5 The Incomplete Gamma Functions and Related
Functions 565
x CONTENTS
Chapter 11 BESSEL FUNCTIONS 573
11.1 Bessel Functions of the First Kind, Jv(x) 573
11.2 Orthogonality 591
11.3 Neumann Functions, Bessel Functions of the Second
Kind, Nv(x) 596
11.4 Hankel Functions 603
11.5 Modified Bessel Functions, Iv(x) and Kv(x) 610
11.6 Asymptotic Expansions 616
11.7 Spherical Bessel Functions 622
Chapter 12 LEGENDRE FUNCTIONS 637
12.1 Generating Function 637
12.2 Recurrence Relations and Special Properties 645
12.3 Orthogonality 652
12.4 Alternate Definitions of Legendre Polynomials 663
12.5 Associated Legendre Functions 666
12.6 Spherical Harmonics 680
12.7 Angular Momentum Ladder Operators 685
12.8 The Addition Theorem for Spherical Harmonics 693
12.9 Integrals of the Product of Three Spherical
Harmonics 698
12.10 Legendre Functions of the Second Kind, Qn(x) 701
12.11 Vector Spherical Harmonics 707
Chapter 13 SPECIAL FUNCTIONS 712
13.1 Hermite Functions 712
13.2 Laguerre Functions 721
13.3 Chebyshev (Tschebyscheff) Polynomials 731
13.4 Chebyshev Polynomials—Numerical
Applications 740
13.5 Hypergeometric Functions 748
13.6 Confluent Hypergeometric Functions 753
Chapter 14 FOURIER SERIES 760
14.1 General Properties 760
14.2 Advantages, Uses of Fourier Series 766
14.3 Applications of Fourier Series 770
14.4 Properties of Fourier Series 778
14.5 Gibbs Phenomenon 783
14.6 Discrete Orthogonality—Discrete Fourier
Transform 787
CONTENTS xi
Chapter 15 INTEGRAL TRANSFORMS 794
15.1 Integral Transforms 794
15.2 Development of the Fourier Integral 797
15.3 Fourier Transforms—Inversion Theorem
15.4 Fourier Transform of Derivatives 807
15.5 Convolution Theorem 810
15.6 Momentum Representation 814
15.7 Transfer Functions 820
15.8 Elementary Laplace Transforms 824
15.9 Laplace Transform of Derivatives 831
15.10 Other Properties 838
15.11 Convolution or Faltung Theorem 849
15.12 Inverse Laplace Transformation 853
800
Chapter 16 INTEGRAL EQUATIONS 865
16.1 Introduction 865
16.2 Integral Transforms, Generating Functions 873
16.3 Neumann Series, Separable (Degenerate) Kernels
16.4 Hilbert-Schmidt Theory 890
16.5 Green's Functions—One Dimension 897
16.6 Green's Functions—Two and Three Dimensions
879
Chapter 17 CALCULUS OF VARIATIONS 925
17.1 One-Dependent and One-Independent Variable 925
17.2 Applications of the Euler Equation 930
17.3 Generalizations, Several Dependent Variables 937
17.4 Several Independent Variables 942
17.5 More Than One Dependent, More Than One
Independent Variable 944
17.6 Lagrangian Multipliers 945
17.7 Variation Subject to Constraints 950
17.8 Rayleigh-Ritz Variational Technique 957
Appendix 1 REAL ZEROS OF A FUNCTION 963
Appendix 2 GAUSSIAN QUADRATURE 968
GENERAL REFERENCES 974
Index 975
PREFACE TO THE
THIRD EDITION
The many additions and revisions in this third edition of Mathematical
Methods for Physicists are based on 15 years of teaching from the second edition,
on the questions from current students, and on the advice of colleagues, reviewers,
and former students. Almost every section has been revised; many of the sections
have been completely rewritten. In most sections, there are new exercises, all class
tested. New sections have been added on non-Cartesian tensors, dispersion
theory, first-order differential equations, numerical application of Chebyshev
polynomials, the fast Fourier transform, and on transfer functions.
Throughout the text, I have placed significant additional emphasis on numer-
numerical applications and on the relation of these mathematical methods to comput-
computing and to numerical analysis.
For students studying graduate level physics, particularly theoretical physics,
a number of topics including Hermitian operators, Hilbert space, and the concept
of completeness have been expanded.
xiii
PREFACE TO THE
SECOND EDITION
This second edition of Mathematical Methods for Physicists incorporates a
number of changes, additions, and improvements made on the basis of experience
with the first edition and the helpful suggestions of a number of people. Major
revisions have been made in the sections on complex variables, Dirac delta func-
function, and Green's functions. New sections have been included on oblique co-
coordinates, Fourier-Bessel series, and angular momentum ladder operators. The
major addition is a series of sections on group theory. While these could have
been presented as a separate group theofy chapter, there seemed to be several
advantages to include them in Chapter 4, Matrices. Since the group theory is
developed in terms of matrices the arrangement seems a reasonable one.
xv
PREFACE TO THE
FIRST EDITION
Mathematical Methods for Physicists is based upon two courses in mathematics
for physicists given by the author over the past fourteen years, one at the junior
level and one at the beginning graduate level. This book is intended to provide
the student with the mathematics he needs for advanced undergraduate and
beginning graduate study in physical science and to develop a strong background
for those who will continue into the mathematics of advanced theoretical physics.
A mastery of calculus and a willingness to build on this mathematical foundation
are assumed.
This text has been organized with two basic principles in view. First, it has been
written in a form that it is hoped will encourage independent study. There are
frequent cross references but no fixed, rigid page-by-page or chapter-by-chapter
sequence is demanded.
The reader will see that mathematics as a language is beautiful and elegant.
Unfortunately, elegance all too often means elegance for the expert and obscurity
for the beginner. While still attempting to point out the intrinsic beauty of mathe-
mathematics, elegance has occasionally been reluctantly but deliberately sacrificed in
the hope of achieving greater flexibility and greater clarity for the student.
Mathematical rigor has been treated in a similar spirit. It is not stressed to the
point of becoming a mental block to the use of mathematics. Limitations are
explained, however, and warnings given against blind, uncomprehending appli-
application of mathematical relations.
The second basic principle has been to emphasize and re-emphasize physical
examples in the text and in the exercises to help motivate the student, to illustrate
the relevance of mathematics to his science and engineering.
This principle has also played a decisive role in the selection and development
of material. The subject of differential equations, for example, is no longer a
series of trick solutions of abstract, relatively meaningless puzzles but the solu-
solutions and general properties of the differential equations the student will most
frequently encounter in a description of our real physical world.
xvii
ACKNOWLEDGMENTS
A major revision of this sort necessarily represents the influence and help of
many people. Many of the revisions resulted from current students' requests for
clarification. Many of the additions were a response to the comments and advice
of former students. To all my students, my thanks for their help. Professor P. A.
Macklin has been most helpful with his suggestions and corrections. The final
form of this text owes much to the talents of Senior Editor Jeff Holtmeier of
Academic Press, Inc. and of Carol Kosik of Editing, Design & Production, Inc. A
special acknowledgment is owed Mrs. Jane Kelly for so patiently and conscien-
conscientiously typing this manuscript.
XIX
INTRODUCTION
Many of the physical examples used to illustrate the applications of mathemat-
mathematics are taken from the fields of electromagnetic theory and quantum mechanics.
For convenience the main equations are listed below and the symbols identified.
References in these fields are also given.
ELECTROMAGNETIC THEORY
MAXWELL'S EQUATIONS (MKS UNITS—VACUUM)
6B
\ D = p V x E= -
dt
VB = 0 V xH = ?U
dt
Here E is the electric field defined in terms of force on a static charge and В the
magnetic induction defined in terms of force on a moving charge. The related
fields D and H are given (in vacuum) by
D = £0E and В = jU0H
The quantity p represents free charge density while J is the corresponding
current. The electric field E and the magnetic induction В are often expressed in
terms of the scalar potential cp and the magnetic vector potential A.
ЗА
E = ?<p B = V x A
dt
For additional details see: J. M. Marion, Classical Electromagnetic Radiation,
New York: Academic Press A965); J. D. Jackson, Classical Electrodynamics, 2nd
ed. New York: Wiley A975).
Note that Marion and Jackson prefer Gaussian units. A glance at the last two
xxi
xxii INTRODUCTION
texts and the great demands they make upon the student's mathematical com-
competence should provide considerable motivation for the study of this book.
QUANTUM MECHANICS
SCHRODINGER WAVE EQUATION (TIME INDEPENDENT)
h2
-—VV +
2m
ф is the (unknown) wave function. The potential energy, often a function of
position, is denoted by V while E is the total energy of the system. The mass of the
particle being described by \jj\sm.h is Planck's constant h divided by In. Among
the extremely large number of beginning or intermediate texts we might note: A.
Messiah, Quantum Mechanics B vols), New York; Wiley A961): R. H. Dicke and J.
P. Wittke, Introduction to Quantum Mechanics, Reading Mass.: Addison-Wesley
A960); E. Merzbacher, Quantum Mechanics, 2nd Ed. New York: Wiley A970).
1 VECTOR
ANALYSIS
1.1 DEFINITIONS, ELEMENTARY APPROACH
In science and engineering we frequently encounter quantities that have
magnitude and magnitude only: mass, time, and temperature. These we label
scalar quantities. In contrast, many interesting physical quantities have mag-
magnitude and, in addition, an associated direction. This second group includes
displacement, velocity, acceleration, force, momentum, and angular momen-
momentum. Quantities with magnitude and direction are labeled vector quantities.
Usually, in elementary treatments, a vector is defined as a quantity having
magnitude and direction. To distinguish vectors from scalars, we identify vector
quantities with boldface type, that is, V.
As an historical sidelight, it is interesting to note that the vector quantities
listed are all taken from mechanics but that vector analysis was not used in the
development of mechanics and, indeed, had not been created. The need for
vector analysis became apparent only with the development of Maxwell's
electromagnetic theory and in appreciation of the inherent vector nature of
quantities such as the electric field and magnetic field.
Our vector may be conveniently represented by an arrow with length propor-
proportional to the magnitude. The direction of the arrow gives the direction of the
vector, the positive sense of direction being indicated by the point. In this
representation vector addition
C = A + B A.1)
consists in placing the rear end of vector В at the point of vector A. Vector С
is then represented by an arrow drawn from the rear of A to the point of B.
This procedure, the triangle law of addition, assigns meaning to Eq. 1.1 and is
illustrated in Fig. 1.1. By completing the parallelogram, we see that
C = A + B = B + A, A.2)
FIG. 1.1 Triangle law of vector
addition
as shown in Fig. 1.2. In words, vector addition is commutative.
2 VECTOR ANALYSIS
FIG. 1.2 Parallelogram law of vec-
vector addition
For the sum of three vectors
D = A + B + C,
Fig. 1.3, we may first add A and В
A + В = E.
Then this sum is added to С
D = E + С.
Similarly, we may first add В and С
В + С = F.
Then
D = A + F.
In terms of the original expression,
(A + В) + С = A + (B + C).
Vector addition is associative.
FIG. 1.3 Vector addition is associa-
associative
A direct physical example of the parallelogram addition law is provided by
a weight suspended by two cords. If the junction point (O in Fig. 1.4) is in
equilibrium, the vector sum of the two forces F2 and F2 must just cancel the
downward force of gravity, F3. Here the parallelogram addition law is subject
to immediate experimental verification.*
1 Strictly speaking the parallelogram addition was introduced as a definition.
Experiments show that if we assume that the forces are vector quantities and
we combine them by parallelogram addition the equilibrium condition of
zero resultant force is satisfied.
DEFINITIONS, ELEMENTARY APPROACH 3
FIG. 1.4 Equilibrium of forces. F1 + F2 = -F3
Subtraction may be handled by defining the negative of a vector as a vector
of the same magnitude but with reversed direction. Then
In Fig. 1.3
A = E - B.
Note that the vectors are treated as geometrical objects that are independent
of any coordinate system. Indeed, we have not yet introduced a coordinate
system. This concept of independence of a preferred coordinate system is
developed in considerable detail in the next section.
The representation of vector A by an arrow suggests a second possibility.
Arrow A (Fig. 1.5), starting from the origin,2 terminates at the point (xt, yx, zx).
Thus, if we agree that the vector is to start at the origin, the positive end may
be specified by giving the cartesian coordinates (xx,yx, zx) of the arrow head.
Although A could have represented any vector quantity (momentum,
electric field, etc.,), one particularly important vector quantity, the displacement
from the origin to the point (xx, yx, zx), is denoted by the special symbol r. We
then have a choice of referring to the displacement as either the vector г or the
collection (xx,y1,z1), the coordinates of its end point.
*,*i). A-3)
2 The reader will see that we could start from any point in our cartesian
reference frame, we choose the origin for simplicity.
4 VECTOR ANALYSIS
FIG. 1.5 Cartesian components
Using r for the magnitude of vector r, we find that Fig. 1.6 shows that the end-
point coordinates and the magnitude are related by
xt = rcosot, yx = rcos/3, zx = rcosy. A.4)
Cos a, cos /?, and cos у are called the direction cosines, a. being the angle between
the given vector and the positive x-axis, and so on. One further bit of vocabulary:
The quantities xl9 yl, and zx are known as the (cartesian) components of г or
the projections of r.
FIG. 1.6 Direction cosines
DEFINITIONS, ELEMENTARY APPROACH 5
If we proceed in the same manner, any vector A may be resolved into its
components (or projected onto the coordinate axes) to yield
Ax = Acosa, A.5)
in which a is the angle between A and the positive x-axis. Again, we may choose
to refer to the vector as a single quantity A or to its components (Ax,Ay,Az).
Note that the subscript x in Ax denotes the x component and not a dependence
on the variable x. Ax may be a function of x, y, and z as Ax(x,y,z). The choice
between using A or its components (Ax,Ay,Az) is essentially a choice between
a geometric or an algebraic representation. In the language of group theory
(Chapter 4), the two representations are isomorphic.
Use either representation at your convenience. The geometric "arrow in,
space" may aid in visualization. The algebraic set of components is usually
much more suitable for precise numerical or algebraic calculations.
Vectors enter physics in two distinct forms. A) Vector A may represent a
single force acting at a single point. The force of gravity acting at the center of
gravity illustrates this form. B) Vector A may be defined over some extended
region; that is, A and its components may be functions of position: Ax =
Ax(x,y,z), and so on. Examples of this sort include the velocity of a fluid
varying from point to point over a given volume and electric and magnetic
fields. Some writers distinguish these two cases by referring to the vector
defined over a region as a vector field. The concept of the vector defined over a
region and being a function of position will be extremely important in Section
1.2 and in later sections where we differentiate and integrate vectors.
At this stage it is convenient to introduce unit vectors along each of the
coordinate axes. Let i be a vector of unit magnitude pointing in the positive
x-direction, j, a vector of unit magnitude in the positive ^-direction, and к, а
vector of unit magnitude in the positive z-direction. Then \AX is a vector with
magnitude equal to Ax and in the positive x-direction. By vector addition
A = iAx+jAy + kAz, A.6)
which states that a vector equals the vector sum of its components. Note that if
A vanishes, all of its components must vanish individually; that is, if
A = 0, then Ax = Ay = Az = 0.
Finally, by the Pythagorean theorem, the magnitude of vector A is
A = (A2X + A] + Al1I/2. A.7a)
This resolution of a vector into its components can be carried out in a variety
of coordinate systems, as shown in Chapter 2. Here we restrict ourselves to
cartesian coordinates.
Equation 1.6 is actually an assertion that the three unit vectors i, j, and к
span our real three-dimensional space: Any constant vector may be written as
a linear combination of i, j, and k. Since i, j, and к are linearly independent
(no one is a linear combination of the other two), they form a basis for the real
three-dimensional space.
6 VECTOR ANALYSIS
As a replacement of the graphical technique, addition and subtraction of
vectors may now be carried out in terms of their components. For A = \AX +
}Ay + kAz and В = \BX + jBy + kBz,
A + В = i(Ax ± Bx) + j(Ay ± By) + k(Az + Bz). A.76)
EXAMPLE 1.1.1
Let
A = 6i + 4j + 3k
В = 2i - 3j - 3k.
Then by Eq. 1.76
A + В = 8i + j
and
A - В = 4i + 7j + 6k.
It should be emphasized here that the unit vectors i, j, and к are used for
convenience. They are not essential; we can describe vectors and use them
entirely in terms of their components: A<-+(AX, Ay, Az). This is the approach of
the two more powerful, more sophisticated definitions of vector discussed in
the next section. However, i, j, and к emphasize the direction, which will be
useful in Chapter 2.
So far we have defined the operations of addition and subtraction of vectors.
Three varieties of multiplication are defined on the basis of their applicability:
a scalar or inner product in Section 1.3, a vector product peculiar to three-
dimensional space in, Section 1.4, and a direct or outer product yielding a
second-rank tensor in Section 3.2. Division by a vector is not defined. See
Exercises 4.2.21 and 22.
EXERCISES
1.1.1 Show how to find A and B, given A + В and A — B.
1.1.2 The vector A whose magnitude is 10 units makes equal angles with the coordinate
axes. Find Ax, Ay, and Az.
1.1.3 Calculate the components of a unit vector that lies in the xy-plane and makes
equal angles with the positive directions of the x- and j'-axes.
1.1.4 The velocity of sailboat A relative to sailboat B, vrel, is defined by the equation
Vrei = V4 — vB, where V4 is the velocity of A and vB is the velocity of B. Determine
the velocity of A relative to В if
V4 = 30 km/hr east
vB = 40 km/hr north
ANS. vrc, = 50 km/hr, 53.1° south of east.
ADVANCED DEFINITIONS 7
1.1.5 A sailboat sails for 1 hr at 4 km/hr (relative to the water) on a steady compass
heading of 40° east of north. The sailboat is simultaneously carried along by a
current. At the end of the hour the boat is 6.12 km from its starting point. The
line from its starting point to its location lies 60° east of north. Find the x (east-
(easterly) and у (northerly) components of the water's velocity.
ANS. v^t = 2.73 km/hr, t>north = 0 km/hr.
1.1.6 A vector equation can be reduced to the form A = B. From this show that the
one vector equation is equivalent to three scalar equations.
Assuming the validity of Newton's second law F = ma as a vector equation,
this means that ax depends only on Fx and is independent of Fy and Fz.
1.1.7 The vertices of a triangle A, B, and Care given by the points (—1,0,2), @,1,0),
and A, — 1,0), respectively. Find point D so that the figure ABDC forms a plane
parallelogram.
ANS. B,0,-2).
1.1.8 A triangle is defined by the vertices of three vectors, A, B, and С that extend
from the origin. In terms of A, B, and С show that the vector sum of the successive
sides of the triangle (AB + ВС + CA) is zero.
1.1.9 A sphere of radius a is centered at a point r1.
(a) Write out the algebraic equation for the sphere.
(b) Write out a vector equation for the sphere.
ANS. (a) (x-x1J + (y-ylJ + (z-z1J = a2.
(b) г = rj + a.
(a takes on all directions but has a fixed magnitude, a.)
1.1.10 A corner reflector is formed by three mutually perpendicular reflecting surfaces.
Show that a ray of light incident upon the corner reflector (striking all three
surfaces) is reflected back along a line parallel to the line of incidence.
Hint. Consider the effect of a reflection on the components of a vector describing
the direction of the light ray.
1.1.11 Hubble's law. Hubble found that distant galaxies are receding with a velocity
proportional to their distance from where we are on Earth. For the /th galaxy
V; = #or;
with us at the origin. Show that this recession of the galaxies from us does not
imply that we are at the center of the universe. Specifically, take the galaxy
at rx as a new origin and show that Hubble's law is still obeyed.
1.2 ADVANCED DEFINITIONS*
In the preceding section vectors were defined or represented in two equiv-
equivalent ways: A) geometrically by specifying magnitude and direction, as with an
arrow, and B) algebraically by specifying the components relative to cartesian
coordinate axes. The second definition is adequate for the vector analysis of
this chapter. In this section two more refined, sophisticated, and powerful
*This section is optional. It is not essential for the remaining sections of this
chapter.
8 VECTOR ANALYSIS
definitions are presented. First, the vector field is defined in terms of the
behavior of its components under rotation of the coordinate axes. This trans-
transformation theory approach leads into the tensor analysis of Chapter 3. Second,
the component definition of Section 1.1 is refined and generalized according to
the mathematician's concepts of vector and vector space. This approach leads
to function spaces including the Hilbert space—Section 9.4.
ROTATION OF THE COORDINATE AXES
The definition of vector as a quantity with magnitude and direction breaks
down in advanced work. On the one hand, we encounter quantities, such as
elastic constants and index of refraction in anisotropic crystals, that have
magnitude and direction but which are not vectors. On the other hand, our
naive approach is awkward to generalize, to extend to more complex quantities.
We seek a new definition of vector field, using our displacement vector r as a
prototype.
There is an important physical basis for our development of a new definition.
We describe our physical world by mathematics, but it and any physical
predictions we may make must be independent of our mathematical analysis.
Some writers compare the physical system to a building and the mathematical
analysis to the scaffolding used to construct the building. In the end the scaffold-
scaffolding is stripped off and the building stands.
In our specific case we assume that space is isotropic; that is, there is no
preferred direction or all directions are equivalent. Then the physical system
being analyzed or the physical law being enunciated cannot and must not
depend on our choice or orientation of the coordinate axes.
Now we return to the concept of vector r as a geometric object independent
of the coordinate system. Let us look at r in two different systems, one rotated
in relation to the other.
For simplicity we consider first the two-dimensional case. If the x-, y-
coordinates are rotated counterclockwise through an angle cp, keeping г fixed
(Fig. 1.7), we get the following relations between the components resolved in
the original system (unprimed) and those resolved in the new rotated system
(primed):
x' = x cos cp + v sin cp,
A.8)
y' — — x sin cp + у cos cp
We saw in Section 1.1 that a vector could be represented by the coordinates
of a point; that is, the coordinates were proportional to the vector components.
Hence the components of a vector must transform under rotation as coordinates
of a point (such as r). Therefore whenever any pair of quantities Ax(x, y) and
Ay(x, y) in the xy-coordinate system is transformed into (A'x, A'y) by this rotation
of the coordinate system with
A' = Axcoscp + A v sing)
A.9)
A'y = —Ax sin cp + Ay cos cp,
ADVANCED DEFINITIONS 9
*- x
FIG. 1.7 Rotation of cartesian coordinate axes about the z-axis
we define1 Ax and Ay as the components of a vector A. Our vector now is
defined in terms of the transformation of its components under rotation of the
coordinate system. If Ax and Ay transform in the same way as x and y, the
components of the two-dimensional displacement vector, they are the compo-
components of a vector A. If Ax and Ay do not show this form invariance when the
coordinates are rotated, they do not form a vector.
The vector field components Ax and Ay satisfying the defining equations,
Eq. 1.9, associate a magnitude A and a direction with each point in space. The
magnitude is a scalar quantity, invariant to the rotation of the coordinate
system. The direction (relative to the unprimed system) is likewise invariant to
the rotation of the coordinate system (see Exercise 1.2.1). The result of all this
is that the components of a vector may vary according to the rotation of the
primed coordinate system. This is what Eq. 1.9 says. But the variation with the
angle is just such that the components in the rotated coordinate system A'x and
A'y define a vector with the same magnitude and the same direction as the
vector defined by the components Ax and Ay relative to the л>, ^-coordinate
axes. (Compare Exercise 1.2.1.) The components of A in a particular coordinate
system constitute the representation of A in that coordinate system. Equation
1.9, the transformation relation, is a guarantee that the entity A is independent
of the rotation of the coordinate system.
To go on to three and, later, four dimensions, we find it convenient to use a
more compact notation. Let
1 The corresponding definition of a scalar quantity is S' = S, that is, invariant
under rotation of the coordinates.
10 VECTOR ANALYSIS
X~^Xi A.10)
a21 = coscp, a12 =
12 q,
a2l = — sin<p, a22 = coscp.
Then Eq. 1.8 becomes
The coefficient afj- may be interpreted as a direction cosine, the cosine of the
angle between x\ and Xj; that is,
a12 = cos{x1,x2) = sm<p,
a2x = cos(x2, х2) = cos I <p + — I = — sin cp.
The advantage of the new notation2 is that it permits us to use the summation
symbol ]T and to rewrite Eqs. 1.12 as
Note that / remains as a parameter that gives rise to one equation when it is
set equal to 1 and to a second equation when it is set equal to 2. The index j,
of course, is a summation index, a dummy index, and as with a variable of
integration, j may be replaced by any other convenient symbol.
The generalization to three, four, or N dimensions is now very simple. The
set of N quantities, Vj, is said to be the components of an jV-dimensional vector,
V, if and only if their values relative to the rotated coordinate axes are given by
Л i=l,2, ...,#. A.15)
As before, ац is the cosine of the angle between x\ and x}. Often the upper limit
N and the corresponding range of i will not be indicated. It is taken for granted
that the reader knows how many dimensions his or her space has.
From the definition of ai} as the cosine of the angle between the positive x\
2 The reader may wonder at the replacement of one parameter cp by four
parameters a{i. Clearly, the ai} do not constitute a minimum set of parameters.
For two dimensions the four atJ are subject to the three constraints given in
Eq. 1.18. The justification for the redundant set of direction cosines is the
convenience it provides. Hopefully, this convenience will become more
apparent in Chapters 3 and 4. For three dimensional rotations (9 ay but only
three independent) alternate descriptions are provided by: A) the Euler angles
discussed in Section 4.3, B) quaternions, and C) the Cayley-Klein parameters.
These alternatives have their respective advantages and disadvantages.
ADVANCED DEFINITIONS 11
direction and the positive Xj direction we may write (cartesian coordinatesK
Note carefully that these are partial derivatives. By use of Eq. 1.16, Eq. 1.15
becomes
V! = f ™±V-= У —^-V- A17)
1 ^ Fix J ^ fix' r \1-1')
The direction cosines ai} satisfy an orthogonality condition
ijaik = ejk A.18)
or, equivalently,
YJiald = b}k. A.19)
The symbol Sjk is the Kronecker delta defined by
8jk = 1 for у = к,
A.20)
<5д = 0 for ./=£*.
The reader may easily verify that Eqs. 1.18 and 1.19 hold in the two-dimensional
case by substituting in the specific atj from Eq. 1.11. The result is the well-known
identity sin2 cp + cos2 cp = 1 for the nonvanishing case. To verify Eq. 1.18 in
general form, we may use the partial derivative forms of Eqs. 1.16 to obtain
dx' dxl
'i dxl i dx'i дхк дхк'
The last step follows by the standard rules for partial differentiation, assuming
that Xj is a function of x\, x'2, x'3, and so on. The final result, dXjjdxk, is equal
to 5jk, since Xj and xk as coordinate lines (j ф к) are assumed to be perpendicular
(two or three dimensions) or orthogonal (for any number of dimensions).
Equivalently, we may assume that Xj and xk (j ф к) are totally independent
variables. If j = k, the partial derivative is clearly equal to 1.
In redefining a vector in terms of how its components transform under a
rotation of the coordinate system, we should emphasize two points:
1. This definition is developed because it is useful and appropriate in
describing our physical world. Our vector equations will be independent of
any particular coordinate system. (The coordinate system need not even be
cartesian.) The vector equation can always be expressed in some particular
coordinate systeto and, to obtain numerical results, we must ultimately express
the equation in some specific coordinate system.
3 Differentiate x\ — £aikxk with respect to Xj. See the discussion following
Eq. 1.21. Section 4.3 provides an alternate approach.
12 VECTOR ANALYSIS
2. This definition is subject to a generalization that will open up the branch
of mathematics known as tensor analysis (Chapter 3).
A qualification is also in order. The behavior of the vector components
under rotation of the coordinates is used in Section 1.3 to prove that a scalar
product is a scalar, in Section 1.4 to prove that a vector product is a vector,
and in Section 1.6 to show that the gradient of a scalar, Vxjj, is a vector. The
remainder of this chapter proceeds on the basis of the less restrictive definitions
of the vector given in Section 1.1.
Vectors and Vector Space
It is customary in mathematics to label an ordered triple of real numbers
(x2, x2, x3) a vector x. The number xn is called the nth component of vector x.
The collection of all such vectors (obeying the properties that follow) form a
three-dimensional real vector space. We ascribe five properties to our vectors:
If x = (xj,x2,x3) and у = (У!,у2,Уз),
1. Vector equality: x = у means xt = yh i= 1, 2, 3
2. Vector addition: x + у = z means xt + yt = zt,
3. Scalar multiplication: ax<^(axi,ax2,ax3) (with
a real)
4. Negative of a vector: — x = (— l)x •*-►( — x1? — x2,
-x3)
5. Null vector: There exists a null vector 0 <-> @,0,0).
Since our vector components are simply numbers, the following properties
also hold:
1. Addition of vectors is commutative: x + у = у + x.
2. Addition of vectors is associative: (x + y) + z =
x + (y + z).
3. Scalar multiplication is distributive:
a(x + y) = ax + ay, also (a + b)x = ax + bx.
4. Scalar multiplication is associative: {ab)x = a(bx).
Further, the null vector 0 is unique as is the negative of a given vector x.
So far as the vectors themselves are concerned this approach merely for-
formalizes the component discussion of Section 1.1. The importance lies in the
extensions which will be considered in later chapters. In Chapter 4, we show that
vectors form both an Abelian group under addition and a linear space with
the transformations in the linear space described by matrices. Finally, and
perhaps most important, for advanced physics the concept of vectors presented
here may be generalized to A) complex quantities,4 B) functions, and C) an
infinite number of components. This leads to infinite dimensional function
4 The «-dimensional vector space of real n-tuples is often labeled R" and the
«-dimensional vector space of complex n-tuples is labeled C".
SCALAR OR DOT PRODUCT 13
spaces, the Hilbert spaces, which are, important in modern quantum theory. A
brief introduction to function expansions and Hilbert space appears in Section
9.4.
EXERCISES
1.2.1 (a) Show that the magnitude of a vector A, A - (Al + A2I'2 is independent
of the orientation of the rotated coordinate system,
(Al + a)Y* = (л;2 + а;2I'2
;
independent of the rotation angle <p.
This independence of angle is expressed by saying that A is invariant under
rotations.
(b) At a given point (x,y) A defines an angle a relative to the positive x-axis
and a' relative to the positive x'-axis. The angle from x to x' is (p. Show that
A = A' defines the same direction in space when expressed in terms of its
primed components, as in terms of its unprimed components; that is,
a' = a — <p.
1.2.2 Prove the orthogonality condition ^апаш ~ $jk- As a special case of this the
direction cosines of Section 1.1 satisfy the relation
cos2 a + cos2 P + cos2 у = 1,
a result that also follows from Eq. 1.7a.
1.3 SCALAR OR DOT PRODUCT
Having defined vectors, we now proceed to combine them. The laws for
combining vectors must be mathematically consistent. From the possibilities
that are consistent we select two that are both mathematically and physically
interesting. A third possibility is introduced in Chapter 3, in which we form
tensors.
The combination of AB cos в, in which A and В are the magnitudes of two
vectors and в, the angle between them, occurs frequently in physics (Fig. 1.8).
FIG. 1.8 Scalar product A-B =
АВсоьв
14 VECTOR ANALYSIS
For instance,
work = force x displacement x cos в
is usually interpreted as displacement times the projection of the force along
the displacement.
With such applications in mind, we define
AB = AXBX + AyBy + AZBZ = £ЛД. A.22)
as the scalar, dot, or inner product of A and B. The scalar product of two
vectors is a scalar quantity. We note that from this definition A • В = В • A;
the scalar product is commutative. The unit vectors i, j, and к satisfy the relations
i-i = j-j = k-k = 1, A.22a)
whereas
ij = ik = jk = O,
A.22b)
j-i = k*i = k-j = 0.
If we reorient our axes and let A define a new лг-axis,1 then
and
Bx = В cos в.
Then by Eq. 1.22
A-B = ,4£cos0, A.23)
which may be taken as a second definition of scalar product. The component
definition, Eq. 1.22, might be labeled an algebraic definition. Then Eq. 1.23
would be a geometric definition. One of the most common applications of the
scalar product in physics is in the calculation of work, W=¥-s, the scalar
product of force and displacement.
EXAMPLE 1.3.1
For the two vectors A and В of Example 1.1.1, A = 6i + 4j + Зк, В =
2i — 3j — 3k,
A«B = A2- 12-9)= -9
by Eq. 1.22. In this case the projection of A on В (or В on A) is negative. Actually,
A| = C6 + 16 + 9I/2 = F1I/2 = 7.81,
В | = D + 9 + 9)i/2 = B2I/2 = 4.69,
and cos0 = -0.246, в = 104.2°.
1 The invariance of A ■ В under rotation of the coordinate axes is proved later
in this section.
SCALAR OR DOT PRODUCT 15
If A • В = 0 and we know that А ф 0 and В =/= 0, then from Eq. 1.23 cos в = 0
or 9 = 90°, 270°, and so on. The vectors A and В must be perpendicular.
Alternately, we may say A and В are orthogonal. The unit vectors i, j, and к
are mutually orthogonal. To develop this notion of orthogonality one more
step, suppose that n is a unit vector and г is a nonzero vector in the xy-plane;
that is, г = ix + \y (Fig. 1.9). If
n-r = 0
for all choices of r, then n must be perpendicular (orthogonal) to the xy-plane.
FIG. 1.9 A normal vector
Often it is convenient to replace i, j, and к by subscripted unit vectors em,
m— 1, 2, 3 with i = e1? and so on. Then Eqs. 1.22л and b become
ет*е„ = <5т„. A.22c)
For m Ф n the unit vectors em and е„ are orthogonal. For m = n each vector is
normalized to unity, that is, has unit magnitude. The set em is said to be orthonor-
mal. A major advantage of Eq. 1.22c over Eqs. 1.22йг and b is that Eq. 1.22c
may readily be generalized to ^-dimensional space: m, n = 1, 2, ..., N.
Finally, we are picking sets of unit vectors em that are orthonormal for con-
convenience—a very great convenience. The nonorthogonal situation is explored
in Section 4.4, "Oblique Coordinates."
SCALAR PROPERTY
We have not yet shown that the word scalar is justified or that the scalar
product is indeed a scalar quantity. To do this, we investigate the behavior of
A-B under a rotation of the coordinate system. By use of Eq. 1.15
16 VECTOR ANALYSIS
A'XB'X + A'yB'y + A'ZB'Z = 2>^ %
i j i j i J
A.24)
Using the indices к and / to sum over x, y, and z, we obtain
к I i j
and, by rearranging the terms on the right-hand side, we have
i j I
The last two steps follow by using Eq. 1.18, the orthogonality condition of the
direction cosines, and Eq. 1.20, which defines the Kronecker delta. The effect
of the Kronecker delta is to cancel all terms in a summation over either index
except the term for which the indices are equal. In Eq. 1.26 its effect is to set
j = i and to eliminate the summation over j. Of course, we could equally well
set i =j and eliminate the summation over i. Equation 1.26 gives us
ь A.27)
which is just our definition of a scalar quantity, one that remains invariant under
the rotation of the coordinate system.
In a similar approach which exploits this concept of invariance, we take
С = A + В and dot it into itself.
С • С = (A + В) • (A + В)
A.28)
= AA + BB + 2AB.
Since
CC = C2, A.29)
the square of the magnitude of vector С and thus an invariant quantity, we
see that
A-B = i(C2 -A2 -B2), invariant. A.30)
Since the right-hand side of Eq. 1.30 is invariant—that is, a scalar quantity—
the left-hand side, A • B, must also be invariant under rotation of the coordinate
system. Hence A • В is a scalar.
Equation 1.28 is really another form of the law of cosines which is
C2 = A2 + B2 + 2ABcosd. A.31)
Comparing Eqs. 1.28 and 1.31, we have another verification of Eq. 1.23, or,
if preferred, a vector derivation of the law of cosines (Fig. 1.10).
EXERCISES 17
В
FIG. 1.10 The law of cosines
An interesting illustration of the geometric interpretation of the scalar
product is provided by an example from a branch of general relativity. Consider
a four-dimensional sphere
x2 + y2 + z2 + w2 = 1
in x, y, z, w space. The surface of this four-dimensional sphere may be described
by the vector г = (x,y,z,w) with the restriction that |r| = 1. It is possible to
construct a unit vector t that is tangential to this four-dimensional sphere over
its entire surface. As one possible example,
t = (y, -x,w, -z).
The reader may verify that
therefore unit magnitude, and
therefore tangential, over the entire sphere.
The two-dimensional analog exists but there is no three-dimensional analog.
Hair growing out of a sphere cannot be combed down all over. There will be
a cowlick.
The dot product, given by Eq. 1.22, may be generalized in two ways. The
space need not be restricted to three dimensions. In «-dimensional space,
Eq. 1.22 applies with the sum running from 1 to п. п may be infinity, with the
sum then a convergent infinite series (Section 5.2). The other generalization
extends the concept of vector to embrace functions. The function analog of a
dot or inner product appears in Section 9.4.
EXERCISES
1.3.1 What is the cosine of the angle between the vectors
A = 3i + 4j + к
and
B = i-j + k?
ANS. cos 9 = 0, 6 = -.
18 VECTOR ANALYSIS
1.3.2 Two unit magnitude vectors e; and e,- are required to be either parallel or per-
perpendicular to each other. Show that e^e,- provides an interpretation of Eq. 1.18,
the direction cosine orthogonality relation.
1.3.3 Given that A) the dot product of a unit vector with itself is unity and B) this
relation is valid in all (rotated) coordinate systems, show that i' • Г = 1 (with the
primed system rotated 45° about the z-axis relative to the unprimed) implies that
i-j = 0.
1.3.4 The vector r, starting at the origin, terminates at and specifies the point in space
(x,y, z). Find the surface swept out by the tip of r if
(a) (r-a)-a = 0,
(b) (r - a) • г = 0.
The vector a is a constant (constant in magnitude and direction).
1.3.5 Ml
The interaction energy between two dipoles of moments щ and ц2 may be written
in the vector form
V=- Pi'tb .
Г3
and in the scalar form
V = f^1 B cos 0, cos 62 - sin 0, sin 0, cos <p).
Here Bx and 02 are the angles of ]i1 and ц2 relative to r, while <p is the azimuth of
\i2 relative to the щ — r plane. Show that these two forms are equivalent.
Hint. Eq. 12.198 will be helpful.
1.3.6 A pipe comes diagonally down the south wall of a building, making an angle
of 45° with the horizontal. Coming into a corner, the pipe turns and continues
diagonally down a west-facing wall, still making an angle of 45° with the horizontal.
What is the angle between the south-wall and west-wall sections of the pipe?
ANS. 120°.
1.4 VECTOR OR CROSS PRODUCT
A second form of vector multiplication employs the sine of the included
angle instead of the cosine. For instance, the angular momentum of a body is
defined as
angular momentum = radius arm x linear momentum
= distance x linear momentum x sin в.
For convenience in treating problems relating to quantities such as angular
momentum, torque, and angular velocity, we define the vector or cross product
as
VECTOR OR CROSS PRODUCT 19
Linear momentum
*- x
FIG. 1.11 Angular momentum
with
С = A x B,
С = AB sind.
A.32)
Unlike the preceding case of the scalar product, С is now a vector, and we assign
it a direction perpendicular to the plane of A and В such that A, B, and С form a
right-handed system. With this choice of direction we have
AxB=—BxA, anticommutation.
From this definition of cross product we have
A.32л)
A.326)
whereas
and
ixj = k, j x к = i, kxi = j
j x i = —к, к x j = —i, ixk= —j.
A.32c)
Among the examples of the cross product in mathematical physics are the
relation between linear momentum p and angular momentum L (defining
angular momentum),
L = г x p
and the relation between linear velocity v and angular velocity со,
V = CO X Г.
Vectors v and p describe properties of the particle or physical system. However,
20 VECTOR ANALYSIS
the position vector г is determined by the choice of the origin of the coordinates.
This means that со and L depend on the choice of the origin.
The familiar magnetic induction В is usually defined by the vector product
force equation1
Fm = ?vx B.
Here v is the velocity of the electric charge q and FM is the resulting force on
the moving charge.
The cross product has an important geometrical interpretation which we
shall use in subsequent sections. In the parallelogram defined by A and В
(Fig. 1.12) В sin в is the height if A is taken as the length of the base. Then
A x В | = А В sin в is the area of the parallelogram. As a vector, A x В is the
area of the parallelogram defined by A and B, with the area vector normal to
the plane of the parallelogram. This suggests that area may be treated as a
vector quantity.
В sin 9
FIG. 1.12 Parallelogram representation of the vector product
Parenthetically, it might be noted that Eq. 1.32c and a modified Eq. 1.326
form the starting point for the development of quaternions. Equation 1.326
is replaced byixi = jxj = kxk = —1.
An alternate definition of the vector product С = А х В consists in specifying
the components of С:
Cx = AyBz-AzBy,
— AZBX — AXBZ,
A.33)
or
lrThe electric field E is assumed here to be zero.
VECTOR OR CROSS PRODUCT 21
= AjBk -AkBj, i, j, к all different,
A.34)
and with cyclic permutation of the indices i, j, and к. The vector product С
may be conveniently represented by a determinant2
г-i
i J к
Ax Ay Az
вх ву bz
A.35)
Expansion of the determinant across the top row reproduces the three com-
components of С listed in Eq. 1.33.
Equation 1.32 might be called a geometric definition of the vector product.
Then Eq. 1.33 would be an algebraic definition.
EXAMPLE 1.4.1
With A and В given in Example 1.1.1,
A = 6i + 4j + 3k,
В = 2i - 3j - 3k,
A x B =
1 j к
6 4 3
2 -3 -3
= i(- 12 + 9) - j(- 18 - 6) + k(- 18 - 8)
= -3i + 24j-26k.
To show the equivalence of Eq. 1.32 and the component definition, Eq. 1.33,
let us form A • С and В • С, using Eq. 1.33. We have
A-C = A-(A x B)
= Ax(AyBz - AzBy) + Ay(AzBx - AXBZ) + Az(AxBy - AyBx)
= 0.
Similarly,
BC = B(A x B) = 0.
A.36)
A.37)
Equations 1.36 and 1.37 show that С is perpendicular to both A and В (cos 0 = 0,
в = ± 90°) and therefore perpendicular to the plane they determine. The positive
direction is determined by considering special cases such as the unit vectors
!See Section 4.1 for a summary of determinants.
22 VECTOR ANALYSIS
The magnitude is obtained from
(A x B)-(A x B) = A2B2-(A-BJ
= A2B2-A2B2cos2d A.38)
= A2B2sin20.
Hence
C = ABsmd. A.39)
The big first step in Eq. 1.38 may be verified by expanding out in component
form, using Eq. 1.33 for A x В and Eq. 1.22 for the dot product. From Eqs.
1.36, 1.37, and 1.39 we see the equivalence of Eqs. 1.32 and 1.33, the two
definitions of vector product.
There still remains the problem of verifying that С = А х В is indeed a
vector; that is, it obeys Eq. 1.15, the vector transformation law. Starting in a
rotated (primed system)
C[ = AjB'k — A'kB'j, i,j, and к in cyclic order,
1тВт A.40)
l,m
The combination of direction cosines in parentheses vanishes for m — l. We
therefore have j and к taking on fixed values, dependent on the choice of /,
and six combinations of / and m. If i = 3, then j = 1, к = 2 (cyclic order), and
we have the following direction cosine combinations
atla22 - a2Xal2 = аъъ,
a12a23 - a22ai3 = a31
and their negatives. Equations 1.41 are identities satisfied by the direction
cosines. They may be verified with the use of determinants and matrices
(see Exercise 4.3.3). Substituting back into Eq. 1.40,
С'ъ = a33A1B2 + a32A3B1 + a3iA2B3
-a33A2Bl-a32AlB3-a3lA3B2
= a31C1 + a32C2 + a33C3
By permuting indices to pick up C[ and C2, we see that Eq. 1.15 is satisfied
and С is indeed a vector. It should be mentioned here that this vector nature of
the cross product is an accident associated with the three-dimensional nature
EXERCISES 23
of ordinary space.3 It will be seen in Chapter 3 that the cross product may also
be treated as a second-rank antisymmetric tensor!
If we define a vector as an ordered triple of numbers (or functions) as in the
latter part of Section 1.2, then there is no problem identifying the cross product
as a vector. The cross-product operation maps the two triples A and В into a
third triple С which by definition is a vector.
We now have two ways of multiplying vectors; a third form appears in
Chapter 3. But what about division by a vector? It turns out that the ratio
B/A is not uniquely specified (Exercise 4.2.19) unless A and В are also required
to be parallel. Hence division of one vector by another is not defined.
EXERCISES
1.4.1 Two vectors A and В are given by
A = 2i + 4j + 6k,
В = 3i - 3j - 5k.
Compute the scalar and vector products A • В and A x B.
1.4.2 Show the equivalence of Eq. 1.32 and the component definition Eq. 1.33 by
expanding A, B, and С in С = А х В in cartesian components.
1.4.3 Starting with С = A + B, show that С х С leads to
A x B= -B x A.
1.4.4 Show that
(a) (А-В)-(А + В) = Л2-Я2,
(b) (A - B) x (A + B) = 2A x B.
The distributive laws needed here,
A-(B + C) = A-B + A-C
and
Ax(B + C) = AxB + AxC,
may easily be verified (if desired) by expansion in cartesian components.
1.4.5 Given the three vectors,
P = 3i + 2j - k,
Q= _6i-4j + 2k,
R = i - 2j - k,
find two that are perpendicular and two that are parallel or antiparallel.
3 Specifically Eq. 1.41 holds only for three-dimensional space. Technically, it
is also possible to define a cross product in R1, seven-dimensional space, but
the cross product turns out to have unacceptable (pathological) properties.
24 VECTOR ANALYSIS
1.4.6 IfP = iPx + j/^andQ = \QX -+- JGyare any two nonparallel (also nonantiparallel)
vectors in the xy-plane, show that P x Q is in the z-direction.
1.4.7 Prove that (A x B) • (A x B) = (ABJ - (A • BJ.
1.4.8 Using the vectors
P = icos0 + jsin0,
Q = icos<p — jsin<p,
R = icos<p + }sin<p,
prove the familiar trigonometric identities
sin@ + <p) = sin в cos <p + cos 9 sin <p,
cos@ + (p) = cos 9 cos cp — sin в sin (p.
1.4.9 (a) Find a vector A that is perpendicular to
(b) What is A if, in addition to this requirement, we also demand that it have
unit magnitude?
1.4.10 If four vectors a, b, c, and d all lie in the same plane, show that
(a x b) x (c x d) = 0.
Hint. Consider the directions of the cross-product vectors.
1.4.11 The coordinates of the three vertices of a triangle are B,1,5), E,2,8), and D,8,2).
Compute its area by vector methods.
1.4.12 The vertices of parallelogram ABCD are A,0,0), B,-1,0), @,-1,1), and
(—1,0,1) in order. Calculate the vector areas of triangle ABD and of triangle
BCD. Are the two vector areas equal?
ANS. Area^D = -^(i + j + 2k).
1.4.13 The origin and the three vectors A, B, and С (all of which start at the origin)
define a tetrahedron. Taking the outward direction as positive, calculate the total
vector area of the four tetrahedral surfaces.
Note. In Section 1.11 this result is generalized to any closed surface.
1.4.14 Find the sides and angles of the spherical triangle ABC defined by the three vectors
A = A,0,0),
and
Each vector starts from the origin (Fig. 1.13).
EXERCISES 25
В
FIG. 1.13 Spherical triangle
1.4.15 Derive the law of sines:
sin a _ sin /? _ sin у
*- У
x
1.4.16 The magnetic induction В is defined by the Lorentz force equation
F=^(vx B).
Carrying out three experiments, we find that if
26 VECTOR ANALYSIS
v - i, - = 2k - 4j,
g
v = j, — = 4i - k,
4
and
ь F • т
v = k, — = j - 2i,
From the results of these three separate experiments calculate the magnetic
induction B.
1.5 TRIPLE SCALAR PRODUCT, TRIPLE VECTOR
PRODUCT
TRIPLE SCALAR PRODUCT
Sections 1.3 and 1.4 cover the two types of multiplication of interest here.
However, there are combinations of three vectors, А* (В х QandA x (В х C),
which occur with sufficient frequency to deserve further attention. The com-
combination
A • (B x C)
is known as the triple scalar product. В х С yields a vector which, dotted into
A, gives a scalar. We note that (A • В) х С represents a scalar crossed into a
vector, an operation that is not defined. Hence, if we agree to exclude this
undefined interpretation, the parentheses may be omitted and the triple scalar
product written A • В x C.
Using Eq. 1.33 for the cross product and Eq. 1.22 for the dot product, we
obtain
А-В x С = Ax(ByCz - BzCy) + Ay(BzCx - BXCZ) + Az(BxCy - ByCx)
— R • Г1 v A — С • A у R
= -A-C x B= -C-B x A = -B-A x C, and so on.
A.43)
The high degree of symmetry present in the component expansion should be
noted. Every term contains the factors At, Bj, and Ck. If i,j, and к are in cyclic
order (x,y,z), the sign is positive. If the order is anticyclic, the sign is negative.
Further, the dot and the cross may be interchanged,
A-BxC = AxB-C A.44)
A convenient representation of the component expansion of Eq. 1.43 is provided
by the determinant
A A A
/±x /±y /±z
A-BxC= Bx By Bz A.45)
С С С
y^x y^v W
TRIPLE SCALAR PRODUCT, TRIPLE VECTOR PRODUCT 27
The rules for interchanging rows and columns of a determinant1 provide an
immediate verification of the permutations listed in Eq. 1.43, whereas the
symmetry of A, B, and С in the determinant form suggests the relation given in
Eq. 1.44.
The triple products encountered in Section 1.4, which showed that A x В
was perpendicular to both A and B, were special cases of the general result
(Eq. 1.43).
The triple scalar product has a direct geometrical interpretation. The three
vectors A, B, and С may be interpreted as defining a parallelepiped (Fig. 1.14).
В x
= area of parallelogram base.
A.46)
The direction, of course, is normal to the base. Dotting A into this means multi-
multiplying the base area by the projection of A onto the normal, or base times height.
Therefore
A'BxC = volume of parallelepiped defined by A, B, and C.
FIG. 1.14 Parallelepiped representation of triple scalar product
EXAMPLE 1.5.1 A parallelepiped
For
A = i + 2j - k,
1 See Section 4.1 for a summary of the properties of determinants.
28 VECTOR ANALYSIS
AB x C =
1 2 -1
0 1 1
1 -1 0
A.47)
By expansion by minors across the top row the determinant equals
1@ + 1) - 2@ - 1) - 1@ - 1) = 4.
This is the volume of the parallelepiped defined by A, B, and C. The reader
should note that A-BxC may sometimes turn out to be negative! This
problem and its interpretation are considered in Chapter 3.
The triple scalar product finds an interesting and important application
in the construction of a reciprocal crystal lattice. Let a, b, and с (not necessarily
mutually perpendicular) represent the vectors that define a crystal lattice. The
distance from one lattice point to another may then be written
г = naa + nbb + ncc, A-48)
with na,nb, and nc taking on integral values. With these vectors we may form
bxc ., cxa , axb t, ло \
a' = — , b=— , с ' = — . A.48a)
a«b xc a*b x с a*bxc
We see that a' is perpendicular to the plane containing b and с and has a magni-
magnitude proportional to a~x. In fact, we can readily show that
a/'a = b/-b = c'-c= 1, A.486)
whereas
a' • b = a ' • с = b' • a = b ' • с = с • a = с ' • b = 0. A 48c)
It is from Eqs. 1.486 and 1.48c that the name reciprocal lattice is derived. The
mathematical space in which this reciprocal lattice exists is sometimes called
a Fourier space, on the basis of relations to the Fourier analysis of Chapters
14 and 15. This reciprocal lattice is useful in problems involving the scattering
of waves from the various planes in a crystal. Further details may be found in
R. B. Leighton's Principles of Modem Physics, pp. 440-448 [New York:
McGraw-Hill A959)]. We encounter the reciprocal lattice again in an analysis
of oblique coordinate systems, Section 4.4.
TRIPLE VECTOR PRODUCT
The second triple product of interest is A x (В х С). Here the parentheses
must be retained, as may be seen by considering the special case
ix (ix j) = ixk=-j A.49)
but
(i x i) x j = 0.
The fact that the triple vector product is a vector follows from our discussion
TRIPLE SCALAR PRODUCT, TRIPLE VECTOR PRODUCT 29
of vector product. Also, we see that the direction of the resulting vector is
perpendicular to A and to В х С The plane defined by В and С is perpendicular
to В x С and so A x (В x С) lies in this plane. Specifically, if В and С lie in
the xy-plane, then В х С is in the z-direction and A x (В х С) is back in the
xy-plane (Fig. 1.15). This means that A x (В х С) will be a linear combination
of В and C.We find that
A x (B x C) = B(A-C)-C(A-B),
A-50)
a relation sometimes known as the В AC-CAB rule. This result may be verified
by the direct though not very elegant method of expanding into cartesian
components (see Exercise 1.5.2).
'A X (B X C)
FIG. 1.15 В and С are in the
.xy-plane. В x С is perpen-
perpendicular to the xy-plane and is
shown here along the z-axis.
Then A x (B x C) is perpen-
perpendicular to the z-axis and there-
therefore is back in the xy-plane.
An alternate derivation using the Levi-Civita eijk of Section 3.4 is the topic
of Exercise 3.4.8.
The В AC-CAB rule is probably the single most important vector identity.
Because of its frequent use in problems and in future derivations, the rule
probably should be memorized.
It might be noted here that as vectors are independent of the coordinates
so a vector equation is independent of the particular coordinate system. The
coordinate system only determines the components. If the vector equation
can be established in cartesian coordinates, it is established and valid in any
of the coordinate systems to be introduced in Chapter 2.
EXAMPLE 1.5.2 A triple vector product
By using the three vectors given in Example 1.5.1, we obtain
A x (B x C) = (j + k)(l - 2) - (i - j)B - 1)
= -i-k
byEq. 1.50. In detail,
30 VECTOR ANALYSIS
BxC =
0 1 1
1 -1 0
and
A x (B x C) =
1 2 -1
1 1 -1
= -i-k.
Other, more complicated, products may be simplified by using these forms
of the triple scalar and triple vector products.
EXERCISES
1.5.1
1.5.2
1.5.3
One vertex of a glass parallelepiped is at the origin. The three adjacent vertices
are at C,0,0), @,0,2), and @,3,1). All lengths are in centimeters. Calculate the
number of cubic centimeters of glass in the parallelepiped by using the triple
scalar product.
Verify the expansion of the triple vector product
Ax (BxC) = B(AC)-C(AB)
by direct expansion in cartesian coordinates.
Show that the first step in Eq. 1.38, which is
(A x B)(A x B) = A2B2-(A-BJ,
is consistent with the В AC-CAB rule for a triple vector product.
EXERCISES 31
1.5.4 Given the three vectors A, B, and C,
A = i+j,
В = j + k,
С = i - k.
(a) Compute the triple scalar product, A • В x C. Noting that A = В + С, give
a geometric interpretation of your result for the triple scalar product.
(b) Compute A x (В х С).
1.5.5 The angular momentum L of a particle is given by L = r x p = mr x v, where p
is the linear momentum. With linear and angular velocity related by v = и х г,
show that
L = mr2[w-ro(ro.<o)].
Here r0 is a unit vector in the г direction. For г • со = 0 this reduces to L = /to,
with the moment of inertia / given by mr2. In Section 4.6 this result is generalized
to form an inertia tensor.
1.5.6 The kinetic energy of a single particle is given by T—\mv2. For rotational
motion this becomes \rn{m x rJ. Show that
Г=£/и[г2с»2-(г-юJ].
For г ♦ со = 0 this reduces to T = jlco2 with the moment of inertia / given by mr2.
1.5.7 Show that
a x (b x c) + b x (c x a) + с x (a x b) = 0.
1.5.8 A vector A is decomposed into a radial vector Ar and a tangential vector A,.
If r0 is a unit vector in the radial direction, show that
(a) Ar = r0(A-r0)
and
(b) Ar= -r0 x (r0 x A).
1.5.9 Prove that a necessary and sufficient condition for the three (nonvanishing)
vectors A, B, and С to be coplanar is the vanishing of the triple scalar product
A-B x C = 0.
1.5.10 Three vectors A, B, and С are given by
A = 3i - 2j + 2k,
В = 6i + 4j - 2k,
С = - 3i - 2j - 4k.
Compute the values of А-В х С and A x (В х С), С x (A x B) and В х
(C x A).
1.5.11 Vector D is a linear combination of three noncoplanar (and nonorthogonal)
vectors:
D = aA + bB + cC.
Show that the coefficients are given by a ratio of triple scalar products,
D-BxC
a — an(j so on
A-B x С
32 VECTOR ANALYSIS
1.5.12 Show that
(Ax B)(C xD) = (A-C)(B-D)-(A.D)(B-C).
1.5.13 Show that
(A x B) x (C x D) = (AB x D)C - (AB x C)D.
1.5.14 For a spherical triangle such as pictured in Fig. 1.13 show that
sin Л sin В sin С
sin ВС sin С A sin А В
Here sin A is the sine of the included angle at A while ВС is the side opposite
(in radians).
Hint. Exercise 1.5.13 will be useful.
1.5.15 Given
bxc ., cxa , axb , . ,„
a = , b = , с = and a-bxcfO,
a-b x с a-b x с a-b x с
show that
(a) x'-y = Sxy, (x,y = a,b,c),
(b) a b х c=(a-bx c)~\
, . b'xc'
()
1.5.16 If x'-y = 5xy, (x,y = a,b,c), prove that
bxc
a =
a-b x с
(This is the converse of Problem 1.5.15.)
1.5.17 Show that any vector V may be expressed in terms of the reciprocal vectors
a', b, c' by
V = (V-a)a' + (V-b)b'+(V-c)c'.
1.5.18 An electric charge qt moving with velocity vx produces a magnetic induction
В given by
В = ^^211о (mks units),
4л: Г
where r0 points from q^ to the point at which В is measured (Biot and Savart law),
(a) Show that the magnetic force on a second charge q2, velocity v2, is given
by the triple vector product
(b) Write out the corresponding magnetic force Fx that q2 exerts onq1. Define
your unit radial vector. How do Fx and F2 compare?
(c) Calculate ¥Y and F2 for the case of qt and q2 moving along parallel tra-
trajectories side by side.
ANS. (b) Fl
Flf4x(Y2xr0).
In general, there is no simple relation between Fx and
F2. Specifically, Newton's third law, F, = — F2, does not
hold.
GRADIENT, V 33
(С) t у = -—- —-jf- V Го = — * 2 •
Mutual attraction.
1.6 GRADIENT, V
Suppose that cp(x,y,z) is a scalar point function, that is, a function whose
value depends on the values of the coordinates (x,y, z). As a scalar, it must have
the same value at a given fixed point in space, independent of the rotation of
our coordinate system, or
<p'(xfux2ix'3) = q>{xux2,x3). A.51)
By differentiating with respect to x[ we obtain
dcp'{x\, x'2, x3) = dq>jxu x2, хъ)
dx'i dx[
A.52)
JdXjdx; у ijdxj
by the rules of partial differentiation and Eq. 1.16. But comparison with Eq.
1.17, the vector transformation law, now shows that we have constructed a
vector with components dcpjdXj. This vector we label the gradient of (p.
A convenient symbolism is
^ ^ ^ A.53)
6x By
or
3 A54)
dz
\(p (or del cp) is our gradient of the scalar cp, whereas V (del) itself is a vector
differential operator (available to operate on or to differentiate a scalar cp). It
should be emphasized that this operator is a hybrid creature that must satisfy
both the laws for handling vectors and the laws of partial differentiation.
EXAMPLE 1.6.1 The Gradient of a Function of r.
Let us calculate the gradient of/(r) =f(\fxr+y2 +~z*).
+да+
=да+да+k
dx J dy dz
Now/(r) depends on x through the dependence of r on x. Therefore1
1 This is a special case of the chain rule of partial differentiation:
дДг,в,ф) = dfdr d/dl df<hp
dx dr dx дв дх дер dx
Here df/дв = df/d<p - 0, df/dr -> df/dr.
34 VECTOR ANALYSIS
= dfjr) _ dr
dx dr dx
From r as a function of x, y, z
dr ^d^+y2 + z2I12 ^
^ ^ =
dx~ dx ~ (x2 + y2 + z2I12 ~ r '
Therefore
= df(r) x
dx dr r
Permuting coordinates (x-+y, у -» z, z -» x) to obtain the у and z derivatives,
we get
r dr
= T°Jr-
Here r0 is a unit vector (r/r) in the positive radial direction. The gradient of a
function of r is a vector in the (positive or negative) radial direction. In Section
2.5 r0 is seen as one of the three orthonormal unit vectors of spherical polar
coordinates.
A GEOMETRICAL INTERPRETATION
One immediate application of V<p is to dot it into an increment of length
dr = idx+jdy + kdz. A.55)
Thus we obtain
d d d
A.56)
dx dy dz
= d<p,
the change in the scalar function cp corresponding to a change in position dr.
Now consider P and Q to be two points on a surface cp(x,y, z) = C, a constant.
These points are chosen so that Q is a distance dr from P. Then moving from
P to Q, the change in cp(x,y, z) = С is given by
dm = (Va>) • dr
A.57)
= 0,
since we stay on the surface cp(x, y, z) = С This shows that \cp is perpendicular
to dr. Since dr may have any direction from P as long as it stays in the surface
(p, point Q being restricted to the surface, but having arbitrary direction, V<p is
seen as normal to the surface cp = constant (Fig. 1.16).
GRADIENT, V 35
q> (x, y,z) = С
FIG. 1.16 The length increment dr is required to stay on the surface cp — С.
If we now permit dr to take us from one surface cp — C2 to an adjacent
surface cp = C2 (Fig. 1.17л),
dcp = C2 — Cl = AC
A.58)
For a given dcp, \dr\ is a minimum when it is chosen parallel to \cp (cos 0=1);
or, for a given |dr\, the change in the scalar function cp is maximized by choosing
dr parallel to V<p. This identifies \cp as a vector having the direction of the
maximum space rate of change of cp, an identification that will be useful in
Chapter 2 when we consider noncartesian coordinate systems.
This identification of \cp may also be developed by using the calculus of
variations subject to a constraint, Exercise 17.6.9.
EXAMPLE 1.6.2
As a specific example of the foregoing, and as an extension of Example 1.6.1,
we consider the surfaces consisting of concentric spherical shells, Fig. \.\lb.
We have
cp(x,y,z) = (x2 +y2+ z2I'2 = ri = Ch
where r{ is the radius equal to Ch our constant. AC = Acp = Ar,-, the distance
between two shells. From Example 1.6.1
dcp(r)
36 VECTOR ANALYSIS
ф = Сг >
Ф =
■^ у
FIG. 1.17a Gradient
FIG. 1.176 Gradient for cp(x,y,z)
{x2 + y2 + z2I'2, spherical shells:
(x2 + y2 + z2I'2 = r2 = C2,
(x2 +y2 + z2I'2 = rj = C,
The gradient is in the radial direction and is normal to the spherical surface
q> = C.
The gradient of a scalar is of extreme importance in physics in expressing
the relation between a force field and a potential field.
force = — V (potential).
A.59)
This is illustrated by both gravitational and electrostatic fields, among others.
Readers should note that the minus sign in Eq. 1.59 results in water flowing
DIVERGENCE, V • 37
downhill rather than uphill! We reconsider Eq. 1.59 in a broader context in
Section 1.13.
EXERCISES
1.6.1 If S(x,y,z) = (x2 +y2 + z2)~3/2, find
(a) \S at the point A,2,3);
(b) the magnitude of the gradient of S,\\S\ at A,2,3);
and
(c) the direction cosines of VS at A,2,3).
1.6.2 (a) Find a unit vector perpendicular to the surface
x2 + y2 + z2 = 3
at the point A,1,1).
(b) Derive the equation of the plane tangent to the surface at A,1,1).
ANS. (a) (i + j V
(b) x + у + z = 3.
1.6.3 Given a vector r12 = i(xy — x2) + \(y1 — y2) + k(zy — z2), show that 4 Jr12 (gra-
(gradient with respect to xlf yt, and zv of the magnitude rl2) is a unit vector in the
direction of rx 2.
1.6.4 If a vector function F depends on both space coordinates (x, y, z) and time t, show
that
<3F
—dt.
at
1.6.5 Show that \(uv) = v\u + uSv, where и and v are differentiable scalar functions
of x, y, and z.
1.6.6 (a) Show that a necessary and sufficient condition that u(x,y, z) and v(x,y,z)
are related by some function/(w, v) = 0 is that (Vm) x (Vu) = 0.
(b) If и = u{x,y) and v = v(x,y), show that the condition (Vh) x (\v) = 0 leads
to the two-dimensional Jacobian
du
дх
dv
дх
du
ду
dv
ду
\x,y/
The functions и and v are assumed differentiable.
1.7 DIVERGENCE, V
Differentiating a vector function is a simple extension of differentiating
scalar quantities. Suppose r(z) describes the position of a satellite at some time
t. Then, for differentiation with respect to time,
r(t + At) - r(Q
lim
dt At-»o At
= v, linear velocity.
38 VECTOR ANALYSIS
FIG. 1.18 Differentiation of a vector
Graphically, we again have the slope of a curve, orbit, or trajectory, as shown
in Fig. 1.18.
If we resolve r(t) into its cartesian components, dr/dt always reduces directly
to a vector sum of not more than three (for three-dimensional space) scalar
derivatives. In other coordinate systems (Chapter 2) the situation is a little
more complicated, for the unit vectors are no longer constant in direction.
Differentiation with respect to the space coordinates is handled in the same
way as differentiation with respect to time, as seen in the following paragraphs.
In Section 1.6 V was defined as a vector operator. Now, paying careful
attention to both its vector and its differential properties, we let it operate on
a vector. First, as a vector we dot it into a second vector to obtain
V-V =
Bx By Bz
A.60)
known as the divergence of V. This is a scalar, as discussed in Section 1.3.
EXAMPLE 1.7.1
Calculate V • r.
V • г = i— + j— +
\ Bx By
_Bx By Bz
dx By Bz
Bz
(ix + }y + kz),
or
DIVERGENCE, V • 39
EXAMPLE 1.7.2
Generalizing Example 1.7.1,
г)+ £#+£&+*.&
r dr r dr r dr
The manipulation of the partial derivatives leading to the second equation in
Example 1.7.2 is discussed in Example 1.6.1.
In particular, if/(r) = rn~x,
V-ror"
= Ъгп~х + (п- l)^ A.60a)
This divergence vanishes for n = —2, an important fact in Section 1.14.
A PHYSICAL INTERPRETATION
To develop a feeling for the physical significance of the divergence, consider
\'(pv) with y(x,y,z), the velocity of a compressible fluid and p(x,y,z), its
density at point (x,y,z). If we consider a small volume dxdydz (Fig. 1.19), the
fluid flowing into this volume per unit time (positive x-direction) through the
face EFGH is (rate of flow in)£FGH = pvx\x=odydz. The components of the flow
pvy and pvz tangential to this face contribute nothing to the flow through this
face. The rate of flow out (still positive x-direction) through face ABCD is
pvx\x=dxdydz. To compare these flows and to find the net flow out, we expand
this last result in a Maclaurin series1, Section 5.6. This yields
(rate of flow out)^BCB = pvx\x=dxdydz
д
dydz.
Here the derivative term is a first correction term allowing for the possibility
of nonuniform density or velocity or both2. The zero-order term pvx\x=0
(corresponding to uniform flow) cancels out.
1A Maclaurin expansion for a single variable is given by Eq. 5.88, Section 5.6.
Here we have the increment x of Eq. 5.88 replaced by dx. We show a partial
derivative with respect to' x since pvx may also depend on у and z.
2 Strictly speaking, pvx is averaged over face EFGH and the expression
pvx + (d/dx)(pvx)dx is similarly averaged over face ABCD. Using an arbi-
arbitrarily small differential volume, we find that the averages reduce to the values
employed here.
40 VECTOR ANALYSIS
С
dz
G
D
H
F
*■ )
dy
FIG. 1.19 Differential rectangular parallelepiped (in first or positive octant)
Net rate of flow out |x = -—(pvx) dxdydz.
Equivalently, we can arrive at this result by
vx(Ax,0,0) - pvx@,0,0) 3 d[pvx(x,y,z)]
Ax
dx
0,0,0
Now the x-axis is not entitled to any preferred treatment. The preceding result
for the two faces perpendicular to the x-axis must hold for the two faces
perpendicular to the y-axis, with x replaced by у and the corresponding changes
for у and z: у -»z, z -»x. This is a cyclic permutation of the coordinates. A
further cyclic permutation yields the result for the remaining two faces of our
parallelepiped. Adding the net rate of flow out for all three pairs of surfaces of
our volume element, we have
net flow out
(per unit time)
ду
= V '(pv) dxdydz
dxdydz A.61)
Therefore the net flow of our compressible fluid out of the volume element
dxdydz per unit volume per unit time is V • (pv). Hence the name divergence. A
direct application is in the continuity equation
A.62)
which simply states that a net flow out of the volume results in a decreased
density inside the volume. Note that in Eq. 1.62 p is considered to be a possible
EXERCISES 41
function of time as well as of space: p(x,y,z, t). The divergence appears in a
wide variety of physical problems, ranging from a probability current density
in quantum mechanics to neutron leakage in a nuclear reactor.
The combination V • (/V), in which / is a scalar function and V a vector
function, may be written
+ v +f + v +
dx +dy y+J dy +<3z 2+/ dz
A.62a)
which is just what we would expect for the derivative of a product. Notice that
V as a differential operator differentiates both/and V; as a vector it is dotted
into V (in each term).
If we have the special case of the divergence of a vector vanishing,
V-B = 0, A.63)
the vector В is said to be solenoidal, the term coming from the example in
which В is the magnetic induction and Eq. 1.63 appears as one of Maxwell's
equations. When a vector is solenoidal it may be written as the curl of another
vector known as the vector potential. In Section 1.13 we shall calculate such a
vector potential.
EXERCISES
1.7.1 For a particle moving in a circular orbit r = ir cos cot + \r sin cot,
(a) evaluate r x r.
(b) Show that f.+ co2r = 0.
The radius r and the angular velocity со are constant.
ANS. (a) kcor2.
Note, r = dr/dt, f = d2r/dt2.
1.7.2 Vector A satisfies the vector transformation law, Eq. 1.15. Show directly that its
time derivative dAjdt also satisfies Eq. 1.15 and is therefore a vector.
1.7.3 Show, by differentiating components, that
/ ч d ,» „, ^A „ dB
(a) _(A.B)-_.B + A~,
(b) J(*xBL»BU»^,
dt dt dt
just like the derivative of the product of two algebraic functions.
1.7.4 In Chapter 2 it will be seen that the unit vectors in noncartesian coordinate systems
are usually functions of the coordinate variables, e, = e(gi,g2,g3) but |e,| = 1.
Show that either deAdqj = 0 or dejdc/j is orthogonal to e,.
1.7.5 Prove V-(a x b) = b-V x a-a-V x b.
Hint. Treat as a triple scalar product.
42 VECTOR ANALYSIS
1.7.6 The electrostatic field of a point charge q is
Лпг0 г2
Calculate the divergence of E. What happens at the origin?
1.8 CURL, Vx
Another possible operation with the vector operator V is to cross it into a
vector. We obtain
V x V = i
i J к
d d d
dx dy dz
К- К, К
A.64)
which is called the curl of V. In expanding this determinant form or in any
operation with V, we must consider the derivative nature of V. Specifically,
V x V is defined only as an operator, another vector differential operator. It is
certainly not equal, in general, to —V x V.1 In the case of Eq. 1.64 the deter-
determinant must be expanded from the top down so that we get the derivatives as
shown in the middle portion of Eq. 1.64. If V is crossed into the product of a
scalar and a vector, we can show
= f
dy dy
dz
A.65)
If we permute the coordinates x -»у, у -*■ z, z -»x to pick up the ^-component
and then permute them a second time to pick up the z-component,
V x (/V) =/V x V + (V/) x V,
A.66)
which is the vector product analog of Eq. 1.62a. Again, as a differential operator
V differentiates both/and V. As a vector it is crossed into V (in each term).
EXAMPLE 1.8.1
Calculate V x rf(r)
ByEq. 1.66
1 In this same spirit, if A is a differential operator, it is not necessarily true
that AxA = 0. Specifically, for the quantum mechanical angular momentum
operator, L = — i(r x V), we find that L x L = /L.
CURL, V x 43
V x rf(r) =f(r)\xr
x r.
First,
V x r =
= 0.
i J к
A A A
dx dy dz
x у z
Second, using V/(r) = ro(df/dr) (Example 1.6.1), we obtain
V x rf(r) =froxr = 0.
dr
A.67)
A.68)
A.69)
The vector product vanishes, since г = ror and r0 x r0 = 0.
To develop a better feeling for the physical significance of the curl, we
consider the circulation of fluid around a differential loop in the xy-plane,
Fig. 1.20.
Although the circulation is technically given by a vector line integral j V • dX
(Section 1.10), we can set up the equivalent scalar integrals here. Let us take
the circulation to be
circulation 1234 = Vx(x,y)dXx+\ Vy(x,y)dXy
A.70)
+ Vx(x,y)dXx+ Vy(x,y)dXy.
J 3 J4
The numbers 1,2, 3, and 4 refer to the numbered line segments in Fig. 1.20.
In the first integral dXx = +dx but in the third integral dXx — — dx because
the third line segment is traversed in the negative x-direction. Similarly, dXy =
+dy for the second integral, —dy for the fourth. Next, the integrands are
referred to the point (x0, y0) with a Taylor expansion2 taking into account the
у
Xo, Уо +
+ dx, yo + dy
+ dx, vc
-► x
FIG. 1.20 Circulation around a differential loop
2 Vy(x0 + dx,y0) = Vy(x0,y0)
dx +
The higher-order terms will drop out in the limit as dx ~* 0. A correction term
for the variation of Vy with у is canceled by the corresponding term in the
fourth integral (see Section 5.6).
44 VECTOR ANALYSIS
displacement of line segment 3 from 1 and 2 from 4. For our differential line
segments this leads to
circulation! 234 = Vx(x0,y0)dx +
dV
dy
ду
dxdy.
дх
(-dx)+VJxo,yo)(-dy) A.71)
\6х ду
Dividing by dx dy, we have
circulation per unit area = V x V|2. A-72)
The circulation3 about our differential area in the xy-plane is given by the
z-component of V x V. In principle, the curl, V x V at (xo,yo), could be
determined by inserting a (differential) paddle wheel into the moving fluid at
point (xo,^o). The rotation of the little paddle wheel would be a measure of the
curl.
We shall use the result, Eq. 1.71, in Section 1.13 to derive Stokes's theorem.
Whenever the curl of a vector V vanishes,
VxV = 0. A.73)
V is labeled irrotational. The most important physical examples of irrotational
vectors are the gravitational and electrostatic forces. In each case
У = СЦ=С~, A.74)
г г
where С is a constant and r0 is the unit vector in the outward radial direction.
For the gravitational case we have С = —Gm1m2, given by Newton's law of
universal gravitation. If С = qxq2jAm0, we have Coulomb's law of electro-
electrostatics (mks units). The force V given in Eq. 1.74 may be shown to be irrotational
by direct expansion into cartesian components as we did in Example 1.8.1.
Another approach is developed in Chapter 2, in which we express V x, the
curl, in terms of spherical polar coordinates. In Section 1.13 we shall see that
whenever a vector is irrotational, the vector may be written as the (negative)
gradient of a scalar potential. In Section 1.15 we shall prove that a vector may
be resolved into an irrotational part and a solenoidal part (subject to conditions
at infinity). In terms of the electromagnetic field this corresponds to the resolu-
resolution into an irrotational electric field and a solenoidal magnetic field.
For waves in an elastic medium, if the displacement u is irrotational,
V x u = 0, planes waves (or spherical waves at large distances) become longitu-
longitudinal. If u is solenoidal, V #u = 0, then the waves become transverse. A seismic
disturbance will produce a displacement that may be resolved into a solenoidal
sIn fluid dynamics V x V is called the "vorticity."
EXERCISES 45
part and an irrotational part (compare Section 1.15). The irrotational part
yields the longitudinal P (primary) earthquake waves. The solenoidal part
gives rise to the slower transverse S (secondary) waves, Exercise 3.6.8.
Using the gradient, divergence, and curl, and of course the BAC-CAB rule,
we may construct or verify a large number of useful vector identities. For
verification, complete expansion into cartesian components is always a possibil-
possibility. Sometimes if we use insight instead of routine shuffling of cartesian compo-
components, the verification process can be shortened drastically.
Remember that V is a vector operator, a hybrid creature satisfying two sets
of rules:
1. vector rules, and
2. partial differentiation rules—including differentia-
differentiation of a product.
EXAMPLE 1.8.2. Gradient of a Dot Product
Verify that
V(A-B) = (B-V)A + (A-V)B + B x (V x A) + A x (V x B). A.75)
This particular example hinges on the recognition that V(A • B) is the type of
term that appears in the BAC-CAB expansion of a triple vector product, Eq.
1.50. For instance,
A x (V x B) = V(A-B)-(A-V)B,
with the V differentiating only B, not A. From the commutativity of factors in
a scalar product we may interchange A and В and write
В x (V x A) = V(A-B)-(B»V)A,
now with V differentiating only A, not B. Adding these two equations, we
obtain V differentiating the product A-B and the identity, Eq. A.75).
This identity is used frequently in advanced electromagnetic theory. Exercise
1.8.15 is one simple illustration.
EXERCISES
1.8.1 Show, by rotating the coordinates, that the components of the curl of a vector
transform as a vector.
Hint. The direction cosine identities of Eq. 1.41 are available as needed.
1.8.2 Show that u x v is solenoidal if u and v are each irrotational.
1.8.3 If A is irrotational, show that A x г is solenoidal.
1.8.4 A rigid body is rotating with constant angular velocity w. Show that the linear
velocity v is solenoidal.
1.8.5 A vector function f(x,y,z) is not irrotational but the product of f and a scalar
46 VECTOR ANALYSIS
function g(x,y,z) is irrotational. Show that
f-V xf=0.
1.8.6 If (a) V = iVx(x,y)+jVy(x,y) and (b) VxV^O, prove that V x V is per-
perpendicular to V.
1.8.7 Classically, angular momentum is given by L = r x p, where p is the linear
momentum. To go from classical mechanics to quantum mechanics, replace p
by the operator —/V (Section 15.6). Show that the quantum mechanical angular
momentum operator has cartesian components
г
(in units of h).
1.8.8 Using the angular momentum operators previously given, show that they satisfy
commutation relations of the form
LAo Ly\ — LxLy — LyLx = iLz
and hence
L x L = /L.
These commutation relations will be taken later as the defining relations as an
angular momentum operator—Exercise 4.2.15 and the following one and Section
12.7.
1.8.9 With the commutator bracket notation \Lx,Ly~\ — LxLy — LyLx, the angular
momentum vector L satisfies [Lx, Ly~\ — iLz, etc. and so on, or L x L = /L.
Two other vectors a and b commute with each other and with L, that is, [a, b] =
[a, L] = [b, L] = 0. Show that
[a-L,b-L] = z(ax b)L.
1.8.10 For A = iAx(x,y, z) and В = iBx(x,y, z) evaluate each term in the vector identity
V(A • B) = (B • V)A + (A • V)B + В x (V x A) + A x (V x B)
and verify that the identity is satisfied.
1.8.11 Verify the vector identity
V x (A x B) = (B«V)A-(A-V)B-B(V«A) + A(V-B).
1.8.12 As an alternative to the vector identity of Example 1.8.2 show that
V(A-B) = (A x V) x B + (B x V) x A + A(V-B) + B(V • A).
1.8.13 Verify the identity
A x (V x A) = iV(v42) - (A-V)A.
1.8.14 If A and В are constant vectors, show that
V(A-B x r) = A x B.
SUCCESSIVE APPLICATIONS OF V 47
1.8.15 A distribution of electric currents creates a constant magnetic moment m. The
force on m in an external magnetic induction В is given by
F = V x (B x m).
Show that
F= V(m-B).
Note. Assuming no time dependence of the fields, Maxwell's equations yield
V x B = 0. AlsoV-B = 0.
1.8.16 An electric dipole of moment p is located at the origin. The dipole creates an
electric potential at г given by
Find the electric field, E = — V^ at r.
1.8.17 The vector potential A of a magnetic dipole, dipole moment m, is given by
A(r) = (до/4л:)(т x r/r3). Show that the magnetic induction В = V x A is given
by
B = Mo3ro(ro-m)-m
An r3
Note. The limiting process leading to point dipoles is discussed in Section 12.1
for electric dipoles, Section 12.5 for magnetic dipoles.
1.8.18 The velocity of a two-dimensional flow of liquid is given by
V = iu(x,y) — jv(x,y).
If the liquid is incompressible and the flow is irrotational show that
du dv , 8u dv
— ОПЛ ~~
dilU
дх dy ду дх
These are the Cauchy-Riemann conditions of Section 6.2.
1.8.19 The evaluation in this section of the four integrals for the circulation omitted
Taylor series terms such as dVJdx, dVy/dy and all second derivatives. Show that
dVJdx, dVy/dy cancel out when the four integrals are added and that the second
derivative terms drop out in the limit as dx -+0,dy-+ 0.
Hint. Calculate the circulation per unit area and then take the limit dx -* 0,
1.9 SUCCESSIVE APPLICATIONS OF V
We have now defined gradient, divergence, and curl to obtain vector,
scalar, and vector quantities, respectively. Letting V operate on each of these
quantities, we obtain
(a) \-\cp (b) VxVip (c) VV-V
(d) V • V x V (e) V x (V x V),
all five expressions involving second derivatives and all five appearing in the
48 VECTOR ANALYSIS
second-order differential equations of mathematical physics, particularly in
electromagnetic theory.
The first expression, V • \cp, the divergence of the gradient, is named the
Laplacian of ср. We have
• ^ , , д \ (. dcp . dcp , dcp
dy dz) \ dx dy dz,
A.76a)
.
,
dx2 dy2 dz2'
When (p is the electrostatic potential, we have
V-W = 0. A.766)
which is Laplace's equation of electrostatics. Often the combination V • V is
written V2.
EXAMPLE 1.9.1
Calculate \-\g(r).
Referring to Examples 1.6.1 and 1.7.2,
r dr dr2'
replacing/(r) in Example 1.7.2 by \\r-dg\dr. \ig{r) = r", this reduces to
V-Vr" = n(n + \)r"~2.
This vanishes for n = 0 \j}{r) = constant] and for n = — 1; that is, g(r) = \jr
is a solution of Laplace's equation, V2gf(r) = 0. This is for r ^ 0. At r = 0 a
Dirac delta function is involved (see Eq. 1.173 and Section 8.7).
Expression (b) may be written
A A A
dx dy dz
dcp dcp d(p
dx dy dz
By expanding the determinant, we obtain
\dy dz dz dy) \dz dx dx dz/ \dx dy dy dx)
= 0,
VxV(p=i
aVV
A.77)
assuming that the order of partial differentiation may be interchanged. This is
true as long as these second partial derivatives of cp are continuous functions.
SUCCESSIVE APPLICATIONS OF V 49
Then, from Eq. 1.77, the curl of a gradient is identically zero. All gradients,
therefore, are irrotational. Note carefully that the zero in Eq. 1.77 comes as a
mathematical identity, independent of any physics. The zero in Eq. 1.766 is a
consequence of physics.
Expression (d) is a triple scalar product which may be written
V-V x V =
A A A
дх ду dz
A A A
дх ду dz
V V V
A.78)
Again, assuming continuity so that the order of differentiation is immaterial,
we obtain
V-VxV = 0. A.79)
The divergence of a curl vanishes or all curls are solenoidal. In Section 1.15 we
shall see that vectors may be resolved into solenoidal and irrotational parts by
Helmholtz's theorem.
The two remaining expressions satisfy a relation
Vx(VxV) = W'V-V-VV. A.80)
This follows immediately from Eq. 1.50, the BAC-CAB rule, which we rewrite
so that С appears at the extreme right of each term. The term V • VV was not
included in our list, but it may be defined by Eq. 1.80. If V is expanded in
cartesian coordinates so that the unit vectors are constant in direction as well
as in magnitude, V • VV, a vector Laplacian, reduces to
V- VV = iV- \VX + jV • \Vy + kV- \VZ,
a vector sum of ordinary scalar Laplacians. By expanding in cartesian coor-
coordinates, we may verify Eq. 1.80 as a vector identity.
EXAMPLE 1.9.2 Electromagnetic Wave Equation
One important application of this vector relation (Eq. 1.80) is in the deriva-
derivation of the electromagnetic wave equation. In vacuum Maxwell's equations
become
VB = 0, A.81a)
V-E = 0, A.81b)
A.81c)
A.81a1)
VxE^p.
ot
Here E is the electric field, В the magnetic induction, e0 the electric permittivity,
SO VECTOR ANALYSIS
and fi0 the magnetic permeability (mks or SI units). Suppose we eliminate В
from Eqs. 1.81c and 1.8 Id. We may do this by taking the curl of both sides of
Eq. 1.8 Id and the time derivative of both sides of Eq. 1.81c. Since the space
and time derivatives commute,
A.82)
ci ci
and we obtain
Я21Г
A.83)
dt2
Application of Eqs. 1.80 and of 1.81b yields
the electromagnetic vector wave equation. Again, if E is expressed in cartesian
coordinates, Eq. 1.84 separates into three scalar wave equations, each involving
a scalar Laplacian.
EXERCISES
1.9.1 Verify Eq. 1.80
V x(V x V) = VVV- VVV
by direct expansion in cartesian coordinates.
1.9.2 Show that the identity
Vx(V x V) = VV-V- V-VV
follows from the BAC-CAB rule for a triple vector product. Justify any alteration
of the order of factors in the ВАС and CAB terms.
1.9.3 Prove that V x (<pV<p) = 0.
1.9.4 You are given that the curl of F equals the curl of G. Show that F and G may
differ by (a) a constant and (b) a gradient of a scalar function.
1.9.5 The Navier-Stokes equation of hydrodynamics contains a nonlinear term
(v'V)v. Show that the curl of this term may be written — V x [v x (V x v)].
1.9.6 From the Navier-Stokes equation for the steady flow of an incompressible
viscous fluid we have the term
V x [v x (V x v)]
where v is the fluid velocity. Show that this term vanishes for the special case
v = iv(y,z).
1.9.7 Prove that (Vw) x (Vu) is solenoidal where и and v are differentiable scalar
functions.
VECTOR INTEGRATION 51
1.9.8 q> is a scalar satisfying Laplace's equation, V2<p = 0. Show that V<p is both sole-
noidal and irrotational.
1.9.9 With ф a scalar function, show that
(r x V)-(r 22 2^ ^
гЧф r^ 2r.
or* or
(This can actually be shown more easily in spherical polar coordinates, Section
2.5).
1.9.10 In a (nonrotating) isolated mass such as a star, the condition for equilibrium is
\P + pV<p = 0.
Неге Р is the total pressure, p the density, and <p the gravitational potential.
Show that at any given point the normals to the surfaces of constant pressure
and constant gravitational potential are parallel.
1.9.11 In the Pauli theory of the electron one encounters the expression
(p - eA) x (p - еА)ф,
where ф is a scalar function. A is the magnetic vector potential related to the
magnetic induction В by В = V x A. Given that p = — /V, show that this ex-
expression reduces to /eBi//.
1.9.12 Show that any solution of the equation
V x V x A - k2A = 0
automatically satisfies the vector Helmholtz equation
V2A + k2A = 0
and the solenoidal condition
V-A = 0.
Hint. Let V • operate on the first equation.
1.9.13 The theory of heat conduction leads to an equation
where Ф is a potential satisfying Laplace's equation: V2<D = 0. Show that a solu-
solution of this equation is
1.10 VECTOR INTEGRATION
The next step after differentiating vectors is to integrate them. Let us start
with line integrals and then proceed to surface and volume integrals. In each
case the method of attack will be to reduce the vector integral to scalar integrals
with which the reader is assumed familiar.
Line Integrals
Using an increment of length dr = i dx + j dy + к dz, we may encounter the
line integrals
52 VECTOR ANALYSIS
Г
A.85a)
A.856)
A.85c)
in each of which the integral is over some contour С that may be open (with
starting point and ending point separated) or closed (forming a loop). Because
of its physical interpretation that follows, the second form, Eq. 1.856 is by far
the most important of the three.
With (p, a scalar, the first integral reduces immediately to
= i (p(x,y,z)dx + j cp(x,y,z)dy
J
+ k q>(x,y,z)dz.
J с
This separation has employed the relation
\icp dx = i \cp dx, A-87)
J J
which is permissible because the cartesian unit vectors i, j, and к are constant
in both magnitude and direction. Perhaps this relation is obvious here, but it
will not be true in the noncartesian systems encountered in Chapter 2.
The three integrals on the right side of Eq. 1.86 are ordinary scalar integrals
and, to avoid complications, we assume that they are Riemann integrals. Note,
however, that the integral with respect to x cannot be evaluated unless у and z
are known in terms of x and similarly for the integrals with respect to у and z.
This simply means that the path of integration С must be specified. Unless the
integrand has special properties that lead the integral to depend only on the
value of the end points, the value will depend on the particular choice of
contour С For instance, if we choose the very special case cp = 1, Eq. 1.85л is
just the vector distance from the start of contour С to the end point, in this
case independent of the choice of path connecting fixed end points. With
dx — \dx + jdy + kdz, the second and third forms also reduce to scalar
integrals and, like Eq. 1.85л, are dependent, in general, on the choice of path.
The form (Eq. 1.856) is exactly the same as that encountered when we calculate
the work done by a force that varies along the path,
W= \F-dr
A.88a)
= \Fx(x,y,z)dx+ \Fy(x,y,z)dy+ \Fz(x,y,z)dz.
In this expression F is the force exerted on a particle.
VECTOR INTEGRATION 53
EXAMPLE 1.10.1
The force exerted on a body is F = — \y + jx. The problem is to calculate
the work done going from the origin to the point A,1).
W= Г ' F-rfr= Г ' (-ydx + xdy). AMb)
Jo,o Jo,o
Separating the two integrals, we obtain
W= - Г ydx + Г xdy. A.88c)
Jo Jo
The first integral cannot be evaluated until we specify the values of у as x
ranges from 0 to 1. Likewise, the second integral requires x as a function of y.
Consider first the path shown in Fig. 1.21. Then
A.88Л)
since у = 0 along the first segment of the path and x = 1 along the second.
If we select the path [x"= 0,0 < у < 1] and [0 < x < I,y = 1], then Eq.
1.88c gives W= — 1. For this force the work done depends on the choice of
path.
W= -| Odx+ Г ldy= 1,
J
У
A,1)
0,0)
FIG. 1.21 A path of integration
Surface Integrals
Surface integrals appear in the same forms as line integrals, the element of
area also being a vector, dc.1 Often this area element is written n dA in which
n is a unit (normal) vector to indicate the positive direction.2 There are two
conventions for choosing the positive direction. First, if the surface is a closed
surface, we agree to take the outward normal as positive. Second, if the surface
1 Recall that in Section 1.4 the area (of a parallelogram) represented a cross-
product vector.
2 Although n always has unit length, its direction may well be a function of
position.
54 VECTOR ANALYSIS
FIG. 1.22 Right-hand rule for the
positive normal
is an open surface, the positive normal depends on the direction in which the
perimeter of the open surface is traversed. If the right-hand fingers are placed
in the direction of travel around the perimeter, the positive normal is indicated
by the thumb of the right hand. As an illustration, a circle in the .xy-plane
(Fig. 1.22) mapped out from x to у to — x to —y and back to x will have its
positive normal parallel to the positive z-axis (for the right-handed coordinate
system). If readers should ever encounter one-sided surfaces, such as Moebius
strips, it is suggested that they either cut the strips and form reasonable, well-
behaved surfaces or label them pathological and send them to the nearest
mathematics department.
Analogous to the line integrals, Eqs. 1.85л, b, c, surface integrals may
appear in the forms
cpda
У-da
V x da.
Again, the dot product is by far the most commonly encountered form.
The surface integral j V • da may be interpreted as a flow or flux through the
given surface. This is really what we did in Section 1.7 to obtain the significance
of the term divergence. This identification reappears in Section 1.11 as Gauss's
theorem. Note that both physically and from the dot product the tangential
components of the velocity contribute nothing to the flow through the surface.
Volume Integrals
Volume integrals are somewhat simpler, for the volume element dx is a
scalar quantity.3 We have
3Frequently the symbols d3r and d3x are used to denote a volume element in
x (xyz or xix2x3) space.
VECTOR INTEGRATION 55
= i Vxdx+\\ Vydx + k\ Vzdx, A.89)
Jv Jv Jv Jv
again reducing the vector integral to a vector sum of scalar integrals.
Integral Definitions of Gradient, Divergence,
and Curl
One interesting and significant application of our surface and volume
integrals is their use in developing alternate definitions of our differential
relations. We find
\q>= hm
JdO
V= hm i-j-j—,
dt^o J dx
= lim
J
J dx
A.90)
A.91)
A.92)
In these three equations \ dx is the volume of a small region of space and da
is the vector area element of this volume. The identification of Eq. 1.91 as the
divergence of V was carried out in Section 1.7. Here we show that Eq. 1.90 is
consistent with our earlier definition of V<p (Eq. 1.53). For simplicity we choose
\dx to be the differential volume dxdydz (Fig. 1.23). This time we place the
С
-у
в
FIG. 1.23 Differential rectangular parallelepiped (origin at center)
origin at the geometric center of our volume element. The area integral leads
to six integrals, one for each of the six faces. Remembering that dc is outward,
dtj'i = —\da\ for surface EFHG, and +\da\ for surface ABDC, we have
\(p da = — i
abdc
56 VECTOR ANALYSIS
aegA 8У 2J Jbfhd
Jabfe\ 8z 2J Jcdhg\ dz 2/
Using the first two terms of a Maclaurin expansion, we evaluate each integrand
at the origin with a correction included to correct for the displacement (±dx/2)
of the center of the face from the origin.4 Having chosen the total volume to be
of differential size (\dx = dxdydz), we drop the integral signs on the right and
obtain
A.93)
j \ ex oy czj
Dividing by
\dx = dx dy dz,
J
we verify Eq. 1.90.
This verification has been oversimplified in ignoring other correction terms
beyond the first derivatives. These additional terms, which are introduced in
Section 5.6 when the Taylor expansion is developed, vanish in the limit
\dx -* 0 (dx -> 0, dy -*0,dz-+ 0).
This, of course, is the reason for specifying in Eqs. 1.90, 1.91, and 1.92 that this
limit be taken.
Verification of Eq. 1.92 follows these same lines exactly, using a differential
volume dxdydz.
EXERCISES
1.10.1 The force field acting on a two-dimensional linear oscillator may be described by
F = — \kx — \ky.
Compare the work done moving against this force field when going from A,1)
to D,4) by the following straight-line paths:
(a) A,1)-D,1)-» D,4)
(b) A,1)-A,4)-D,4)
(c) A,1) —D,4)alongjc = j;.
This means evaluating
fD,4)
F-Jr
along each path.
;The origin has been placed at the geometric center.
GAUSS'S THEOREM 57
1.10.2 Find the work done going around a unit circle in the xy-plane:
(a) counterclockwise from 0 to n,
(b) clockwise from 0 to —n,
doing work against a force field given by
— 2 2 ' 2 . 2 *
Л'2 + >>2 X2 + у2
Note that the work done depends on the path.
1.10.3 Calculate the work you do in going from point A,1) to point C,3). The force
you exert is given by
F = i(x - y) + j(x + y).
Specify clearly the path you choose. Note that this force field is nonconservative.
1.10.4 Evaluate & г -dr.
Note. The symbol $ means that the path of integration is a closed loop.
1.10.5 Evaluate
over the unit cube defined by the point @,0,0) and the unit intercepts on the
positive х-, у-, and z-axes. Note that (a) r-da is zero for three of the surfaces
and (b) each of the three remaining surfaces contributes the same amount to
the integral.
1.10.6 Show, by expansion of the surface integral, that
hm i~ = V x V.
/л-o j dx
Hint. Choose the volume to be a differential volume, dxdydz.
1.11 GAUSS'S THEOREM
Here we derive a useful relation between a surface integral of a vector and
the volume integral of the divergence of that vector. Let us assume that the
vector V and its first derivatives are continuous over the region of interest.
Then Gauss's theorem states that
Г \-dc= j \-\dx. A.94a)
In words, the surface integral of a vector over a closed surface equals the
volume integral of the divergence of that vector integrated over the volume
enclosed by the surface.
Imagine that volume V is subdivided into an arbitrarily large number of
tiny (differential) parallelepipeds. For each parallelepiped
A.946)
six surfaces
from the analysis of Section 1.7, Eq. 1.61, with p\ replaced by V. The summa-
58 VECTOR ANALYSIS
J
FIG. 1.24 Exact cancellation of ato's on interior
surfaces. No cancellation on exterior surface.
tion is over the six faces of the parallelepiped. Summing over all parallelepipeds,
we find that the V • db terms cancel (pairwise) for all interior faces; only the
contributions of the exterior surfaces survive (Fig. 1.24). Analogous to the
definition of a Riemann integral as the limit of a sum, we take the limit as the
number of parallelepipeds approaches infinity (-♦ со) and the dimensions of
each approach zero (-> 0).
\-Vdz
exterior surfaces
volumes
1
1
Г \-d<s= I \-\dt
J S JV
The result is Eq. 1.94л, Gauss's theorem.
From a physical point of view Eq. 1.61 has established V-V as the net
outflow of fluid per unit volume. The volume integral then gives the total net
outflow. But the surface integral j V* Jcr is just another way of expressing this
same quantity, which is the equality, Gauss's theorem.
GREEN'S THEOREM
A frequently useful corollary of Gauss's theorem is a relation known as
Green's theorem. If и and v are two scalar functions, we have the identities
A.95)
A.96)
V • (и щ = и V • \v + (\u) • (Vt>),
V • (v Vm) = v V • \u + (\v) • (Vm).
Subtracting Eq. 1.96 from Eq. 1.95, integrating over a volume (w, v, and their
derivatives, assumed continuous), and applying Eq. 1.94 (Gauss's theorem),
we obtain
v\''S/u)dx=\ (uVv- v\u)-da. A.97)
Jv Js
This is Green's theorem. We use it for developing Green's functions, Chapters
GAUSS'S THEOREM 59
8 and 16. An alternate form of Green's theorem derived from Eq. 1.95 alone is
u\vdo = uV'\vdT+ \u-\vdr. A.98)
Js Jv Jv
This is the form of Green's theorem used in Section 1.15.
ALTERNATE FORMS OF GAUSS'S THEOREM
Although Eq. 1.94 involving the divergence is by far the most important
form of Gauss's theorem, volume integrals involving the gradient and the curl
may also appear. Suppose
V(xj,z)=F(xj!z)a, A.99)
in which a is a vector with constant magnitude and constant but arbitrary
direction. (You pick the direction, but once you have chosen it, hold it fixed.)
Equation 1.94 becomes
a- Vda= Г V
•^ ^v A.100)
= a- Г \Vdx
Jv
by Eq. 1.62a. This may be rewritten
a
Vdo- Wdx
= 0. A.101)
-J S Jv
Since |a| Ф 0 and its direction is arbitrary, meaning that the cosine of the
included angle cannot always vanish, the term in brackets must be zero.1 The
result is
j Vda= j MVdx. A.102)
Js Jv
In a similar manner, using V = a x P in which a is a constant vector, we may
show
Г do x P= V x Prfr. A.103)
Js Jv
These last two forms of Gauss's theorem are used in the vector form of Kirchhoff
diffraction theory. They may also be used to verify Eqs. 1.90 and 1.92.
Gauss's theorem may also be extended to dyadics or tensors (see Section
3.5).
is exploitation of the arbitrary nature of a part of a problem is a valuable
and widely used technique. The arbitrary vector is used again in Sections 1.12
and 1.13. Other examples appear in Section 1.14 (integrands equated) and in
Section 3.3, quotient rule.
60 VECTOR ANALYSIS
EXERCISES
1.11.1 Using Gauss's theorem prove that
Г da = 0,
Js
if 5 is a closed surface.
1.11.2 Show that
Js
where V is the volume enclosed by the closed surface
Note. This is a generalization of Exercise 1.10.5.
1.11.3 If В = V x A, show that
for any closed surface 5.
1.11.4 Over some volume V let ф be a solution of Laplace's equation (with the deriva-
derivatives appearing there continuous). Prove that the integral over any closed
surface in V of the normal derivative of ф, (дф/дп, or \ф • n) will be zero.
1.11.5 In analogy to the integral definitions of gradient, divergence, and curl of Section
1.10, show that
Jdt-0 J dt
1.11.6 The electric displacement vector D satisfies the Maxwell equation V-D = p
where p is the charge density (per unit volume). At the boundary between two
media there is a surface charge density о (per unit area). Show that a boundary
condition for D is
(D2-D1)-n = <r.
n is a unit vector normal to the surface and out of medium 1.
Hint. Consider a thin pillbox as shown in the figure.
1.11.7 From Eq. 1.62a with V the electric field E, and /the electrostatic potential q>,
show that
\ pq>dx = £Q \ E2 dx.
J J
This corresponds to a three-dimensional integration by parts.
Hint. E = — V<p, V-E = p/s0. You may assume that cp vanishes at large r at
least at fast as r~\
1.11.8 A particular steady-state electric current distribution is localized in space.
Choosing a bounding surface far enough out so that the current density J is zero
STOKES'S THEOREM 61
everywhere on the surface, show that
Hint. Take one component of J at a time. With V • J = 0, show that J; = V • xt J
and apply Gauss's theorem.
1.11.9 The creation of a localized system of steady electric currents (current density
J) and magnetic fields may be shown to require an amount of work
W=\ \H-Bdx.
Transform this into
W=\ \i-Kdx.
Here A is the magnetic vector potential: V x A = B.
Hint. In Maxwell's equations take the displacement current term dD/dt = 0.
If the fields and currents are localized, a bounding surface may be taken far
enough out so that the integrals of the fields and currents over the surface
yield zero.
1.11.10 Prove the generalization of Green's theorem:
{v<£u - u<£v)dx = р(иЩ - uS/v)-d<s.
Here S£ is the self-adjoint operator (Section 9.1):
and p, q, u, and v are functions of position, p and q having continuous first
derivatives and и and v having continuous second derivatives.
Note. This generalized Green's theorem appears in Sections 8.7 and 16.6.
1.12 STOKES'S THEOREM
Gauss's theorem relates the volume integral of a derivative of a function to
an integral of the function over the closed surface bounding the volume. Here
we consider an analogous relation between the surface integral of a derivative
of a function and the line integral of the function, the path of integration being
the perimeter bounding the surface.
Let us take the surface and subdivide it into a network of arbitrarily small
rectangles. In Section 1.8 we showed that the circulation about such a differen-
differential rectangle (in the xy-plane) is V x V\zdxdy. From Eq. 1.71 applied to one
differential rectangle
A.104)
four sides
We sum over all the little rectangles as in the definition of a Riemann integral.
The surface contributions (right-hand side of Eq. 1.104) are added together.
The line integrals (left-hand side of Eq. 1.104) of all interior line segments cancel
62 VECTOR ANALYSIS
t
t
FIG. 1.25 Exact cancellation on
interior paths. No cancellation on
exterior path.
identically. Only the line integral around the perimeter survives (Fig. 1.25).
Taking the usual limit as the number of rectangles approaches infinity while
dx -» 0, dy -» 0, we have
\-dl=
\x\-do
exterior
line segments
rectangles
= Vx У-da.
A.105)
This is Stokes's theorem. The surface integral on the right is over the surface
bounded by the perimeter or contour for the line integral on the left.
This demonstration of Stokes's theorem is limited by the fact that we used a
Maclaurin expansion of \(x,y, z) in establishing Eq. 1.71 in Section 1.8.
Actually we need only demand that the curl of \(x,y, z) exists and that it be
integrable over the surface. A proof of the Cauchy integral theorem analogous
to the development of Stokes's theorem here but using these less restrictive
conditions appears in Section 6.3.
Stokes's theorem obviously applies to an open surface. It is possible to con-
consider a closed surface as a limiting case of an open surface with the opening (and
therefore the perimeter) shrinking to zero. This is the point of Exercise 1.12.7.
ALTERNATE FORMS OF STOKES'S THEOREM
As with Gauss's theorem, other relations between surface and line integrals
are possible. We find.
da x
= cb
J
and
(do x\)xP= idlxP.
A.106)
A.107)
Equation 1.106 may readily be verified by the substitution V = щ in which a
is a vector of constant magnitude and of constant direction, as in Section 1.11.
EXERCISES 63
Substituting into Stokes's theorem, Eq. 1.105,
(V x a<p) • da = — a x \q> • da
Js Js
Г 17 А AЛ08)
= -a- \ \<p x do.
Js
For the line integral
A.109)
and we obtain
fh = 0. A.110)
Since the choice of direction of a is arbitrary, the expression in parentheses
must vanish, thus verifying Eq. 1.106. Equation 1.107 may be derived similarly
by using V = a x P, in which a is again a constant vector.
Both Stokes's and Gauss's theorems are of tremendous importance in a wide
variety of problems involving vector calculus. Some idea of their power and
versatility may be obtained from the exercises of Sections 1.11 and 1.12 and the
development of potential theory in Sections 1.13 and 1.14.
EXERCISES
1.12.1 Given a vector t = — \y + jx. With the help of Stokes's theorem, show that
the integral around a continuous closed curve in the xy-plane
Г Г
\ i)t-^ = U (xdy - ydx) = A,
J J
the area enclosed by the curve.
1.12.2 The calculation of the magnetic moment of a current loop leads to the line
integral
<pr x dr.
J
(a) Integrate around the perimeter of a current loop (in the xy-plane) and
show that the scalar magnitude of this line integral is twice the area of
the enclosed surface.
(b) The perimeter of an ellipse is described by г = ia cos 0 + \b sin 0. From
part (a) show that the area of the ellipse is nab.
1.12.3 Evaluate jr x dr by using the alternate form of Stokes's theorem given by
Eq. 1.107:
Take the loop to be entirely in the xy-plane.
64 VECTOR ANALYSIS
1.12.4 In steady state the magnetic field H satisfies the Maxwell equation V x H = J,
where J is the current density (per square meter). At the boundary between
two media there is a surface current density К (per meter). Show that a boundary
condition on H is
n x (H2-Hi) = K.
n is a unit vector normal to the surface and out of medium 1.
Hint. Consider a narrow loop perpendicular to the interface as shown in the
figure.
medium 2
medium 1
1.12.5 From Maxwell's equations, V x H = J with J here the current density and
E = 0. Show from this that
where / is the net electric current enclosed by the loop integral. These are the
differential and integral forms of Ampere's law of magnetism.
1.12.6 A magnetic induction В is generated by electric current in a ring of radius R.
Show that the magnitude of the vector potential A (B = V x A) at the ring is
where q> is the total magnetic flux passing through the ring.
Note. A is tangential to the ring.
1.12.7 Prove that
Г V x V-J<r = 0,
Js
if 5 is a closed surface.
1.12.8 Evaluate §r-dr (Exercise 1.10.4) by Stokes's theorem.
1.12.9 Prove that
Г Г
<bu\rvd'k = — <bv\u-d'k.
J J
1.12.10 Prove that
&db= Г (Vwjx (\v)-d<s.
1.13 POTENTIAL THEORY
Scalar Potential
If a force over a given region of space S can be expressed as the negative
gradient of a scalar function cp,
POTENTIAL THEORY 65
F=-Vq>, A.111)
we call cp a scalar potential. The force F appearing as the negative gradient of
a single-valued scalar potential is labeled a conservative force. We want to know
when a scalar potential function exists. To answer this question we establish
two other relations as equivalent to Eq. 1.111. These are
VxF = 0 A.112)
and
<j)F-ufr = O, A.113)
for every closed path in our region S. We proceed to show that each of these
three equations implies the other two.
Let us start with
F • dx = — ф \q> • dx
F= -\<p. A.114)
Then
VxF=-VxV<p = 0 A.115)
by Eq. 1.77 or Eq. 1.111 implies Eq. 1.112. Turning to the line integral, we have
A.116)
using Eq. 1.56. Now dcp integrates to give ср. Since we have specified a closed
loop, the end points coincide and we get zero for every closed path in our
region S for which Eq. 1.111 holds. It is important to note the restriction here
that the potential be single-valued and that Eq. 1.111 hold for all points in S.
This problem may arise in using a scalar magnetic potential, a perfectly valid
procedure as long as no net current is encircled. As soon as we choose a path
in space that encircles a net current, the scalar magnetic potential ceases to be
single-valued and our analysis no longer applies.
Continuing this demonstration of equivalence, let us assume that Eq. 1.113
holds. If §F'dx = 0 for all paths in S, we see that the value of the integral
joining two distinct points A and В is independent of the path (Fig. 1.26). Our
premise is that
<j) F-dx = 0. A.117)
J ACBDA
Therefore
Г Г Г
A.118)
66 VECTOR ANALYSIS
D
В
FIG. 1.26 Possible paths for doing work
reversing the sign by reversing the direction of integration. Physically, this
means that the work done in going from A to В is independent of the path and
that the work done in going around a closed path is zero. This is the reason for
labeling such a force conservative: Energy is conserved.
With the result shown in Eq. 1.118, we have the work done dependent only
on the end points, A and B. That is,
work done by force =
= (p(A)-cp(B).
A.119)
Eq. 1.119 defines a scalar potential (strictly speaking, the difference in potential
between points A and B) and provides a means of calculating the potential. If
point В is taken as a variable, say, (x,y,z), then differentiation with respect to
x, y, and z will recover Eq. 1.111.
The choice of sign on the right-hand side is arbitrary. The choice here is made
to achieve agreement with Eq. 1.111 and to ensure that water will run downhill
rather than uphill. For points A and В separated by a length dr, Eq. 1.119
becomes
This may be rewritten
F'dr = —dcp
= — \qydr
(F + \q>) - dr = 0,
A.120)
A.121)
and since dr is arbitrary, Eq. 1.111 must follow.
If
F • dr = 0,
A.122)
POTENTIAL THEORY 67
we may obtain Eq. 1.112 by using Stokes's theorem (Eq. 1.109).
F• dx = \\ xF-da.
A.123)
If we take the path of integration to be the perimeter of an arbitrary differential
area da, the integrand in the surface integral must vanish. Hence Eq. 1.113
implies Eq. 1.114.
Finally, if V x F = 0, we need only reverse our statement of Stokes's
theorem (Eq. 1.123) to derive Eq. 1.113. Then, by Eqs. 1.119 to 1.121, the
initial statement F = — \cp is derived. The triple equivalence is demonstrated
(Fig. 1.27).
F= -Vp(l.lll)
V x F = 0A.112)
FIG. 1.27 Equivalent formulations
To summarize, a single-valued scalar potential function cp exists if and only
if F is irrotational or the work done around every closed loop is zero. The
gravitational and electrostatic force fields given by Eq. 1.75 are irrotational and
therefore are conservative. Gravitational and electrostatic scalar potentials
exist. Now, by calculating the work done (Eq. 1.119), we proceed to determine
three potentials, Fig. 1.28.
EXAMPLE 1.13.1 Gravitational Potential
Find the scalar potential for the gravitational force on a unit mass mx,
_ Gmlm2r0_ kr0
r r
By integrating Eq. 1.111 from infinity into position r, we obtain
(Pair)-
FG-dr.
A.125)
By use of FG = — Fapplied, a comparison with Eq. 1.88 shows that the potential
is the work done in bringing the unit mass in from infinity. (We can define only
potential difference. Here we arbitrarily assign infinity to be a zero of potential.)
The integral on the right-hand side of Eq. 1.125 is negative, meaning that <pG(r)
68 VECTOR ANALYSIS
is negative. Since FG is radial, we obtain a contribution to cp only when dx is
radial or
, ч Г00 kdr k
yj. r
t/ Г
_ Gmxm2
The final negative sign is a consequence of the attractive force of gravity.
EXAMPLE 1.13.2 Centrifugal potential
Calculate the scalar potential for the centrifugal force per unit mass, Fc =
co2rr0, radially outward. Physically, this might be you on a large horizontal
spinning disk at an amusement park. Proceeding as in Example 1.13.1, but
integrating from the origin outward and taking фс@) = О, we have
q>c(r)= - Fc-dr=~
co2r2
If we reverse signs, taking FSHO= —кг, we obtain <р8но— ikr2, the simple
harmonic oscillator potential.
*- r
FIG. 1.28 Potential energy versus distance (gravitational, centrifugal, and simple
harmonic oscillator)
The gravitational, centrifugal, and simple harmonic oscillator potentials are
shown in Fig. 1.28. Clearly, the simple harmonic oscillator yields stability and
POTENTIAL THEORY 69
describes a restoring force. The centrifugal potential describes an unstable
situation.
THERMODYNAMICS—EXACT
DIFFERENTIALS
In thermodynamics, which is sometimes called a search for exact differentials,
we encounter equations of the form
df= P(x,y)dx + Q(x,y)dy. A.126)
The usual problem is to determine whether j (P(x, y) dx + Q(x,y)dy) depends
only on the end points, that is, whether df is indeed an exact differential. The
necessary and sufficient condition is that
Y^-dy A.126a)
or that
A.126ft)
Equations 1.1266 depend on the relation
BP(x,y)^BQ(x,y)
dy dx
being satisfied. This, however, is exactly analogous to Eq. 1.116, the requirement
that F be irrotational. Indeed, the z-component of Eq. 1.116 yields
Vector Potential
In some branches of physics, especially electromagnetic theory, it is con-
convenient to introduce a vector potential A, such that a (force) field В is given by
B = VxA. A.127)
Clearly, if Eq. 1.127 holds, VB = 0 by Eq. 1.79 and В is solenoidal. Here we
want to develop a converse, to show that when В is solenoidal a vector potential
A exists. We demonstrate the existence of A by actually calculating it. Suppose
В = ibt + \b2 + kft3 and our unknown A = iax + \a2 + ka3. By Eq. 1.127
*.. 0.128a)
dy dz
70 VECTOR ANALYSIS
^-^ = 63. 0.128c)
ox oy
Let us assume that the coordinates have been chosen so that A is parallel to the
^z-plane; that is, ax = 0.1 Then
b - -
8°3
дх
A.129)
г~~дх~'
Integrating, we obtain
а2 = b3dx+f2(y,z),
A.130)
а3= - b2dx+f3(y,z),
Jx0
where /2 and f3 are arbitrary functions of у and z but are not functions of x.
These two equations can be checked by differentiating and recovering Eq.
1.129. Eq. 1.128a becomes
да-, да? Сx (db*, дЬЛ , d1\ df7
dy dz J V ^y ^z / дУ dz
A.131)
— I ~~^ MA "T" 9
J ex C7 5z
using V • В = 0. Integrating with respect to x, we obtain
da-, da-, , , , . ч 5Д 5A 1Л 1-^1ч
Remembering that/3 and/2 are arbitrary functions of у and z, we choose
/2=0,
Г A.133)
/3 = V^o,.^)^,
so that the right-hand side of Eq. 1.132 reduces to bl(x,y, z) in agreement with
Equation 1.128я. With/2 and/3 given by Eq. 1.133, we can construct A.
b3(x,y,z)dx +k b^Xo,^^)^- b2{x,y,z)dx
. A.134)
This is not quite complete. We may add any constant since В is a derivative of
1 Clearly, this can be done at any one point. It is not at all obvious that this
assumption will hold at all points; that is, A will be two dimensional. The
justification for the assumption is that it works; Eq. 1.134 satisfies Eq. 1.127.
POTENTIAL THEORY 71
A. What is much more important, we may add any gradient of a scalar function
\q> without affecting В at all. Finally, the functions /2 and /3 are not unique.
Other choices could have been made. It will be seen in Section 1.15 that we may
still specify V • A.
EXAMPLE 1.13.3 A Magnetic Vector Potential for a Constant Magnetic
Field
To illustrate the construction of a magnetic vector potential, we take the
special but still important case of a constant magnetic induction
B = kBz, A.135)
in which Bz is a constant. Equation 1.128 becomes
даъ da2 = Q
dy dz
^^ 0, A.136)
dz dx
да2 dax _ „
dx dy
If we assume that ax = 0, as before, then by Eq. 1.134
A = jf Bzdx
J
A.137)
setting a constant of integration equal to zero. It can readily be seen that this A
satisfies Eq. 1.127.
To show that the choice at = 0 was not sacred or at least not required, let us
try setting аъ = 0. From Eq. 1.136
^ 0, A.138a)
0,
dz
^ 0, A.1386)
^ 0,
dz
^ Дг. A.138с)
dx dy
We see ax and a2 are independent of z or
al=a1(x,y), a2 = a2(x,y). A.139)
Equation 1.138c is satisfied if we take
a2=P\ Bzdx=pxBz A.140)
72 VECTOR ANALYSIS
and
a,={p~ 1) ["Bzdy = (p- l)yBg, A.141)
with p any constant. Then
A = i(p-\)yBz+jpxBz. A.142)
Again, Eqs. 1.127, 1.135, and 1.142 are seen to be consistent. Comparison of
Eqs. 1.137 and 1.142 shows immediately that A is not unique. The difference
between Eqs. 1.137 and 1.142 and the appearance of the parameter^ in Eq.
1.142 may be accounted for by rewriting Eq. 1.142 as
A =-±(iy - }x)Bz
+ (P - i)(iy + V)Bt A.143)
with
cp = xy. A.144)
The first term in A corresponds to the usual form
A = i(Bxr) A.145)
for B, a constant.
To summarize this discussion of the vector potential, when a vector В is
solenoidal, a vector potential A exists such that В = V x A. A is undetermined
to within an additive gradient. This corresponds to the arbitrary zero of poten-
potential, a constant of integration for the scalar potential.
In many problems the magnetic vector potential A will be obtained from the
current distribution that produces the magnetic induction B. This means solving
Poisson's (vector) equation (see Exercise 1.14.4).
EXERCISES
1.13.1 If a force F is given by
F = (x2 + y2 + z2)"(ix + iy + kz),
find
(a) V-F,
(b) VxF,
(c) A scalar potential cp(x,y,z) so that F = — S(p.
(d) For what value of the exponent n does the scalar potential diverge at both
the origin and infinity?
ANS. (a) Bn + 3)r2n (c) —r2n+2, пф-\
in+ 2
(b) 0 (d) n= -1, cp= -lnr.
EXERCISES 73
1.13.2 A sphere of radius a is uniformly charged (throughout its volume). Construct
the electrostatic potential <p(r) for 0 < r < со.
Hint. In Section 1.14 it is shown that the Coulomb force on a test charge at
r = rQ depends only on the charge at distances less than rQ and is independent
of the charge at distances greater than r0. Note that this applies to a spherically
symmetric charge distribution.
1.13.3 The usual problem in classical mechanics is to calculate the motion of a particle
given the potential. For a uniform density (p0), nonrotating massive sphere,
Gauss's law of Section 1.14 leads to a gravitational force on a unit mass m0
at a point r0 produced by the attraction of the mass at r < r0. The mass at
r > rQ contributes nothing to the force.
(a) Show that F/ra0 = — DnGpQj3)r, 0 < r < a where a is the radius of the
sphere.
(b) Find the corresponding gravitational potential, 0 < r < a.
(c) Imagine a vertical hole running completely through the center of the earth
and out to the far side. Neglecting the rotation of the earth and assuming
a uniform density pQ = 5.5 gm/cm3, calculate the nature of the motion of
a particle dropped into the hole. What is its period?
Note. Far is actually a very poor approximation. Because of varying density,
the approximation F = constant, along the outer half of a radial line and
Far along the inner half is a much closer approximation.
1.13.4 The origin of the cartesian coordinates is at the Earth's center. The moon is
on the z-axis, a fixed distance R away (center-to-center distance). The tidal
force exerted by the moon on a particle at the earth's surface (point x, y, z)
is given by
Find the potential that yields this tidal force.
ANS ~GMm(z2 1X2 1..2Ч
/IJVij, ~ \Z 2^ 2У )
In terms of the Legendre polynomials of Chapter 12 this becomes
— GMm 2
3— r2P2(cos в).
1.13.5 A long straight wire carrying a current / produces a magnetic induction В with
components
w /
—y
2n \x2+y2'x2+y2
Find a magnetic vector potential, A.
ANS. A = - к(ц?1/4п) ln(x2 + y2).
(This solution is not unique.)
1.13.6 If
ro _ { x У z\
r \r r r I
find a vector A such that V x A = B. One possible solution is
_ \yz jxz
■2\"
r(x2 + yz) r(x2 + y2)
74 VECTOR ANALYSIS
1.13.7 Show that the pair of equations
A = |(Bx r),
В = V x A,
is satisfied by any constant vector В (any orientation).
1.13.8 Vector В is formed by the product of two gradients
В = (уи) х (Щ
where и and v are scalar functions.
(a) Show that В is solenoidal.
(b) Show that
A = \{uVv - vS/u)
is a vector potential for В in that
В = V x A.
1.13.9 The magnetic induction В is related to the magnetic vector potential A by
В = V x A. By Stokes's theorem
В • da = <L A • dr.
Show that each side of this equation is invariant under the gauge transformation,
Note. Take the function ф to be single-valued. The complete gauge transforma-
transformation is considered in Exercise 3.7.4.
1.13.10 With E the electric field and A the magnetic vector potential, show that [E +
dAjdt'] is irrotational and that therefore we may write
1.13.11 The total force on a charge q moving with velocity v is
F = q(E + v x B).
Using the scalar and vector potentials, show that
F = q\ -So-
Note that we now have a total time derivative of A in place of the partial deriva-
derivative of Ex. 1.13.10.
1.14 GAUSS'S LAW, POISSON'S EQUATION
Gauss's Law
Consider a point electric charge q at the origin of our coordinate system.
This produces an electric field E given by1
1 The electric field E is defined as the force per unit charge on a small stationary
test charge. qt: E = ¥jqt. From Coulomb's law the force on q, due to q is
F = (qq,/4neo)(rojr2). When we divide by q, Eq. 1.146 follows.
GAUSS'S LAW, POISSON'S EQUATION 75
•q
FIG. 1.29
A.146)
We now derive Gauss's law which states that the surface integral
0
A.147)
q/e0 if the closed surface S includes the origin (where q is located) and zero if the
surface does not include the origin (Fig. 1.29). The surface S is any closed sur-
surface ; it need not be spherical.
Using Gauss's theorem, Eq. 1.94 (and neglecting the q/4n£0), we obtain
A.148)
by Example 1.7.2, provided the surface S does not include the origin, where the
integrands are not defined. This proves the second part of Gauss's law.
76 VECTOR ANALYSIS
Го
FIG. 1.30 Exclusion of the origin
The first part, in which the surface S must include the origin, may be handled
by surrounding the origin with a small sphere S' of radius S (Fig. 1.30). So that
there will be no question what is inside and what is outside, imagine the volume
outside the outer surface S and the volume inside surface S'(r < S) connected
by a small hole. This joins surfaces 5" and 5", combining them into one single
simply connected closed surface. Because the radius of the imaginary hole may
be made vanishingly small, there is no additional contribution to the surface
integral. The inner surface is deliberately chosen to be spherical so that we will
be able to integrate over it. Gauss's theorem now applies to the volume between
S and S" without any difficulty. We have
A.149)
We may evaluate the second integral, for da' — — roe2dQ, in which JQ is an
element of solid angle. The minus sign appears because we agreed in Section
1.10 to have the positive normal Го outward from the volume. In this case the
outward r'o is in the negative radial direction, r'o = — r0. By integrating over all
angles, we have
r0 • d&
A.150)
independent of the radius 8. With the constants from Eq. 1.146, this results in
A.151)
GAUSS'S LAW, POISSON'S EQUATION 77
completing the proof of Gauss's law. Notice carefully that although the surface
5" may be spherical, it need not be spherical.
Going just a bit further, we consider a distributed charge so that
q= \pdx. A.152)
Jv
Equation 1.151 still applies, with q now interpreted as the total distributed
charge enclosed by surface S.
f E-da= f £-dx. A.153)
Js Jvso
Using Gauss's theorem, we have
Г Г о
V-Erfr= *-dx. A.154)
Jv Jvs°
Since our volume is completely arbitrary, the integrands must be equal or
A.155)
one of Maxwell's equations. If we reverse the argument, Gauss's law follows
immediately from Maxwell's equation.
Poisson's Equation
Replacing E by — V<p, Eq. 1.155 becomes
V-V<p =--£-, A.156)
which is Poisson's equation. For the condition p = 0 this reduces to an even
more famous equation,
V-V<p = 0, A.157)
Laplace's equation. We encounter Laplace's equation frequently in discussing
various coordinate systems (Chapter 2) and the special functions of mathe-
mathematical physics which appear as its solutions. Poisson's equation will be in-
invaluable in developing the theory of Green's functions (Sections 8.7 and 16.5).
From direct comparison of the Coulomb electrostatic force law and New-
Newton's law of universal gravitation
F - ЛЯЯ THh
E~
E~4ne0 r2 F°' *G~ ° r2 °-
All of the potential theory of this section applies equally well to gravitational
potentials. For example, the gravitational Poisson equation is
= +4nGp A.156д)
with p now a mass density.
78 VECTOR ANALYSIS
EXERCISES
1.14.1 Develop Gauss's law for the two-dimensional case in which
2ne0 2neop
Here q is the charge at the origin or the line charge per unit length if the two-
dimensional system is a unit thickness slice of a three-dimensional (circular
cylindrical) system. The variable p is measured radially outward from the line
charge. p0 is the corresponding unit vector (see Section 2.4).
1.14.2 (a) Show that Gauss's law follows from Maxwell's equation
Неге р is the usual charge density.
(b) Assuming that the electric field of a point charge q is spherically symmetric,
show that Gauss's law implies the Coulomb inverse square expression
E= ?r
4n£Qr2'
1.14.3 Show that the value of the electrostatic potential cp at any point P is equal to the
average of the potential over any spherical surface centered on P. There are no
electric charges on or within the sphere.
Hint. Use Green's theorem, Eq. 1.97, with u~l — r, the distance from P, and v =
cp. Also note Eq. 1.173 in Section 1.15.
1.14.4 Using Maxwell's equations, show that for a system (steady current) the magnetic
vector potential A satisfies a vector Poisson equation,
V2A=-/iJ,
provided we require V • A = 0.
1.15 HELMHOLTZ'S THEOREM
In Section 1.13 it was emphasized that the choice of a magnetic vector
potential A was not unique. The divergence of A was still undetermined. In this
section two theorems about the divergence and curl of a vector are developed.
The first theorem is as follows.
A vector is uniquely specified by giving its divergence and its curl within a
region and its normal component over the boundary.
Let us take
V-V1 = j,
A.158)
VxV,=c,
where s may be interpreted as a source (charge) density and c, as a circulation
(current) density. Assuming also that the normal component VXn on the boun-
boundary is given, we want to show that V2 is unique. We do this by assuming the
existence of a second vector V2, which satisfies Eq. 1.158 and has the same
HELMHOLTZ'S THEOREM 79
normal component over the boundary, and then showing that Vv — V2 = 0.
Let
w = vt - v2.
Then
V-W = 0 A.159)
and
VxW = 0. A.160)
Since W is irrotational we may write (by Section 1.13)
W=-V<p. A.161)
Substituting this into Eq. 1.159, we obtain
\-\cp = O, A.162)
Laplace's equation.
Now we draw upon Green's theorem in the form given in Eq. 1.98, letting и
and v each equal ср. Since
K=vln-v2n = o A.163)
on the boundary, Green's theorem reduces to
0. A.164)
J V J V
The quantity W • W = W2 is nonnegative and so we must have
W = V1-V2=0 A.165)
everywhere. Thus Vt is unique, proving the theorem.
For our magnetic vector potential A the relation В = V x A specifies the
curl of A. Often for convenience we set V*A = 0 (compare Exercise 1.14.4).
Then (with boundary conditions) A is fixed.
This theorem may be written as a uniqueness theorem for solutions of
Laplace's equation, Exercise 1.15.1. In this form, this uniqueness theorem is of
great importance in solving electrostatic and other Laplace equation boundary
value problems. If we can find a solution of Laplace's equation that satisfies
the necessary boundary conditions, then our solution is the complete solution.
Such boundary value problems are taken up in Sections 12.3 and 12.5.
Helmholtz's Theorem
The second theorem we shall prove is Helmholtz's theorem.
A vector V satisfying Eq. 1.158 with both source and circulation densities
vanishing at infinity may be written as the sum of two parts, one of which is
irrotational, the other solenoidal.
Helmholtz's theorem will clearly be satisfied if we may write V as
V= -\q> + V x A, A.166)
80 VECTOR ANALYSIS
Source point
FIG. 1.31 Source and field points
— \(p being irrotational and V x A being solenoidal. We proceed to justify
Eq. 1.166.
V is a known vector. Taking the divergence and curl
V x V = c(r),
A.166a)
A.166b)
with s(r) and c(r) now known functions of position. From these two functions
we construct a scalar potential <р(тг),
-dx2,
A.167a)
12
and a vector potential А(гх),
' c(r2)
An
A.167b)
'12
Here the argument rx indicates (xlyl,zl), the field point; r2, the coordinates
of the source point (x2,y2,z2), whereas
- x2)
- y2)
x - z2J]
2-11/2
A.168)
When a direction is associated with rl2, the positive direction is taken to be
away from the source toward the field point. Vectorially, r12 = rx — r2, as
shown in Fig. 1.31. Of course, s and с must vanish sufficiently rapidly at large
distance so that the integrals exist. The actual expansion and evaluation of
integrals such as Eqs. 1.167a and b is treated in Section 12.1
From the uniqueness theorem at the beginning of this section, V is uniquely
HELMHOLTZ'S THEOREM 81
specified by its divergence, s, and curl, с (and boundary conditions). Returning
to Eq. 1.166, we have
A.169a)
the divergence of the curl vanishing and
VxV = VxVxA, A.16%)
the curl of the gradient vanishing. If we can show that
-V-V<p(r1) = j(r1) A.169c)
and
V x V x A(rx) = cO-i), A.169a*)
then V as given in Eq. 1.166 will have the proper divergence and curl. Our
description will be internally consistent and Eq. 1.166 justified.1
First, we consider the divergence of V:
И=-?.^=-1у.?р1,/т2, A.170)
The Laplacian operator, V • V or V2, operates on the field coordinates {xx,yx, zx)
and so commutes with the integration with respect to {хг,уг,2г). We have
v-v=-sjJ(rJV?fe)A»- <U71>
From Example 1.6.1 and the development of Gauss's law in Section 1.14,
depending on whether the integration included the origin r = 0. This result
may be conveniently expressed by introducing the Dirac delta function, S(r),2
V2(-4= -4я<5(г). A.173)
This Dirac delta function is defined by its assigned properties
<5(r) = 0, r±0, A.174a)
h(r)S(r)dx=f@), A-1746)
where/(r) is any well-behaved function and the volume of integration includes
the origin. As a special case of Eq. 1.1746,
1 Alternatively, we could solve Eq. 1.169c, Poisson's equation, and compare
the solution with the constructed potential, Eq. 1.167a. The solution of
Poisson's equation is developed in Section 8.7.
2Compare Section 8.7 for a more extended treatment of the Dirac delta
function.
82 VECTOR ANALYSIS
d{r)dx=\. A.175)
The quantity 8(r) is really not a function at all, since it is undefined (infinite)
at r = 0. However, the crucial property, Eq. 1.1746 can be developed rigorously
as the limit of a sequence of functions, a distribution. This development appears
in Section 8.7. Here we proceed to use the delta function in terms of its defining
properties.
We must make two minor modifications in Eq. 1.173 before applying it.
First, our source is at r2, not at the origin. This means that the An in Gauss's
law appears if and only if the surface includes the point г = r2. To show this,
we rewrite Eq. 1.173:
' l -! -r2). A.176)
12/
This shift of the source to r2 may be incorporated in the defining equations
A.174) as
S(rl-r2) = 01 х,фх2, A.177a)
). A.1776)
Second, noting that differentiating r\\ twice with respect to x2, y2, z2 is the
same as differentiating twice with respect to Xj, yx, zx, we have
1 - x2)
A.178)
We could equally well have noted that from its defining properties
5{xl-x2) = 5{x2-xl). A.179)
Rewriting Eq. 1.171 and using the Dirac delta function, Eq. 1.178, we may
integrate to obtain
Ur2)(-47r)<5(r2 -xl)dx2 A.180)
= s{rx).
The final step follows from Eq. 1.1776 with the subscripts 1 and 2 exchanged.
Our result, Eq. 1.180, shows that the assumed form of V and of the scalar
potential q> are in agreement with the given divergence (Eq. 1.166a).
To complete the proof of Helmholtz's theorem, we need to show that our
assumptions are consistent with Eq. 1.166a, that is, the curl of V is equal to
с(гх). From Eq. 1.166
EXERCISES 83
V x V=Vx V x A
A.181)
= VV-A- V2A.
The first term, VV • A leads to
Г (\ A.182)
by Eq. l Л67b. Again replacing the second derivatives with respect to x1,y1,zl
by second derivatives with respect to x2, y2, z2, we integrate each component3
ofEq. 1.182 by parts :
= V,
c(r2)
(U83)
дх2
The second integral vanishes because the circulation density с is solenoidal.4
The first integral may be transformed to a surface integral by Gauss's theorem.
If с is bounded in space or vanishes faster than 1/r for large r, so that the integral
in Eq. 1.1676 exists, then by choosing a sufficiently large surface the first integral
on the right-hand side of Eq. 1.183 also vanishes.
With VV- A = 0, Eq. 1.181 now reduces to
dx2. A.184)
This is exactly like Eq. 1.171 except that the scalar ^(r2) is replaced by the vector
circulation density c(r2). Introducing the Dirac delta function, as before, as a
convenient way of carrying out the integration, we find that Eq. 1.184 reduces
to Eq. 1.158. We see that our assumed form of V, given by Eq. 1.166, and of
the vector potential A, given by Eq. 1.1676, are in agreement with Eq. 1.158
specifying the curl of V.
This completes the proof of Helmholtz's theorem, showing that a vector
may be resolved into irrotational and solenoidal parts. Applied to the electro-
electromagnetic field, we have resolved our field vector V into an irrotational electric
field E, derived from a scalar potential <p, and a solenoidal magnetic induction
field B, derived from a vector potential A. The source density s(r) may be
interpreted as an electric charge density (divided by electric permittivity e),
whereas the circulation density c(r) becomes electric current density (times
magnetic permeability pi).
EXERCISES
1.15.1 Implicit in this section is a proof that a function ф(г) is uniquely specified by
requiring it to A) satisfy Laplace's equation and B) satisfy a complete set of
boundary conditions. Develop this proof explicitly.
3This avoids creating the tensor c(r2)V2.
4 Remember с = V x V is known.
84 VECTOR ANALYSIS
1.15.2 (a) Assuming that P is a solution of the vector Poisson equation, Vf P(i"i) =
— V(ri), develop an alternate proof of Helmholtz's theorem, showing that
V may be written as
V= -\cp + V x A,
where
A = V x P,
and
(b) Solving the vector Poisson equation, we find
12
Show that this solution substituted into cp and A of part (a) leads to the
expressions given for (p and A in Section 1.15.
REFERENCES
Davis, Harry F. and Arthur D. Snider, Introduction to Vector Analysis, 4th ed. Boston:
Allyn & Bacon A979).
Kellogg, O. D., Foundations of Potential Theory. New York: Dover A953). Originally
published, 1929.
The classic text on potential theory.
Marion, J. В., Principles of Vector Analysis. New York: Academic Press A965).
A moderately advanced presentation of vector analysis oriented toward tensor analysis.
Rotations and other transformations are described with the appropriate matrices.
Wrede, R. C, Introduction to Vector and Tensor Analysis. New York: Wiley A963).
Reprinted, New York: Dover A972).
Fine historical introduction. Excellent discussion of differentiation of vectors and
applications to mechanics.
2 COORDINATE
SYSTEMS
In Chapter 1 we restricted ourselves almost completely to cartesian coor-
coordinate systems. A cartesian coordinate system offers the unique advantage that
all three unit vectors, i, j, and k, are constant in direction as well as in magnitude.
We did introduce the radial distance г but even this was treated as a function
of x, y, and z. Unfortunately, not all physical problems are well adapted to
solution in cartesian coordinates. For instance, if we have a central force prob-
problem, F = r0F(r), such as gravitational or electrostatic force, cartesian coor-
coordinates may be unusually inappropriate. Such a problem literally screams for
the use of a coordinate system in which the radial distance is taken to be one
of the coordinates, that is, spherical polar coordinates.
The point is that the coordinate system should be chosen to fit the problem,
to exploit any constraint or symmetry present in it. Then, hopefully, it will be
more readily soluble than if we had forced it into a cartesian framework. Quite
often "more readily soluble" will mean that we have a partial differential equa-
equation that can be split into separate ordinary differential equations, often in
"standard form" in the new coordinate system. This technique, the separation
of variables, is discussed in Section 2.6.
We are primarily interested in coordinates in which the equation
0 B.1)
is separable. Equation 2.1 is much more general than it may appear. If
к2 = 0 Eq. 2.1 -*■ Laplace's equation,
k2 = ( + ) constant Helmholtz's equation,
k2 = ( —) constant Diffusion equation (space part),
k2 = constant x kinetic energy Schrodinger wave equation.
It has been shown [L. P. Eisenhart, Phys. Rev. 45, 427 A934)] that there are 11
coordinate systems in which Eq. 2.1 is separable, all of which can be considered
particular cases of the confocal ellipsoidal system.
Naturally, there is a price that must be paid for the use of a noncartesian
coordinate system. We have not yet written expressions for gradient, divergence,
or curl in any of the noncartesian coordinate systems. Such expressions are
developed in very general form in Section 2.2. First, we must develop a system
of curvilinear coordinates, a general system that may be specialized to any of
the particular systems of interest. We shall specialize to circular cylindrical
coordinates in Section 2.4 and to spherical polar coordinates in Section 2.5.
85
86 COORDINATE SYSTEMS
2.1 CURVILINEAR COORDINATES
In cartesian coordinates we deal with three mutually perpendicular families
of planes: x = constant, у — constant, and z = constant. Imagine that we super-
superimpose on this system three other families of surfaces. The surfaces of any one
family need not be parallel to each other and they need not be planes. If this is
difficult to visualize, the figure of a specific coordinate system such as Fig. 2.3
may be helpful. The three new families of surfaces need not be mutually perpen-
perpendicular, but for simplicity we quickly impose this condition (Eq. 2.7). We may
describe any point (x, y, z) as the intersection of three planes in cartesian co-
coordinates or as the intersection of the three surfaces that form our new, curvi-
curvilinear coordinates. Describing the curvilinear coordinate surfaces by qx =
constant, q2 — constant, q3 = constant, we may identify our point by (qx, q2, #3)
as well as by (x, y, z). This means that in principle we may write
General curvilinear coordinates Circular cylindrical coordinates
Я\,Яг,Яъ Р> Ф> z
x = x(qx,q2,q3) x = pcoscp
У = У(Я\,Яг,Яъ) y = psmq) B.2)
z = z(gl,q2,q3) z = z
specifying x, y, z in terms of the #'s and the inverse relations,
) B.3)
z = z
As a specific illustration of the general, abstract qx, q2, q3 the transformation
equations for circular cylindrical coordinates (Section 2.4) are included in Eqs.
2.2 and 2.3. With each family of surfaces qt = constant, we can associate a unit
vector e, normal to the surface qt = constant and in the direction of increasing
q{. Then a vector V may be written
Differentiation of x in Eq. 2.2 leads to
дх дх дх ,~ лл
ax = ~^-dqx + ~^-dq2 + ^~dq3, B.4)
oq\ oq2 dq3
and similarly for differentiation of у and z. From the Pythagorean theorem in
cartesian coordinates the square of the distance between two neighboring
points is
ds2 = dx2 +dy2 +dz2. B.4a)
We assume that in our curvilinear coordinate space the square of the distance
element can be written as a general quadratic form:
CURVILINEAR COORDINATES 87
ds2 = glldq2l + g12dq1dq2+ g13dqldq3
+ g2ldq2dqx + g22dq22 + g23dq2dq3
+ g3ldq3dqx + g32dq3dq2 + g33dq2 v ' ;
Spaces for which Eq. 2.5 is a legitimate expression are called metric or Rieman-
nian. Substituting Eq. 2.4 (squared) and the corresponding results for dy2 and
dz2 into Eq. 2.4a and equating coefficients of dqidq^ we find
,jJ = |£|£ + |iJi + |£|£ B.6)
ddq dqdq dq dq
These coefficients gVp which we now proceed to investigate, may be viewed as
specifying the nature of the coordinate system {qx,q2,q3). Collectively these
coefficients are referred to as the metric and in Section 3.3 will be shown to be
a second-rank tensor.2 In general relativity the metric components are deter-
determined by the properties of matter. Geometry is merged with physics.
At this point we limit ourselves to orthogonal (mutually perpendicular sur-
surfaces) coordinate systems, which means (see Exercise 2.1.1K
9ij = 0, i ф]. B.7)
(Nonorthogonal coordinate systems are considered in some detail in Sections
3.8 and 3.9 in the framework of tensor analysis and in Section 4.4 by using
matrix analysis.) Now, to simplify the notation, we write gVl = hf so that
ds2 = (h, dqxJ + Qi2dq2J + (h3 dq3J. B.8)
The specific coordinate systems are described in subsequent sections by specify-
specifying these scale factors hl,h2, and h3. Conversely, the scale factors may be con-
conveniently identified by the relation
ds^h.dq, B.9)
for any given dqh holding the other q's constant. Note that the three curvilinear
coordinates qx,q2, q3 need not be lengths. The scale factors ht may depend on
the q's and they may have dimensions. The product h{dqi must have dimensions
of length. The differential distance vector dv may be written
dx = hx dqx ex + h2dq2e2 + h3dq3e3
lrThe dq's are arbitrary. For instance, setting dq2 = dq3 = 0 isolates дг1. It
might be noted that Eq. 2.6 can be derived from Eq. 2.4 more elegantly with
the matrix notation of Chapter 4. Further, the matrix notation leads directly
to the Jacobian determinant, Exercise 2.1.5. *
2 The tensor nature of the set of gtfs follows from the quotient rule (Section
3.3). Then the tensor transformation law yields Eq. 2.6.
3In relativistic cosmology the nondiagonal elements of the metric gtJ are
usually set equal to zero as a consequence of the physical assumptions of no
rotation and no shear strains (see also Section 3.6).
COORDINATE SYSTEMS
Using this curvilinear component form, we find that a line integral becomes
From Eq. 2.9 we may immediately develop the area and volume elements
da{j = dSi dSj = hthj dqi dq} B.10)
and
dx = dsxds2ds3 = hlh2h3dql dq2dq3.
B.11)
The expressions in Eqs. 2.10 and 2.11 agree, of course, with the results of using
the transformation equations, Eq. 2.2, and Jacobians.
From Eq. 2.10 an area element may be expanded:
da = ds2 ds3 ex + ds3 dsl e2 + dsl ds2 e3
= h2h3dq2dq3el + h3h1dq3dqle2
+ hlh2dqldq2e3
A surface integral becomes
\\-d<r= Vlh2h3dq2dq3+ V2h3hldq3dql
+ V3h1h2dq1dq2.
Examples of such line and surface integrals appear in Sections 2.4 and 2.5.
In anticipation of the new forms of equations for vector calculus that appear
in the next section, the student should clearly understand that vector algebra is
the same in orthogonal curvilinear coordinates as in cartesian coordinates.
Specifically, for the dot product
A-B = A1B1+A2B2+A3B3, B.11a)
where the subscripts indicate curvilinear components. For the cross product
A x B =
A\ A2 A3
Bx B2 B3
B.116)
just like Eq. 1.35.
EXERCISES
2.1.1 Show that limiting our attention to orthogonal coordinate systems implies that
Hint. Construct a triangle with sides dst, ds2, and ds. Equation 2.9 must hold
regardless of whether gtj — 0. Then compare ds2 from Eq. 2.5 with a calculation
using the law of cosines. Show that cos#12 = QJ
EXERCISES 89
2.1.2 In the spherical polar coordinate system qx — r,q2 — 6,q3 — (p. The transformation
equations corresponding to Eq. 2.2 are
x = r sin в cos cp
у — r sin в sin cp
z = rcosO.
(a) Calculate the spherical polar coordinate scale factors: hr, he, and h^.
(b) Check your calculated scale factors by the relation dst = А,-ф,-.
2.1.3 The u-, v-, /-coordinate system frequently used in electrostatics and in hydrody-
hydrodynamics is defined by
xy = u,
x2 - y2 = v,
z = z.
This u-, v-, z-system is orthogonal.
(a) In words, describe briefly the nature of each of the three families of coordinate
surfaces.
(b) Sketch the system in the .xy-plane showing the intersections of surfaces of
constant и and surfaces of constant v with the ху-рЫпе.
(c) Indicate the directions of the unit vector u0 and v0 in all four quadrants.
(d) Finally, is this u-, v-, z-system right-handed (u0 x v0 = +k) or left-handed
(u0 x v0 = -k)?
2.1.4 The elliptic cylindrical coordinate system consists of three families of surfaces:
x2 v2
1. * + У =1
a2 cosh2 и a2 sinh2 и
X2 V2
2. У =1
a2 cos2 v a2 sin2 v
3. z = z
Sketch the coordinate surfaces и — constant and v = constant as they interest
the first quadrant of the .xy-plane. Show the unit vectors u0 and v0. The range of
и is 0 < и < oo. The range of v is 0 < v < 2л.
2.1.5 A fvw-dimensional orthogonal system is described by the coordinates q1 and q2-
Show that the Jacobian
(Al2
\4i» Яг)
is in agreement with Eq. 2.10.
Hint. It's easier to work with the square of each side of this equation.
2.1.6 In Minkowski space we define xl = x, x2 — у, хъ = z, and x4 — ict. This is done
so that the space-time interval ds2 — dx2 + dy2 + dz2 — c2dt2 (c — velocity of
light) becomes ds2 — £f=1 dx2. Show that the metric in Minkowski space is gis — Sy
or
90 COORDINATE SYSTEMS
This indicates the advantage of using Minkowski space in a special relativity
theory: It is a four-dimensional cartesian system. We use Minkowski space in
Sections 3.7 and 4.12 for describing Lorentz transformations.
2.2 DIFFERENTIAL VECTOR OPERATIONS
Gradient
The starting point for developing the gradient, divergence, and curl operators
in curvilinear coordinates is our interpretation of the gradient as the vector
having the magnitude and direction of the maximum space rate of change (com-
(compare Section 1.6). From this interpretation the component of Уф^х, q2, #3) in
the direction normal to the family of surfaces qx = constant is given by1
B.12)
since this is the rate of change of ф for varying qx, holding q2 and q3 fixed. The
quantity dsx is a differential length in the direction of increasing qx (compare
Eq. 2.9). In Section 2.1 we introduced a unit vector ex to indicate this direction.
By repeating Eq. 2.12 for q2 and again for q3 and adding vectorially, we see
that the gradient becomes
1 <з ' J. з ' J a
OS л OS 2 OSr,
1 2 3 B.13)
di// di// дф
1hxdqx 2h2dq2 3h3dq3'
Exercise 2.2.4 offers a mathematical alternative independent of this physical
interpretation of the gradient.
Divergence
The divergence operator may be obtained from the second definition (Eq.
1.91) of Chapter 1 or equivalently from Gauss's theorem, Section 1.11. Let
us use Eq. 1.91:
= Hm i^, B.14)
with a differential volume hxh2h3dqx dq2dq3 (Fig. 2.1). Note that the positive
directions have been chosen so that (qx, q2, q3) or (ex, e2, e3) form a right-handed
set, ex x e2 = e3.
The area integral for the two faces qx = constant is given by
1 Here the use of q> to label a function is avoided because it is conventional to
use this symbol to denote an azimuthal coordinate.
DIFFERENTIAL VECTOR OPERATIONS 91
ds3 =
ds2 = h2 dq2
= hi dqx
FIG. 2.1 Curvilinear volume element
[•
dq2dq3 - Vih2h3dq2dq3
e
B.15)
x dq2 dq3,
exactly as in Sections 1.7 and 1.10.2 Adding in the similar results for the other
two pairs of surfaces, we obtain
[_dqi dq2 z J 1У dq3
Division by our differential volume (Eq. 2.14) yields
B.16)
dqidq2dq3.
dq:
dq2
B.17)
In Eq. 2.17 V; is the component of V in the erdirection, increasing qt; that is,
We may obtain the Laplacian by combining Eqs. 2.13 and 2.17, using
V = \ф(д1, q2, <73). This leads to
2Since we take the limit dqlt dq2, dq3-*0, the second- and higher-order
derivatives will drop out.
92 COORDINATE SYSTEMS
6 (h2h3
]
h дф\ + 6
дд1 \ hi dgj 6g2 \ h2 6g2) дд3 \ h3
. BЛЩ
Curl
Finally, to develop V x V, let us apply Stokes's theorem (Section 1.12) and,
as with the divergence, take the limit as the surface area becomes vanishingly
small. Working on one component at a time, we consider a differential surface
element in the curvilinear surface gi = constant. From
B.186)
=\ x\\ih2h3dg2dq3
J s
(mean value theorem of integral calculus) Stokes's theorem yields
V x V\1h2h3dg2dg3 = <j> V-dr,
B.19)
with the line integral lying in the surface g± = constant. Following the loop
A,2, 3,4) of Fig. 2.2,
V3h3+-^(V3h3)dg2
ддг
dg
dg2
dg2-V3h3dg3 B.20)
dg2dg3
= h3 dq3
У
x
FIG. 2.2 Curvilinear surface element
We pick up a positive sign when going in the positive direction on parts 1 and 2
and a negative sign on parts 3 and 4 because here we are going in the negative
direction. Higher-order terms in Maclaurin or Taylor expansion have been
EXERCISES 93
omitted. They will vanish in the limit as the surface becomes vanishingly small
B.21)
The remaining two components ofV x V may be picked up by cyclic permuta-
permutation of the indices. As in Chapter 1, it is often convenient to write the curl in
determinant form:
From
Eq.
2.
19
V
X
V
1
1 ~ h2h3
dq2
' 6 (h V)
3 dq3
V x V =
6q2 dq.
h2V2
h3V3
B.22)
Remember that because of the presence of the differential operators, this
determinant must be expanded from the top down. Note that this equation is
not identical with the form for the cross product of two vectors, Eq. 2.1 \b.
V is not an ordinary vector; it is a vector operator.
Our geometric interpretation of the gradient and the use of Gauss's and
Stokes's theorems (or integral definitions of divergence and curl) have enabled
us to obtain these quantitites without having to differentiate the unit vectors e;.
There exist alternate ways to determine grad, div, and curl based on direct
differentiation of the e;. One approach resolves the e,- of a specific coordinate
system into its cartesian components (Exercises 2.4.1 and 2.5.1) and differenti-
differentiates this cartesian form (Exercises 2.4.3 and 2.5.2). The point here is that the
derivatives of the cartesian i, j, and к vanish since i, j, and к are constant in
direction as well as in magnitude. A second approach [L. J. Kijewski, Am. J.
Phys. 33, 816 A965)] assumes the equality of dh/dqidqj and dh/dqjdqi and
develops the derivatives of e; in a general curvilinear form. Exercises 2.2.3 and
2.2.4 are based on this method.
EXERCISES
2.2.1 Develop arguments to show that ordinary dot and cross products (not involving
V) in orthogonal curvilinear coordinates proceed as in cartesian coordinates with
no involvement of scale factors.
2.2.2 With et a unit vector in the direction of increasing qt, show that
1 d(h2h3)
(a) V-e,=
h1h2h3 dq,
(b
* Г Shi dhx 1
=— e2—-*--e3—-M.
hi I h3dq3 h2dq2j
Note that even though et is a unit vector, its divergence and curl do not necessarily
vanish.
94 COORDINATE SYSTEMS
2.2.3 Show that the orthogonal unit vectors e,- may be defined by
е{ = ——. (a)
^ dq{
In particular, show that e; • e; = 1 leads to an expression for ht in agreement with
Eq. 2.6.
Eq. (a) may be taken as a starting point for deriving
—- = e-——, /
oq-} hfiqi
and
8ei _ v л 8hi
dqt pt }hjdqj
2.2.4 Derive
W7, дф дф дф
17./- — л ' 'ft ,.1 р .:
1hldql
by direct application of Eq. 1.90,
= lim Ц^-.
ЯгиЛ Evaluation of the surface integral will lead to terms like {hlh2hiyl{djdql)
{elh2hi). The results listed in Ex. 2.2.3 will be helpful. Cancellation of unwanted
terms occurs when the contributions of all three pairs of surfaces are added to-
together.
2.3 SPECIAL COORDINATE SYSTEMS-
RECTANGULAR CARTESIAN COORDINATES
As mentioned in Section 2.1, there are 11 coordinate systems in which the
three-dimensional Helmholtz equation can be separated into three ordinary
differential equations. Some of these coordinate systems have achieved promi-
prominence in the historical development of quantum mechanics. Other systems such
as bipolar coordinates, satisfy special needs. Partly because the needs are rather
infrequent, but mostly because the development of high-speed computing
machines and efficient programming techniques reduces the need for these
coordinate systems, the discussion in this chapter is limited to A) cartesian
coordinates, B) spherical polar coordinates, and C) circular cylindrical co-
coordinates. Specifications and details of the other coordinate systems will be
found in the first two editions of this work and in the references (Morse and
Feshbach, Margenau and Murphy).
Rectangular Cartesian Coordinates
These are the cartesian coordinates on which Chapter 1 is based. In this
simplest of all systems
h1=hx = 1,
h2 = hy=\, B.23)
h3 = hz=l.
CIRCULAR CYLINDRICAL COORDINATES (p, <p, z) 95
The families of coordinate surfaces are three sets of parallel planes: x = con-
constant, у = constant, and z = constant. The cartesian coordinate system is unique
in that all its ht's are constant. This will be a significant advantage in treating
tensors in Chapter 3. Note also that the unit vectors, et, e2, e3 or i, j, k, have
fixed directions.
From Eqs. 2.13, 2.17,2.18, and 2.22 we reproduce the results of Chapter 1,
B.24)
dz
, B.25)
* B.26)
x V =
1
d
Yx
vx
j
d
dy
Vy
к
d
dz
B.27)
2.4 CIRCULAR CYLINDRICAL COORDINATES {p, cp, z)
In the circular cylindrical coordinate system the three curvilinear coordinates
(Я\,Ч2уЯъ) аге relabeled (p,cp,z). The coordinate surfaces, shown in Fig. 2.3,
are
1. Right circular cylinders having the z-axis as a com-
common axis,
p = (x2 + y2I'2 = constant.
2. Half planes through the z-axis,
cp = tan ( - J = constant.
w
3. Planes parallel to the xy-plane, as in the cartesian system,
z = constant.
The limits on p, qy and z are
0 < p < oo, 0 < cp < 2я, and — oo<z<oo.
Note that we are using p for the perpendicular distance from the z-axis and
saving rfor the distance from the origin.
Inverting the preceding equations for p and cp (or going directly to Fig. 2.3),
we obtain the transformation relations
96 COORDINATE SYSTEMS
FIG. 2.3 Circular cylinder coordinates
x = pcos<p,
у = p sin ф,
z = z.
B.28)
The z-axis remains unchanged. This is essentially a two-dimensional curvilinear
system with a cartesian z-axis added on to form a three-dimensional system.
According to Eq. 2.28 or from the length elements dsb the scale factors are
B.29)
The unit vectors el5 e2, e3 are relabeled (po,<po,k), Fig. 2.4. The unit vector
p0 is normal to the cylindrical surface pointing in the direction of increasing
radius p. The unit vector <p0 is tangential to the cylindrical surface, perpendicular
to the half plane <p = constant and pointing in the direction of increasing
azimuth angle (p. The third unit vector, k, is the usual cartesian unit vector.
CIRCULAR CYLINDRICAL COORDINATES (p, q>, z) 97
FIG. 2.4 Circular cylindrical
coordinate unit vectors
A differential displacement dx ipay be written
dx = podsp +
+ kdz
= Po dp + Фо P d(p + к dz.
B.30)
The differential operations involving V follow from Eqs. 2.13, 2.17, 2.18,
and 2.22,
B.31)
B.32)
B.33)
dp
p2 d(pJ
VxV=if
Po
e
dp
vP
РФо
6
6cp
г (О
е
dz
vz
B.34)
Finally, for problems such as circular wave guides or cylindrical cavity resona-
resonators the vector Laplacian V2V resolved in circular cylindrical coordinates is
v2v = v2v Lf ^
р р 1 р 1
р р1 р р1 дер
B.35)
V2V z =
98 COORDINATE SYSTEMS
The basic reason for the form of the z-component is that the z-axis is a cartesian
axis; that is,
= Pof(VP, VJ + <?og(Vp, V9)
The operator V2 operating on the p0, <p0 unit vectors stays in the po<po-plane.
This behavior holds in all such cylindrical systems.
EXAMPLE 2.4.1 A Navier-Stokes Term
The Navier-Stokes equations of hydrodynamics contain a nonlinear term
V x [v x (V x v)],
where v is the fluid velocity. For fluid flowing through a cylindrical pipe in
the z-direction
From Eq. 2.34
Finally,
v = kv(p).
V x v = -
P
= —
v x (V x v) = —
P
ч [\ X ^V X \))
Po
d
dp
0
Фо
Po
0
0
vu
1
P
РФо
d
dcp
0
dv
pa
0
-P
.dv
%-
Po
d
dp
dv
"To
к
d
dz
v(p)
о k
ap
РФо
d
dqy
0
k
dz
0
= 0.
For this particular case the nonlinear term vanishes.
EXERCISES 99
EXERCISES
2.4.1 Resolve the circular cylindrical unit vectors into their cartesian components
(Fig. 2.5).
FIG. 2.5
ANS. p0 = i cos (p + j sin cp,
ф0 = — i sin cp + j cos cp,
2.4.2 Resolve the cartesian unit vectors into their circular cylindrical components.
ANS. i = p0 cos cp — ф0 sin cp,
j = p0 sin cp + ф0 cos q>,
k = k0.
2.4.3 From the results of Ex. 2.4.1 show that
2.4.4
dq>
dq>
and that all other first derivatives of the circular cylindrical unit vectors with
respect to the circular cylindrical coordinates vanish.
Compare V • V (Eq. 2.32) with the gradient operator
д l д , д
V + + k
2.4.5
(Eq. 2.31) dotted into V. Note that the differential operators of V differentiate
both the unit vectors and the components of V.
Hint. mo(lIp)(djd(p)'P0Vo becomes ф0 (p0Vo) and does not vanish.
pdcp
(a) Show that г = pop + koz.
(b) Working entirely in circular cylindrical coordinates, show that
Vt = 3 and Vxr = 0.
100 COORDINATE SYSTEMS
2.4.6 (a) Show that the parity operation (reflection through the origin) on a point
(p, cp,z) relative to fixed х-, y-, z-axes consists of the transformation
q> —> cp + n
z -* —z
(b) Show that p0 and <p0 have odd parity (reversal of direction) and that к
has even parity.
Note. The cartesian unit vectors i, j, and к remain constant.
2.4.7 A rigid body is rotating about a fixed axis with a constant angular velocity w.
Take со to lie along the z-axis. Express г in circular cylindrical coordinates and
using circular cylindrical coordinates,
(a) calculate v = со х г.
(b) calculate V x v.
ANS. (a) v = ф0сор
(b) V x v = 2@
2.4.8 A particle is moving through space. Find the circular cylindrical components
of its velocity and acceleration.
vp = P\ aP = P - P<P2,
% = РФ, % = РФ + 2р'Ф,
vz — z az = z.
Hint.
r@ = po@p@ + kz@
= [icos(p@ + Jsin(p@]p@ + kz(f).
Note, p = dp/dt, p = d2p/dt2, and so on.
2.4.9 Solve Laplace's equation \2ф — 0, in cylindrical coordinates for ф = ф(р).
ANS. ф = к\п^~
Po
2.4.10 In right circular cylindrical coordinates a particular vector function is given by
V(p, cp) = p0 Vp(p, cp) + ф0 V^ip, cp).
Show that V x V has only a z-component. Note that this result will hold for
any vector confined to a surface q3 = constant as long as the products hr Vr and
h2 V2 are each independent of q3.
2.4.11 For the flow of an incompressible viscous fluid the Navier-Stokes equations
lead to
-V x (v x (V x v)) = ^-V2(V x v).
Po
Here ц is the viscosity and p0 the density of the fluid. For axial flow in a cylindrical
pipe we take the velocity v to be
v = kv(p).
From Example 2.4.1
V x (v x (V x v)) = 0
for this choice of v.
EXERCISES 101
Show that
\2(\ x v) = 0
leads to the differential equation
1 d ( d2v\ 1 do л
pdp\ dp2) p dp
and that this is satisfied by
v = vo + a2p2.
2.4.12 A conducting wire along the z-axis carries a current /. The resulting magnetic
vector potential is given by
Show that the magnetic induction В is given by
2np
2.4.13 A force is described by
x2 + у2 х2 + у2
(a) Express F in circular cylindrical coordinates.
Operating entirely in circular cylindrical coordinates for (b) and (c),
(b) calculate the curl of F and
(c) calculate the work done by F in encircling the unit circle once counter-
counterclockwise.
(d) How do you reconcile the results of (b) and (c)?
2.4.14 A transverse electromagnetic wave (ТЕМ) in a coaxial wave guide has an electric
field E = Е(р,<рУ(к2~юг) and a magnetic induction field of В = В(р,<рУ(к2~юГ).
Since the wave is transverse neither E nor В has a z component. The two fields
satisfy the vector Laplacian equation
\2E(p,cp) = 0
(a) Show that E = p0E0(a/p)el(kz~tot) and В = <?0B0(a/p)emz-(Ot) are solutions.
Here a is the radius of the inner conductor and Eo and Bo are amplitudes.
(b) Assuming a vacuum inside the wave guide, verify that Maxwell's equations
are satisfied with
Bo/Eo = к /со = fi0e0(co/k) = l/с.
2.4.15 A calculation of the magnetohydrodynamic pinch effect involves the evaluation of
(B« V)B. If the magnetic induction В is taken to be В = у0Вф(р), show that
2.4.16 The linear velocity of particles in a rigid body rotating with angular velocity
со is given by
v = форю
Integrate §vdk around a circle in the xy-phne and verify that
tvdk
102 COORDINATE SYSTEMS
2.5 SPHERICAL POLAR COORDINATES (г, О, ф)
Relabeling (qx, q2, q3) as (г, 9, <p), we see that the spherical polar coordinate
system consists of the following:
1. Concentric spheres centered at the origin,
r = (x2 + y2 + z2I'2 = constant.
2. Right circular cones centered on the z-(polar) axis,
vertices at the origin,
9 = arc cos -—. ^ =-7tj = constant.
(x2 + y2 + z2I'2
3. Half planes through the z-(polar) axis,
у
Ф = arc tan- = constant.
x
By our arbitrary choice of definitions of 9, the polar angle, and <p, the azimuth
angle, the z-axis is singled out for special treatment. The transformation
equations corresponding to Eq. 2.2 are
x = r sin 9 cos ф,
у = г sin 9 sin ф, B.36)
z = r cos 9,
measuring 9 from the positive z-axis and q> in the x>'-plane from the positive
x-axis. The ranges of values are 0 < r < со, 0 < 9 < n, and 0 < cp < 2%. From
Eq. 2.6
hx=hr= 1,
h2 = he = r, B.37)
h3 = h(p = rsm9.
This gives a line element
dx = rodr + Qord9 + <por sin 9 dcp.
In this spherical coordinate system the area element (for r = constant) is
dA = da$(p = r2 sin 9 d9 dcp, B.38)
the dark, shaded are in Fig. 2.6. Integrating over the aximuth <p, we find that
the area element becomes a ring of width d9,
dA = 2nr2 sin 9 d9. B.39)
This form will appear repeatedly in problems in spherical polar coordinates
with azimuthal symmetry—such as the scattering of an unpolarized beam of
nuclear particles. By definition of solid radians or steradians, an element of
solid angle dQ is given by
SPHERICAL POLAR COORDINATES (r, 9, q>) 103
dip
И А
Integrating over the entire spherical surface, we obtain
du = 471.
From Eq. 2.11 the volume element is
dx = r2 dr sin 9 d9 dcp
= r2drdu.
FIG. 2.6 Spherical po-
polar coordinate area ele-
elements
B.40)
B.41)
The spherical polar coordinate unit vectors are shown in Fig. 2.7.
It must be emphasized that the unit vectors r0, 60, and <p0 vary in direction as the
angles 9 and (p vary. Specifically, the 9 and cp derivatives of these spherical polar
coordinate unit vectors do not vanish (Exercise 2.5.2). When differentiating
vectors in spherical polar (or in any noncartesian system) this variation of the
unit vectors with position must not be neglected. In terms of the fixed direction
cartesian unit vectors i, j, and k,
r0 = i sin 9 cos cp + j sin 9 sin cp + к cos 9,
60 = i cos в cos cp + j cos 9 sin ф — к sin 9, B.42)
<p0 = — i sin cp + j cos cp.
Note that a given vector can now be expressed in a number of different (but
equivalent) ways. For instance, the position vector г may be written
r = ror
= ix + jy + kz
= \r sin 9 cos cp + \r sin 9 sin cp + kr cos 9.
104 COORDINATE SYSTEMS
+» У
FIG. 2.7 Spherical polar coordinates
Select the form that is most useful for your particular problem.
From Section 2.2, relabeling the curvilinear coordinate unit vectors el9 e2,
and e3 as r0, 60, and <p0 gives
v-v =
~^- + в0- -^-
1
1 дф
rsmv ocp
r2 sin в
1
г2 sin в
V x V =
1
r2 sin в
ьтв~(г2
дг \
r0 rQ0 r sin
дг дв дер
К rVe
B.44)
B.45)
B.46)
B.47)
Occasionally, the vector Laplacian V2V is needed in spherical polar co-
coordinates. It is best obtained by using the vector identity (Eq. 1.80) of Chapter 1.
For future reference
SPHERICAL POLAR COORDINATES (г, в, ф) 105
v2vl =( 2 \26 i 62 \ cose 6 i l d2 i * a'V
|r V г2 + г ar+ ar2 + r2 sin0 ae r2ae2 + r2 sin2
sin0 ae r2ae2 + r2 sin
m9
2 2 6Ve 2cos0 2
г 2 2 6Ve 2cos0
r r2 r г2 6в г2 sin в в
sin 0 a«p r2 sin2
r2 r г2 6в г2 sin в в г2 sine dq>'
B.49)
B50)
r2 sin2 0 a«pK }
These expressions for the components of V2V are undeniably messy, but
sometimes they are needed. There is no guarantee that nature will always be
simple.
EXAMPLE 2.5.1
Using Eqs. 2.44 to 2.47, we can reproduce by inspection some of the results
derived in Chapter 1 by laborious application of cartesian coordinates.
From Eq. 2.44
ar B.51)
Vrlf = r0«r"-1.
From Eq. 2.45
r ar B.52)
\-ror" = {n + 2)rn~1.
From Eq. 2.46
f% B.53)
= n(n + l)r"-2. B.54)
Finally, from Eq. 2.47
V x ro/(r) = 0. B.55)
EXAMPLE 2.5.2 Magnetic Vector Potential
The computation of the magnetic vector potential of a single current loop
in the xy-plane involves the evaluation of
106 COORDINATE SYSTEMS
In spherical polar coordinates this reduces as follows
r0 rQ0 rsin9<p0
V = V x
r2sin6
дг ев
дер
О 0 г sin вА^г, в)
= v x
1
Taking the curl a second time, we obtain
r0
V =
r2sm6
дг
A
дв
e
(r sin 6AJ
By expanding the determinant, we have
rdr2
i + J-JLL!
г дв [_sin
= -Фо
в w
B.56)
B.57)
dep
0
B.58)
In Chapter 12 we shall see that V leads to the associated Legendre equation and
that Ay may be given by a series of associated Legendre polynomials.
EXERCISES
2.5.1
2.5.2
Resolve the spherical polar unit vectors into their cartesian components.
ANS. r0 = isin0cos<p + jsin0sin<p -
Bo = i cos в cos <p + j cos в sin <p — к sin 0,
ф0 = — i sin (p + j cos cp.
(a) From the results of Exercise 2.5.1 calculate the partial derivatives of r0,
Qo, and <p0 with respect to г, в, and cp.
(b) With V given by
8 „ 1 8 \ 8
8r r 89 rsinO dep
(greatest space rate of change), use the results of part (a) to calculate V • V^.
This is an alternate derivation of the Laplacian.
Note. The derivatives of the left-hand V operate on the unit vectors of the right-
hand \ before the unit vectors are dotted together.
2.5.3 A rigid body is rotating about a fixed axis with a constant angular velocity w.
Take oo to be along the z-axis. Using spherical polar coordinates,
EXERCISES 107
(a) Calculate
v = ю x r,
(b) Calculate
V x v.
ANS. (a) v = (poa>rsm0
(b) V x v = 2co.
2.5.4 The coordinate system (x,y,z) is rotated through an angle Ф counterclockwise
about an axis defined by the unit vector n into system (x',y',z'). In terms of the
new coordinates the radius vector becomes
r' = rcos<J> + г x nsinO + n(n-r)(l — cos<J>).
(a) Derive this expression from geometric considerations.
(b) Show that it reduces as expected for n = k. The answer, in matrix form,
appears in Section 4.3.
(c) Verify that r'1 = r2,
2.5.5 Resolve the cartesian unit vectors into their spherical polar components.
i = r0 sin в cos cp + Bo cos в cos cp — <p0 sin (p,
j — r0 sin в sin (p + Bo cos 0 sin (p + <p0 cos <p,
к = r0 cos в — % sin 6.
2.5.6 The direction of one vector is given by the angles 91 and ц>х. For a second vector
the corresponding angles are в2 and q>2. Show that the cosine of the included
angle у is given by
cos}' = cos 0t cos 62 + sin 6l sin в2 cos(<p! — (p2).
See Fig. 12.16.
2.5.7 A certain vector V has no radial component. Its curl has no tangential com-
components. What does this imply about the radial dependence of the tangential
components of V?
2.5.8 Modern physics lays great stress on the property of parity—whether a quantity
remains invariant or changes sign under an inversion of the coordinate system.
In cartesian coordinates this means x -* — x, у -» —у, and z -* —z.
(a) Show that the inversion (reflection through the origin) of a point (r, 9, <p)
relative to fixed х-, у-, z-axes consists of the transformation
<p -* (p + я.
(b) Show that r0 and <p0 have odd parity (reversal of direction) and that 60
has even parity.
2.5.9 With A any vector
AVr = A.
(a) Verify this result in cartesian coordinates.
(b) Verify this result using spherical polar coordinates. (Eq. 2.44 provides V.)
In the language of dyadics (Section 3.5), Vr is the indemfactor, a unit dyadic.
2.5.10 A particle is moving through space. Find the spherical coordinate components
of its velocity and acceleration:
108 COORDINATE SYSTEMS
A particle is moving through
of its velocity and acceleration:
vr = r,
ve — гв,
иф = г$твф,
ar = r — гв'2 — r sin2 вф2,
ав = гв + 2гв — rsinOcosOcp2,
a<f, = r sin вф + 2r sin вф + 2r cos ввф.
Hint.
г@ = го(Ог(О
= [i sin 0@ cos (p(t) + j sin d(t) sin (p(t) + к cos 0@] КО-
Note. Using the Lagrangian techniques of Section 17.3, we may obtain these
results somewhat more elegantly. The dot in r means time derivative, r = dr/dt.
The notation was originated by Newton.
2.5.11 A particle m moves in response to a central force according to Newton's second
law
mi = ro/(r).
Show that г x г = с, a constant and that the geometric interpretation of this
leads to Kepler's second law.
2.5.12 Express д/дх, д/ду, д/dz in spherical polar coordinates.
.... й . л д n I d sincp д
ANS. — — sin 0cos ю- + cos0cos(p-— r-^^—,
dx or rod r sin 0 d(p
д . Q . д a . I d cos (p д
= sin 0 sin (p 1- cos 0 sin (p 1-
cy or г дв r sin 0 d(p
— = COS0 Sin0 .
dz 8r г дв
Hint. Equate V,,, and \гвф.
2.5.13 From Exercise 2.5.12 show that
\ By dx/ 8(p
\ By dx/ 8(p
This is the quantum mechanical operator corresponding to the z-component
of angular momentum.
2.5.14 With the quantum mechanical angular momentum operator defined as L =
— i(r x V), show that
(а) «, = ,*(~
\дв д(р/
These are the raising and lowering operators of Sections 12.6 and 7.
EXERCISES 109
2.5.15 Verify that L x L = iL in spherical polar coordinates. L = — i(r x V), the quan-
quantum mechanical angular momentum operator.
Hint. Use spherical polar coordinates for L but cartesian components for the
cross product.
2.5.16 (a) From Eq. 2.44 show that
L=-i(rxV) = i(eo4^-9o;|Y
\ sin в дер до/
(b) Resolving в0 and <p0 into cartesian components, determine Lx, Ly, and Lz
in terms of в, ср, and their derivatives.
(c) From L2 = L2 + L2 + L2 show that
до) sin20V
2.5.17 With L = — it x V verify the operator identities
, ч _ д г х L
(a) v = r0—-i—2-,
or r
(b) rV2-Vn +r —)=i'V xL.
This latter identity is useful in relating angular momentum and Legendre's
differential equation, Exercise 8.3.1.
2.5.18 Show that the following three forms (spherical coordinates) о(\2ф(г) are equiva-
equivalent.
(а)
(b)
1 d Г 2#(гЛ
r2dr\_ dr J'
2 #(r)
The second form is particularly convenient in establishing a correspondence
between spherical polar and cartesian descriptions of a problem. A generalization
of this appears in Exercise 8.6.11.
2.5.19 One model of the solar corona assumes that the steady-state equation of heat flow
is satisfied. Here, k, the thermal conductivity, is proportional to T512. Assuming
that the temperature T is proportional to r", show that the heat flow equation
is satisfied by T= T0(r0/rJn.
2.5.20 A certain force field is given by
_-, 2PCOS0
F = r0 —
(in spherical polar coordinates).
r>P/2,
110 COORDINATE SYSTEMS
(a) Examine V x Fto see if a potential exists.
(b) Calculate <j> F • dk for a unit circle in the plane 9 = л/2.
What does this indicate about the force being conservative or noncon-
servative?
(c) If you believe that F may be described by F= — \ф, find ф. Otherwise
simply state that no acceptable potential exists.
2.5.21 (a) Show that A = — <pocot0/r is a solution of V x A = ro/r2.
(b) Show that this spherical polar coordinate solution agrees with the solution
given for Exercise 1. 13.5:
A — # У2 ' XZ
r(x2 + уГ) 17(хТТуГУ
Note that the solution diverges for 9 = 0, л corresponding to x, у — 0.
(с) Finally, show that A = — eo<psin0/r is a solution. Note that although this
solution does not diverge (г ф 0) it is no longer single-valued for all possible
azimuth angles.
2.5.22 A magnetic vector potential is given by
a -bo mxr
An r3
Show that this leads to the magnetic induction В of a point magnetic dipole,
dipole moment m.
ANS. For m = km,
I? v a r Mo 2mcos 0 „ /i0 msinfl
V X A — Го- 5 1- Wo- з .
4л г3 4л г
Compare Eqs. 12.136 and 12.137.
2.5.23 At large distances from its source, electric dipole radiation has fields
gi(kr-<ot) gi{kr-at)
E i^Bo, B = aBsin0 фо.
r
г
Show that Maxwell's equations
5B 5E
\ x E = —— and V x В = ео/ло —
ct at
are satisfied, if we take
ClE CO _ . \-l/2
ав к
Hint. Since r is large, terms of order r~2 may be dropped.
2.5.24 The magnetic vector potential for a uniformly charged rotating spherical shell is
fioa ceo sin 9
Фо *—ъ—' v > п
3 г
u0acto n
<Po 'r cos 9, r<a.
3
(a — radius of spherical shell, a surface charge density, and со angular velocity.)
Find the magnetic induction В = V x A.
SEPARATION OF VARIABLES HI
ANS. ВЛ^в)^-^.^, r>a
3 гл
sinO
r, r>a
3 г
2ц0аои>
r<a.
2.5.25 (a) Explain why V2 in plane polar coordinates follows from V2 in circular
cylindrical coordinates with z = constant.
(b) Explain why taking V2 in spherical polar coordinates and restricting в
to л/2 does not lead to the plane polar form of V2.
Note.
V2(p) + +
2'
dp2 рдр р2д(р
2.6 SEPARATION ОГ VARIABLES
CARTESIAN COORDINATES
In cartesian coordinates the Helmholtz equation (Eq. 2.1) becomes
using Eq. 2.26 for the Laplacian. For the present let k2 be a constant. Perhaps
the simplest way of treating a partial differential equation such as 2.59 is to
split it into a set of ordinary differential equations. This may be done as follows:
Let
iKx,y,z) = X(x)Y(y)Z(z) B.60)
and substitute back into Eq. 2.59. How do we know Eq. 2.60 is valid? The
answer is very simple. We do not know it is valid! Rather, we are proceeding
in the spirit of let's try and see if it works. If our attempt succeeds, then Eq. 2.60
will be justified. If it does not succeed, we shall find out soon enough and then
we shall try another attack such as Green's functions, integral transforms,
or brute force numerical analysis. With ф assumed given by Eq. 2.60, Eq. 2.59
becomes
j2v J2y Jly
dx1 dy1 dz2
Dividing by ф = XYZ and rearranging terms, we obtain
1 d2X k2±^I
Y dy2 Z dz2'
B 62)
Equation 2.62 exhibits one separation of variables. The left-hand side is a
function of x alone, whereas the right-hand side depends only on у and z.
112 COORDINATE SYSTEMS
So Eq. 2.62 is a sort of paradox. A function of x is equated to a function of j;
and z, but x, y, and z are all independent coordinates. This independence means
that the behavior of x as an independent variable is not determined by у and z.
The paradox is resolved by setting each side equal to a constant, a constant of
separation. We choose1
B63)
1 л2
_£2 L
Y dy2 Z dz2
Now, turning our attention to Eq. 2.64, we obtain
— г-2 _i_ /2 о a<^
~Г7 —~т" — — »v it — ~~ ~—, ^z.ojj
and a second separation has been achieved. Here we have a function of у equated
to a function of z and the same paradox appears. We resolve it as before by
equating each side to another constant of separation, — m2,
— —у = — mz, B.66)
1 ^2-7"
_£2 + /2 + m2= -n\ B.67)
Z dz2 ' '
introducing a constant «2 by £2 = /2 + m2 + «2 to produce a symmetric set of
equations. Now we have three ordinary differential equations B.63, 2.66, and
2.67) to replace Eq. 2.59. Our assumption (Eq. 2.60) has succeeded and is
thereby justified.
Our solution should be labeled according to the choice of our constants /,
m, and n, that is,
4>imn(x,y,z) = ВД ВДад. B.68)
Subject to the conditions of the problem being solved and to the condition
k2 = I2 + m2 + n2, we may choose /, m, and n as we like, and Eq. 2.68 will
still be a solution of Eq. 2.1, provided Xt(x) is a solution of Eq. 2.63, and so on.
We may develop the most general solution of Eq. 2.1 by taking a linear com-
combination of solutions ф1тп,
Ч> — V a i// О 691
x /_i "imnYlmrf \*..\js )
l,m,n
The constant coefficients almn are finally chosen to permit *F to satisfy the
boundary conditions of the problem.
'The choice of sign, completely arbitrary here, will be fixed in specific
problems by the need to satisfy specific boundary conditions.
SEPARATION OF VARIABLES 113
LINEAR OPERATORS
How is this possible? What is the justification for writing Eq. 2.69? The
justification is found in noting that V2 + k2 is a linear (differential) operator.
A linear operator Z£ is defined as an operator with the following two properties:
where a is a constant and
The derivatives d"jdx" and the integral J [ ~\dx are examples of linear
operators. The square ( J and sin are examples of nonlinear operators.
In general, (axJ =f= ax2 and sin@ + cp) ф sin# + sin (p. As a consequence of
the defining properties, any linear combination of solutions of a linear differ-
differential equation is also a solution. From its explicit form V2 + k2 is seen to
these two properties (and is therefore a linear operator). Equation 2.69 then
follows as a direct application of these two defining properties.2
A further generalization may be noted. The separation process just described
would go through just as well for
k2=f(x)+g(y)+h(z) + k'2, B.70)
with k'2 a new constant.
We would simply have
1 d2X
X dx2
+f(x)=-l2 B.71)
replacing Eq. 2.63. The solutions X, Y, and Z would be different, but the
technique of splitting the partial differential equation and of taking a linear
combination of solutions would be the same.
In case the reader wonders what is going on here, this technique of separation
of variables of a partial differential equation has been introduced to illustrate
the usefulness of these coordinate systems. The solutions of the resultant
ordinary differential equations are developed in Chapters 8 through 13.
CIRCULAR CYLINDRICAL COORDINATES
With our unknown function ф dependent on p, cp, and z, the Helmholtz
equation becomes
\2ф(р,ср,2) + к2ф(р,ср,2) = 0, B.72)
or
рдр\ др) рг дер cz*
2 We are especially interested in linear operators because in quantum mechan-
mechanics physical quantities are represented by linear operators operating in a
complex, infinite dimensional Hilbert space.
114 COORDINATE SYSTEMS
As before, we assume a factored form for ф,
B.74)
Substituting into Eq. 2.73, we have
B.75)
All the partial derivatives have become ordinary derivatives. Dividing by РФ2
and moving the z derivative to the right-hand side yields
1 d ( dP\ . 1 d2<b . ,, \ d2Z
B.76)
pP dp\ dp) р2Ф dcp2 Z dz:
Again, we have the paradox. A function of z on the right appears to depend
on a function of p and cp on the left. We resolve the paradox by setting each
side of Eq. 2.76 equal to a constant, the same constant. Let us choose3 — I2.
Then
and
Setting k2 + I2 = n2, multiplying by p2, and rearranging terms, we obtain
B.79)
dp)
We may set the right-hand side equal to m2 and
d2<$>
= -т2Ф. B.80)
dcp1
Finally, for the p dependence we have
0 (Z81)
This is Bessel's differential equation. The solutions and their properties are
presented in Chapter 11.
The original Helmholtz equation, a three-dimensional partial differential
equation, has been replaced by three ordinary differential equation, Eqs. 2.77,
2.80, and 2.81. A solution of the Helmholtz equation is
B.74)
3The choice of sign of the separation constant is arbitrary. However, a minus
sign is chosen for the axial coordinate z in expectation of a possible exponential
dependence on z (from Eq. 2.77). A positive sign is chosen for the azimuthal
coordinate ц> in expectation of a periodic dependence on q> (from Eq. 2.80).
SEPARATION OF VARIABLES 115
Identifying the specific P, Ф, Z solutions by subscripts, we see that the most
general solution of the Helmholtz equation is a linear combination of the
product solutions:
¥(p, cp, z) = X атпРтп(р)Фт(ср)гпB)- B.82)
m,n
SPHERICAL POLAR COORDINATES
Let us try to separate Eq. 2.1, again with k2 constant, in spherical polar
coordinates. Using Eq. 2.46, we obtain
д2ф
1
r2 sin в
дф
дг
дф
l
= -к2ф. B.83)
Now, in analogy with Eq. 2.60 we try
B.84)
By substituting back into Eq. 2.83 and dividing by R&Ф, we have
1 d
1
Rr2drV dr
sin в d6
d ( .
-^ sin
1
d2Ф
dO
Фг2Б1П2в
= -к2. B.85)
Note that all derivatives are now ordinary derivatives rather than partials. By
multiplying by r2 sin2 в, we can isolate (\1Ф)^2Ф^(р2) to obtain4
1 <1Ф 2 . 2a
— -j~t = r2 sin2 в
Ф d(p2
_k2_J_d_(r2dR
r2R dr \ dr
1
r2 sin i
B.86)
Equation 2.86 relates a function of cp alone to a function of r and в alone.
Since г, в, and cp are independent variables, we equate each side of Eq. 2.86 to a
constant. Here a little consideration can simplify the later analysis. In almost
all physical problems cp will appear as an azimuth angle. This suggests a periodic
solution rather than an exponential. With this in mind, let us use — m2 as the
separation constant. Any constant will do, but this one will make life a little
easier. Then
B.87)
and
±_d_fr2dR
1
2R dr\ dr } r2 sin 00
r2sin20
= -k2. B.88)
Multiplying Eq. 2.88 by r2 and rearranging terms, we obtain
RJr
dr
+ r2k2 = г
1
sin i
Те [sinele
sin2 в
B.89)
Again, the variables are separated. We equate each side to a constant Q and
finally obtain
4 The order in which the variables are separated here is not unique. Many
quantum mechanics texts show the r dependence split off first.
116 COORDINATE SYSTEMS
+ k2R - ^v = 0- B.91)
Once more we have replaced a partial differential equation of three variables by
three ordinary differential equations. The solutions of these ordinary differential
equations are discussed in Chapters 11 and 12. In Chapter 12, for example,
Eq. 2.90 is identified as the associated Legendre equation in which the constant
Q becomes /(/ + 1); /is an integer. If к2 is a (positive) constant, Eq. 2.91 becomes
the spherical Bessel equation of Section 11.7.
Again, our most general solution may be written
Фощ(Г, 0, <p) = £ Яа(г)®ат(9)Фт(ср). B.92)
Q,m
The restriction that к2 be a constant is unnecessarily severe. The separation
process will still be possible for A:2 as general as
к2 = Яг) + ~29@) + ^Ь~вК<Р) + k'2. B.93)
In the hydrogen atom problem, one of the most important examples of the
Schrodinger wave equation with a closed form solution, we have k2 =f(r).
Equation 2.91 for the hydrogen atom becomes the associated Laguerre equation.
The great importance of this separation of variables in spherical polar
coordinates stems from the fact that the case k2 — k2{r) covers a tremendous
amount of physics: a great deal of the theories of gravitation, electrostatics,
atomic physics, and nuclear physics. And, with k2 = k2(r), the angular depen-
dependence is isolated in Eqs. 2.87 and 2.90, which can be solved exactly.
Separation of variables and an investigation of the resulting ordinary
differential equations are discussed again in Section 8.3.
EXERCISES
2.6.1 By letting the operator?2 + k2 act on the general form а 1ф1(х,у,г) + а2ф2(х,у,г),
show that it is linear, that is, (V2 + к2){ахфх + а2ф2) = a^V2 + к2)ф1 + а2(Ч2 +
к2)ф2.
2.6.2 Show that the Helmholtz equation
\2ф + к2ф = 0
is still separable in circular cylindrical coordinates if k2 is generalized to k2 +
Яр) + п/р2)д(<р) + h(z).
2.6.3 Separate variables in the Helmholtz equation in spherical polar coordinates split-
splitting off the radial dependence first. Show that your separated equations have the
same form as Eqs. 2.87, 2.90, and 2.91.
2.6.4 Verify that
REFERENCES 117
\2ф(г, в, <р) + \к2 +/(г) + \д(9) + -Y±^-h(cp)\ ф(г, 0, <р) = О
[_ rz r^sin v J
is separable (in spherical polar coordinates). The functions/, g, and h are functions
only of the variables indicated; k2 is a constant.
2.6.5 An atomic (quantum mechanical) particle is confined inside a rectangular box of
sides a, b, and c. The particle is described by a wave function ф which satisfies the
Schrodinger wave equation
-^\2ф = Еф.
2m
The wave function is required to vanish at each surface of the box (but not to be
identically zero). This condition imposes constraints on the separation constants
and therefore on the energy E. What is the smallest value of E for which such a
solution can be obtained?
2.6.6 For a homogeneous spherical solid with constant thermal diffusivity, K, and no
heat sources the equation of heat conduction becomes
at
Assume a solution of the form
T=R(r)T(t)
and separate variables. Show that the radial equation may take on the standard
form
d2 R dR
r2—rj- + 2r у [aV - n(n + \)~\R = 0; n = integer.
dr dr
The solutions of this equation are called spherical Bessel functions.
2.6.7 Separate variables in the thermal diffusion equation of Exercise 2.6.6 in circular
cylindrical coordinates. Assume that you can neglect end effects and take T =
Additional exercises on separation of variables appear at the end of Section 8.3.
REFERENCES
Margenau, H., and G. M. Murphy. The Mathematics of Physics and Chemistry, 2nd ed.
Princeton, N.J.: D. Van Nostrand A956).
Chapter 5 covers curvilinear coordinates and 13 specific coordinate systems.
Morse, P. M., and H. Feshbach. Methods of Theoretical Physics. New York: McGraw-
Hill A953).
Chapter 5 includes a description of several different coordinate systems. Note carefully
that Morse and Feshbach are not above using left-handed coordinate systems even for
cartesian coordinates. Elsewhere in this excellent (and difficult) book there are many
examples of the use of the various coordinate systems in solving physical problems.
Eleven additional fascinating but seldom encountered orthogonal coordinate systems
are discussed in the second A970) edition of Mathematical Methods for Physicists.
3 TENSOR
ANALYSIS
3.1 INTRODUCTION, DEFINITIONS
Tensors are important in many areas of physics, including general relativity
and electromagnetic theory. One of the more prolific sources of tensor quan-
quantities is the anisotropic solid. Here the elastic, optical, electrical, and magnetic
properties may well involve tensors. The elastic properties of the anisotropic
solid are considered in some detail in Section 3.6. As an introductory illustra-
illustration, let us consider the flow of electric current. We can write Ohm's law in the
usual form
J = aE, C.1)
with current density J and electric field E, both vector quantities. * If we have an
isotropic medium, a, the conductivity, is a scalar, and for the x-component, for
example,
Jl = aEl. C.2)
However, if our medium is anisotropic, as in many crystals, or a plasma in the
presence of a magnetic field, the current density in the x-direction may depend
on the electric fields in the y- and z-directions as well as on the field in the
x-direction. Assuming a linear relationship, we must replace Eq. 3.2 with
Jl = allE1 +al2E2 + al3E3, C.3)
and, in general,
tkEk. C.4)
For ordinary three-dimensional space the scalar conductivity a has given way
to a set of nine elements, a!k.
031 °32 °33
This array of nine elements actually forms a tensor, as shown in Section 3.3.
1 Another example of this type of physical equation appears in Section 4.6.
118
INTRODUCTION, DEFINITIONS 119
Generalizing Eq. 3.1, if a relation A = ВС with A and С as nonparallel
vectors holds in all orientations of a cartesian system, then В is a (second-rank)
tensor. This is proven in Section 3.3. Other physical problems giving rise to
tensors include elasticity (Section 3.6), electromagnetism (Section 3.7), the
inertia matrix (Section 4.6), and above all, general relativity.
In Chapter 1 a quantity that did not change under rotations of the coordinate
system, an invariant quantity, was labeled a scalar. A quantity whose com-
components transformed like those of the distance of a point from a chosen origin
(Eq. 1.9, Section 1.2) was called a vector. This transformation property was
adopted as the defining characteristic of a vector. The transformation of the
components of the vector under a rotation of the coordinates just preserves the
vector as a geometric entity (such as an arrow in space), independent of the
orientation of the reference frame.
There is a possible ambiguity in this transformation definition of vector
(Eq. 3.6),
in which aVi is the cosine of the angle between the x--axis and the x^-axis.
If we start with a differential distance vector dx, then, taking dx[ to be a
function of the unprimed variables,
by partial differentiation. If we set
a^% C-S)
Eqs. 3.6 and 3.7 are consistent. Any set of quantities A} transforming according
to
Pr'
4 = £тгЧ C-9)
is defined as a contravariant vector.
However, we have already encountered a slightly different type of vector
transformation. The gradient of a scalar, V<p, defined by
C.10)
OXl OX 2 OX3
(using х^Хз^з for x,y,z), transforms as
dx\ j dxj dx\
using ф = (p(x,y,z) = <p{x',/,z') = cp', cp defined as a scalar quantity. Notice
that this differs from Eq. 3.9 in that we have dxj/dx't instead ofdx'JdXj. Equation
3.11 is taken as the definition of а со variant vector with the gradient as the
prototype.
120 TENSOR ANALYSIS
In cartesian coordinates
я§ = |г = ^' C-12)
and there is no difference between contravariant and covariant transformations.
In other systems Eq. 3.12 in general does not apply, and the distinction between
contravariant and covariant is real and must be observed. This is of prime
importance in the curved Riemannian space of general relativity. A much
simpler example is provided by the oblique coordinates of Section 4.4.
In the remainder of this section the components of a contravariant vector
are denoted by a superscript, A1, whereas a subscript is used for the components
of a covariant vector AL.2
Definition of Second-rank Tensors
To remove some of the fear and mystery from the term tensor, let us rechristen
a scalar as a tensor of rank zero and relabel a vector as a tensor of first rank.
Then we proceed to define contravariant, mixed, and covariant tensors of
second rank by the following equations:
4'ij _ ydx'i °xj Aki
kl oxk oxl
B/l. = y^±^-Bk C 13)
f-,i чг-> OXk OX i „
kl OXi VXj
Clearly, the rank goes as the number of partial derivatives (or direction cosines)
in the definition: zero for a scalar, one for a vector, two for a second-rank
tensor, and so on. Each index (subscript or superscript) ranges over the number
of dimensions of the space. The number of indices (rank of tensor) is indepen-
independent of the dimensions of the space. We see that Akl is contravariant with
respect to both indices, Ckl is covariant with respect to both indices, and Bkx
transforms contravariantly with respect to the first index к but covariantly with
respect to the second index /. Once again, if we are using cartesian coordinates,
all three forms of the tensors of second rank, contravariant, mixed, and co-
covariant are the same.
As with the components of a vector, the transformation laws for the com-
components of a tensor, Eq. 3.13, yield entities (and properties) that are indepen-
independent of the choice of reference frame. This is what makes tensor analysis im-
important in physics. The independence of reference frame (invariance) is ideal
for expressing and investigating universal physical laws.
2This means that the coordinates (x,y,z) should be written (jc1,*2,.*:3) since
r transforms as a contravariant vector. Because we shall shortly restrict our
attention to cartesian tensors (where the distinction between contravariance
and covariance disappears) we continue to use subscripts on the coordinates.
This avoids the ambiguity of x2 representing both x squared and y.
INTRODUCTION, DEFINITIONS 121
The second-rank tensor A (components Akl) may be conveniently repre-
represented by writing out its components in a square array C x 3 if we are in
three-dimensional space),
/A11 A12 A13\
A = Ia21 A22 A23\ C.14)
\A31 A32 A33)
This does not mean that any square array of numbers or functions forms a
tensor. The essential condition is that the components transform according to
Eq. 3.13.
In the context of matrix analysis the preceding transformation equations
become (for cartesian coordinates) an orthogonal similarity transformation,
Section 4.3. A geometrical interpretation of a second-rank tensor (the inertia
tensor) is developed in Section 4.6.
Addition and Subtraction of Tensors
The addition and subtraction of tensors is defined in terms of the individual
elements just as for vectors. To add or subtract two tensors, we add or subtract
the corresponding elements. If
A+B = C, C.15)
then
л ij _|_ nij /~<ij
Of course, A and В must be tensors of the same rank and both expressed in a
space of the same number of dimensions.
Summation Convention
In tensor analysis it is customary to adopt a summation convention to put
Eq. 3.13 and subsequent tensor equations in a more compact (and, for the be-
beginning student, a more obscure) form. As long as we are distinguishing between
contravariance and со variance, let us agree that when an index appears on one
side of an equation, once as a superscript and once as a subscript, we auto-
automatically sum over that index. Then we may write the second expression in
Eq. 3.13 as
Bni = ^^Bkl; C.16)
with the summation of the right-hand side over к and / implied. This is the sum-
summation convention.3
To illustrate the use of the summation convention and some of the techniques
of tensor analysis, let us show that the now familiar Kronecker delta, dkl, is
really a mixed tensor of second rank, S^.4 The question is, does <5* transform
3In this context dx'Jdxk might better be written as a\ and dxjdx] as b'j.
4 It is common practice to refer to a tensor A by specifying a typical component,
Atj. As long as the reader refrains from writing nonsense such as A = Ац, no
harm is done.
122 TENSOR ANALYSIS
according to Eq. 3.13? This is our criterion for calling it a tensor. We have,
using the summation convention,
дхк dx'j дхк
by definition of the Kronecker delta. Now
dxl dxk dx'i
dxk
(ЗЛ8)
by direct partial differentiation of the right-hand side (chain rule). However,
x[ and x'j are independent coordinates, and therefore the variation of one with
respect to the other must be zero if they are different, unity if they coincide;
that is,
|f = ^ C.19)
Hence
J dxk 0Xj
showing that the <5* are indeed the components of a mixed second-rank tensor.
Notice that this result is independent of the number of dimensions of our space.
The Kronecker delta has one further interesting property. It has the same
components in all of our rotated coordinate systems and is therefore called
isotropic. In Section 3.4 we shall meet a third-rank isotropic tensor and three
fourth-rank isotropic tensors. No isotropic first-rank tensor (vector) exists.
Symmetry—Antisymmetry
The order in which the indices appear in our description of a tensor is im-
important. In general, Amn is independent of A"m, but there are some cases of
special interest. If
Amn = Anm^ C 20)
we call the tensor symmetric. If, on the other hand,
Amn=-Anm, C.21)
the tensor is antisymmetric. Clearly, every (second-rank) tensor can be resolved
into symmetric and antisymmetric parts by the identity
Amn = \{A™ + Anm) + \{Amn - Anm), C.22)
the first term on the right being a symmetric tensor, the second, an antisym-
antisymmetric tensor. This resolution into symmetric and antisymmetric tensors will
reappear in the theory of elasticity (Section 3.6). A similar resolution of func-
functions into symmetric and antisymmetric parts is of extreme importance to
quantum mechanics.
EXERCISES 123
Spinors
It was once thought that the system of scalars, vectors, tensors (second-
rank), and so on formed a complete mathematical system, one that is adequate
for describing a physics independent of the choice of reference frame. But the
universe (and mathematical physics) are not this simple. In the realm of elemen-
elementary particles, for example, spin zero particles5 (n mesons, a particles) may be
described with scalars, spin 1 particles (deuterons) by vectors, and spin 2
particles (hypothetical gravitons) by tensors. This listing omits the most com-
common particles: electrons, protons, and neutrons, all with spin \. These particles
are properly described by spinors. A spinor is not a scalar, vector, or tensor. A
brief introduction to spinors in the context of group theory appears in Section
4.10.
EXERCISES
3.1.1 Show that if the components of any tensor of any rank vanish in one particular
coordinate system they vanish in all coordinate systems.
Note. This point takes on especial importance in the four-dimensional curved space
of general relativity. If a quantity, expressed as a tensor, exists in one coordinate
system, it exists in all coordinate systems and is not just a consequence of a choice
of a coordinate system (as are centrifugal and Coriolis forces in Newtonian
mechanics).
3.1.2 The components of tensor A are equal to the corresponding components of tensor
В in one particular coordinate system; that is,
A9.= «9.
Show that tensor A is equal to tensor B, A;j = BVi, in all coordinate systems.
3.1.3 The first three components of a four-dimensional vector vanish in each of two
reference frames. If the second reference frame is not merely a rotation of the first
about the x4 axis, that is, if at least one of the coefficients ai4,(i — 1,2,3) ф 0, show
that the fourth component vanishes in all reference frames. Translated into rela-
tivistic mechanics this means that if momentum is conserved in two Lorentz frames,
then energy is conserved in all Lorentz frames.
3.1.4 From an analysis of the behavior of a general second-rank tensor under 90° and
180° rotations about the coordinate axes, show that an isotropic second-rank
tensor in three-dimensional space must be a multiple of <5,y.
3.1.5 The four-dimensional fourth-rank Riemann-Christoffel curvature tensor of
general relativity, RiUm, satisfies the symmetry relations
With the indices running from 1 to 4, show that the number of independent com-
components is reduced from 256 to 36 and that the condition
5 The particle spin is intrinsic angular momentum (in units of ft). It is distinct
from classical, orbital angular momentum due to motion.
124 TENSOR ANALYSIS
further reduces the number of independent components to 21. Finally, if the com-
components satisfy an identity Riklm + Яцтк + R-mki — 0< show that the number of
independent components is reduced to 20.
Note. The final three-term identity furnishes new information only if all four
indices are different. Then it reduces the number of independent components by
one third.
3.1.6 Tiklm is antisymmetric with respect to all pairs of indices. How many independent
components has it (three-dimensional space)?
3.2 CONTRACTION, DIRECT PRODUCT
Contraction
When dealing with vectors, we formed a scalar product (Section 1.3) by
summing products of corresponding components:
A • В = AiBt (summation convention). C.23)
The generalization of this expression in tensor analysis is a process known as
contraction. Two indices, one covariant and the other contravariant, are set
equal to each other and then (as implied by the summation convention) we sum
over this repeated index. For example, let us contract the second-rank mixed
tensor B/l..
Cl nk
7B i
dxkdxf
C.24)
_ дхг к
by Eq. 3.18 and then by Eq. 3.19
C.25)
Our contracted second-rank mixed tensor is invariant and therefore a scalar.1
This is exactly what we obtained in Section 1.3 for the dot product of two vectors
and Section 1.7 for the divergence of a vector. In general, the operation of con-
contraction reduces the rank of a tensor by 2. An example of the use of contraction
appears in Section 3.6.
Direct Product
The components of a covariant vector (first-rank tensor) at and those of a
contravariant vector (first-rank tensor) Ы may be multiplied component by
component to give the general term atbJ. This, by Eq. 3.13, is actually a second-
rank tensor, for
matrix analysis this scalar is the trace of the matrix, Section 4.2.
CONTRACTION, DIRECT PRODUCT 125
Contracting, we obtain
а\Ъ* = akb\ C.27)
as in Eqs. 3.24 and 3.25 to give the regular scalar product.
The operation of adjoining two vectors a{ and bJ as in the last paragraph is
known as forming the direct product. For the case of two vectors, the direct
product is a tensor of second rank. In this sense we may attach meaning to VE,
which was not defined within the framework of vector analysis. In general, the
direct product of two tensors is a tensor of rank equal to the sum of the two
initial ranks; that is,
AjBkl = Cjkl, C.28)
where Cjkl is a tensor of fourth rank. From Eqs. 3.13
j dxm dx'j 6xp 6xq
The direct product appears in mathematical physics as a technique for
creating new higher-rank tensors. Exercise 3.2.1 is a form of the direct product
in which the first factor is V. Applications appear in Section 3.7.
When T is an nth rank cartesian tensor, (d/dxi)Tjkl..., an element of YT, is a
cartesian tensor of rank n + 1 (Exercise 3.2.1). However, (д/дх()Тщ... is not a
tensor under more general transformations. In noncartesian systems djdx[ will
act on the partial derivatives dxp/dx'q and destroy the simple tensor transforma-
transformation relation.
So far the distinction between a covariant transformation and a contra-
variant transformation has been maintained because it does exist in non-
cartesian space and because it is of great importance in general relativity. In
Sections 3.8 and 3.9 we shall develop differential relations for noncartesian
tensors. Now, however, because of the simplification achieved, we restrict our-
ourselves to cartesian tensors. As noted in Section 3.1, the distinction between
contravariance and covariance disappears and all indices are from now on
shown as subscripts. We restate the summation convention and the operation
of contraction.
Summation Convention
When a subscript (letter, not number) appears twice on one side of an equa-
equation, summation with respect to that subscript is implied.
Contraction
Contraction consists of setting two unlike indices (subscripts) equal to each
other and then summing as implied by the summation convention.
126 TENSOR ANALYSIS
EXERCISES
3.2.1 If T...i is a tensor of rank n, show that dT...JdXj is a tensor of rank n + 1 (cartesian
coordinates).
Note. In noncartesian coordinate systems the coefficients a{j are, in general, func-
functions of the coordinates, and the simple derivative of a tensor of rank n is not a
tensor except in the special case of n — 0. In this case the derivative does yield a
со variant vector (tensor of rank 1) by Eq. 3.11.
3.2.2 If Tijk... is a tensor of rank n, show that '£ldT(jk.../dXj is a tensor of rank n — \
(cartesian coordinates).
3.2.3 The operator
may be written as
4
using x4 = z'cr. This is the four-dimensional Laplacian, usually called the d'Alem-
bertian and denoted by D2. Show that it is a scalar operator.
3.3 QUOTIENT RULE
If Ax and Bj are vectors, as seen in Section 3.2, we can easily show that AtBj
is a second-rank tensor. Here we are concerned with a variety of inverse rela-
relations. Consider such equations as
KtA^B C.29a)
KuAJ = Bt C.29b)
КиА]к = Bik C.29c)
KmAu = Bkl C.29c/)
KViAk = Bijk C.29e)
In each of these expressions A and В are known tensors of rank indicated by the
number of indices and A is arbitrary. In each case i^is an unknown quantity. We
wish to establish the transformation properties of K. The quotient rule asserts
that if the equation of interest holds in all (rotated) cartesian coordinate sys-
systems, К is a tensor of the indicated rank. The importance in physical theory is
that the quotient rule can establish the tensor nature of quantities. Exercise
3.3.1 is a simple illustration of this. The quotient rule (Eq. 3.296) shows that the
inertia matrix appearing in the angular momentum equation L = /со, Section
4.6, is a tensor. And Eq. 3.29c/ is quoted in Section 3.6 to establish the tensor
nature of the generalized Hooke's law "constant" cijkl.
In proving the quotient rule, we consider Eq. 3.29b as a typical case. In our
EXERCISES 127
primed coordinate system
ЩА] = B[ = aikBk, C.30)
using the vector transformation properties of B. Since the equation holds in all
rotated cartesian coordinate systems,
aikBk = а{к(Кк1Аг). C.31)
Now, transforming A back into the primed coordinate system1 (compare Eq.
3.9), we have
ЩА] = aikKkiajlA'j C.32)
Rearranging, we obtain
(Ц-а{калКк1)А^ = 0. C.33)
This must hold for each value of the index / and for every primed coordinate
system. Since the Aj arbitrary,2 we conclude
Ц = а1калКк1, C.34)
which is our definition of second-rank tensor.
The other equations may be treated similarly, giving rise to other forms of
the quotient rule. One minor pitfall should be noted—the quotient rule does
not necessarily apply if В is zero. The transformation properties of zero are
indeterminate.
EXERCISES
3.3.1 The double summation К^АЬВ^ is invariant for any two vectors At and Bj. Prove
that Ky is a second-rank tensor.
Note. In the form ds2 (invariant) = д^йхк dxj, this result shows that gu, the "metric"
is a tensor.
3.3.2 The equation Ki}Ajk = Bik holds for all orientations of the coordinate system. If
A and В are second-rank tensors, show that К is a second-rank tensor also.
3.3.3 The exponential in a plane wave is ехр[/(кт — ш)~\. We recognize хц — {хх,х2,
x3,ict) as a prototype vector in Minkowski space. If k«r — cot is a scalar under
Lorentz transformations (Section 3.7), show that k^ — {kx, k2, k3, ico/c) is a vector
in Minkowski space.
Note. Multiplication by h yields (p, iEjc) as a vector in Minkowski space.
1 Note carefully the order of the indices of the direction cosine aJt in this
inverse transformation. We have
2 We might, for instance, take A\ = \ and A'm = 0 for m ф 1. Then the equa-
equation K{x = aikallKkl follows immediately. The rest of Eq. 3.34 comes from
other special choices of the arbitrary A].
128 TENSOR ANALYSIS
3.4 PSEUDOTENSORS, DUAL TENSORS
So far our coordinate transformations have been restricted to pure rotations.
We now consider the effect of reflections or inversions. If we have transforma-
transformathen by Eq. 3.7
tion coefficients atj = —
C-35)
which is an inversion. Note carefully that this transformation changes our
initial right-handed coordinate system into a left-handed coordinate system.1
Our prototype vector r with components (xux2,x3) transforms to r' =
(x\,x'2,x'3) = ( — xl,—x2, —x3). This new vector r' has negative components,
relative to the new transformed set of axes. As shown in Fig. 3.1, reversing the
directions of the coordinate axes and changing the signs of the components
gives r' = r. The vector (an arrow in space) stays exactly as it was before the
transformation was carried out. The position vector r and all other vectors
whose components behave this way (reversing sign with a reversal of the co-
coordinate axes) are called polar vectors.
У
x'
FIG. 3.1 Inversion of cartesian coordinates—polar vector
A fundamental difference appears when we encounter a vector defined as the
cross product of two polar vectors. Let С = A x B, where both A and В are
polar vectors. From Eq. 1.33 of Section 1.4 the components of С are given by
Cx = A2B3 - A
3*2-
C-36)
and so on. Now when the coordinate axes are inverted, At ->■ —A-, Bj-+ — B]
but from its definition Ck-+ + Ck'; that is, our cross-product vector, vector C,
does not behave like a polar vector under inversion. To distinguish, we label it a
pseudovector or axial vector (see Fig. 3.2). The term axial vector is frequently
used because these cross products often arise from a description of rotation.
'This is an inversion of the coordinate system or coordinate axes, objects in
the physical world remaining fixed.
PSEUDOTENSORS, DUAL TENSORS 129
X X -*
FIG. 3.2 Inversion of cartesian coordinates—axial vector
Examples are
angular velocity,
angular momentum,
torque,
magnetic induction field B,
V = CO X Г
L = г x p,
N = г x f,
ав
8t
= -V x E.
In v = со x r, the axial vector is the angular velocity со, and г and v = drjdt are
polar vectors. Clearly, axial vectors occur frequently in elementary physics,
although this fact is usually not pointed out. In a right-handed coordinate sys-
system an axial vector С has a sense of rotation associated with it given by a right-
hand rule (compare Section 1.4). In the inverted left-handed system the sense
of rotation is a left-handed rotation. This is indicated by the curved arrows in
Fig. 3.2.
The distinction between polar and axial vectors may also be illustrated by a
reflection. A polar vector reflects in a mirror like a real physical arrow, Fig.
З.Зй. In Figs. 3.1 and 3.2 the coordinates are inverted; the physical world re-
remains fixed. Here the coordinate axes remain fixed; the world is reflected—as
in a mirror in the xz-plane. Specifically, in this representation we keep the axes
fixed and associate a change of sign with the component of the vector. For a
mirror in the xz-plane, Py-+ — Py. We have
p = (P p p)
P = {PX,-Py, Pz). polar vector.
An axial vector such as a magnetic field H or a magnetic moment ц
(= current x area of current loop) behaves quite differently under reflection.
Consider the magnetic field H and magnetic moment ц to be produced by an
electric charge moving in a circular path (Exercise 5.8.4 and Example 12.5.1).
130 TENSOR ANALYSIS
FIG. 3.3 a. Mirror in xz-plane; b. Mirror in xz-plane
PSEUDOTENSORS, DUAL TENSORS 131
Reflection reverses the sense of rotation of the charge. The two current loops
and the resulting magnetic moments are shown in Fig. 3.3b. We have
If we agree that the universe does not care whether we use a right- or left-
handed coordinate system, then it does not make sense to add an axial vector to
a polar vector. In the vector equation A = B, both A and В are either polar
vectors or axial vectors.2 Similar restrictions apply to scalars and pseudo-
scalars and, in general, to the tensors and pseudotensors considered sub-
subsequently.
Usually, pseudoscalars, pseudovectors, and pseudotensors will transform as
S' =
a
C- = \a\uijCj,
C.37)
where |a| is the determinant3 of the array of coefficients amn. In our inversion the
determinant is
a
10 0
0-1 0
0 0-1
= -1.
C.38)
For a reflection of one axis, the x-axis,
-10 0
a
0 1 0
0 0 1
= -1,
C.39)
and again the determinant \a\ = — 1. On the other hand, for all pure rotations
the determinant \a\ is always +1. This is discussed further in Section 4.3. Often
quantities that transform according to Eq. 3.37 are known as tensor densities.
They are regular tensors as far as rotations are concerned, differing from tensors
only in reflections or inversions of the coordinates, and then the only difference
is the appearance of an additional minus sign from the determinant \a\.
In Chapter 1 the triple scalar product S = A x В • С was shown to be a scalar
(under rotations). Now by considering the transformation given by Eq. 3.35, we
see that S ->• — S, proving that the triple scalar product is actually a pseudo-
scalar : This behavior was foreshadowed by the geometrical analogy of a volume.
If all three parameters of the volume, length, depth, and height, change from
positive distances to negative distances, the product of the three will be negative.
2The big exception to this is in beta decay, weak interactions. Here the
universe distinguishes between right- and left-handed systems, and we add
polar and axial vector interactions.
3 Determinants are described in Section 4.1.
132 TENSOR ANALYSIS
Levi-Civita Symbol
For future use it is convenient to introduce the three-dimensional Levi-
Civita symbol 8ijk defined by
£123 = £231 = £312 = 1?
all other &ijk = 0.
Note that sijk is totally antisymmetric with respect to all pairs of indices. Sup-
Suppose now that we have a third-rank pseudotensor Sijk, which in one particular
coordinate system is equal to 8ijk. Then
by
by
d'ijk = a aiP
definition of pseudotensor. Now
direct expansion of the determinant,
ajqakr^pqr
pqr — a
showing
that
<5123 = a
C.41)
C.42)
2= l=e123.
Considering the other possibilities one by one, we find
Л' = р И 43^
ijk ijk К ' '
for rotations and reflections. Hence 8ijk is a pseudotensor.4'5 Furthermore, it is
seen to be an isotropic pseudotensor with the same components in all rotated
cartesian coordinate systems.
Dual Tensors
With any antisymmetric second-rank tensor CV} (in three-dimensional space)
we may associate a dual pseudovector C{ defined by
С — ip С И 44^
i — 2 ijk jk * \ * ~~/
Here the antisymmetric Cjk may be written
/ о c12 -с3л
Cjk= I -Q2 0 C23 I. C.45)
\C31 -C23 0 /
We know that C,- must transform as a vector under rotations from the double
contraction of the fifth-rank (pseudo) tensor sijkCmn but that it is really a
4The usefulness of eijk extends far beyond this section. For instance, the
matrices Mk of Exercise 4.2.16 were derived from (Мк)^ = —ieijk. Much of
elementary vector analysis can be written in a very compact form by using
sm and the identity of Exercise 3.4.4. See Evett, A. A. "Permutation Symbol
Approach to Elementary Vector Analysis." Am. J. Phys. 34, 503 A966).
5 The numerical value of eiJk is given by the triple scalar product of coordinate
unit vectors :
j x ek
From this point of view each element of sijk is a pseudoscalar, but the siJk
collectively form a third-rank pseudotensor.
PSEUDOTENSORS, DUAL TENSORS 133
pseudovector from the pseudo nature of 8ijk. Specifically, the components of
С are given by
~~ (^
23?
C.46)
Notice the cyclic order of the indices that comes from the cyclic order of the
components of sijk. This duality, given by Eq. 3.46, means that our three-
dimensional vector product may literally be taken to be either a pseudovector
or an antisymmetric second-rank tensor, depending on how we chose to write
it out.
If we take three (polar) vectors A, B, and C, we may define
с
Bj
вк
лк вк ск
А{В£к - AtBkCj
C.47)
By an extension of the analysis of Section 3.1 each term ApBqCr is seen to be a
third-rank tensor, making Vijk a tensor of third rank. From its definition as a
determinant Vijk is totally antisymmetric, reversing sign under the interchange
of any two indices, that is, the interchange of any two rows of the determinant.
The dual quantity is
clearly a pseudoscalar. By expansion it is seen that
C.48)
V =
A, B,
A2
B2 C2
5, С,
C.49)
our familiar triple scalar product.
For use in writing of Maxwell's equations in covariant form, Section 3.7, we
want to extend this dual vector analysis to four-dimensional space and, in
particular, to indicate that the four-dimensional volume element, dxtdx2dx3
dxA, is a pseudoscalar.
We introduce the Levi-Civita symbol Eijkl, the four-dimensional analog of
sijk. This quantity sijkl is defined as totally antisymmetric in all four indices. If
(ijkl) is an even permutation6 of A,2,3,4), then sijkl is defined as +1; if it is an
odd permutation, then &m is — 1. The Levi-Civita sijkl may be proved a pseudo-
tensor of rank 4 by analysis similar to that used for establishing the nature of
8ijk. Introducing a fourth-rank tensor,
6 A permutation is odd if it involves an odd number of interchanges of adjacent
indices such as A 2 3 4)->(l 3 2 4). Even permutations arise from an even
number of transpositions of adjacent indices. (Actually the word "adjacent"
is not necessary.)
134 TENSOR ANALYSIS
At
AJ
Ai
Bt
Bj
Bk
Bx
Q
ck
Q
A
A
A
C.50)
built from the polar vectors А, В, С, and D, we may define the dual quantity
— . .bijkl**ijkl' yJ.Ji)
We actually have a quadruple contraction which reduces the rank to zero. From
the pseudo nature of e^,, H is a pseudoscalar. Now we let А, В, С, and D be
infinitesimal displacements along the four coordinate axes (Minkowski space),
and
A = (dx1,0,0,0)
В = @,dx2,Q,Q), and so on,
H = dx1 dx2 dx3 dx4,
C.52)
C.53)
The four-dimensional volume element is now identified as a pseudoscalar. We
use this result in Section 3.7. This result could have been expected from the
results of the special theory of relativity. The Lorentz-Fitzgerald contraction
of dx1 dx2dx3 just balances the time dilation of dxA.
We slipped into this four-dimensional space as a simple mathematical exten-
extension of the three-dimensional space and, indeed, we could just as easily have
discussed 5-, 6-, or iV-dimensional space. This is typical of the power of the
component analysis. Physically, this four-dimensional space is usually taken as
Minkowski space,
(xux
2,x3, x
= (x,y, z, ict),
C.54)
where t is time. This is the merger of space and time achieved in special rela-
relativity. The transformations that describe the rotations in four-dimensional
space are the Lorentz transformations of special relativity. We encounter these
Lorentz transformations in Sections 3.7 and 4.13.
Irreducible Tensors
For some applications, particularly in the quantum theory of angular mo-
momentum, our cartesian tensors are not particularly convenient. In mathematical
language our general second-rank tensor Atj is reducible, which means that it
can be decomposed into parts of lower tensor rank. In fact, we have already
done this. From Eq. 3.25
A = Aa
C.55)
is a scalar quantity, the trace of Atj.
7 An alternate approach, using matrices, is given in Section 4.3 (see Exercise
4.3.9).
EXERCISES 135
The antisymmetric portion
Щ = ¥Ац ' Ajd C-56)
has just been shown to be equivalent to a (pseudo) vector, or
Btj = Ck cyclic permutation of i,j, k. C.57)
By subtracting the scalar A and the vector Ck from our original tensor, we have
an irreducible, symmetric, zero-trace second-rank tensor, S^, in which
Stj = \{Atj + An) - ±А6ф C.58)
with five independent components. Then, finally, our original cartesian tensor
may be written
AVi = ±AStj + Ck + Stj. C.59)
The three quantities A,Ck, and Stj form spherical tensors of rank 0, 1, and 2,
respectively, transforming like the spherical harmonics Y^ (Chapter 12) for
L = 0, 1, and 2. Further details of such spherical tensors and their uses will be
found in the book by Rose, cited in Chapter 12.
A specific example of the preceding reduction is furnished by the symmetric
electric quadrupole tensor
u = (IXiXj - r2Sij)p(xl, x2, x3) d3x.
The —r2bV} term represents a subtraction of the scalar trace (the three i=j
terms). The resulting Qtj has zero trace. The strain tensor of Section 3.6 is
another example of this reduction (See Exercise 3.6.7).
EXERCISES
3.4.1 An antisymmetric square array is given by
0
-Q
Q
Q
0
-Q
о/
Г °
\-cl2
\-Ci3
С и
0
-c23
Q3
Q3
0
where (C\, C2, C3) form a pseudovector. Assuming that the relation
1
holds in all coordinate systems, prove that Cjk is a tensor. (This is another form
of the quotient theorem).
3.4.2 Show that the vector product is unique to three-dimensional space, that is, only
in three dimensions can we establish a one-to-one correspondence between the
components of an antisymmetric tensor (second-rank) and the components of a
vector.
136 TENSOR ANALYSIS
3.4.3 Show that
(a) <5, = 3,
(b) «Vi/k = o,
(c) eiP4eJP4 = 2<5i/>
(d) %% = 6.
3.4.4 Show that
Eijkepqk = "ipVjq ~ "iq^jp-
3.4.5 (a) Express the components of a cross-product vector С, С = A x B, in terms
of eiJk and the components of A and B.
(b) Use the antisymmetry of eiJk to show that A • A x В = 0.
ANS. (a) C- = 4kA-3Bk.
3.4.6 (a) Show that the inertia tensor (matrix) of Section 4.6 may be written
hi = т(хях„ди - xtXj)
for a particle of mass m at (xl,x2,x3).
(b) Show that
/..= -Af,,Afy= -meilkxkelJmxm,
where Ma — m1/2eilkxk. This is the contraction of two second-rank tensors
and is identical with the matrix product of Section 4.2.
3.4.7 Write V • V x A and V x \q> in eiJk notation, so that it becomes obvious that each
expression vanishes.
ANS. ^xA = %^,
(V X \)
EiJk<p.
dxj dxk
3.4.8 Expressing cross products in terms of Levi-Civita symbols (eijk), derive the BAC-
CAB rule, Eq. 1.50.
Hint. The relation of Exercise 3.4.4 is helpful.
3.4.9 Verify that each of the following fourth-rank tensors is isotropic, that is, it has
the same form independent of any rotation of the coordinate systems.
(a) Aijkl = ди5к„
(b) Bijkl = 6ikdj, + dt,SJk,
(c) Cijkl = Sikdj, - 6uSJk.
3.4.10 Show that the two index Levi-Civita symbol e^ is a second-rank pseudotensor (in
two-dimensional space). Does this contradict the uniqueness of <Sy (Exercise
3.1.4)?
3.4.11 (a) Represent e^ by a 2 x 2 matrix, and using the 2x2 rotation matrix of
Section 4.3, show that ey is invariant under orthogonal similarity trans-
transformations.
/1 0\
(b) Demonstrate the pseudo nature of e{J by using I I as the transforming
matrix.
3.4.12 Given Ak = %етВу with B{J = —Bjh antisymmetric, show that
"mn ~ emnk^k-
DYADICS 137
3.4.13 Show that the vector identity
(A xB)-(C xD) = (A-C)(B-D)-(A-D)(B-C)
(Exercise 1.5.12) follows directly from the description of a cross product with
eijk and the identity of Exercise 3.4.4.
3.5 DYADICS
Occasionally, particularly in the older literature and older textbooks, the
reader will see references to dyads or dyadics. The dyadic is a somewhat clumsy
device for extending ordinary vector analysis to cover tensors of second rank.
If we adjoin two vectors i and j to form the combination ij, we have a dyad.
Multiplication (scalar or vector) from the left involves the left-hand member of
the pair and leaves the right-hand member strictly alone:
C.60)
Multiplication from the right is just the reverse; that is,
4.A-i[y(iAx + )Ay + kAz)-] ^6i)
= iAy.
From this we see that, in general, the operation of multiplication is non-
commutative. It must be emphasized strongly that the i and j of the dyad ij are
not operating on each other. If they had scalar coefficients, these would be
multiplied together, but as far as the unit vectors are concerned there is no dot
or cross product involved; they are just sitting there. As just shown, the order
is significant ij Ф ji. We thus have a composite quantity that depends in part on
the ordering. This dependence on ordering will reappear when we study matrices
(Chapter 4) and complex quantities (Chapter 6), the complex number being
literally an ordered pair of real numbers.
Extending this construction, we adjoin two vectors A and В to form
kBz)
C.62)
T
= AB =
(L4* +
\\AXBX
+ '№y
+
вх
у "•
ЦА
+ j
kA
xBy
QAy
z)(\Bx+\By
+ \kAxBz
By + }kAyBz
The quantity T = AB is a dyadic formed as shown from a combination of dyads.
We have proved (Section 3.2) that this product of two vectors AB is a tensor of
second rank. Hence, dyadics are tensors of second rank, written in a form that
preserves the vector nature but obscures the tensor transformation properties.
It has already been noted that the multiplication of a vector and a dyadic is
not commutative, but there is an important special case in which the operation
is commutative. We take the dyadic AB and set
138 TENSOR ANALYSIS
a-AB = ABa, C.63)
where a is an arbitrary vector. If a = i, then
AXB = ABX
C.64)
iAxBx + \AxBy + kAxBz = iAxBx + }AyBx + kAzBx.
By equating components, we obtain
AXBX = AXBX,
AxBy=AyBx, C.65)
AXBZ=AZBX,
showing that A = cB, in which с is a constant. In other words, if multiplication
with an arbitrary vector is commutative, the dyadic must be symmetric, and the
coefficient of dyad pq equals the coefficient of dyad qp. Conversely, if the dyadic
is symmetric, multiplication is commutative.
One of the most significant properties of a symmetric dyadic is that it can
always be put in normal or diagonal form by proper choice of the coordinate
axes:
т - атх
тхх
уу C-66)
+ kkTzz,
all the nondiagonal coefficients going to zero. The coordinate transformation
that puts our dyadic in this diagonal form is known as the principal axis trans-
transformation. It is discussed at some length in Section 4.6.
There is an interesting and useful geometric interpretation of a symmetric
dyadic. For simplicity let us assume that our symmetric dyadic T is already in its
diagonal form. Then with r, the usual distance vector, we form the equation
г-Тт=1, C.67)
which limits the length of г according to its orientation. By expanding Eq. 3.67,
we have
(ix + j> + kz) - (nTxx + fiTyy + kkTzz) • (ix + }y + kz) = 1
C.68)
у2Т х«2Т J-72T -1
л jxx~t".X J-yy ~T ^ Jzz — 1*
If Txx > 0, Гу), > 0, and Tzz > 0, then Eq. 3.68 defines an ellipsoid with semiaxes
a, b, and с given by
a — T~1/2 b = T~m с = T~112 C
U J XX 1 U УУ •> ^ J ZZ • VJ-
For the inertia tensor of Section 4.6 these diagonal elements are clearly positive
from their definition (Eq. 4.139). Diagonalizing our dyadic corresponded to
orienting the dyadic ellipsoid so that the ellipsoid axes were lined up with the
coordinate axes.
If U is an antisymmetric dyadic;
EXERCISES 139
Цех — 0, and so on,
Uxy = — Uyx, and so on,
then for any vector a
a-U = -U-a. C.70)
Multiplication of a vector and an antisymmetric dyadic follows an anticom-
mutation rule (See Exercise 3.5.4a).
Dyadics are rather awkward to handle in comparison with the usual tensor
analysis (once the concept of transformation under coordinate rotation has been
absorbed). They are quite unwieldy for representing third- or higher-rank
tensors, so we shall return to tensor analysis and have nothing further to do with
dyadic notation.
EXERCISES
3.5.1 If A and В transform as vectors, Eqs. 3.6 and 3.8, show that the dyadic AB
satisfies the tensor transformation law, Eq. 3.13.
3.5.2 Show that I = ii + jj + kk is a unit dyadic in the sense that for any vector V
I-V = V.
The individual dyads ii, and so on are specific examples of the projection opera-
operators of quantum mechanics.
3.5.3 Show that Vr is equal to the unit dyadic, I.
3.5.4 If U is an antisymmetric dyadic and V a vector, show that
(a) VU= -UV
(b) VUV = 0.
3.5.5 The two-dimensional vectors г = ix + j_y and t = — y\ + \x may be related by the
tensor equation г • U = t.
(a) Find the tensor U, using our earlier component description of tensors.
(b) Find U and treat it as a dyadic.
3.5.6 In an investigation of the interaction of molecules a dyadic is formed from the
unit relative distance vectors e12 given by
For
show that trace U
1 is the unit dyadic
U =
; that
6.
is
e12 = -
U = l
1 = И + jj +
r2 — rx
- 3e12e12
kk.
3.5.7 Show that Gauss's theorem holds for dyadics, that
I do-D = j \-Ddz.
140 TENSOR ANALYSIS
3.5.8 Show that
The function E is a vector function of position. The integration is over a simple
closed surface. This improbable combination of surface integrals actually appears
in the vector Kirchhoff diffraction theory.
3.5.9 Show that the following zero-trace, symmetric, unit tensors
t° = Bkk-H-jj)/V6
satisfy the double contraction relation
t™*t" = S
mn-
These unit tensors are used in defining tensor spherical harmonics, an extension
of the vector spherical harmonics of Section 12.11. The tensor spherical har-
harmonics, in turn, are helpful in describing gravitational waves.
3.5.10 A mass m is being pulled slowly (zero acceleration) up an incline (angle в from
horizontal). The usual coefficient of friction is ц:/= fJ-N, where TV is the normal
force. Define a coordinate system and rewrite this scalar friction equation as a
vector equation replacing the scalar ц by a dyad.
3.5.11 The combination of two vectors AB forms a dyadic but not every dyadic can be
resolved into two vectors. Show that the two-dimensional dyadic
A = ij - ji
cannot be resolved into two two-dimensional vectors.
3.6 THEORY OF ELASTICITY
When an elastic body is subjected to an external force or stress, it becomes
deformed or strained. Our study of elasticity in terms of tensors falls naturally
into three parts: A) a description of the strain or deformation of the elastic
substance, B) a description of the force or stress that produces the deformation,
and C) a generalized Hooke's law in tensor form, relating stress and strain.
Elastic Strain: Deformation
The deformation of our elastic body may be described by giving the change
in relative position of the parts of the body when the body is subjected to some
external stress (Fig. 3.4). Consider a point Po at position г relative to some
fixed origin and a second point Qo displaced from Po by a distance Sx. In the
unstrained state the coordinates of Qo relative to Po are Ex,-; in the strained
state, when Po has been displaced a distance u to point Px and Qo a distance v
to Qt, the coordinates of Qx relative to P1 are byL = дх{ + дщ. The change in
position of Q relative to P is just дщ. Neglecting second- and higher-order
THEORY OF ELASTICITY 141
FIG. 3.4 Elastic strain
differentials,1 a three-dimensional Taylor expansion (Eq. 5.109) yields
c5u = u(r + Sx) - u(r) = (Sx • V)u,
oui=—-Ldxk.
dxk
C.71)
Since щ is the component of a vector, dujdxk is an element of a second-rank
tensor, Vu. Resolving this tensor into symmetric and antisymmetric terms, we
have
ди
- <ЗЛ2)
The antisymmetric part, ^ik, may be identified as a pure rotation (and not a
deformation at all). From Section 3.4 we may associate an axial vector \ with
% = iV x u. C.73)
The displacement Eu corresponding to the antisymmetric part — £ik dxk becomes
<5u = i(V x u) x <5x. C.74)
This is a rotation about an instantaneous axis through Po in the direction
(V x u) through V x u radians. Equation 3.74 is the time integral of v(Z) =
(o(t) x r.
xThe limitation to first-order terms is a rather severe limitation implying
relative strains of no more than perhaps 1 percent in actual application.
142 TENSOR ANALYSIS
The remaining symmetric part of our tension ц^ is taken as a pure strain
tensor. The diagonal elements (^ ltri22,*7зз) of 4tk represent stretches, whereas
the nondiagonal elements represent shear strains.2
This may be seen by considering Qo to be displaced from Po along the x-axis;
Sx = iSx^. From Eq. 3.72
C.75)
8u3 =
Hence the displacement in the strained case is
Syt = SXi + 5щ = A + *7n)<5*i,
Sy2 = Su2 = ri2iSxt, C.76)
8y3 = Su3= rj3iSxt.
For our initial displacement, Sx = iSx1, the diagonal term r\xl contributes to
the 1 component of ^y (stretching) and rj2i and rj3i contribute to 5y2 and дуъ,
respectively, representing shears.
Stress-force
The stresses or forces must be defined carefully. Referring to Fig. 3.5, which
shows a differential volume, we see that the force in the xrdirection acting on
the surface dA whose normal is in the Xy-direction is Pi} dA. The /ys themselves
are actually "pressures" in the sense of being force/area. Whenever the terms
stress or force are used, it is understood that the Ptj are to be multiplied by the
appropriate differential area. These are the forces acting on the small parallel-
parallelepiped of Fig. 3.5. For clarity only the forces on the front three faces are shown.
If we assume that the stresses are homogeneous, the forces on the opposite
faces will be reversed in sign as shown in Fig. 3.6. Note that Plx is the shear
(in the x2-direction) applied to face B. Fot a homogeneous force face A must
apply the same shearing stress P21 to the outside medium. The stress applied to
face A by the outside medium is just the reverse, or P21 directed downward
(in the — x2-direction). Built into this argument are three assumptions that
should be noted explicitly.
1. Homogeneous stress throughout the body.
2. Existence of static equilibrium.
3. Absence of body forces (such as gravity acting on
the mass within the parallelepiped) and body torques
(external magnetic field acting on magnetic do-
domains).
These assumptions permit placement of a further restriction on the Pt/s.
Consider the net torque on the parallelepiped shown in Figs. 3.5 and 3.7 about
the x3-axis. The normal pressures PH exert no net torque. The shearing stresses
2 Clearly, for an ordinary liquid or a gas (which cannot support shear strains),
the nondiagonal elements must vanish.
THEORY OF ELASTICITY 143
FIG. 3.5 Stresses
21
FIG. 3.6 Homogeneous stresses -sign reversal
144 TENSOR ANALYSIS
dx3
t
/ft.
FIG. 3.7 Homogeneous stresses—balance of torques
P3i and P32 have zero moment arm. The shearing stresses Pi3 and P23 are
balanced by equal and opposite stresses on the bottom face (x3 = 0). The
remaining torques are
P2i(dx2dx3)dxi
and
C.77)
Pi2(dxidx3)dx2,
which must balance,
P2idxl dx2 dx3 = Pi2dxi dx2 dx3
in the absence of rotation about the x3-axis. We have
P2l = Pi2 C.78)
and, in general, repeating the argument for the absence of rotation about x2
and x3, we get
P = P
1 ij 1 ji-
C.79)
Thus the array of stresses (pressure) Ptj is symmetric. These are equalities of
magnitude, not direction, which is given by the first index. Now we show that
this array is a tensor. We form an infinitesimal tetrahedron with a slant face of
area dA, normal in the xj-direction, as shown in Fig. 3.8. The forces on the
slant face are P[}dA. The forces on the faces xi ~ 0, x2 = 0, and x3 ~ 0 are,
respectively,
THEORY OF ELASTICITY US
'j (normal to dA)
FIG. 3.8 Differential tetrahedron—balance of forces
and
Pnl(andA),
Pm2(aj2dA),
Pm3(aj3dA),
where ajk dA is the area of the face xk = 0, given by the slant area dA projected
onto the plane xk = 0. Our ajk is the usual direction cosine, the cosine of the
angle between the xj- and the xraxes.
The force PmXajxdA is along xm. Its component in the x--direction is
aimajxPmX dA (no summation). If we now sum over m, the foregoing expression
gives us the sum of the x--components of the three forces on the back face,
xx = 0, in the x--direction. Finally, summing over all three faces, xk = 0, the
total force along x- is
C.80)
for static equilibrium. Since the area dA is arbitrary, we have
P' — a a P C 81)
which by definition makes Pmk a tensor.
We note that the strain tensor щ is found to be a tensor by an essentially
mathematical argument, independent of any physics. Ptj, in contrast, is shown
to be a tensor by physical arguments (equilibrium) that lead directly to the
definition of tensor.
146 TENSOR ANALYSIS
Stress-strain Relations: Hooke's Law
First, let us assume an isotropic elastic solid. Later we shall return to the
general anisotropic case. Consider a uniform rod parallel to the xx-axis.3 Now
let us investigate the effect on the length of the rod of small tensile stresses Px x,
P22, and /33 acting separately. By applying a small tensile stress Pxx, we obtain
^ii = Ai, C.82a)
where E is Young's modulus. Applying a small tensile stress P22, we expect a
contraction along the x-axis and so write
Erjxx = -aP22, C.826)
the minus sign indicating contraction, a is Poisson's ratio. A similar equation
would hold for /33. The effects of Pxx, P22, and P33 together become
Erjxx=Pxx-aP22-aP33. C.83)
All through here we are limiting ourselves to small stresses and small strains so
that the stress-strain relation will be linear. Equation 3.83 may be rewritten
Erjxx = A + a)Pxx - a(Pxx + P22 + P33), C.84)
and similarly for Erj22 and Erj33.
Now rjij and /y are tensors as proved earlier in this section. Because of the
symmetry of our system their nondiagonal components are zero. To find the
generalization of Eq. 3.84 in an arbitrarily oriented cartesian coordinate
system, we rotate the axes
*ij = aikajk^kk-
If we multiply Eq. 3.84 by aixajx, the corresponding equation for Er\22 by
ai2aj2, and the equation for Erj33 by ai3aj3 and add all three equations, we
obtain
j = 0 + °)aikajkpkk ~ °(pnn)aikajk- C-86)
Using Eqs. 3.85 and 3.18, we get
Щ = О + °)Pi - °(Pmm) бФ C-87)
where
(Pmm) = (Pnn) = Рц+Рг2 + Рзз, C-88)
the contracted (and therefore invariant) tensor Piy
It is frequently more convenient to solve for the stresses Pik. We may do
this by setting i =j in Eq. 3.87 and contracting
3This special choice will start us off in a system identified in Chapter 4 as
the principal axis system, the particular coordinate system in which the
shearing strains vanish.
THEORY OF ELASTICITY 147
Erj, = A + a)P:: - laPjj
Uj jj jj
= A - 2а)Рл,
dropping the primes as superfluous. Substituting back into Eq. 3.87, we have
Fn
A + a)Ptj = Ещ + у^-Ч^ц C-90)
or
Рц = 2тц + кцтт6ц, C.91)
where I and /л, known as Lame's constants, are given by
oE E
The constant ц may be identified as the rigidity or shear modulus. Consider a
parallelepiped fixed to the (x3 = 0)-plane with a tangential stress PX2 applied.
The displacement (<5u of Fig. 3.9) is (rjx2,0,0). In terms of our strain tensor,
Eq. 3.72, rjij = 0 except for rjl2 = *72i = 1.4- From Eq. 3.91
Pl2 = 2H'bri = M, C.93)
showing that /i is the ratio of shear stress to shear strain r\.
4*2,
1
1
1
1
1
1
1
6
Л" >
7
/
1
1
1
1
j
' у-
Ji2 FIG. 3.9 Shear stress—shear strain
If the strain is spherically symmetric, as in hydrostatic pressure,
Ч\\=Чгг = Чъъ\ Ч\г = Ч\ъ = Чгъ = 0. C.94)
Then Eq. 3.91 becomes
= ЗЪ/г1,
where
к = X + |/i. C.96)
Since Ъг]хх is the relative change in volume to first order, we identify к as the
bulk modulus.
148 TENSOR ANALYSIS
Generalized Hooke's Law
For the general case covering anisotropic as well as isotropic solids we
express the linear stress-strain relation by a generalized Hooke's law.
C-97)
where cijM is a fourth-rank tensor by the quotient rule, Eq. 3.29d. Since the
stress tensor Pi} and the strain tensor rjkl are both symmetric,
cijkl = Cjikl = Cijlk = cjilk> C.98)
reducing the number of components from 81 C4) to 36. It may further be
shown4 that
Cijkl = Cklijy C.99)
which further reduces the number of independent components to 21.
If we apply our general tensor relation (Eq. 3.97) to an isotropic body, the
elastic constant tensor cijkl must be a linear combination of the most general
isotropic tensors of fourth rank. Using the results of Exercise 3.4.9, we have
itSjd + c[SikSJt - ёпё]к]. (З.ЮО)
By substituting into Eq. 3.97, we obtain
p.. = адучп + b(rjij + rjji) + c(rjij - rjji) C.101)
as before. Since щ is symmetric, this reduces to
ц + 2br1ij
in complete agreement with Eq. 3.91.
The properties and applications of the fourth-rank Hooke's law tensor are
explored further in the exercises.
All the preceding discussion of elasticity has been in a cartesian framework—
to simplify the mathematics. But sometimes problems in the real, physical
world simply demand some other coordinate system. In considering the free
oscillations of the Earth, for instance, we generally rewrite the elastic relations
of this section and the equations of motion in spherical polar coordinates. The
result is a complicated set of simultaneous second-order partial differential
equations which can be solved only by numerical integration (Section 8.8). The
comparison of such theoretical-numerical results and seismological records of
the free oscillations of our elastic Earth has yielded information about the
structure of the Earth's interior.
EXERCISES
3.6.1 The three-dimensional fourth-rank stress-strain tensor cijkl satisfies
Cijkl = Cijlk = Cjilk-
4 Compare I. S. Sokolnikoff, Mathematical Theory of Elasticity. New York:
McGraw-Hill A956).
EXERCISES 149
(a) Show that application of these symmetry conditions reduces the number of
independent components or elements of ciJkl from 81 to 36.
(b) If we further specify that
cijkl = ckliji
show that the number of independent components drops to 21.
3.6.2 (a) What arguments can you adduce to demonstrate that Poisson's ratio a is
nonnegative?
(b) Assuming that the shear modulus ц and the bulk modulus к are each non-
negative, set an upper limit to the value of Poisson's ratio.
ANS. (b) a < 1/2.
3.6.3 Calculate the elastic potential energy (per unit volume) of an elastic isotropic body
subjected to a small strain.
W
ANS. — = Щщд2 + Wyfy
3.6.4 The potential energy density of a strained elastic solid is given by
P-E. = 2cijklrlijrlkh
If the solid has cubic symmetry,
(a) Show that any ciJkl, in which any subscript A,2, 3 or x, y, z) appears an odd
number of times, vanishes; that is,
c1112 = 0.
Hint. Reflect the coordinate appearing an odd number of times.
(b) Show that there remain three distinct nonzero elastic constants
Cllll = C2222 = C3333
cii22 = C22ii = Сизз> and so on
C1212 = C2121 = C1313> and SO ОП
= c1221, and so on
for a total of 21 elements.
3.6.5 If the atomic force between every two atoms of our elastic body is along the line
joining the two atoms and each atom is a center of symmetry, then, as shown by
Cauchy, cijkl = ckjil. Given A) an isotropic elastic body and B) this symmetry
condition of Cauchy, show that the elastic constant tensor cyfc/ is completely
symmetric under all permutations of the indices.
3.6.6 If our elastic solid is isotropic, cijkl will have 21 nonvanishing components. Express
these 21 components in terms of Young's modulus E and Poisson's ratio a.
ANS. Cllll-£frJi^L
1122 A + ff)(l - 2ff)
2A
■> — J-f — /1
|2I2~£2A)-"'
3.6.7 The original strain tensor дщ/дхк is reducible in the sense of Section 3.4. A partial
reduction, the splitting off of the antisymmetric £ik, is carried out in the first part
of Section 3.6. Completing the reduction, we may write
150 TENSOR ANALYSIS
Here rj is the contracted (scalar) rjH.
(a) Show that the tensor vy = \фу describes a change in volume and no change
of shape.
(b) Show that the second tensor stj = щ — \ф(} describes a change in shape
(shear) with no change in volume to first order.
Note. Our elasticity theory here is a first-order theory. Discard second- and
higher-order terms.
3.6.8 (a) Derive the equation for waves in an elastic medium
m d2u/dt2 = (k + f/i)VV • u - /iV x V x u.
Hint. Consider the net force on a unit cube (mass m).
(b) If the displacement u is irrotational, show that the elastic waves are propagated
with a velocity v = {{к + f/i)/w]1/2 and are longitudinal (plane waves or
spherical waves at large distances).
(c) If the displacement u is solenoidal, show that the elastic waves are propagated
with velocity v — (/i/mI/2 and are transverse (plane waves or spherical waves
at large distances).
3.7 LORENTZ COVARIANCE OF MAXWELL'S
EQUATIONS
If a physical law is to hold for all orientations of our (real) special coordinates
(i.e., to be invariant under rotations), the terms of the equation must be со var-
variant under rotations (Sections 1.2. and 3.1). This means that we write the
physical laws in the mathematical form scalar = scalar, vector = vector,
second-rank tensor = second-rank tensor, and so on. Similarly, if a physical
law is to hold for all inertial systems, the terms of the equation must be со variant
under Lorentz transformations.
Using Minkowski space (x = xx,у = x2,z = x3,ict = x4), we have a four-
dimensional cartesian space in that the metric gtj = <5,j (Section 2.1). The
Lorentz transformations take the form of a "rotation" in this four-dimensional
complex space.1
Here we consider Maxwell's equations
VxE=-~, C.102a)
at
pv, C.102b)
xH ^ + pv,
ot
V-D = p, C.102c)
V-B = 0, C.102a1)
and the relations
1A group theoretic derivation of the Lorentz transformation in Minkowski
space appears in Section 4.13. See also H. Goldstein, Classical Mechanics.
Cambridge, Mass.: Addison-Wesley A951), Chapter 6. The tensor equation
for a photon YjxI — 0> independent of reference frame, leads to the Lorentz
transformations.
LORENTZ COVARIANCE OF MAXWELL'S EQUATIONS 151
D = e0E, В =
C.103)
The symbols have their usual meanings as given in the introduction. For
simplicity we assume vacuum (e = e0, /i = /i0).
We assume that Maxwell's equations hold in all inertial systems; that is,
Maxwell's equations are consistent with special relativity. (The covariance of
Maxwell's equations under Lorentz transformations was actually shown by
Lorentz and Poincare before Einstein proposed his theory of special relativity.)
Our immediate goal is to rewrite Maxwell's equations as tensor equations in
Minkowski space. This will make the Lorentz covariance explicit.
In terms of scalar and magnetic vector potentials, we may write2
B= V x A
C.104)
Equation 3.104 specifies the curl of A; the divergence of A is still undefined
(compare Sections 1.13 and 1.15). We may, and for future convenience we do,
impose the further restriction on the vector potential A,
C.105)
This is the Lorentz relation. It will serve the purpose of uncoupling the differen-
differential equations for A and <p that follow. The potentials A and cp are not yet
completely fixed. The freedom remaining is the topic of Exercise 3.7.4.
Now we rewrite the Maxwell equations in terms of the potentials A and ср.
From Eqs. 3.102c for V-D and 3.104
dA
dt
C.106)
whereas Eqs. 3.1026 for V x H and 3.104 and Eq. 1.80 of Chapter 1 yield
df
dt
C.107)
Using the Lorentz relation, Eq. 3.105, and the relation eo/io = 1/c , we obtain
d2'
V2-
2 Я/2
c2 dt
2 dt2
Now the differential operator
c2 dt
V2-
A= -цор\,
„--■£,
C.108)
'Compare Section 1.13, especially Exercise 1.13.10.
152 TENSOR ANALYSIS
becomes in Minkowski space
4
I
Here we adopt Greek indices, as is customary in relativity theory, to indicate a
summation from 1 to 4. This summation,
is a four-dimensional Laplacian, usually called the d'Alembertian and denoted
by П2. It may readily be proved a scalar (see Exercise 3.2.3).
For convenience we define
A A
C.109)
=—^ = ce0A А4
цс
If we further put
e^mil, ^s/2> еь*,„ tp = u, (злю)
C' C' C'
then Eq. 3.108 may be written in the form
Equation 3.111 looks like a tensor equation, but looks do not constitute
proof. To prove that it is a tensor equation, we start by investigating the
transformation properties of the generalized current zM.
Since an electric charge element de is an invariant quantity, we have
de = p dxx dx2 dx3, invariant. C.112)
We saw in Section 3.4 that the four-dimensional volume element, dxxdx2dx3
dx4, was also invariant. Comparing this result, Eq. 3.53 with Eq. 3.112, we see
that the charge density p must transform the same way as dxA, the fourth
component of a four-dimensional vector dxx. We put ip = i4, with i4 now
established as the fourth component of a four-dimensional vector. The other
parts of Eq. 3.110 may be expanded as
_ pvx _ p dxx _ ip dxx
C.113)
1 с с dt ic dt
. dx,
dx4
Since we have just shown that i4 transforms as dx4, this means that ix transforms
as dxx. With similar results for i2 and i3, we have ix transforming as dxx,
proving that ix is a vector, a four-dimensional vector in Minkowski space.
LORENTZ COVARIANCE OF MAXWELL'S EQUATIONS 153
Equation 3.111, which follows directly from Maxwell's equations, Eq. 3.102,
is assumed to hold in all cartesian systems (all Lorentz frames). Then, by the
quotient rule, Section 3.3, A^ is also a vector and Eq. 3.111 is a legitimate
tensor equation.
Now, working backward, Eq. 3.104 may be written
C.114)
<U,*) = 0,2,3)
дхъ'
and cyclic permutations.
We define a new tensor
дх^ дхх ~удл ■/ям'
an antisymmetric second-rank tensor, since A x is a vector. Written out explicitly,
0 cBz
-cBy 0
cBy —cBx 0 —i^z
iEx iEy iEz 0
Notice that in our four-dimensional Minkowski space E and В are no longer
vectors but together form a second-rank tensor. With this tensor we may write
the two nonhomogeneous Maxwell equations C.1026 and 3.102c) combined as
a tensor equation
1Г*=1х. (ЗЛ16)
The left-hand side of Eq. 3.116 is a four-dimensional divergence of a tensor
and therefore a vector. This, of course, is equivalent to contracting a third-rank
tensor dfXfl/dxv (compare Exercises 3.2.1 and 3.2.2). The two homogeneous
Maxwell equations—3.102« for V x E and 3.102^ for V • В—may be expressed
in the tensor form
ММ <^ о C.117)
cxx cx2 ox3
for Eq. 3.102d and three equations of the form
C.118)
+ + =0
OX 2 ОХЪ ОХ 4
for Eq. 3.102«. (A second equation permutes 124, a third permutes 134.) Since
dxv
— lXnv
154 TENSOR ANALYSIS
is a tensor (of third rank), Eqs. 3.102й and 3.102^ are given by the tensor
equation
*AMv+'vAM + W = 0- (ЗЛ19)
From Eqs. 3.117 and 3.118 the reader will understand that the indices I, fi,
and v are supposed to be different. Actually Eq. 3.119 automatically reduces
to 0 = 0 if any two indices coincide. An alternate form of Eq. 3.119 appears
in Exercise 3.7.14.
Lorentz Transformation of E and В
The construction of the tensor equations C.116 and 3.119) completes our
initial goal of rewriting Maxwell's equations in tensor form.3 Now we exploit
the tensor properties of our four vectors and the tensor f^.
For the Lorentz transformation corresponding to motion along the z{x3)-
axis with velocity v, the "direction cosines" are4
C.120)
where
/1
'о
0
Vo
У =
О
1
О
О
A
—
V
с
-Р-
О
О
У
ФУ
2)-1/2
о
Фу
and
C.121)
Using the tensor transformation properties, we may calculate the electric and
magnetic fields in the moving system in terms of the values in the original
reference frame. From Eqs. 3.13, 3.115, and 3.120 we obtain
C.122)
F' — F
and
3 Modern theories of quantum electrodynamics and elementary particles are
often written in this "manifestly covariant" form to guarantee consistency
with special relativity. Conversely, the insistence on such tensor form has
been a useful guide in the construction of these theories.
4 A group theoretic derivation of the Lorentz transformation appears in
Section 4.13. See also Goldstein, Chapter 6.
LORENTZ COVARIANCE OF MAXWELL'S EQUATIONS 155
(зл2з)
This coupling of E and В is to be expected. Consider, for instance, the case
of zero electric field in the unprimed system
F — F — F — 0
Clearly, there will be no force on a stationary charged particle. When the
particle is in motion with a small velocity v along the z-axis,5 an observer on
the particle sees fields (exerting a force on his charged particle) given by
E'x=-vBy,
Щ = vBx,
where В is a magnetic induction field in the unprimed system. These equations
may be put in vector form
E' = vxB
or C.124)
¥= q\ xB,
which is usually taken as the operational definition of the magnetic induction B.
Electromagnetic Invariants
Finally, the tensor (or vector) properties allow us to construct a multitude
of invariant quantities. A more important one is the scalar product of the two
four-dimensional vectors or four vectors Ax and ix. We have
Axix = C8A^ + c&A^
+ ce0Az^ + iso(pip C.125)
с
= eo(A*J — pep), invariant,
with A the usual magnetic vector potential and J the ordinary current density.
The final term pq> is the ordinary static electric coupling with dimensions of
energy per unit volume. Hence our newly constructed scalar invariant is an
energy density. The dynamic interaction of field and current is given by the
product A* J. This invariant Axix appears in the electromagnetic Lagrangians
of Exercise 17.3.6 and 17.5.1.
5 If the velocity is not small (so that v2/c2 is negligible), a relativistic transforma-
transformation of force is needed.
156 TENSOR ANALYSIS
Other possible electromagnetic invariants appear in Exercises 3.7.9 and
3.7.11.
EXERCISES
3.7.1 (a) Show that every four vector in Minkowski space may be decomposed into
an ordinary three-space vector and a three-space scalar. Examples: (r,ict),
(p\/c, ip), (ce0A, ieoq>), (p, iE/c), (k, iw/c).
Hint. Consider a rotation of the three-space coordinates with time fixed,
(b) Show that the converse of (a) is not true—every three vector plus scalar
does not form a Minkowski four vector.
3.7.2 (a) Show that
(b) Show how the previous tensor equation may be interpreted as a statement
of continuity of charge and current in ordinary three-dimensional space
and time.
(c) If this equation is known to hold in all Lorentz reference frames, why can
we not conclude that /M is a vector?
3.7.3 Write the Lorentz condition (Eq. З.Ю5), as a tensor equation in Minkowski
space.
3.7.4 A gauge transformation consists of varying the scalar potential q>x and the
vector potential Ax according to the relation
2 1 dt'
A2 = Aj - \x-
The new function x is required to satisfy the homogeneous wave equation
X~e°fl°~dt2~~
Show the following:
(a) The Lorentz relation is unchanged.
(b) The new potentials satisfy the same inhomogeneous wave equations as
did the original potentials.
(c) The fields E and В are unaltered.
The invariance of our electromagnetic theory under this transformation is called
gauge invariance.
3.7.5 A charged particle, charge q, mass w, obeys the Lorentz covariant equation
dpjdx = (9/£omoc)/MV/V
pv is the four-dimensional momentum vector {Pi,p2>PiJEIc). х is the proper
time; dx = dty/l — v2/c2, a Lorentz scalar. Show that the explicit space-time
forms are
dp/dt = q(E + v x B)
EXERCISES 157
3.7.6 From the Lorentz transformation matrix elements (Eq. 3.120) derive the Einstein
velocity addition law
, и — v и' + v
U = ;г- ОГ U = —
l~(uv/c2) l+(u'v/c2y
where и = icdx3jdxA and u' = icdx'3\dx\.
Hint. If L12(v) is the matrix transforming system 1 into system 2, L23(u') the
matrix transforming system 2 into system 3, L13(u) the matrix transforming
system 1 directly into system 3, then Ь1г{и) = L23(w')L12(y). From this matrix
relation extract the Einstein velocity addition law.
3.7.7 The dual of a four-dimensional second-rank tensor В may be defined by B*,
where the elements of the dual tensor are given by
ft* _ _£.. ft
2!
Show that B* transforms as
(a) a second-rank tensor under rotations,
(b) a pseudotensor under inversions.
Note. The asterisk here does not mean complex conjugate.
3.7.8 Construct f *, the dual of f, where f is the electromagnetic tensor given by Eq.
3.115.
'0
. —, 0
ANS. f * - e0
iEx 0
rcBx -cBy -cBz 0
This corresponds to
cB-» -iE,
- /E -»cB.
This transformation, sometimes called a "dual transformation," leaves Maxwell's
equations in vacuum (p = 0) invariant.
3.7.9 As the quadruple contraction of a fourth-rank pseudotensor and two second-rank
tensors £цхч<т/цл/ч<т is clearly a pseudoscalar. Evaluate it.
ANS. -8/egcB-E.
3.7.10 (a) If an electromagnetic field is purely electric (or purely magnetic) in one
particular Lorentz frame, show that E and В will be orthogonal in other
Lorentz reference systems.
(b) Conversely, if E and В are orthogonal in one particular Lorentz frame, there
exists a Lorentz reference system in which E (or B) vanishes. Find that
reference system.
3.7.11 Show that c2B2 — E2 is a scalar invariant.
3.7.12 Since {dxl,dx2,dx3,dxA) is a vector, dx^dx^ is a scalar. Evaluate this scalar for
a moving particle in two different coordinate systems: (a) a coordinate system
fixed relative to you (lab system), and (b) a coordinate system moving with a
moving particle (velocity v relative to you). With the time increment labeled
dx in the particle system and dt in the lab system, show that
d? = dty/l-v2/c2.
dx or т is the proper time of the particle, a Lorentz invariant quantity.
158 TENSOR ANALYSIS
3.7.13 Expand the scalar expression
in terms of the fields and potentials. The resulting expression is the Lagrangian
density used in Exercise 17.5.1.
3.7.14 Show that Eq. З.П9 may be written
3.8 NONCARTESIAN TENSORS, COVARIANT
DIFFERENTIATION
The distinction between contravariant transformations and covariant trans-
transformations was established in Section 3.1. Then, for convenience, we restricted
our attention to cartesian coordinates (in which the distinction disappears).
Now in these two concluding sections we return to noncartesian coordinates
and resurrect the contravariant and covariant dependence. As in Section 3.1,
a superscript will be used for an index denoting contravariant and a subscript
for an index denoting, covariant dependence. The metric tensor of Section 2.1
will be used to relate contravariant and covariant indices.
The emphasis in this section is on differentiation, culminating in the construc-
construction of the covariant derivative. We saw in Section 3.2 that the derivatives of a
vector yields a second-rank tensor—in cartesian coordinates. The covariant
derivative of a vector yields a second-rank tensor in noncartesian coordinate
systems.
Metric Tensor, Raising and Lowering Indices
Let us start with a set of basis vectors £,- such that an infinitesimal displace-
displacement dv would be given by
dr = £1dqi +z2dq2 + z3dq3. C.126)
For convenience we take cl5 £2, and £3 to form a right-handed set. These
vectors are not necessarily orthogonal. The oblique coordinates of Section 4.4
furnish a convenient example of a nonorthogonal system. Also, a limitation to
three-dimensional space will be required only for the discussions of cross
products and curls. Otherwise these c,- may be in iV-dimensional space, including
the four-dimensional space-time of special and general relativity. The basis
vectors £f may be expressed by
*i = jn> C-127)
dq
as in Exercise 2.2.3. Note, however, that the £; here do not necessarily have unit
magnitude. From Exercise 2.2.3, the unit vectors, are
NONCARTESIAN TENSORS, COVARIANT DIFFERENTIATION 159
e,- = — —- (no summation)
^ dq{
and therefore
£i = hiei (no summation). C.128)
The £г are related to the unit vectors e,- by the scale factors A,- of Section 2.2.
The e,- have no dimensions; the £,- have the dimensions of//,-. In spherical polar
coordinates, as a specific example,
£r = ler
£в = гев C.129)
As in Section 2.1, we construct the square of a differential displacement
(dsJ = dx • dx = A
C.130)
= zt'Zjdqldqj.
Comparing this with (dsJ of Section 2.1, Eq. 2.4, we identify £,-•£,• as the
covariant metric tensor
Ei-EJ = glJ. C.131)
Clearly, gtj is symmetric. The tensor nature of g(j follows from the quotient
rule, Exercise 3.3.1. We take the relation
0*4 = 3 (ЗЛ32)
to define the corresponding contravariant tensor gik. Contravariant glk enters
as the inverse * of covariant gkj. We use this contravariant glk to raise indices,
converting a covariant index into a contravariant index, as shown subsequently.
Likewise the covariant gkj will be used to lower indices. The choice of glk and
gkj for this raising-lowering operation is arbitrary. Any second-rank tensor
(and its inverse) would do. Specifically, we have
glj£j = £l relating covariant and
contravariant basis vectors,
C.133)
glJFj = F' relating covariant and
contravariant vector components.
Then
gij£j = £г as the corresponding index
lowering relations. C.134)
..Fj = F
iJ i
1 If the tensor gkJ is written as a matrix, the tensor glk is given by the inverse
matrix.
160 TENSOR ANALYSIS
As an example of these transformations we start with the contravariant form
of vector
F = F%. C.135)
From Eqs. 3.133 and 3.134
C.136)
the final equality coming from Eq. 3.132. Equation 3.135 gives the contra-
contravariant representation of F. Equation 3.136 gives the corresponding covariant
representation of the same F. Examples of such representation appear in
Section 4.4, "Oblique Coordinates."
It should be emphasized again that the c,- and zj do not have unit magnitude.
This may be seen in Eqs. 3.129 and in the metric tensor gtj for spherical polar
coordinates and its inverse glj:
0 0
0 0 *
r2 sin2 0,
Derivatives, Christoffel Symbols
Let us form the differential of a scalar
f^1'. C.137)
Since the dql are the components of a contravariant vector, the partial deriva-
derivatives дф/дд1 must form a covariant vector—by the quotient rule. The gradient
of a scalar becomes
Vi// = |^T£l'- C-138)
The reader should note that дф/дд1 are not the gradient components of Section
2.2—because £l Ф e; of Section 2.2.
Moving on to the derivatives of a vector, we find that the situation is much
more complicated because the basis vectors £г are in general not constant.
Remember we are no longer restricting ourselves to cartesian coordinates and
the nice, convenient i, j, к!. Direct differentiation yields
ja£ ^|s C139)
cgJ dgJ dgJ
Now d£t/dgj will be some linear combination of the £k with the coefficient
depending on the indices i and j from the partial derivative and index к from
the base vector. We write
NONCARTESIAN TENSORS, COVARIANT DIFFERENTIATION 161
0 = Г*О.£,. C.140a)
Multiplying by £m, we have
The Ткц is a Christoffel symbol (of the second kind). It is also called a "coefficient
of connection." These Тки are not third-rank tensors and the dVl/dgj of Eq.
3.139 are not second-rank tensors. Equation 3.140 should be compared with
the results quoted in Exercise 2.2.3 (remembering that in general c,- =f= e,). In
cartesian coordinates, Tjj = 0 for all values of the indices i, j, and k. These
Christoffel three index symbols may be computed by the techniques of Chapter
2. This is the topic of Exercise 3.8.7. Equation 3.153 at the end of this section
offers an easier method.
Using Eq. 3.127, we obtain
d£t _ d2r _ d£j
dqj ~ dqjdql ~ dql
-г*. <ЗЛ41)
Hence these Christoffel symbols are symmetric in the two lower indices:
Г\ = Г%. C.142)
Covariant Derivative
With the Christoffel symbols, Eq. 3.139 may be rewritten
= —£-+F'rkr C3 143^
dqj l+ ° *• { }
Now i and к in the last term are dummy indices. Interchanging i and k (in this
one term), we have
dqJ
The quantity in parenthesis is labeled a covariant derivative, Vl;j. We have
V1. ■ = —- + VkT\-. C.145)
The ;j subscript indicates differentiation with respect to qj. The differential
d\ becomes
d\ = ~dqJ = \y}jd^]zt. C.146)
A comparison with Eq. 3.126 or 3.135 shows that the quantity in square
brackets is the ith contravariant component of a vector. Since dqj is the yth
contravariant component of a vector (again, Eq. 3.126), Kl,- must be the iyth
162 TENSOR ANALYSIS
component of a (mixed) second-rank tensor (quotient rule). The covariant
derivatives of the contravariant components of a vector form a mixed second-
rank tensor, V.).
Since the Christoffel symbols vanish in cartesian coordinates, the covariant
derivative and the ordinary partial derivative coincide
dVl
~— = V.\ (cartesian coordinates) C.147)
dqJ 'J
The covariant derivative of a covariant vector Vt is given by (Exercise 3.8.8)
Like Kj, Vi;j is a second-rank tensor.
The physical importance of the covariant derivative is that
A consistent replacement of regular partial derivatives by covariant derivatives
carries the laws of physics (in component form) from flat space time into the curved
(Riemannian) space time of general relativity. Indeed, this substitution may be taken
as a mathematical statement of Einstein's principle of equivalence.2
The Christoffel Symbols as Derivatives of the
Metric Tensor
It is often convenient to have an explicit expression for the Christoffel
symbols in terms of derivatives of the metric tensor. As an inital step, we define
the Christoffel symbol of the first kind [ij, k~\ by
1иЛ1^дткТти. C.149)
This [ij,k~\ is not a third-rank tensor. From Eq. 3.1406
0
C.150)
Now we differentiate gtj = £г • с,-, Eq. 3.131:
k dqk j
dqk dqk q
(ЗЛ51)
by Eq. 3.150.
Then
2 Misner, C. W., K. S. Thorne, and J. A. Wheeler, Gravitation. San Francisco:
W. H. Freeman A973), p. 387.
EXERCISES 163
and
1*$?шдЖЫ (ЗЛ53)
2У \dqj dql dqk\'
These Christoffel symbols and the covariant derivatives are applied in the next
section.
EXERCISES
3.8.1 Equations 3.128 and 3.129 use the scale factor ht, citing Exercise 2.2.3. In Section
2.2 we had restricted ourselves to orthogonal coordinate systems, yet Eq. 3.128
holds for nonorthogonal systems. Justify the use of Eq. 3.128 for nonorthogonal
systems.
3.8.2 (a) Show that г1 • г, = dj.
(b) From the result of part (a) show that
Г = ¥-е' and F; = F-8,.
3.8.3 For the special case of three-dimensional space ^, г2, e3 defining a right-handed
coordinate system, not necessarily orthogonal) show that
£i x £k
г' = — , i,j, к = 1, 2, 3 and cyclic permutations.
«i x £k • £.■
Note. These contravariant basis vectors, г', define the reciprocal lattice space
of Sections 1.5 and 4.4.
3.8.4 Prove that the contravariant metric tensor is given by
giJ=ei-eJ.
3.8.4A If the covariant vectors г, are orthogonal, show that
(a) gu is diagonal,
(b) gu = l/gH (no summation),
(с) у| = i/|4
3.8.5 Derive the covariant and contravariant metric tensors for circular cylindrical
coordinates.
3.8.6 Transform the right-hand side of Eq. 3.138
into the e, basis and verify that this expression agrees with the gradient developed
in Section 2.2 (for orthogonal coordinates).
3.8.7 Evaluate dsi/dgj for the spherical polar coordinates, and from these results
calculate the spherical polar coordinate Г^.
Note. Exercise 2.5.1 offers a way of calculating the needed partial derivatives.
Remember
г1 = r0 but £2 = r®o
164 TENSOR ANALYSIS
3.8.8 Show that the covariant derivative of a covariant vector is given by
V = ~—i — v rk
hJ ~ dgj k ij-
Hint. Differentiate
3.8.9 Verify that Vi:j = gik V* by showing that
\bVk
dqJ
3.8.10 From the circular cylindrical metric tensor, gtj, calculate the Г*у for circular
cylindrical coordinates.
Note. There are only three nonvanishing F's.
3.8.11 Using the Г*у from Exercise 3.8. Ю, write out the covariant derivatives V\j of
a vector V in circular cylindrical coordinates.
3.8.12 A triclinic crystal is described, using an oblique coordinate system. The three
covariant base vectors are
г2 = 0.4i + 1.6j
and
£3 = 0.2i + 0.3j+ I.Ok.
(a) Calculate the elements of the covariant metric tensor ди.
(b) Calculate the Christoffel three index symbols, Py. (This is a ""by inspection"
calculation.)
(c) From the cross-product form of Exercise 3.8.3 calculate the contravariant
base vector г3.
(d) Using the explicit forms of e3 and г,, verify that г3 • г; — S?.
Note. If it were needed, the contravariant metric tensor could be determined
by finding the inverse of g{j or by finding the г1 and using gli — s'-sj.
3.8.13 Verify that
■"■" J 2 [dqJ dql dqky
Hint. Substitute Eq. 3.151 into the right-hand side and show that an identity
results.
3.9 TENSOR DIFFERENTIAL OPERATIONS
In this section the covariant derivative of Section 3.8 is applied to rederive
the vector differential operations of Section 2.2 in general tensor form.
Divergence
Replacing the partial derivative by the covariant derivative, we take the
divergence to be
Expressing T\k by Eq. 3.153, we have
ik~2g
TENSOR DIFFERENTIAL OPERATIONS 165
, C.154)
When contracted with gim the last two terms in the curly bracket cancel, since
then
• I (i/7
ik = 29d<f' ( ^
From the theory of determinants Section 4.1,
^~H = 99 TT> (i.lbe)
where gf is the determinant of the metric, g = det (&_,•). Substituting this result
into Eq. 3.157, we obtain
C.159)
z.g от g— oq
This yields
V • V = V'l.{ = -4тг J-k(91'2 yk)- C.160)
To compare this result with Eq. 2.17, note that hih2h3 = gi/2 and V1 (contra-
(contravariant coefficient of £г) = VJhi (no summation), where Vt is the Section 2.2
coefficient of e,-.
Laplacian
In Section 2.2 replacement of the vector V in V • V by \ф led to the Laplacian
V • \ф. Here we have a contravariant V\ Using the metric tensor to create a
contravariant \ф, we make the substitution
9 dqk
Then the Laplacian V • \ф becomes
HJ)- <ЗЛб1)
For the orthogonal systems of Section 2.2 the metric tensor is diagonal and the
contravariant gu becomes
166 TENSOR ANALYSIS
Equation 3.161 reduces to
in agreement with Eq. 2.18a.
Curl
The difference of derivatives that appears in the curl (Eq. 2.21) will be
written
dVL_dV1
6qj dql'
Again, remember that the components Vt here are coefficients of the contra-
variant (nonunit) base vectors c1. The Vt of Section 2.2 are coefficients of unit
vectors et. Adding and subtracting, we obtain
T\j = rkfi,
The characteristic difference of derivatives of the curl becomes a difference of
covariant derivatives and therefore is a second-rank tensor (covariant in both
indices). As emphasized in Section 3.4, the special vector form of the curl exists
only in three-dimensional space.
From Eq. 3.153 it is clear that all the Christoffel three index symbols vanish
in Minkowski space (gXll = SAfl) and in the real space-time of special relativity
with
Here
.x0 = ct, xx = x, x2 = y, and x3 = z.
This completes the development of the differential operators in general
tensor form. (The gradient was given in Section 3.8.) In addition to the fields of
elasticity and electromagnetism, these differential forms find application in
mechanics (Lagranian mechanics, Hamiltonian mechanics, and the Euler equa-
equations for rotation of rigid body); fluid mechanics; and perhaps most important
of all, in the curved space-time of modern theories of gravity.
REFERENCES 167
EXERCISES
3.9.1 Verify Eq. 3.158
dq dqk
for the specific case of spherical polar coordinates.
3.9.2 Starting with the divergence in tensor notation, Eq. 3.160, develop the divergence
of a vector in spherical polar coordinates, Eq. 2.45.
3.9.3 The covariant vector Ai is the gradient of a scalar. Show that the difference of
covariant derivatives А{.} — Aj;i vanishes.
REFERENCES
Heitler, W. The Quantum Theory of Radiation. 2nd ed., Oxford: Oxford University Press
A947). Reprinted, New York: Dover A983).
Jeffreys, Harold, Cartesian Tensors. Cambridge: Cambridge University Press A952).
This is an excellent discussion of cartesian tensors and their application to a wide variety
of fields of classical physics.
Lawden, Derek F.,An Introduction to Tensor Calculus, Relativity and Cosmology, 3rd ed.
New York: Wiley A982).
MisneR, С W., Thorne, K. S., and Wheeler, J. A., Gravitation. San Francisco: W. H.
Freeman A973), p. 387.
Moller, C, The Theory of Relativity. Oxford: Oxford University Press A955). Reprinted
A972).
Most texts on general relativity include a discussion of tensor analysis. Chapter 4
develops tensor calculus, including the topic of dual tensors. The extension to non-
cartesian systems, as required by general relativity, is presented in Chapter 9.
PaimOFSKy, W. K. H., and M. Phillips. Classical Electricity and Magnetism., 2nd ed.
Reading, Mass.: Addison-Wesley A962).
The Lorentz covariance of Maxwell's equations is developed for both vacuum and
material media. Panofsky and Phillips use contravariant and covariant tensors rather
than Minkowski space. Discussions using Minkowski space are given by Heitler and
Stratton.
Sokolnikoff, I. S., Tensor Analysis—Theory and Applications, 2nd ed. New York: Wiley
A964).
Particularly useful for its extension of tensor analysis to non-Euclidean geometries.
Stratton, J. A., Electromagnetic Theory. New York: McGraw-Hill A941).
Weinberg, S. Gravitation and Cosmology; Principles and Applications of the General Theory
of Relativity. New York: Wiley A972).
This book and the one by Misner, Thorne, and Wheeler are the two leading texts on
general relativity and cosmology (with tensors in noncartesian space).
4 DETERMINANTS,
MATRICES, AND
GROUP THEORY
"Disciplined judgement about what is neat
and symmetrical and elegant has time and
time again proved an excellent guide to
how nature works."
Murray Gell-Mann
4.1 DETERMINANTS
We begin our study of matrices by summarizing some properties of deter-
determinants, partly because determinants are useful in matrix analysis and partly
to illustrate, by way of contrast, what matrices are not. The concept of "deter-
"determinant" and the notation were introduced by Leibnitz.
Properties
A determinant is A) a square array of numbers or functions that B) may be
combined together according to the rule that follows. We have
al bi
a2 b2
a* b-,
D.1)
The number of columns (and of rows) in the array is sometimes called the
order of the determinant. In terms of its elements, at, bj, and so on the value of
the determinant D is
p ah с ■ • ■ D 2)
where %..., analogous to the Levi-Civita symbol of Section 3.4 is +1 for even
permutations1 of A,2,3, ...,«), -1 for odd permutations, and zero if any
index is repeated.
*In a linear array abed..., any single, simple transposition of adjacent
elements yields an odd permutation of the original array: abed-* bacd. Two
such transpositions yield an even permutation. In general, an odd number of
such interchanges of adjacent elements results in an odd permutation; an even
number of such transpositions yields an even permutation.
DETERMINANTS 169
Specifically, for the third-order determinant,
D =
al
a3 b3 c3
D.3)
Equation 4.2 leads to
D = + aib2c3 — aib3c2 + a2b3ci — a2bic3
a3bic2 — a3b2clr> D.4)
with six terms in the sum.
The third-order determinant, then, is this particular linear combination of
products. Each product contains one and only one element from each row and
from each column. Each product is added in if the order represents an even
permutation of rows (the columns being in a, b, с or \, 2, 3 order) and sub-
subtracted if we have an odd permutation. Equation 4.3 may be considered
shorthand notation for Eq. 4.4. The number of terms in the sum (Eq. 4.2) is 24
for a fourth-order determinant, n\ for an «th-order determinant. Because of
the appearance of the negative signs in Eq. 4.4 (and possibly in the individual
elements, ah bj, ..., as well), there may be considerable cancellation. It is
quite possible that a determinant of large numbers will have a very small value.
Several useful properties of the «th-order determinants follow from Eq. 4.2.
Again, to be specific, Eq. 4.4 for third-order determinants is used to illustrate
these properties.
Laplacian Development by Minors
Equation 4.4 may be written
D = а1ф2с3 - b3c2) - a2(bic3 -
а3ф1с2 - b2c{)
= a.
b2 c2
b3 c3
a2
b,
b3
c3
+ a3
b,
b2
cl
Сг
D.5)
In general, the «th-order determinant may be expanded as a linear combination
of the products of the elements of any row (or any column) and the (n — 1)-
order determinants formed by striking out the row and column of the original
determinant in which the element appears. This reduced array B x 2 in this
specific example) is called a "minor." If the element is in the /th row and the
/th column, the sign associated with the product is (— \)l+i. The minor with
the sign (— l)l+j is called the "cofactor." If Mtj is used to designate the minor
formed by omitting the /th row and the/th column and CV} is the corresponding
cofactor, Eq. 4.5 becomes
In this case, expanding down the first column, we have/ = 1 and the summation
over /.
This Laplace expansion may be used to advantage in the evaluation of
high-order determinants in which a lot of the elements are zero. For example,
170 DETERMINANTS, MATRICES, AND GROUP THEORY
to find the value of the determinant
0 1 0 0
-10 0 0
0 0 0 1
0 0-10
we expand across the top row to obtain
-1 0 0
Z> = (-lI+2-(l) о 0 1
0-10
Again, expanding across the top row, we get
D.6)
D.7)
0
-1
1
0
1+1
-
= 1.
0
-1
1
0
D.8)
This determinant D (Eq. 4.6) is formed from one of the Dirac matrices appearing
in Dirac's relativistic electron theory.
Antisymmetry
The determinant changes sign if any two rows are interchanged or if any
two columns are interchanged. This follows from the even-odd character of
the Levi-Civita e in Eq. 4.2 or explicitly from the form of Eqs. 4.3 and 4.4.2
This property was used in Section 3.4 to develop a totally antisymmetric
linear combination. It is also frequently used in quantum mechanics in the
construction of a many particle wave function that, in accordance with the
Pauli exclusion principle, will be antisymmetric under the interchange of any
two identical spin \ particles (electrons, protons, neutrons, etc.).
As a special case of antisymmetry, any determinant with two rows equal or
two columns equal equals zero.
If each element in a row or each element in a column is zero, the determinant
is equal to zero.
If each element in a row or each element in a column is multiplied by a
constant, the determinant is multiplied by that constant.
The value of a determinant is unchanged if a multiple of one row is added
(column by column) to another row or if a multiple of one column is added
(row by row) to another column.
We have
2The sign reversal is reasonably obvious for the interchange of two adjacent
rows (or columns), this clearly being an odd permutation. The reader may
wish to show that the interchange of any two rows is still an odd permutation.
DETERMINANTS 171
a2 + kb2 b2 c7
a, + kb-, br, c?
Using the Laplace development on the right-hand side, we obtain
D.9)
a2
«3
+
+
+
kb,
kb2
kb3
bi
Ьг
Ьъ
ci
Сг
Съ
—
Cl i
a2
a3
b\
Ьг
b3
Cl
c2
c3
+ k
b,
Ьг
Ьъ
b, с
b2 с
b3 с
D.10)
then by the property of antisymmetry the second determinant on the right-hand
side of Eq. 4.10 vanishes, verifying Eq. 4.9.
As a special case, a determinant is equal to zero if any two rows are propor-
proportional or any two columns are proportional.
Some useful relations involving determinants of matrices appear in the
exercises of Sections 4.2 and 4.5.
Solution of a Set of Homogeneous Equations
One of the major applications of determinants is in the establishment of a
condition for the existence of a nontrivial solution for a set of linear homo-
homogeneous algebraic equations. Suppose we have three homogeneous equations
with three unknowns (or n equations with n unknowns)
axx + bxy + cxz = 0,
a2x + b2y + c2z = 0,
a3x + b3y + c3z = 0.
D.11)
The problem is to determine whether any solution, apart from the trivial one
x = 0, у = 0, z = 0, exists.
By forming the determinant of the coefficients of Eq. 4.11 and then multiply-
multiplying by x,
x
a2
«3
°i
Ьг
ьъ
Ci
c2
Съ
a^x
a2x
a3x
Di
b2
Ьг
Now, adding to the first column у times the second column and z times the
third column, we get
x
a
a2x
a3x
b2y
b3y
c2z
c3z
bi
b2
b
3 c3
D.12)
This step follows from Eq. 4.9, but by Eq. 4.11 each element of the first column
vanishes. Then
172 DETERMINANTS, MATRICES, AND GROUP THEORY
x
b,
a2
«3
=
0
0
0
b, с,
b2 сг
Ьз съ
= 0.
D.13)
Therefore x (and у and z) must be zero unless the determinant of the coefficients
vanishes. Conversely, we can show that if the determinant of the coefficients
vanishes, a nontrivial solution does indeed exist. This is used in Section 8.6 to
establish the linear dependence or independence of a set of functions.
Solution of a Set of Nonhomogeneous Equations
If our linear algebraic equations are nonhomogeneous, that is, if the zeros
on the right-hand side of Eq. 4.11 are replaced by dlyd2, and d3, respectively,
then from Eq. 4.12 we obtain,3 in place of Eq. 4.13,
x =
ь,
Ьг
«3
D.14)
If the determinant of the coefficients (the denominator) vanishes, the non-
homogeneous set of equations has no solution—unless the numerators also
vanish. In this case solutions may exist but they are not unique (see Exercise
4.1.3 for a specific example).
For numerical work, this determinant solution, Eq. 4.14, is exceedingly
unwieldy. The determinant may involve large numbers with alternate signs,
and in the subtraction of two large numbers the relative error may soar to a
point that makes the result worthless. Also, although the determinant method
is illustrated here with 3 equations and 3 unknowns, we might easily have 20
equations with 20 unknowns. From the definition of determinant (Eq. 4.2),
our «th-order determinant will have n\ terms. If we were to ask a high-speed
electronic computer to compute these n! terms at the rate of one each micro-
microsecond, the computer would still take 20! microseconds or 77,000 years. There
must be a better way.
In fact, there are better ways. One of the best is a straightforward elimination
process often called Gauss elimination. To illustrate this technique, consider
the following set of equations.
EXAMPLE 4.1.4 Gauss Elimination
Solve
3x + 2y + z = 11
2x + 3 у + z = 13
x+ y + 4z=\2.
D.15)
'Exercise 1.5.13 gives the vector analog of Eq. 4.14.
DETERMINANTS 173
For convenience and for the optimum numerical accuracy, the equations
are rearranged so that the largest coefficients run along the main diagonal
(upper left to lower right). This has already been done in the preceding set.
The Gauss technique is to use the first equation to eliminate the first un-
unknown, x, from the remaining equations. Then the (new) second equation is
used to eliminate у from the last equation. In general, we work down through
the set of equations, and then, with one unknown determined, we work back
up to solve for each of the other unknowns in succession.
Dividing each row by its initial coefficient, we see that Eqs. 4.15 become
x + 0.6667.y + 0.3333z = 3.6667
x + 1.5000j> + 0.5000z= 6.5000 D.16)
x + l.OOOOj + 4.0000z = 12.0000.
Now, using the first equation, we eliminate x from the second and third:
x + 0.6667.y + 0.3333z = 3.6667
0.8333j + 0.1667z = 2.8333 D.17)
0.3333>- + 3.6667z = 8.3333,
and
x + 0.6667.y+ 0.3333z= 3.6667
y+ 0.2000z= 3.4000 D.18)
y+ 11.0000z = 25.0000.
Repeating the technique, we use the new second equation to eliminate у
from the third equation:
x + 0.6667.y+ 0.3333z= 3.6667
y+ 0.2000z= 3.4000 D.19)
10.8000z = 21.6000,
or
z= 2.0000.
Finally, working back up, we get
у + 0.2000 x 2.0000 = 3.4000,
or
у = 3.0000.
Then with z and у determined,
x + 0.6667 x 3.0000 + 0.3333 x 2.0000 = 3.6667,
and
174 DETERMINANTS, MATRICES, AND GROUP THEORY
x= 1.0000.
The technique may not seem so elegant as Eq. 4.14, but it is well adapted to
modern computing machines and is far faster than the time spent with deter-
determinants.
This Gauss technique may be used to convert a determinant into triangular
form:
D =
0 b
2
0 0 c3
for a third-order determinant. In this form D = aib2c3. For an «th-order
determinant the evaluation of the triangular form requires only n — 1 multiplica-
multiplications compared with the n\ required for the general case.
A variation of this progressive elimination is known as Gauss-Jordan
elimination. We start as with the preceding Gauss elimination, but each new
equation considered is used to eliminate a variable from all the other equations,
not just those below it. If we had used this Gauss-Jordan elimination, Eq. 4.19
would become
x + 0.2000z= 1.4000
у + 0.2000z = 3.4000 D.20)
z = 2.0000,
using the second equation of Eq. 4.18 to eliminate у from both the first and
third equations. Then the third equation of Eq. 4.20 is used to eliminate z
from the first and second, giving
x = 1.0000
у = 3.0000 D.21)
z = 2.0000.
We return to this Gauss-Jordan technique in Section 4.2 for inverting matrices.
Another technique suitable for computer use is the Gauss-Seidel iteration
technique. Each technique has its advantages and disadvantages. The Gauss
and Gauss-Jordan methods may have accuracy problems for large deter-
determinants. This is also a problem for matrix inversion (Section 4.2). The Gauss-
Seidel method, as an iterative method, may have convergence problems. The
IBM Scientific Subroutine Package {SSP) uses Gauss and Gauss-Jordan tech-
techniques. The Gauss-Seidel iterative method and the Gauss and Gauss-Jordan
elimination methods are discussed in considerable detail by Ralston and Wilf
and also by Pennington.4
4 Ralston, A., and H. Wilf, Eds., Mathematical Methods for Digital Com-
Computers. New York: Wiley A960). Pennington, R. H., Introductory Computer
Methods and Numerical Analysis, New York: Macmillan A970).
EXERCISES 175
EXERCISES
4.1.1 Evaluate the following determinants
(a)
(b)
(c)
1
0
1
1
V2
1
3
0
0
1
0
0
0
0
2
1
3
1
0
0
0
2
1
уз
0
2
0
0
2
0
У5
0
0
0
4.1.2 Test the set of linear homogeneous equations
x + Ъу + 3z = 0,
x - у + z = 0,
2x + у + 3z = 0,
to see if it possesses a nontrivial solution.
4.1.3 Given the pair of equations
x + 2y = 3,
2x + 4y = 6.
(a) Show that the determinant of the coefficients vanishes.
(b) Show that the numerator determinants (Eq. 4.14) also vanish.
(c) Find at least two solutions.
4.1.4 Express the components of A x В as 2 x 2 determinants. Show then that the dot
product A • (A x B) yields a Laplacian expansion of a 3 x 3 determinant. Finally,
note that two rows of the 3x3 determinant are identical and hence A ■ (A x B) = 0.
4.1.5 If Qj is the cofactor of element aV] (formed by striking out the /th row andy'th
column and including a sign (— l)i+j), show that
(a) YjaijCij ~ YjaaCji — И I' where \A \ is the determinant with the elements a^,
i i
(b) £«цса = 5>;А* = 0, ]фк.
4.1.6 A determinant with all elements of order unity may be surprisingly small. The
Hilbert determinant Hu — (i +j — I), /,7=1,2, .. ., n is notorious for its
small values.
(a) Calculate the value of the Hilbert determinants of order n for n = 1,2, and 3.
(b) If an appropriate subroutine is available, find the Hilbert determinants of
order n for n = 4, 5, and 6.
176 DETERMINANTS, MATRICES, AND GROUP THEORY
ANS. n
1
2
3
4
5
6
1.
8.33333
4.62963
1.65344
3.74930
5.36730
X
X
X
X
X
КГ2
10~4
10~7
КГ12
КГ18
4.1.7 Solve the following set of linear simultaneous equations. Give the results to five
decimal places.
0.9*!
0.9x2 H
1.0x2 -
0.8x2 -
0.5x2 -
0.2x2-
0.1x2-
h 0.8x3 H
f O.8X3 -
f 1.0x3 "
t- 0.7x3 -
V 0.4x3 -
f 0.2x3 -
- 0.4x4 H
f 0.5x4 -
f 0.7x4 -
f 1.0x4-
f 0.6x4 -
V 0.3x4 -
-0.
fO.
Ю.
ьо.
Ы.
ьо.
2x3-
4xs H
6xs -
0xs H
5x5-
ЬО.
hO.
hO.
hO.
1*6
■2x6
■3*6
5*6
h 1.0x6
= 1.0
= 0.9
= 0.8
= 0.7
= 0.6
= 0.5
Note. These equations may also be solved by matrix inversion, Section 4.2.
4.2 MATRICES
Matrix analysis is essentially a theory of linear operations (linear algebra).
Suppose, for instance, that a linear operator A is operating in a space that is
described by the usual basis vectors i, j, and k. A operating on i transforms it
into some linear combination of the basis vectors:
A\ = ian + jfl21 + ka31.
(In Section 4.3 the coefficients will be developed in detail for A, a rotation
operator.) Similarly, the effect of A on j is given by a linear combination
ia12 + \a21 + ka32, and on к by ia13 + ja23 + ka33. Then the effect of A on a
vector u is to produce a vector v,
v = Аи
Expanding, we obtain
it?! i(allui+ al2u2 + «i3w3)
+ P2 = +J(«21W1 + «22«2 + «23«3>
+ kv3 +k(a3iui +a32u2 + a33u3).
Equating the i components, we have
з
or, in general,
MATRICES 177
»t=taUuJ> '=1,2,3. D.22)
We label the array of elements aVi a matrix and take the summation of products
in Eq. 4.22 as a definition of matrix multiplication (inner product). Before
passing to formal definitions, the reader should note that operator A is
described or characterized by its effect on the basis vectors. The matrix elements
пц constitute a representation of the operator, a representation that depends on
the choice of the basis.
Basic Definitions
A matrix may be defined as a square or rectangular array of numbers or
functions that obeys certain laws. This is a perfectly logical extension of familiar
mathematical concepts. In arithmetic we deal with single numbers. In the theory
of complex variables (Chapter 6) we deal with ordered pairs of numbers,
A,2) = 1 + 2/, in which the ordering is important. We now consider numbers
(or functions) ordered in a square or rectangular array. For convenience in
later work the numbers are distinguished by two subscripts, the first indicating
the row (horizontal) and the second indicating the column (vertical) in which
the number appears. For instance, ai3 is the matrix element in the first row,
third column. Hence, if A is a matrix with m rows and n columns,
«12 • •
«22 • '
«m2 • ■
■ «1„
• «2л
Perhaps the most important fact to note is that the elements ai} are not combined
with one another. The matrix is not a determinant. It is an ordered array of
numbers, not a single number. It makes no more sense to add or multiply all
the at/s together than it does to write 1 + 2/ = 3!
The matrix A, so far just an array of numbers, has the properties we assign to
it. Literally, this means constructing a new form of mathematics. We postulate
that matrices A, B, and C, with elements ai}, bip and ci}, respectively, combine
according to the following rules:
Equality
Matrix A = Matrix В if and only if aV} = bV} for all values of / and j. This,
of course, requires that A and В each be m by n arrays (m rows, n columns).
Addition
A + В = С if and only if aV} + btj = ctj for all values of / and j, the elements
combining according to the laws of ordinary algebra (or arithmetic if they are
simple numbers). This means that A + В = В + A, commutation. Also, an
associative law is satisfied (A+ B) + C = A + (B + C).
178 DETERMINANTS, MATRICES, AND GROUP THEORY
Multiplication (by a Scalar)
The multiplication of matrix A by the scalar quantity a is defined as
aA = (aA),
in which the elements of a A are aau; that is, each element of matrix A is multi-
multiplied by the scalar factor. This is in striking contrast to the behavior of deter-
determinants in which the factor a multiplies only one column or one row and not
every element of the entire determinant. A consequence of this scalar multi-
multiplication is that
a A = Aa, commutation.
Multiplication (Matrix Multiplication), Inner
Product
А В = С if and only ifx ctJ = £ aikbkj. D.23)
к
The ij element of С is formed as a scalar product of the /th row of A with the
yth column of В (which demands that A have the same number of columns
(«) as В has rows). The dummy index к takes on all the values 1,2, ..., n in
succession, that is,
Сц = апЬц + ai2b2j + ai3b3j D.24)
for n = 3. Obviously, the dummy index к may be replaced by any other symbol
that is not already in use without altering Eq. 4.23. Perhaps the situation may
be clarified by stating that Eq. 4.23 defines the method of combining certain
matrices. This method of combination, to give it a label, is called matrix
multiplication. To illustrate, consider two matrices
and a3= . D.25)
The X1 element of the product, {о1аъI1 is given by the sum of the products
of elements of the first row of ax with the corresponding elements of the first
column of аъ:
1 + 1-0 = 0.
VI O/ViOi -1/
Continuing, we have
/0-1 + 1-0 0-0 + l-(-l)\ /0 -1\
aia3 = ( = . D.26)
3 Vbl+0-0 l-0 + 0-(-iy \1 0/
Here
(ala3)ij = ^iil^31. + а\паъ2-
1 Some authors follow the summation convention here (compare Section 3.1).
MATRICES 179
Direct application of the definition of matrix multiplication shows that
">*'=(-! J) D-27)
and by Eq. 4.23
a3ai = -W3. D.28)
Except in special cases, matrix multiplication is not commutative.2
AB^BA. D.29)
However, from the definition of matrix multiplication we can show that an
associative law holds, (AB)C = A(BC). There is also a distributive law,
A(B + C) = AB + AC.
Direct Product
A second procedure for multiplying matrices, known as the direct tensor or
Kronecker product, follows. If A is an m x m matrix and В an n x n matrix,
then the direct product is
A <g> В = С. D.30)
С is an mn x mn matrix with elements
Ca, = AuBkl. D.31)
with
a = n(i -l) + k, 0 = n(J - 1) + /.
For instance, if A and В are both 2x2 matrices,
. . D-32)
a12b21
21^21 «21^22 «22^21 «22^22^
The direct product is associative but not commutative. As an example of
the direct product, the Dirac matrices of Section 4.5 may be developed as direct
2The reader should note that the basic definitions of equality, addition, and
multiplication are given in terms of the matrix elements, the a-Js, and so on.
All our matrix operations can be carried out in terms of the matrix elements.
However, Cayley A859) showed that we can also treat a matrix as a single
algebraic operator, as in Eq. 4.29. Matrix elements and single operators each
have their advantages as will be seen in the following section. We shall use
both approaches.
3 Commutation or the lack of it is conveniently described by the commutator
. bracket symbol, [A, B] = AB - BA. Equation 4.29 becomes [А, В] ф 0.
180 DETERMINANTS, MATRICES, AND GROUP THEORY
products of the Pauli matrices and the unit matrix. Other examples appear in
the construction of groups in group theory and in vector or Hilbert space in
quantum theory.
The direct product defined here is sometimes called the "standard" form
and denoted by ®. Three other types of direct products of matrices exist as
mathematical possibilities or curiosities but have little or no application in
mathematical physics.
Special Cases
A number of matrices are of special interest. If the matrix has one column and
« rows, it is called a column vector, |x> with components xt,i = 1,2, ...,«.
Similarly, if the matrix has one row and « columns, it is called a row vector,
<xj with components xh i = 1, 2, ..., «. Clearly, if A is an « x « matrix, |x>
an «-component column vector, and <x| an «-component row vector,
A|x> and <x|A
are defined by Eq. 4.23, whereas
A<x| and |x>A
are not defined.
Clearly, the row vector <x| = (xx, x2, ..., xn) and the column vector |x>
with the same components are not independent. Just as clearly they cannot be
added: <x| + |x> is not defined. In quantum theory it is convenient to consider
the column vectors |x> in one space and the row vectors <x| in a different space,
a dual space.
In the remainder of this chapter we confine our attention to column vectors,
row vectors, and square matrices.
The unit matrix 1 has elements 5ti, Kronecker delta, and the property that
1 A = A1 = A for all A.
1 =
/10 0 0
0 10 0
0 0 10
0 0 0 1
\. . . .
\
D.33)
If all elements are zero, the matrix is called the null matrix and is denoted
by O. For all A
= O.
0 =
/000
о о о
о о о
D.34)
It should be noted that it is possible for the product of two matrices to be the
MATRICES 181
null matrix without either one being the null matrix. For example, if
A-f ') and BJ ' °\
\o o/ V-i o/
А В = О. Once more the results of ordinary algebra do not apply directly.
Diagonal Matrices
An important special type of matrix is the square matrix in which all the
nondiagonal elements are zero. Specifically, if a 3 x 3 matrix A is diagonal,
/«11
0
i 0
0
«22
0
0
0
a*
The physical interpretation of such diagonal matrices and the method of
reducing matrices to this diagonal form are considered in Section 4.6. Here
we simply note a significant property of diagonal matrices—multiplication of
diagonal matrices is commutative,
А В = В A, if A and В are each diagonal.
Trace
In any square matrix the sum of the diagonal elements is called the trace.
One of its interesting and useful properties is that the trace of a product of
two matrices A and В is independent of the order of multiplication:
= trace(BA).
This holds even though А В =/= В A. Equation 4.35 means that the trace of any
commutator bracket, [А, В] = А В — В A, is zero.
In Exercise 4.5.23 the operation of taking the trace selects one term out of
a sum of 16 terms. The trace will serve the same function relative to matrices
as orthogonality serves for vectors and functions.
In terms of tensors (Section 3.2) the trace is a contraction and like the
contracted second-rank tensor is a scalar (invariant).
Matrices are used extensively to represent the elements of groups (compare
Exercise 4.2.7 and Sections 4.8 to 4.12). The trace of the matrix representing
the group element is known in group theory as the character. The reason for
the special name and special attention is that while the matrices may vary the
trace or character remains invariant (compare Exercise 4.3.9). Finally, we note
that as an operator the trace is a linear operator.
Matrix Inversion
At the beginning of this section matrix A is introduced as the representation
of an operator that (linearly) transforms the coordinate axes. A rotation would
182 DETERMINANTS. MATRICES, AND GROUP THEORY
be one example of such a linear transformation. Now we look for the inverse
transformation A that will restore the original coordinate axes. This means,
as either a matrix or an operator equation,4
1 = A'1A = 1. D.36)
From Exercise 4.2.32
_, Cit
a
ij
D.37)
with the assumption that the determinant of A (| A|) ф 0. If it is zero, we label
A singular. No inverse exists. This conclusion that we must require |A| ф 0 is
about the only use of Eq. 4.37. As explained at the end of Section 4.1, this
determinant form is totally unsuitedfor numerical work with large matrices.
There is a wide variety of alternative techniques. One of the best and most
commonly used is the Gauss-Jordan matrix inversion technique. The theory is
based on the results of Exercises 4.2.34 and 4.2.35, which show that there exist
matrices ML such that the product MLA will be A but with
a. one row multiplied by a constant, or
b. one row replaced by the original row minus a mul-
multiple of another row, or
c. rows interchanged.
Other matrices MR Operating on the right (AMfi) can carry out the same
operations on the columns of A.
This means that the matrix rows and columns may be altered (by matrix
multiplication) as though we were dealing with determinants, so we can apply
the Gauss-Jordan elimination techniques of Section 4.1 to the matrix elements.
Hence there exists a matrix ML (or MK) such that5
MLA = 1. D.38)
The ML = A. We determine ML by carrying out the identical elimination
operations on the unit matrix. Then
.MJ = ML. D.39)
To clarify this, we consider a specific example.
EXAMPLE 4.2.1 Gauss-Jordan Matrix Inversion
We want to invert the matrix
4 Here and throughout this chapter our matrices have finite rank. If A is an
infinite rank matrix (n x n with n -* oo), then life is more difficult. For A
to be the inverse we must demand that both
AA = 1 and A~1A=1.
One relation no longer implies the other.
5 Remember that det(A) Ф 0.
MATRICES 183
A= 2 3 11. D.40)
For convenience we write A and 1 side by side and carry out the identical
operations on each:
and 0 10. D.41)
To be systematic, we multiply each row to get akl = 1,
'l 0.6667 0.3333\ /0.3333
1 1.5000 0.5000 j and I 0 0.5000 0 }. D.42)
kl 1.0000 4.0000/ \ 0
Subtracting the first row from the second and third, we obtain
'1 0.6667 0.3333\ / 0.3333 0
0 0.8333 0.1667 1 and I -0.3333 0.5000 0 |. D.43)
0.3333 3.6667/ \-0.3333 0
Then we divide the second row (of both matrices) by 0.8333 and subtract 0.6667
times it from the first row, and 0.3333 times it from the third row. The results
for both matrices are
0.2000\ / 0.6000 -0.4000 0\
0.2000 1 and j-0.4000 0.6000 0 J. D.44)
3.6000/ \- 0.2000 -0.2000 1/
We divide the third row (of both matrices) by 3.6. Then as the last step 0.2
times the third row is subtracted from each of the first two rows (of both
matrices). Our final pair is
@.6111 -0.3889 -0.0556\
-0.3889 0.6111 -0.0556 I. D.45)
-0.0556 -0.0556 0.2778/
The check is to multiply the original A by the calculated A^1 to see if we
really do get the unit matrix 1. The result to four decimal places is
/ 0.9999 -0.0001 -0.0002 \
AA~1=( -0.0001 0.9999 -0.0002 I D.46)
\-0.0002 -0.0002 1.0000/
or 1, the unit matrix to within the round-off error (mostly from rounding off
-0.05555-•• to -0.0556).
As with the Gauss-Jordan solution of simultaneous linear algebraic equa-
184 DETERMINANTS, MATRICES, AND GROUP THEORY
tions, this technique is well adapted to large computing machines. Indeed, this
Gauss-Jordan matrix inversion technique will probably be available in the
program library as a subroutine.
4.2.1 Show that matrix multiplication is associative, (AB)C = A(BC).
4.2.2 Show that
(A+ B)(A- B) = A2- B2
if and only if A and В commute,
[A, B] = 0.
4.2.3 Show that matrix A is a linear operator by showing that
ACcjrj + c2r2) = CjArj + c2Ar2.
It can be shown that an « x « matrix is the most general linear operator in
an «-dimensional vector space. This means that every linear operator in this
«-dimensional vector space is equivalent to a matrix.
4.2.4 (a) Complex numbers, a + ib, with a and b real, may be represented by (or,
are isomorphic with) 2x2 matrices:
... (a b\
\-b a)
Show that this matrix representation is valid for (i) addition and (ii) multi-
multiplication,
(b) Find the matrix corresponding to {a + ib)~l.
4.2.5 If A is an « x « matrix, show that
det(-A) = (-l)"detA.
4.2.6 (a) Matrix С is the matrix product of A and B. Show that the determinant of
С is the product of the determinants of A and B.
detC = detA x det B.
Hint. The determinant can be written
(b) If С = А + В, in general,
det С =j=detA + det B.
Construct a specific numerical example to illustrate this inequality.
4.2.7 Given the three matrices
/-1 0\ /0 l\ / 0 -l\
V ° -i/ \i о/ an ~V-i o/
Find all possible products of A, B, and C, two at a time, including squares.
Express your answers in terms of A, B, and C, and 1, the unit matrix.
EXERCISES 185
These three matrices together with the unit matrix form a representation of a
mathematical group, the vierergruppe.
Sections 4.8 and 4.9 (Group Theory) contain repeated references to this group.
4.2.8 Given
о or
-i 0 0
.0 -1 0,
show that
Kn= KKK-- (« factors) =1
(with the proper choice о(п,пф 0).
4.2.9 Verify the Jacobi identity
This is useful in matrix descriptions of elementary particles. As a mnemonic
aid, the reader might note that the Jacobi identity has the same form as the В AC-
CAB rule of Section 1.5.
4.2.10 Show that the matrices
/0 1 0\
A = l 0 0 0 J, B=|
\0 0 0/
satisfy the commutation relations
[A, B] = C, [A,C] = 0, and [B,C] = 0.
4.2.11 Let
i =
0
0
0
0
0
0
°\
o/
c =
r
0
lo
0
0
0
1
0
0
and
Show that
(a) i2 = j2 = k2 = — 1, where 1 is the unit matrix.
(b) jj=-jj=k,
]k= -kj= i,
ki = —ik = j.
These three matrices (i, j, and k) plus the unit matrix 1 form a basis for qua-
quaternions. An alternate basis is provided by the four 2 x 2 matrices, ial, ia2,
— ia3, and 1, where the ct's are the Pauli spin matrices of Exercise 4.2.13.
4.2.12 A matrix with elements au = 0 for у < / may be called upper right triangular.
The elements in the lower left (below and to the left of the main diagonal) vanish.
Examples are the matrices in Chapters 12 and 13 relating power series and
eigenfunction expansions.
186 DETERMINANTS, MATRICES, AND GROUP THEORY
Show that the product of two upper right triangular matrices is an upper right
triangular matrix.
4.2.13 The three Pauli spin matrices are
ai=(> i)- =C o')-and =(i -i)
Show that
(a) a2=1,
(b) a{aj = iak, (ij, k) = A,2,3), B,3,1), C,1,2) (cyclic permutation),
(c) ст,G,+ G;О{ = 2оу1.
These matrices were used by Pauli in the nonrelativistic theory of electron spin.
4.2.14 Using the Pauli ct's of Exercise 4.2.13, show that
(a' a)(c• b) = a • Ы + io• (a x b).
Here
a = \ax + \ay + kaz
and a and b are ordinary vectors.
4.2.15 One description of spin 1 particles uses the matrices
0 1 6\ /0 -i
(l 0 1 , My = ^ i 0
10/ v \0 i 0,
and
/l 0 0N
M2 = I 0 0 0
\0 0 -1,
Show that
(a) [ Mx, My] = / M2, and so on6 (cyclic permutation of indices).
Using the Levi-Civita symbol of Section 3.4, we may write
[M{, Mj] = ieijkMk.
(b) M2 = M2X + M2, + M22 = 21,
where 1 is the unit matrix.
(c) [M2, M,] = 0,+
where
L+ = Mx+ /My,
L" = M,-iMr
4.2.16 Repeat Exercise 4.2.15 using an alternate representation,
/0 0 0 \ /0 0 Г
Mx = l 0 0 -i J, My= I 0 0 0
\0 i 0 / \-i 0 0/
and
бГА,В1 = AB- BA.
EXERCISES 187
@ -i
i О О
OOO/
In Section 4.11 these matrices appear as the generators of the rotation matrices.
4.2.17 Show that the matrix-vector equation
reproduces Maxwell's equations in vacuum. Here ф is a column vector with
components ф-} = Bj — iEj/c, j = x, y, z. M is a vector whose elements are the
angular momentum matrices of Exercise 4.2.16. Note that e0/u0 = 1/c2.
From Exercise 4.2.15(b)
A comparison with the Dirac relativistic electron equation suggests that the
"particle" of electromagnetic radiation, the photon, has zero rest mass anc}
a spin of 1 (in units of ft).
4.2.18 Repeat Exercise 4.2.15, using the matrices for a spin off,
and
0 О
1 О
О -1
О О
4.2.19 An operator P commutes with Jx and Jy, the x and у components of an angular
momentum operator. Show that P commutes with the third component of
angular momentum; that is,
[P,JJ=O.
Hint. The angular momentum components must satisfy the commutation relation
of Exercise 4.2.15(a).
4.2.20 The L+ and L~ matrices of Exercise 4.2.15 are "ladder operators." L+ operating
on a system of spin projection m will raise the spin projection to m + 1 if m is
below its maximum. L+ operating on m yields zero. L~ reduces the spin
projection in unit steps in a similar fashion. Dividing by y/2, we have
/0 1 0\
L+=l 0 0 1 I, L~ =
\0 0 0/
Show that
L+1 -1 > = |0>, L" | -1 > = null column vector,
L+|O> = |1>, L"|O> = |-1>,
L+11 > = null column vector, LT 11 > = 10>,
188 DETERMINANTS, MATRICES, AND GROUP THEORY
where
-1>= 0 , |0>= 1 and
representing states of spin projection —1,0, and 1, respectively.
Note. Differential operator analogs of these ladder operators appear in Exercise
12.6.7.
4.2.21 Vectors A and В are related by the tensor T
B = TA.
Given A and В show that there is no unique solution for the components of T.
This is why vector division B/A is undefined (apart from the special case of
A and В parallel and T then a scalar).
4.2.22 We might ask for a vector A, an inverse of a given vector A in the sense that
A-A = A-A = 1.
Show that this relation does not suffice to define A uniquely. A has literally
an infinite number of inverses.
4.2.23 If A is diagonal, with all diagonal elements different, and A and В commute,
show that В is diagonal.
4.2.24 If A and В are diagonal, show that A and В commute.
4.2.25 Show that trace (ABC) = trace (CBA) if any two of the three matrices commute.
4.2.26 Angular momentum matrices satisfy a commutation relation
[ Мг, My] = / Mk, i,j, к cyclic.
Show that the trace of each angular momentum matrix vanishes.
4.2.27 (a) The operator Tr replaces a matrix A by its trace; that is,
Tr(A) = trace(A) = ][>„.
Show that Tr is a linear operator,
(b) The operator det replaces a matrix A by its determinant; that is,
det(A) = determinant of A.
Show that det is not a linear operator.
4.2.28 A and В anticommute. Also, A2 = 1, B2 = 1. Show that trace(A) = trace( B) = 0.
Note. The Pauli and Dirac (Section 4.5) matrices are specific examples.
4.2.29 With )x> an iV-dimensional column vector and <_y| an jV-dimensional row vector,
show that
Note. \x}(y\ tneans column vector |x> multiplying row vector (y\. The result
is a square matrix N x N.
4.2.30 (a) If two nonsingular matrices anticommute, show that the trace of each
one is zero. (Nonsingular means that the determinant of the matrix elements
EXERCISES 189
(b) For the conditions of part (a) to hold A and В must Ьеяхи matrices with
n even. Show that if n is odd a contradiction results.
4.2.31 If a matrix has an inverse, show that the inverse is unique.
4.2.32 If A has elements
where Cjt is they/th cofactor of | A|, show that
A-XA=1.
Hence A is the inverse of A (if |A| ф 0).
Note. In numerical work it sometimes happens that | A| is almost equal to zero.
Then there is trouble.
4.2.33 Show that det A = (det A).
Hint. Apply Exercise 4.2.6.
Note. If det A is zero, then A has no inverse. A is singular.
4.2.34 Find the matrices ML such that the product MLA will be A but with:
(a) the /th row multiplied by a constant к, (аи -> ka{j,j = 1, 2, 3, ...).
(b) the /th row replaced by the original /th row minus a multiple of the mth
row, (ay -> ay - kamj,j = 1, 2, 3, ...).
(c) the /th and mth rows interchanged, {atj -> amj, amj -> аф) = 1, 2, 3, ...).
4.2.35 Find the matrices MR such that the product A MR will be A but with:
(a) the /th column multiplied by a constant к, (а]{ -> kaJhj = 1, 2, 3, ...).
(b) the /th column replaced by the original /th column minus a multiple of the
mth column, (aJt -> ajt - kajm,j = 1, 2, 3, ...).
(c) the /th and mth columns interchanged, {a]{ -> ajm, a]m -> ajhj = 1, 2, 3, ...).
4.2.36 Find the inverse of
4.2.37 (a) Rewrite Eq. 2.4 of Chapter 2 (and the corresponding equations for dy and
dz) as a single matrix equation
\dxk} = J\dqj}.
J is a matrix of derivatives, the Jacobian matrix. Show that
<dxk\dxk} = <dqi\G\dqj}
with the metric (matrix) G having elements gtj given by Eq. 2.6.
(b) Show that
j dq2 dq3 = dxdydz.
Det(J) is the usual Jacobian.
4.2.38 Matrices are far too useful to remain the exclusive property of physicists. They
may appear wherever there are linear relations. For instance, in a study of
population movement the initial fraction of a fixed population in each of n
areas (or industries or religions, etc.) is represented by an «-component column
vector P. The movement of people from one area to another in a given time is
190 DETERMINANTS, MATRICES. AND GROUP THEORY
described by an n x n (stochastic) matrix T. Here Ttj is the fraction of the popula-
population in the 7th area that moves to the /th area. (Those not moving are covered
by / =7.) With P describing the initial population distribution, the final popula-
population distribution is given by the matrix equation TP = Q.
n
From its definition £ Pt= 1.
(a) Show that conservation of people requires that
(b) Prove that
n
1=1
continues the conservation of people.
4.2.39 Given a 6 x 6 matrix A with elements a{j = 0.5|lWI, i = 0, 1,2, . .., 5;7; = 0, 1,
2, ..., 5. Find A. List the matrix elements аи' to five decimal places.
4
2
0
0
0
0
— 2
5
-2
0
0
0
0
-2
5
-2
0
0
0
0
-2
5
_2
0
0
0
0
-2
5
_2
\
0
0
0
-2
4/
ANS.
\
4.2.40 Exercise 4.1.7 may be written in matrix form
AX-С
Find A and calculate X as A-1C.
4.2.41 (a) Write a subroutine that will multiply complex matrices. Assume that the
complex matrices are in a general rectangular form.
(b) Test your subroutine by multiplying pairs of the Dirac 4x4 matrices of
Table 4.1, Section 4.5.
4.2.42 (a) Write a subroutine that will call the complex matrix multiplication sub-
subroutine of Exercise 4.2.41 and will calculate the commutator bracket of
two complex matrices.
(b) Test your complex commutator bracket subroutine with the matrices of
Exercise 4.2.16.
4.2.43 Interpolating polynomial is the name given to the (n — l)-degree polynomial
determined by (and passing through) n points, (*,-,yt) with all the x,'s distinct.
This interpolating polynomial forms the basis for the numerical quadrature
developed in Appendix 2.
(a) Show that the requirement that an (n — l)-degree polynomial in x passes
through each of the n points (x,, >,) with all x; distinct leads to n simultaneous
equations of the form
и-1
(b) Write a computer program that will read in n data points and return the
n coefficients cij. Use a subroutine to solve the simultaneous equations if
such a subroutine is available.
ORTHOGONAL MATRICES 191
(c) Rewrite the set of simultaneous equations as a matrix equation
XA = Y.
(d) Repeat the computer calculation of part (b) but this time solve for vector
A by inverting matrix X (again, using a subroutine).
4.2.44 A calculation of the values of electrostatic potential inside a cylinder leads to
K@.0) = 52.640 K@.6) = 25.844
V@.2) = 48.292 K@.8) = 12.648
V@A) = 38.270 KA.0) = 0.0
The problem is to determine the values of the argument for which V = 10, 20,
5
30, 40, and 50. Express V(x) as a series £ сцп^1п- (Symmetry requirements in
n=0
the original problem require that V(x) be an even function of л.) Determine the
coefficients a2n. With V(x) now a known function of л% find the root of ^(.х) — 10
= 0, 0 < x < 1. Repeat for V(x) — 20, and so on.
ANS. a0 = 52.640
a2= -117.676
F@.6851) = 20.
4.3 ORTHOGONAL MATRICES
Ordinary three-dimensional space may be described with the familiar
cartesian coordinates (x,y, z). We consider a second set of cartesian coordinates
(x',y', z') whose origin coincides with that of the first set but whose orientation
is different (Fig. 4.1). We can say that the primed coordinate axes have been
rotated relative to the initial, unprimed coordinate axes. Since this rotation
is a linear operation, we expect a matrix equation relating the primed basis to
the unprimed basis.
This section repeats portions of Chapters 1 and 3 in a slightly different
context and with a different emphasis. Previously, attention was focused on
the vector or tensor. In the case of the tensor, transformation properties were
strongly stressed and were critical. Here emphasis is placed on the description
of the coordinate rotation itself—the matrix. Transformation properties, the
behavior of the matrix when the basis is changed, appear at the end of this
section. Sections 4.5 and 4.6 continue with transformation properties in complex
vector spaces.
Direction Cosines
A unit vector along the x'-axis (i') may be resolved into components along
the х-, у-, and z-axes by the usual projection technique.
Г = icos(x\ x) +)cos(x',y) + kcos(x',z). D.47)
Equation 4.47 is a specific example of the linear relations discussed at the
beginning of section 4.2.
For convenience these cosines, which are the direction cosines, are labeled
192 DETERMINANTS, MATRICES, AND GROUP THEORY
X, Xi
FIG. 4.1 Cartesian coordinate systems
y, x2
',x) = i'-i =
cos(x',z) = i'-k = «i3.
Continuing, we have
cos(/,x) = j'-i = a21, (<*21фа12),
cos(y',y) =}'•} = a22, and so on.
Now Eq. 4.47 may be rewritten
i' = ia11 +}a12 + ka13
and also
)' = ia21 +}a22 +ka23,
к' = ш31
D.48)
D.49)
D.50)
We may also go the other way by resolving i, j, and к into components in the
primed system. Then
i = i'a11 +i'a21 +k'a31,
) = i'a12 +)'a22 + k'a32,
к = i'a13 + }'a23 + k'a33.
D.51)
ORTHOGONAL MATRICES 193
Associating i and V with the subscript 1, j and j' with the subscript 2, к and k'
with the subscript 3, we see that in each case the first subscript of аи refers to
the primed unit vector (i'J^k'), whereas the second subscript refers to the
unprimed unit vector (i, j, k).
Applications to Vectors
If we consider a vector whose components are functions of the position in
space, then
= \\x',y',z') = xvx. + j'f; + vv;.,
since the point may be given both by the coordinates (x, y, z) and the coordinates
(x', y', z'). Note that V and V are geometrically the same vector (but with differ-
different components). The coordinate axes are being rotated; the vector stays fixed.
Using Eq. 4.50 to eliminate i, j, and k, we may separate Eq. 4.52 into three
scalar equations.
V* = allVx + a12Vy + a^Vz
K-=a2iVx + a22Vy + a23Vz
and
In particular, these relations will hold for the coordinates of a point (x,y,z)
and {x',y',z'\ giving
x' = a11x + a12y + a13z,
/
and
z' = аЪ1х + аЪ2у + аъъг.
It is convenient to change the notation slightly at this point.
Let
D.55)
and similarly for the primed coordinates. In this notation the set of three
equations D.54) may be written as
x\ = t atjXj, D.56)
i=i
where / takes on the values 1,2, and 3 and the result is three separate equations.
Now let us set aside these results and try a different approach to the same
problem. We consider two coordinate systems (xx, x2, x3) and (x\, x'2, х'ъ) with
194 DETERMINANTS, MATRICES, AND GROUP THEORY
a common origin and one point (xx, x2,x3) in the unprimed system, (x\,х'2,х'ъ)
in the primed system. Note the usual ambiguity. The same symbol x denotes
both the coordinate axis and a particular distance along that axis. Since our
system is linear, x'{ must be a linear combination of the x/s. Let
*'i = t Wr D-57>
The ai} may be identified as our old friends, the direction cosines. This identifi-
identification is carried out for the two-dimensional case later.
If we have two sets of quantities (V1, V2, F3) in the unprimed system and
(VI, V2, F3') in the primed system, related in the same way as the coordinates
of a point in the two different systems (Eq. 4.57),
tUVJ> D-58)
then, as in Section 1.2, the quantities (Vx, V2, V3) are defined as the components
of a vector; that is, a vector is defined in terms of transformation properties
of its components under a rotation of the coordinate axes. In a sense the
coordinates of a point have been taken as a prototype vector. The power and
usefulness of this definition becomes apparent in Chapter 3, in which it is
extended to define pseudovectors and tensors.
From Eq. 4.56 we can derive interesting information about the a£/s which
describe the orientation of coordinate system (x\, х'2,х'ъ) relative to the system
(xx, x2,x3). The length from the origin to the point is the same in both systems.
Squaring, for convenience,
D-59)
},k i
This can be true for all points if and only if
i}aik = 5}k, ./,* =1,2,3. D.60)
Verification of Eq. 4.60, if needed, may be obtained by returning to Eq. 4.59
and setting г = (x1,x2,x3) = A,0,0), @,1,0), @,0,1), A,1,0), and so on to
evaluate the nine relations given by Eq. 4.60. This process is valid, since Eq. 4.59
must hold for all r for a given set of au. Equation 4.60, a consequence of requiring
that the length remain constant (invariant) under rotation of the coordinate
system, is called the orthogonality condition. The ay's, written as a matrix A,
form an orthogonal matrix. Note carefully that Eq. 4.60 is not matrix multi-
multiplication. Rather, it is interpreted later as a scalar product of two columns
of A.
that two independent indices; and к are used.
ORTHOGONAL MATRICES 195
In matrix notation Eq. 4.56 becomes
'> = A
D.61)
Orthogonality Conditions—Two-Dimensional Case
A better understanding of the ay's and the orthogonality condition may be
gained by considering rotation in two dimensions in detail. (This can be thought
of as a three-dimensional system with the xx- x2-axes rotated about x3.) From
Fig. 4.2
Ф
FIG. 4.2
x\ = xx cos (p + x2 sin (p,
x'2= — xt sin (p + x2 cos cp.
D.62)
Therefore by Eq. 4.61
A =
coscp
I—sin (p cos <pj
D.63)
Notice that A reduces to the unit matrix for q> = 0. Zero rotation means nothing
has changed. It is clear from Fig. 4.2 that
= cos (p =
n
a12 = sin (p = cos I - — q> I =
), and so on,
D.64)
thus identifying the matrix elements atj as the direction cosines. Equation 4.60,
the orthogonality condition, becomes
sin2 q> + cos2 (p — 1,
sin (p cos (p — sin (p cos (p = 0.
D.65)
196 DETERMINANTS, MATRICES, AND GROUP THEORY
The extension to three dimensions (rotation of the coordinates through an
angle q> counterclockwise about x3) is simply
D.66)
The агг = 1 expresses the fact that х'ъ = х3, since the rotation has been about
the x3-axis. The zeros guarantee that x\ and x'2 do not depend on x3 and that
х'ъ does not depend on xx and x2. In more sophisticated language, xt and x2
span an invariant subspace, whereas x3 forms an invariant subspace alone. The
general form of A is reducible. Equation 4.66 gives one possible decomposition.
Inverse Matrix, A
Returning to the general transformation matrix A, the inverse matrix A is
defined such that
*> = A^l*'). D.67)
That is, A" describes the reverse of the rotation given by A and returns the
coordinate system to its original position. Symbolically, Eqs. 4.61 and 4.67
combine to give
and since x> is
the unit matrix.
using Eqs. 4.61
arbitrary,
Similarly,
and 4.67 and
jc> = A
A^A =
AA~X =
eliminating
д
= 1
= 1
x>,
> instead
of
D.68)
D.69)
D.70)
Transpose Matrix, A
We can determine the elements of our postulated inverse matrix A by
employing the orthogonality condition. Equation 4.60, the orthogonality con-
condition, does not conform to our definition of matrix multiplication, but it can
be put in the required form by defining a new matrix A such that
4 = au; D.71)
that is, A, called "A transpose, is formed from A by interchanging rows
and columns. Equation 4.60 becomes
AA = 1. D.72)
This is a restatement of the orthogonality condition and may be taken as a
definition of orthogonality. Multiplying Eq. 4.72 by A from the right and
1 Some texts denote A transpose by AT.
ORTHOGONAL MATRICES 197
using Eq. 4.70, we have
A = A~1. D.73)
This important result that the inverse equals the transpose holds only for
orthogonal matrices and indeed may be taken as a further restatement of the
orthogonality condition.
Multiplying Eq. 4.73 by A from the left, we obtain
AA = 1 D.74)
or
«*« = *«, D.75)
which is still another form of the orthogonality condition.
Summarizing, the orthogonality condition may be stated in several equivalent
ways:
*y*i* = 3jk D.76a)
jAi = SJk D.766)
AA = AA = 1 D.76c)
A = A. D.76a?)
Any one of these relations is a necessary and a sufficient condition for A to be
orthogonal.
It is now possible to see and understand why the term orthogonal is appro-
appropriate for these matrices. We have the general form
A=
a matrix of direction cosines in which a-tj is the cosine of the angle between x\ and
Xj. Therefore a11,a12, a13 are the direction cosines of x\ relative io х1,х2,хг.
These three elements of A define a unit length along x\, that is, a unit vector f,
The orthogonality relation (Eq. 4.75) is simply a statement that the unit vectors
i',}', and k' are mutually perpendicular or orthogonal. Our orthogonal trans-
transformation matrix A rotates one orthogonal coordinate system into a second
orthogonal coordinate system.
As an example of the use of matrices, the unit vectors in spherical polar
coordinates may be written as
D.77)
198 DETERMINANTS, MATRICES, AND GROUP THEORY
where С is given in Exercise 2.5.1. This is equivalent to Eq. 4.50 with i', j', and
k' replaced by r0, 00, and q>0. From the preceding analysis С is orthogonal.
Therefore the inverse relation becomes
= с-ч e0 =c e0 , D.78)
and Exercise 2.5.5 is solved by inspection. Similar applications of matrix
inverses appear in connection with the transformation of a power series into
a series of orthogonal functions (Gram-Schmidt orthogonalization) and the
numerical solution of integral equations.
Successive Rotations, Matrix Multiplication
Returning to orthogonal matrices, let the coordinate rotation
x'} = A|x> D.79)
be followed by a second rotation given by matrix В such that
jc"> = B|jc'>. D.80)
In component form
-ZbuZVk D-81)
= Y(Yb-a,)x,
/| ,j \ /, ,j IJ JKs К
к j
The summation over j is matrix multiplication defining a matrix С = В А
such that
ikxk. D.82)
Again, the definition of matrix multiplication is found useful and indeed this is
the justification for its existence. The physical interpretation is that the matrix
product of the two matrices, В A, is the rotation that carries the unprimed
system directly into the double-primed coordinate system.
Euler Angles
Our transformation matrix A contains nine direction cosines. Clearly, only
three of these are independent, Eq. 4.60 providing six constraints. Equivalently,
we may say that two parameters (в and cp in spherical polar coordinates) are
required to fix the axis of rotation. Then one additional parameter describes
the amount of rotation about the specified axis. In the Lagrangian formulation
of mechanics (Section 17.3) it is necessary to describe A by using some set of
ORTHOGONAL MATRICES 199
x"-i Xl = X 3
X 1
X I
line of modes
FIG. 4.3 (a) Rotation about jc3 through angle a; (b) Rotation about x'2 through
angle /?; (c) Rotation about xl through angle y.
three independent parameters rather than the redundant direction cosines. The
usual choice of parameters is the Euler angles.3
The goal is to describe the orientation of a final rotated system (x",x2, x'3")
relative to some initial coordinate system (xl5x2,x3). The final system is
developed in three steps—each step involving one rotation described by one
Euler angle (Fig. 4.3):
1. The x\-, x2-, x'3-axes are rotated about the x3-axis
through an angle a counterclockwise relative toxj,
x2, x3. (The x3- and x'3-axes coincide.)
The x'[-, x2-, x'3-axes are rotated about the x'2-axis4
through an angle ft counterclockwise relative to x'x,
x'2, х'ъ. (The x2- and the x2-&xqs coincide).
The third and final rotation is through an angle у
counterclockwise about the Xj-axis, yielding the xf,
x'2, х'з system. (The х"ъ- and x'3-axes coincide.)
2.
3.
The three matrices describing these rotations are
cos a sin a
—sin a cos a 0
0 0 1
D.83)
exactly like Eq. 4.66,
cos/? 0 —sin/?
О 1 О
f 0 cos£
D.84)
and
3 There are almost as many definitions of the Euler angles as there are authors.
Here we follow the choice generally made by workers in the area of group
theory and the quantum theory of angular momentum (compare Section 4.9).
4 Many authors choose this second rotation to be about the -V
200 DETERMINANTS, MATRICES, AND GROUP THEORY
/ cosy sin у 0\
Rz(y) = (-siny cosy 0 J. D.85)
\ о oi/
The total rotation is described by the triple matrix product.
A(oc, fi, y) = Rz(y) Ry(fi) Rz(a). D.86)
(The component form of successive transformations is considered in Eqs. 4.79
to 4.82.)
Note the order: Rz(a) operates first,, then Ry(fi), and finally Rz(y). Direct
multiplication gives
(cos у cos fi cos a — sin у sin a
— sin у cos fi cos a — cos у sin a
sin fi cos a
cos у cos fi sin a 4- sin у cos a — cos у sin fi
— sin у cos fi sin a 4- cos у cos a sin у sin /?
sin fi sin a cos fi
Equating A(ay) with A(oc,/?, y), element by element, yields the direction
cosines in terms of the three Euler angles. We could use this Euler angle identi-
identification to verify the direction cosine identities, Eq. 1.41, of Section 1.4, but
the approach of Exercise 4.3.3 is much more elegant.
TWO TECHNIQUES
Our matrix description of rotation leads to the Oj group, which will be
discussed in Sections 4.10 and 4.11. Rotations may also be described by the
SUB) group and quaternions. The power and flexibility of matrices pushed
quaternions into obscurity early in this century.5 The SUB) concepts and
techniques are often encountered in modern particle physics. The SUB) group
is also considered in Sections 4.10 and 4.11.
The Euler angle description of rotations forms a basis for developing the
rotation group of Section 4.10.
It will be noted that the matrices have been handled in two ways in the fore-
foregoing discussion: by their components and as single entities. Each technique
has its own advantages. Both are useful.
Consider the evaluation of (ST) where ST is a (product) matrix that has
an inverse. Then, clearly,
(ST)(ST)-1 = 1.
5Stephenson, R. J., "Development of Vector Analysis from Quaternions.'
Am. J. Phys. 34, 194 A966).
ORTHOGONAL MATRICES 201
Multiplying first by S and then by T-1 successively from the left, we have
(ST)~1 = T^S. D.88)
The inverse of a product equals the product of the inverses in reverse order.
This may be readily generalized to any number of factors.
On the other hand, the evaluation of (ST) may perhaps best be carried out
by considering the components. Let U = ST, with S, T, and U not necessarily
orthogonal. Then
uik =
using the definition of transpose. But
Uik
and Eq. 4.83 may be written as
(ST) = T5. D.89)
The transpose of a product equals the product of the transposes in reverse order.
Note that in the two illustrations neither S nor T is required to be orthogonal.
Symmetry Properties
The transpose matrix is useful in a discussion of symmetry properties. If
A = A, аи = ая, D.90)
the matrix is called symmetric, whereas if
A=-A, ац=-а„, D.91)
it is called antisymmetric or skewsymmetric. The diagonal elements vanish.
It is easy to show that any (square) matrix may be written as the sum of a
symmetric matrix and an antisymmetric matrix. Consider the identity
A = i[A + A] + |[A - A]. D.92)
[A + A] is clearly symmetric, whereas [A — A] is clearly antisymmetric. This
is the matrix analog of Eq. 3.22, Chapter 3, for tensors.
Similarity Transformation
So far we have interpreted the orthogonal matrix as rotating the coordinate
system. This changes the components of a fixed vector (not rotating with the
coordinates) (Fig. 1.7, Chapter 1). However, Eq. 4.89 may be interpreted equally
well as a rotation of the vector in the opposite direction (Fig. 4.4).
These two possibilities: A) rotating the vector keeping the basis fixed and
B) rotating the basis (in the opposite sense) keeping the vector fixed have a
202 DETERMINANTS, MATRICES, AND GROUP THEORY
= Ar
• x
FIG. 4.4 Fixed coordinates-rotated vector
direct analogy in quantum theory. Rotation (a time transformation) of the
state vector gives the Schrodinger picture. Rotation of the basis keeping the
state vector fixed yields the Heisenberg picture.
Suppose we interpret matrix A as rotating a vector г into the position shown
by 14.
rx = Ar.
D.93)
Now let us rotate the coordinates by applying matrix B, which rotates (x,y,z)
into (*',/, O,
x = BAr
= BA(B~1B)r
= (BAB~1)Br.
D.94)
Brx is just rx in the new coordinate system with a similar interpretation holding
for Br. Hence in this new system (Br) is rotated into position (BrJ by the
matrix BAB.
Br1 = (BAB~1)Br
1 1 1
т\ = А' г
In the new system the coordinates having been rotated by matrix B, A has the
form A', in which
ORTHOGONAL MATRICES 203
A = BAB1. D.95)
A' operates in the л:', у', z' space as A operates in the x, y, z space.
The transformation defined by Eq. 4.95 with В any matrix, not necessarily
orthogonal, is known as a similarity transformation. In component form Eq.
4.95 becomes
АЛ D.96)
к,I
Now if В is orthogonal,
Ь^ = ЬЦ = Ь„, D.97)
and we have
л- D-98)
It may be helpful to think of A again as an operator, possibly as rotating
coordinate axes, relating current density and electric fields in an anisotropic
crystal (Section 3.1) or angular momentum and angular velocity of a rotating
solid (Section 4.6). Matrix A is the representation in a given coordinate system—
or basis. But there are directions associated with A—crystal axes, symmetry
axes in the rotating solid, and so on—so that the representation A depends
on the basis. The similarity transformation shows just how the representation
changes with a change of basis.
Relation to Tensors
Comparing Eq. 4.98 with the equations of Section 3.1, we see that it is the
definition of a tensor of second rank. Hence a matrix that transforms by an
orthogonal similarity transformation is, by definition, a tensor. Clearly, then,
any orthogonal matrix A, interpreted as rotating a vector (Eq. 4.93), may be
called a tensor. If, however, we consider the orthogonal matrix as a collection
of fixed direction cosines, giving the new orientation of a coordinate system,
there is no tensor transformation involved.
The symmetry and antisymmetry properties defined earlier are preserved
under orthogonal similarity transformations. Let A be a symmetric matrix,
A = A, and
A'=BAB. D.99)
Now
A'= BAB^1 = B-1AB= BAB, D.100)
since В is orthogonal. But A = A. Therefore
1 = A', D.101)
showing that the property of symmetry is invariant under an orthogonal
similarity transformation. In general, symmetry is not preserved under a
nonnorthogonal similarity transformation.
204 DETERMINANTS, MATRICES, AND GROUP THEORY
EXERCISES
Note. Assume all matrix elements are real.
4.3.1 Show that the product of two orthogonal matrices is orthogonal.
Note. This is a key step in showing that all n x n orthogonal matrices form a
group (Section 4.10).
4.3.2 If A is orthogonal, show that its determinant has unit magnitude.
4.3.3 If A is orthogonal and det A = +1, show that ai} = Сф where Ctj is the cofactor
of a-,j. This yields the identities of Eq. 1.41 used in Section 1.4 to show that a
cross product of vectors (in three-space) is itself a vector.
Hint. Note Exercise 4.2.32.
4.3.4 Another set of Euler rotations in common use is
1. a rotation about the x3-axis through an angle cp, counter-
counterclockwise,
2. a rotation about the x'j-axis through an angle 0, counter-
counterclockwise, and
3. a rotation about the x^-axis through an angle ф, counter-
counterclockwise.
If
a = q> — n/2 q> = a + /2
у = ф + п/2 ф = у- n/2,
show that the final systems are identical.
4.3.5 Suppose the Earth is moved (rotated) so that the north pole goes to 30° north,
20° west (original latitude and longitude system) and the 10° west meridian
points due south.
(a) What are the Euler angles describing this rotation?
(b) Find the corresponding direction cosines.
@.9551 -0.2552 -0.1504^
0.0052 0.5221 -0.8529
0.2962 0.8138 0.5000/
4.3.6 Verify that the Euler angle rotation matrix, Eq. 4.87, is invariant under the
transformation
a -> a + я, /? -> — /?, у -> у — я.
4.3.7 Show that the Euler angle rotation matrix A(a, /?, y) satisfies the following
relations:
(a) A~1(a,p,y) = A(aJ,y)
(b) A-1(«,^,y) = A(-y,-^-a).
4.3.8 Show that the trace of the product of a symmetric and an antisymmetric matrix
is zero.
4.3.9 Show that the trace of a matrix remains invariant under similarity transforma-
transformations.
4.3.10 Show that the determinant of a matrix remains invariant under similarity trans-
transformations.
EXERCISES 205
Note. These two exercises D.3.9 and 4.3.10) show that the trace and the determi-
determinant are independent of the basis. They are characteristics of the matrix (operator)
itself.
4.3.11 Show that the property of antisymmetry is invariant under orthogonal similarity
transformations.
4.3.12 A is 2 x 2 and orthogonal. Find the most general form of
Compare with two-dimensional rotation.
4.3.13 |x> and | y> are column vectors. Under an orthogonal transformation S, |x'> =
S|x>, | y'> = S| y>. Show that the scalar product <x| y> is invariant under this
orthogonal transformation.
Note. This is equivalent to the in variance of the dot product of two vectors,
Section 1.3.
4.3.14 Show that the sum of the squares of the elements of a matrix remains invariant
under orthogonal similarity transformations.
Note. In Exercise 3.7.11 c2B2 — E2 may be obtained as the sum of the squares
of the components of the matrix (tensor) f^.
4.3.15 As a generalization of Exercise 4.3.14, show that
jk l,m
where the primed and unprimed elements are related by an orthogonal similarity
transformation. This result is useful in deriving invariants in electromagnetic
theory (compare Section 3.7).
Note. This product MJk = ^S^T^ is sometimes called a Hadamard product.
In the framework of tensor analysis, Chapter 3, this exercise becomes a double
contraction of two second-rank tensors and therefore is clearly a scalar (in-
(invariant)!
4.3.16 A rotation cp1 + cp2 about the z-axis is carried out as two successive rotations
9j and q>2, each about the z-axis. Use the matrix representation of the rotations
to derive the trigonometric identities:
+ <p2) = cos <Pi cos <p2 — sin <pt sin <p2
sin(9j + cp2) = sin <pt cos <p2 4- cos <pt sin <p2.
4.3.17 A column vector V has components Vx and V2 in an initial (unprimed) system.
Calculate V[ and V2 for a
(a) rotation of the coordinates through an angle of в counterclockwise,
(b) rotation of the vector through an angle of в clockwise.
The results for parts (a) and (b) should be identical.
4.3.18 Write a subroutine that will test whether a real N x N matrix is symmetric.
Symmetry may be defined as
0
atJ
an\ < e,
where e is some small tolerance (which allows for truncation error, and so on
in the machine).
206 DETERMINANTS, MATRICES, AND GROUP THEORY
4.4 OBLIQUE COORDINATES
Throughout this book so far—vector analysis, coordinate systems, tensor
analysis, and now matrices—we have always taken our coordinates to be
orthogonal. But sometimes the demands of a physical system force the use of
a nonorthogonal or oblique system of coordinates. In describing the physical
properties of a crystal, for example, we might find it more convenient to use
the coordinate system defined by the axes of this crystal—and these axes are
often oblique.
Consider a coordinate system in which the noncoplanar unit vectors a, b,
and с are not orthogonal. (When we describe a crystal a, b, and с might not
have unit magnitude either. The interatomic spacings would be more appro-
appropriate lengths.) Then an arbitrary vector may be written
V = \VX + )Vy
kVz = ava
bvb
cvc = v.
D.102)
V will denote the vector expressed in the usual rectangular cartesian system,
whereas v is the same vector expressed in the oblique coordinate system.
Equivalently, we can say that (Vx, Vy, Vz) is the representation in the usual
cartesian basis, whereas (va, vb, vc) is the representation of the same vector in the
nonorthogonal basis.
z
У
FIG. 4.5 V=jK,
The special case (really two-dimensional) of j, k, b, c, and V all in the x = 0
plane is shown in Fig. 4.5. Note carefully that the components vb and vc are
found by projecting the tip of V parallel to с for vb and parallel to b for vc. The
general procedure for obtaining one component would be to pass a plane
through the tip of V parallel to the plane defined by the other two unit vectors.
With the components defined this way, the sum of the components is just V
by the triangle or parallelogram laws of vector addition, Section 1.1.
We proceed from Eq. 4.102 exactly as in Section 4.3 with a instead of i', b
instead of j' and с in place of k'. From
OBLIQUE COORDINATES 207
a = \ax + \ay + kaz
b = ibx+)by + kbz D.103)
c = icx+]cy + kcz,
equating cartesian components, we obtain
К = axva + bxvb + cxvc
Vy = ayva + byvb + cyvc D.104)
In matrix form the vector V described by an orthogonal basis is related to
its description in the oblique basis by
V=Pv, D.105)
where
(ax bx cx\
ay by cy\ D.106)
az bz cz)
The transformation matrix P is not orthogonal, since the column vectors
forming it, a, b, and c, are not orthogonal.
Since
v=P-1V, D.107)
we seek P *. The solution is actually developed in Section 1.5. The reciprocal
lattice vectors
c' = -a x b , D.108)
a x b*c a x b*c a x b*c
taken as row vectors, form a matrix Q,
'ax ay az
Q= \K К К \ D.109)
It should be emphasized that a', b', and c7 are not orthogonal. Also, they are
not of unit length, and if a, b, and с have dimensions, then a', b', and c' have
reciprocal dimensions. If a is a length, a' could be a wave number. From the
properties developed in Section 1.5
PQ=QP = 1 D.110)
or
Q= p-1, p= Q-i D.111)
Exercise 4.4.1 outlines a slightly different, but equivalent, derivation of Q.
208 DETERMINANTS, MATRICES, AND GROUP THEORY
FromEqs. 4.107 and 4.111
v=QV. D.112)
Taking the transpose of Eqs. 4.105 and 4.112, we have
<F| = <u|P, <y| = <F|U D.113)
< | denoting a row vector—as in Section 4.2.
V may be resolved in the a, b', c'-space (the reciprocal lattice) exactly as in
the a, b, c-space. From the primed analog of Eqs. 4.102 to 4.104
V= Uv' v' = PV, D.114)
and
<F| = <i/|Q <y| = <J/|P. D.115)
The scalar product of two vectors U and V becomes
<M'||i;> D.116)
from Eqs. 4.112 and 4.115. | > denotes a column vector. The square of a
vector in oblique coordinates is not the sum of the squares of the components,
but rather, the sum of the products of an oblique component and the corre-
corresponding reciprocal lattice component.
If U and V in Eq. 4.116 are the differential length dR = (dx,dy,dz), then
ds2 = idR\\dK) = idr\PP\dry, D.117)
using Eqs. 4.105 and 4.113. ds2 is the square of the distance element; dx is dR
but resolved in the oblique coordinates. Reference to Eq. 2.4 identifies PP as
the metric of our oblique coordinates. The metric of the reciprocal lattice is
QU.
Further development of vector analysis, particularly of a vector calculus in
oblique coordinates, is probably best considered a branch of noncartesian
tensor analysis, Sections 3.8 and 3.9.
v = (va,vb,vc) is a contravariant vector in the language of Section 3.1. The
corresponding covariant components are (v'a,v'b,v'c) in the reciprocal lattice.
From Eqs. 4.105, 4.112, and 4.114
v'.y = рр|у.> and \vt} = Qu|i>;>. D.118)
The metric PP transforms the contravariant vector into covariant form. Its
inverse, QU, transforms the covariant vector into contravariant form. In
contravariant-covariant tensor notation (Section 3.1) the elements of PP are
gtj, whereas the elements of QQ are glJ. We have
(dsJ = gijdxldxJ = glJ
9ij9ik =
v{ (covariant) =
vl (contravariant) =
HERMITIAN MATRICES, UNITARY MATRICES 209
The reader should note that the distinction between covariant and contra-
variant forms vanishes when the coordinates are orthogonal (cartesian).
EXERCISES
4.4.1 From the result of Exercise 4.2.32 qxj = Pnj\ P|, derive the relation
, _ b x с
a x b«c
4.4.2 The vectors defining a particular system of oblique coordinates are
a = i, b = j, and c = (j + k)ly/2.
(a) Find P, Q, and metric PP.
(b) If V = i + 3j + 2k, find v and v'. Verify that
<v'\\v>=V2.
4.4.3 Show that
(a) v'a = a-Y
(b) va = a'-y.
Note that the lattice defining vectors a, a', and so on need not have unit magnitude.
4.4.4 One vector with cartesian components V( and oblique (contravariant) components
v{ and a second with cartesian components Ut and reciprocal lattice (covariant)
components щ are transformed by a rotation of the coordinate systems described
by the (orthogonal) matrix S. By definition of a vector
V> = S|V> and |U'>=S|U>.
(a) Show that
v'> (contravariant) = QSP|v>
u'> (covariant) = P3U|u>
(b) Show that <u' v'> is an invariant, independent of S.
4.4.5 Show that the metric for contravariant vectors, (gy) = PP, is given by
a a • b a • cN
b*a b*b b-c
а с • b с • с,
For oblique coordinates all these dot products and therefore all the #y's are con-
constants.
4.5 HERMITIAN MATRICES, UNITARY MATRICES
Definitions
Thus far it has generally been assumed that our linear vector space is a real
space and that the matrix elements (the representations of the linear operators)
are real. For many calculations in classical physics real matrix elements will
suffice. However, in quantum mechanics complex variables are unavoidable
210 DETERMINANTS, MATRICES, AND GROUP THEORY
because of the form of the basic commutation relations (or the form of the
time-dependent Schrodinger equation). With this in mind, we generalize to
the case of complex matrix elements. To handle these elements, let us define,
or label, some new properties.
1. Complex conjugate, A*, formed by taking the
complex conjugate (/ -* — i) of each element, where
2. Adjoint, A+, formed by transposing A*,
А* = А^ = А*. D.119)
3. Hermitian matrix. The matrix A is labeled Hermitian
(or self-adjoint) if
A = Af. D.120)
In quantum mechanics (or matrix mechanics)
matrices are usually constructed to be Hermitian.
4. Unitary matrix. Matrix U is labeled unitary if
11* = IT1, D.121)
which represents a generalization of the concept of
orthogonal matrix (compare Eq. 4.73).
If the matrix elements are complex, the physicist is almost always concerned
with adjoint matrices, Hermitian matrices, and unitary matrices. Unitary
matrices are especially important in quantum mechanics because they leave
the length of a (complex) vector unchanged-analogous to the operation of an
orthogonal matrix on a real vector. It is for this reason that the S matrix of
scattering theory is a unitary matrix. One important exception to this interest
in unitary matrices is the group of Lorentz matrices, Sections 3.7 and 4.13.
Using Minkowski space, we see that these matrices are orthogonal, not unitary.
If the transforming matrix in a similarity transformation is unitary, the
transformation is referred to as a unitary transformation,
А' = иАи* D.122)
Just as the product of two orthogonal matrices is found to be orthogonal
(Exercise 4.3.1), so we can show that the product of two unitary matrices is
unitary. Let Ux and U2 be unitary. Then
1 =(U1U2)(U1U2r1
= U^U^Ur1 D.123)
= u1u2ut2ul,
using the unitary property. Since the operation of adjoint is the same as trans-
transpose (except for the complex conjugate),
11^ = 11*111, D.124)
HERMITIAN MATRICES, UNITARY MATRICES 211
(Exercise 4.5.3). Substituting into Eq. 4.123, we have
1 =(UlU2)(UlU2y D.125)
Multiplying from the left by (U { U2)'x, we obtain
(U1U2r1=(U1U2)t, D.126)
which shows that the product of two unitary matrices is itself unitary. This is
one of the steps in demonstrating that the n x n unitary matrices form a group
(Section 4.10). Other properties and applications of these concepts are included
in the exercises at the end of this section.
Pauli Matrices
Four by four complex matrices have been used extensively in relativistic
theories of the electron. A convenient starting point for developing the 4x4
matrices is the set of three 2x2 Pauli matrices
(' °Y D.127)
These were introduced by W. Pauli to describe a particle of spin \ (nonrelativistic
theory). It can readily be shown that (compare Exercise 4.2.13) the Pauli a's
satisfy
oioi + Ojdi = 25^, anticommutation D.128)
о-хо-} = юк, cyclic permutation of indices D.129)
(ад2 = ^ D-130)
Dirac Matrices
In 1927 P. A. M. Dirac extended this formalism. Dirac required a set of
four anticommuting matrices. The three Pauli matrices plus the unit matrix
form a complete set; that is, any constant 2x2 matrix M may be written
M =co1 +clal +c2a2 + c3a3, D.131)
where c0, cx, c2, and c3 are constants. Hence the Pauli 2x2 matrices were
inadequate; no fourth anticommuting matrix exists. We can show that 3x3
matrices likewise cannot furnish an anticommuting set of four matrices (Exer-
(Exercise 4.7.8).
Turning to 4 x 4 matrices, we can build up a complete set as direct products1
of the Pauli matrices and the unit matrix. Let
0i, Dirac = 1 ® Of. Pauli D-! 32)
Pj, Dirac =?,\ Pauli® 1- D.133)
For example,
xThe direct product A ® В is defined in Section 4.2.
212 DETERMINANTS, MATRICES, AND GROUP THEORY
1 0
1 0 0
a,=l .'' .. .1=1 „ 0 0
0 1
0 1
0 0
1 0
We can show that these 4x4 matrices satisfy the relations
g(Gj + apt — 2<5y1, anticommutation, D.134)
GiPj — PjGi = [^,-,P/] = 0, commutation, D.135)
and
G;Gj = lGk,
cyclic permutation of indices. D.136)
PiPj = '
It is now possible to set up a matrix multiplication table (Table 4.1).
Dirac originally chose to use the set of four matrices labeled ax, a2, ct3, and
a4, where at = pxot and a4 = p3. Today the set labeled yu i= 1, 2, 3, 4, 5, is
in more common use.
These 4x4 Dirac matrices may be referred to as E^-, in which2
With the understanding that p0 = a0 = 1, the unit matrix, we let the indices i
and j range from 0 to 3. These 16 matrices Etj have a number of interesting
properties :
1. DetE0-=+l.
2 E2 = 1
3. Ey = Ejj; all are Hermitian and then, by property 2,
unitary.
4. Trace (Ey) = 0 except for Eoo = 1, in which case
trace (Eoo) = 4. This property is exploited in Exercise
4.5.23 as the matrix analog of orthogonality.
5. The 16 Eu matrices almost form a mathematical
group.3 Any two of them multiplied together yield
a member of the set within a factor of — 1 or + /.
2 С /г /Оч /r
СУ ~ "i.Pauli ЧУ "j.Pauli1
3The Bi} can be modified so that they satisfy the group property exactly, but
then they are no longer Hermitian and unitary.
HERMITIAN MATRICES, UNITARY MATRICES 213
6. The 16 Efj- are linearly independent. No one can be
written as a linear sum of the other 15.
7. The 16 El7 form a complete set. Any 4x4 matrix
(with constant elements) may be written as a linear
combination of these 16,
A= I fyEtf,
i,j=O
where the coefficients c(j- are constants, real or
complex.
TABLE 4.1 Dirac Matrices
a-,
Pi
Ръ
Anticommuting Sets
From these 16 Hermitian matrices we can form six anticommuting sets of
5 matrices each. Using the labels shown in Table 4.1, we have the following
sets:
214 DETERMINANTS, MATRICES, AND GROUP THEORY
1. au a2, a3, a4, a5.
2. Ki, У2, Уз, У^ Vs-
3. 6U 62, 63, pu p2.
4. au yu 6U a2, ст3.
5. a2, y2, 62, au ст3.
6. a3, y3, 63, au cr2.
Each Ey (exclusive of the unit matrix) appears in two of the preceding sets. In
addition to the set of a's, the set of y's has been used extensively in relativistic
quantum theory.
The largest completely commuting sets of Dirac matrices (including the unit
matrix) have only four matrices.
The discussion of orthogonal matrices in Section 4.3 and unitary matrices
in this section is only a mere beginning. The further extensions are of vital
concern in modern "elementary" particle physics. With the Pauli and Dirac
matrices, we can develop spinors for describing electrons, protons, and other
spin \ particles. The coordinate system rotations lead to Dj(a, /3, }>), the rotation
group usually represented by matrices in which the elements are functions of
the Euler angles describing the rotation. The special unitary group SUC),
(composed of 3 x 3 unitary matrices with determinant +1), has been used
with considerable success to describe mesons and baryons. These extensions
are considered further in Sections 4.10 to 4.12.
EXERCISES
4.5.1 Show that
det(A*) = (det A)* = det(At).
4.5.2 Three angular momentum matrices satisfy the basic commutation relation
(and cyclic permutation of indices). If two of the matrices have real elements,
show that the elements of the third must be pure imaginary.
4.5.3 Show that (AB)f = BfAf.
4.5.4 Matrix С = SfS. Show that the trace is positive definite unless S is the null
matrix in which case trace (C) = 0.
4.5.5 If A and В are Hermitian matrices, show that (А В + В A) and z'(AB - В A)
are also Hermitian.
4.5.6 Matrix С is not Hermitian. Show that С + Cf and z'(C — Cf) are Hermitian.
This means that a non-Hermitian matrix may be resolved into two Hermitian
parts :
С=^(С + С+) + —1(С
2 2i
EXERCISES 215
4.5.7 A and В are two noncommuting Hermitian matrices:
AB- BA = zC.
Prove that С is Hermitian.
4.5.8 Show that a Hermitian matrix remains Hermitian under unitary similarity
transformations.
4.5.9 Two matrices A and В are each Hermitian. Find a necessary and sufficient
condition for their product А В to be Hermitian.
ANS. [A, B]=0.
4.5.10 Show that the reciprocal of a unitary matrix is unitary.
4.5.11 A particular similarity transformation yields
A'= UAIT1
If the adjoint relationship is preserved (Af/ = A'f) and det U = 1, show that U
must be unitary.
4.5.12 Two matrices U and H are related by
U = eiaH
with a real. (The exponential function is defined by a Maclaurin expansion. This
will be done in Section 4.11.)
(a) If H is Hermitian, show that U is unitary.
(b) If U is unitary, show that H is Hermitian. (H is independent of a.)
Note. With H the Hamiltonian,
ф(х, t) = U(x, W(x,0) = ехр(-иИ/к)ф(х,О)
is a solution of the time-dependent Schrodinger equation. \J(x, t) = exp(—it Hjh)
is the "evolution operator."
4.5.13 An operator T(t + e, t) describes the change in the wave function from t to
t + e. For e real and small enough so that e2 may be neglected
(a) If T is unitary, show that H is Hermitian.
(b) If H is Hermitian, show that T is unitary.
Note. When H{t) is independent of time, this relation may be put in exponential
form—Exercise 4.5.12.
4.5.14 Show that an alternate form
17V U ft\ /O&
— ibi\\l)jZ.rl
I it + £, t) =
v ' 1 +1еН@/2й
agrees with the T of part (a) of Exercise 4.5.13 neglecting e2 and is exactly unitary
(for H Hermitian).
4.5.15 Prove that the direct product of two unitary matrices is unitary.
4.5.16 Denoting the 16 Dirac matrices by E,7 = рса^р0 = a0 = 1), show that
(a) Ey = 1 for all i and j,
(b) Ey = Ey (Hermitian).
Hint. Use the known properties of p,- and o).
216 DETERMINANTS, MATRICES, AND GROUP THEORY
4.5.17 Verify Eqs. 4.134 to 4.136 for the 4 x 4 a and p matrices.
4.5.18 Using Eqs. 4.135 and 4.136, show that each of the six sets of Dirac matrices
listed in Eq. 4.137 is actually an anticommuting set.
4.5.19 Using Eqs. 4.135 and 4.136, show that
(a) аха2аъаАа5= +1,
(b) K1K2K3K4K5- +1.
4.5.20 If M = iA + Ys\ show that
M2= M.
Note that y5 may be replaced by any other Dirac matrix (any Ey of Table 4.1).
If M is Hermitian, then this result, M2 = M, is the defining equation for a quan-
quantum mechanical projection operator.
4.5.21 Show that
a x a = 2ia,
where a is a vector whose components are the a matrices,
a = (alta2,a3).
Note that if a is a polar vector (Section 3.4), then a is an axial vector.
4.5.22 Prove that the 16 Dirac matrices form a linearly independent set.
Hint. Assume the contrary. Let Emn be a linear combination of the other Ey's.
Multiply by Emn. Take the trace and show that a contradiction results.
4.5.23 (a) If we assume that a given 4x4 matrix, A (with constant elements), can
be written as a linear combination of the 16 Dirac matrices
i,j=0
show that
cmn = i trace (AEmJ.
(b) If A has one and only one nonvanishing element, show that there will be
exactly four nonvanishing coefficients in its expansion.
(c) Expand
1 0 0 o"
0 0 0 0
A 0 0 0 0
in terms of the Ey.
0 0 0 0
ANS. A = i(E00 + E03 + E30 + E33)
4.5.24 If A is any one of the Dirac matrices (excluding the unit matrix), it will commute
with eight of the Dirac matrices and anticommute with the other eight. List
the eight matrices that anticommute with yx.
ANS. a2, ст3, Pi, or,, Yi, Y3, Рз, <*i-
4.5.25 For investigating questions of covariance under Lorentz transformations, one
usually expresses the Dirac electron theory in terms of уд, /л = 1, 2, 3, 4. Show
that these four matrices together with their products
DIAGONALIZATION OF MATRICES 217
(a) уду, V- f v
(b) Y^YvYa, indices all different
(С) У1У2У3У4
and the unit matrix 1 reproduce all 16 Dirac matrices (apart from constant fac-
factors).
Note. In beta decay theory 1 is used to describe a scalar interaction, the four
y£'s a vector interaction, the six double products (у,Уу) a tensor interaction, the
four triple products (угудк) an axial vector interaction, and the product y5 =
Y1Y2Y3Y4 a pseudoscalar interaction. Experiment shows the actual interaction
is a linear combination of vector and axial vector, not conserving parity.
4.5.26 (a) Givenr'= Ur, with U a unitary matrix and r a (column) vector with complex
elements, show that the norm (magnitude) of r is invariant under this opera-
operation.
(b) The matrix U transforms any column vector r with complex elements into
r' leaving the magnitude invariant: rfr = r'V. Show that U is unitary.
4.5.27 Write a subroutine that will test whether a complex N x N matrix is self-adjoint.
In demanding equality of matrix elements atj = a]j, allow some small tolerance
e to compensate for truncation error, and so on in the machine.
4.5.28 Write a subroutine that will form the adjoint of a complex M x N matrix.
4.5.29 (a) Write a subroutine that will take a complex M x N matrix A and will yield
the product AfA.
Hint. This subroutine can call the subroutines of Exercise 4.2.41 and 4.5.28.
(b) Test your subroutine by taking A to be one or more of the Dirac matrices,
Table 4.1.
4.6 DIAGONALIZATION OF MATRICES
Moment of Inertia Matrix
In many physical problems involving matrices it is desirable to carry out a
(real) orthogonal similarity transformation or a unitary transformation to
reduce the matrix to a diagonal form, nondiagonal elements all equal to zero.
One particularly direct example of this is the moment of inertia matrix I of a
rigid body. From the definition of angular momentum L we have
L=lo> D.138)
o> being the angular velocity.x The inertia matrix I is found to have diagonal
components
i(rf ~ xf)> an so on' D.139)
the subscript i referring to mass m, located at r^ = {xi, yt, zj. For the nondiagonal
components we have the products of inertia.
xThe moment of inertia matrix may also be developed from the kinetic
energy of a rotating body, T — |<ю|1|ю>.
218 DETERMINANTS, MATRICES, AND GROUP THEORY
Pi
P2
P2
FIG. 4.6 Moment of inertia ellipsoid
D.140)
By inspection, matrix I is symmetric. Also, since I appears in a physical equation
of the form D.138), which holds for all orientations of the coordinate system,
it may be considered to be a tensor (quotient rule, Section 3.3).
The problem now is to orient the coordinate axes in space so that the Ixy and
the other nondiagonal elements will vanish. As a consequence of this orientation
and an indication of it, if the angular velocity is along one such realigned axis,
the angular velocity and the angular momentum will be parallel.
Geometrical Picture—Ellipsoid
It is perhaps instructive to consider a geometrical picture of this problem.
If the inertia matrix I is multiplied from each side by a unit vector of variable
direction, n = (a, /?, y),
<и|1|и> = 1, D.141)
where I is a number (scalar) whose magnitude depends on the choice of direction
of n. Carrying out the multiplication, we obtain
To throw this into one of the standard forms for an ellipsoid, we introduce
D-143)
DIAGONALIZATION OF MATRICES 219
in which p is variable in direction am/magnitude. Equation 4.142 becomes
1 = 4cP2 + IyyPl + IzzPl + 2IxyPxp2 + 21Х2рхРъ + 2/vzp2p3. D.144)
This is the general form of an ellipsoid relative to the coordinates pl4 p2, p3.
However, from analytic geometry it is known that the coordinate axes can
always be rotated to coincide with the axes of our ellipsoid. Then
'2
I2p'2
in which p'x, p2, р'ъ is the new set of coordinates.
D.145)
Principal Axes
In many elementary cases, especially when symmetry is present, these new
axes, called the principal axes, can be found by inspection. We now proceed to
develop a general method of finding the diagonal elements and the principal
axes.
Hermitian Matrices
First, let us examine an important theorem about the diagonal elements and
the principal axes. In the equation
A|r> = A|r>
D.146)
A, a number (scalar), is known as the eigenvalue, |r> the corresponding vector,
is the eigenvector.2 The terms were introduced from the early German literature
on quantum mechanics. We now show that if A is a Hermitian matrix,3 its
eigenvalues are real and its eigenvectors orthogonal.
Let A,- and Ay be two eigenvalues and |r,-> and |r,->, the corresponding eigen-
eigenvectors of A, a Hermitian matrix. Then
D.147)
D.148)
D.149)
D.150)
> D.151)
Equation 4.147 is multiplied by <ij
<гу|А|
Equation 4.148 is multiplied by <r,| to give
<ri|A|r/> = ^J<r£
Taking the adjoint* of this equation, we have
2Equation 4.138 will take on this form when to is along one of the principal
axes. Then L = /to and Ira = Ato. In the mathematics literature A is usually
called a characteristic value, to a characteristic vector.
3If A is real, the Hermitian requirement is replaced by a requirement of
symmetry.
*Note <| |
220 DETERMINANTS, MATRICES, AND GROUP THEORY
or
<rJ.|A|r1->=A;<rJ.|r1.> D.152)
since A is Hermitian. Subtracting Eq. 4.152 from Eq. 4.149, we obtain
(Л, - Л/)<г;|г,> = 0. D.153)
This is a general result for all possible combinations of i and/ First, let j = i.
Then Eq. 4.153 becomes
(Л,-Л?)<г,|г,>=0. D.154)
Since <rI-JrI-> = 0 would be a trivial solution of Eq. 4.154, we conclude that
A,- = Af, D.155)
or Я,- is real, for all i.
Second, for i ф j, and hx Ф Xj,
or
D.156)
D.157)
which means that the eigenvectors of distinct eigenvalues are orthogonal, Eq.
4.157 being our generalization of orthogonality in this complex space.4
If A,- = Xj (degenerate case), |r£> is not automatically orthogonal to |r,->, but
it may be made orthogonal.5 Consider the physical problem of the moment of
inertia matrix again. If xx is an axis of rotational symmetry, then we will find
that A2 = A3. Eigenvectors |r2) and |r3) are each perpendicular to the symmetry
axis, |гг>, but they lie anywhere in the plane perpendicular to 1^); that is, any
linear combination of r2> and r3> is also an eigenvector. Consider (a2 r2> -+-
r3>) with a2 and аъ constants. Then
A(a2|r2> + я3|г3» = «2Я2|г2> + «3Я3|г3>
= X2(a2 г2> +а3 г3»,
D.158)
as is to be expected, for xx is an axis of rotational symmetry. Therefore, if
!*!> and |r2) are fixed, |r3) may simply be chosen to lie in the plane perpendic-
perpendicular to |r1> and also perpendicular to |r2). A general method of orthogonalizing
solutions, the Gram-Schmidt process, is applied to functions in Section 9.3.
The set of n orthogonal eigenvectors of our n x n Hermitian matrix forms a
4The corresponding theory for differential operators (Sturm-Liouville
theory) appears in Section 9.2. The integral equation analog (Hilbert-
Schmidt theory) is given in Section 16.4.
5 We are assuming here that the eigenvectors of the и-fold degenerate A, span
the corresponding и-dimensional space. This may be shown by including a
parameter s in the original matrix to remove the degeneracy and then letting
г approach zero (compare Exercise 4.6.30). This is analogous to breaking a
degeneracy in atomic spectroscopy by applying an external magnetic field
(Zeeman effect).
DIAGONALIZATION OF MATRICES 221
complete set, spanning the «-dimensional (complex) space. This fact is useful
in a variational calculation of the eigenvalues, Section 17.8 (Exercise 4.7.19).
Eigenvalues and eigenvectors are not limited to Hermitian matrices. All
matrices have eigenvalues and eigenvectors. For instance, the stochastic
population matrix T satisfies an eigenvalue equation
'/^equilibrium Лг equilibrium n
with Я = 1. However, only Hermitian matrices have all eigenvectors orthogonal
and all eigenvalues real.
Antihermitian Matrices
Occasionally, in quantum theory we encounter antihermitian matrices:
A^ -A.
Following the analysis of the first portion of this section, we can show that
a. The eigenvalues are pure imaginary (or zero).
b. The eigenvectors corresponding to distinct eigen-
eigenvalues are orthogonal.
The matrix R formed from the normalized eigenvectors is unitary. This anti-
antihermitian property is preserved under unitary transformations.
Secular Equation
The preceding demonstration of real eigenvalues and orthogonal eigen-
eigenvectors is essentially an existence theorem. To determine the eigenvalues A;
and the eigenvectors |r,-> actually we return to Eq. 4.146. Assuming |r> to be
multiplied by the unit matrix, we may rewrite Eq. 4.146
(A-A1)|r> = 0, D.159)
in which 1 is the unit matrix. This is a set of simultaneous, homogeneous,
linear equations. By Section 4.1 it has nontrivial solutions only if the deter-
determinant of the coefficients vanishes,
A — Д11 = 0. D.160)
Let us consider the case in which A is a 3 x 3 Hermitian matrix. Then
axx-k ax2 ахг
a2x a22-X a23
азх аЪ2 аъъ-Х
= 0. D.161)
Because of its applications in astronomical theories Eq. 4.161 is usually called
the secular equation.6 Equation 4.161 yields a cubic equation in A, which, of
course, has three roots.7 By Eq. 4.155 we know that these roots are real.
6This equation also appears in second-order perturbation theory in quantum
mechanics.
7 See Exercise 6.4.9.
222 DETERMINANTS, MATRICES. AND GROUP THEORY
Substituting one root at a time back into Eq. 4.159, we can find the correspond-
corresponding eigenvectors.
EXAMPLE 4.6.1 Eigenvalues and Eigenvectors of a Symmetric Matrix
Let
D.162)
The secular equation is
-X 1
0
1
0
-X 0
0 -X
= 0,
or
-X{X2 - 1) = 0,
D.163)
D.164)
expanding by minors. The roots are X= — 1, 0, 1. To find the eigenvector
corresponding to X = — 1, we substitute this value back into the eigenvalue
equation Eq. 4.159
D.165)
With X = -1, this yields
= 0.
D.166)
Within an arbitrary scale factor, and an arbitrary sign (or phase factor),
гг> = A, — 1,0). Note carefully that (for real |r> in ordinary space) the eigen-
eigenvector singles out a line in space. The positive or negative sense is not deter-
determined. This indeterminancy could be expected if we noted that Eq. 4.159 is
homogeneous in |r>. For convenience we will require that the eigenvectors be
normalized to unity, <гг гг> = 1. With this choice of sign,
or rl = (-£=,—^,0
D.167)
is fixed. For X = 0, Eq. 4.159 yields
D.168)
r2> or r2 = @,0,1) is a suitable eigenvector. Finally, for X = 1, we get
= 0,
D.169)
DIAGONALIZATION OF MATRICES 223
or
or
D.170)
The orthogonality of r1? r2, and r3, corresponding to three distinct eigenvalues,
may be easily verified.
EXAMPLE 4.6.2 Degenerate Eigenvalues
Consider
1 0 Ол
А = | 0 0 11.
О 1 0.
D.171)
The secular equation is
1-Я 0 0
0 -Я 1
0 1 -Я
= 0
D.172)
or
A-Я)(Я2-1) = 0, Я = -1,1,1,
a degenerate case. If Я = — 1, the eigenvalue equation D.159) yields
2x = 0,
A suitable normalized eigenvector is
D.173)
D.174)
For Я = 1, we get
D.175)
D.176)
and no further information. We have an infinite number of choices. Suppose,
as one possible choice, r2 is taken as
r2> or r2= 0,
1 1
2\/2
D.177)
which clearly satisfies Eq. 4.176. Then r3 must be perpendicular to rx and may
be made perpendicular to r2 by8
= ri xr2 = A,0,0).
D.178)
8 The use of the cross product is limited to three-dimensional space (see
Section 1.4).
224 DETERMINANTS. MATRICES, AND GROUP THEORY
DiagonaliZation
The equations, developed for our existence theorem at the beginning of this
section, can be used to form a transformation matrix that will convert the
Hermitian matrix A into diagonal form. Let R be a matrix formed from the
three orthonormal column vectors гг>, r2>, and r3> in any desired order.
x2
Уг
z-,
X
D.179)
in which each column {xhyhz^ is an eigenvector rx. Since
D.180)
R is unitary (or simply orthogonal if A, and therefore r, are real). Then, forming
RfAR, we have
D.181)
Hence RfAR is a diagonal matrix with eigenvalues A,-, the order of the eigen-
eigenvalues corresponding to the order of the column vectors r, or |r,> in R. To
develop the geometrical picture, consider A, a real (symmetric) matrix with
real eigenvalues and real eigenvectors. Matrix R corresponds to B in Eq.
4.95 or better, R corresponds to B, R being composed of <rj and so on, the
eigenvectors r, written as row vectors.
<r2
<r3
= I x.
■г Уг
Уг
Л Уг
D.182)
Now the row (bn,bi2,bi3), which defines a unit vector r,- in relation to the
original coordinate system, specifies the three direction cosines of r,- with the
original axes. Remembering that matrix В rotates the coordinate system into
a new system in which (here) A is diagonal, we see that this new system is
specified by the three eigenvectors r,-= (xj,j>,-,z,.). They are the unit vectors
along the principal axes, the axes in relation to which A is diagonal.
The preceding analysis has the advantage of exhibiting and clarifying
conceptual relationships in the diagonalization of matrices. However, for
matrices larger than 3 x 3, or perhaps 4x4, the process rapidly becomes so
EXERCISES 225
cumbersome that we turn gratefully to high-speed computers and iterative
techniques.9 One such technique is the Jacobi method for determining eigen-
eigenvalues and eigenvectors of real symmetric matrices. This Jacobi technique for
determining eigenvalues and eigenvectors and the Gauss-Seidel method of
solving systems of simultaneous linear equations are examples of relaxation
methods. They are iterative techniques in which hopefully the errors will
decrease or relax as the iterations continue. Relaxation methods are used
extensively for the solution of partial differential equations.
EXERCISES
4.6.1 (a) Starting with the angular momentum of the /th element of mass,
L« = r,- x p. = /и,-г,- x (со x rf),
derive the inertia matrix such that L = Ito, |L> = l|co>.
(b) Repeat the derivation starting with kinetic energy
4.6.2 Show that the eigenvalues of a matrix are unaltered if the matrix is transformed
by a similarity transformation. This property is not limited to symmetric or
Hermitian matrices. It holds for any matrix satisfying the eigenvalue equation,
Eq. 4.159. If our matrix can be brought into diagonal form by a similarity
transformation, then two immediate consequences are
1. The trace (sum of eigenvalues) is invariant under a similarity
transformation.
2. The determinant (product of eigenvalues) is invariant under a
similarity transformation.
Note. Prove this separately (for matrices that cannot be diagonalized). The
invariance of the trace and determinant are often demonstrated by using the
Cayley-Hamilton theorem: A matrix satisfies its own characteristic (secular)
equation.
4.6.3 As a converse of the theorem that Hermitian matrices have real eigenvalues and
that eigenvectors corresponding to distinct eigenvalues are orthogonal, show
that if
(a) the eigenvalues of a matrix are real and
(b) the eigenvectors satisfy Eq. 4.180, rjr,- = dy or (<r,-|r/-> = Sy), then the matrix
is Hermitian.
4.6.4 Show that a real matrix that is not symmetric cannot be diagonalized by an
orthogonal similarity transformation.
Hint. Assume that the nonsymmetric real matrix can be diagonalized and develop
a contradiction.
4.6.5 The matrices representing the angular momentum components Jx, Jy, and Jz
are all Hermitian. Show that the eigenvalues of J2 where J2 = J2 4- J2 + J2
are real and nonnegative.
9 In higher-dimensional systems the secular equation may be strongly ill-
conditioned with respect to the determination of its roots (the eigenvalues).
Direct solution by machine may be very inaccurate. Iterative techniques for
diagonalizing the original matrix are usually preferred.
226 DETERMINANTS, MATRICES, AND GROUP THEORY
4.6.6 A has eigenvalues A,- and corresponding eigenvectors |x,>. Show that A has the
same eigenvectors but with eigenvalues Af'.
4.6.7 A square matrix with zero determinant is labeled singular.
(a) If A is singular, show that there is at least one nonzero column vector v
such that
A|v> = 0.
(b) If there is a nonzero vector |v> such that
show that A is a singular matrix. This means that if a matrix (or operator)
has zero as an eigenvalue, the matrix (or operator) has no inverse.
4.6.8 The same similarity transformation diagonalizes each of two matrices. Show
that the original matrices must commute. (This is particularly important in the
matrix (Heisenberg) formulation of quantum mechanics.)
4.6.9 Two Hermitian matrices A and В have the same eigenvalues. Show that A and
В are related by a unitary similarity transformation.
4.6.10 Find the eigenvalues and an orthonormal (orthogonal and normalized) set of
eigenvectors for the matrices of Exercise 4.2.15.
4.6.11 Show that the inertia matrix for a single particle of mass m at (x,y, z) has a zero
determinant. Explain this result in terms of the invariance of the determinant
of a matrix under similarity transformations (Exercise 4.3.10) and a possible
rotation of the coordinate system.
4.6.12 A certain rigid body may be represented by three point masses:
m, = l at A,1,-2)
m2 = 2 at (-1,-1,0)
m3= 1 at A,1,2).
(a) Find the inertia matrix.
(b) Diagonalize the inertia matrix obtaining the eigenvalues and the principal
axes (as orthonormal eigenvectors).
4.6.13 z
A,0, \)f
у j@, 1, 1)
Unit masses are placed as shown in the figure.
EXERCISES 227
(a) Find the moment of inertia matrix.
(b) Find the eigenvalues and a set of orthonormal eigenvectors.
(c) Explain the degeneracy in terms of the symmetry of the system.
/4-1
ANS. 1= -1 4 -1 r, = A/^/3,1/V3, l/>/3)
4.6.14 A mass m, = \ kg is located at A,1,1) (meters), amass m2 = \ kg is at (—1, — 1,
— 1). The two masses are held together by an ideal (weightless, rigid) rod.
(a) Find the moment of inertia tensor of this pair of masses.
(b) Find the eigenvalues and eigenvectors of this inertia matrix.
(c) Explain the meaning, the physical significance of the Я = 0 eigenvalue.
What is the significance of the corresponding eigenvector?
(d) Now that you have solved this problem by rather sophisticated matrix-
tensor techniques, explain how you could obtain
A) Я = 0 and Я = ? — by inspection.
B) гя=0 = ? — by inspection.
(By inspection means using freshman physics.)
4.6.15 Unit masses are at the eight corners of a cube (+1, +1, +1). Find the moment
of inertia matrix and show that there is a triple degeneracy. This means that so
far as moments of inertia are concerned, the cubic structure exhibits spherical
symmetry.
4.6.16 Find the eigenvalues and corresponding orthonormal eigenvectors of the fol-
following matrices (as a numerical check, note that the sum of the eigenvalues equals
the sum of the diagonal elements of the original matrix—Exercise 4.3.9). Note
also the correspondence between det A = 0 and the existence of X = 0—as
required by Exercise 4.6.2 and 4.6.7.
A = ( 0 1 0 |. ANS. Л = 0, 1,2.
4.6.17
ANS. X= -1,0,2.
V о о o/
4.6.18
/ \
ANS. X= -1,1,2.
Vo i i/
4.6.19
/ v \
ANS. к =-3,1,5.
4.6.20
ANS. X = 0, 1, 2.
Vo i \)
4.6.21
/ \
ANS. k= -1,1,2.
228 DETERMINANTS, MATRICES, AND GROUP THEORY
4.6.22
ANS. Я = -
4.6.23
ANS. Я = 0, 2, 2.
Vo i \)
4.6.24
ANS. X= -1,-1,2.
\i i o/
4.6.25
/ \
ANS. k= -1,2,2.
4.6.26
/ \
ANS. Я = 0, 0, 3.
\l 1 \)
4.6.27
/ \
ANS. Я =1,1,6.
\г о г/
4.6.28
ANS. Я = 0, 0, 2.
Vo о о/
4.6.29
А = I 0 3 0 1. ANS. Я = 2,3,6.
V/3 о з/
4.6.30 (a) Determine the eigenvalues and eigenvectors of
CO-
Note that the eigenvalues are degenerate for e = 0 but the eigenvectors are
orthogonal for all e Ф 0 and e -> 0.
(b) Determine the eigenvalues and eigenvectors of
Note that the eigenvalues are degenerate for e = 0 and for this (nonsym-
metric) matrix the eigenvectors (e = 0) do not span the space,
(c) Find the cosine of the angle between the two eigenvectors as a function of
efor 0 < e < 1.
4.6.31 (a) Take the coefficients of the simultaneous linear equations of Exercise 4.1.7
to be the matrix elements ai} of matrix A (symmetric). Calculate the eigen-
eigenvalues and eigenvectors.
EIGENVECTORS. EIGENVALUES 229
(b) Form a matrix R whose columns are the eigenvectors of A and calculate
the triple matrix product RAR.
ANS. X = 3.33163
4.6.32 Repeat Exercise 4.6.31 by using the matrix of Exercise 4.2.39.
4.7 EIGENVECTORS, EIGENVALUES
In Section 4.6 we concentrate primarily on Hermitian or real symmetric
matrices and on the actual process of finding the eigenvalues and eigenvectors.
In this section we generalize to normal matrices with Hermitian and unitary
matrices as special cases. The physically important problem of normal modes
of vibration and the numerically important problem of ill-conditioned matrices
are also considered.
Normal Matrices1
A normal matrix is a matrix that commutes with its adjoint,
[A, At] = 0.
Obvious and important examples are Hermitian and unitary matrices. We will
show that normal matrices have orthogonal eigenvectors (see Table 4.2). We
proceed in two steps.
I. Let A have an eigenvector |x> and corresponding eigenvalue A. Then
A|x> = A|x> D.183)
or
(A-A1)|x> = 0. D.184)
For convenience the combination A — Я1 will be labeled B. Taking the adjoint
of Eq. 4.184, we obtain
| |t. D.185)
Because
we have
[B, Bf] = 0. D.186)
The matrix В is also normal.
1 Normal matrices are the largest class of matrices that can be diagonalized
by unitary transformations. For an extensive discussion of normal matrices,
see "Normal matrices for physicists," P. A. Macklin, Am. J. Phys. 52: 513
A984).
230 DETERMINANTS, MATRICES. AND GROUP THEORY
From Eqs. 4.184 and 4.185 we form
This equals
by Eq. 4.186. Now Eq. 4.188 may be rewritten as
(В
Thus
D.187)
D.188)
D.189)
. D.190)
We see that for normal matrices, Af has the same eigenvectors as A but the
complex conjugate eigenvalues.
II. Now, considering more than one eigenvector-eigenvalue, we have
D.191)
D.192)
D.193)
Multiplying Eq. 4.192 from the left by <хг| yields
Operating on the left side of Eq. 4.193, we obtain
D.194)
From Eq. 4.190 with Af having the same eigenvectors as A but the complex
conjugate eigenvalues
D.195)
Substituting into Eq. 4.193, we have
or
(Л, - Лу)<х,|ху> = О
D.196)
This is the same as Eq. 4.156.
For Xt ф Xj
<х,.|х,> = 0.
The eigenvectors corresponding to different eigenvalues of a normal matrix
are orthogonal. This means that a normal matrix may be diagonalized by a
unitary transformation. The required unitary matrix may be constructed from
the orthonormal eigenvectors as shown earlier in Section 4.6.
The converse of this result is also valid. If A can be diagonalized by a unitary
transformation, then A is normal.
EIGENVECTORS, EIGENVALUES 231
TABLE 4.2
Matrix
Hermitian
Antihermitian
Unitary
Normal
Eigenvalues
Real
Pure imaginary (or zero)
Unit magnitude
If A has eigenvalue A
Ai" has eigenvalue A*.
Eigenvectors
(for different eigenvalues)
Orthogonal
Orthogonal
Orthogonal
Orthogonal
A and A^ have the same eigenvectors.
Normal Modes of Vibration
We consider the vibrations of a classical model of the CO2 molecule. It is
an illustration of the application of matrix techniques to a problem that does
not start as a matrix problem. It also provides an example of the eigenvalues
and eigenvectors of an asymmetric real matrix.
EXAMPLE 4.7.1 Normal Modes
Consider three masses on the x-axis joined by springs as shown in Fig. 4.7.
The spring forces are assumed to be linear (small displacements, Hooke's law)
and the mass is constrained to stay on the x-axis.
m
■ x\
-►X2
"►*}
FIG. 4.7
Using a different coordinate for each mass Newton's second law yields the
set of equations
k ,
X2 — ~~~\X2 ~ X
ZZ
D.197)
The system of masses is vibrating. We seek the common frequencies, cd, such
that all masses vibrate at this same frequency. These are the normal modes. Let
r _ Y J(ot / — 1 0 7,
Л.1 •Al0C 5 l ' ! *•) J'
Substituting into Eq. 4.197, we may rewrite this set as
232 DETERMINANTS. MATRICES, AND GROUP THEORY
к
М
к
т
0
к
М
2к
т
к
М
о\
к
т
к I
м/
x,
D.198)
with the common factor el(Ot divided out. We have a matrix-eigenvalue equation
with the matrix asymmetric. The secular equation is
— со
0
_А
м ~ м
if /IT If
mm m
к к
0
M М
- co:
= 0.
D.199)
This leads to
M
m M
=o.
The eigenvalues are
M
and
M m
all real.
The corresponding eigenvectors are determined by substituting the eigen-
eigenvalues back into Eq. 4.198 one eigenvalue at a time. For со2 = О Eq. 4.198
yields
— x2
=0
— Xj + 2x2 — x3 = 0
Then, we get
— x2
= X2 =
= 0.
This describes pure translation, no relative motion of the masses, no vibration.
For со2 = k/M Eq. 4.198 yields
The two outer masses are moving in opposite direction. The center mass is
stationary.
For со2 = k/M + 2k/m the eigenvector components are
= 2M
1 3' 2 m l'
EIGENVECTORS, EIGENVALUES 233
The two outer masses are moving together. The center mass is moving opposite
to the two outer ones. The net momentum is zero.
Any displacement of the three masses along the x-axis can be described as
a linear combination of these three types of motion: translation plus two forms
of vibration.
Ill-conditioned Systems
A system of simultaneous linear equations may be written as
A|x> =
or
D.200)
with A and |y> known and |x> unknown. The reader may encounter examples
in which a small error in |y> results in a larger error in |x>. In this case the
matrix A is called "ill-conditioned." With |<5x> an error in |x> and |<5y> an
error in |y>, the relative errors may be written as
D.201)
x>
L<y|y>J
Here K(A), a property of matrix A, is labeled the condition number. For A
Hermitian one form of the condition number is given by1
K(A) =
D.202)
An approximate form due to Turing2 is
K(A)=n[Aij]max[A7Jl]mM, D.203)
in which n is the order of the matrix and [А,Л is the maximum element in A.
L_ fj—' ш Ha
EXAMPLE 4.7.2 An Ill-conditioned Matrix
A common example of an ill-conditioned matrix is the Hilbert matrix,
Hij = (i +j— I)- The Hilbert matrix of order 4, H4, is encountered in a least
squares fit of data to a third-degree polynomial. We have
i
4
I
5
1
3
D.204)
The elements of the inverse matrix (order ri) are given by
ш-1л - (-l)i+i - (л + /-1)!(л+у-1)!
i+j-l [(i-
y- 1)!]2(«-
1)!]2(«-0К«-У)-г
D.205)
1 Forsythe, George E., and Cleve B. Moler, Computer Solution of Linear
Algebraic Equations.
2 Compare Todd, John, The Condition of the Finite Segments of the Hilbert
Matrix, in the National Bureau of Standards' Applied Mathematics Series
#313.
234 DETERMINANTS. MATRICES, AND GROUP THEORY
For n = 4
/ 16 -120 240 -140\
-i_| ~120 1200 ~2700 168° |
4 I 240 -2700 6480 -4200 J '
\-140 1680 -4200 2800/
From Eq. 4.203 the Turing estimate of the condition number for H4 becomes
*Tunng = 4 x 1 x 6480
= 2.59 x 104.
This is a warning that an input error may be multiplied by 25,000 in the
calculation of the output result. It is a statement that H4 is ill-conditioned.
If you encounter a highly ill-conditioned system you have two alternatives
(besides abandoning the problem).
a. Try a different mathematical attack.
b. Arrange to carry more significant figures and push
through by brute force.
As previously seen, matrix eigenvector-eigenvalue techniques are not limited
to the solution of strictly matrix problems. A further example of the transfer of
techniques from one area to another is seen in the application of matrix tech-
techniques to the solution of Fredholm eigenvalue integral equations, Section 16.3.
In turn, these matrix techniques are strengthened by a variational calculation
of Section 17.8.
EXERCISES
4.7.1 Show that every 2x2 matrix has two eigenvectors and corresponding eigen-
eigenvalues. The eigenvectors are not necessarily orthogonal. The eigenvalues are not
necessarily real.
4.7.2 As an illustration of Exercise 4.7.1, find the eigenvalues and corresponding
eigenvectors for
Note that the eigenvectors are not orthogonal.
ANS. A,=0, г, =B,-1);
Д2 = 4, r2 = B,l).
4.7.3 If A is a 2 x 2 matrix show that its eigenvalues к satisfy the equation
k2 - к trace(A) + det A = 0.
4.7.4 Assuming a unitary matrix U to satisfy an eigenvalue equation Ur = Яг, show
that the eigenvalues of the unitary matrix have unit magnitude. This same result
holds for real orthogonal matrices.
EXERCISES 235
4.7.5 Since an orthogonal matrix describing a rotation in real three-dimensional
space is a special case of a unitary matrix, such an orthogonal matrix can be
diagonalized by a unitary transformation.
(a) Show that the sum of the three eigenvalues is 1 + 2cos(p; where q> is the
net angle of rotation about a single fixed axis.
(b) Given that one eigenvalue is 1, show that the other two eigenvalues must
be eiv and t?~'v.
Our orthogonal rotation matrix (real elements) has complex eigenvalues.
4.7.6 A is an «th order Hermitian matrix with orthonormal eigenvectors |x;> and
real eigenvalues Ax < A2 < A3 < • • • < An. Show that for a unit magnitude vector
4.7.7 A particular matrix is both Hermitian and unitary. Show that its eigenvalues
are all +1.
Note. The Pauli and Dirac matrices are specific examples.
4.7.8 For his relativistic electron theory Dirac required a set of four anticommuting
matrices. Assume that these matrices are to be Hermitian and unitary. If these
are n x n matrices, show that n must be even. With 2x2 matrices inadequate
(why?), this demonstrates that the smallest possible matrices forming a set of
four anticommuting, Hermitian, unitary matrices are 4x4.
4.7.9 A is a normal matrix with eigenvalues kn and orthonormal eigenvectors |х„>.
Show that A may be written as
Hint. Show that both this eigenvector form of A and the original A give the
same result acting on an arbitrary vector | y>.
4.7.10 A has eigenvalues 1 and —1 and corresponding eigenfunctions Ц) and (°).
Construct A.
4.7.11 A non-Hermitian matrix A has eigenvalues Д,- and corresponding eigenvectors
|u,->. The adjoint matrix Af has the same set of eigenvalues but different corre-
corresponding eigenvectors, |v;>. Show that the eigenvectors form a biorthogonal
set in the sense that
<V,.|U;>=0 to if фк;.
4.7.12 You are given a pair of equations:
A|fn> = An|gn>
A|gn> = An|fn> with A real.
(a) Prove that
(b) Prove that
(c) State how you know that
fn> is an eigenvector of (AA) with eigenvalue X\.
gn> is an eigenvector of (AA) with eigenvalue кгп.
1. The
2. The
fn> form an orthogonal set.
gn> form an orthogonal set.
3. k\ is real.
4.7.13 Prove that A of the preceding problem may be written as
236 DETERMINANTS, MATRICES, AND GROUP THEORY
with the |gn> and <fj normalized to unity.
Hint, (a) Show that A! operating on an arbitrary vector yields the same
result as A operating on that vector,
(b) Expand your arbitrary vector as a linear combination of fn>.
4.7.14 Given
-4,
(a) Construct the transpose A and the symmetric forms AA and AA.
(b) FromAA|gn> = i2|gn>
find Xn and |gn>. Normalize the |gn>'s.
(c) FromAA|fn> = in2|fn>
find kn [same as (b)] and |fn>. Normalize the |fn>'s.
(d) Verify that
| |„> and | j
| |
(e) Verify that
n
4.7.15 Given the eigenvalues Xx = 1, k2 = — 1, and the corresponding eigenvectors
l /Г
(a) Construct A.
(b) Verify that A
(c) Verify that A
§„> = К fn>.
ANS. A =
(
yf2\l
I
4.7.16 This is a continuation of Exercise 4.5.12, where the unitary matrix U and the
Hermitian matrix H are related by
\J =
,iaH
(a) If trace H = 0, show that det U = +1.
(b) If det U = +1, show that trace H = 0.
Hint. H may be diagonalized by a similarity transformation. Then, interpreting
the exponential by a Maclaurin expansion, U is also diagonal. The corresponding
eigenvalues are given by u} = cxp(iahj).
Note. These properties, and those of Exercise 4.5.12, are vital in the development
of the concept of generators in group theory—Section 4.11.
4.7.17 An n x n matrix A has n eigenvalues Аг. If В = eA show that В has the same
eigenvectors as A with the corresponding eigenvalues B, given by Bt = exp(/4,).
Note, e is defined by the Maclaurin expansion of the exponential:
eA = 1 + A +
A2 A2
+ ■
2! 3!
4.7.18 A matrix P is a projection operator satisfying the condition
P2= P.
Show that the corresponding eigenvalues {p2)x and px satisfy the relation
This means that the eigenvalues of P are 0 and 1.
INTRODUCTION TO GROUP THEORY 237
4.7.19 In the matrix eigenvector, eigenvalue equation
A is an n x n Hermitian matrix. For simplicity assume that its n real eigenvalues
are distinct, A, being the largest. If r> is an approximation to Ir,),
i=2
show that
<r|r> -Я'
and that the error in A! is of the order |<5;|2. Take |<5,| « 1.
Hint. The «|r,> form a complete orthogonal set spanning the «-dimensional
(complex) space.
4.7.20 Two equal masses are connected to each other and to walls by springs as shown
in the figure. The masses are constrained to stay on a horizontal line.
(a) Set up the Newtonian acceleration equation for each mass.
(b) Solve the secular equation for the eigenvectors.
(c) Determine the eigenvectors and thus the normal modes of motion.
К
m
m
К
4.8 INTRODUCTION TO GROUP THEORY
The theory of finite groups, developed originally as a branch of pure mathe-
mathematics, can be a beautiful, fascinating toy. For the physicist, group theory, with-
without any loss of its beauty, is also an extraordinarily useful tool for formalizing
semi-intuitive concepts and for exploiting symmetries. Group theory becomes
a useful tool for the development of crystallography and solid state physics
when we introduce specific representations (matrices) and start calculating
group characters (traces). A brief introduction to this area appears in Section
4.9. Perhaps even more important in physics is the extension of group theory to
continuous groups1 and the applications of these continuous groups to quan-
quantum theory and the particles of high energy physics. This is the topic of Sections
4.10 to 4.11.
As knowledge of our physical world expanded almost explosively in the first
third of this century, Wigner and others realized that invariance was a key con-
concept in understanding the new phenomena and in developing appropriate
theories. The mathematical tool for treating invariants and symmetries is group
theory. It represents a unification and formalization of principles such as parity
and angular momentum that are widely used by physicists. Parity is related to
1 These are groups with an infinite number of elements. Each element depends
on one or more parameters which vary continuously.
238 DETERMINANTS, MATRICES, AND GROUP THEORY
invariance under inversion. Conservation of angular momentum is a direct
consequence of rotational symmetry, which means invariance under spatial
rotations. Although the formal techniques of group theory may not be neces-
necessary, these powerful mathematical techniques can save much labor. Group
theory can produce a unification that (once grasped) leads to greater simplicity.
Definition of Group
A group G may be defined as a set of objects or operations (called the
elements) that may be combined or "multiplied" to form a well-defined product
and that satisfy the following four conditions. We label the set of elements
a,b,c, ... :
1. If a and b are any two elements, then the product ab
is also a member of the set.
2. The defined multiplication is associative, (ab)c =
a{bc). This is automatic for matrix multiplication.
3. There is a unit element / such that la = al = a for
every element in the set.2
4. There must be an inverse or reciprocal of each ele-
element. The set must contain an element b = a'1 such
that aa'1 = a a = / for each element of the set.
In physics, these abstract conditions often take on direct physical meaning in
terms of transformations of vectors, spinors, and tensors.
As a very simple, but not trivial, example of a group, consider the set l,a,b,c
that combine according to the group multiplication table3
1
a
b
с
1
1
a
b
с
a
a
b
с
1
i b
\ ь
с
1
a
с
с
1
a
b
Clearly, the four conditions of the definition of "group" are satisfied. The
elements a, b, c, and 1 are abstract mathematical entities, completely unre-
unrestricted except for the preceding multiplication table.
Now, for a specific representation of these group elements, let
1->1, a-+i, b-*-\, c-+-i, D.207)
combining by ordinary multiplication. Again, the four group conditions are
satisfied, and these four elements form a group. We label this group, C4. Since
the multiplication of the group elements is commutative, the group is labeled
2 Following Wigner, the unit element of a group is often labeled E, from the
German Einheit, the unit.
3The order of the factors is row-column: ab = с in the indicated previous
example.
INTRODUCTION TO GROUP THEORY 239
commutative or abelian. Our group is also a cyclic group in that the elements may
be written as successive powers of one element, in this case /", n — 0,1,2, 3. Note
that in writing out Eq. 4.207, we have selected a specific representation for this
group of four objects, Q.
We recognize that the group elements 1, /, — 1, — / may be interpreted as
successive 90° rotations in the complex plane. Then, from Eq. 4.63 we create the
set of four 2x2 matrices (replacing q> by — (p in Eq. 4.63 to rotate a vector
rather than rotate the coordinates.)
'cos(p — sin(/A
Ksir\(p cosq> /'
and for (p = 0, n/2, n, and Зл;/2 we have
1 °)
,0 \ . .
D.208)
/-1 0\ ' ~ -x
B= C =
V 0 -\)
This set of four matrices forms a group with the law of combination being matrix
multiplication. Here is a second representation, now in terms of matrices. A
little matrix multiplication verifies that this representation is also abelian and
cyclic. Clearly, there is a correspondence of the two representations
i i i ; A h 1R с <—> / <—> С D 209^
Homomorphism, Isomorphism
There may be a correspondence between the elements of two groups (or
between two representations), one-to-one, two-to-one, or many-to-one. If this
correspondence satisfies the same group multiplication table, we say that the
two groups are homomorphic. A most important homomorphic correspondence
between the groups O3 and SUB) is developed in Section 4.10. As a special
case, if the correspondence is one-to-one, still preserving the multiplication
table, then the groups are isomorphic.4 In the group C4 the two representations
A, /, — 1, —i) and A; А, В, С) are isomorphic.
In contrast to this, there is no such correspondence between either of these
representations of group C4 and another group of four objects, the vierergruppe
(Exercise 4.2.7), The vierergruppe has a multiplication table:
/
y2
Уз
i
1
yl
y2
Уз
к
vx
1
Уз
y2
y2
y2
Уз
I
Уз
Уз
y2
к
I
4Suppose the elements of one group are labeled gt, the elements of a second
group ht. Then gi*-*hh a one-to-one correspondence for all values of i. Also,
if 9i9j = 9k^ and ^i^j = ^*» then gk and hk must be corresponding elements.
240 DETERMINANTS, MATRICES, AND GROUP THEORY
Confirming the lack of correspondence between the group represented by
A, i, -1, -/) or the matrices A, А, В, С) of Eq. 4.208, note that although the
vierergruppe is abelian, it is not cyclic. The cyclic group C4 and the vierergruppe
are not isomorphic.
Matrix Representations—Reducible and
Irreducible
The representation of group elements by matrices is a very powerful tech-
technique and has been almost universally adopted among physicists. The use of
matrices imposes no significant restriction. It can be shown that the elements of
any finite group and of the continuous groups of Section 4.10 may be repre-
represented by matrices and, in particular, by unitary matrices. In quantum me-
mechanics these unitary representations assume a special importance since unitary
matrices can be diagonalized, and the eigenvalues can serve for the classification
of quantum states.
If there exists a unitary transformation5 that will transform our original
representation matrices into a diagonal or block-diagonal form, for example,
Г2Ъ
'"зз
42 Л43
'l2 '13
г-,, \ I п.,1 п.,-, О О \
D.2Ю)
such that the smaller portions or submatrices are no longer coupled together,
then the original representation is reducible. Equivalently, we have
SRS- = I \. D.2П)
If R is an n x n matrix, we might have P an m x m matrix, and Q an (n — m) x
(n — m) matrix. The O's are then rectangular matrices m x (n — m) and
(n — m) x m with all elements zero. We may write this result as
R=P®Q, D.212)
and say that R has been decomposed into the representations P and Q. For
instance, all representations of dimension greater than 1 of Abelian groups are
reducible. If no such unitary transformation exists, the representation is
irreducible. Among the Dirac matrices of Table 4.1, 1, a1, a2, cr3, p3, 6t, 62,
and 63 are in this reduced form. The topic of Exercise 4.8.1 is to show that the
matrices 1, A, B, and С form a reducible representation and to reduce them to
the irreducible representations. The 2x2 matrix representation of the vierer-
vierergruppe is likewise reducible.
The irreducible representations play a role in group theory that is roughly
analogous to the unit vectors of vector analysis. They are the simplest represen-
representations—all others may be built up from them.
'A unitary matrix remains unitary under a unitary transformation.
INTRODUCTION TO GROUP THEORY 241
Classes and Character
Consider a group element x transformed into a group element у by a simi-
similarity transform with respect to gt, an element of the group
9ixg7x=y. D-213)
The group element у is conjugate to x. A class is a set of mutually conjugate
group elements. In general, this set of elements forming a class does not satisfy
the group postulates and is not a group. Indeed, the unit element 1 which is
always in a class by itself is the only class that is also a subgroup. All members
of a given class are equivalent in the sense that any one element is a similarity
transform of any other element. Clearly, if a group is abelian, every element is a
class by itself. We find that
1. Every element of the original group belongs to one
and only one class.
2. The number of elements in a class is a factor of the
order of the group.
We get a possible physical interpretation of the concept of class by noting
that у is a similarity transform of x. If gt represents a rotation of the coordinate
system, then у is the same operation as x but relative to the new, related co-
coordinates.
In Section 4.3 we see that a real matrix transforms under rotation of the
coordinates by an orthogonal similarity transformation. Depending on the
choice of reference frame, essentially the same matrix may take on an infinity of
different forms. Likewise, our group representations may be put in an infinity
of different forms by using unitary transformations. But each such transformed
representation is isomorphic with the original. From Exercise 4.3.9 the trace of
each element (each matrix of our representation) is invariant under unitary
transformations. Just because it is invariant, the trace (relabeled the character)
assumes a role of some importance in group theory, particularly in applications
to solid state physics. Clearly, all members of a given class (in a given represen-
representation) have the same character. Elements of different classes may have the same
character but elements with different characters cannot be in the same class.
The concept of class is important A) because of the trace or character and
B) because the number of nonequivalent irreducible representations of a group is
equal to the number of classes.
Subgroups and Cosets
Frequently a subset of the group elements (including the unit element /) will
by itself satisfy the four group requirements and therefore is a group. Such a
subset is called a subgroup. Every group has two trivial subgroups: the unit
element alone and the group itself. The elements 1 and b of the four element
group C4 discussed earlier form a nontrivial subgroup. In Section 4.10 we con-
consider O3, the (continuous) group of all rotations in ordinary space. The rota-
rotations about any single axis form a subgroup of O3. Numerous other examples
of subgroups appear in the following sections.
242 DETERMINANTS, MATRICES, AND GROUP THEORY
Consider a subgroup H with elements hx and a group element x not in H.
Then xht and htx are not in subgroup H. The sets generated by
xht /=1,2,... and htx /=1,2, ...
are called cosets, respectively, the left and right cosets of subgroup H with
respect to x. It can be shown (assume the contrary and prove a contradiction)
that the coset of a subgroup has the same number of distinct elements as the
subgroup. Extending this result we may express the original group G as the sum
of H and cosets:
Then the order of any subgroup is a divisor of the order of the group. It is this
result that makes the concept of coset significant. In the next section the six-
element group D3 (order 6) has subgroups of order 1, 2, and 3. D3 cannot (and
does not) have subgroups of order 4 or 5.
The similarity transform of a subgroup H by a fixed group element x not in
H, xHx~l yields a subgroup—Exercise 4.8.8. If this new subgroup is identical
with H for all x,
xHx'1 = #,
then His called an invariant, normal, or self-conjugate subgroup. Such subgroups
are involved in the analysis of multiplets of atomic and nuclear spectra and the
particles discussed in Section 4.12. All subgroups of a commutative (abelian)
group are automatically invariant.
EXERCISES
4.8.1 Show that the matrices 1, A, B, and С of Eq. 4.208 are reducible. Reduce them.
Note. This means transforming A and С to diagonal form (by the same unitary
transformation).
Hint. A and С are anti-Hermitian. Their eigenvectors will be orthogonal.
4.8.2 Possible operations on a crystal lattice include An (rotation by л), т (reflection),
and / (inversion). These three operations combine as
An-m = i, m-i=An, and i-An = m.
Show that the group (\,An,m, i) is isomorphic with the vierergruppe.
4.8.3 Four possible operations in the лу-plane are:
1. no change/
(X -> —A"
2. inversion ■{
{y-> -y
(x -* —x
3. reflection <
[
DISCRETE GROUPS 243
f Л" -> Л"
4. reflection <
[y^ -y-
(a) Show that these four operations form a group.
(b) Show that this group is isomorphic with the vierergruppe.
(c) Set up a 2 x 2 matrix representation.
4.8.4 Rearrangement theorem.
Given a group of n distinct elements (/, a,b,c,..., n), show that the set of products
(al, a2, ab, ac, ..., an) reproduces the n distinct elements in a new order.
4.8.5 Using the 2x2 matrix representation of Exercise 4.2.7 for the vierergruppe,
(a) Show that there are four classes, each with one element.
(b) Calculate the character (trace) of each class. Note that two different classes
may have the same character.
(c) Show that there are three two-element subgroups. (The unit element by
itself always forms a subgroup.)
(d) For any one of the two-element subgroups show that the subgroup and a
single coset reproduce the original vierergruppe.
Note that subgroups, classes, and cosets are entirely different.
4.8.6 Using the 2 x 2 matrix representation, Eq. 4.208, of C4,
(a) Show that there are four classes, each with one element.
(b) Calculate the character (trace) of each class.
(c) Show that there is one two-element subgroup.
(d) Show that the subgroup and a single coset reproduce the original group.
4.8.7 Prove that the number of distinct elements in a coset of a subgroup is the same
as the number of elements in the subgroup.
4.8.8 A subgroup H has elements А,-, л' is a fixed element of the original group G and
is not a member of H. The transform
xhtx~l i = 1, 2, ...
generates a conjugate subgroup л//лм. Show that this conjugate subgroup satisfies
each of the four group postulates and therefore is a group.
4.8.9 (a) A particular group is abelian. A second group is created by replacing gt by
g1x for each element in the original group. Show that the two groups are
isomorphic.
Note. This means showing that if аД = ch then, щхЪ^х = cjl.
(b) Continuing part (a), if the two groups are isomorphic, show that each must
be abelian.
4.9 DISCRETE GROUPS
In physics, groups usually appear as a set of operations that leave a system
unchanged, invariant. This is an expression of symmetry. Indeed, a symmetry
may be defined as the invariance of the Hamiltonian of a system under a group
of transformations. Symmetry in this sense is important in classical mechanics,
but it becomes even more important and more profound in quantum mechanics.
In this section we investigate the symmetry properties of sets of objects (atoms
in a molecule or crystal). This provides additional illustrations of the group
244 DETERMINANTS, MATRICES, AND GROUP THEORY
concepts of Section 4.8 and leads directly to dihedral groups. The dihedral
groups in turn open up the study of the 32 point groups and 230 space groups
that are of such importance in crystallography and solid state physics. It might
be noted that it was through the study of crystal symmetries that the concepts of
symmetry and group theory entered physics.
Two Objects—Twofold Symmetry Axis
Consider first the two-dimensional system of two identical atoms in the xy-
plane at A,0) and (— 1,0), Fig. 4.8. What rotations1 can be carried out (keeping
both atoms in the xy-plane) that will leave this system invariant? The first
candidate is, of course, the unit operator 1. A rotation of n radians about the
z-axis completes the list. So we have a rather uninteresting group of two mem-
members A, — 1). The z-axis is labeled a twofold symmetry axis—corresponding to
the two rotation angles 0 and n that leave the system invariant.
FIG. 4.8 Diatomic molecule H2, N2, O2, Cl2, and so on
Our system becomes more interesting in three dimensions. Now imagine a
molecule (or part of a crystal) with atoms of element X at faon the x-axis,
atoms of element Fat ±b on the j-axis, and atoms of element Z at ±c on the
z-axis as shown in Fig. 4.9. Clearly, each axis is now a twofold symmetry axis.
Using Rx(n) to designate a rotation of л; radians about the x-axis, we may set up
a matrix representation of the rotations as in Section 4.3:
R.00 =
1 =
D.214)
we deliberately exclude reflections and inversions. They must be
brought in to develop the full set of 32 point groups.
DISCRETE GROUPS 245
FIG. 4.9 D2 symmetry
These four elements [1, RxGr), Ry(n), RzGr)] form an abelian group with a
group multiplication table:
1
R*00
RyOO
RZGT)
1
1
R,
R>
R.
R,
1
Rz
R
Rj.(^)
Ry
Rz
1
R,
RzC71)
Rz
R.v
R,
1
The products shown in this table can be obtained in either of two distinct
ways: A) We may analyze the operations themselves—a rotation of n about the
x-axis followed by a rotation of n about the j-axis is equivalent to a rotation of
n about the z-axis: (Ry(n)Rx(n) = RzGr). B) Alternatively, once the matrix
representation is established, we can obtain the products by matrix multiplica-
multiplication. This is where the power of mathematics is shown—when the system is too
complex for a direct physical interpretation.
Comparison with Exercises 4.2.7,4.8.2, or 4.8.3 shows immediately that this
group is the vierergruppe. The matrices of Eq. 4.214 are isomorphic with those
of Exercise 4.2.7. Also, they are obviously reducible—being diagonal. The
subgroups are A, RJ, A, Ry) and A, R2). They are invariant. It should be noted
that a rotation of n about the j-axis and a rotation of n about the z-axis is
equivalent to a rotation of л: about the x-axis. RzGr) Ry(n) = Rx(n). In symmetry
terms, if у and z are twofold symmetry axes, x is automatically a twofold sym-
symmetry axis.
This symmetry group,2 the vierergruppe, is often labeled D2, the D signifying
a dihedral group and the subscript 2 signifying a twofold symmetry axis (and no
higher symmetry axis).
2 A symmetry group is a group of symmetry-preserving operations, that is,
rotations, reflections, and inversions. A symmetric group is the group of
permutations of и distinct objects—of order и!
246 DETERMINANTS. MATRICES, AND GROUP THEORY
D
FIG. 4.10 Symmetry operations on an equilateral triangle
Three Objects—Threefold Symmetry Axis
Consider now three identical atoms at the vertices of an equilateral triangle,
Fig. 4.10. Rotations of the triangle of 0, 2л;/3, and 4n/3 leave the triangle in-
invariant. In matrix form, we have3
1 = R,@) =
A =
в =
1 0s
,0 1
'cos 2 л/3
, sin 2л;/3
— sin27r/3N
cos27r/3 ,
4>/3/2
D.215)
The z-axis is a threefold symmetry axis. A, A, B) form a cyclic group, a sub-
subgroup of the complete six-element group that follows.
In the xy-plane there are three additional axes of symmetry—each atom
(vertex) and the geometric center defining an axis. Each of these is a twofold
symmetry axis. These rotations may most easily be described within our two-
dimensional framework by introducing reflections. The rotation of n about the
С or 7-axis, which means the interchanging of atoms a and c, is just a reflection
of the x-axis:
3Note that here we are rotating the triangle counterclockwise relative to
fixed coordinates.
DISCRETE GROUPS 247
D.216)
We may replace the rotation about the D-axis by a rotation of 4n/3 (about our
z-axis) followed by a reflection of the x-axis (x~* — x) (Fig. 4.11):
D = RD(n) = CB
/-1 0Ч/-1/2 V3/2\
V о i;v-V3/2 -1/2;
r \ D.217)
-V3/2 -1/2 J
FIG. 4.11 The triangle on the right is the triangle on the left rotated 180° about
theZ)-axis. D = CB.
In a similar manner, the rotation of n about the £"-axis interchanging a and b is
replaced by a rotation of 2л;/3 (A) and then a reflection4 of the x-axis (x -»■ — x):
E = RE(n) = CA
= /-l 0Ч/-1/2 -.
V 0 1Д73/2 -1/2
P/ Y D.218)
V3/2 -1/2;
The complete group multiplication table is
1
A
В
С
D
E
1
1
A
В
С
D
E
A
A
В
1
E
С
D
В
В
1
A
D
E
С
С
С
D
E
1
A
В
D
D
E
С
В
1
A
E
С
D
A
В
1
Notice that each element of the group appears only once in each row and in each
column—as required by the rearrangement theorem, Exercise 4.8.4. Also, from
the multiplication table the group is not abelian. We have constructed a six-
4 Note that, as a consequence of these reflections, det(C) = det(D) = det(E) =
— 1. The rotations A and B, of course, have a determinant of +1.
248 DETERMINANTS, MATRICES, AND GROUP THEORY
element group and a 2 x 2 irreducible matrix representation of it. The only
other distinct six-element group is the cyclic group [1, R, R2, R3, R4, R5] with
/ОО8Я/3 -sin*/3\ / 1/2 -V3/2X
Unrc/3 costi/3 J VV3/2 1/2 )' ( j
Our group [1, A, B, C, D, E] is labeled D3 in crystallography, the dihedral
group with a threefold axis of symmetry. The three axes (C, D, and E) in the
xy-plane automatically become twofold symmetry axes. As a consequence,
A, C), A, D), and A, E) all form two-element subgroups. None of these two-
element subgroups of D3 is invariant.
There are two other irreducible representations of the symmetry group of the
equilateral triangle: B) the trivial A,1,1,1,1,1), and B) the almost as trivial
A,1,1, — 1, — 1, — 1), the positive signs corresponding to proper rotations and
the negative signs to improper rotations (involving a reflection). Both of these
representations are homomorphic with D3.
A general and most important result for finite groups of h elements is that
f = h, D.220)
where щ is the dimension of the matrices of the /th irreducible representation.
This equality, sometimes called the dimensionality theorem, is very useful in
establishing the irreducible representation of a group. Here for D3 we have
I2 + I2 + 22 = 6 for our three representations. No other irreducible represen-
representations of the symmetry group of three objects exist.
Dihedral Groups, Dn
A dihedral group Dn with an «-fold symmetry axis implies « axes with angular
separation of 2л;/« radians, « is a positive integer, but otherwise unrestricted. If
we apply the symmetry arguments to crystal lattices, then « is limited to 1,2, 3,4,
and 6. The requirement of invariance of the crystal lattice under translations in
the plane perpendicular to the «-fold axis excludes « = 5, 7, and higher values.
Try to cover a plane completely with identical regular pentagons and with no
overlapping.5 For individual molecules, this constraint does not exist, although
the examples with « > 6 are rare, n = 5 is a real possibility. As an example, the
symmetry group for ruthenocene, (C5H5JRu, illustrated in Fig. 4.12, is D5.6
Crystallographic Point and Space Groups
The dihedral groups just considered are examples of the crystallographic
point groups. A point group is composed of combinations of rotations and
reflections (including inversions) that will leave some crystal lattice unchanged.
Limiting the operations to rotations and reflections (including inversions)
means that one point—the origin—remains fixed, hence the term point group.
5 For D6 imagine a plane covered with regular hexagons and the axis of
rotation through the geometric center of one of them.
6 Actually the full technical label is D5h, the h, indicating invariance under a
reflection of the fivefold axis.
EXERCISES 249
н
н-
Н'
н
FIG. 4.12 Ruthenocene
Including the cyclic groups, two cubic groups (tetrahedron and octahedron
symmetries), and the improper forms (involving reflections), we come to a total
of 32-point groups.
If, to the rotation and reflection operations that produced the point groups,
we add the possibility of translations and still demand that some crystal lattice
remain invariant, we come to the space groups. There are 230 distinct space
groups, a number that is appalling except, possibly, to specialists in the field.
For details (which can cover hundreds of pages) see the references.
EXERCISES
4.9.1 (a) Once you have a matrix representation of any group, a one-dimensional
representation can be obtained by taking the determinants of the matrices.
Show that the multiplicative relations are preserved in this determinant
representation,
(b) Use determinants to obtain a one-dimensional representative of D2.
4.9.2 Explain how the relation
applies to the vierergruppe (h = 4) and to the dihedral group, ZK (h = 6).
4.9.3 Show that the subgroup A, A, B) of D3 is an invariant subgroup.
4.9.4 The group D3 may be discussed as a. permutation group of three objects. Matrix
B, for instance, rotates vertex a (originally in location 1) to the position formerly
occupied by с (location 3). Vertex b moves from location 2 to location 1, and
250 DETERMINANTS, MATRICES, AND GROUP THEORY
so on. As a permutation (a b c)->(b с a). In three dimensions
'0 1 0\ /a\ /bs
0 0 1 \lbUc
0 0/V/ \a,
(a) Develop analogous 3x3 representations for the other elements of D3.
(b) Reduce your 3 x 3 representation to the 2 x 2 representation of this section.
(This 3x3 representation must be reducible or Eq. 4.220 would be violated.)
Note. The actual reduction of a reducible representation may be awkward. It
is often easier to develop directly a new representation of the required dimension.
4.9.5 (a) The permutation group of four objects, />4, has 4! = 24 elements. Treating
the four elements of the cyclic group, C4, as permutations, set up a 4 x 4
matrix representation of C4. C4 becomes a subgroup of PA.
(b) How do you know that this 4x4 matrix representation of C4 must be
reducible?
Note. C4 is abelian and every abelian group of h objects has only h one-
dimensional irreducible representations.
4.9.6 (a) The objects {abed) are permuted to (d а с b). Write out a 4 x 4 matrix
representation of this one permutation.
(b) Is permutation, {a b d c)-*(d а с b), odd or even?
(c) Is this permutation a possible member of the DA group? Why or why not?
4.9.7 The elements of the dihedral group Dn may be written in the form
SaR^Btt/«), Я = 0,1
where RzBn/ri) represents a rotation of 2njn about the «-fold symmetry axis,
whereas S represents a rotation of ж about an axis through the center of the
regular polygon and one of its vertices.
For S = E show that this form may describe the matrices А, В, С, and D of D3.
Note. The elements Rz and S are called the generators of this finite group.
Similarly, i is the generator of the group given by Eq. 4.207.
4.9.8 Show that the cyclic group of n objects, Cn, may be represented by rm, m = 0, 1,
2, ...,«— 1. Here r is a generator given by
r = expB7r is In).
The parameter s takes on the values s = 1, 2, 3, ...,«, each value of s yielding
a different one-dimensional (irreducible) representation of Cn.
4.9.9 Develop the irreducible 2x2 matrix representation of the group of operations
(rotations and reflections) that transform a square into itself. Give the group
multiplication table.
Note. This is the symmetry group of a square and also the dihedral group, DA.
X
у
^' *у
CONTINUOUS GROUPS 251
4.9.10 The permutation group of four objects contains 4! = 24 elements. From Ex.
4.9.9, D4, the symmetry group for a square, has far less than 24 elements. Explain
the relation between DA and the permutation group of four objects.
4.9.11 A plane is covered with regular hexagons, as shown.
(a) Determine the dihedral symmetry of an axis perpendicular to the plane
through the common vertex of three hexagons (A), That is, if the axis has
«-fold symmetry, show (with careful explanation) what n is. Write out the
2x2 matrix describing the minimum (nonzero) positive rotation of the
array of hexagons that is a member of your Dn group.
(b) Repeat part (a) for an axis perpendicular to the plane through the geometric
center of one hexagon (B).
4.9.12 In a simple cubic crystal, we might have identical atoms at r = (la,ma,na), I,
m, and n taking on all integral values.
(a) Show that each cartesian axis js a fourfold symmetry axis.
(b) The cubic group will consist of all operations (rotations, reflections, in-
inversion) that leave the simple cubic crystal invariant. From a consideration
of the permutation of the positive and negative coordinate axes, predict
how many elements this cubic group will contain.
4.9.13 (a) From the Z>3 multiplication table construct a similarity transform table
showing xyx~x, where л: and у each range over all six elements of D3:
1
A
1
1
A
A
1
A
(b) Divide the elements of D3 into classes. Using the 2 x 2 matrix representation
of Eqs. 4.215 to 4.218 note the trace (character) of each class.
4.10 CONTINUOUS GROUPS
Infinite Groups, Lie Groups
All of the groups in the two preceding sections have contained a finite num-
number of elements: four for the vierergmppe, six for D3, and so on. Here we intro-
252 DETERMINANTS, MATRICES, AND GROUP THEORY
duce groups with an infinite number of elements. The group element will con-
contain one or more parameters that vary continuously over some range. The
continuously varying parameter gives rise to a continuum of group elements. In
contrast to the four-member cyclic group A, /, — 1, — /), we might have el<p, with
q> varying continuously over the range [0,2л:]. The Oj and SUB) groups
described subsequently are additional examples.
Among the various mathematical possibilities, the continuous groups known
as Lie groups are of particular interest. The characteristic of a Lie group is that
the parameters of a product element are analytic functions1 of the parameters of
the factors. In the case of transformations, a rotation, for instance, we might
write
*;=/(*!,*2,*э>0) D-221)
(compare Eq. 1.9). For this transformation group to be a Lie group the func-
functions^ must be analytic functions of the parameter 9. This will be true for the
Oj and SUB) groups considered here and in Section 4.11, for SUC) en-
encountered in Section 4.12, and for the Lorentz group of Section 4.13. All are
Lie groups. The analytic nature of the functions (differentiability) allows us to
develop the concept of generator (Section 4.11) and to reduce the study of the
whole group to a study of the group elements in the neighborhood of the identity
element.
If these parameters vary over closed intervals such as [0, я], or [0,2л:] for
angles, the group is compact. An important property of this is that every repre-
representation of a compact group is equivalent to a unitary representation. In con-
contrast, the homogeneous Lorentz group of Section 4.13 is not compact and the
representation L(v) is not unitary.
We now consider two continuous groups: A) the orthogonal group Oj and
B) the special unitary group SUB). A representation of Oj is obtained from
Section 4.3. For SUB) a B/ + 1) x B/ + 1) representation is developed—Eq.
4.235. Then these two groups are shown to be homomorphic, a two-to-one
correspondence. From this homomorphism the SUB) representation provides
a series of representations of rotations and leads to the rotation matrix DJ.
Orthogonal Group, О 3
The set of n x n real orthogonal matrices forms a group. (Check to see that
the group properties of Section 4.8 are satisfied.) Our n x n matrix has
n(n — l)/2 independent parameters. For n = 2, there is only one independent
parameter: one angle in Eq. 4.63. For n = 3 there are three independent parame-
parameters: the three Euler angles of Section 4.3.
We consider in some detail the set of 3 x 3 real orthogonal matrices with a
determinant +1—rotations only, no reflections. This group is frequently
labeled O3, the + indicating that the determinant is +1. From Section 4.3 the
rotations about the coordinate axes are
Analytic, defined in Section 6.2, means having derivatives of all orders.
CONTINUOUS GROUPS 253
D.222)
We are following the conventions of Section 4.3. The rotations are counter-
counterclockwise rotations of the coordinate system to a new orientation. Also, from
Section 4.3 the general member of Oj is the Euler angle rotation
A(a, ft у) = R(a, ft y) = Rz(y) Ry@) R2(a). D.223)
The relation of the Oj group and orbital angular momentum is developed in
Section 4.11. O3 also appears in Section 4.12 leading into SUC) and particle
physics.
Special Unitary Group, SUB)
The set of n x n unitary matrices also forms a group. (Again, check to see
that the group properties are satisfied.) This group is often labeled U(«). We
impose the additional restriction that the determinant of the matrices be + 1 and
obtain the special unitary or unitary unimodular group, SU(«). Our n x n
unitary, unit determinant matrix has n2 — 1 independent parameters. For n = 2
there are three parameters—the same as for O3. For n = 3 there are eight
parameters. This will become the eightfold way of Section 4.12.
For n = 2 we have SUB) with a general group element
a - ' D.224)
with a*a + b*b = 1. As indicated, a and b are complex. These parameters are
often called the Cayley-Klein parameters, having been introduced by Cayley
and Klein in connection with problems of rotation in mechanics. Although not
quite so obvious, an alternate general form is
/ e
U(£,>,,0= -« • -ч
\—e ^sinfy e s cos r\
with the three parameters £, rj, and С real. Both these forms, Eqs. 4.224 and
4.225, may be checked by showing that U 1)т = 1.
Now let us determine the irreducible representations of SUB). Returning to
Eq. 4.224, we see that U describes a transformation of a two-component com-
complex column vector (called a spinor):
/ a b\fu\
4.,-U -A.) <4'226)
or
254 DETERMINANTS, MATRICES, AND GROUP THEORY
и' = аи + bv,
D.227)
i/ = — b*u + a*v.
From the form of this result, if we were to start with a homogeneous polynomial
of the nth degree in и and v and carry out the unitary transformation, Eq. 4.227,
we would still have a homogeneous nth-degree polynomial. This is significant
in that the n + 1 terms u", un~1v, un~2v2, and so on belong to an (n + 1)-
dimensional representation of our special unitary group.
To save algebraic juggling, we follow the choice of Wigner and let n = 2/ and
consider the (monomial) function
uJ+m j-m
Liu, v) = u V -. D.228)
JO + )l(j - m)\
The index m will range from —j to +j, covering all terms of the form upvq with
p + q — 2/. The denominator is a sort of normalizing factor that will make our
representation unitary. If we take the action of U on fm(u, v) to be2
U/m(«,i>) =/„(«', i/), D.229)
then
U/m(«, v) =fm(au + bv, -b*u + a*v)
= (аи + bv)j+m(-b*u + a*v)j-m D.230)
j(j + m)\{j - m)\
Now the job is to express the right-hand side of Eq. 4.230 as a linear combina-
combination of terms of the form offm(u, v). The coefficients in the linear combination
will give us the desired representation. We expand the two binomials by the
binomial theorem (Section 5.6), obtaining
(аи + bv)j+m =
Then
J+mJ "(-I)
k\l\(j + m-k)\(j-m-l)\
X aJ+m~ka*l]jkl)*U-m~l)u2j-k-lvk+l
If we lety — к — I = m',
2jklk + l j+j-m^ D.233)
2 In Section 4.11 the transformation (rotation) of a function is defined in
terms of the inverse rotation of the coordinates. Here we use Eq. 4.229 since
we are setting up a comparison with O3, which is described in terms of rota-
rotations of the coordinates—Eq. 4.222.
CONTINUOUS GROUPS 255
matching the form of Eq. 4.228. Replacing the summation over / by a summa-
summation over rri,
Wm(u,v)= t Umm,fm,(u,v), D.234)
m'=-j
where the matrix element Umm, is given by
j+m
umm,=
k](j — rri — k)\{j + m — A:)! (rri — m + k)\
The index /c starts with zero and runs up to j + m, but the factorials3 in the
denominator guarantee that the coefficient will vanish if any exponent goes
negative.
Equation 4.234 shows that the effect of U operating on/m is given by a linear
combination of fm, with coefficients Umm,. This is the same as the rotation
operator discussed at the beginning of Section 4.2. The rotation operator was
represented by the matrix A. Here the operator U is represented by the matrix
of elements Umm,. Since m and rri each range from —/ to +/ in unit steps, our
matrices (Umm>) representing SUB) have dimensions B/ + 1) x B/ — 1).
To be a little more specific about this—if у = \,
rri =\ m'=-i
identical with Eq. 4.224. The cases for j = 1 and up are most conveniently
handled with trigonometric functions, as shown subsequently.
SUB) — O3 Homomorphism
As just seen, the elements of SUB) describe rotations in a two-dimensional
complex space. (The invariance of sfs, Exercise 4.10.6, suggests a "rotation"
of the spinor s, Eq. 4.226.) The determinant is +1. There are three independent
parameters. Our real orthogonal group OJ, determinant + 1, clearly describes
rotations in ordinary three-dimensional space with the important characteristic
of leaving x2 + y2 + z2 invariant. Also, there are three independent parameters.
The rotation interpretations and the equality of numbers of parameters suggest
the existence of some sort of correspondence between the groups SUB) and
O3. Here we develop this correspondence.
The operation of SUB) on a matrix is given by a unitary transformation,
Eq. 4.122,
M'=UMUf. D.237)
Taking M to be a 2 x 2 matrix, we note that any 2x2 matrix may be written
as a linear combination of the unit matrix and the three Pauli matrices of
3From'Section 10.1 (-и)! = ±oo for n = 1, 2, 3,
256 DETERMINANTS, MATRICES, AND GROUP THEORY
Section 4.2. Let M be the zero-trace matrix,
(z x — iy\
D.238)
x + iy -z )
the unit matrix not entering. Since the trace is invariant under a unitary trans-
transformation (Exercise 4.3.9), M' must have the same form,
( z' x' - iy' \
M' = x'ox + y'a2 + z'a3 =(,,., _ , • D.239)
\x + iy z )
The determinant is also invariant under a unitary transformation (Exercise
4.3.10). Therefore
- (x2 + y2 + z2) = - (x'2 + y'2 + z'\ D.240)
or x2 + y2 + z2 is invariant under this operation of SUB), just as with Oj.
SUB) must, therefore, describe a rotation. This suggests that SUB) and Oj
may be isomorphic or homomorphic.
We approach the problem of what rotation SUB) describes by considering
special cases. Returning to Eq. 4.224 with one eye on Eq. 4.225, let a = e1^
and b — 0, or
(e* 0 \
Uz=( A. D.241)
In anticipation of Eq. 4.245, this U is given a subscript z.
Carrying out a unitary transformation on each of the three Pauli a's, we
have
z°l ' ~ \0 е~1У\1 0 A 0 ely
D.242)
_/ 0 e2i<\
~ \e~2ii 0 )'
We reexpress this result in terms of the Pauli <7rs to obtain
\jzxa1\Jfz = jccos2^a1 — xsin2^a2. D.243)
Similarly,
Uzya? Ut = у sin 2£a, + у cos 2£o->
D.244)
From these double angle expressions we see that we should start with a half-
angle: Z, = a/2. Then, from Eqs. 4.237-4.239, 4.243, and 4.244
x' = x cos a + у sin a
y'= —xsincc + ycosa D.245)
z' = z.
The 2x2 unitary transformation using Uz(a/2) is equivalent to the rotation
operator R(a) of Eq. 4.222.
CONTINUOUS GROUPS 257
The establishment of the correspondence of
/ cos 0/2 sin,
~ \- sin 0/2 cos 0/2,
and Ry@) and of
/cosa>/2 ismq>/2^
\ism(p/2 cos(p/2.
D.246)
D247)
and Rx(<p) are left as Exercise 4.10.7. The reader might note that 1)к(ф/2) has
the general form
D.248)
ик(ф/2) = 1 cos ф/2 + iok sin
where к = x, y, z. We return to this point in Section 4.11.
The correspondence
Uz(a/2) =
0
0
cos a sin a 0
— sin a cos a 0
0
0
1
= R=(a) D.249)
is not a simple one-to-one correspondence. Specifically, as a in R, ranges from
0 to 2n, the parameter in Uz, a/2, goes from 0 to n. We find
Rr(a + 2n) = Rz(a)
Uz(a/2 + n) =
-e
»2
0
D.250)
Therefore Z?o?/? Ur(a/2) and Uz(a/2 + я) = - Uz(a/2) correspond to Rs(a). The
correspondence is 2 to 1, or SUB) and Oj are homomorphic. This establishment
of the correspondence between the representations of SUB) and those of Oj
means that the known representations of SUB) automatically provide us with
the representations4 of Oj.
Combining the various rotations, we find that a unitary transformation
using
U(a,jJ, y) = Uz(y/2)U,(jJ/2)Uz(a/2)
D.251)
corresponds to the general Euler rotation Rz(y)Rv(/f)Rz(oc). By direct
multiplication,
U(oc,/?,y)=
0
— sinp/2 cosp/2/ \ 0
0
_
D.252)
4 Whereas SUB) has representations for integral and half odd integral values
of у (j = 0, ^, 1, |, ...), O3 is limited to integral values of у (j = 0, 1,2, ...).
Further discussion of this point—the relation between O3 and orbital angular
momentum—appears in Sections 4.11 and 12.7.
258 DETERMINANTS, MATRICES, AND GROUP THEORY
This is our alternate general form, Eq. 4.225, with
f = (у + аI2, ц = p/2, C = G- a)/2.
From Eq. 4.252 we may identify the parameters of Eq. 4.225 as
With these, our SUB) representation Umm, of Eq. 4.235 becomes
U ,ia
,2j+m-m'~2k
'— m + 2k
X £>lmyCOS^
-sin I
D.253)
D.254)
D.255)
Here are our irreducible representations in terms of the Euler angles. The
importance of Eq. 4.255 is that it allows us to calculate the B/ + 1) x B/ + 1)
irreducible representations of SUB) for all j (j = 0, \, 1,|, ...) and the
irreducible representations of Oj for integral orbital angular momentum j
Rotation Matrix DJ'(a, fi, y)
In the quantum mechanics literature it is customary to take the adjoint of
m' defining5
For у =
m = —\
Dv (a, p, y) =
rri = -\\еш1
Fory = 1 Eqs. 4.255 and 4.256 lead to
I
= 0
= -' \
rri = 1
rri = -1
— e
_,el -cosв(
iy
cosj?
>'V
\
fe_,y etasm2 gfa1 +cos^ciy
5 The reason for this is that t/mm. is defined here in terms of rotations of
coordinates. DJm-m is used to rotate functions. Further discussion of this point
appears in Section 4.11.
D.256)
D.257)
I. D.258)
D.259)
CONTINUOUS GROUPS 259
x" FIG. 4.13 Euler angle rotations (y = 0)
For у = /, integral, the operation of the rotation matrix D' on the spherical
harmonics (Section 12.6) is given by
D.260)
The point (в', ср') is the same point in space as (в, q>) but measured relative to
the rotated coordinate system rather than relative to the initial system. This
rotated system is specified by the three Euler angles: a, C, and y. The rotation
matrix Dl(a,P,y) rotates the У^(в,(р) the way A(a,£,y), Eq. 4.87, rotates the
coordinates. The first two Euler angles a and /? define a new polar axis, z" in
Fig. 4.13 and a new zero of azimuth. (The third Euler angle у corresponds to a
rotation about the new polar axis and is irrelevant here.) The point (в', q>') is the
same point in space as (9, (p), but is measured relative to the rotated coordinate
system rather than relative to the initial system. Eq. 4.260 has a wide variety
of applications, ranging from the angular correlation of nuclear radiations to
the relation between the body fixed axes of a rotating solid to the space fixed
axes.
Note the analogy with the homogeneous functions/m(M, v) of Eq. 4.228. The
spherical harmonics F/"(#, cp) expressed in cartesian coordinates are homoge-
homogeneous functions of x, y, and z. (Each term of rl F/" has the form xaybzc with
a + b + с = /.) Thus Eq. 4.260 is the analog of Eq. 4.229.
One immediate application of the rotation matrix D^ is in the proof of the
spherical harmonic addition theorem, Exercise 4.10.11. For further details of
D^ the reader should consult the text by Rose, cited in the references at the end
of this chapter.
6The proof of this equation hinges on the identification of Dlmm as a matrix
element of the rotation operator (exp( — in «14), Section 4.12) with the
spherical harmonics taken as the basis functions.
260 DETERMINANTS, MATRICES, AND GROUP THEORY
EXERCISES
4.10.1 Show that an n x n orthogonal matrix has n(n — l)/2 independent parameters.
Hint. The orthogonality condition, Eq. 4.60, provides constraints.
4.10.2 Show that an n x n special unitary matrix has n2 — 1 independent parameters.
Hint. Each element may be complex—doubling the number of possible para-
parameters. Some of the constraining equations are likewise complex—and count
as two constraints.
4.10.3 The special linear group SLB) consists of all 2 x 2 matrices (with complex
elements) having a determinant of +1. Show that such matrices form a group.
Note. The SLB) group can be related to the full Lorentz group, Section 4.13
much as the SUB) group is related to O3.
4.10.4 Show that Rz is (or is not) an invariant subgroup of Ot ■
4.10.5 Prove that the general form of a 2 x 2 unitary, unimodular matrix is
_/ a b
~\~b* a*
with a*a + b*b = 1.
4.10.6 Denoting the spinor (u, v) of Eq. 4.226 by s, show that sfs — /V, the length
of the spinor, is conserved under the transformation U.
4.10.7 (a) Show that 1Ц0/2) corresponds to R^).
(b) Show that Ux((p/2) corresponds to Rx((p).
4.10.8 (a) Show that the a and у dependence of DJ(a, /?, y) may be factored out such
that
(b) Show that AJ(a) and CJ(y) are diagonal. Find the explicit forms.
(c) Show that dJ(£) = Dj@, /?, 0).
Hint. Exercises 4.2.28 and 4.2.29 may be helpful.
4.10.9 By inspection of Eqs. 4.255 and 4.256, or the special cases, Eqs. 4.258 and 4.259,
Explain why this should be so.
4.10.10 For /= 1 Eq. 4.260 becomes
m'=-\
Rewrite these spherical harmonics in cartesian form. Using D1 from Eq. 4.259,
show that the resulting cartesian coordinate equations are equivalent to the
Euler rotation matrix A(a, [5, y), Eq. 4.80, rotating the coordinates.
4.10.11 (a) Assuming that DJ(a, /?, y) is unitary, show that
m=-l
is a scalar quantity (invariant under rotations). This is a function analog
of a scalar product of vectors.
GENERATORS 261
(b) From part (a) derive the spherical harmonic addition theorem, Eq. 12.224:
Hint. Set в1 = 0 (which makes в2 = у) and quote Exercise 12.6.2.
4.11 GENERATORS
Rotations and Angular Momentum
From Section 4.3 we have matrix representations of the rotation of a co-
coordinate system and the rotation of a vector. From Section 4.10 we have matrix
representations of the rotation of functions. In all these cases rotations about
a common axis combine as
Multiplication of these matrices is equivalent to addition of the arguments.
This suggests that we look for an exponential representation of our rotations:
Qxp((p1)-Qxp((p2) = exp(<pj + <p2).
From Exercise 4.5.12 we take two matrices U and H related by
U = eiaH = 1 + wH + (iflHJ/2! + • • •. D.261)
Here a is a real parameter independent of H. The Maclaurin expansion of the
exponential serves to define the exponential. Further, from Exercise 4.5.12,
if H is Hermitian, then U is unitary. Similarly, if U is unitary, H is Hermitian.
Now, in the context of group theory, His labeled a generator,1 the generator
of U. The relation of the generator to the rotation group O3 is indicated
schematically in Fig. 4.14.
1. Starting with the left side of Fig. 4.14, the matrix describing a. finite rotation
of the coordinates through an angle q> counterclockwise about the z-axis is given
by Eq. 4.44 as
(cos q> sin q> 0\
— sirup coscp 0 J. D.262)
0 0 l/
2. Let the rotation described Rz be an infinitesimal rotation through an
angle дц>. Then Rz may be written as
RzE<p)= 1 +ie(pMz, D.263)
where
0
i
0
— i
0
0
0
0
0
Мг= i 0 0 1. D.264)
xThe use of the term generator here for continuous groups is completely
different from the use of this term for finite groups (compare Exercise 4.9.7).
262 DETERMINANTS, MATRICES, AND GROUP THEORY
Infinitesimal
rotation
Orbital angular
momentum
Commutation rules
Structure constants
Differential eq.
iV-fold iteration
Generator—
Exponential
FIG. 4.14 Group-generator relationships
Mz and the corresponding matrices Mx and My appear in Exercise 4.2.16 where
they are shown to satisfy particular commutation relations (as in Exercise 1.8.8).
In Section 12.7 we will show that this identifies the M matrices as an angular
momentum representation. Mz may also be obtained by differentiation. If we
interpret the derivative of a matrix as the matrix of the derivatives, then
dRJd<p\v=0 = i IVL. D.265)
From this point of view, Eq. 4.263 is a Maclaurin expansion of R= with terms
of order {dipJ and beyond omitted. The validity of Eq. 4.265 is a consequence
of the differentiability of Lie groups.
3. Our finite rotation q> may be compounded of successive infinitesimal
rotations 5q>.
M2)A
Let Sq> = (p/N for N rotations, with TV -> oo. Then,
y IVL).
Rz(<p) = lim [1 + (Up/N) M2f = ехр(нрМг).
D.266)
D.267)
From this form we identity Mz as the generator of the group Rr(<p), a subgroup
of O3. The actual reconstruction of Rz(<p) appears subsequently. Two charac-
characteristics are worth noting:
a. Mz is Hermitian and Rr(<p) is unitary.
b. Trace( M2) = 0 and det Rz(<p) = + 1.
In direct analogy with Mz, Mx may be identified as the generator of Rx,
the (sub) group of rotations about the x-axis. And then N\y generates Ry.
4. As indicated in Eq. 4.261, the exponential may be expanded to give
GENERATORS 263
ехр(ир MJ = 1 + Up Mz + {Up M2J/2! + {i<p MzK/3! + • • •
/O 0 0\ /10 0\
= |O 0 O) + (o 1 0 ){1 -(p2/2\ + cp4/4\- •■•} D.268)
\0 0 1/ \0 0 0/
In the second preceding equality the relations
/1 0 0\
M2 = I 0 1 0 ) and M3 = Mz D.269)
\0 0 0/
have been used. Recognizing that the first series is cosq) and the second sin<p,
we have Rz(<p) as given in Eq. 4.262.
5. Returning to the infinitesimal level, our infinitesimal rotations commute:
= [Ry{5q>y), RZ(^J]
] 0
and an infinitesimal rotation about an axis defined by a unit vector n becomes
R {dtp) = 1 + i{5<px Mx + 5<py My + 5(pz Mz)
= 1 + id(pn- M.
6. From Exercise 4.2.16 the generators satisfy the commutation relations
[Mf, Mj] = ieiJkMk D.272)
characteristic of angular momentum, Exercise 1.8.8. Here eijk is the totally
antisymmetric Levi-Civita symbol of Section 3.4. A summation over к is
implied, but there is only one nonvanishing term. The coefficient of Mfe, isijk,
is called the structure constant. The structure constants form the starting point
for the development of a Lie algebra. As previously seen, the group generators
determine the structure constants. Conversely, it may be shown that the
structure constants determine the group.
The result of this manipulation is that
Rz(<p) describes a rotation of the coordinate system about the z-axis and Mz
is identified as an angular momentum matrix. The sign in the exponent is
positive since we have rotated the coordinate system; rotation of a vector
relative to a fixed coordinate system would be described by
It might be noted that Eq. 4.272 has an infinite number of solutions. The three
matrices Mx, My, Mz of Exercise 4.2.15 constitute one solution—correspond-
solution—corresponding to one unit of angular momentum. Other solutions, B/ + 1) x B1 + 1)
264 DETERMINANTS, MATRICES, AND GROUP THEORY
matrices, with / = 2, 3, 4, ... generate the other irreducible representations of
the rotation group, Oj.
Rotation of Functions
In all the foregoing discussion the matrices rotate the coordinates. Any
physical system being described is held fixed. Now let us hold the coordinates
fixed and rotate a function ф(х,у,г) relative to our fixed coordinates. With R
to rotate the coordinates, we introduce an operator 01 to rotate functions.
We define 0t, by
with
x' = Rx.
D.273)
D.274)
In words, 01 operates on the function ф, rotating ф and creating a new function
ф'. This new function ф' is numerically equal to ф(х'), where x' indicates that
the coordinates have been rotated by R. For the special case of a rotation about
the 2-axis
£%2((р)ф(х, у, z) = ф(хсо&(р +ysir\(p), — x sin (p + у cos (p,z). D.275)
To get some understanding of the meaning of Eq. 4.275 consider the case
(p = n/2. Then
-x,z). D.276)
x
FIG. 4.15 Rotation of a function ф(х, у, z)
The function ф may represent a wavefunction or some classical physical system.
Imagine that ф(х,у,г) is large when its first argument is large. Then
^z((p — п/2)ф(х,у, z) will be large when the first argument of ф(у, —x,z) is
large, that is, when у is large. This is pictured in Fig. 4.15. The effect, then, ofMz
is to rotate the pattern of the function ф counterclockwise—the same as R would
rotate the coordinate system.
Returning to Eq. 4.275, consider an infinitesimal rotation again, q> -*■ 6q>.
Then, using Rz, Eq. 4.262, we obtain
GENERATORS 265
,у,г) = ф(х+у5(р,у - xd(p,z). D.277)
The right side may be expanded as a Taylor series (Section 5.6) to give
Ях(д<р)ф(х,у,2) = ф(х,у,г) - 5(р{хдф/ду - удф/дх} + O(ScpJ
= A - i5<pLz)\ff(x,y,z),
the differential expression in curly brackets being iLz, Exercise 1.8.7 again.
Since a rotation of first <p and then 5q> about the z-axis is given by
= 0 - iSq>LJ£x(<P№, D.279)
we have (as an operator equation)
5<p) - 9tz{q>))l5<p = - iLz®z{q>). D.280)
The left side is just d@z(<p)/d<p (for 5q> -»• 0). In this form Eq. 4.280 integrates
immediately to
®z{(p) = exp{-i(pLz), D.281)
Note carefully that £#z((p) rotates functions (counterclockwise) relative to fixed
coordinates and that Lz is our angular momentum operator. The constant of
integration is fixed by the boundary condition ^z@) = 1.
Note the resemblance to Eq. 4.267 and the differences. Rz rotates the
coordinates; Mz rotates functions. Mz is a matrix, Lz a differential operator.
Note also that Lx, Ly, and Lz satisfy exactly the same commutation relation
as Mx, Мя and Mz,
[£„ Lj-\ = isijkLk D.282)
and yield the same structure constants.
Equations 4.281 and 4.267 might also be compared with two equations in
Section 4.3: Eq. 4.89, in which A rotates coordinates counterclockwise, and
Eq. 4.93, in which the same A rotates a vector clockwise. Here we have R
rotating coordinates counterclockwise and Sft, rotating functions counterclock-
counterclockwise. This is a consequence of the negative exponential in Eq. 4.281.
SUB) and the Pauli Matrices
The elements (Ux, U^,, U2) of the two-dimensional unitary group, SUB), may
be generated by
ехр^шаД exp(jiba2) and exp(^/ca3), D.283)
where ox, a2, and ог are the three Pauli spin matrices. The three parameters
a, b, and с are real. Again, note that the a's are Hermitian and have zero trace.
The elements of SUB), Eq. 4.283, are unitary and have a determinant of +1.
It might be noted that the generators in diagonal form such as ог lead to
conserved quantum numbers.
The Pauli a's satisfy commutation relations
[ah aj] = 2isijkak. D.284)
266 DETERMINANTS, MATRICES, AND GROUP THEORY
This differs from the L and M commutation relations, Eqs. 4.272 and 4.282
by a factor of 2. Let us therefore define s,- = \o-x, i= 1, 2, 3. Then
[sif s^ = ietfksk D.285)
exactly like the angular momentum commutation relations,2 Eqs. 4.272 and
4.282, showing that the sh not ah are the dtlgular momentum operators. This
is the reason for including the ^'s in the generator exponentials. Essentially
this is the same as the adoption of the half-angles in the investigation of the
SUB)-O3 homomorphism in Section 4.10. exp(jica3) = exp(/cs3) = U. is the
2x2 analog of Eq. 4.267. Uz is the rotation matrix and s3 = G3/2 is the
corresponding angular momentum matrix.
Equation 4.66 gives the rotation operator for rotating the coordinates in
the three-space. Using the angular momentum matrix s_, we have as the
corresponding coordinate rotation operator in two-dimensional (complex)
space
Mz = exp(i(psz) = Qxp(i(paJ2).
For rotating the two-component column vector wave function (spinor) of a
spin \ particle relative to fixed coordinates, the rotation operator is
Expanding exp(/as1) = Qxp(\iaax) as a Maclaurin series, we obtain
=1{1- («/2J/2! + (a/2L/4! }
+ iax {(a/2) - (a/2K/3! + (a/2M/5! }
}
D.286)
/.cos a/2 г sin a/2\
\/sin a/2 cos a/2 J
= 1 cos a/2 + iax sin a/2,
a special case of Eq. 4.248. The parameter a appears as an angle, the coefficient
of an angular momentum matrix—like <p in Eq. 4.267. But in SUB) form the
angle always appears as a half-angle, a/2. Similarly (completing Eq. 4.248),
/ cos 6/2 sin6/2\ , . ,
exp&ba2) = = 1 cos b/2 + ia2 sin b/2
\ — sinb 2 cosb 2)
D.287)
. /expjic 0 \
exp(yica3) = , = 1 cos c/2 + ia3 sin c/2.
\ 0 exp - jic/
With this identification of the exponentials, the general form of the SUB)
matrix may be written as
U (а, Д, y) = exp(^/ya3)exp(^//?G2)exp(^/aG3). D.288)
2 These structure constants (ieijk) lead to the SUB) representations of dimen-
dimension 2/ + 1 for generators of dimension 2/ + \,j = 0, \, 1, f, . . .. The integral
j cases also lead to the representations of Oj, as discussed in Section 4.10.
SUB), SUC), AND NUCLEAR PARTICLES 267
This reproduces Eq. 4.252 of Section 4.10. With D(oc,£,y) = иЧсх,&у),
and leads to Eq. 4.258. The selection of the Pauli matrices corresponds to the
Euler angle rotations described in Sections 4.3 and 4.10.
Further examples of the infinitesimal rotation—exponentiation generator
technique—appear in Section 4.13.
EXERCISES
4.11.1 A translation operator T(a) converts ф(х) to ф(х + а),
Т(а)ф(х) = ф(х + а).
In terms of the (quantum mechanical) linear momentum operator px = —idjdx,
show that
T(a) = exp(iapx).
Hint. Expand ф(х + a) as a Taylor series.
4.11.2 Consider the general SUB) element Eq. 4.225 to be built up of three Euler
rotations: (i) a rotation of a/2 about the z-axis, (ii) a rotation of b/2 about the
new x-axis, and (iii) a rotation of c/2 about the new z-axis. (All rotations counter-
counterclockwise.) Using the Pauli a generators, show that these rotation angles are
determined by
a = £ - С + n/2 = a + n/2
Ь = 2ц =0
c = Z + C-nl2 = y- n/2.
Note. The angles a and b here are not the a and b of Eq. 4.224.
4.11.3 The angular momentum-exponential form of the Euler angle rotation operators is
= exp(—iyJz •) exp(—ifiJy,) exp( — iaJz).
Show that in terms of the original axes:
0t = exp(—ioJz) exp(—z/Ц) exp( — iyjz)
Hint. The $ operators transform as matrices. The rotation about the /-axis
(second Euler rotation) may be referred to the original j-axis by
exp (— i /Ц,.) = exp (— i oJz) exp (—i /Ц) exp (iaJz).
4.12 SUB), SUC), AND NUCLEAR PARTICLES
The application of group theory to "elementary" particles has been labeled
by Wigner the third stage of group theory and physics. The first stage was
the search for the 32 point groups and the 230 space groups giving crystal
symmetries—Section 4.9. The second stage was a search for representations
268 DETERMINANTS, MATRICES, AND GROUP THEORY
such as the representations of O3 and SUB)—Section 4.10. Now in this third
stage, physicists are back to a search for groups.
In discussing the strongly interacting particles of high energy physics and the
special unitary groups SUB) and SUC), we should look to angular momentum
and the rotation group O3 for an analogy. Suppose we have an electron in
the spherically symmetric attractive potential of some atomic nucleus. The
electron's Schrodinger wavefunction may be characterized by three quantum
numbers n, /, and m. The energy, however, is 2/ + 1-fold degenerate, depending
only on n and Iх. The reason for this degeneracy may be stated in two equivalent
ways :
1. The potential is spherically symmetric, independent
of в and q>, and
2. The Schrodinger Hamiltonian -(h2/2me)\2 + V{r)
is invariant under ordinary spacial rotations (O3).
As a consequence of the spherical symmetry of the potential, the angular
momentum L is conserved. In Section 4.11 the cartesian components of L are
identified as the generators of the rotation group O3. Instead of representing
Lx, Ly, and Lz by operators, let us use matrices. The exercises at the end of
Section 4.2 provide examples for / = \, 1, and f. The Lt matrices are B/ + 1) x
B/ + 1) matrices with the dimension the same as the number of the degenerate
states.2 These Lt matrices generate the B/ + 1) x B/ + 1) irreducible represen-
representations of O3. The dimension 2/+ 1 is identified with the 2/+ 1 degenerate
states.
The common method of eliminating this degeneracy is to introduce a
constant magnetic induction B. This leads to the Zeeman effect. This magnetic
induction adds a term to the Schrodinger Hamiltonian that is not invariant
under O3. This is a symmetry-breaking term.
So much for the analogy. In the case of the strongly interacting particles
(neutrons, protons, etc.) we cannot follow the analogy directly, because we
do not yet fully understand the nuclear interaction. We do not know the
Hamiltonian. So instead, let us run the analogy backward.
In the 1930s Heisenberg proposed that nuclear forces were charge-indepen-
charge-independent, that the only two massive particles (baryons) known then, the neutron
and proton, were two different states of the same particle. Table 4.2 shows that
they have almost the same mass. The fractional difference, (га„ — mp)/mp ъ
0.0014, is small, suggesting that the mass difference is produced by a small
charge-dependent perturbation. It was convenient to describe this near degen-
degeneracy by introducing a quantity I with z-projections /3 = \ for the proton,
— \ for the neutron. The name coined for I was isospin. Isospin had nothing
to do with spin (the particle's intrinsic angular momentum) but the two-
1 If the potential is a pure Coulomb potential, the energy depends only on n
(see Section 13.2).
2 With L, a matrix, the Schrodinger wavefunction \j/(r, в, <p) is replaced by a
state vector—with 2/ + 1 components. Angular momentum and the B/ + 1)-
fold degeneracy are discussed at some length in Section 12.7.
SUB), SUC), AND NUCLEAR PARTICLES 269
TABLE 4.3
Baryons with Spin j Even Parity
E
I
Л
N
E
5Г
IP
£+
Л
n
P
Mass (MeV)
1321.300
1314.900
1197.410
1192.540
1189.470
1115.500
939.550
938.256
Y
-1
0
0
1
I
2
1
0
1
2
/3
1
~2
+ \
_ J
0
+ 1
0
1
2
component isospin state vector obeyed the same mathematical relations as the
spin J =\ state vector, and in particular could be taken to be an eigenvector
of the Pauli ог matrix.
In the absence of charge-dependent forces, isospin is conserved (the proton
and neutron have the same mass) and we have a twofold degeneracy. Equiva-
lently, the unknown nuclear Hamiltonian must be invariant under the group
generated by the isospin matrices. The isospin matrices are just the three Pauli
matrices Bx2 matrices), and the group generated is the SUB) group of
Section 4.10, also 2x2 corresponding to our twofold degeneracy.
By 1961 many more particles had been discovered (or created). The eight
shown in Table 4.3 attracted particular attention.3 It was convenient to describe
them by characteristic quantum numbers, / for isospin, and Y for hypercharge.
The particles may be grouped into charge or isospin multiplets. Then the
hypercharge Y may be taken as twice the average charge of the multiplet. For
the neutron-proton multiplet
Г= 2-^@+1)= 1.
D.289)
The hypercharge and isospin values are listed in Table 4.3.
From scattering and production experiments it had become clear that both
hypercharge Y and isospin / were conserved under stong (nuclear) interaction.
Remember L (or /) is conserved under a spherically symmetric Hamiltonian.
The eight particles thus appeared as an eightfold degeneracy, but now with two
quantities to be conserved. In 1961 Gell-Mann, and independently Ne'eman,
suggested that the strong interaction should be invariant under the three-
dimensional special unitary group, SUC), that is, should have SUC) symmetry.
The choice of SUC) was based first on the existence of two conserved
quantities. This dictated a group of rank 2, a group, two of whose generators
1 All masses are given in energy units, MeV.
270 DETERMINANTS, MATRICES, AND GROUP THEORY
(and only two) commuted. Second, the group had to have an 8 x 8 representa-
representation to account for the eight degenerate baryons. In a sense SUC) was the
simplest generalization of SUB). Gell-Mann set up eight generators: three for
the components of isospin, one for hypercharge, and four additional ones. All
are 3x3, zero-trace matrices. As with O3 and SUB), there are an infinity of
irreducible representations. An eight-dimensional one was associated with the
eight particles of Table 4.3.4
We imagine the Hamiltonian for our eight baryons to be composed of three
parts
■" -"strong 1 -"medium ■ -"electromagnetic1 yt.ZyK))
The first part, #strong, possesses the SUC) symmetry and leads to the eightfold
degeneracy. Introduction of a symmetry breaking interaction, НтЫЫт, removes
part of the degeneracy giving the four isospin multiples S, Z, Л, and N. These
are multiplets because #medium still possesses SUB) symmetry. Finally, the
presence of charge-dependent forces splits the isospin multiplets and removes
the last degeneracy. This imagined sequence is shown in Fig. 4.16.
Applying first-order perturbation theory of quantum mechanics, simple
relations among the baryon masses may be calculated. Also, intensity rules for
decay and scattering processes may be obtained.
Perhaps the most spectacular success of this SUC) model has been its
prediction of new particles. In 1961 four A'and three n mesons (all pseudoscalar;
spin 0, odd parity) suggested another octet, similar to the baryon octet. The
SUC) theory predicted an eighth meson r\°, mass 563 MeV. The rj° meson,
experimentally determined mass 548 MeV, was found soon after. Groupings
of nine of the heavier baryons (all with spin f, even parity) suggested a 10-
member group or decuplet. The missing tenth baryon was predicted to have a
mass of about 1680 MeV and a negative charge. In 1964 the negatively charged
Q~, mass 1675 + 12 MeV, was discovered.
Since the completion of this f+ decuplet, a f ~ (odd parity) multiplet for
baryons and 1" and 2+ multiplets for mesons have been established.
The application of group theory to strongly interacting particles has been
extended beyond SUC). There has been an extensive investigation of SUF)
and of the more complex, higher-dimensional groups. Great attention has been
paid to the group generators and to the structure constants in the generator
commutation relations (such as ieijk for orbital angular momentum). These
structure constants define a Lie algebra. It is possible to associate space integrals
of current densities with the group generators. This leads to a current algebra
far beyond the scope of this discussion.
To keep group theory and its very real accomplishment in proper perspective,
we should emphasize that group theory identifies and formalizes symmetries.
4This application of SUC) has been called by Gell-Mann the "eightfold
way." Note the eight independent parameters of SUC) (from n2 — 1), the
eight generators, the 8x8 representation associated with eight particles. The
name also refers to the Eightfold Way of Buddha.
SUB), SUC), AND NUCLEAR PARTICLES 271
—0
mass
Л
n
... _^
TT
^strong
■ TT
' °medium
1
TT
i*strong
' ^medium
I TT
"I" "^electromagnetic
FIG. 4.16 Baryon mass splitting
It classifies (and sometimes predicts) particles. But aside from saying that one
part of the Hamiltonian has SUB) symmetry and another part has SUC)
symmetry, group theory says nothing about the particle interactions. Remember
that the statement that the atomic potential is spherically symmetric tells us
nothing about the radial dependence of the potential or of the wavefunction.
4.13 HOMOGENEOUS LORENTZ GROUP
Generalizing the approach to vectors of Section 1.2, scientists demand that
our physical laws be со variantl under
a. space and time translations,
b. rotations in real, three-dimensional space, and
с Lorentz transformations.
The demand for covariance under translations is based on the homogeneity of
space and time. Covariance under rotations is an assertion of the isotropy of
space. The requirement of Lorentz covariance is based on acceptance of special
relativity. All three of these transformations together form the inhomogeneous
Lorentz group or the Poincare group. Here we exclude translations. The
space rotations and the Lorentz transformations together form a group—the
homogeneous Lorentz group.
We first generate a subgroup, the Lorentz transformations in which the
1гГо be со variant means to have the same form in different coordinate systems
so that there is no preferred reference system (compare Sections 1.2 and 3.1).
272 DETERMINANTS, MATRICES, AND GROUP THEORY
relative velocity v is along the x = xt axis. The generator may be determined
by considering Lorentz space-time reference frames moving with a relative
velocity dv, an infinitesimal.2 The relations are similar to those for rotations
in real space, Sections 1.2, 3.1, and 4.3, except that here the angle of rotation
is pure imaginary (compare Section 3.7).
We work in Minkowski space with x4 = ict. For an infinitesimal relative
velocity Sv the space-time transformation is Galilean:
xl+id(]x4. D.291)
Here, as usual, fi = v/c. By symmetry we also write
jc; = x4 + iadpxx D.292)
with a a parameter that is fixed by the requirement that x\ + x\ be invariant,
x;2+jc;2 = jc2+x2. D.293)
Remember x^ is the prototype four-dimensional vector in Minkowski space.
Thus Eq. 4.293 is simply a statement of the invariance of the square of the
magnitude of the "distance" vector under rotation in Minkowski space. Here
is where the special relativity is brought into our transformation. Squaring
and adding Eqs. 4.291 and 4.292 and discarding terms of order (dfiK, we find
a — — 1. Equations 4.291 and 4.292 may be combined as a matrix equation
(**) = 0 +SCa2)(Xi). D.294)
\X4/dp \X4
4
a2 happens to be the negative of the Pauli matrix, oy.
The parameter eft represents an infinitesimal change. Using the same
techniques as in Section 4.11, we repeat the transformation N times to develop
a. finite transformation with the velocity parameter 9 = N3C. Then
In the limit as N-+ oo,
lim И + ^ ) = exp ва2. D.296)
N-»oo \ N J
As in Section 4.11, the exponential is interpreted by a Maclaurin expansion
ехрва2 = 1 + ва2 + (в<72J/2\ + {во2)ъ/Ъ\ + ■■■. D.297)
Noting that а22 = 1,
exp 9а2 = 1 cosh в + o2 sinh 0. D.298)
Hence our finite Lorentz transformation is
2This derivation, with a slightly different metric, appears in an article by
Strecker, J. L., Am. J. Phys. 35, 12 A967).
SUB), SUC), AND NUCLEAR PARTICLES 273
D.299)
VA / coshfl isinh0\/x
= 1
x'a/ V — г sinh 0 coshO J\Xa
G2 has generated the representations of this special Lorentz transformation.
Cosh в and sinh в may be identified by considering the origin of the primed
coordinate system, x\ = 0, or xx = vt. Substituting into Eq. 4.299, we have
0 = xx cosh в + jc4i sinh в.
D.300)
With xx = vt and xA = ict.
tanh в = /I = v/c
Note that в Ф v/c except in the limit as v -> 0.
Using 1 - tanh2 в = (cosh2 0)'\
Py. D.301)
The matrix in Eq. 4.299 agrees with the x3 — xA portion of the matrix in Eq.
3.120.
The preceding special case of the velocity parallel to one space axis is easy,
but it illustrates the infinitesimal velocity—exponentiation—generator tech-
technique. Now we apply this exact technique to derive the Lorentz transformation
for the relative velocity v not parallel to any space axis.
Let vx = 1\\\, v2 — fi\v\, and v3 = v|v| with Л, /г, and v the direction cosines
of v. In analogy to Eq. 4.291 we write
xi = xi
x'2 = x2
X3 =
D.302)
Again, by symmetry we try
= x4
iax5fixx + ia2Sfix
D.303)
From
t x? = t
x
l
ax= —X, а2=—ц, and «3=—v.
Rewriting Eqs. 4.302 and 4.303 as a matrix equation, we have
0 0 uSP\ /x
1
0
0
1
I
D.305)
Subtracting out 1 and removing Sfi as a factor, we obtain
x' = A + spa)x.
D.306)
274 DETERMINANTS, MATRICES, AND GROUP THEORY
Here
о =
D.307)
By direct multiplication (with Я + /i + v2 = 1),
Xv
/IV
v2
0
D.308)
and
ог = о. D.309)
As before, we iterate N times with 9 = N5fi. Forming the exponential
lim A + ва/Nf = e0a
= 1 + a sinh 9 + a2 (cosh 0 - 1).
D.310)
a is our generator with the parameters Я, /i, and v defining the direction of
the velocity built in. Writing out the second part of Eq. 4.310, the Lorentz
transformation matrix in all its glory is
L(v) =
/1 +Я2(СО8110- 1)
/
1
\ Av(cosh^-l)
\ — iX sinh 9
Я/i (cosh 9 —
1 + /i2(cosh 9 -
/iv (cosh 9 —
— //i sinh 9
1)
-1)
1)
Яv(cosh 9 —
/iv (cosh 0 —
1 + v2(cosh 9 -
— iv sinh 0
1)
1)
-1)
iX sinh 6>\
г/i sinh 0 \
iv sinh 0 /
cosh 9 J
D.311)
Again, cosh 9 = A - £2)~1/2 = y, sinh 9 = £y.
It is worth noting that the combination of Eqs. 4.310 and 4.311,
L(v) = e0a D.312)
is not in the exact form of Eq. 4.261. The exponent lacks the factor г, and L(v)
is not unitary.
The matrices given by Eq. 4.299 for the case of v = ivx form a subgroup.
The matrices of Eq. 4.311 do not. The product of two Lorentz transformation
matrices, L(vx) and L(v2), yields a third Lorentz matrix L(v3)—if the two
velocities \x and v2 are parallel. The resultant velocity v3 is related to \x and v2
by the Einstein velocity addition law, Section 3.7. If \x and v2 are not parallel,
no such simple relation exists. Specifically, consider three reference frames S,
S', and S", with S and S' related by L(vt), and S and S" related by L(v2).
REFERENCES 275
If the velocity of S" relative to the original system S is v3, S" is not obtained
from S by L(v3) = L(v2) L^). Rather, we find that
L(v3)=RL(v2)L(v1) D.313)
where R is a 3 x 3 space rotation matrix embedded in our four-dimensional
space-time. With Vj, and v2 not parallel, the final system S" is rotated relative
to S. This rotation is the origin of the Thomas precession involved in spin-orbit
coupling terms in atomic and nuclear physics. Because of its presence, the L(v)
by themselves do not form a group.
EXERCISES
4.13.1 Obtain ст(А, ju,v) by differentiating the final matrix, Eq. 4.3П.
4.13.2 Two Lorentz transformations are carried out in succession: vl along the x-axis,
then v2 along the j-axis. Show that the resultant transformation (given by the
product of these two successive transformations) cannot be put in the form of
Eq. 4.311.
Note. The discrepancy corresponds to a rotation.
4.13.3 Rederive the Lorentz transformation working entirely in the real space (xo,Xj,
x2, x3) with x0 = ct. Show that the Lorentz transformation may again be written
L(v) = ехр@ст), Eq. 4.312, but now with
a =
4.13.4 Using the matrix relation, Eq. 4.299, let the velocity parameter 0, relate the
Lorentz reference frames {x\, x4) and (x1, x4). Let 62 relate (x'j", x4) and {x\, x4).
Finally, let в relate (Xj',x4) and (x1?x4). From в = 0x + 62 derive the Einstein
velocity addition law
0
-A
-n
— v
-A
0
0
0
— M
0
0
0
— v
0
0
0
v =
1+^
c2
REFERENCES
Aitken, A. C, Determinants and Matrices. New York: Interscience Publishers A956).
Reprinted, Greenwood A983).
A readable introduction to determinants and matrices.
Bickley, W. G., and R. S. H. G. Thompson, Matrices—Their Meaning and Manipulation.
Princeton, N. J.: Van Nostrand A964).
A comprehensive account of the occurrence of matrices in physical problems, their
analytic properties, and numerical techniques.
Buerger, M. J., Elementary Crystallography. New York: Wiley A956).
A comprehensive discussion of crystal symmetries. Buerger develops all 32 point groups
and all 230 space groups.
276 DETERMINANTS, MATRICES, AND GROUP THEORY
Related books by this author include Contemporary Crystallography. New York:
McGraw-Hill A970), Crystal Structure Analysis, Krieger A979) (reprint, 1960), and
Introduction to Crystal Geometry, Krieger A977) (reprint, 1971).
Burns, G., and A. M. Glazer, Space Groups for Solid State Scientists. New York:
Academjc Press A978).
A well-organized, readable treatment of groups and their application to the solid state.
Falicov, L. M., Group Theory and Its Physical Applications. Notes compiled by A.
Luehrmann. Chicago: University of Chicago Press A966).
Group theory with an emphasis on applications to crystal symmetries and solid state
physics.
Gell-Mann, M., and Ne'eman, Y., The Eightfold Way. New York: Benjamin A965).
A collection of reprints of significant papers on SUC) and the particles of high-energy
physics. The several introductory sections by Gell-Mann and Ne'eman are especially
helpful.
Hamermesh, M., Group Theory and Its Application to Physical Problems. Reading, Mass.:
Addison-Wesley A962).
A detailed, rigorous account of both finite and continuous groups. The 32 point groups
are developed. The continuous groups are treated with Lie algebra included. A wealth
of applications to atomic and nuclear physics.
Higman, В., Applied Group-Theoretic and Matrix Methods. New York: Dover A964),
Oxford: Oxford University Press A955).
A rather complete and unusually intelligible development of matrix analysis and group
theory.
Park, D., "Resource Letter SP-1 on Symmetry in Physics." Am. J. Phys. 36, 577-584
A968).
Includes a large selection of basic references on group theory and its applications to
physics: atoms, molecules, nuclei, solids, and elementary particles.
Ram, В., Am. J. Phys. 35, 16 A967).
An excellent discussion of the application of SUC) to the strongly interacting particles
(baryons). For a sequel to this see R. D. Young, "Physics of the Quark Model." Am.
J. Phys. 41, 472 A973).
Rose, M. E., Elementary Theory of Angular Momentum. New York: Wiley A957).
As part of the development of the quantum theory of angular momentum, Rose includes
a detailed and readable account of the rotation group.
Wigner, E. P., Group Theory and Its Application to the Quantum Mechanics of Atomic
Spectra. Translated by J. J. Griffin. New York and London: Academic Press A959).
This is the classic reference on group theory for the physicist. The rotation group is
treated in considerable detail. There are a wealth of applications to atomic physics.
5 INFINITE SERIES
5.1 FUNDAMENTAL CONCEPTS
Infinite series, literally summations of an infinite number of terms, occur
frequently in both pure and applied mathematics. They may be used by the
pure mathematician to define functions as a fundamental approach to the theory
of functions, as well as for calculating accurate values of transcendental constants
and transcendental functions. In the mathematics of science and engineering
infinite series are ubiquitous, for they appear in the evaluation of integrals
(Section 5.6 and 5.7), in the solution of differential equations (Sections 8.5 and
8.6), and as Fourier series (Chapter 14) and compete with integral representa-
representations for the description of a host of special functions (Chapters 11, 12, and 13).
In Section 16.3 the Neumann series solution for integral equations provides one
more example of the occurrence and use of infinite series.
Right at the start we face the problem of attaching meaning to the sum of an
infinite number of terms. The usual approach is by partial sums. If we have a
sequence of infinite terms ul,u2,u3,u4.,u5, . .., we define the i th partial sum as
st = t un. E.1)
This is a finite summation and offers no difficulties. If the partial sums s,- converge
to a (finite) limit as i -*■ oo,
lim st = S, E.2)
/—►00
the infinite series Y^=i un1S said to be convergent and to have the value S. Note
carefully that we reasonably, plausibly, but still arbitrarily define the infinite
series as equal to S. The reader should also note that a necessary condition for
this convergence to a limit is that Hmn_>00 un = 0. This condition, however, is not
sufficient to guarantee convergence. Equation 5.2 is usually written in formal
mathematical notation:
The condition for the existence of a limit S is that for each с > 0,
there is a fixed N such that
\S — st\ < e, for i > N.
This condition is often derived from the Cauchy criterion applied to the partial
sums s-r The Cauchy criterion is:
A necessary and sufficient condition that a sequence (s,) converge
277
278 INFINITE SERIES
is that for each e > 0 there is a fixed number N such that
\S; — S;
< e for all i, j > N.
This means that the individual partial sums must cluster together as
we move far out in the sequence.
The Cauchy criterion may easily be extended to sequences of functions. We see
it in this form in Section 5.5 in the definition of uniform convergence and in
Section 9.4 in the development of Hilbert space.
Our partial sums s; may not converge to a single limit but may oscillate,
as in the case
и =1-1 + 1-1 + 1 + (- 1)" -f • • •. E.3)
Clearly, s( = 1 for i odd but 0 for i even. There is no convergence to a limit,
and series such as this one are labeled oscillatory.
For the series
1 + 2 + 3+---+И+--- E.4)
we have
As n -> oo,
lim sn = oo. E.6)
Whenever the sequence of partial sums diverges (approaches ±oo), the infinite
series is said to diverge. Often the term divergent is extended to include oscil-
oscillatory series as well.
Because we evaluate the partial sums by ordinary arithmetic, the convergent
series, defined in terms of a limit of the partial sums, assume a position of
supreme importance. Two examples may clarify the nature of convergence or
divergence of a series and will also serve as a basis for a further detailed investiga-
investigation in the next section.
EXAMPLE 5.1.1 The Geometric Series
The geometrical sequence, starting with a and with a ratio r (r > 0), is given
by
The nth partial sum is given by1
a, ar, ar2, ar3, . . ., or" x,
»„ - «^r. E.7)
1 Multiply and divide sn = Y,m=oarm by 1 — r. .
FUNDAMENTAL CONCEPTS 279
Taking the limit as n -*■ oo,
lim sn = , for r < 1. E.8)
n-»oo 1 — Y
Hence, by definition, the infinite geometric series converges for r < 1 and is
given by
„."-1=^-. E.9)
On the other hand, if r > 1, the necessary condition un -*■ 0 is not satisfied and
the infinite series diverges.
EXAMPLE 5.1.2 The Harmonic Series
As a second and more involved example, we consider the harmonic series
„=i
We have the limn_>00 un — Hmn_>001/n = 0, but this is not sufficient to guarantee
convergence. If we group the terms (no change in order) as
i+1 + a+i) + ci+i+т+i) + (b + ■ ■ ■ + iV) + ■ ■ ■, (
it will be seen that each pair of parentheses encloses p terms of the form
p + 1 p + 2 p + p 2p 2
Forming partial sums by adding the parenthetical groups one by one, we obtain
5
Si - , s4>2,
s"
The harmonic series considered in this way is certainly divergent.2 An
alternate and independent demonstration of its divergence appears in Section
5.2.
Using the binomial theorem3 (Section 5.6), we may expand the function
A+хГ1:
2 The (finite) harmonic series appears in an interesting note on the maximum
stable displacement of a stack of coins, Johnson, P. R., "The Leaning Tower
of Lire." Am. J. Phys. 23, 240 A955).
3 Actually Eq. 5.14 may be taken as an identity and verified by multiplying
both sides by 1 + x.
280 INFINITE SERIES
—l— =1-х + х2-х3+---+ (-х)"-1 + • • ■. E.14)
1 + x
If we let x -*■ 1, this series becomes
1 — 1-4-1 — 1-4-1 — 1-4- ---, E.15)
a series that we labeled oscillatory earlier in this section. Although it does not
converge in the usual sense, meaning can be attached to this series. Euler, for
example, assigned a value of \ to this oscillatory sequence on the basis of the
correspondence between this series and the well-defined function A + x).
Unfortunately, such correspondence between series and function is not unique
and this approach must be refined. Other methods of assigning a meaning to a
divergent or oscillatory series, methods of defining a sum, have been developed.
In general, however, this aspect of infinite series is of relatively little interest to
the scientist or the engineer. An exception to this statement, the very important
asymptotic or semiconvergent series, is considered in Section 5.10.
EXERCISES
5.1.1 Show that
„5 B" - DBn + 1) = 2"
Hint. Show (by mathematical induction) that sm = m/Bm + 1).
5.1.2 Show that
CO 1
E —-—=i-
„% П(П + 1)
Find the partial sum sm and verify its correctness by mathematical induction.
Note. The method of expansion in partial fractions, Section 15.8, offers an alterna-
alternative way of solving Exercises 5.1.1 and 5.1.2.
5.2 CONVERGENCE TESTS
Although nonconvergent series may be useful in certain special cases,
(compare Section 5.10), we usually insist, as a matter of convenience if not
necessity, that our series be convergent. It therefore becomes a matter of extreme
importance to be able to tell whether a given series is convergent. We shall
develop a number of possible tests, starting with the simple and relatively insen-
insensitive tests and working up to the more complicated but quite sensitive tests.
For the present let us consider a series of positive terms, an > 0, postponing
negative terms until the next section.
Comparison Test
If term by term a series of terms un < an, in which the а„ form a convergent
series, the series ]Г„ ип is also convergent. Symbolically, we have
CONVERGENCE TESTS 281
а„ = ai+a2+
convergent,
If un < an for all n, then ]Tn un ^ Zn fl« an<^ Z" w« therefore is convergent.
If term by term a series of terms vn > bn, in which the bn form a divergent
series, the series ]Г„ vn is also divergent. Note that comparisons of un with bn or
un with а„ yield no information. Here we have
n = b1 + b2 + b3 +
divergent,
If vn > bn for all n, then ^]„ и„ > ^]„ Ь„ and ^]„ и„ therefore is divergent.
For the convergent series an we already have the geometric series, whereas
the harmonic series will serve as the divergent series bn. As other series are
identified as either convergent or divergent, they may be used for the known
series in this comparison test.
All tests developed in this section are essentially comparison tests. Figure 5.1
exhibits these tests and the interrelationships.
Cauchy
root
Kummer, an
(Comparison with
geometric series)
an =
CD'Alembert, Л
Cauchy ratio J
(Also by comparison
with geometric series)
an = n
Euler Maclaurin
integral
(Comparison
with integral)
1
Raabe
an = n In n
Gauss
FIG. 5.1 Comparison tests
EXAMPLE 5.2.1 The p Series
p> P = 0.999, for convergence. Since n~°"9 > n~\ and bn = n
Test
forms the divergent harmonic series, the comparison test shows that £„ n
is divergent. Generalizing, Yn n~p is seen to be divergent for all p < 1.
-0.999
I
Cauchy Root Test
If (flnI/n < r < 1 for all sufficiently large n, with r independent of n, then
„ an is convergent. If (an)lln > 1 for all sufficiently large n, then £„ an is divergent.
282 INFINITE SERIES
The first part of this test is verified easily by raising (а„I/п < г to the nth
power. We get
an<r" < 1.
Since r" is just the nth term in a convergent geometric series, ^]nfln is convergent
by the comparison test. Conversely, if (anI/n > 1, then an > 1 and the series
must diverge. This root test is particularly useful in establishing the properties
of power series (Section 5.7).
D'Alembert or Cauchy Ratio Test
If an+i/an < r < 1 for all sufficiently large n, and r is independent of n, then
Y,nan is convergent. If an+1/an > 1 for all sufficiently large n, then £nan is
divergent.
Convergence is proved by direct comparison with the geometric series
A + r + r2 + • ■ •). In the second part an+i > an and divergence should be
reasonably obvious. Although not quite so sensitive as the Cauchy root test,
this D'Alembert ratio test is one of the easiest to apply and is widely used. An
alternate statement of the ratio test is in the form of a limit:
If
,. а„+1 4
hm -2—- < 1, convergence,
> 1, divergence, E.16)
= 1, indeterminant.
Because of this final indeterminant possibility, the ratio test is likely to fail
at crucial points, and more delicate, more sensitive tests are necessary.
The alert reader may wonder how this indeterminacy arose. Actually it was
concealed in the first statement an+l/an < r < 1. We might encounter an+l/an < 1
for all finite n but be unable to choose an r < 1 and independent of n such that
an+i/an < r for all sufficiently large n. An example is provided by the harmonic
series
an+i n
an n+
Since
< 1. E.17)
n^oo an
no fixed ratio r < 1 exists and the ratio test fails.
E.18)
EXAMPLE 5.2.2 D'Alembert Ratio Test
Test Yjn n/2" for convergence.
E.19)
an+i _ (n
an n/2" 2 n
CONVERGENCE TESTS 283
Since
< - for n > 2,
4
E.20)
we have convergence. Alternatively,
i- an+i 1
hm -JLtL = -
П-^ОО пп 2
E.21)
and again—convergence.
Cauchy or Maclaurin Integral Test
This is another sort of comparison test in which we compare a series with an
integral. Geometrically, we compare the area of a series of unit-width rectangles
with the area under a curve.
/<■*)
\
i :
УС)
I
(a)
/B) = a2
1 4
г
\
1
. f(\)
(
= ai
3
b)
4
FIG. 5.2 (a) Comparison of integral and sum-blocks leading, (b) Comparison of
integral and sum-blocks lagging
Let/(x) be a continuous, monotonic decreasing function in which/(n) = an.
Then ^„а„ converges if jf f(x)dx is finite and diverges if the integral is infinite.
For the i tjh partial sum
But
Si = I an = I f{n).
Л=1 Л=1
f(x)dx
E.22)
E.23)
by Fig. 5.2a, /(x) being monotonic decreasing. On the other hand, from Fig. 5.2b,
5,-fll< [lf(x)dx, E.24)
in which the series is represented by the inscribed rectangles. Taking the limit
as i -> oo, we have
f{x)dx<
f(x)dx + a1.
E.25)
284 INFINITE SERIES
Hence the infinite series converges or diverges as the corresponding integral
converges or diverges.
This integral test is particularly useful in setting upper and lower bounds on
the remainder of a series after some number of initial terms have been summed.
That is,
oo N oo
Z ttn = Z ttn + Z fl">
n = l n = l n
where
*oo
f(x)dx< Z an <\ f(x)dx +
EXAMPLE 5.2.3 Riemann Zeta Function
The Riemann zeta function is defined by
C(P) = Z n~P- E-26)
n = l
We may take/(x) = x~p and then
x
~p+1
oo
•b "P+li E.27)
= lnx|?, p=l.
The integral and therefore the series are divergent for p < 1, convergent for
p > 1. Hence Eq. 5.26 should carry the condition p > 1. This, incidentally,
is an independent proof that the harmonic series (p = 1) diverges and diverges
logarithmically. The sum of the first million terms £i.ooo,ooon-i^ js oniy
14.392 726....
This integral comparison may also be used to set an upper limit to the
Euler-Mascheroni constant1 defined by
у = lim( У m'1 -Inn ). E.28)
™ \m=i J
Returning to partial sums,
m'1 -lnn< Г ~-lnn+ 1. E.29)
Evaluating the integral on the right, sn < 1 for all n and therefore у < 1. Exercise
5.2.12 leads to more restrictive bounds. Actually the Euler-Mascheroni constant
is 0.577 215 66....
is is the notation of National Bureau of Standards, Handbook of Mathe-
Mathematical Functions. Applied Mathematics Series-55 (AMS-55).
CONVERGENCE TESTS 285
Kummer's Test
This is the first of three tests that are somewhat more difficult to apply than
the preceding tests. Their importance lies in their power and sensitivity. Fre-
Frequently, at least one of the three will work when the simpler easier tests are
indecisive. It must be remembered, however, that these tests, like those pre-
previously discussed, are ultimately based on comparisons. It can be shown that
there is no most slowly converging convergent series and no most slowly
diverging divergent series. This means that all convergence tests given here,
including Kummer's, may fail sometime.
We consider a series of positive terms u{ and a sequence of finite positive
constants a{. If
%— - aH+l > С > 0 E.30)
U
for all n> N, some fixed number,2 then Y?=i ui converges. If
an~"~ - an+1 < 0 E.31)
"n + l
and Yf=i a71 diverges, then Yf=i ui diverges.
The proof of this powerful test is remarkably simple. From Eq. 5.30, with
С some positive constant,
E.32)
< aNuN aN+1uN+l
CuN+2 — aN+\UN+\ ~ aN+2UN+2
n^ -anun
Adding and dividing by С, (С ф 0), we obtain
< «v _ «A E.33)
;=jv+i
Hence for the partial sum, sn,
N aNu}
E.34)
/1 1J
< Z wi + ^L^, a constant, independent of и.
The partial sums therefore have an upper bound. With zero as an obvious
lower bound, the series ]T ut must converge.
Divergence is shown as follows. From Eq. 5.31
2 With um finite, the partial sum sN will always be finite for УУ finite. The
convergence or divergence of a series depends on the behavior of the last
infinity of terms, not on the first N terms.
286 INFINITE SERIES
anun >an^un^ > ■■■ >aNuN, n>N. E.35)
Thus
un > ^ E.36)
and
Ui>aNuN X aj1. E.37)
If Yf=i яГ1 diverges, then by the comparison test £;U; diverges.
Equations 5.30 and 5.31 are often given in a limit form:
lim (ая-р- - an+1) = С E.38)
Thus for С > 0 we have convergence, whereas for С < 0 (and ^af1 divergent)
we have divergence. It is perhaps useful to show the equivalence of Eq. 5.38 and
Eqs. 5.30 and 5.31 and to show why indeterminacy creeps in when the limit
С = 0. From the definition of limit
'и
< 8 E.39)
for all n > N and all g > 0, no matter how small g may be. When the absolute
value signs are removed,
С - g < an-^- - an+1 < С + e. E.40)
Now if С > 0, Eq. 5.30 follows from g sufficiently small. On the other hand,
if С < 0, Eq. 5.31 follows. However, if С = 0, the center term an(ujun+i) — an+1
may be either positive or negative and the proof fails. The primary use of
Kummer's test is to prove other tests such as Raabe's (compare also Exercise
5.2.3).
If the positive constants an of Kummer's test are chosen an = n, we have
Raabe's test.
Raabe's Test
If un > 0 and if
n(~^-l)>P>l E.41)
for all n> N, where N is a positive integer independent of n, then ]Гг щ converges.
If
"(~-lW E-42)
then Y,i и{ diverges (]Г п 1 diverges).
CONVERGENCE TESTS 287
The limit form of Raabe's test is
lim n\
и
U
*- ~ 1 = P.
E.43)
We have convergence for P > 1, divergence for P < 1, and no test for P = 1
exactly as with the Kummer test. This indeterminacy is pointed up by Exercise
5.2.4, which presents a convergent series and a divergent series with both series
yielding P = 1 in Eq. 5.43.
Raabe's test is more sensitive than the d'Alembert ratio test because Y^=i n *
diverges more slowly than £^=11. We obtain a still more sensitive test (and one
that is relatively easy to apply) by choosing а„ = n In n. This is Gauss's test.
Gauss's Test
If un > 0 for all finite n and
и
n+l
n n
2 ■>
E.44л)
in which B(ri) is a bounded function of n for n -*■ oo, then ]Ггм,- converges for
/i > 1 and diverges for ft < 1.
The ratio ujun+i of Eq. 5.44a often comes as the ratio of two quadratic forms:
n + ujU + a0
u
n+1
E.44b)
It may be shown (Exercise 5.2.5) that we have convergence for ay > b^ + 1 and
divergence for ax < bx + 1.
The Gauss test is an extremely sensitive test of series convergence. It will
work for all series the physicist is likely to encounter. For ft > 1 or ft < 1 the
proof follows directly from Raabe's test
lim n
n
n
= lim
л->оо
= ft.
ft
Bib)
n
E.45)
If ft = 1, Raabe's test fails. However, if we return to Kummer's test and use
an = nlnn, Eq. 5.38 leads to
lim < n In n
= lim
= lim (n + 1)
B{n)
n
n
E.46)
In n — In n — In I 1 + -
n
Borrowing a result from Section 5.6 (which is not dependent on Gauss's test),
we have
288 INFINITE SERIES
lim - (n + l)ln( 1 +- ) = lim - (n + 1)( —-^ + —^ • • • )
n-oo \ tlj n^ao \n 2fl2 3fl3 J
= -1 < 0.
Hence we have divergence for h = 1. This is an example of a successful applica-
application of Kummer's test in which Raabe's test had failed.
EXAMPLE 5.2.4 Legendre Series
The recurrence relation for the series solution of Legendre's equation
(Section 8.5) may be put in the form
2/B/ + 1) - /(/ + 1)
a2j B/ + l)By + 2)
This is equivalent to w27+2/w2j for x = +1. For у :» I,3
U2j B/ + l)By + 2) = 2/ + 2
-„« 2;B;+l) " 2^ ^
/
By Eq. 5.44b the series is divergent. Later we shall demand that the Legendre
series be finite at x = 1. We shall eliminate the divergence by setting the para-
parameter n = 2y0, an even integer. This will truncate the series, converting the
infinite series into a polynomial.
Improvement of Convergence
This section so far has been concerned with establishing convergence as an
abstract mathematical property. In practice, the rate of convergence may be
of considerable importance. Here we present one method of improving the rate
of convergence of a convergent series. Other techniques are given in Sections
5.4 and 5.9.
The basic principle of this method, due to Kummer, is to form a linear
combination of our slowly converging series and one or more series whose
sum is known. For the known series the collection
1 =i
„fi n(n + 1)
1 1
«з=Е
n=i n(n + l)(n + 2) 4
1
„fi n(n + 1)(и + 2)(и + 3) 18
aP= I
n=i n(n+ l)---(n + p) p-pl
3The n dependence enters B{n) but does not affect h.
EXERCISES 289
is particularly useful.4 The series are combined term by term and the coefficients
in the linear combination chosen to cancel the most slowly converging terms.
EXAMPLE 5.2.5 Riemann Zeta Function, £C)
Let the series to be summed be Y^=i n~3.In Section 5.9 this is identified as a
Riemann zeta function, £C). We form a linear combination
olx is not included since it converges more slowly than £C). Combining terms,
we obtain on the left-hand side
f J + fl2 1 = у n\\ + a2) + 3n + 2
\ n(n + l)(n + 2)J h n3(n+l)(n
If we choose a2 = — 1, the preceding equations yield
The resulting series may not be beautiful but it does converge as n~4, appreciably
faster than n~3. A more convenient form comes from Exercise 5.2.21. There,
the symmetry leads to convergence as n~5.
The method can be extended including a3a3 to get convergence as n~5,
a^oc^ to get convergence as n~6, and so on. Eventually, you have to reach a
compromise between how much algebra you do and how much arithmetic the
computing machine does. As computing machines get larger and faster, the
balance is steadily shifting to less algebra for you and more arithmetic for the
machine.
EXERCISES
5.2.1 (a) Prove that if
lim при„ ->• A < oo; p > 1,
the series ]Г"=1 и„ converges,
(b) Prove that if
lim nun — A > 0,
H-»00
the series diverges. (The test fails for A = 0.)
These two tests, known as limit tests, are often convenient for establishing the
convergence or divergence of a series. They may be treated as comparison tests,
comparing with
4These series sums may be verified by expanding the forms by partial fractions,
writing out the initial terms and inspecting the pattern of cancellation of
positive and negative terms.
290 INFINITE SERIES
£n~«, \<q<p.
5.2.2 If
lim ^ = K,
n-oo n
a constant with 0 < К < oo, show that ]ГЬИ converges or diverges with ]Г<з„.
шг. If ]Г <з„ converges, use
" 2K "'
If У а„ diverges, use
5.2.3 Show that the complete d'Alembert ratio test follows directly from Rummer's
test with a{ = 1.
5.2.4 Show that Raabe's test is indecisive for P = 1 by establishing that P = 1 for the
series
(a) un = and that this series diverges.
nmn
(b) un = =- and that this series converges.
n(ln nf
Note. By direct addition £'00-000[n(lnnJ]-1 = 2.02288. The remainder of the
series n > 105 yields 0.08686 by the integral comparison test. The total, then, 2
to oo, is 2.1097.
5.2.5 Gauss's test is often given in the form of a test of the ratio
un _n2 + aln + a0
un+l n + b1n + b0
For what values of the parameters ax and hx is there convergence? Divergence?
ANS. Convergent for a, — b, > 1,
divergent for ax — b, < 1.
(d) f [n(n + 1)Г1/2.
00 1
() I
5.2.6
5.2.7
Test
(a)
(b)
(c)
Test
(a)
(b)
(c)
for
n = 2
00
I
00
I
for
00
1
00
I2
00
I
convergence
on»)-.
n\
10""
1
2nBn - 1)"
convergence
1
n(n + 1)
1
n\nn
1
n2"
(e)
EXERCISES 291
5.2.8 For what values of p and q will the following series converge?
OO 1
V i
\p > 1, all q,
ANS. Convergent for <
lp=l, Я>1
,. fp<l, all g,
divergent for <
5.2.9 Determine the range of convergence for Gauss's hypergeometric series
Hint. Gauss developed Gauss's test for the specific purpose of establishing the
convergence of this series.
ANS. Convergent for — 1 < x < 1 and x = + 1 if у > a + /?.
5.2.10 A simple machine calculation yields
100
£ n'3 = 1.202007.
Show that
1.202056 < £ n< 1.202057.
n=l
Hint. Use integrals to set upper and lower bounds on ]Г"=]01 и~3.
Comment. A more exact value for summation ]Tf n~3 is 1.202056903....
5.2.11 Set upper and lower bounds on ^Ji0,00'000^1, assuming that (a) the Euler-
Mascheroni constant is known.
ANS. 14.392 726 < ^Ji0,00-000 n~l < 14.392 727.
(b) The Euler-Mascheroni constant is unknown.
5.2.12 Given ^=i°°ni = 7.485470..., set upper and lower bounds on the Euler-
Mascheroni constant. ANS. 0.5767 < у < 0.5778.
5.2.13 (From Olbers's paradox.) Assume a static universe in which.the stars are uniformly
distributed. Divide all space into shells of constant thickness; the stars in any
one shell by themselves subtend a solid angle of co0. Allowing for the blocking out
of distant stars by nearer stars, show that the total net solid angle subtended by
all stars, shells extending to infinity, is exactly An. (Therefore the night sky
should be ablaze with light.)
5.2.14 Test for convergence
13-5---Bп-1)
2-4-6-..Bл)
1 9 25
+ +
4 64 256
5.2.15 The Legendre series, ]Tyeven мДх), satisfies the recurrence relations
Uj+2\X) = T' 4- ~)\( ' 4- V\ X Uj(X"
in which the index j is even and / is some constant (but, in this problem, not a
nonnegative odd integer). Find the range of values of x for which this Legendre
series is convergent. Test the end points carefully. ANS. — 1 < x < 1.
292 INFINITE SERIES
5.2.16 A series solution (Section 8.5) of the Chebyshev equation leads to successive
terms having the ratio
uj+2(x) _ (k+jJ-n2 2
X
with к — 0 and к = 1. Test for convergence at x = +1. ANS. Convergent.
5.2.17 A series solution for the ultraspherical (Gegenbauer) function C"(x) leads to the
recurrence
aJ+2 = aj-
{k + j)(k +j + 2a) - n(n + 2a)
Investigate the convergence of each of these series at x = ± 1 as a function of the
parameter a. ANS. Convergent for a < j;
divergent for a > \.
5,2.18 A series expansion of the incomplete beta function (Section 10.4) yields
\p p+l
| A - q)B - q) • ■ ■ (n - д)х„ { |
n\(p + n) у
Given that 0<x<l,p>0, and q > 0, test this series for convergence. What
happens at x = 1 ?
5.2.19 Show that the following series is convergent.
V Bs - 1)!!
Note. Bs- l)\l = Bs- I)Bs - 3)-- ■ 3 • 1 with (- 1)!! = \.Bs)\\ = Bs)Bs - 2)
• • • 4 • 2 with 0!! = 1. The series appears as a series expansion of sin A) and
equals я/2.
5.2.20 Show how to combine CB) = VjJLx n~2 with a1 and a2 to obtain a series converging
as n.
Note. CB) is actually available in closed form: £B) = n2/6 (see Section 5.9).
5.2.21 The convergence improvement of Example 5.2.5 may be carried out more
expediently (in this special case) by putting a2 into a more symmetric form:
Replacing n by n — 1, we have
' V 1 1
h
U2 hi (n - l)n(n + 1) 4'
(a) Combine CC) and a'2 to obtain convergence as n.
(b) Let a'4 be a4 with n -» n — 2. Combine (C), a'2, and ol'4 to obtain convergence
7
asn.
(c) If CC) is to be calculated to 6 decimal accuracy (error 5 x 10~7), how many
terms are required for £C) alone? combined as in part (a)? combined as in
part (b)?
Note. The error may be estimated using the corresponding integral.
ANS. (a, ^
5.2.22 Catalan's constant (j8B) of AMS-55, Chapter 23) is defined by
ALTERNATING SERIES 293
) = 1 1 +
k=0 1 J J
0B) = £ (-DfcB/c + I) = 1 _ 1 +
1 J
Calculate CB) to six-digit accuracy.
Hint. The rate of convergence is enhanced by pairing the terms:
1 1 16/c
D/c-lJ D/c+lJ A6/c2 -IJ'
If you have carried enough digits in your series summation, ^=1 16/c/A6/c2 — IJ,
additional significant figures may be obtained by setting upper and lower bounds
on the tail of the series, Yj?=n+i- These bounds may be set by comparison with
integrals as in the Maclaurin integral test.
ANS. )SB) = 0.915965594177
5.3 ALTERNATING SERIES
In Section 5.2 we limited ourselves to series of positive terms. Now, in
contrast, we consider infinite series in which the signs alternate. The partial
cancellation due to alternating signs makes convergence more rapid and much
easier to identify. We shall prove the Leibnitz criterion, a general condition
for the convergence of an alternating series.
Leibnitz Criterion
Consider the series ^°=1 (— l)n+1 an with an > 0. If an is monotonic decreasing
(for sufficiently large n) and limn^0D an = 0, then the series converges.
To prove this, we examine the even partial sums
S2n = al ~ a2 + «3 - • • • - a2n>
s2n + 2 = S2n + (a2n + l ~
Since a2n+i > a2n+2> we nave
s2n+2 > s2n- E-52)
On the other hand,
s2n+2 =a1- (a 2 - a3) ~(a4-a5)- ■■■ - a2n+2. E.53)
Hence, with each pair of terms a2p — a2p+1 > 0,
s2n+2<a1. E.54)
With the even partial sums bounded s2n < s2n+2 < a^ and the terms an decreasing
monotonically and approaching zero, this alternating series converges.
One further important result can be extracted from the partial sums. From
the difference between the series limit S and the partial sum sn
~ Sn = an + l ~ an + 2 + пп+Ъ ~ n + 4
i ъ) - («„+4 - an+5) - ■■ ■
or
294 INFINITE SERIES
S-sn<an+1. E.56)
Equation 5.56 says that the error in cutting off an alternating series after n
terms is less than an+l, the first term dropped. A knowledge of the error obtained
this way may be of great practical importance.
Absolute Convergence
Given a series of terms un in which un may vary in sign, if £|м„| converges,
then Yjun is said to be absolutely convergent. If £wn converges but X|wn
diverges, the convergence is called conditional.
The alternating harmonic series is a simple example of this conditional
convergence. We have
|1(-1Г1п-1 = 1-А + А-А+ ••• + !-•••, E.57)
convergent by the Leibnitz criterion; but
^ = i+M+I+-+!+- E-58)
n=l Z J 4
has been shown to be divergent in Sections 5.1 and 5.2.
The reader will note that all the tests developed in Section 5.2 assume a
series of positive terms. Therefore all the tests in that section guarantee absolute
convergence.
EXERCISES
5.3.1 (a) From the electrostatic two hemisphere problem (Exercise 12.3.20) we obtain
the series
%K n \2 + 2)\\
Test for convergence,
(b) The corresponding series for the surface charge density is
s=o /v Bs)!!
Test for convergence. The !! notation is explained in Section 10.1.
5.3.2 Show by direct numerical computation that the sum of the first 10 terms of
00
limln(l + x) = ln2= X ("~I)""
x~^1 n = l
differs from In 2 by less than the eleventh term: In 2 = 0.69314 71806....
5.3.3 In Exercise 5.2.9 the hypergeometric series is shown convergent for x = +1, if
у > a + /?. Show that there is conditional convergence for x = — 1 for у down to
у > a + (I — 1.
Hint. The asymptotic behavior of the factorial function is given by Stirling's series,
Section 10.3.
ALGEBRA OF SERIES 295
5.4 ALGEBRA OF SERIES
The establishment of absolute convergence is important because it can be
proved that absolutely convergent series may be handled according to the
ordinary familiar rules of algebra or arithmetic.
1. If an infinite series is absolutely convergent, the series
sum is independent of the order in which the terms
are added.
2. The series may be multiplied with another absolutely
convergent series. The limit of the product will be the
product of the individual series limits. The product
series, a double series, will also converge absolutely.
No such guarantees can be given for conditionally convergent series. Again
consider the alternating harmonic series. If we write
i - h + i - i + • • • = i - ft - i) - (i - i) , E.59)
it is clear that the sum
f-iy-V1 <1. E.60)
However, if we rearrange the terms slightly, we may make the alternating
harmonic series converge to f. We regroup the terms of Eq. 5.59, taking
(i + i + i) - ft) + ft + £ + A + * + £) - Й)
E.61)
+ (г/ + • • • + ys) - Й) + (A + • • • + is) - ft) + • • • •
Treating the terms grouped in parenthesis as single terms for convenience,
we obtain the partial sums
st = 1.5333 s2 = 1.0333
s3 = 1.5218 s4 = 1.2718
s5 = 1.5143 s6 = 1.3476
s7 = 1.5103 ss = 1.3853
s9 = 1.5078 s10 = 1.4078
From this tabulation of sn and the plot of sn versus n in Fig. 5.3 the convergence
to § is fairly clear. We have rearranged the terms, taking positive terms until the
partial sum was equal to or greater than f, then adding in negative terms until
the partial sum just fell below f, and so on. As the series extends to infinity,
all original terms will eventually appear, but the partial sums of this rearranged
alternating harmonic series converge to f.
By a suitable rearrangement of terms a conditionally convergent series may
be made to converge to any desired value or even to diverge. This statement is
sometimes given as Riemann's theorem. Obviously, conditionally convergent
series must be treated with caution.
296 INFINITE SERIES
1.500
1.100
FIG. 5.3
1.5
456 789 10
Number of terms in sum, n
Alternating harmonic series—terms rearranged to give convergence to
Improvement of Convergence, Rational Approximations
The series
+x)=
E.61a)
converges very slowly as x approaches +1. The rate of convergence may be
improved substantially by multiplying both sides of Eq. 5.61a by a polynomial
and adjusting the polynomial coefficients to cancel the more slowly converging
portions of the series. Consider the simplest possibility: Multiply ln(l + x) by
1 +
+x)=
n-\
Combining the two series on the right term by term, we obtain
£
fllx)ln(l + x) = x + £ (-
)-1 (\
л = 2
оо
П —
1Г / ччи-l W(l — fli) — 1 „
n=2
n(n - 1)
Clearly, if we take ai =Л, the n in the numerator disappears and our combined
series converges as n~2.
Continuing this process, we find that A + 2x + x2)ln(l + x) vanishes as n~3,
A + 3x + 3x2 + x3)ln(l + x) vanishes as n~4. In effect we are shifting from a
simple series expansion of Eq. 5.61-a to a rational fraction representation in which
the function ln(l + x) is represented by the ratio of a series and a polynomial:
ALGEBRA OF SERIES 297
| =
1 + X
Such rational approximations may be both compact and accurate. The SSP
computer subroutines make extensive use of such approximations.
Rearrangement of Double Series
Another aspect of the rearrangement of series appears in the treatment of
double series (Fig. 5.4):
Let us substitute
Z Z an,m-
m = 0 n = 0
n = q > 0,
m = p — q > 0,
(q < p)-
This results in the identity
ос ос оо P
Z E «„« = E E ««c
m = 0 n = 0 p=0 ^ = 0
E-62)
The summation over p and q of Eq. 5.62 is illustrated in Fig. 5.5. The substitution
m = 0
n = 0
1
2
3
«oo
i
i
i.
«10
I
.a
03
«21
«
i
i
i
30
I
I
I
«
31
«
r /
22
i
1
l
32
i
i
1
„-Г
1
«23
1
1
1
«33
1
1
1
FIG. 5.4 Double series—sum-
series—summation over n indicated by ver-
vertical dashed lines
1
2
p = 0 1
«
00
«
01
«10
«02
l
l
«11
«20
«
«
03
I
I
I
12
«21
«30
FIG. 5.5 Double series—again,
the first summation is represented
by vertical dashed lines but these
vertical lines correspond to diago-
diagonals in Fig. 5.4.
298 INFINITE SERIES
s = 0
1
2
r = 0
«00
1
«01
2
«02
«10
3
«03
«11
4
«04
«12
«20
FIG. 5.6 Double series. The sum-
summation over ^ corresponds to a sum-
summation along the almost horizontal
slanted lines in Fig. 5.4.
n = s > 0,
m — r — 2s >0,
oo [r/2]
fls,r-2s
leads to
E.63)
m=On=O r=Os=O
with [r/2] = r/2 for r even, (r — l)/2 for r odd. The summation over r and s of
Eq. 5.63 is shown in Fig. 5.6. Equations 5.62 and 5.63 are clearly rearrangements
of the array of coefficients anm, rearrangements that are valid as long as we have
absolute convergence.
The combination of Eqs. 5.62 and 5.63,
oo P oo [r/2]
Z Z a4,p-4 = I Z «s,r-2s
p=0q=0 r=0s=0
E.64)
is used in Section 12.1 in the determination of the series form of the Legendre
polynomials.
EXERCISES
5.4.1 Given the series (derived in Section 5.6)
2 3 4
-1 < x < 1,
show that
7 ~\ 4
V V Y
(a) ln(l-x)= -x-y-y-y , -1<х<1.
x3 x5
- 1 < X < 1.
The original series, ln(l + x), appears in an analysis of binding energy in crystals.
It is j the Madelung constant B In 2) for a chain of atoms. The second series (b) is
SERIES OF FUNCTIONS 299
useful in normalizing the Legendre polynomials (Section 12.3) and in developing a
second solution for Legendre's differential equation (Section 12.10).
5.4.2 Determine the values of the coefficients au a2, and a3 that will make A + axx +
a2x2 + a3x3)ln(l + x) converge as n~4. Find the resulting series.
5.4.3 Show that
00
00
where £(n) is the Riemann zeta function.
5.4.4 Write a program that will rearrange the terms of the alternating harmonic series
to make the series converge to 1.5. Group your terms as indicated in Eq. 5.61. List
the first 100 successive partial sums that just climb above 1.5 or just drop below
1.5, and list the new terms included in each such partial sum.
ANS. n sn
1 1.5333
2 1.0333
3 1.5218
4 1.2718
5 1.5143
5.5 SERIES OF FUNCTIONS
We extend our concept of infinite series to include the possibility that each
term un may be a function of some variable, un = un(x). Numerous illustrations
of such series of functions appear in Chapters 11 to 14. The partial sums become
functions of the variable x
sn(x) = Mi(x) + u2(x) + ■ ■ ■ + un(x), E.65)
as does the series sum, defined as the limit of the partial sums
£ un(x) = S(x) = lim sn(x). E.66)
So far we have concerned ourselves with the behavior of the partial sums as
a function of n. Now we consider how the foregoing quantities depend on x.
The key concept here is that of uniform convergence.
Uniform Convergence
If for any small s > 0 there exists a number N, independent ofx in the interval
[a, fe] (a < x < b) such that
|S(x) - sn(x)\ < 8, for all n > N, E.67)
the series is said to be uniformly convergent in the interval [a,b~\. This says
that for our series to be uniformly convergent, it must be possible to find a
300 INFINITE SERIES
*- X
x = a
FIG. 5.7 Uniform convergence
finite N so that the tail of the infinite series, \Y?=n+i щ(х)\, will be less than an
arbitrarily small g for all x in the given interval.
This condition, Eq. 5.67, which defines uniform convergence, is illustrated
in Fig. 5.7. The point is that no matter how small s is taken to be we can always
choose n large enough so that the absolute magnitude of the difference between
S(x) and sn(x) is less than g for all x, a <x <b. If this cannot be done, then
£un(x) is not uniformly convergent in [a,b].
EXAMPLE 5.5.1
= V
E.68)
The partial sum sn(x) = nx(nx + 1) i as may be verified by mathematical
induction. By inspection this expression for sn(x) holds for n = 1,2. We assume
it holds for n terms and then prove it holds for n + 1 terms.
sn+l(x) = sn(x)
X
[nx + l][(n+ l)x+ 1]
nx . x
[nx + 1] [nx + 1] [(и + l)x + 1]
(n + l)x
(n+ l)x+ V
completing the proof.
Letting n approach infinity, we obtain
S@) = lim sn@) = 0,
n-*oo
S(x фО)= lim sn(x ф 0) = 1.
П—*OD
We have a discontinuity in our series limit at x = 0. However, sn(x) is a contin-
SERIES OF FUNCTIONS 301
uous function of x, 0 < x < 1, for all finite n. Equation 5.67 with g sufficiently
small, will be violated for all finite n. Our series does not converge uniformly.
Weierstrass MTest
The most commonly encountered test for uniform convergence is the
Weierstrass M test. If we can construct a series of numbers Ya Mif in which
M{ > |м,-(х)| for all x in the interval [a, fe] and £f M; is convergent, our series
Yaui(x) wiH be uniformly convergent in [a,b].
The proof of this Weierstrass M test is direct and simple. Since £,M; con-
converges, some number N exists such that for n + 1 > N,
£ Mi<£. E.69)
This follows from our definition of convergence. Then, with |w;(x)| < M; for all
x in the interval a < x <b,
Z \щ(х)\<в. E.70)
Hence
- sn(X)\ =
Щ(х)
< e, E.71)
and by definition Y?=i w;(x) is uniformly convergent in [a, fe]. Since we have
specified absolute values in the statement of the Weierstrass M test, the series
Yf=\ ui(x) is also seen to be absolutely convergent.
The reader should note carefully that uniform convergence and absolute
convergence are independent properties. Neither implies the other. For specific
examples,
1 -oo<x<oo E.72)
„tl n + x
and
+x), 0<x<l E.73)
converge uniformly in the indicated intervals but do not converge absolutely.
On the other hand,
oo
X A - x)x" =1, 0 < x < 1
n=0 E.74)
= 0, x = 1
converges absolutely but does not converge uniformly in [0,1].
From the definition of uniform convergence we may show that any series
f{x) = £ un(x) E.75)
302 INFINITE SERIES
cannot converge uniformly in any interval that includes a discontinuity of/(x).
Since the Weierstrass M test establishes both uniform and absolute con-
convergence, it will necessarily fail for series that are uniformly but conditionally
convergent.
Abel's Test
A somewhat more delicate test for uniform convergence has been given by
Abel. If
"„(*) = <*nL(x),
n = A, convergent,
and the functions fn(x) are monotonic [fn+1(x) < /„(x)] and bounded, 0 < fn(x)
< M, for all x in [fl,b], then ^un(x) converges uniformly in [a,b~\.
This test is especially useful in analyzing power series (compare Section 5.7).
Details of the proof of Abel's test and other tests for uniform convergence are
given in the references listed at the end of this chapter.
Uniformly convergent series have three particularly useful properties.
1. If the individual terms un(x) are continuous, the series sum
f(x) = t un(x) E.76)
n = l
is also continuous.
2. If the individual terms un(x) are continuous, the series may be integrated
term by term. The sum of the integrals is equal to the integral of the sum.
["f(x)dx= I ["un(x)dx. E.77)
la " *■ Ja
3. The derivative of the series sum f(x) equals the sum of the individual
term derivatives,
t-/W= 14-»Лх\
ax „=iax
provided the following conditions are satisfied.
, . , duJx) . • г л
un(x) and -■ "■ are continuous in [a, b].
Ш
. - is uniformly convergent in [a, fe].
n=i dx
Term-by-term integration of a uniformly convergent series1 requires only
continuity of the individual terms. This condition is almost always satisfied in
physical applications. Term-by-term differentiation of a series if often not valid
because more restrictive conditions must be satisfied. Indeed, we shall en-
1 Term-by-term integration may also be valid in the absence of uniform
convergence.
TAYLOR'S EXPANSION 303
counter cases in Chapter 14, Fourier Series, in which term-by-term differentia-
differentiation of a uniformly convergent series leads to a divergent series.
EXERCISES
5.5.1 Find the range of uniform convergence of
(a) i^ip
П = 1 П
(b) £ \ ANS. (a) 1 < x < oo.
"=i n* (b) 1 < s < x < oo.
5.5.2 For what range of x is the geometric series Y*=o x" uniformly convergent?
ANS. -1<-s<x<s<1
5.5.3 For what range of positive values of x is ]T"=0 1/A + x")
(a) Convergent ?
(b) Uniformly convergent ?
5.5.4 If the series of the coefficients J^an and ]ГЬИ are absolutely convergent, show that
the Fourier series
(an cos nx + bn sin nx)
is uniformly convergent for — oo < x < oo.
5.6 TAYLOR'S EXPANSION
This is an expansion of a function into an infinite series or into a finite series
plus a remainder term. The coefficients of the successive terms of the series
involve the successive derivatives of the function. We have already used Taylor's
expansion in the establishment of a physical interpretation of divergence
(Section 1.7) and in other sections of Chapters 1 and 2. Now we derive the
Taylor expansion.
We assume that our function/(x) has a continuous nth derivative1 in the
interval a < x <b. Then, integrating this nth derivative n times,
tla
•»x / fx
X) X )a
Continuing, we obtain
1 Taylor's expansion may be derived under slightly less restrictive conditions,
compare Jeffreys and Jeffreys, Methods of Mathematical Physics, Section
1.133.
304 INFINITE SERIES
fM(x)(dxK = /(n~3)(x) - f(n~3\a) - (x - a)fn'2\a)
E.80)
Finally, on integrating for the nth time,
Г • • • [P"\x)(dxT = f(x) - fia) - (x - a)f\a)
E.81)
Note that this expression is exact. No terms have been dropped, no approxima-
approximations made. Now, solving for/(x), we have
f(x) = Да) + (x - a)f'(a)
I» n\2 ( n\n-i E.82)
The remainder, Rn, is given by the n-fold integral
Rn=[ • • • |f"\x)(dxT. E.83)
This remainder, Eq. 5.83, may be put into perhaps more intelligible form by
using the mean value theorem of integral calculus
I a(x\dx = (x — a)a(F} E 84)
with д < ^ < x. By integrating n times we get the Lagrangian form2 of the
remainder:
Kn = j J {Q). E.85J
With Taylor's expansion in this form we are not concerned with any questions
of infinite series convergence. This series is finite, and the only questions
concern the magnitude of the remainder.
When the function/(x) is such that
lim Rn = 0, E.86)
n—>oo
Eq. 5.82 becomes Taylor's series
An alternate form derived by Cauchy is
with a < l, < x.
TAYLOR'S EXPANSION 305
Дх) = Да) + (x- a)f\a) + <*^*£/»(fl) + • • •
E.87)
\
r/()
„o n\
Our Taylor series specifies the value of a function at one point, x, in terms of
the value of the function and its derivatives at a reference point, a. It is an
expansion in powers of the change in the variable, Ax = x — a in this case.
The notation may be varied at the user's convenience. With the substitution
x-+ x + h and a -+ x we have an alternate form
/(x + h)=t -.fn\x).
n
When we use the operator D = d/dx the Taylor expansion becomes
oo UnT)n
Дх + h)= £ ^j-/(x) = ehDf(x).
(The transition to the exponential form anticipates Eq. 5.90 that follows.) An
equivalent operator form of this Taylor expansion appears in Exercise 4.11.1.
A derivation of the Taylor expansion in the context of complex variable theory
appears in Section 6.5.
Maclaurin Theorem
If we expand about the origin (a = 0), Eq. 5.87 is known as Maclaurin's
series
Дх) = ДО) + х/'@) + ^/"@) + • • •
E.88)
oo n
= I ~./(П)@).
An immediate application of the Maclaurin series (or the Taylor series) is in
the expansion of various transcendental functions into infinite series.
EXAMPLE 5.6.1
Let/(x) = ex. Differentiating, we have
/(n)@) = 1 E.89)
for all n, n = 1, 2, 3, .... Then, by Eq. 5.88, we have
x2 x3
1+X + +
00 x"
E.90)
'Note that 0! = 1 (compare Section 10.1).
306 INFINITE SERIES
This is the series expansion of the exponential function. Some authors use this
series to define the exponential function.
Although this series is clearly convergent for all x, we should check the
remainder term, Rn. By Eq. 5.85 we have
E.91)
= —e*, О < £ < X.
п\
Therefore
x"ex
Rn < ~- E.92)
and
lim Rn = 0 E.93)
n—>oo
for all finite values of x, which indicates that this Maclaurin expansion of ex is
valid over the range — oo < x < oo.
EXAMPLE 5.6.2
Let/(x) = ln(l + x). By differentiating, we obtain
™-<1+*>-' ,,94)
/<»>(*) = (-1)»^ - 1)!A + x)'".
The Maclaurin expansion (Eq. 5.88) yields
2 3 4
V V V
fL + fL_^.+ ... +Rn
E.95)
In this case our remainder is given by
—,
Now the remainder approaches zero as n is increased indefinitely, provided
0 < x < I.4 As an infinite series
ln(l+x)= £(-1Гх~, E.97)
П
is range can easily be extended to — 1 < x < 1 but not tox= — 1.
TAYLOR'S EXPANSION 307
which converges for — 1 < x < 1. The range — 1 < x < 1 is easily established
by the d'Alembert ratio test (Section 5.2). Convergence at x = 1 follows by the
Leibnitz criterion (Section 5.3). In particular, at x = 1, we have
the conditionally convergent alternating harmonic series.
Binomial Theorem
A second, extremely important application of the Taylor and Maclaurin
expansions is the derivation of the binomial theorem for negative and/or
nonintegral powers.
Let f(x) = A + x)m, in which m may be negative and is not limited to integral
values. Direct application of Eq. 5.88 gives
A + x)m = 1 + mx + m(m2 {K2 + ... +Rn. E.99)
For this function the remainder is
Rn = ^(l + £f-« x m(m - 1) • • • (m - n + 1) E.100)
and i lies between 0 and x, 0 < £ < x. Now, for n > m, A + £)m~" is a maximum
for £ = 0. Therefore
Rn < —, x m{m - 1) • • • (m - n + 1). E.101)
Note that the m dependent factors do not yield a zero unless m is a nonnegative
integer; Rn tends to zero as n -+ oo if x is restricted to the range 0 < x < 1.
The binomial expansion therefore is shown to be
A + хГ = 1 + mx + m{m- 1]x2 + W(W ~ ffW ~ 2)x3 + • • • ■ E.102)
In other, equivalent notation
E.103)
n=o\n
\x".
The quantity I, which equals ml/n\(m — n)\ is called a binomial coefficient.
\nj
Although we have only shown that the remainder vanishes,
lim Rn = 0,
for 0 < x < 1, the series in Eq. 5.102 actually may be shown to be convergent
308 INFINITE SERIES
for the extended range — 1 < x < 1. For m an integer, (m — n)! = + oo if n > m
(Section 10.1) and the series automatically terminates at n = m.
EXAMPLE 5.6.3 Relativistic Energy
The total relativistic energy of a particle is
/ я2\~1/2
= mcz\ I ~ \ . E.Ю4)
V J
Compare this equation with the classical kinetic energy, \mv2.
By Eq. 5.Ю2 with x = —v2/c2 and m = — \ we have
Г I / v2\ (_i/2)( —3/2) / t;2\2
(-1/2) (-3/2) (-5/2)
3!
, V л/^У -'/^V ^/^/ __ | _(_
\2 + \mv2 +
or
x-i 2iA 7!*^ 7t/i~' 7/t/\l //-чл^/-\
£ = mc +_mrJ+_mBJ.? + _mBJ.^_j +.... E.Ю5)
The first term, me2, is identified as the rest mass energy. Then
. 3t;2 . 5ЛЛ2
' kinetic
2
4c2 8lc'
E.106)
For particle velocity v <<c c, the velocity of light, the expression in the brackets
reduces to unity and we see that the kinetic portion of the total relativistic
energy agrees with the classical result.
For polynomials we can generalize the binomial expansion to
(«i + «2 + • • • + am)n = У f
where the summation includes all different combinations of nlf n2, ..., nm with
Yj=i ni = n- Here nt and n are all integral. This generalization finds considerable
use in statistical mechanics.
Maclaurin series may sometimes appear indirectly rather than by direct use
of Eq. 5.88. For instance, the most convenient way to obtain the series expansion
. _! Д Bn-l)!! x2n+l , x3 3x5 , ,. 1Л, ч
sm х = |-B^-B7ТТ) = х + -б- + Ж+'-'' EЛ06в)
is to make use of the relation
dt
Г
sin i =
We expand A — £2)~1/2 (binomial theorem) and then integrate term by term.
This term-by-term integration is discussed in Section 5.7. The result is Eq. 5.106a.
EXERCISES 309
Finally, we may take the limit as x
Exercise 5.2.5.
1. The series converges by Gauss's test,
Taylor Expansion—More than One Variable
If the function / has more than one independent variable, say, / = j\x, y),
the Taylor expansion becomes
f(x, y) = f(a,
2!
(y -
(x - af
дх'
E.107)
дх ду
3(x - a)(y - b)
8х8у
with all derivatives evaluated at the point {a,b). Using a7t = Xj — xj0, we may
write the Taylor expansion for m independent variables in the symbolic form
f(xj) =
A convenient vector form is
i=l
E.108)
EXERCISES
E.109)
5.6.1 Show that
(a) sinx= К-1Г
,2и + 1
5.6.2
(b) cosx=
In Section 6.1 elx is defined by a series expansion such that
eix = cos x + i sin x.
This is the basis for the polar representation of complex quantities. As a special
case we find, with x = n,
ein = -1.
Derive a series expansion of cot x in increasing powers of x by dividing cosx by
sinx.
Note. The resultant series that starts with 1/x is actually a Laurent series (Section
6.5). Although the two series for sin x and cos x were valid for all x, the convergence
of the series for cot x is limited by the zeros of the denominator, sin x.
310 INFINITE SERIES
5.6.3 (a) Expand A + x)ln(l + x) in a Maclaurin series. Find the limits on x for
convergence,
(b) From the results for part (a) show that
= I + i у (~1)H1
2 2„%п{п+\)
ANS. (a)
5.6.4 The Raabe test for
У (-1)"
„f2
In пГ1 leads to
lim n
"(w + 1) ln(w + 1)
n\nn
-1
Show that this limit is unity (which means that the Raabe test here is indeter-
minant).
5.6.5 Show by series expansion that
th-'f
5.6.6
5.6.7
5.6.8
5.6.9
This identity may be used to obtain a second solution for Legendre's equation.
Show that f(x) = xm (a) has no Maclaurin expansion but (b) has a Taylor
expansion about any point x0 ф 0. Find the range of convergence of the Taylor
expansion about x = x0.
Let x be an approximation for a zero of /(x) and Ax, the correction.
Show that by neglecting terms of order (AxJ
Ax = -Ж
A*)
This is Newton's formula for finding a root. Newton's method has the virtues of
illustrating series expansions and elementary calculus but is very treacherous.
See Appendix Al for details and an alternative.
Expand a function Ф(х, у, z) by Taylor's expansion. Evaluate Ф, the average value
of Ф, averaged over a small cube of side a centered on the origin and show that the
Laplacian of Ф is a measure of deviation of Ф from Ф@,0,0).
The ratio of two differentiable functions /(x) and g(x) takes on the indeterminate
form 0/0 at x = x0. Using Taylor expansions prove L'Hospital's rule
,. Дх) ,. f'{x)
lim -^ = lim^^^
()
5.6.10 With n > 1, show that
(b) i_ Jl±l)>a
n \ n )
Use these inequalities to show that the limit defining the Euler-Mascheroni
constant is finite.
EXERCISES 311
5.6.11 Expand A — 2tz + t2)~112 in powers of t. Assume that t is small. Collect the
coefficients off0, tl and t2. A »rC _ p / \ _ i
Л1\ъ. a0 — rQ{z) — i,
al=Pl(z) = z,
a2 = P2(z) = iCz2 - 1),
where an = Pn(z), the nth Legendre polynomial.
5.6.12 Using the double factorial notation of Section 10.1, show that
for m = 1, 2, 3, ....
5.6.13 Using binomial expansions, compare the three Doppler shift formulas:
/ _ v\~l
(a) v' = v( 1 + - I moving source;
V 4
(b) v' = v( 1 + - J, moving observer;
V 4
(c) v' = v(l±-yi-^2J , relativistic.
Note. The relativistic formula agrees with the classical formulas if terms of order
v2/c2 can be neglected.
5.6.14 In the theory of general relativity there are various ways of relating (defining) a
velocity of recession of a galaxy to its red shift, S. Milne's model (kinematic
relativity) gives
(a) v^cdil+^S),
(b) v2 = cS(l + \S)(l + ЗУ2
(c) 1 + S =
'1 + Уз/с
1 - V3/C
1/2
1. Show that for S « 1 (and v3/c <sc: 1) all three formulas reduce to v = c5.
2. Compare the three velocities through terms of order d2.
Note. In special relativity (with S replaced by z), the ratio of observed wavelength
X to emitted wave length Ao is given by
1 + Z .
Xo \c - vj
5.6.15 The relativistic sum w of two velocities и and v is given by
2'
w _ u/c + v/c
с 1 + uv/c
If
V U
- = - = 1 - a,
с с
where 0 < a < 1, find w/c in powers of a through terms in a3.
5.6.16 The displacement x of a particle of rest mass m0, resulting from a constant force
mog along the x-axis, is
312 INFINITE SERIES
x = —
9
•ЭТЧ
including relativistic effects. Find the displacement x as a power series in time t.
Compare with the classical result
x = \gt2.
5.6.17 By use of Dirac's relativistic theory the fine structure formula of atomic spectros-
copy is given by
E = me2
1 + —
У
2
-1/2
n-\k\J] '
where
s = (\k\2 - y2I12, fc= ±1, ±2, ±3
Expand in powers of y2 through order y4-(y2 = Ze2/hc, with Z the atomic
number.) This expansion is useful in comparing the predictions of the Dirac
electron theory with those of a relativistic Schrodinger electron theory. Experi-
Experimental results support the Dirac theory.
5.6.18 In a head-on proton-proton collision, the ratio of the kinetic energy in the center
of mass system to the incident kinetic energy is
y/2mc2(Ek + 2mc2) - 2mc2
Find the value of this ratio of kinetic energies for
(a) Ek <§c me2 (nonrelativistic)
(b) Ek:» me2 (extreme-relativistic)
ANS. (a) i (b) • 0. The latter an-
answer is a sort of law of diminish-
diminishing returns for high energy
particle accelerators (with sta-
stationary targets).
5.6.19 With binomial expansions
r
1-Х „ = ! X — 1 1-Х
Adding these two series yields ^>=-coxn = 0.
Hopefully, we can agree that this is nonsense but what has gone wrong?
5.6.20 (a) Planck's theory of quantized oscillators led to an average energy
00
У neoexp(-neo//cT)
£ exp(-neo//cT)
n = 0
where e0 was a fixed energy. Identify numerator and denominator as bino-
binomial expansions and show that the ratio is
exp(eo//cT) - 1
(b) Show that the <e> of part (a) reduces to kT, the classical result, for kT:» e0.
5.6.21 (a) Expand by the binomial theorem and integrate term by term to obtain the
Gregory series for tan x:
POWER SERIES 313
(XX I r . 9 л f. \ i
1 + r
Jo 1 ^ l Jo
со V2n + 1
= V (-if- , -1 <x< 1.
„=o 2n + 1
(b) By comparing series expansions, show that
_, (, /1 — гх\
tan 1 x = -In .
2 \l + гх/
Яшг. Compare Exercise 5.4.1.
5.6.22 In numerical analysis it is often convenient to approximate d2ij/(x)/dx2 by
Find the error in this approximation. 4
ANS. Error-— (/><4)(x)
12
5.6.23 You have a function y(x) tabulated at equally spaced values of the argument
х„ = x + nh.
Show that the linear combination
1
I2h
yields
{-y2
Hence this linear combination yields y'o if(hA/6O)y{5) and higher powers of h and
higher derivatives of y(x) are negligible.
5.6.24 In a numerical integration of a partial differential equation the three-dimensional
Laplacian is replaced by
+ h,y,z) + ф(х -h,y,z)
ф(х,у + h,z) + ф(х,у - h,z)+ ij/(x,y,z + h)
ф(х,у,г - h) -
Determine the error in this approximation. Here h is the step size, the distance
between adjacent points in the х-, у-, or z-direction.
5.6.25 Using double precision, calculate e from its Maclaurin series.
Note. This simple, direct approach is the best way of calculating e to high accuracy.
Sixteen terms give e to 16 significant figures. The reciprocal factorials give very
rapid convergence.
5.7 POWER SERIES
The power series is a special and extremely useful type of infinite series of
the form
314 INFINITE SERIES
f{x) = a0 + alx + a2x2 + a3x
E.110)
where the coefficients a; are constants, independent of x.l
Convergence
Equation 5.110 may readily be tested for convergence by either the Cauchy
root test or the d'Alembert ratio test (Section 5.2). If
л-юо
the series converges for — R < x < R. This is the interval or radius of conver-
convergence. Since the root and ratio tests fail when the limit is unity, the end points
of the interval require special attention.
For instance, if an = n~i, then R = 1 and, from Sections 5.1, 5.2, and 5.3, the
series converges for x = — 1 but diverges for x = +1. If an = n\, then R = 0
and the series diverges for all x Ф 0.
Uniform and Absolute Convergence
Suppose our power series (Eq. 5.110) has been found convergent for —R<
x < R; then it will be uniformly and absolutely convergent in any interior
interval, —S < x < S, where 0 < S < R.
This may be proved directly by the Weierstrass M test (Section 5.5) by using
Ml = \ai\Si.
Continuity
Since each of the terms un(x) = anx" is a continuous function of x and
f(x) = Yjanx" converges uniformly for — S < x < S,f{x) must be a continuous
function in the interval of uniform convergence.
This behavior is to be contrasted with the strikingly different behavior of the
Fourier series (Chapter 14), in which the Fourier series is used frequently to
represent discontinuous functions such as sawtooth and square waves.
Differentiation and Integration
With un(x) continuous and ]^anx" uniformly convergent, we find that the
differentiated series is a power series with continuous functions and the same
radius of convergence as the original series. The new factors introduced by
differentiation (or integration) do not affect either the root or the ratio test.
Therefore our power series may be differentiated or integrated as often as
desired within the interval of uniform convergence (Exercise 5.7.13).
Equation 5.110 may be rewritten with z = x + iy, replacing x. The following
sections will then yield uniform convergence, integrability, and differenti-
differentiability in a region of a complex plane in place of an interval on the x-axis.
POWER SERIES 315
In view of the rather severe restrictions placed on differentiation (Section 5.5),
this is a remarkable and valuable result.
Uniqueness Theorem
In the preceding section, using the Maclaurin series, we expanded ex and
ln(l + x) into infinite series. In the succeeding chapters functions are frequently
represented or perhaps defined by infinite series. We now establish that the
power-series representation is unique.
If
f(x)= Xanx", -Ra<x<Ra
E.112)
= £ Kx\ -Rb<x <Rb,
with overlapping intervals of convergence, including the origin, then
an = bn E.113)
for all n; that is, we assume two (different) power-series representations and then
proceed to show that the two are actually identical.
From Eq. 5.112
oo oo
£ anx" = £ bnx\ -R<x<R, E.114)
л=0 л=0
where R is the smaller of Ra, Rb. By setting x = 0 to eliminate all but the constant
terms, we obtain
ao = bo. E.115)
Now, exploiting the differentiability of our power series, we differentiate Eq.
5.113, getting
oo oo
nanx"-1 = X nbnx"~\ E.116)
Л=1 Л=1
We again set x = 0 to isolate the new constant terms and find
ax=bx. E.117)
By repeating this procees n times, we get
an = bn, E.118)
which shows that the two series coincide. Therefore our power-series representa-
representation is unique.
This will be a crucial point in Section 8.5, in which we use a power series to
develop solutions of differential equations. This uniqueness of power series
appears frequently in theoretical physics. The establishment of perturbation
theory in quantum mechanics is one example. The power-series representation
316 INFINITE SERIES
of functions is often useful in evaluating indeterminate forms, particularly when
l'Hospital's rule may be awkward to apply (Exercise 5.7.9).
EXAMPLE 5.7.1
Evaluate
,. 1 — COSX /c 1 1ЛЧ
lim = . E.119)
x-^0 X
Replacing cos x by its Maclaurin series expansion, we obtain
1 - cosx _!-(!- x2/2\ + x4/4! - • • •)
x2
have
_x2/2
_ 1
lim-
jc->0
!-
x2
4!
—
x4/4!
x2
+ -
cosx
x2
x2
+ •••
1
~ 2'
Letting x ->■ 0, we have
1 ГТ\Ч Y 1
E.120)
The uniqueness of power series means that the coefficients an may be identified
with the derivatives in a Maclaurin series. From
oo oo 1
f(x) = £ anx" = £ ^/(n)@)x"
we have
Reversion (Inversion) of Power Series
Suppose we are given a series
У - Уо = аЛх - xo) + ai(x ~ *oJ + • ' •
E.121)
This gives (y — y0) in terms of (x — x0). However, it may be desirable to have
an explicit expression for (x — x0) in terms of (y — y0). We may solve Eq. 5.121
for x — x0 by reversion (or inversion) of our series. Assume that
with the bn to be determined in terms of the assumed known an. A brute-force
EXERCISES 317
approach, which is perfectly adequate for the first few coefficients, is simply to
substitute Eq. 5.121 into Eq. 5.122. By equating coefficients of (x — x0)" on both
sides of Eq. 5.122, since the power series is unique, we obtain
E.123)
b3 = -sBaj- «1«з),
"i
fe4 = —jEala2a3 — a\aA — 5a\), and so on.
a1
Some of the higher coefficients are listed by Dwight.2 A more general and much
more elegant approach is developed by the use of complex variables in the first
and second editions of Mathematical Methods for Physicists.
EXERCISES
5.7.1 The classical Langevin theory of paramagnetism leads to an expression for the
magnetic polarization
P(x) = с (^ - AY
ysinhx xj
Expand P(x) as a power series for small x (low fields, high temperature).
5.7.2 The depolarizing factor L for an oblate ellipsoid in a uniform electric field parallel
to the axis of rotation is
L (l + C2)(lC
where Co defines an oblate ellipsoid in oblate spheroidal coordinates (£,£,<?
Show that
lim L = — (sphere),
Co^00 3e0
lim L = — (thin sheet).
Co-0 e0
5.7.3 The corresponding depolarizing factor (Exercise 5.7.2) for a prolate ellipsoid is
£o V2 По ~
Show that
2Dwight, H. В., Tables of Integrals and Other Mathematical Data, 4th ed.
New York: Macmillan A961). (Compare Formula No. 50.)
318 INFINITE SERIES
lim L = — (sphere),
lo"*00 Збо
lim L = 0 (long needle).
5.7.4 The analysis of the diffraction pattern of a circular opening involves
fin
Jo
Expand the integrand in a series and integrate by using
cos2" (pdm = Г"'. • 2n,
'27Г Bи)!
к,иъ ц) иц) = -
*2п
)o
cos2"+1<pd<p =
The result is 2n times the Bessel function J0(c).
5.7.5 Neutrons are created (by a nuclear reaction) inside a hollow sphere of radius JR.
The newly created neutrons are uniformly distributed over the spherical volume.
Assuming that all directions are equally probable (isotropy), what is the average
distance a neutron will travel before striking the surface of the sphere? Assume
straight line motion, no collisions,
(a) Show that
JT-k2 sin2 Ok2dk sin OdO.
Jo Jo
(b) Expand the integrand as a series and integrate to obtain
, £ 1
r = R
„tl Bи - 1)Bи + 1)Bл + 3
(с) Show that the sum of this infinite series is jj, giving 7 = %R.
Hint. Show that sn = ^ — [4Bn + l)Bn + З)]^1 by mathematical induction.
Then let n -*■ oo.
5.7.6 Given that
dx
= tan x
n
/o ^ ' л о ^
expand the integrand into a series and integrate term by term obtaining3
я . 1 1 1 1 . .,„ 1
which is Leibnitz's formula for n. Compare the convergence (or lack of it) of the
integrand series and the integrated series at x = 1.
Leibnitz's formula converges so slowly that it is quite useless for numerical
work; n has been computed to 100,000 decimals4 by using expressions such as
3The series expansion of tan l x (upper limit 1 replaced by x) was discovered
by James Gregory in 1671, 3 years before Leibnitz. See Peter Beckmann's
entertaining and informative book, A History of Pi, 2nd ed. Boulder, Col.:
The Golem Press, A971).
4Shanks, D., and J. W. Wrench, Jr., "Computation of л to 100,000 decimals,"
Math Computation 16, 76 A962).
EXERCISES 319
л = 24 tan1 + 8 tan" yy + 4 tan 239,
n = 48 tan ts + 32 tan57 - 20 tan jjg
These expressions may be verified by the use of Exercise 5.6.2.
5.7.7 Expand the incomplete factorial function
in a series of powers of x for small values of x. What is the range of convergence
of the resulting series? Why was x specified to be small?
ANS. Г e''t"dt
Jo
H+if 1 x x2
|_(n+l) (и+ 2) 2!(n+3) p
5.7.8 Derive the series expansion of the incomplete beta function
Bx(P,q)= f t'-\l-tf-4t
Jo
[P P+l
for 0 < x < 1, p > 0 and q > 0 (if x = 1).
5.7.9 Evaluate
. ч .. sin(tanx) — tan(sinx)
(a) hm—^ L_ v ^
(b) Итх"иу„(х) for n = 3,
x->0
where jn(x) is a spherical Bessel function (Section 11.7) defined by
ANS. (a) -~,
(b) — >— forn = 3.
W 1 • 3 ■ 5 • • ■ Bn + 1) 105
5.7.10 Neutron transport theory gives the following expression for the inverse neutron
diffusion length of k:
к \)
By series inversion or otherwise, determine k2 as a series of powers of b/a. Give
the first two terms of the series.
5.7.11 Develop a series expansion of sinh x in powers of x by
320 INFINITE SERIES
(a) reversion of the series for sinh y,
(b) a direct Maclaurin expansion.
5.7.12 A function f(z) is represented by a descending power series
00
f(z) = £ anz~\ R<z<cq.
Show that this series expansion is unique; that is, if/(z) = ]Г^0 bnz~", R < z < со,
then а„ = Ь„ for all n.
Jn
5.7.13 A power series given by
00
fix) = У a x"
n=0
converges for — R < x < R. Show that the differentiated series and the integrated
series have the same interval of convergence. (Do not bother about the end
points x = +R.)
5.7.14 Assuming that f(x) may be expanded in a power series about the origin, /(x) =
Х!и°=о anx"i with some nonzero range of convergence. Use the techniques em-
employed in proving uniqueness of series to show that your assumed series is a
Maclaurin series:
5.7.15 The Klein-Nishina formula for the scattering of photons by electrons contains a
term of the form
E) =
J + 2e e
Here e = hv/mc2, the ratio of the photon energy to the electron rest mass energy.
Find
£-*0'
ANS. 4
5.7.16 The behavior of a neutron losing energy by colliding elastically with nuclei of
mass A is described by a parameter £ь
An approximation, good for large A, is
2
Expand ^ and £2 in powers of A 1. Show that £2 agrees with ^ through (A 1J.
Find the difference in the coefficients of the (A'1K term.
5.7.17 Show that each of these two integrals equals Catalan's constant
f1 dt
(a) arc tan t~,
Jo l
ELLIPTIC INTEGRALS 321
5.7.18 Calculate ж (double precision) by each of the following arc tangent expressions:
ж = 16tan A/5) - 4tan(l/239)
ж = 24tan(l/8) + StarT1 A/57) + 4tan(l/239)
n = 48 tan A/18) + 32 tan A/57) - 20 tan A/239).
You should obtain 16 significant figures.
Note. These formulas have been used in some of the more accurate calculations
of?r.5
5.7.19 An analysis of the Gibbs phenomenon of Section 14.5 leads to the expression
(a) Expand the integrand in a series and integrate term by term. Find the
numerical value of this expression to four significant figures.
(b) Evaluate this expression by the Gaussian quadrature (Appendix A2).
ANS. 1.178980.
5.8 ELLIPTIC INTEGRALS
Elliptic integrals are included here partly as an illustration of the use of
power series and partly for their own intrinsic interest. This interest includes
the occurrence of elliptic integrals in physical problems (Example 5.8.1 and
Exercise 5.8.4) and applications in mathematical problems.
EXAMPLE 5.8.1 Period of a Simple Pendulum
Y////////////////
m
FIG. 5.8 Simple pendulum
For small amplitude oscillations our pendulum (Figure 5.8) has simple har-
harmonic motion with a period T = 2n(l/gI12. For a maximum amplitude 0M large
5 Shanks, D., and i. W. Wrench, "Computation of к to 100,000 decimals,"
Math. Computation 16, 76 A962).
322 INFINITE SERIES
enough so that sin6M ф вм, Newton's second law of motion and Lagrange's
equation (Section 17.7) lead to a nonlinear differential equation (sinO is a
nonlinear function of в), so we turn to a different approach.
The swinging mass m has a kinetic energy of jml2(d0/dtJ and a potential
energy of — mgl cos в (в = \п taken for the arbitrary zero of potential energy).
Since dO/dt = 0 at в = вм, the conservation of energy principle gives
-xml2(— ) — mglcosO = — mglcosOM. E.124)
2 \dt)
Solving for dO/dt we obtain
f =±(j-J\cos0-coseMy'2 E.125)
with the mass m canceling out. We take t to be zero when 0 = 0 and dO/dt > 0.
An integration from в = 0 to в = QM yields
(cos<9 -cos(9Mr1/2 dO = (y) dt = \i) L E126)
This is \ of a cycle, and therefore the time t is \ of the period, T. We note that
в < вм, and with a bit of clairvoyance we try the half-angle substitution
^ ( E.127)
With this, Eq. 5.126 becomes
/,\l/2 /V/2 / /д \ \-l/2
^^()) EЛ28)
Although not an obvious improvement over Eq. 5.126, the integral now defines
the complete elliptic integral of the first kind, K(sinOM/2). From the series
expansion, the period of our pendulum may be developed as a power series—
powers of sin 9M/2:
@{ T+--]- EЛ29)
Definitions
Generalizing Example 5.8.1 to include the upper limit as a variable, the
elliptic integral of the first kind is defined as
F(cp\a)=\ A -sin2asm20)~ll2d0 E.130a)
Jo
or
F(x\m)=\ [A -t2)(l -mt2)Yl/2dt, 0 < m < 1. E.130b)
Jo
(This is the notation of AMS-55.) For (p = n/2, x = 1, we have the complete
ELLIPTIC INTEGRALS 323
elliptic integral of the first kind;
fn/2
K(m)= A -msin26T1/2fiM
J° E-131)
= 1A - t2)(l - mt2)]-1* dt,
Jo
with m = sin2 a, 0 < m < 1.
The elliptic integral of the second kind is defined by
E(<p\ol)=\ A - sin2asm20I12dO E.132a)
Jo
or
£(x|m)= (V-^r) Л> 0<m<l. E.132b)
Again, for the case cp = я/2, x = 1, we have the complete elliptic integral of the
second kind:
fn/2
E(m)= A - m sin2 вI'2 dO
Jo E.133)
Г1 /1 - mt2V12
Exercise 5.8.1 is an example of its occurrence. Fig. 5.9 shows the behavior of
K(m) and E(m). Extensive tables are available in AMS-55.
Series Expansion
For our range 0 < m < 1, the denominator of K(m) may be expanded by the
binomial series,
A - m sin2 Oy112 = 1 + im sin2 9 + ^m2 sin4 ()+
E.134)
For any closed interval [0, mmax], mmax < 1 this series is uniformly convergent
and may be integrated term by term. From Exercise 10.4.9
Hence
Similarly,
324 INFINITE SERIES
3.0
2.0
тг/2
1.0
0
.
. ■
Kip
^-—
h--~.
/
в
/
/w)
1
/
0.5
1.0
m
1.0
FIG. 5.9 Complete elliptic
integrals, AT(m) and £(m)
„, ч п L /IV т /l-3Vm2
m3
'1-3-5V m
2-4-6
.. E.137)
(Exercise 5.8.2). In Section 13.5 these series are identified as hypergeometric
functions, and we have
f) = ?2^i(iii;w) E138)
E.139)
Limiting Values
From the series Eqs. 5.136 and 5.137, or from the defining integrals,
lim K(m) = -,
E.140)
lim£(m) = -.
m-*0 2
E.141)
For m -»■ 1 the series expansions are of little use. However, the integrals yield
KmK(m) = oo, E.142)
the integral diverging logarithmically, and
lim£(m) = 1.
E.143)
EXERCISES 325
The elliptic integrals have been used extensively in the past for evaluating
integrals. For instance, integrals of the form
/= R{t,
Jo
a2t
where Я is a rational function of t and of the radical, may be expressed in terms
of elliptic integrals. Jahnke and Emde, Chapter 5, give pages of such trans-
transformations. With high-speed computers available for direct numerical evalua-
evaluation, interest in these elliptic integral techniques has declined. However, elliptic
integrals still remain of interest because of their appearance in physical problems
—Exercises 5.8.4 and 5.8.5.
EXERCISES
5.8.1
The ellipse x2/a2 + y2/b2 = 1 may be represented parametrically by x = asinO,
у = bcosO. Show that the length of arc within the first quadrant is
Here
f/
a I A - m sin2 Of2 dO = aE{m).
Jo
2 b2)/a2
0<m = (a2 - b2)/a2 < 1.
5.8.2 Derive the series expansion
1 \2-4/ 3
?
m"
5.8.3 Show that
»i-o
n
-.
4
5.8.4 A circular loop of wire in the xy-plane, as shown, carries a current /. Given that
the vector potential is
. . . aunl Cn cos a da
Av(p,<i>,z) = u '
2n Jo (a2 + p2 + z2 - 2apcosaI/2'
show that
where
4ap
(a + p) + г
Note. For extension of Exercise 5.8.4 to B, see Smythe, page 270.]
^mythe, W. R., Static and Dynamic Electricity, 3rd ed. McGraw-Hill,
New York A969).
326 INFINITE SERIES
5.8.6
5.8.7
5.8.8
У
-Л (Р, 9, 0)
5.8.5 An analysis of the magnetic vector potential of a circular current loop leads to
the expression
f(k2) = /T2[B - k2)K(k2) - 2Е(к2)],
where K(k2) and E(k2) are the complete elliptic integrals of the first and second
kinds. Show that for k2 <sc 1 (r >s> radius of loop)
nk2
f(k2)
16
Show that
dE(k2)
(a)
dk
dK(k2)
dk /c(l - k2)
tiint. For part (b) show that
К
~k'
"я/2
E(k2) = A - k2) A - /csin2 ()y3l2d0
Jo
by comparing series expansions.
(a) Write a function subroutine that will compute E(m) from the series expan-
expansion, Eq. 5.137.
(b) Test your function subroutine by using it to calculate E(m) over the range
m = 0.0@.1H.9 and comparing the result with the values given by AMS-55.
Repeat Exercise 5.8.7 for K(m). To be written out as in Exercise 5.8.7.
Note. These series for E(m), Eq. 5.137, and K(m), Eq. 5.136, converge only very
slowly for m near 1. More rapidly converging series for E(m) and K(m) exist. See
Dwight's Tables of Integrals:2 No. 773.2 and 774.2. Your computer subroutine
for computing E and К probably uses polynomial approximations: AMS-55,
Chapter 17.
2Dwight, H. В., Tables of Integrals and Other Mathematical Data. New York:
Macmillan Co. A947).
BERNOULLI NUMBERS, EULER-MACLAURIN FORMULA 327
5.8.9 A simple pendulum is swinging with a maximum amplitude of 0M. In the limit as
вм ->• 0, the period is 1 sec. Using the elliptic integral, K(k2), к = sin@M/2) calculate
the period T for 0M = 0 A0°) 90°.
Caution. Some elliptic integral subroutines require к = m1/2 as an input parameter,
not m itself.
Check values. 0 T (sec)
10° 1.00193
50° 1.05033
90° 1.18258
5.8.10 Calculate the magnetic vector potential A(p, <p,z) = ^0Av(p, cp,z) of a circular
current loop (Exercise 5.8.4) for the ranges p/a = 2, 3, 4, and z/a = 0,1, 2, 3, 4.
Note. This elliptic integral calculation of the magnetic vector potential may be
checked by an associated Legendre function calculation, Example 12.5.1.
Check value. For p/a = 3 and z/a = 0;
A(p = 0.029023/V-
5.9 BERNOULLI NUMBERS, EULER-MACLAURIN
FORMULA
The Bernoulli numbers were introduced by Jacques (James, Jacob) Bernoulli.
There are several equivalent definitions, but extreme care must be taken, for
some authors introduce variations in numbering or in algebraic signs. One
relatively simple approach is to define the Bernoulli numbers by the series1
E.144)
ex -
„=o И!
By differentiating this power series repeatedly and then setting x = 0, we obtain
d" I x
Specifically,
- d(
E.145)
xe
{ex - 1У
E.146)
V
as may be seen by series expansion of the denominators.
Since these derivatives are awkward to evaluate, we may introduce instead
a series expansion into the defining expression (Eq. 5.144) to obtain
f )(°+ BiX + BlY\
Using the power-series uniqueness theorem (Section 5.7) with the coefficient of
xThe function x/(ex — 1) may be considered a generating function since it
generates the Bernoulli numbers. Generating functions that generate the
special functions of mathematical physics appear in Chapters 11, 12, and 13.
328 INFINITE SERIES
TABLE 5.1 Bernoulli Numbers
n
0
1
2
4
6
8
10
вп
1
~1
l
6
l
42
_J_
в„
1.000000000
-0.500000000
0.1666 66667
-О.ОЗЗЗЗЗЗЗЗ
0.0238 09524
-О.ОЗЗЗЗЗЗЗЗ
0.0757 57576
x° equal to unity and the coefficient of x"(n =f= 0) equal to zero, we obtain
^Во + В,=0, B1=-i E.148)
~50+^1 + | = 0, B2=\. E.149)
Continuing, we have Table 5.1.
Further values are given in National Bureau of Standards, Handbook of Mathe-
Mathematical Functions (AMS-55).
B2n+i = 0 n = 1, 2, 3, .. .,
If the variable x in Eq. 5.144 is replaced by 2ix (and Bt set equal to —\), we
obtain an alternate (and equivalent) definition of B2n by the expression
xcotx= £(-l)"JB2n^-, -n<x<n. E.150)
n
Using the method of residues (Section 7.2) or working from the infinite product
representation of sin x (Section 5.10), we find that
'2\ n= 1, 2, 3, .. .. E.151)
This representation of the Bernoulli numbers was discovered by Euler. It is
readily seen from Eq. 5.151 that |JB2n| increases without limit as n -»■ oo. Nu-
Numerical values have been calculated by Glaisher.2 Illustrating the divergent
behavior of the Bernoulli numbers, we have
B20 = -5.291 x 102
E.152)
JB200 = -3.647 x 10215.
2 Glaisher, J. W. L., "Table of the first 250 Bernoulli's numbers (to nine
figures) and their logarithms (to ten figures). Trans. Cambridge Phil. Soc. XII,
390A871-1879).
BERNOULLI NUMBERS, EULER-MACLAURIN FORMULA 329
Some authors prefer to define the Bernoulli numbers with a modified version
of Eq. 5.151 by using
2B)! y1 n-2n (c i «\
, L P > E.153)
the subscript being just half of our subscript and all signs are positive. Again,
when using other texts or references the reader must check carefully to see
exactly how the Bernoulli numbers are defined.
The Bernoulli numbers occur frequently in number theory. The von Standt-
Clausen theorem states tnat
B2n = An-±-±-j- i-, E.154)
Pi Pi Рз Рк
in which An is an integer and pi9p2, ■ ■ ■ ,Pkare prime numbers that exceed by 1,
a divisor of 2n. It may readily be verified that this holds for
B6(A3 = 1, p = 2,3,7),
B8(A* = l P = 2,3,5), E.155)
B10(i45 = l, p = 2,3, 11),
and other special cases.
The Bernoulli numbers appear in the summation of integral powers of the
integers,
N
L JP> p
7-1
and in numerous series expansion of the
tanx,
cotx,
cscx,
In
In
In
sin
cos
tan
tanhx
cothx
integral,
transcendental functions, including
X
X
X
?
>
and
csch x.
For example,
*3 2 5 (_i)«-i22«B2n- l)JB2n 2n< ,_,_„
tanx = x + — + —x5 + • • • + ^ '- -±— L_2nx2n i + ... E.156)
3 15 Bn)!
330 INFINITE SERIES
TABLE 5.2 Bernoulli Functions
B0 = l
B2 = x2 - x + |
B3 = x3 - f x2 + \x
BA = x4 - 2x3 + x2 - м
B5 = x5 - fx4 + |x3 - |
5 = x - fx + |
„6 „5 I 5 4 1 „2 I 1
X — JX T 2~X — 2~X ' 4Г2
В„@) = В„, Bernoulli number
The Bernoulli numbers are likely to come in such series expansions because of
the defining equations E.144) and E.150) and because of their relation to the
Riemann zeta function ^
СBи) = £ p-2". E.157)
Bernoulli Functions
If Eq. 5.144 is generalized slightly, we have
J¥-— = V Bn{s)~ E.158)
defining the Bernoulli functions, Bn(s). The first seven Bernoulli functions are
given in Table 5.2.
From the generating function, Eq. 5.158,
JBn(O) = JBn, n = 0,1,2, ..., E.159)
the Bernoulli function evaluated at zero equals the corresponding Bernoulli
number. Two particularly important properties of the Bernoulli functions
follow from the defining relation: a differentiation relation
B'n(s) = пВп_М п = 1,2,3,..., E.160)
and a symmetry relation
Bn(l) = (-l)"Bn@), « = 0,1,2,.... E.161)
These relations are used in the development of the Euler-Maclaurin integration
formula.
Euler-Maclaurin Integration Formula
One use of the Bernoulli functions is in the derivation of the Euler-Maclaurin
integration formula. This formula is used in Section 10.3 for the development
of an asymptotic expression for the factorial function—Stirling's series.
The technique is repeated integration by parts using Eq. 5.160 to create new
derivatives. We start with
[lf(x)dx= [l f(x)B0(x)dx. E.162)
BERNOULLI NUMBERS, EULER-MACLAURIN FORMULA 331
From Eq. 5.160 and Exercise 5.9.2
B[(x) = B0(x) = 1. E.163)
Substituting B[(x) into Eq. 5.162 and integrating by parts, we obtain
lf(x)dx = /A)^A) - Д0)ВД - [* f(x)Bt(x)dx
= k/(D +/@)] - [l
z Jo
E.164)
Again, using Eq. 5.160, we have
Bt(x) = kB'2{x), E.165)
and integrating by parts
Г f(x)dx = |[/A) + /@)] - ~[/'(lM2(l) - /'@)B2@)]
' E.166)
~ fB)(x)B2(x)dx.
z- Jo
Using the relations,
B2n(i) = B2n@) = B2n, и = 0,1, 2, ...
E.167)
B2n+i(l) = B2n+i@) = 0, n= 1,2,3, ...
and continuing this process, we have
[ f(x)dx = i[/(l) + /@)] - t J
j
o t P=i{P) E.168«)
f
This is the Euler-Maclaurin integration formula. It assumes that the function
f(x) has the required derivatives.
The range of integration in Eq. 5.168fl may be shifted from [0,1] to [1,2]
by replacing/(x) by/(x + 1). Adding such results up to [n — 1, n],
(* f(x)dx = i/@) + /A) + /B) +...+/(„- l) + if(n)
В1Г2^\) Р2^\ОI E.168b)
+ remainder term.
The terms г/@) + /A) + ' '' + i/(n) appear exactly as in trapezoidal
integration or quadrature. The summation over p may be interpreted as a
correction to the trapezoidal approximation. Equation 5.168b is the form used
in Exercise 5.9.5 for summing positive powers of integers and in Section 10.3
for the derivation of Stirling's formula.
332 INFINITE SERIES
TABLE 5.3 Riemann Zeta Function
s
2
3
4
5
6
7
8
9
10
C(s)
1.64493 40668
1.20205 69032
1.08232 32337
1.03692 77551
1.01734 30620
1.0083492774
1.00407 73562
1.ОО2ОО83928
1.0009945751
The Euler-Maclaurin formula is often useful in summing series by converting
them to integrals.3
Riemann Zeta Function
This series £^=1 p~2" was used as a comparison series for testing convergence
(Section 5.2) and in Eq. 5.151 as one definition of the Bernoulli numbers, B2n.
It also serves to define the Riemann zeta function by
C(s) =f>~s, 5>1. E.169)
Table 5.3 lists the values of £(s) for integral s, s = 2, 3, . .., 10. Closed forms for
even s appear in Exercise 5.9.6. Figure 5.10 is a plot of £(s) - 1. An integral
expression for this Riemann zeta function appears in Section 10.2 as part of
the development of the gamma function.
Another interesting expression for the Riemann zeta function may be derived
as follows:
CM(l-2")-l+l + l+... -A + 1 + 1+...), E.170)
eliminating all the n~s, where n is a multiple of 2. Then
A,1 ± \ EЛ71)
\3S 9s + 15s + /
eliminating all the remaining terms in which n is a multiple of 3. Continuing,
we have £(s)(l - 2~s)(l - 3~s)(l - 5"s) • • • A - P~s\ where P is a prime
number, and all terms n~\ in which n is a multiple of any integer up through
P, are canceled out. As P -*■ oo,
3 Compare Boas, R. P., and С Stutz, "Estimating Sums with Integrals,"
Am. J. Phys. 39, 745 A971) for a number of examples.
BERNOULLI NUMBERS, EULER-MACLAURIN FORMULA 333
IO.Of
J.O
O.I
0.01
0.00 J
00001
2"
I I \ |
j i I
12 3 4 5 6 8 10 12 14
FIG. 5.10 Riemann zeta function, C(s) — 1 versus s
- 2~s)(l - 3~s) • • • A - P-) = C(s) П A - p's) = 1- E-172)
P(prime) = 2
Therefore
П a-p-
p( prime)— 2
-1
E.173)
giving C(s) as an infinite product.4
This cancellation procedure has a clear application in numerical computa-
computation. Equation 5.170 will give C(s)(l — 2~s) to the same accuracy as Eq. 5.169
gives £(s), but with only half as many terms. (In either case, a correction would
be made for the neglected tail of the series by the Maclaurin integral test
technique—replacing the series by an integral, Section 5.2.)
Along with the Riemann zeta function, AMS-55 (Chapter 23) defines three
other functions of sums of reciprocal powers:
oo
ф) = £ (-1Г~1п-' = A- 2l-'K(s\ n = 1,2, . ..
4 This is the starting point for the extensive applications of the Riemann zeta
function to the number theory. See Edwards, H. M., Riemann''s Zeta Function.
New York: Academic Press A974).
334 INFINITE SERIES
oo
A(s) = £ Bn + I)"» = A - 2-%(s), n = 2,3, .. .
n = 0
and
P(s)= f (-Щ2И+1П /7 = 1,2,....
n = 0
From the Bernoulli numbers (Exercise 5.9.6) or Fourier series (Example 14.3.3
and Exercise 14.3.13) special values are
2
1 1 n
+ +
_ 1 1 _ 7l4
4D) -1 + 24 + 34+-"-9o
22 З2 12
1 1
Catalan's constant,
0B) = 1 - -1 + JL _ . . . = 0.9159 6559. . .,
is the topic of Exercise 5.2.22.
Improvement of Convergence
If we are required to sum a convergent series £^=1 an whose terms are rational
functions of n, the convergence may be improved dramatically by introducing
the Riemann zeta function.
EXAMPLE 5.9.1 Improvement of convergence
The problem is to evaluate the series £^°=i 1/A + и2). Expanding A + n2)'1 ~
n~2(l + n~2)~i by direct division, we have
EXERCISES 335
A + n2)-1 = п~2(л - n'2 + n~A -
'2 + n~A
_ 1 1.1
Therefore
I
n2 n4 n6
The С functions are tabulated and the remainder series converges as n~8. Clearly,
the process can be continued as desired. You make a choice between how much
algebra you will do and how much arithmetic the computing machine will do.
Other methods for improving computational effectiveness are given at the
end of Sections 5.2 and 5.4.
EXERCISES
5.9.1 Show that
» (-l)»-^2"^2"- \)B2n 2„_! П П
= £ —2—m X ' K % K 2
Hint, tan x — cot x — 2 cot 2x.
5.9.2 The Bernoulli numbers generated in Eq. 5.144 may be generalized to Bernoulli
polynomials,
xexs £ „ , X
ex
^— = У Bn(s)-.
-1 „% n\
Show that
B0(s) = 1
8,(8) = S-^
B2(s) = s2-s + i
Note that £„@) = Bn, the Bernoulli number.
5.9.3 Show that B'n(s) = nB^^s), n= 1,2,3, ....
Hint. Differentiate the equation in Exercise 5.9.2.
5.9.4 Show that
В„A) = (-1)"В„@).
Hint. Go back to the generating function, Eq. 5.158 or Exercise 5.9.2.
5.9.5 The Euler-Maclaurin integration formula may be used for the evaluation of
finite series:
t /И = [ f{x)dx + \f{\) + \f{n) + %f
m=l J ILL
Show that
[ \f{\) + \
m=l Ji IL
336 INFINITE SERIES
(a) £ m = \n(n + 1).
m = l
(b) £ m2 = %n(n + l)Bn + 1).
(c) X m3 = \n2(n + IJ.
m — 1
(d) £ m4 = ^и(и + 1)Bи + l)Cn2 + 3n - 1).
m = l
5.9.6 From
Show that
(a) CB) = ^ (d)
4 _10
(b) CD) = ~ (e) CUO) = —
@ ^
5.9.7 Planck's black-body radiation law involves the integral
ex
o c
Show that this equals 6 CD). From Exercise 5.9.6
Hint. Make use of the gamma function, Chapter 10.
5.9.8 Prove that
Г x"exdx
Assuming n to be real, show that each side of the equation diverges if n = 1.
Hence the preceding equation carries the condition n > 1. Integrals such as this
appear in the quantum theory of transport effects—thermal and electrical
conductivity.
5.9.9 The Bloch-Gruneissen approximation for the resistance in a monovalent metal
is
^T5 Г1Т x5dx
C
06JO (e* - l)(l - в'*)
where 0 is the Debye temperature characteristic of the metal,
(a) For T-» oo show that
С T
P 4 02
EXERCISES 337
(b) For T -> 0, show that
©е
5.9.10 Show that
•4n(l + xK 1
(a)
o x
, . f ln(l — x) , ,,
(b) hm — -dx = CB).
^ J
o *
2/
From Exercise 5.9.6, £B) = n2/6. Note that the integrand in part (b) diverges for
a = 1 but that the integrated series is convergent.
5.9.11 The integral
*i
, dx
[ln(l-x)]2-
Jo x
appears in the fourth-order correction to the magnetic moment of the electron.
Show that it equals 2 CC).
Hint. Let 1 - x = e~l.
5.9.12 Show that
By contour integration (Exercise 7.2.17), this may be shown equal to я3/8.
5.9.13 For "small" values of x
ln(x!)= -yx+ X(-l)"^x",
П
where у is the Euler-Mascheroni constant and C(n) the Riemann zeta function.
For what values of x does this series converge?
ANS. ~ 1 < x < 1.
Note that if x = 1, we obtain
a series for the Euler-Mascheroni constant. The convergence of this series is
exceedingly slow. For actual computation of y, other, indirect approaches are
far superior (see Exercises 5.9.17, 5.10.11, and 10.5.16).
5.9.14 Show that the series expansion of ln(x!) (Exercise 5.9.13) may be written as
1 / nx \ °° CBn + 1)
(a) ln(x!) = -ln -^x~£ ч——~x2n+1,
2 ysin nxj „tri 2n + 1
,, , , ,. 1, / nx \ 1, /1 + x\ ,.
(b) ln(x!) = -ln -^ln\~i + 0- ~ У)х
2 \smnxj 2 yl — x)
- t [£Bn + 1) - 1]
x2n+1
2n+l
Determine the range of convergence of each of these expressions.
338 INFINITE SERIES
5.9.15 Show that Catalan's constant, AB), may be written as
oo _2
k=l
Hint, n2 = 6CB).
5.9.16 Derive the following expansions of the Debye functions
e> -
00
f? v
2<c
2/c
Ixl <2n,n> 1,
(b)
,—kx
x" nx"
n\
x > 0, n > 1.
The complete integral @, oo) equals n\ £(n + 1), Exercise 10.2.15.
5.9.17 Derive the following Bernoulli number series for the Euler-Mascheroni constant.
и 1 D
V -1 1 V 2fc
s=i ^n t=1 Bk)n2k
Hint. Apply the Euler-Maclaurin integration formula to /(x) = x-1 over the
range [n,iV].
5.9.18 (a) Show that the equation In 2 =
rewritten as
00
In 2= Y 2-41(s)
Li(—l)s+1s l. (Exercise 5.4.1) may be
V -»-if J_
h L 2p
шг. Take the terms in pairs,
(b) Calculate In 2 to six significant figures.
5.9.19 (a) Show that the equation я/4 = ]i>=1 (- l)"+1Bn - I) (Exercise 5.7.6) may
be rewritten as
- = 1 - 2 t 4-2sCBs)
4 s=l
- 2 I Dp)
P=i
(b) Calculate я/4 to six significant figures.
-2и-2
1-1
1 -
5.9.20 Write a function subprogram ZETA(iV) that will calculate the Riemann zeta
function for integer argument. Tabulate C(s) for s = 2, 3, 4, . .., 20. Check your
values against Table 5.3 and AMS-55, Chapter 23.
Hint. If you simply supply the function subprogram with the values of £B), CC),
and CD), you avoid the more slowly converging series. Calculation time may be
further shortened by using Eq. 5.170.
5.9.21 Calculate the logarithm (base 10) of |B2n|, n = 10, 20, ..., 100.
Hint. Program the zeta function as a function subprogram, Exercise 5.9.20.
Check values, log |B1Oo| = 78.45
log|B200| = 215.56.
ASYMPTOTIC OR SEMICONVERGENT SERIES 339
5.10 ASYMPTOTIC OR SEMICONVERGENT SERIES
Asymptotic series frequently occur in physics. In numerical computations
they are employed for the accurate computation of a variety of functions. We
consider here two types of integrals that lead to asymptotic series: first, an
integral of the form
where the variable x appears as the lower limit of an integral. Second, we consider
the form
I2(x)= e~"/(^W
with the function / to be expanded as a Taylor series (binomial series). Asymp-
Asymptotic series often occur as solutions of differential equations. An example of
this appears in Section 11.6 as a solution of Bessel's equation.
Incomplete Gamma Function
The nature of an asymptotic series is perhaps best illustrated by a specific
example. Suppose that we have the exponential integral function1
Ei(x)= ~du, E.174)
J -co
or
— Ei( — x) = —du = £j(x), E.175)
Jx
to be evaluated for large values of x. Better still, let us take a generalization of
the incomplete factorial function (incomplete gamma function),2
/*OD
I(x,p)= e~uu~pdu = ГA — р,х), E.176)
Jx
in which x and p are positive. Again, we seek to evaluate it for large values of x.
Integrating by parts, we obtain
e-x
I(x,P)= — -P e uu p ldu
E-177)
xp
JThis function occurs frequently in astrophysical problems involving gas with
a Maxwell-Boltzmann energy distribution.
2 See also Section 10.5.
340 INFINITE SERIES
Continuing to integrate by parts, we develop the series
гл, ^-^/! P , P(P±J) \
)
7 E.178)
v* У v* У ' А -у* Р
This is a remarkable series. Checking the convergence by the d'Alembert
ratio test, we find
и- 1)! x
- E.179)
= 00
for all finite values of x. Therefore our series as an infinite series diverges every-
everywhere! Before discarding Eq. 5.178 as worthless, let us see how well a given
partial sum approximates the incomplete factorial function, I(x,p).
I(x,p) - sn(x,p) = (- 1) p—^7 е-'м-*-"-1 du = Rn(x,p). E.180)
In absolute value
\l(x,p) - sn(x,p)\ <|P_ "*' е'ии'р~п~1 du.
When we substitute и = v + x the integral becomes
e-°[l+-) dv.
л Jo
For large x the final integral approaches 1 and
|/(x,p) - sn(x,p)\ « 1£+^.-|^. E.181)
This means that if we take x large enough, our partial sum sn is an arbitrarily
good approximation to the desired function I(x,p). Our divergent series (Eq.
5.178) therefore is perfectly good for computations. For this reason it is some-
sometimes called a semiconvergent series. Note that the power of x in the denominator
of the remainder (p + n + 1) is higher than the power of x in the last term
included in sn(x, p), (p + n).
Since the remainder Rn(x, p) alternates in sign, the successive partial sums give
alternately upper and lower bounds for I(x,p). The behavior of the series (with
ASYMPTOTIC OR SEMICONVERGENT SERIES 341
0.20
0.19
0.18
0.17
0.16
0.15
0.14
0.1704
- 0.1741
0.1664
Sn(X = 5)
12 3 4 5 6
FIG. 5.11 Partial sums of ехЕх(х)\х=5
10
p = 1) as a function of the number of terms included is shown in Fig. 5.11. We
have
exEi(x) =
и
du
E.182)
x x2 x3 x4
which is evaluated at x = 5. For a given value of x the successive upper and lower
bounds given by the partial sums first converge and then diverge. The optimum
determination of exEi(x) is then given by the closest approach of the upper and
lower bounds, that is, between s4 = s6 = 0.1664 and s5 = 0.1741 for x = 5.
Therefore
0.1664 < exE1(x)\x=5 < 0.1741. E.183)
Actually, from tables,
ex£1(x)|x=5 = 0.1704,
E.184)
within the limits established by our asymptotic expansion. Note carefully that
inclusion of additional terms in the series expansion beyond the optimum point
literally reduces the accuracy of the representation.
As x is increased, the spread between the lowest upper bound and the highest
lower bound will diminish. By taking x large enough, one may compute exE1(x)
to any desired degree of accuracy. Other properties of Ei (x) are derived and
discussed in Section 10.5.
342 INFINITE SERIES
Cosine and Sine Integrals
Asymptotic series may also be developed from definite integrals—if the
integrand has the required behavior. As an example, the cosine and sine integrals
(Section 10.5) are defined by
J*oo
— dt, E.185)
X
/•oo
si(x)= - —dt. E.186)
J X
Combining these with regular trigonometric functions, we may define
/•oo •
f(x) = Ci(x) sin x — si(x) cos x = — dy,
J° У + Х E.187)
/•oo v '
t \ /-•/■ \ -t \ • cosy ,
g(x) = —Ci(x) cos x — si(x)sm x = —dy,
Jo У + x
with the new variable у = t — x. Going to complex variables, Section 6.1,
we have
g(x) + if(x) =
——
o 37 + x
Ai, E-188)
i
Jo 1 + iu
in which w = —iy/x. The limits of integration, 0 to oo, rather than 0 to — ioo,
may be justified by Cauchy's theorem, Section 6.3. Rationalizing the de-
denominator and equating real part to real part and imaginary part to imaginary
part, we obtain
/*oo —xu
tie
e(x) = J тт?*1
Д
For convergence of the integrals we must require that M(x) > 0.3
Now, to develop the asymptotic expansions, let v = xu and expand the factor
[1 + (y/xJ] by the binomial theorem.4 We have
1 Г00 oo 2n 1 oo /">иМ
> V2n
, E-190)
x^ V2«
/О л=0 л л n=0 л
ъ 3%{x) = real part of (complex) x (compare Section 6.1).
4This step is valid for v < x. The contributions from v > x will be negligible
(for large x) because of the negative exponential. It is because the binomial
expansion does not converge for v > x that our final series is asymptotic rather
than convergent.
sinx
X „
COS
X
oo
X
Э
n = 0
x:
f П"
t)!
Bn)!
x2n
cosx
x2
sin
X'
V
n = 0
-A- v^ i , л
ri = 0
(in + 1)
x2n
x2"
!
1)!
i
ASYMPTOTIC OR SEMICONVERGENT SERIES 343
From Eqs. 5.187 and 5.190
E.191)
the desired asymptotic expansions.
This technique of expanding the integrand of a definite integral and integrat-
integrating term by term is applied in Section 11.6 to develop an asymptotic expansion
of the modified Bessel function Kv and in Section 13.6 for expansions of the
two confluent hypergeometric functions M(a,c;x) and U(a,c;x).
Definition of Asymptotic Series
The behavior of these series (Eqs. 5.178 and 5.191) is consistent with the
defining properties of an asymptotic series.5 Following Poincare, we take6
x"Rn(x) = x"[/(x) - sn(x)l
where
E.193)
The asymptotic expansion of/(x) has the properties that
lim хиЯ„(х) = О, for fixed n, E.194)
x-+oo
and
lim xnRn(x) = oo, for fixed x.7 E.195)
For power series, as assumed in the form of sn(x),Rn(x) — x"". With conditions
E.194) and E.195) satisfied, we write
f(x) * f anx~". E.196)
n = 0
Note the use of » in place of =. The function/(x) is equal to the series only
in the limit as x ->■ go.
5 It is not necessary that the asymptotic series be a power series. The required
property is that the remainder Rn(x) be of higher order than the last term
kept—as in Eq. 5.194.
6Poincare's definition allows (or neglects) exponentially decreasing functions.
The refinement of Poincare's definition is of considerable importance for the
advanced theory of asymptotic expansions, particularly for extensions into
the complex plane. However, for purposes of an introductory treatment and
especially for numerical computation with x real and positive, Poincare's
approach is perfectly satisfactory.
7 This excludes convergent series of inverse powers of x. Some writers feel that
this distribution, this exclusion, is artificial and unnecessary.
344 INFINITE SERIES
Asymptotic expansions of two functions may be multiplied together and the
result will be an asymptotic expansion of the product of the two functions.
The asymptotic expansion of a given function f(t) may be integrated term by
term (just as in a uniformly convergent series of continuous functions) from
x < t < oo and the result will be an asymptotic expansion of j^ j\t)dt. Term-by-
term differentiation, however, is valid only under very special conditions.
Some functions do not possess an asymptotic expansion; ex is an example
of such a function. However, if a function has an asymptotic expansion, it has
only one. The correspondence is not one to one; many functions may have
the same asymptotic expansion.
One of the most useful and powerful methods of generating asymptotic
expansions, the method of steepest descents, will be developed in Section 7.4.
Applications include the derivation of Stirling's formula for the (complete)
factorial function (Section 10.3) and the asymptotic forms of the various Bessel
functions (Section 11.6). Asymptotic series occur fairly often in mathematical
physics. One of the earliest and still important approximation treatments of
quantum mechanics, the WKB expansion, is an asymptotic series.
Applications to Computing
Asymptotic series are frequently used in the computations of functions by
modern high-speed electronic computers. This is the case for the Neumann
functions N0(x) and ATx(x), and the modified Bessel functions In(x) and Kn(x).
The relevant asymptotic series are given as Eqs. 11.127, 11.134, and 11.136.
A further discussion of these functions is included in Section 11.6. The asymp-
asymptotic series for the exponential integral, Eq. 5.182, for the Fresnel integrals,
Exercise 5.10.2, and for the Gauss error function, Exercise 5.10.4, are used for
the evaluation of these integrals for large values of the argument. How large
the argument should be depends on what accuracy is required. In actual practice,
a finite portion of the asymptotic series is telescoped by using Chebyshev
techniques to optimize the accuracy as discussed in Section 13.4.
EXERCISES
5.10.1 Stirling's formula for the logarithm of the factorial function is
12 + (/ + Лшх E x1'2"
1п(х!Iп2я + (х + Лшхх E x.
} 2 \ 2) £ Bл)Bл - 1)
The B2n are the Bernoulli numbers (Section 5.9). Show that Stirling's formula is
an asymptotic expansion.
5.10.2 Integrating by parts, develop asymptotic expansions of the Fresnel integrals.
(a) C(x) = I cos аи
Jo 2
Cx 2
(b) S(x)= sin—du.
Jo 2
These integrals appear in the analysis of a knife-edge diffraction pattern.
EXERCISES 345
5.10.3 Rederive the asymptotic expansions of Ci(x) and si(x) by repeated integration
by parts.
Hint. Ci{x) + isi{x) = - —dt.
J X
5.10.4 Derive the asymptotic expansion of the Gauss error function
__2_ Г -t*d
^Jo
"'Л 1 , ЬЗ 1-3-5
\ 2x2 22x4 23x6
2 Г00
Hint. erf(x) = 1 - erfc(x) = 1 - — e~'2 A.
№ -л.
Normalized so that erf(oo) = 1, this function plays an important role in probabil-
probability theory. It may be expressed in terms of the Fresnel integrals (Exercise 5.10.2),
the incomplete gamma functions (Section 10.5), and the confluent hypergeomet-
ric functions (Section 13.6).
5.10.5 The asymptotic expressions for the various Bessel functions, Section 11.6,
contain the series
P(z)~l + Y( irrb2"i[4v2-Bs-D2]
CO
„+ills=i l>2-Bs- IJ]
Show that these two series are indeed asymptotic series.
5.10.6 Forx>l
1 y, „1
Test this series to see if it is an asymptotic series.
5.10.7 In Exercise 5.9.17 the Euler-Mascheroni constant у is expressed with a Bernoulli
number series:
kti Bk)n2k'
Show that this is an asymptotic series.
5.10.8 Develop an asymptotic series for
Л0О
e'xv(l + v2y2dv.
Jo
Take x to be real and positive.
5.10.9 Calculate partial sums of exE1{x) for x = 5, 10, and 15 to exhibit the behavior
shown in Fig. 5.11. Determine the width of the throat for x = 10 and 15 anal-
analogous to Eq. 5.183.
ANS. Throat width: n = 10, 0.000051
и = 15, 0.0000002.
346 INFINITE SERIES
5.10.10 The knife-edge diffraction pattern is described by
/ = 0.5/0{[C(u0) + 0.5]2 + [S(u0) + 0.5]2},
where C(u0) and S(u0) are the Fresnel integrals. Here Io is the incident intensity
and / the diffracted intensity. u0 is proportional to the distance away from the
knife edge (measured at right angles to the incident beam). Calculate I/Io for
u0 varying from —1.0 to +4.0 in steps of 0.1. Tabulate your results and, if a
plotting routine is available, plot them.
Check value. u0 = 1.0, ///„ = 1.259226.
5.10.11 The Euler-Maclaurin integration formula of Section 5.9 provides a way of
calculating the Euler-Mascheroni constant у to high accuracy. Using/(x) = 1/x
in Eq. 5.168fc (with interval [l,n]) and the definition of 7, Eq. 5.28, we obtain
y=Ys-l-\nn- — + У^Ц
Using double precision arithmetic, calculate y.
Note. Knuth, D. E., "Euler's constant to 1271 places," Math. Computation 16,
275 A962). An even more precise calculation appears in Exercise 10.5.16.
ANS. For n = 1000,
7 = 0.57721566 4901.
5.11 INFINITE PRODUCTS
Consider a succession of positive factors f1' f2' /з' f* • • • fn, (ft > 0). Using
capital pi to indicate product, as capital sigma indicates a sum, we have
E-197)
We define pn, a partial product, in analogy with sn the partial sum,
Pn = Ylft E-198)
i = l
and then investigate the limit
lim pn = P. E.199)
If P is finite (but not zero), we say the infinite product is convergent. If P is
infinite or zero, the infinite product is labeled divergent.
Since the product will diverge to infinity if
lim fn > 1 E.200)
n — co
or to zero for
lim/n<l, (and >0), E.201)
it is convenient to write our infinite product as
ft A + «,)•
INFINITE PRODUCTS 347
The condition an -*■ 0 is then a necessary (but not sufficient) condition for
convergence.
The infinite product may be related to an infinite series by the obvious method
of taking the logarithm
In ft A + an) = £ ln(l + an). E.202)
n = l n = l
A more useful relationship is stated by the following theorem.
Convergence of Infinite Product
If 0 < an < 1, the infinite products f]^=1 A + an) and f[*=i 0 - an) converge
if Z^°=i an converges and diverge if £^=1 an diverges.
Considering the term 1 + о„, we see from Eq. 5.90
1 + an < ea«. E.203)
Therefore for the partial product pn
Pn < es", E.204)
and, letting n -+ oo,
oo
П A + an) < exp X any E.205)
n=l n=l
thus establishing an upper bound for the infinite product.
To develop a lower bound, we note that
Pn = 1 + X«;+ Z Z«;flj+ •••> >s«' E-206)
since fl; > 0. Hence
oo oo
П A + «J > Z *»• E.207)
n=l n=l
If the infinite sum remains finite, the infinite product will also. If the infinite
sum diverges, so will the infinite product.
The case of f] A — an) is complicated by the negative signs, but a proof that
depends on the foregoing proof may be developed by noting that for an<\
(remember an -*■ 0 for convergence)
A - an) < A + flj-1
and
A - an) > A + 2anyK E.208)
Sine, Cosine, and Gamma Functions
The reader will recognize that an nth-order polynomial Pn(x) with n real
roots may be written as a product of n factors:
Pn(x) = (x- Xi)(x - x2) ■ ■ ■ (x - xn) = П (x - xt). E.209)
348 INFINITE SERIES
In much the same way we may expect that a function with an infinite number
of roots may be written as an infinite product, one factor for each root. This is
indeed the case for the trigonometric functions. We have two very useful infinite
product representations,
oo /
sinx = x Y\ ( 1 —
n2n2
cos x =
1 -
2_2
Bи - 1Jя
E.210)
E.211)
The most convenient and perhaps most elegant derivation of these two expres-
expressions is by the use of complex variables.1 By our theorem of convergence,
Eqs. 5.210 and 5.211 are convergent for all finite values of x. Specifically, for
the infinite product for sin x, an = x2/n2n2,
2 oo
E.212)
by Exercise 5.9.6. The series corresponding to Eq. 5.211 behaves in a similar
manner.
Equation 5.210 leads to two interesting results. First, if we set x = я/2,
we obtain
_ oo
i=fn
Solving for я/2, we have
1 -
1
Bnf
ft
BnJ
Bn)
E.213)
Bn - l)Bn + 1)
2-2 4-4 6-6
1-з"з-5*5-7 * "'
E.214)
which is Wallis's famous formula for я/2.
The second result involves the gamma or factorial function (Section 10.1).
One definition of the gamma function is
—I *
Г(х) =
-x/r
E.215)
where у is the usual Euler-Mascheroni constant (compare Section 5.2). If we
take the product of Г(х) and Г( — х), Eq. 5.215 leads to
1 The derivation appears in Mathematical Methods for Physicists, 1 st and
2nd eds. (Section 7.3). As an alternative Eq. 5.210 can be obtained from the
Weierstrass factorization theorem.
EXERCISES 349
Г(х)Г(-х)= -
X
г
г
х'г
-1
г=1
-1
E.216)
Using Eq. 5.210 with x replaced by nx, we obtain
n
Г(х)Г(-х)= -,
x sin nx
E.217)
Anticipating a recurrence relation developed in Section 10.1, we have — хГ( — х)
= ГA — x). Eq. 5.217 may be written as
Г(х)ГA - x) = -
n
E.218)
This will be useful in treating the gamma function (Chapter 10).
Strictly speaking, we should check the range of .x for which Eq. 5.215 is
convergent. Clearly, individual factors will vanish for x = 0, — 1, —2, .... The
proof that the infinite product converges for all other (finite) values of x is left
as Exercise 5.11.9.
These infinite products have a variety of uses in analytical mathematics.
However, because of rather slow convergence, they are not suitable for precise
numerical work.
EXERCISES
5.11.1 Using
In Г] A ± an) = £ ln(l + а„)
n~\
and the Maclaurin expansion of ln(l ± а„), show that the infinite product
ПГ=1 A ± an) converges or diverges with
5.11.2 An infinite product appears in the form
Г=1 A ± an) converges or diverges with the infinite series £^=1 а„.
Л + Ф
where a and Ъ are constants. Show that this infinite product converges only if
a = b.
5.11.3 Show that the infinite product representations of sin x and cos x are consistent
with the identity 2 sin x cos x = sin2x.
5.11.4 Determine the limit to which
»=2
converges.
350 INFINITE SERIES
5.11.5 Show that
00 ~
~ 1 -
5.11.6 Prove that
5 - ф
ИМИ
5.11.7 Using the infinite product representations of sin x, show that
oo / „ \2m
xcotx = l-2 I (±) ,
m,n = l \n7t/
hence that the Bernoulli number
~M v ' {2nf
5.11.8 Verify the Euler identity
5.11.9 Show that Y[T=i (* + x/r)e'xlr converges for all finite x (except for the zeros of
1 + x/r).
Hint. Write the nth factor as 1 + а„.
а„.
5.11.10 Calculate cos x from its infinite product representation, Eq. 5.211, using (a) 10,
(b) 100, and (c) 1000 factors in the product. Calculate the absolute error. Note
how slowly the partial products converge—making the infinite product quite
unsuitable for precise numerical work.
ANS. For 1000 factors cos я = - 1.00051.
REFERENCES
Bender, С. М., and S. Orszag, Advanced Mathematical Methods for Scientists and
Engineers. New York: McGraw-Hill A978).
Particularly recommended for methods of accelerating convergence.
Davis, H. Т., Tables of Higher Mathematical Functions. Bloomington, lnd.: Principia Press
A935).
Volume II contains extensive information on Bernoulli numbers and polynomials.
Dingle, R. B. Asymptotic Expansions: Their Derivation and Interpretation. London and
New York: Academic Press A973).
Gradshteyn, I. S., and I. N. Ryzhik, Table of Integrals, Series and Products. Corrected and
enlarged edition prepared by Alan Jeffrey. New York: Academic Press A980).
Hansen, E., A Table of Series and Products. Englewood Cliffs, N.J.: Prentice-Hall, Inc.
A975).
A tremendous compilation of series and products.
Hardy, G. H., Divergent Series. Oxford: Clarendon Press A956).
A standard, comprehensive work on methods of treating divergent series. Hardy in-
includes an instructive account of the gradual development of the concepts of conver-
convergence and divergence.
Knopp, Konrad, Theory and Application of Infinite Series. London: Blackie and Son
(reprinted 1946).
REFERENCES 351
This is a thorough, comprehensive, and authoritative work, which covers infinite series
and products. Proofs of almost all of the statements not proved in Chapter 5 will be
found in this book.
Mangulis, V., Handbook of Series for Scientists and Engineers. New York and London:
Academic Press A965).
A most convenient and useful collection of series. Includes algebraic functions, Fourier
series, and series of the special functions: Bessel, Legendre, and so on.
Olver, F. W. J., Asymptotics and Special Functions. New York: Academic Press A974).
A detailed, readable development of asymptotic theory. Considerable attention is paid
to error bounds for use in computation.
Rainville, E. D., Infinite Series. New York: Macmillan Co. A967). A readable and useful
account of series—constants and functions.
Sokolnikoff, I. S., and R. M. Redheffer, Mathematics of Physics and Modern Engineer-
Engineering, 2nd ed. New York: McGraw-Hill A966).
A long Chapter 2 A01 pages) presents infinite series in a thorough but very readable
form. Extensions to the solutions of differential equations, to complex series, and to
Fourier series are included.
The topic of infinite series is treated in many texts on advanced calculus.
6 FUNCTIONS OF A
COMPLEX
VARIABLE I
ANALYTIC PROPERTIES
MAPPING
The imaginary numbers are a wonderful
flight of God's spirit; they are almost an
amphibian between being and not being.
GOTTERFIED WlLHELM VON LEIBNITZ, 1702
We turn now to a study of functions of a complex variable. In this area we
develop some of the most powerful and widely useful tools in all of mathematical
analysis. To indicate, at least partly, why complex variables are important, we
mention briefly several areas of application.
1. For many pairs of functions и and v, both и and v satisfy Laplace's
equation
Hence either w.or v may be used to describe a two-dimensional electrostatic
potential. The other function that gives a family of curves orthogonal to those
of the first function, may then be used to describe the electric field E. A similar
situation holds for the hydrodynamics of an ideal fluid in irrotational motion.
The function и might describe the velocity potential, whereas the function v
would then be the stream function.
In many cases in which the functions и and v are unknown, mapping or
transforming in the complex plane permits us to create a coordinate system
tailored to the particular problem.
2. In Chapter 8 we shall see that the second-order differential equations of
interest in physics may be solved by power series. The same power series may
be used in the complex plane to replace x by the complex variable z. The
dependence of the solution/(z) at a given z0 on the behavior of/(z) elsewhere
gives us greater insight into the behavior of our solution and a powerful tool
(analytic continuation) for extending the region in which the solution is valid.
3. The change of a parameter к from real to imaginary, к -> г/с, transforms
the Helmholtz equation into the diffusion equation. The same change transforms
352
COMPLEX ALGEBRA 353
the Helmholtz equation solutions (Bessel and spherical Bessel functions) into
the diffusion equation solutions (modified Bessel and modified spherical Bessel
functions).
4. Integrals in the complex plane have a wide variety of useful applications.
a. Evaluating definite integrals.
b. Inverting power series.
с Forming infinite products.
d. Obtaining solutions of differential equations for large
values of the variable (asymptotic solutions).
e. Investigating the stability of potentially oscillatory
systems.
f. Inverting integral transforms.
5. Many physical quantities that were originally real become complex as
a simple physical theory is made more general. The real index of refraction
of light becomes a complex quantity when absorption is included. The real
energy associated with a nuclear energy level becomes complex when the finite
lifetime of the energy level is considered.
6.1 COMPLEX ALGEBRA
A complex number is nothing more than an ordered pair of two ordinary
numbers, (a, b) or a + ib, in which i is ( —1I/2. Similarly, a complex variable is
an ordered pair of two real variables,
z = (x,y) = x + iy. F.1)
The reader will see that the ordering is significant, that in general a + ib is not
equal to b + ia and x + iy is not equal to у + ix.i
It is frequently convenient to employ a graphical representation of the com-
complex variable. By plotting x—the real part of z—as the abscissa and у—the
imaginary part of z—as the ordinate, we have the complex plane or Argand
plane shown in Fig. 6.1. If we assign specific values to x and y, then z corresponds
to a point (x, y) in the plane. In terms of the ordering mentioned before, it is
obvious that the point (x, y) does not coincide with the point (y,x) except for
for the special case of x = y.
All our complex variable analyses can be developed in terms of ordered
pairs2 of numbers (a, b), variables {x,y), and functions (u(x,y), v(x,y)). The i is
not necessary but it is convenient. It serves to keep pairs in order—somewhat
like the unit vectors of Chapter 1.
1 The algebra of complex numbers, a + ib, is isomorphic with that of matrices
of the form
(compare Exercise 4.2.4).
2This is how a computer would do complex arithmetic.
354 FUNCTIONS OF A COMPLEX VARIABLE I
У
У
(x,y)
x
X
FIG. 6.1 Complex plane—Argand diagram
In Chapter 1 the points in the xy-plane are identified with the two-dimen-
two-dimensional displacement vector r = ix + jy. As a result, two-dimensional vector
analogs can be developed for much of our complex analysis. Exercise 6.1.2 is
one simple example; Cauchy's theorem, Section 6.3, is another.
Further, from Fig. 6.1 we may write
x = rcos 0
у = rsinO
F.2)
and
z = r(cos0 + ism в). F.3)
Using a result that is suggested (but not rigorously provedK by Section 5.6,
we have the very useful polar representation
z = re .
F.4)
In this representation r is called the modulus or magnitude of z (r = \z\) and
the angle в is labeled the argument or phase of z.
The choice of polar representation, Eq. 6.4, or cartesian representation,
Eq. 6.1, is a matter of convenience. Addition and subtraction of complex
variables are easier in the cartesian representation. Multiplication, division,
powers, and roots are easier to handle in polar form.
Analytically or graphically, using the vector analogy, we may show that the
modulus of the sum of two complex numbers is no greater than the sum of the
moduli and no less than the difference, Exercise 6.1.3,
zj — \z2\ < \zy + z2\ <\zl\ + \z2\. F.5)
Because of the vector analogy, these are called the triangle inequalities.
Using the polar form, Eq. 6.4, we find that the magnitude of a product is the
3 Strictly speaking, Chapter 5 was limited to real variables. However, we can
define <?*as £"=az7n-' for complex z. The development of power-series expan-
expansions for complex functions is taken up in Section 6.5 (Laurent expansion).
Alternatively, ez can be defined by Eqs. 6.3 and 6.4.
COMPLEX ALGEBRA 355
product of the magnitudes,
Also,
F.6)
z2) = argZi + argz2. F.7)
From our complex variable z complex functions/(z) or w(z) may be con-
constructed. These complex functions may then be resolved into real and imaginary
parts
w(z) = u(x, y) + iv(x, y), F.8)
in which the separate functions u{x, y) and v{x, y) are pure real. For example,
if/(z) = z2, we have
f(z) = (x + iyJ
= {x2 — y2) + i2xy.
The real part of a function f(z) will be labeled Mf{z\ whereas the imaginary
part will be labeled Jf{z). In Eq. 6.8
= v(x, y).
The relationship between the independent variable z and the dependent
variable w is perhaps best pictured as a mapping operation. A given z = x + iy
means a given point in the z-plane. The complex value of w(z) is then a point
in the w-plane. Points in the z-plane map into points in the w-plane and curves
in the z-plane map into curves in the w-plane as indicated in Fig. 6.2.
У-
vi к
z-plane
w-plane
и
FIG. 6.2 The function w(z) = u(x, y) + iv(x, y) maps points in the xy-plane into points in
the uv plane.
Complex Conjugation
In all these steps, complex number, variable, and function, the operation of
replacing i by — i is called "taking the complex conjugate." The complex
356 FUNCTIONS OF A COMPLEX VARIABLE I
У
*-x
I
, —i') FIG. 6.3 Complex conjugate points
conjugate of z is denoted by z*, where4
z* = x - iy. F.9)
The complex variable z and its complex conjugate z* are mirror images of each
other reflected in the x-axis, that is, inversion of the y-axis (compare Fig. 6.3).
The product zz* leads to
zz* = (x + iy){x — iy) = x2 + y2
F.10)
= r2.
Hence
the magnitude of z.
Functions of a Complex Variable
All the elementary functions of real variables may be extended into the
complex plane—replacing the real variable x by the complex variable z. This is
an example of the analytic continuation mentioned in Section 6.5. The extremely
important relation, Eq. 6.4, is an illustration of this. Moving into the complex
plane opens up new opportunities for analysis.
EXAMPLE 6.1.1 De Moivre's Formula
If Eq. 6.3 is raised to the nth power, we have
e'n() = (cos6> +isin6>)n. F.11)
Expanding the exponential now with argument nO, we obtain
cos nO + i sin nO = (cos 0 + i sin 0)". F.12)
This is De Moivre's formula.
Now if the right-hand side of Eq. 6.12 is expanded by the binomial theorem,
we obtain cos пв as a series of powers of cos 9 and sin 0, Exercise 6.1.6.
Numerous other examples of relations among the exponential, hyperbolic,
and trigonometric functions in the complex plane appear in the exercises.
Occasionally there are complications. The logarithm of a complex variable
may be expanded using the polar representation
4The complex conjugate is often denoted by z.
EXERCISES 357
lnz = \nrew
F.13a)
= In r + i0.
This is not complete. To the phase angle 0, we may add any integral multiple
of 2л without changing z. Hence Eq. 6.13a should read
\nz = \nreHe+2nn)
F.13b)
= In r + i@ + Inn).
The parameter n may be any integer. This means that lnz is a multivalued
function having an infinite number of values for a single pair of real values r
and в. To avoid ambiguity, we usually agree to set n = 0 and limit the phase
to an interval of length 2л such as (— л, л). The line in the z-plane that is not
crossed, the negative real axis in this case, is labeled a cut line. The value of In z
with n = 0 is called the principal value of In z.
Further discussion of these functions, including the logarithm, appears in
Section 6.6.
EXERCISES
6.1.1 (a) Find the reciprocal of x + iy, working entirely in the cartesian representa-
representation.
(b) Repeat part (a), working in polar form but expressing the final result in
cartesian form.
6.1.2 The complex quantities a — и + iv and b = x + iy may also be represented as
two-dimensional vectors, a = ш + \v, b = i.x + \y. Show that
a*b = a • b + /k • a x b.
6.1.3 Prove algebraically thai
\zl\-\z2\<\zi +z2\:
Interpret this result in terms of vectors.
Prove that
6.1.4 We may define a complex conjugation operator К such that Kz = z*. Show that
К is not a linear operator.
6.1.5 Show that complex numbers have square roots and that the square roots are
contained in the complex plane. What are the square roots of/?
6.1.6 Show that
(a) cos л0 = cos" 0 - Г jcos"-2Osin2ft + Г jcos"~4tf sin4U - •••.
(b) sinn0 = (" ) cos""" ^ sin0-Г jcos"~30sin3 0 + Г )cos"0sin5 ^ - •••.
Note. The quantities ( I are binomial coefficients:
\mj \m/ (n — m)\m\
358 FUNCTIONS OF A COMPLEX VARIABLE I
6.1.7 Prove that
/ ч N\-^ siniV(x/2) x
(a) > cosnx = —Lcos(N — 1)-,
„=o sinx/2 2
... V . siniV(x/2) . IKr .x
(b) У sinnx = ^-^sin(JV - 1)-.
„=o sin x/2 2
These series occur in the analysis of the multiple-slit diffraction pattern. Another
application is the analysis of the Gibbs phenomenon, Section 14.5.
Hint. Parts (a) and (b) may be combined to form a geometric series (compare
Section 5.1).
6.1.8 For — 1 < p < 1 prove that
. , ^ „ 1 — pcosx
(a) 2. P cos nx = \ о 2'
„=o l-2pcosx + p
,,, у „ ■ psinx
„=o l
These series occur in the theory of the Fabry- Perot interferometer.
6.1.9 Assume that the trigonometric functions and the hyperbolic functions are defined
for complex argument by the appropriate power series
oo „и oa _2.v+ 1
sinz= У (-l)(»-1)/2^-= У (-l)s-~ ,
oo _n oo _2.v
cos z = У (- I)"'2 - = У (- 1)* t—,
oo _n oo ^2s + l
sinhz = У —= У —"- ,
»=Kodd«r- s=oBs+l)I
со п oo _2.v
cosh г = У — = У ,
^«! B5)!
(a) Show that
i sin z = sinh iz, sin /r = i sinh z,
cos z = cosh iz, cos iz = cosh z.
(b) Verify that familiar functional relations such as
coshz = ,
sin(zt + z2) = sinzj cosz2 + sinz2cosZ),
still hold in the complex plane.
6.1.10 Using the identities
cosz = ,
sinz =
2x
established from comparison of power series, show that
(a) sin(x + iy) = sin x cosh у + i cos x sinh y,
cos(x + iy) — cos x cosh у — i sin x sinh y,
EXERCISES 359
(b) |sin z\2 = sin2 x + sinh2 y,
|cosz|2 = cos2x + sinh2y.
This demonstrates that we may have sin 2, cos z > 1 in the complex plane.
6.1.11 From the identities in Ex. 6.1.9 and 6.1.10 show that
(a) sinh(x + iy) = sinh x cos у + i cosh x sin y,
cosh(x + iy) = cosh x cos у + i sinh x sin y,
(b) |sinhz|2 = sinh2x + sin2 y,
|coshz|2 = sinh2x + cos2 y.
6.1.12 Prove that
(a) |sin21 > |sinx|
(b) |cosz| > |cosx|.
6.1.13 Show that the exponential function ez is periodic with a pure imaginary period
of 2m.
6.1.14 Show that
sinhx + isiny
(a) tanh(z/2) =
(b) coth(z/2) =
coshx + cosy'
sinhx — isiny
cosh x — cos у
6.1.15 Find all the zeros of
(a) sin z, (c) sinh z,
(b) cos 2, (d) cosh z.
6.1.16 Show that
(a) sin 2 = - iln(iz ± J\ - z\ (d) sinh z = InB +
s 2 = -i\n(z ± yfz2 - 1) (e) cosh
(b) cos 2 = -ПпB ± y/z2 - 1), (e) cosh z = ln(z + y/z2 - 1),
(c) tan2=^ln(^—-), (f) tanh~lz = -\n(-—-).
2 \i-zj 2 \l-zj
Hint. 1. Express the trigonometric and hyperbolic functions in terms of exponen-
exponentials. 2. Solve for the exponential and then for the exponent.
6.1.17 In the quantum theory of the photoionization we encounter the identity
(ia - lYb
( ) = exp( —2fecot ' a),
\ia + \)
in which a and b are real. Verify this identity.
6.1.18 A plane wave of light of angular frequency w is represented by
imit-nx/c)
In a certain substance the simple real index of refraction n is replaced by the
complex quantity n — ik. What is the effect of к on the wave? What does к corre-
correspond to physically? The generalization of a quantity from real to complex form
occurs frequently in physics. Examples range from the complex Young's modulus
of viscoelastic materials to the complex potential of the "cloudy crystal ball"
model of the atomic nucleus.
6.1.19 We see that for the angular momentum components defined in Exercise 2.5.14
(Lx - iLy) ф (Lx + iLy)*
Explain why this occurs.
360 FUNCTIONS OF A COMPLEX VARIABLE I
6.1.20 Show that the phase of/B) = и + iv is equal to the imaginary part of the log-
logarithm of/B). Exercise 10.2.13 depends on this result.
6.1.21 (a) Show that elnZ always equals z.
(b) Show that \nez does not always equal z.
6.1.22 The infinite product representations of Section 5.11 hold when the real variable
x is replaced by the complex variable z. From this, develop infinite product
representations for
(a) sinh z
(b) cosh z.
6.1.23 The equation of motion of a mass m relative to a rotating coordinate system is
d2x _
m—=- = F — mco x
dt2
. _ / dr\ (dm \
(со х r) — 2m to x — — m { —- x r .
V dtJ \dt )
Consider the case F = 0, r = ix + jy, and со = wk, with w constant. Show that
the replacement of r = ix + jy by z = x + iy leads to
d2z „ dz ,
-7-y + i2c»~— ojzz = 0.
dt2 dt
Note. This differential equation may be solved by the substitution a = [е~'ш.
6.1.24 Using the complex arithmetic available in FORTRAN IV, write a program that
will calculate the complex exponential ez from its series expansion (definition).
Calculate ez for z = е'"ж1в, n = 0, 1, 2, . .., 12. Tabulate the phase angle («я/6),
0l{z), J{z), &(ez), J{ez) \e% and the phase of ez.
Check value, n = 5, & = 2.61799, St(z) = -0.86602,
J(z) = 0.50000, 2t(ez) = 0.36913, J{ez) = 0.20166,
|ez| = 0.42062, phase(r) = 0.50000.
6.1.25 Using the complex arithmetic available in FORTRAN IV, calculate and tabu-
tabulate ^(sinh2), J^(sinhz), |sinhz|, and phase(sinhz) for .x = 0.0 @.1) l.O and у
= 0.0@.1) 1.0.
Hint. Beware of dividing by zero if calculating an angle as an arc tangent.
Check value, z = 0.2 + O.li, ^f(sinhz) = 0.20033,
^(sinhz) = 0.10184, |sinhz| = 0.22473,
phase(sinhz) = 0.47030.
6.1.26 Repeat Exercise 6.1.25 for cosh 2.
6.2 СAUCHY-RIEMANN CONDITIONS
Having established complex functions of a complex variable, we nowproceed
to differentiate them. The derivative of f(z), like that of a real function, is defined
by
Ez-0 Z + OZ — Z dz-O OZ
F.14)
i
or
CAUCHY-RIEMANN CONDITIONS 361
У
5х->
5у =
0
0 ' ]
►
Sx
8у
= 0
X
FIG. 6.4 Alternate approaches to
provided that the limit is independent of the particular approach to the point z.
For real variables we require that the right-hand limit (x -»■ x0 from above)
and the left-hand limit (x -> x0 from below) be equal for the derivative df(x)/dx
to exist at x = x0. Now, with z (or z0) some point in a plane, our requirement
that the limit be independent of the direction of approach is very restrictive.
Consider increments dx and dy of the variables x and y, respectively. Then
Also,
so that
dz = Sx + iSy.
df = du + idv,
§f
dz
du + idv
Sx + idy
F.15)
F.16)
F.17)
Let us take the limit indicated by Eq. 6.14 by two different approaches as shown
in Fig. 6.4. First, with Sy = 0, we let Sx -»■ 0. Equation 6.14 yields
hm -~ = nm
Й2-0 dz дх
du
—
ox
F.18)
assuming the partial derivatives exist. For a second approach, we set dx = 0
and then let Sy -»■ 0. This leads to
,. Sf .. / .du , dv
lim -f- = hm -1 — + -—
&-0 OZ dy-*O \ dy dy/
_ .du dv
dy dy
F-19)
If we are to have a derivative df/dz, Eqs. 6.18 and 6.19 must be identical.
Equating real parts to real parts and imaginary parts to imaginary parts (like
components of vectors), we obtain
du
dv
dv_
dx
F.20)
dx dy' dy
These are the famous Cauchy-Riemann conditions. They were discovered by
362 FUNCTIONS OF A COMPLEX VARIABLE I
Cauchy and used extensively by Riemann in his theory of analytic functions.
These Cauchy-Riemann conditions are necessary for the existence of a deriva-
derivative of/(z), that is, ifdf/dz exists, the Cauchy-Riemann conditions must hold.
Conversely, if the Cauchy-Riemann conditions are satisfied and the partial
derivatives of u(x, y) and v(x,y) are continuous, the derivative df/dz exists.
This may be shown by writing
The justification for this expression depends on the continuity of the partial
derivatives of и and v. Dividing by Sz, we have
8f _ (ди/дх + i(dv/dx))dx + (ди/ду + i(du/dy))dy
Sz dx + iSy
F.22)
= (ди/дх + i{dv/dx)) + (ди/ду + i(dv/dy))dy/Sx
1 + i(dy/8x)
Ifdf/dz is to have a unique value, the dependence on ду/дх must be eliminated.
Applying the Cauchy-Riemann conditions to the у derivatives, we obtain
du .du dv .du .fdu .dv\ .._
— + i— =-— + i— = i l — + i—\ F.23
cy dy dx dx \dx dx)
Substituting Eq. 6.23 into Eq. 6.22, we may cancel out the ду/дх dependence
and
which shows that lim df/dz is independent of the direction of approach in the
complex plane as long as the partial derivatives are continuous.
It is worthwhile noting that the Cauchy-Riemann conditions guarantee
that the curves и = cx will be orthogonal to the curves v = c2 (compare Section
2.1). This is fundamental in application to potential problems in a variety of
areas of physics. If и = c1 is a line of electric force, then v = c2 is an equipotential
line (surface), and vice versa. A further implication for potential theory is
developed in Exercise 6.2.1.
Analytic Functions
Finally, if/(z) is differentiable at z = z0 and in some small region around z0,
we say that/(z) is analytic1 at z = z0. If/(z) is analytic everywhere in the (finite)
complex plane, we call it an entire function. Our theory of complex variables
here is essentially one of analytic functions of complex variables, which points
up the crucial importance of the Cauchy-Riemann conditions. The concept of
analyticity carried on in advanced theories of modern physics plays a crucial
role in dispersion theory (of elementary particles). If f\z) does not exist at
L Some writers use the term holomorphic.
EXERCISES 363
z = z0, then z0 is labeled a singular point and consideration of it is postponed
until Section 7.1.
To illustrate the Cauchy-Riemann conditions, consider two very simple
examples.
EXAMPLE 6.2.1
Let/(z) = z2. Then the real part u(x,y) = x2 — y2 and the imaginary part
v(x, y) = 2xy. Following Eq. 6.20,
du _ _ _ dv du _ _ _ dv
дх ду' ду Ox
We see that/(z) = z2 satisfies the Cauchy-Riemann conditions throughout the
complex plane. Since the partial derivatives are clearly continuous, we conclude
that/(z) = z2 is analytic.
EXAMPLE 6.2.2
Let f(z) = z*. Now и = x and v = — y. Applying the Cauchy-Riemann
conditions, we obtain
du _ . dv
Ix~~ ФТу
The Cauchy-Riemann conditions are not satisfied and /(z) = z* is not an
analytic function of z. It is interesting to note that /(z) = z* is continuous,
thus providing an example of a function that is everywhere continuous but
nowhere differentiable.
The derivative of a real function of a real variable is essentially a local
characteristic, in that it provides information about the function only in a local
neighborhood—for instance, as a truncated Taylor expansion. The existence
of a derivative of a function of a complex variable has much more far-reaching
implications. The real and imaginary parts of our analytic function must
separately satisfy Laplace's equation. This is Exercise 6.2.1. Further, our analytic
function is guaranteed derivatives of all orders, Section 6.4. In this sense the
derivative not only governs the-local behavior of the complex function, but
controls the distant behavior as well.
EXERCISES
6.2.1 The functions u(x, y) and v(x,y) are the real and imaginary parts, respectively, of
an analytic function w(z).
(a) Assuming that the required derivatives exist, show that
V2u = \2v - 0.
Solutions of Laplace's equation such as u(x, y) and v(x, y) are called harmonic
functions.
364 FUNCTIONS OF A COMPLEX VARIABLE I
(b) Show that
ди ди dv dv _
дх ду дх ду
and give a geometric interpretation.
Hint. The technique of Section 1.6 allows you to construct vectors normal to the
curve u(x,y) = C; and v(x,y) — cy
6.2.2 Show whether or not the function f(z) = ^P(z) = x is analytic.
6.2.3 Having shown that the real part u(x, y) and the imaginary part v{x, y) of an analytic
function w(z) each satisfy Laplace's equation, show that u(x,y) and v(x,y) cannot
have either a maximum or a minimum in the interior of any region in which w(z)
is analytic. (They can have saddle points.)
6.2.4 Let A = d2w/dx2, В = d2w/dxdy, С = d2w/dy2. From the calculus of functions
of two variables, w(x, y), we have a saddle point if
B2 - AC> 0.
With f(z) = u(x,y) + iv(x,y), apply the Cauchy-Riemann conditions and show
that neither u(x, y) nor v(x, y) has a maximum or a minimum in a finite region of
the complex plane.
6.2.5 Find the analytic functions
w(z) = u{x, y) + iv(x, y)
if
(a) u(x,y) = x3 — 3xy2,
(b) v (x, y) = e~y sin x.
6.2.6 If there is some common region in which w{ = u(x,y) + iv(x,y) and w2 = wf =
u(x,y) — iv(x, y) are both analytic, prove that u(x,y) and v(x,y) are constants.
6.2.7 The function/(z) = u(x,y) + iv(x,y) is analytic. Show that f*{z*) is also analytic.
6.2.8 Using f{reie) = R(r,0)ei@(r-e\ in which R(r,0) and Q(r,0) are differentiable real
functions of r and в, show that the Cauchy-Riemann conditions in polar coor-
coordinates become
dR _R d®
or r dv
(b) ^M
rdO dr
Hint. Set up the derivative first with 5z radial and then with 5z tangential.
6.2.9 As an extension of Exercise 6.2.8 show that ®(r, 0) satisfies Laplace's equation in
polar coordinates, Eq. 2.33 (without the final term).
6.2.10 Two-dimensional irrotational fluid flow is conveniently described by a complex
potential f(z) = u(x, v) + iv(x, y). We label the real part u(x, y), the velocity poten-
potential and the imaginary part v(x,y), the stream function. The fluid velocity V is
given by V = \u. Iff(z) is analytic,
(a) Show that df/dz = Vx - iVy,
(b) Show that V • V = 0 (no sources or sinks),
(c) Show that V x V = 0 (irrotational, nonturbulent flow).
CAUCHY'S INTEGRAL THEOREM 365
6.2.11 A proof of the Schwarz inequality (Section 9.4) involves minimizing an expression
The i/^'s are integrals of products of functions; фаа and фьь are real, фаЪ is complex.
Я is a parameter, possibly complex.
(a) Differentiate the preceding expression with respect to A*, treating A as an
independent parameter, independent of A*. Show that setting the derivative
df/дХ* equal to zero yields
(b) Show that df/дк = 0 leads to the same result.
(c) Let A = x + iy, A* = x — iy. Set the x and у derivatives equal to zero and
show that again
Л=~Ф?ь/Фьь-
This independence of A and A* appears again in Section 17.7.
6.2.12 The function f(z) is analytic. Show that the derivative of/B) with respect to z*
vanishes.
Hint. Use the chain rule and take x = (z + z*)/2, у = (z — z*)/2i.
Note. This result emphasizes that our analytic function j\z) is not just a complex
function of two real variables x and y. It is a function of the complex variable
x + iy.
6.3 CAUCHY'S INTEGRAL THEOREM
Contour Integrals
With differentiation under control, we turn to integration. The integral of a
complex variable over a contour in the complex plane may be defined in close
analogy to the (Riemann) integral of a real function integrated along the real
x-axis.
We divide the contour zoz'o into n intervals by picking n — 1 intermediate
points zu z2, . . ., on the contour (Figure 6.5). Consider the sum
t - z^X F.25)
where £,- is a point on the curve between zj and z^x. Now let n -> 00 with
zj ~ zJ-
0
for ally. If the limn^0D Sn exists and is independent of the details of choosing the
points zj and £y, then
.-Z)= f(z)dz. F.26)
к
The right-hand side of Eq. 6.26 is called the contour integral of/(z) (along the
specified contour С from z = z0 to z = z'o).
The preceding development of the contour integral is closely analogous to
the Riemann integral of a real function of a real variable. As an alternative,
the contour integral may be defined by
366 FUNCTIONS OF A COMPLEX VARIABLE I
у
= Zn
-•»- X
FIG. 6.5
'2 (%Х2>У2
f(z) dz = I [u(x, y) + iv{x, y)~] [dx + i dy]
[u{x,y)dx - v(xty)dy]
[v(xty)dx + u(x,y)dy],
with the path joining (x^y^ and ix2,y2) specified. This reduces the complex
integral to the complex sum of real integrals. It's somewhat analogous to the
replacement of a vector integral by the vector sum of scalar integrals, Section
1.10.
Stokes's Theorem Proof
Cauchy's integral theorem is the first of two basic theorems in the theory of
the behavior of functions of a complex variable. First, a proof under relatively
restrictive conditions—conditions that are intolerable to the mathematician
developing a beautiful abstract theory but that are usually satisfied in physical
problems.
If a function /(z) is analytic (therefore single-valued) and its partial derivatives
are continuous throughout some simply connected region R,1 for every closed
path С (Fig. 6.6) in R the line integral of/(z) around С is zero or
f(z)dz=hf(z)dz =
F.27)
1A simply connected region or domain is one in which every closed contour
in that region encloses only the points contained in it. If a region is not simply
connected, it is called multiply connected. As an example of a multiply con-
connected region, consider the z-plane with the interior of the unit circle excluded.
CAUCHY'S INTEGRAL THEOREM 367
FIG. 6.6 A closed contour С within
a simply connected region
The symbol § is used to emphasize that the path is closed. The reader will recall
that in Section 1.13 such a function/(z), identified as a force, was labeled
conservative.
In this form the Cauchy integral theorem may be proved by direct appli-
application of Stokes's theorem (Section 1.12). With f(z) = u(x, y) + iv(x, y) and
dz = dx + i dy,
p f(z) dz = cp (u + iv) (dx + i dy)
ic Jc
= <p (u dx — v dy) + i <p (v dx + и dy).
Jc J
F.28)
These two line integrals may be converted to surface integrals by Stokes's
theorem, a procedure that is justified if the partial derivatives are continuous
within C. In applying Stokes's theorem, the reader might note that the final
two integrals of Eq. 6.28 are completely real. Using
we have
(Vxdx+V,dy) =
-Vе)
dxdy.
jc J \дх дУ)
For the first integral in the last part of Eq. 6.28 let и = Vx and v
<t (udx-vdy)= J) (Vxdx + Vvdy)
F.29)
-Kr2Then
dx dy
dv du
F.30)
For the second integral on the right side of Eq. 6.28 we let и = Vy and v
Using Stokes's theorem again, we obtain
= Vx
2 In the proof of Stokes's theorem, Section 1.12, Vx and Vy are any two func-
functions (with continuous partial derivatives).
368 FUNCTIONS OF A COMPLEX VARIABLE I
i (v dx + и dy) =
- — )dxdy.
F.31)
On applicat'ion of the Cauchy-Riemann conditions that must hold, since/(z)
is assumed analytic, each integrand vanishes and
F.32)
= 0.
Cauchy-Goursat Proof
This completes the proof of Cauchy's integral theorem. However, the proof
is marred from a theoretical point of view by the need for continuity of the first
partial derivatives. Actually, as shown by Goursat, this condition is not essential.
An outline of the Goursat proof is as follows. We subdivide the region inside
the contour С into a network of small squares as indicated in Fig. 6.7. Then
у
u
a
FIG. 6.7 Cauchy-Goursat con-
contours
f(z)dz
F.33)
all integrals along interior lines canceling out. To attack the §c.f(z)dz, we
construct the function
df(z)
Z — Z;
dz
F.34)
with Zj an interior point of thej'th subregion. Note that [/(z) — /(z,)]/(z — z7)
is an approximation to the derivative at z = zy Equivalently, we may note that
if/(z) had a Taylor expansion (which we have not yet proved), then <5;(z, z,-)
would be of order z — z,, approaching zero as the network was made finer.
We may make
с g, F.35)
CAUCHY'S INTEGRAL THEOREM 369
where s is an arbitrarily chosen small positive quantity.
Solving Eq. 6.34 for/(z) and integrating around C,, we obtain
f(z)dz=l (z-z^zjdz, F.36)
the integrals of the other terms vanishing.3 When Eqs. 6.35 and 6.36 are com-
combined, one may show that
r
< As, F.37)
where Л is a term of the order of the area of the enclosed region. Since s is
arbitrary, we let s -»■ 0 and conclude that:
If a function/(z) is analytic on and within a closed path C,
Lf(z)dz = 0. F.38)
Details of the proof of this significantly more general and more powerful form
can be found in Churchill and in the other references cited. Actually we can still
prove the theorem for/(z) analytic within the interior of С and only continuous
on С
The consequence of the Cauchy integral theorem is that for analytic functions
the line integral is a function only of its end points, independent of the path of
integration:
[Z2 f{z)dz = F(z2) - F(Zl) = - Г f{z)dz, F.39)
JZl Jz2
again exactly like the case of a conservative force, Section 1.13.
Multiply Connected Regions
The original statement of our theorem demanded a simply connected region.
This restriction may easily be relaxed by the creation of a barrier, a cut line.
Consider the multiply connected region of Fig. 6.8, in which/(z) is not defined
for the interior R. Cauchy's integral theorem is not valid for the contour C,
as shown, but we can construct a contour С for which the theorem holds.
We cut from the interior forbidden region R to the forbidden region exterior
to R and then run a new contour C", as shown in Fig. 6.9.
The new contour С through ABDEFGA never crosses the cut line that
literally converts R into a simply connected region. The three-dimensional
analog of this technique was used in Section 1.14 to prove Gauss's law. By
Eq. 6.39
[Af(z)dz= -[Df(z)dz, F.40)
3jdz and §zdz = 0.
370 FUNCTIONS OF A COMPLEX VARIABLE I
*~x FIG. 6.8 A closed contour С in
a multiply connected region
FIG. 6.9 Conversion of a mul-
*~x tiply connected region into a sim-
simply connected region
/(z) having been continuous across the cut line and line segments DE and GA
arbitrarily close together. Then
b f(z)dz= f{z)dz+ J\z)dz
I c Jabd Jefg
= 0
F.41)
by Cauchy's integral theorem, with region R now simply connected. Applying
Eq. 6.39 once again with ABD -> C\ and EFG -> —C'2, we obtain
f(z)dz=h f(z)dz,
F.42)
in which C[ and C2 are both traversed in the same (counterclockwise) direction.
It should be emphasized that the cut line here is a matter of mathematical
convenience, to permit the application of Cauchy's integral theorem. Since /(z)
is analytic in the annular region, it is necessarily single-valued and continuous
across any such cut line. When we consider branch points (Section 7.1) our
functions will not be single-valued and a cut line will be required to make them
single-valued.
EXERCISES
6.3.1 Show that fef(z)dz = -fcj{z)dz.
6.3.2 In the Goursat proof of Cauchy's integral theorem we take
CAUCHY'S INTEGRAL FORMULA 371
<b z dz = 0.
Show that this expression holds, taking the path of integration to be the unit
circle, Ы = 1.
6.3.3 Prove that
f(z)dz
•L,
where |/|max is the maximum value of \f(z)\ along the contour С and L is the length
of the contour.
6.3.4 Verify that
z*dz
/0,0
depends on the path by evaluating the integral for the two paths shown in Fig. 6.10.
Recall that f(z) = z* is not an analytic function of z and that Cauchy's integral
theorem therefore does not apply.
FIG. 6.10
6.3.5 Show that
dz
= 0,
in which the contour С is a circle defined by \z\ = R > 1.
Hint. Direct use of the Cauchy integral theorem is illegal. Why? The integral may
be evaluated by transforming to polar coordinates and using tables. The preferred
technique would be the calculus of residues, Section 7.2. This yields 0 for R > 1
and 2ni for R < 1.
6.4 CAUCHY'S INTEGRAL FORMULA
As in the preceding section, we consider a function f(z) that is analytic on
a closed contour С and within the interior region bounded by С We seek to
prove that
f(z)
— z,
= 2nif{z0),
F.43)
372 FUNCTIONS OF A COMPLEX VARIABLE I
in which z0 is some point in the interior region bounded by C. This is the second
of the two basic theorems mentioned in Section 6.2. Note carefully that since
z is on the contour С while z0 is in the interior, z — z0 Ф 0 and the integral
Eq. 6.43 is well defined.
у
FIG. 6.11 Exclusion of a singular point
Although /(z) is assumed analytic, the integrand is f{z)/(z — z0) and is not
analytic at z = z0. If the contour is deformed as shown in Fig. 6.11 (or Fig. 6.9,
Section 6.3), Cauchy's integral theorem applies. By Eq. 6.42
0, F.44)
where С is the original outer contour and C2 is the circle surrounding the point
z0 traversed in a counterclockwise direction. Let z = z0 + re'e, using the polar
representation because of the circular shape of the path around z0. Here r is
small and will eventually be made to approach zero. We have
Taking the limit as r -*■ 0, we obtain
dz = if(z0) dO
z° Jc2 F.45)
= 2nif(z0),
since/(z) is analytic and therefore continuous at z = z0. This proves the Cauchy
integral formula.
Here is a remarkable result. The value of an analytic function/(z) is given
at an interior point z = z0 once the values on the boundary С are specified.
This is closely analogous to a two-dimensional form of Gauss's law (Section 1.14)
in which the magnitude of an interior line charge would be given in terms of
the cylindrical surface integral of the electric field E.
A further analogy is the determination of a function in real space by an
integral of the function and the corresponding Green's function (and their
derivatives) over the bounding surface. Kirchhoff diffraction theory is an
example of this.
It has been emphasized that z0 is an interior point. What happens if z0 is
exterior to C? In this case the entire integrand is analytic on and within C.
Cauchy's integral theorem, Section 6.3, applies and the integral vanishes. We
CAUCHY'S INTEGRAL FORMULA 373
have
_L L f^dz = №o), zo interior
2?n" Jc z ~ z° I0» zo exterior.
Derivatives
Cauchy's integral formula may be used to obtain an expression for the
derivative of/(z). From Eq. 6.43, with/(z) analytic,
fir, L Zr, \ fir, \ 1 / Г flr,\
dz-
Sz0 2ni6zo\J z - z0 - Sz0
Then, by definition of derivative (Eq. 6.14),
'(z0) = hm —
йо 2rci
0 J (z - z0 - <5zo)(z - z0)
UJz.
2711 J (Z - Z0J
The alert reader will see that this result could have been obtained by differ-
differentiating Eq. 6.43 under the integral sign with respect to z0. This formal or
turning-the-crank approach is valid, but the justification for it is contained in
the preceding analysis.
This technique for constructing derivatives may be repeated. We write
/'(z0 + <5z0) and/'(z0), using Eq. 6.46. Subtracting, dividing by <5z0, and finally
taking the limit as Sz0 -*■ 0, we have
Note that /B)(z0) is independent of the direction of <5z0 as it must be. Continuing,
we get
that is, the requirement that/(z) be analytic not only guarantees a first derivative
but derivatives of all orders as well! The derivatives of/(z) are automatically
analytic. The reader should notice that this statement assumes the Goursat
version of the Cauchy integral theorem. This is why Goursat's contribution is so
significant in the development of the theory of complex variables.
Morera's Theorem
A further application of Cauchy's integral formula is in the proof of Morera's
theorem, which is the converse of Cauchy's integral theorem. The theorem states
'This expression is the starting point for defining derivatives of fractional
order. See A. Erdelyi, et al., Tables of Integral Transforms, Vol. 2, New York:
McGraw-Hill A954). For recent applications to mathematical analysis see
T. J. Osier, "An integral analogue of Taylor's series and its use in computing
Fourier transforms." Math. Computation 26, 449 A972) and his references.
374 FUNCTIONS OF A COMPLEX VARIABLE I
the following:
If a function f(z) is continuous in a simply connected region R and §cf(z)dz = 0
for every closed contour С within R, thenfiz) is analytic throughout R.
Let us integrate/(z) from zt to z2. Since every closed path integral of/(z)
vanishes, the integral is independent of path and depends only on its end points.
We label the result of the integration F(z), with
F{z2)-F{zt)= \2f(z)dz. F.48)
Jz,
As an identity,
F(z2) - F(Zi)
l
using t as another complex variable. Now we take the limit as z2 -»■ zx.
lim = 0, F.50)
since/(t) is continuous.2 Therefore
Hm -^ ^ii = F'(z)
z2~*zi z2 — zt
by definition of derivative (Eq. 6.14). We have proved that F'(z) at z = zx exists
and equals/^zj. Since zt is any point in R, we see that F(z) is analytic. Then by
Cauchy's integral formula (compare Eq. 6.47) F'(z)=f(z) is also analytic,
proving Morera's theorem.
Drawing once more on our electrostatic analog, we might use/(z) to represent
the electrostatic field E. If the net charge within every closed region in R is zero
(Gauss's law), the charge density is everywhere zero in R. Alternatively, in terms
of the analysis of Section 1.13,/(z) represents a conservative force (by definition
of conservative), and then we find that it is always possible to express it as the
derivative of a potential function F(z).
EXERCISES
6.4.1 Show that
л = - 1,
n ф - 1,
where the contour С encircles the point z — z0 in a positive (counterclockwise)
sense. The exponent n is an integer.
The calculus of residues, Chapter 7, is based on this result.
1 We can quote the mean value theorem of calculus here.
EXERCISES 375
6.4.2 Show that
1 Г
— <b zm " 1dz, m and n integers
2ni Г
(with the contour encircling the origin once counterclockwise), is a representation
of the Kronecker delta bmn.
6.4.3 Solve Exercise 6.3.5 by separating the integrand into partial fractions and then
applying Cauchy's integral theorem for multiply connected regions.
Note. Partial fractions are explained in Section 15.7 in connection with Laplace
transforms.
6.4.4 Evaluate
Г dz
where С is the circle \z\ — 2.
6.4.5 Assuming that f(z) is analytic on and within a closed contour С and that the
point z0 is within C, show that
Jc (z - z0) Jc (z - zof
6.4.6 You know that /(z) is analytic on and within a closed contour C. You suspect
that the nth derivative /(n)(z0) is given by
; ^
c(z-z0)"+l
Using mathematical induction, prove that this expression is correct.
6.4.7 Show that
where R is the radius of a circle centered at z = z0 and M is the maximum value
of |/(z)| on that circle. Assume that /(z) is analytic on and within the circle.
6.4.8 If/(z) is analytic and bounded [|/(z)| < M, a constant] for all z, show that/(z)
must be a constant. This is Liouville's theorem.
6.4.9 Fundamental theorem of algebra. As a corollary of Liouville's theorem, Exercise
6.4.8, show that every polynomial equation,
P(z) = a0 + axz + ■■■ + anz" = 0
has at least one root. Here n > 0 and а„ ф 0.
Hint. Consider /(z) = 1/P(z).
Note. Once the preceding result is established we can divide out the root and
repeat the process for the resulting (n — l)-degree polynomial. This leads to the
conclusion that P(z) has exactly n roots.
6.4.10 (a) A function f(z) is analytic within a closed contour С (and continuous on C).
If f(z) ф 0 within С and |/(z)| > M on C, show that
for all points within C.
Hint. Consider w(z) = l//(z).
376 FUNCTIONS OF A COMPLEX VARIABLE I
6.4.11
(b) If j\z) = 0 within the contour C, show that the foregoing result does not
hold, that it is possible to have j/(z)j = 0 at one or more points in the interior
with j/(z)j > 0 over the entire bounding contour. Cite a specific example of
an analytic function that behaves this way.
Using the Cauchy integral formula for the nth derivative, convert the following
Rodrigues formulas into the corresponding Schlaefli integrals,
(a) Legendre
2nn\dxn
ANS.
2"
2ni
(b) Hermite
(c) Laguerre
ex d"
n\ ax"
Note. From the Schlaefli integral representations one can develop generating
functions for these special functions. Compare Sections 12.4, L nd 13.2.
6.5 LAURENT EXPANSION
Taylor Expansion
The Cauchy integral formula of the preceding section opens up the way for
another derivation of Taylor's series (Section 5.6), but this time for functions of a
complex variable. Suppose we are trying to expand/(z) about z = z0 and we have
z = z1 as the nearest point on the Argand diagram for which/(z) is not analytic.
We construct a circle С centered at z = z0 with radius \z' — zo| < \zi — zo\ (Fig.
6.12). Since zy was assumed to be the nearest point at which/(z) was not analytic,
f(z) is necessarily analytic on and within C.
FIG. 6.12
LAURENT EXPANSION 377
From Equation 6.43, the Cauchy integral formula,
z'-z
f(z')dz'
2ni ]r (z' - z0) - (z - z0)
F.52)
27iiJc(z'-z0)[l-(z-z0)/(z'-z0)]-
Here z' is a point on the contour С and z is any point interior to C. It is not quite
rigorously legal to expand the denominator of the integrand in Eq. 6.52 by the
binomial theorem, for we have not yet proved the binomial theorem for complex
variables. Instead, we note the identity
1
1 -t
= 1 + t + t2 + t3 +
F.53)
which may easily be verified by multiplying both sides by 1 — t. The infinite
series, following the methods of Section 5.2, is convergent for \t\ < 1.
Now for point z interior to C, \z — zo| < |z' — zo|, and, using l 6.53, Eq.
6.52 becomes
- z0Tf(z')dz'
F.54)
Interchanging the order of integration and summation (valid since Eq. 6.53 is
uniformly convergent for It < 1), we obtain
F.55)
Referring to Eq. 6.47, we get
f(z) =Y(z- z0)^
n\
F.56)
which is our desired Taylor expansion. Note that it is based only on the assump-
assumption that /(z) is analytic for
z — z(
z, - z(
. Just as for real variable power
series (Section 5.7), this expansion is unique for a given z0.
From the Taylor expansion for/(z) a binomial theorem may be derived—
Exercise 6.5.2.
Schwarz Reflection Principle
From the binomial expansion of g(z) = (z — x0)" for integral n it is easy to
see that the complex conjugate of the function is the function of the complex
conjugate,
g*(z) = (z- x0)"* = (z* - xoy = g(z*).
F.57)
378 FUNCTIONS OF A COMPLEX VARIABLE I
f{z) — u(x, y) + iv(x, y)
=f*(z*) = u{x, - y) - iv(x, - y)
и
f(z*) = u(x, - y) + iv(x, - v)
=/*(z) = u(x, y) - iv(x, y)
FIG. 6.13 Schwarz reflection
This leads us to the Schwarz reflection principle:
If a function f(z) is A) analytic over some region including the real axis and B)
real when z is real, then
f*(z)=f(z*).
(see Fig. 6.13).
Expanding/(z) about some (nonsingular) point x0 on the real axis,
F.58)
f(z) =
- *оГ
n\
F.59)
by Eq. 6.56. Since/(z) is analytic at z = x0, this Taylor expansion exists. Since
/(z) is real when z is real,/(n)(x0) must be real for all n. Then when we use Eq.
6.57, Eq. 6.58, the Schwarz reflection principle, follows immediately. Exercise
6.5.6 is another form of this principle.
Analytic Continuation
In the foregoing discussion we assumed that/(z) has an isolated nonanalytic
or singular point at z = zl (Fig. 6.12). For a specific example of this behavior
consider
1
1+z'
F.60)
which becomes infinite at z = — 1. Therefore/(z) is nonanalytic at zx = — 1 or
zy= — 1 is our singular point. By Eq. 6.56 or the binomial theorem for complex
LAURENT EXPANSION 379
FIG. 6.14 Analytic continuation
functions that follows directly from it,
1
= 1 - z -f z2 - z3 +
convergent for |z| < 1. If we label this circle of convergence Cu Eq. 6.61 holds for
/(z) in the interior of C, which we label region St.
The situation is that/(z) expanded about the origin holds only in S^ (and on
C\ excluding zx = — 1), but we know from the form of/(z) that it is well defined
and analytic elsewhere in the complex plane outside Sy. Analytic continuation
is a process of extending the region in which a function such as the series in Eq.
6.61 is defined. For instance, suppose we expand/(z) about the point z = i. We
have
1
1
+z
F.62)
By Eq. 6.56 again or 6.62
№ =
l
F.63)
convergent for \z — i\ < |1 + i| = л/2. Our circle of convergence is C2 and the
region bounded by C2 is labeled S2 (Fig. 6.14). Now/(z) is defined by the expan-
expansion (Eq. 6.68) for S2, which overlaps S^ and extends out further in the complex
380 FUNCTIONS OF A COMPLEX VARIABLE I
plane.1 This extension is an analytic continuation, and when we have only
isolated singular points to contend with, the function can be extended in-
indefinitely. Equations 6.60, 6.61, and 6.63 are three different representations of
the same function. Each representation has its own domain of convergence.
Equation 6.61 is a Maclaurin series. Equation 6.63 is a Taylor expansion about
z = i and from the following paragraphs Eq. 6.60 is seen to be a one-term
Laurent series.
Analytic continuation may take many forms and the series expansion just
considered is not necessarily the most convenient technique. As an alternate
technique we shall use a recurrence relation in Section 10.1 to extend the factorial
function around the isolated singular points, z = — n, n = 1,2, 3 • • •. As another
example, the hypergeometric equation is satisfied by the hypergeometric func-
function defined by the series, Eq. 13.114, for \z\ < 1. The integral representation
given in Exercise 13.5.8 permits a continuation over the entire complex plane.
Permanence of Algebraic Form
All our elementary functions, ez, sinz, and so on can be extended into the
complex plane (compare Exercise 6.1.9). For instance, they can be defined by
power-series expansions such as
for the exponential. Such definitions agree with the real variable definition
along the real x-axis and literally constitute an analytic continuation of the
corresponding real functions into the complex plane. This result is often called
permanence of the algebraic form.
Laurent Series
We frequently encounter functions that are analytic in an annular region, say,
of inner radius r and outer radius R, as shown in Fig. 6.15. Drawing an imaginary
cut line to convert our region into a simply connected region, we apply Cauchy's
integral formula, and for two circles, C2 and Ct, centered at z = z0 and with
radii r2 and r1? respectively, where r < r2 < rt < R, we have2
F.65)
1 One of the most powerful and beautiful results of the more abstract theory
of functions of a complex variable is that if two analytic functions coincide
in any region, such as the overlap of 5\ and S2, or coincide on any line segment,
they are the same function in the sense that they will coincide everywhere as
long as they are both well defined. In this case the agreement of the expansions
(Eqs. 6.61 and 6.63) over the region common to 5j and S2 would establish
the identity of the functions these expansions represent. Then Eq. 6.63 would
represent an analytic continuation or extension of f{z) into regions not
covered by Eq. 6.61. We could equally well say that/(r) = 1/A + z) is itself
an analytic continuation of either of the series given by Eqs. 6.61 and 6.63.
2 We may take r2 arbitrarily close to r and rl arbitrarily close to R, maximizing
the area enclosed between Q and C2.
LAURENT EXPANSION 381
FIG. 6.15 \z' - zo|Ci >\z- zo\; \z' -
\z - zo
Note carefully that in Eq. 6.65 an explicit minus sign has been introduced so that
contour C2 (like Ct) is to be traversed in the positive (counterclockwise) sense.
The treatment of Eq. 6.65 now proceeds exactly like that of Eq. 6.62 in the
development of the Taylor series. Each denominator is written as (z' — z0) —
(z — z0) and expanded by the binomial theorem which now follows from the
Taylor series (Eq. 6.56).
Noting that for Q,
find
z' - z
z — z(
while for С2,
z -
Z - Zr
, we
n=0
:1>-*оГ"ф (z'-zo)"-lf(z')dz'.
и=1
The minus sign of Eq. 6.65 has been absorbed by the binomial expansion.
Labeling the first series St and the second S2,
F.67)
„to
= r<
which is the regular Taylor expansion, convergent for \z — zo\ < \z' — z0
that is, for all z interior to the larger circle, Cv For the second series in Eq. 6.66
we have
F.68)
n=l
382 FUNCTIONS OF A COMPLEX VARIABLE I
convergent for \z — zo\ > \z' — zo\ = r2, that is, for all z exterior to the smaller
circle C2. Remember, C2 now goes counterclockwise.
These two series may be combined into one series3 (a Laurent series) by
f{z)= £ an(z-z0)n, F.69)
n— — oo
where
1 f flfcW (&70)
" 2niJc(z'-zoy
Since, in Eq. 6.70, convergence of a binomial expansion is no longer a problem,
С may be any contour within the annular region r < \z — zo\ < R encircling z0
once in a counterclockwise sense. If we assume that such an annular region of
convergence does exist, Eq. 6.69 is the Laurent series or Laurent expansion of
Л4
The use of the cut line (Fig. 6.15) is convenient in converting the annular
region into a simply connected region. Since our function is analytic in this
annular region (and therefore single-valued), the cut line is not essential and,
indeed, does not appear in the final result, Eq. 6.70. In contrast to this, functions
with branch points must have cut lines—Section 7.1.
Laurent series coefficients need not come from evaluation of contour inte-
integrals (which may be very intractable). Other techniques such as ordinary series
expansions may provide the coefficients.
Numerous examples of Laurent series appear in Chapter 7. We limit our-
ourselves here to one simple example to illustrate the application of Eq. 6.69.
EXAMPLE 6.5.1
Let f{z) = [z{z - I)]. If we choose z0 = 0, then r = 0 and R = 1, f(z)
diverging at z = 1. From Eqs. 6.70 and 6.69
1 f dz'
a- - Vrv -1)
F.71)
Again, interchanging the order of summation and integration (uniformly con-
convergent series), we have
If we employ the polar form, as in Eq. 6.47 (or compare Exercise 6.4.1),
'Replace «by — n in S2 and add.
EXERCISES 383
F.73)
In other words,
f-1 for n> -1,
" @ for n < — 1.
The Laurent expansion (Eq. 6.69) becomes
lz^^
- 1) z
F.75)
For this simple function the Laurent series can, of course, be obtained by a
direct binomial expansion.
The Laurent series differs from the Taylor series by the obvious feature of
negative powers of (z — z0). For this reason the Laurent series will always
diverge at least at z = z0 and perhaps as far out as some distance r (Fig. 6.15).
EXERCISES
6.5.1 Develop the Taylor expansion of ln(l + z). ANS. X (-I)
00 _П
"'1
n
6.5.2 Derive the binomial expansion
m(m — 1) ,
A + zf = 1 + mzH v '-2 '
n = 0 \n/
for m any real number. The expansion is convergent for \z\ < 1.
6.5.3 A function f(z) is analytic on and within the unit circle. Also, |/(z)j < 1 for \z\ < 1
and /@) = 0. Show that \f(z)\ < \z\ for \z\ < 1.
Hint. One approach is to show that f(z)/z is analytic and then express [/(zo)/zo]n
by the Cauchy integral formula. Finally, consider absolute magnitudes and take
the nth root. This exercise is sometimes called Schwarz's theorem.
6.5.4 If /(z) is a real function of the complex variable z and the Laurent expansion
about the origin, f(z) = }]anzn, has а„ = 0 for n < — JV, show that all of the
coefficients, an, are real.
6.5.5 A function f(z) = u(x,y) + iv(x,y) satisfies the conditions for the Schwarz reflec-
reflection principle. Show that
384 FUNCTIONS OF A COMPLEX VARIABLE I
(a) и is an even function of y.
(b) v is an odd function of y.
6.5.6 A function /(z) can be expanded in a Laurent series about the origin with the
coefficients а„ real. Show that the complex conjugate of this function of z is the
same function of the complex conjugate of z; that is,
/*(z)=/(z*).
Verify this explicitly for
(a) /(z) = z", n an integer,
(b) /(z) = sin z.
If/(z) = iz, (at = i), show that the foregoing statement does not hold.
6.5.7 The function /(z) is analytic in a domain that includes the real axis. When z is
real (z = x), f(x) is pure imaginary.
(a) Show that
(b) For the specific case /(z) = iz, develop the cartesian forms of/(z), /(z*), and
/*(z). Do not quote the general result of part (a).
6.5.8 Develop the first three nonzero terms of the Laurent expansion of
/(z) = (ez - I)
about the origin. Notice the resemblance to the Bernoulli number generating
function, Eq. 5.144 of Section 5.9.
6.5.9 Prove that the Laurent expansion of a given function about a given point is
unique; that is, if
/(z) = jr an(z-z0)"= t K(z-z0T,
n=-N n=-N
show that а„ = Ь„ for all n.
Hint. Use the Cauchy integral formula.
6.5.10 (a) Develop a Laurent expansion of/(z) = [z{z — I)] about the point z—\
valid for small values of |z — l|. Specify the exact range over which your
expansion holds. This is an analytic continuation of Eq. 6.75.
(b) Determine the Laurent expansion of/(z) about z = 1 but for \z — 1| large.
6.5.11 (a) Given /t(z) = J* e~zidt (with t real), show that the domain in which j\(z)
exists (and is analytic) is !M(z) > 0.
(b) Show that /2(z) = 1/z equals /t (z) over 9l{z) > 0 and is therefore an analytic
continuation of/t(z) over the entire z-plane except for z = 0.
(c) Expand 1/z about the point z = i. You will have /3(z) = Y^=oan(z ~ 0"-
What is the domain of /3(z)?
ANS. - = - i X (if(z - if, \z - i\ < 1
Z n = 0
6.6 MAPPING
In the preceding sections we have defined analytic functions and developed
some of their main features. From these developments the integral relations of
Chapter 7 follow directly. Here we introduce some of the more geometric aspects
MAPPING 385
of functions of complex variables, aspects that will be useful in visualizing the
integral operations in Chapter 7 and that are valuable in their own right in
solving Laplace's equation in two-dimensional systems.
In ordinary analytic geometry we may take у =f{x) and then plot у versus x.
Our problem here is more complicated, for z is a function of two variables x and
y. We use the notation
w =f(z) = u{x,y) + iv{x,y).
F.76)
Then for a point in the z-plane (specific values for x and y) there may correspond
specific values for u(x,y) and v(x, y) which then yield a point in the w-plane. As
points in the z-plane transform or are mapped into points in the w-plane, lines
or areas in the z-plane will be mapped into lines or areas in the w-plane. Our
immediate purpose is to see how lines and areas map from the z-plane to the
w-plane for a number of simple functions.
Translation
w = z + z0.
The function w is equal to the variable z plus a constant, z0
6.1 and 6.76
и = x + x0,
F.77)
x0 + iy0. By Eqs.
F.78)
representing a pure translation of the coordinate axes as shown in Fig. 6.16.
У
-*~x
-*- и
FIG. 6.16 Translation
Here
then
Rotation
it is convenient
w =
to return to
pe1*, z =
W = ZZq
the polar
: rew,
representation,
and z0 = r0
using
el\
F.79)
F.80)
386 FUNCTIONS OF A COMPLEX VARIABLE I
@. 1)
A.0)
FIG. 6.17 Rotation
-*- x
p = '"'о
4< = в + 6lo
-*•- w
or
pei(p = rr0
P = rro,
F.81)
F.82)
Two things have occurred. First, the modulus r has been modified, either
expanded or contracted, by the factor r0. Second, the argument 0 has been
increased by the additive constant в0 (Fig. 6.17). This represents a rotation
the complex variable through an angle 0o. For the special case of z0 = i, we
have a pure rotation through л/2 radians.
Inversion
w =
Again, using the polar form, we have
pel<p =
1
re
ifl
Г
which shows that
P =
cp= -в.
F.83)
F.84)
F.85)
The first part of Eq. 6.85 shows that inversion clearly. The interior of the unit
circle is mapped onto the exterior and vice versa (Fig. 6.18). In addition, the
second part of Eq. 6.85 shows that the polar angle is reversed in sign. Equation
6.83 therefore also involves a reflection of the y-axis exactly like the complex
conjugate equation.
To see how lines in the z-plane transform into the w-plane, we simply return
to the cartesian form:
и + iv =
1
x + iy
F.86)
MAPPING 387
(О, 1)
(О, 1)
*- х
*- и
FIG. 6.18 Inversion
Rationalizing the right-hand side by multiplying numerator and denominator
by z* and then equating the real parts and the imaginary parts, we have
и =
x
X =
и
и2 + v2'
'6.87)
v = — -
У
и2 + v2'
A circle centered at the origin in the z-plane has the form
x2 + y2 = r2
and by Eq. 6.87 transforms into
u2
(и2 + v2J (и2 + v2J
Simplifying Eq. 6.89, we obtain
1
F.88)
F.89)
U? + V* = -=■ =
rz
F.90)
which describes a circle in the w-plane also centered at the origin.
The horizontal line у = ct transforms into
— v
u2 + v2
or
= c.
1
u
ci {2c tY
F.91)
F.92)
388 FUNCTIONS OF A COMPLEX VARIABLE I
у
FIG. 6.19 Inversion, line <-»■ circle
which describes a circle in the w-plane of radius A/2)^ and centered at и = 0,
v= -ki (Fig. 6.19).
The reader may pick up the other three possibilities, x = ±ct, у = — cl5 by
rotating the xy-axes. In general, any straight line or circle in the z-plane will
transform into a straight line or a circle in the w-plane (compare Exercise 6.6.1).
The three transformations just discussed have all involved one-to-one corre-
correspondence of points in the z-plane to points in the w-plane. Now to illustrate
the variety of transformations that are possible and the problems that can
arise, we introduce first a two-to-one correspondence and then a many-to-one
correspondence. Finally, we take up the inverses of these two transformations.
Consider first the transformation
w = z ,
which leads to
<p = 20.
F.93)
F.94)
Clearly, our transformation is nonlinear, for the modulus is squared, but the
significant feature of Eq. 6.94 is that the phase angle or argument is doubled.
This means that the
first quadrant of z, 0 < в < | -> upper half-plane of w, 0 < (p < n,
upper half-plane of z, 0 < 9 < n -*■ whole plane of w, 0 < <p < In.
The lower half-plane of z maps into the already covered entire plane of w, thus
covering the w-plane a second time. This is our two-to-one correspondence,
two distinct points in the z-plane, z0 and zoein = -z0, corresponding to the
single point w = z%.
MAPPING 389
In cartesian representation
и + iv = (x + iyJ
= x2 — y2 + Hxy,
F.95)
leading to
и = x2 — у2,
v = 2xy.
F.96)
Hence the lines и = ct, v = c2 in the w-plane correspond to x2 — y2 = ct,
2xy = с2, rectangular (and orthogonal) hyperbolas in the z-plane (Fig. 6.20).
To every point on the hyperbola x2 — y2 = ct in the right half-plane, x > 0,
one point on the line и = ct corresponds and vice versa. However, every point
on the line и = ct also corresponds to a point on the hyperbola x2 — y2 = ct
in the left half-plane, x < 0, as already explained.
U - d
V = C->
FIG. 6.20 Mapping—hyperbolic coordinates
It will be shown in section 6.7 that if lines in the w-plane are orthogonal the
corresponding hnes in the z-plane are also orthogonal, as long as the transfor-
transformation is analytic. Since и = cx and v = c2 are constructed perpendicular to
each other, the corresponding hyperbolas in the z-plane are orthogonal. We
have literally constructed a new orthogonal system of hyperbolic lines (or
surfaces if we add an axis perpendicular to x and y). Exercise 2.1.3 was an
analysis of this system. It might be noted that if the hyperbolic lines are electric
or magnetic lines of force, then we have a quadrupole lens useful in focusing
beams of high energy particles.
The transformation
F.97)
leads to
or
w =
w =
pew =
F.98)
F.99)
390 FUNCTIONS OF A COMPLEX VARIABLE I
у
Cut line
FIG. 6.21 A cut line
If у ranges from 0 < у < 2% (or — n < у < п), then cp covers the same range.
But this is the whole w-plane. In other words, a horizontal strip in the z-plane
of width 2я maps into the entire w-plane. Further, any point x + i{y + 2nn),
in which n is any integer, maps into the same point (by Eq. 6.99), in the w-plane.
We have a many-(infinitely many)-to-one correspondence.
The inverse of the fourth transformation (Eq. 6.93) is
w = z1'2. F.100)
From the relation
pei<p = rU2eW/2^ F 1Q1)
and
2cp = 0, F.102)
we now have two points in the w-plane (arguments cp and cp + n) corresponding
to one point in the z-plane (except for the point z = 0). Or, to put it another way,
6 and 0 + 2n correspoi г to cp and cp + n, two distinct points in the w-plane.
This is the compler .~ analog of the simple real variable equation y2 = x,
in which two values of y, plus and minus, correspond to each value of x.
The important point here is that we can make the function w of Eq. 6.100 a
single-valued function instead of a double-valued function if we agree to restrict
6 to a range such as 0 < 0 < 2n. This may be done by agreeing never to cross
the line в = 0 in the z-plane (Fig. 6.21). Such a line of demarcation is called a
cut line. The point of termination (z = 0, here) in a multivalued function is
known as a branch point. It is a form of a singular point (compare Section 7.1),
f(z) not being analytic at z = 0.
Any line running from z = 0 out to infinity would serve equally well. The
purpose of the cut line is to restrict the argument of z. The points z0 and z0e2ni
coincide in the z-plane but yield different points w and wein = — w in the w-plane.
Hence in the absence of a cut line the function w = z1/2 is ambiguous.
We shall encounter branch points and cut lines frequently in Chapter 7.
Finally, as the inverse of the fifth transformation (Eq. 6.97), we have
w = lnz. F.103)
MAPPING 391
FIG. 6.22 Ln z, a multivalued function
By expanding it, we obtain
и + iv = In re'9
F.104)
= ln r + W.
For a given point z0 in the z-plane the argument 9 is unspecified within an
integral multiple of 2л. This means that
v = 9 + 2nn, F.105)
and as in the exponential transformation, we have an infinitely many-to-one
correspondence.
Equation 6.103 has a nice physical representation. If we go around the unit
circle in the z-plane, r = 1, and by Eq. 6.104 и = Inr = 0; but v = 0, and 6 is
steadily increasing and continues to increase as 0 continues, past 2л. The
behavior in the w-plane as we go around and around the unit circle in the z-plane
is like the advance of a screw as it is rotated or the ascent of a person walking
up a spiral staricase (Fig. 6.22).
As in the preceding example, we make the correspondence unique (and
Eq. 6.103 unambiguous) by restricting в to a range such as 0 < 0 < 2% by
.aking the line 9 = 0 (positive real axis) as a cut line. This is equivalent to
taking one and only one complete turn of the spiral staircase.
It is because of the multivalued nature of ln z that the contour integral
Ф0,
integrating about the origin. This property appears in Exercises 6.4.1 and 6.4.2
and is the basis for the entire calculus of residues (Chapter 7).
The concept of mapping is a very broad and useful one in mathematics.
Our mapping from a complex z-plane to a complex w-plane is a simple generali-
generalization of one definition of function: a mapping of x (from one set) into у in a
second set. A more sophisticated form of mapping appears in Section 8.7 where
we use the Dirac delta function d{x — a) to map a function/(x) into its value
at the point a. Then in Chapter 15 integral transforms are used to map one
function/(x) in x-space into a second (related) function F(t) in t-space.
392 FUNCTIONS OF A COMPLEX VARIABLE I
plane
FIG. 6.23 Bessel function integration contour
EXERCISES
6.6.1 How do circles centered on the origin in the z-plane transform for
(a)
z
(b) w2(z) = z - -,
z
for z ф 0?
What happens when \z\ ->• 1?
6.6.2 What part of the z-plane corresponds to the interior of the unit circle in the w-plane
if
(a) w =
z-J.
(b) w=
- -
z + i
6.6.3 Discuss the transformations
(a) w(z) = sin z, (c) w(z) = sinhz,
(b) w(z) = cos z, (d) w(z) = cosh z.
Show how the lines x ~ ct, у = с2 map into the w-plane. Note that the last three
transformations can be obtained from the first one by appropriate translation
and/or rotation.
6.6.4 Show that the function
w(z) = (z2 - 1)
1/2
is smgte-\alued ft we take
— 1 < x < 1, y = 0asa cut line.
6.6.5 Show that negative numbers have logarithms in the complex plane. In particular,
findln(-l). ANS. 1п(-1) = 1Я.
6.6.6 An integral representation of the Bessel function follows the contour in the t-plane
shown in Fig. 6.23. Map this contour into the 0-plane with t = ee. Many additional
examples of mapping are given in Chapters 11, 12, and 13.
6.7 CONFORM AL MAPPING
In Section 6.6 hyperbolas were mapped into straight lines and straight lines
were mapped into circles. Yet in all these transformations one feature stayed
CONFORMAL MAPPING 393
у
z - plane
w- plane
FIG. 6.24 Conformal mapping—preservation of angles
constant. This constancy was a result of the fact that all the transformations of
Section 6.6 were analytic.
As long as w = /(z) is an analytic function, we have
df dw
dz dz
Aw
o Az
F.106)
Assuming that this equation is in polar form, we may equate modulus to modulus
and argument to argument. For the latter (assuming that df/dz ф 0)
,. Aw ,. Aw
arg hm —- = hm arg —-
Az->o Az Az->o Az
= Iim arg Aw — Iim argAz
Az->0 Az->0
df
= arg-f- = a,
dz
F.107)
where a, the argument of the derivative, may depend on z but is a constant for a
fixed z, independent of the direction of approach. To see the significance of this,
consider two curves, Cz in the z-plane and the corresponding curve Cw in the
w-plane (Fig. 6.24). The increment Az is shown at an angle of 0 relative to the
real (x) axis, whereas the corresponding increment Aw forms an angle of cp
with the real (u) axis. From Eq. 6.107
cp = 6 + oc, F.108)
or any line in the z-plane is rotated through an angle a in the w-plane as long
as w is an analytic transformation and the derivative is not zero.!
Since this result holds for any line through z0, it will hold for a pair of lines.
Then for the angle between these two lines
(p2-(p1=
a) - (в, +сс) = в2- 0lt
F.109)
which shows that the included angle is preserved under an analytic trans-
transformation. Such angle-preserving transformations are called conformal. The
1 If df/dz = 0, its argument or phase is undefined and the (analytic) trans-
transformation will not necessarily preserve angles.
394 FUNCTIONS OF A COMPLEX VARIABLE I
rotation angle a will, in general, depend on z. In addition \f\z)\ will, usually
be a function of z.
Historically, these conformal transformations have been of great importance
to scientists and engineers in solving Laplace's equation for problems of electro-
electrostatics, hydrodynamics, heat flow, and so on. Unfortunately, the conformal
transformation approach, however elegant, is limited to problems that can be
reduced to two dimensions. The method is often beautiful if there is a high degree
of symmetry present but often impossible if the symmetry is broken or absent.
Because of these limitations and primarily because high-speed electronic com-
computers offer a useful alternative (iterative solution of the partial differential
equation), the details and applications of conformal mapping are omitted.
EXERCISES
6.7.1 Expand w(x) in a Taylor series about the point z = z0 where f'{z0) ~ 0. (Angles
not preserved.) Show that if the first n — 1 derivatives vanish but f(n)(z0) ф 0, then
angles in the z-plane with vertices at z = z0 appear in the w-plane multiplied by n.
6.7.2 Develop the transformations that create each of the four cylindrical coordinate
systems:
(a) Circular cylindrical x = p cos cp,
у — p sin cp.
(b) Elliptic cylindrical x = a cosh и cos v,
у = a sinh и sin v.
(c) Parabolic cylindrical x = £,r\,
у = htf - a.
a sinh ц
(d) Bipolar x =
cosh n — cos
У , ,-
cosh r\ — cos с
Note. These transformations are not necessarily analytic.
6.7.3 In the transformation
a — w
ez =
a + w
how do the coordinate lines in the z-plane transform? What coordinate system
have you constructed?
REFERENCES
AhlfORS, L. V., Complex Analysis, 3rd ed. New York: McGraw-Hill A979).
This text is detailed, thorough, rigorous, and extensive.
Churchill, R. V., J. W. Brown, and R. F. Verkey, Complex Variables and Applications,
3rd ed. New York: McGraw-Hill A974).
This is an excellent text for both the beginning and advanced student. It is readable and
quite complete. A detailed proof of the Cauchy-Goursat theorem is given in Chapter 5.
REFERENCES 395
Greenleaf, F. P., Introduction to Complex Variables. Philadelphia: W. B. Saunders
A972).
This very readable book has detailed, careful explanations.
Kyrala, A., Applied Functions of a Complex Variable. New York: Wiley-Interscience
A972).
An intermediate level text designed for scientists and engineers. Includes many physical
applications.
Levinson, N., and R. M. Redheffer, Complex Variables. San Francisco: Holden-Day
A970).
This text is written for scientists and engineers who are interested in applications.
Morse, P. M., and Feshbach, H., Methods of Theoretical Physics. New York: McGraw-
Hill A953).
Chapter 4 is a presentation of portions of the theory of functions of a complex variable
of interest to theoretical physicists.
Sokolnikoff, I. S., and Redheffer, R. M., Mathematics of Physics and Modern Engineer-
Engineering, 2nd ed. New York: McGraw-Hill A966).
Chapter 7 covers complex variables.
Spiegel, M. R., Theory and Problems of Complex Variables. New York: Schaum A964).
An excellent summary of the theory of complex variables for scientists.
Watson, G. N., Complex Integration and Cauchy's Theorem. New York: Hafner (orig.
1917).
A short work containing a rigorous development of the Cauchy integral theorem and
integral formula. Applications to the calculus of residues are included. Cambridge
Tracts in Mathematics, and Mathematical Physics, No. 15.
Other references are given at the end of Chapter 15.
7 FUNCTIONS OF A
COMPLEX
VARIABLE II
CALCULUS OF
RESIDUES
7.1 SINGULARITIES
In this chapter we return to the line of analysis that started with the Cauchy-
Riemann conditions in Chapter 6 and led on through the Laurent expansion
(Section 6.5). The Laurent expansion represents a generalization of the Taylor
series in the presence of singularities. We define the point z0 as an isolated
singular point of the function/(z) if/(z) is not analytic at z = z0 but is analytic
at neighboring points. A function that is analytic throughout the entire finite
complex plane except for isolated poles is called meromorphic.
Poles
In the Laurent expansion of/(z) about z0
00
f(z)= X an{z-z0)n. G.1)
If an = 0 for n < — m < 0 and a_m Ф 0, we say that z0 is a pole of order m.
For instance, if m = 1; that is, if a~xl(z — z0) is the first nonvanishing term in
the Laurent series, we have a pole of order one, often called a simple pole.
If, on the other hand, the summation continues to n = — oo, the z0 is a pole
of infinite order and is called an essential singularity. These essential singularities
have many pathological features. For instance, we can show that in any small
neighborhood of an essential singularity of f(z) the function f(z) comes arbi-
arbitrarily close to any (and therefore every) preselected complex quantity Wq.1
Literally, the entire w-plane is mapped into the neighborhood of the point z0.
One point of fundamental difference between a pole of finite order and an
essential singularity is that a pole of order m can be removed by multiplying
f(z) by (z — zo)m. This obviously cannot be done for an essential singularity.
The behavior of f(z) as z -*■ oo is defined in terms of the behavior of /(I/O
as t -*■ 0. Consider the function
1This theorem is due to Picard. A proof is given by E. С Titchmarsh, The
Theory of Functions, 2nd ed. New York: Oxford University Press A939).
396
SINGULARITIES 397
As z -*■ oo, we replace the z by 1/t to obtain
Clearly, from the definition, sin z has an essential singularity at infinity. This
result could be anticipated from Exercise 6.1.9 since
sin z = sin iy, when x = 0,
= i sinh y,
which approaches infinity exponentially as у -*■ oo.
Branch Points
There is another sort of singularity that will be important in the later sections
of this chapter. Consider
= Л
in which a is not an integer.2 As z moves around the unit circle from e° to e2ni,
f(z) - e2** ф eOi,
for nonintegral a. As in Section 6.6, we have a branch point. The points e°l
and e2ni in the z-plane coincide but these coincident points lead to different
values of/(z); that is,/(z) is a multivalued function. The problem is resolved
by constructing a cut line so that/(z) will be uniquely specified for a given point
in the z-plane.
Note carefully that a function with a branch point and a required cut line
will not be continuous across the cut line. In general, there will be a phase
difference on opposite sides of this cut line. Hence line integrals on opposite
sides of this branch point cut line will not generally cancel each other. Numerous
examples of this appear in the exercises.
The cut line used to convert a multiply connected region into a simply con-
connected region (Section 6.3) is completely different. Our function is continuous
across the cut line, and no phase difference exists.
EXAMPLE 7.1.1
Consider the function
f(z) = (z2 - II'2 =(z+ iy/2(z - I). G.4)
2z = 0 is technically a singular point, for z" has only a finite number of deriva-
derivatives, whereas an analytic function is guaranteed an infinite number of deriva-
derivatives (Section 6.4). The problem is that/(z) is not single-valued as we encircle
the origin. The Cauchy integral formula may not be applied.
398 FUNCTIONS OF A COMPLEX VARIABLE II
у
FIG. 7.1
The first factor on the right-hand side, (z + 1I/2, has a branch point at z = — 1.
The second factor has a branch point at z = +1. To check on the possibility
of taking the line segment joining z = +1 and z = — 1 as a cut line, let us
follow the phases of these two factors as we move along the contour shown in
Fig. 7.1.
For convenience in following the changes of phase let z + 1 = re'e and
z — 1 = pei(p. Then the phase of f(z) is (в + cp)/2. We start at point 1 where
both z + 1 and z — 1 have a phase of zero. Moving from point 1 to point 2, <p,
the phase of z — 1 = pei<p increases by n. (z — 1 becomes negative.) cp then
stays constant until the circle is completed, moving from 6 to 7. 0, the phase
of z + 1 = re'e shows a similar behavior increasing by 2л as we move from
3 to 5. The phase of the function/(z) = (z + lI/2(z - 1I/2 = rV2pwene+4»i2 is
@ + (p)/2. This is tabulated in the final column of Table 7.1.
TABLE
Phase
Point
1
2
3
4
5
6
7
7.1
Angle
в
0
0
0
71
2я
2я
2я
0
л
л
л
л
л
2п
(в + <р)/2
0
71/2
7Г/2
тс
Зтс/2
Зтс/2
2я
Two features emerge;
1. The phase at points 5 and 6 is not the same as the
phase at points 2 and 3. This behavior can be excepted
at a branch point cut line.
2. The phase at point 7 exceeds that at point 1 by 2л
and the function/(z) = {z2 — 1I/2 is therefore single-
valued for the contour shown, encircling both branch
points.
EXERCISES 399
If we take the x-axis — 1 < x < 1 as a cut line, f(z) is uniquely specified.
Alternatively, the positive x-axis for x > 1 and the negative x-axis for x < — 1
may be taken as cut lines. The branch points cannot be encircled and the
function remains single-valued.
Generalizing from this example, we have that the phase of a function
is the algebraic sum of the phase of its individual factors:
arg/(z) = arg/,(z) + arg/2(z) + arg/3(z) + • • •.
The phase of an individual factor may be taken as the arctangent of the ratio
of its imaginary part to its real part,
arg/(z) = tan (v,/Ui).
For the case of a factor of the form
ft{z) = (z- z0)
the phase corresponds to the phase angle of a two-dimensional vector from
+ z0 to z, the phase increasing by 2тс as the point +z0 is encircled. Conversely,
the traversal of any closed loop not encircling z0 does not change the phase of
z- z0.
As a final note on singularities, Liouville's theorem (Exercise 6.4.8) states
"A function that is everywhere finite (bounded) and analytic must be a constant."
This is readily proved by the use of Cauchy's integral formula. Conversely,
the slightest deviation of an analytic function from a constant value implies
that there must be at least one singularity somewhere in the infinite complex
plane. Apart from the trivial constant functions, then, singularities are a fact
of life, and we must learn to live with them. But we shall do more than that. We
shall use singularities to develop the powerful and useful calculus of residues.
EXERCISES
7,1.1 The function /(z) expanded in a Laurent series exhibits a pole of order m at z = z0.
Show that the coefficient of (z — zo)~l, u_1; is given by
1 d
m-l
(z - zo
1 (m- 1)! dzm'1
with
fl-i = [(z - zo)/(z)]z=z0,
■] ■
when the pole is a simple pole (m = 1). These equations for a_t are extremely
useful in determining the residue to be used in the residue theorem of the next
section.
Hint. The technique that was so successful in proving the uniqueness of power
series, Section 5.7, will work here also.
400 FUNCTIONS OF A COMPLEX VARIABLE II
7.1.2 A function /(z) can be represented by
in which /i(z) and /2(z) are analytic. The denominator /2(z) vanishes at z = z0
showing that /(z) has a pole at z = z0. However, fi{z0) Ф 0, fi(z0) ф 0. Show that
a_l5 the coefficient of (z — Zq) in a Laurent expansion of/(z) at z = z0, is given
by
This result leads to the Heaviside expansion theorem, Section 15.12.
7.1.3 In analogy with Example 7.1.1 consider in detail the phase of each factor and the
resultant overall phase of/(z) = (z2 + 1I/2 following a contour similar to that of
Fig. 7.1, but encircling the new branch points.
7.1.4 The Legendre function of the second kind, Qv(z), has branch points at z = ±1. The
branch points are joined by a cut line along the real (x) axis.
(a) Show that Q0(z) — jln((z + l)/(z — 1)) is single-valued (with the real axis
— 1 < x < 1 taken as a cut line).
(b) For real argument x and |x| < 1 it is convenient to take
Show that
Here x + Ю indicates z approaches the real axis from above, x — Ю indicates
an approach from below.
7.1.5 As an example of an essential singularity consider e1/z as z approaches zero. For
any complex number zo,zo Ф 0, show that
has an infinite number of solutions.
7.2 CALCULUS OF RESIDUES
Residue Theorem
If the Laurent expansion of a function/(z) = ]Г*= _да an(z — z0)" is integrated
term by term by using a closed contour that encircles one isolated singular
point z0 once in a counterclockwise sense, we obtain (Exercise 6.4.1)
n+1 *, G.5)
= 0 for all n ф - 1.
However, if n = — 1,
4-1 a [irewd0
) dz = a <b
0) j
Summarizing Eqs. 7.5 and 7.6, we have
G.6)
CALCULUS OF RESIDUES 401
(г
\
м
<~2 //
7
/
J
2ni
FIG. 7.2 Excluding isolated singularities
G.7)
The constant a_j, the coefficient of (z — z0) ! in the Laurent expansion, is called
the residue of/(z) at z = z0.
A set of isolated singularities can be handled very nicely by deforming our
contour as shown in Fig. 7.2. Cauchy's integral theorem (Section 6.3) leads to
<£ f(z)dz + <£ f(z)dz + | f(z)dz + | f(z)dz + ■ ■ ■ = 0. G.8)
The circular integral around any given singular point is given by Eq. 7.7.
f(z)dz = —
G-9)
assuming a Laurent expansion about the singular point, z = z-r The negative
sign comes from the clockwise integration as shown in Fig. 7.2. Combining
Eqs. 7.8 and 7.9, we have
f(z)dz =
G.10)
= 2ni (sum of enclosed residues).
This is the residue theorem. The problem of evaluating one or more contour
integrals is replaced by the algebraic problem of computing residues at the
enclosed singular points.
We first use this residue theorem to develop the concept of the Cauchy
principal value. Then in the remainder of this section we apply the residue
theorem to a wide variety of definite integrals of mathematical and physical
interest. In Section 7.3 the concept of Cauchy principal value is used to obtain
the important dispersion relations. The residue theorem will also be needed in
Chapter 16 for a variety of integral transforms, particularly the inverse Laplace
transform.
Cauchy Principal Value
Occasionally an isolated first-order pole will be directly on the contour of
integration. In this case we may deform the contour to include or exclude the
residue as desired by including a semicircular detour of infinitesimal radius.
402 FUNCTIONS OF A COMPLEX VARIABLE II
FIG. 7.3 By-passing singular points
This is shown in Fig. 7.3. The integration over the semicircle then gives
! if counterclockwise,
, if clockwise.
This contribution, + or —, appears on the left-hand side of Eq. 7.10. If our
detour were clockwise, the residue would not be enclosed and there would be
no corresponding term on the right-hand side of Eq. 7.10. However, if our
detour were counterclockwise, this residue would be enclosed by the contour
С and a term 2nia^l would appear on the right-hand side of Eq. 7.10. The net
result for either clockwise or counterclockwise detour is that a simple pole on
the contour is counted as one half what it would be if it were within the contour.
This corresponds to taking the Cauchy principal value.
x FIG. 7.4 Closing the contour with an in-
infinite radius semicircle
For instance, let us suppose that f(z) with a simple pole at z = x0 is integrated
over the entire real axis. The contour is closed with an infinite semicircle in the
upper half-plane (Fig. 7.4). Then
>f(z)dz =
f(x)dx+ f(z)dz
Jcv_
f(x)dx
G.11)
c infinite semicircle
= 2ni Y, enclosed residues.
If the small semicircle Cx includes x0 (by going below the x-axis, counter-
counterclockwise), x0 is enclosed, and its contribution appears twice—as ma-x in fc
and as 2ma_x in the term 2m 2^ enclosed residues—for a net contribution of
7иа_!. If the upper small semicircle is elected, x0 is excluded. The only contribu-
contribution is from the clockwise integration over CXq which yields -ш_,. Moving
this to the extreme right of Eq. 7.11, we have +nia-i, as before.
The integrals along the x-axis may be combined and the semicircle radius
permitted to approach zero. We have
f(x)dx
f(x)dx{t =
f(x)dx.
G.12)
CALCULUS OF RESIDUES 403
fix)*
x - x0
FIG. 7.5
Гхо+»
Jxn~d
f(x)dx
P indicates the Cauchy principal value and represents the preceding limiting
process. Note carefully that the Cauchy principal value is a balancing or cancel-
canceling process. In the vicinity of our singularity at z = x0,
f(x)
x — x(
G.13)
This is odd, relative to x0. The symmetric or even interval (relative to x0)
provides cancellation of the shaded areas, Fig. 7.5. The contribution of the
singularity is in the integration about the semicircle.
Sometimes, this same limiting technique is applied to the integration limits
+ oo. We may define
f(x)dx = lim f(x)dx.
G.14)
An alternate treatment moves the pole off the contour and then considers
the limiting behavior as it is brought back. This technique is illustrated in
Example 7.2.4, in which the singular points are moved off the contour in such
a way that the solution is forced into the form desired to satisfy the boundary
conditions of the physical problem.
Evaluation of Definite Integrals
Definite integrals appear repeatedly in problems of mathematical physics as
well as in pure mathematics. Three moderately general techniques are useful in
evaluating definite integrals: A) contour integration, B) conversion to gamma
404 FUNCTIONS OF A COMPLEX VARIABLE II
or beta functions (Chapter 10), and C) numerical quadrature (Appendix A2).
Other approaches include series expansion with term-by-term integration and
integral transforms. As will be seen subsequently, the method of contour
integration is perhaps the most versatile of these methods, since it is applicable
to a wide variety of integrals.
Evaluation of Definite Integrals—
ion/(sin0,cos0)d0
The calculus of residues is useful in evaluating a wide variety of definite
integrals in both physical and purely mathematical problems. We consider,
first, integrals of the form
/= /(sin0,cos0)d0, G.15)
Jo
where/is finite for all values of 6. We also require/to be a rational function
of sin в and cos 6 so that it will be single-valued. Let
z = eie, dz = ieied0.
From this,
dz z-z'1 z + z~l
z 2i
Our integral becomes
G.16)
with the path of integration the unit circle. By the residue theorem, Eq. 7.10,
/ = (— iJni £ residues within the unit circle. G.18)
Note that we are after the residues off{z)/z. Illustrations of integrals of this type
are provided by Exercises 7.2.7 to 7.2.10.
EXAMPLE 7.2.1
Our problem is to evaluate the definite integral
Jo 1 + scosO'
By Eq. 7.17 this becomes
_.Г dz
dz
£ J z2 +B/s)z+ 1'
The denominator has roots
CALCULUS OF RESIDUES 405
_ = Jl - s2 and z+ = f- -Jl - s2.
z+ is within the unit circle; z_ is outside. Then by Eq. 7.18 and Exercise 7.1.1
1
= —i-'2ni
z + 1/e + A/eb/l - e:
We obtain
•2я
1+8COS0
< 1.
Evaluation of Definite Integrals—
Suppose that our definite integral has the form
/ =
f{x)dx
and satisfies the two conditions:
a. f(z) is analytic in the upper half-plane except for a
finite number of poles. (It will be assumed that there
are no poles on the real axis. If poles are present on
the real axis, they may be included or excluded as
discussed earlier in this section.)
b. f(z) vanishes as strongly1 as 1/z2 for
0 < arg z < n.
GO,
G.19)
FIG. 7.6
With these conditions, we may take as a contour of integration the real axis
and a semicircle in the upper half-plane as shown in Fig. 7.6. We let the radius R
of the semicircle become infinitely large. Then
)f(z)dz= lim f(x)dx+ lim f(Rew)iRew dO
J — R JO
= 2ni Y, residues (upper half-plane).
G.20)
could use/(z) vanishes faster than 1/z, but we wish to have f{z) single-
valued.
406 FUNCTIONS OF A COMPLEX VARIABLE II
From the second condition the second integral (over the semicircle) vanishes
and
/•00
f(x)dx = 2niZresidues (upper half-plane). G21)
J-00
EXAMPLE 7.2.2
Evaluate
/•00
J —00
From Eq. 7.21
/•00 I
dx
^-i- G-22)
1 + x
x2
= 2ni ]T residues (upper half-plane).
I — 00
Here and in every other similar problem we have the question—where are the
poles? Rewriting the integrand as
1 ' l G.23)
z2 + 1 z + i z - i
we see that there are simple poles (order 1) at z = i and z = —i.
A simple pole at z = z0 indicates (and is indicated by) a Laurent expansion of
the form x
+ ao+Y an(z - z0)". G.24)
Z0 n = l
The residue a^x is easily isolated as (Exercise 7.1.1)
a_t ={z- zo)f{z)\z=ZQ. G.25)
Using Eq. 7.25, we find that the residue at z = i is l/2i, whereas that at z = —i
is -1/2/.
Then лоо , .
•/ — oo
Here we have used a_t = l/2i for the residue of the one included pole at z = i.
Readers should satisfy themselves that it is possible to use the lower semicircle
and that this choice will lead to the same result, I = n. A somewhat more delicate
problem is provided by the next example.
Evaluation of Definite Integrals—
Consider the definite integral
/= Г f(x)eiaxdx, G.27)
CALCULUS OF RESIDUES 407
with a real and positive. This is a Fourier transform, Chapter 15. We assume the
two conditions:
a. f(z) is analytic in the upper half-plane except for a
finite number of poles.
b.
lim f(z) = 0, 0 < arg z <n.
W-00
G.28)
Note that this is a less restrictive condition than the second condition imposed
on/(z) for integrating ^aof(x)dx previously.
FIG. 7.7 (а) у = B/7i)fl, (b) у = sin в
We employ the contour shown in Fig. 7.6. The application of the calculus of
residues is the same as the one just considered, but here we have to work a little
harder to show that the integral over the (infinite) semicircle goes to zero. This
integral becomes
IR= f(Reie)eiaRcose~aRsineiReiede.
Jo
Let R be so large that \f(z)\ = \f(Rew)\ < s. Then
G.29)
\Ir\ <£R\ e
Jo
-aR sine
d6
G.30)
= 2sR
o
In the range [0, тг/2]
Therefore (Fig. 7.7)
-0<sin0
n
IR < 2eR
n/2
G.31)
Now, integrating by inspection, we obtain
IR < 2eR
- e
'aR
Finally,
aR2/n
lim IR < — e.
G.32)
408 FUNCTIONS OF A COMPLEX VARIABLE II
From Eq. 7.28 s -► 0 as R -► oo and
lim \IR = 0.
G.33)
This useful result is sometimes called Jordan's lemma. With it, we are prepared to
tackle Fourier integrals of the form shown in Eq. 7.27.
Using the contour shown in Fig. 7.6, we have
ЛОО
f(x)eiaxdx + lim IR = 2ni £ residues (upper half-plane).
J — 00
Since the integral over the upper semicircle IR vanishes as R -► oo, (Jordan's
lemma),
/•00
f{x)eiaxdx = 2ш X residues (upper half-plane), (a > 0) G.34)
' -00
EXAMPLE 7.2.3 Singularity on Contour of Integration
The problem is to evaluate
sinx
X
dx.
G.35)
This may be taken as the imaginary part2 of
,G.36)
Now the only pole is a simple pole at z = 0 and the residue there by Eq. 7.25
is a_! = 1. We choose the contour shown in Fig. 7.8 A) to avoid the pole, B) to
FIG. 7.8
include the real axis, and C) to yield a vanishingly small integrand for z = iy,
у ->■ oo. Note that in this case a large (infinite) semicircle in the lower half-plane
2One can use J[(e'z — e <z)l2iz]dz, but then two different contours will be
needed for the two exponentials (compare Example 7.2.4).
CALCULUS OF RESIDUES 409
would be disastrous. We have
Ceizdz Гт ■ dx С eizdz CReixdx Г eiz d
e dz
---^ = 0, G.37)
Z
Cl Z I X JC2
the final zero coming from the residue theorem (Eq. 7.10). By Jordan's lemma
Г ^1^ = 0, G.38)
)c2 z
and
dz f00 eixdx
a^ + pi e_ax = Q ^
Z J
j Z JCl Z J-CO X
The integral over the small semicircle yields (—) ni times the residue of 1, minus,
as a result of going clockwise. Taking the imaginary part,3 we have
Л 00
dx = n G.40)
x
or
I ^/x = *. G.41)
Jo * 2
The contour of Fig. 7.8, although convenient, is not at all unique. Another
choice of contour for evaluating Eq. 7.35 is presented as Exercise 7.2.15.
EXAMPLE 7.2.4 Quantum Mechanical Scattering
The quantum mechanical analysis of scattering leads to the function
f00 xsinxdx
<742»
where a is real and positive. From the physical conditions of the problem there
is a further requirement: I (a) is to have the form eia so that it will represent an
outgoing scattered wave.
Using
sin z = - sinh iz
ii <743)
i
= —- e'z — —-
2
e
2i
'Alternatively, we may combine the integrals of Eq. 7.37 as
410 FUNCTIONS OF A COMPLEX VARIABLE II
we write
with
Eq.
7.42 in the complex plane
т .
1\ ■
г -
J2 ■
/(*) =
J — oo
- * Г
"Л
as
h + h,
ZelZ dz
2 2 Z
zz — a1
30 ze'iz
z2~a2
-Ой
G.44)
G.45)
dz.
Integral It is similar to Example 7.2.3 and, as in that case, we may complete the
contour by an infinite semicircle in the upper half-plane. For I2 the exponential
is negative and we complete the contour by an infinite semicircle in the lower
half-plane, as shown in Fig. 7.9. As in Example 7.2.3, neither semicircle con-
contributes anything to the integral—Jordan's lemma.
FIG. 7.9
There is still the problem of locating the poles and evaluating the residues.
We find poles at z = + a and z = —a on the contour of integration. The residues
are (Exercises 7.1.1, 7.2.1):
z — о
z = —a
Detouring around the poles, as shown in Fig. 7.9 (it matters little whether we go
above or below), we find that the residue theorem leads to
4-.\^ = 2ni
G.46)
for we have enclosed the singularity at z = a but excluded the one at z = — a.
In similar fashion, but noting that the contour for I2 is clockwise,
CALCULUS OF RESIDUES 411
Adding Eqs. 7.46 and 7.47, we have
PI{a) = PIt + PI2 = %ia + e'ia) = ncoshia
2 G.48)
= 7TCOSCT.
This is a perfectly good evaluation of Eq. 7.42, but unfortunately the cosine
dependence is appropriate for a standing wave and not for the outgoing scattered
wave as specified.
To obtain the desired form, we try a different technique. Instead of dodging
around the singular points, let us move them off the real axis. Specifically, let
a -*■ a + iy, — a^—a — iy, where у is positive but small and will eventually be
made to approach zero, that is,
+ iy). G.49)
With this simple substitution, the first integral It becomes
/l\eHa+iy)
jj G.50)
by direct application of the residue theorem. Also,
hip + iy) = ~2nihif-Y~- G-51>
Adding Eqs. 7.50 and 7.51 and then letting у -> 0, we obtain
I+(a) = lim [I Ao + iy) + I2{a + iy)]
y->0
G.52)
a result that does fit the boundary conditions of our scattering problem.
It is interesting to note that the substitution a -*■ a — iy would have led to
/_(<r) = ne'l\ G.53)
which could represent an incoming wave. Our earlier result (Eq. 7.48) is seen to
be the arithmetic average of Eqs. 7.52 and 7.53. This average is the Cauchy
principal value of the integral. Note that we have these possibilities (Eqs. 7.48,
7.52, and 7.53) because our integral is an improper integral. It is not uniquely
defined until we specify the particular limiting process (or average) to be used.
Evaluation of Definite Integrals—
Exponential Forms
With exponential or hyperbolic functions present in the integrand, life gets
somewhat more complicated than before. Instead of a general overall prescrip-
412 FUNCTIONS OF A COMPLEX VARIABLE II
tion, the contour must be chosen to fit the specific integral. These cases are also
opportunities to illustrate the versatility and power of contour integration.
As an example, we consider an integral that will be quite useful in developing
a relation between z! and (— z)!. Notice how the periodicity along the imaginary
axis is exploited.
EXAMPLE 7.2.5 Factorial Function
We wish to evaluate
dx,
G.54)
The limits on a are necessary (and sufficient) to prevent the integral from diverg-
diverging as x -► ± go. This integral (Eq. 7.54) may be handled by replacing the real
variable x by the complex variable z and integrating around the contour shown
in Fig. 7.10. If we take the limit as R -► oo, the real axis, of course, leads to the
integral we want. The return path along у = 2л is chosen to leave the denomina-
denominator of the integral invariant, at the same time introducing a constant factor
numerator we have, in the complex plane,
еажа ш
paz / [>R ах
^~—dz = lim ( ~ dx - ei2na
1+e-
:dx
= A - ei2na)
00 eax
G.55)
dx.
/-00
i
- R + 2m
-R
R + 2-ni
R
FIG. 7.10
In addition there are two vertical sections @ < у < 2л), which vanish (exponen-
(exponentially) as R -*■ oo.
Now where are the poles and what are the residues? We have a pole when
e* = exeiy= -1. G.56)
Equation 7.56 is satisfied at z = 0 + m. By a Laurent expansion4 in powers of
41 +ez = 1 +ez'inein
= 1 - ez'iK
2!
3!
CALCULUS OF RESIDUES 413
(z — in) the pole is seen to be a simple pole with a residue of — eina. Then, apply-
applying the residue theorem once more,
i2na) [
A - ei2na) [ -r^-^dx = 2ni(~eina). G.57)
1 + e*
J — 00
This quickly reduces to
ax n
n
dx = ——, 0 < a < 1. G.58)
' — 00
1 + e* sin an
Using the beta function (Section 10.4), we can show the integral to be equal to
the product (a — 1)! ( — a)!. This results in the interesting and useful factorial
function relation
™L G.59)
sin тш
Although Eq. 7.58 holds for real a, 0 < a < 1, Eq. 7.59 may be extended by
analytic continuation to all values of a, real and complex, excluding only real
integral values.
As a final example of contour integrals of exponential functions, we consider
Bernoulli numbers again.
EXAMPLE 7.2.6 Bernoulli Numbers
In Section 5.9 the Bernoulli numbers were defined by the expansion
x". G.60)
ex - 1 „tb n!
Replacing x with z (analytic continuation), we have a Taylor series (compare
Eq. 6.60) with
n\ С z dz
where the contour Co is around the origin counterclockwise with \z\ < In to
avoid the poles at ±2ni.
For n — 0 we have a simple pole at z = 0 with a residue of +1. Hence by
Eq. 7.10
Я0=~.2тиA)=1. G.62)
2ni
For n = 1 the singularity at z = 0 becomes a second-order pole. The residue may
be shown to be — \ by series expansion of the exponential, followed by a binomial
expansion. This results in
44)=4 (X63)
414 FUNCTIONS OF A COMPLEX VARIABLE II
FIG. 7.11 Contour of integration for
Bernoulii numbers
For n > 2 this procedure becomes rather tedious, and we resort to a different
means of evaluating Eq. 7.61. The contour is deformed, as shown in Fig. 7.11.
The new contour С still encircles the origin, as required, but now it also
encircles (in a negative direction) an infinite series of singular points along the
imaginary axis at z = ±p2ni, p = 1, 2, 3, .... The integration back and forth
along the x-axis cancels out, and for R -► oo the integration over the infinite
circle yields zero. Remember that n > 2. Therefore
= — 2ni £ residues
(z = ±p2ni).
G.64)
At z = p2ni we have a simple pole with a residue {p2ni) ". When n is odd, the
residue from z = p2ni exactly cancels that from z= —p2ni and ВпоЛЛ = 0,
n = 3, 5, 7, and so on. For n even the residues add, giving
В =
n\
2m
(-l)"'22n\
B71)"
(-l)n/22n!
Bn)"
с»
1.P-"
G.65)
{n even),
where £(n) is the Riemann zeta function introduced in Section 5.9. Equation 7.65
corresponds to Eq. 5.151 of Section 5.9.
Branch Points, Cut Lines
Sometimes the integrand will contain z to a fractional power. The integrand
is multivalued. There is a branch point and a cut line is required. Exercises
7.2.18, 7.2.19, and 7.2.23 are examples of this situation. A key point to remember
is that the function can be expected to be discontinuous across this mandatory
cut line. The integral along one side of the cut line will probably not equal the
integral along the other side.
EXERCISES 415
EXERCISES
7.2.1 Determine the nature of the singularities of each of the following functions and
evaluate the residues (a > 0).
1
(a)
(c)
(e)
z2 + a2'
z2
(z2 + a2J
ze+iz
z2 + a2'
e+iz
(b)
(d)
(f)
sin
z2 -+
ze~
z2-
z~*
1
i-a2
1/z
- a2'
Viz
-a2'
r
7.2.2 Locate the singularities and evaluate the residues of each of the following
functions
(a) z-"(ez-iy\ z^O,
(b)
1+e
2z"
7.2.3 The statement that the integral halfway around a singular point is equal to
one half the integral all the way around was limited to simple poles. Show, by
a specific example, that
Г f(z)dz = H f(z)dz
J Semicircle ^ J circle
does not necessarily hold if the integral encircles a pole of higher order.
Hint. Try/(z) = z.
7.2.4 A function /(z) is analytic along the real axis except for a third-order pole at
z = x0. The Laurent expansion about z = x0 has the form
(z - xoy (z - x0)
with g(z) analytic at z = x0. Show that the Cauchy principal value technique
is applicable in the sense that
(a) Нт|Г0 */(*)<**+[ f(x)dx\
is well behaved.
(b) f f(z)dz= ±ina.lf
К
where Cx denotes a small semicircle about z = xn.
7.2.5 The unit step function is defined as (compare Exercise 8.7.13)
JO, s<a
[I, s> a.
Show that u(s) has the integral representations
416 FUNCTIONS OF A COMPLEX VARIABLE II
1 f e
(a) u(s) = lim — dx,
e-*o+2nij_aax-ie
1
(b) «W-j
. The parameter s is real.
7.2.6 Most of the special functions of mathematical physics may be generated
(defined) by a generating function of the form
Given the following integral representations, derive the corresponding generat-
generating function:
(a) Bessel
"v ' 2ni>
(b) Modified Bessel
т / ч 1 I
2m
(c) Legendre
P(X) = — \>(l-2tx+ t2yll2rn~l dt.
2ni
(d) Hermite
(e) Laguerre
Hn(x) = ~ Ье-'2+21хГ"~1 dt.
2ni I
e-xt/(l-t)
(f) Chebyshev
1 С (I - t2)t~"~l
Г„(.х)
— <b -^—b!i— л.
4ra J A - 2tx + t2)
Each of the contours encircles the origin and no other singular points.
7.2.7 Generalizing Example 7.2.1, show that
C2n dB C2n dB 2n
Jo a + bcos0 JO a±bsinB (a2 - b2I'2
What happens if \b\ > \a\l
7.2.8 Show that
Г dB = тш
Jo (a+cos6J~(a2-lK/2'
7.2.9 Show that
С2л dB 2%
for a > \b\.
a> 1.
o 1 -2;cos6+ t2~ 1 - t2'
for Ы < 1.
What happens if 11\ > 1?
What happens if \t\ = 1?
7.2.10 With the calculus of residues show that
EXERCISES 417
n = 0, 1, 2, ....
(The double factorial notation is defined in Section 10.1).
Hint, cos в = \(ew + e~ie) = £(z + z~l), \z\ = 1.
7.2.11 Evaluate
cos bx — cos ax
a > о > 0.
7.2.12 Prove that
sin2x, n
—:—ax = -.
?i(a - b).
Hint, sin2 x = ^A — cos 2x).
7.2.13 A quantum mechanical calculation of a transition probability leads to the
function f(t, со) = 2A — cos cot)/aJ. Show that
ЛОО
f(t, to) dco = 2nt.
7.2.14 Show that (a > 0)
(a)
(b)
cosx
-e "
a
How is the right side modified if cosx is replaced by cos /ex?
xsinx
dx = же ".
How is the right side modified if sin x is replaced by sin/ex?
These integrals may also be interpreted as Fourier cosine and sine transforms—
Chapter 15.
7.2.15 Use the contour shown (Fig. 7.12) with R -» oo to prove that
sinx
dx = n.
-R + iR
R + iR
7.2.16
R FIG. 7.12
In the quantum theory of atomic collisions we encounter the integral
418 FUNCTIONS OF A COMPLEX VARIABLE II
in which p is real. Show that
/ = 0, \p\ > 1
/ = n, \p\ < 1.
What happens if p = ± 1 ?
7.2.17 Evaluate
(a) by appropriate series expansion of the integrand to obtain
n = 0
(b) by contour integration to obtain
8
Hint, x -* z = e\ Try the contour shown in Fig. 7.13, letting R -* со.
У
-R + iir
-R
R + iir
R
7.2.18 Show that
FIG. 7.13
x" , na
—rdx = -
where — 1 < a < 1. Here is still another way of deriving Eq. 7.59.
Hint. Use the contour shown in Fig. 7.14, noting that z = 0 is a branch point
and the positive x-axis is a cut line. Note also the comments on phases following
Example 7.1.1.
7.2.19 Show that
FIG. 7.14
x a , n
-ax = -—
x+ 1
sin ал;
EXERCISES 419
у
-•-л:
FIG. 7.15
where 0 < a < 1. This opens up another way of deriving the factorial function
relation given by Eq. 7.59.
Hint. You have a branch point and you will need a cut line. Recall that z" = w
in polar form is
w
-" = ре19,
which leads to —ав — 2апп = (p.
You must restrict n to zero (or any other single integer) in order that q> may
be uniquely specified. Try the contour shown in Fig. 7.15.
7.2.20 Show that
7.2.21 Evaluate
dx
n
4a3'
a>0.
:d.X.
7.2.22 Show that
ANS.
cos(f2)df= sin(f2)df =
Jo Jo
Hint. Try the contour shown in Fig. 7.16.
Note. These are the Fresnel integrals for the special case of infinity as the upper
limit. For the general care of a varying upper limit, asymptotic expansions of
the Fresnel integrals are the topic of Exercise 5.11.2. Spherical Bessel expansions
are the subject of Exercise 11.7.13.
7.2.23 Several of the Bromwich integrals, Section 15.12, involve a portion that may be
approximated by
Га+iy ezt
- I T~^dz.
lo+iy
Here a and t are positive and finite. Show that
lim l{y) = 0.
420 FUNCTIONS OF A COMPLEX VARIABLE II
:►■*•
FIG. 7.16
7.2.24 Show that
1 , n/n
-dx = ~-
Jo 1 + x" sin(n/n)
Hint. Try the contour shown in Fig. 7.17.
FIG. 7.17
7.2.25 (a) Show that
/(z) = z4-2cos20z2 + 1
has zeros at ew, e~w, —ew, and — e"ie.
(b) Show that
ax n
2sin0
ж
Exercise 7.2.24 (n — 4) is a special case of this result.
DISPERSION RELATIONS 421
7.2.26 Show that
Г x2dx к
ix4-2cos20x2+ 1 2sin0
n
-cos26I/2"
Exercise 7.2.21 is a special case of this result.
7.2.27 Apply the techniques of Example 7.2.4 to the evaluation of the improper
integral
/ — 00
(a) Let a -*■ a + iy.
(b) Let a -*■ a — iy.
(c) Take the Cauchy principal value.
7.2.28 The integral in Exercise 7.2.17 may be transformed into
16
Evaluate this integral by the Gauss-Laguerre quadrature, Appendix A2, and
compare your result with n3/16.
ANS. Integral = 1.93775 A0 points).
7.3 DISPERSION RELATIONS
The concept of dispersion relations entered physics with the work of Kronig
and Kramers in optics. The name dispersion comes from optical dispersion,
a result of the dependence of the index of refraction on wavelength or angular
frequency. The index of refraction n may have a real part determined by the
phase velocity and a (negative) imaginary part determined by the absorption—
see Eq. 7.79. Kronig and Kramers showed that the real part of (n2 — 1) could
be expressed as an integral of the imaginary part. Generalizing this, we shall
apply the label dispersion relations to any pair of equations giving the real
part of a function as an integral of its imaginary part and the imaginary part
as an integral of its real part—Eqs. 7.71a and 1.11b that follow. The existence
of such integral relations might be suspected as an integral analog of the
Cauchy-Riemann differential relations, Section 6.2.
The applications in modern physics are widespread. For instance, the real
part of the function might describe the forward scattering of a gamma ray in
a nuclear Coulomb field (a dispersive process). Then the imaginary part would
describe the electron-positron pair production in that same Coulomb field
(the absorptive process). As will be seen later, the dispersion relations may be
taken as a consequence of causality and therefore are independent of the details
of the particular interaction.
We consider a complex function/(z) that is analytic in the upper half-plane
and on the real axis. We also require that
422 FUNCTIONS OF A COMPLEX VARIABLE II
У
-R
FIG. 7.18
lim
Ы-00
= 0, 0<argz<7r,
G.66)
in order that the integral over an infinite semicircle will vanish. The point of
these conditions is that we may express/(z) by the Cauchy integral formula,
Eq. 6.43,
/Ы =
1
dz.
2ni J z - z0
The integral over the upper semicircle1 vanishes and we have
G.67)
/Ы =
1
fix)
dx.
G.68)
The integral over the contour shown in Fig. 7.18 has become an integral along
the x-axis.
Equation 7.68 assumes that z0 is in the upper half-plane—interior to the
closed contour. If z0 were in the lower half-plane, the integral would yield zero
by the Cauchy integral theorem, Section 6.3. Now, either letting z0 approach
the real axis from above (z0 -► x0), or placing it on the real axis and taking an
average of Eq. 7.68 and zero, we find that Eq. 7.68 becomes
/(Xo) = ±,> Г JM.
ni 1 x-x.
dx,
G.69)
where P indicates the Cauchy principal value.
Splitting Eq. 7.69 into real and imaginary parts2 yields
/(x0) = m(x0) + iv(x0)
u(x)
G.70)
71 X — Xo П X — X(
«/-00 u «/ - 00 L
dx.
Finally, equating real part to real part and imaginary part to imaginary part,
we obtain
1 The use of a semicircle to close the path of integration is convenient, not
mandatory. Other paths are possible.
2The second argument, у = 0, is dropped. м(л:о,0) -> u(x0).
DISPERSION RELATIONS 423
G.71a)
G.71b)
These are the dispersion relations. The real part of our complex function is
expressed as an integral over the imaginary part. The imaginary part is expressed
as an integral over the real part. The real and imaginary parts are Hilbert
transforms of each other. Note that these relations are meaningful only when
f(x) is a complex function of the real variable x. Compare Exercise 7.3.1.
From a physical point of view u(x) and/or v(x) represent some physical
measurements. Then f(z) = u(z) + iv(z) is an analytic continuation over the
upper half-plane, with the value on the real axis serving as a boundary condition.
Symmetry Relations
On occasion/(x) will satisfy a symmetry relation and the integral from — oo
to + oo may be replaced by an integral over positive values only. This is of
considerable physical importance because the variable x might represent a
frequency and only zero and positive frequencies are available for physical
measurements. Suppose3
/(-x) = /•(*)• G-72)
Then
u( — x) + iv( — x) = u(x) — iv(x). G.73)
The real part of/(x) is even and the imaginary part is odd.4 In quantum me-
mechanical scattering problems these relations (Eq. 7.73) are called crossing
conditions. To exploit these crossing conditions, we rewrite Eq. 7.71a as
u{x0) = l-P Г J>®-dx + ip Г-^-dx. G.74)
7Г J^X-Xo 7Г Jo X-X0
Letting x -> — x in the first integral on the right-hand side of Eq. 7.74 and
substituting v( — x)= —v(x) from Eq. 7.73, we obtain
1 1 ] ,
+ > dx
x - xo\
} G.75)
Similarly,
, , dx.
x2 -xl
3This is not just a happy coincidence. It ensures that the Fourier transform
off(x) will be real. In turn, Eq. 7.72 is a consequence of obtaining/(x) as the
Fourier transform of a real function.
4u(x,0) = m( — x,O), v(x,0) = —v( — x,0). Compare these symmetry condi-
conditions with those that follow from the Schwarz reflection principle, Section
6.5.
424 FUNCTIONS OF A COMPLEX VARIABLE II
xou{x)
dx. G.76)
The original Kronig-Kramers optical dispersion relations were in this form.
The asymptotic behavior (x0 -► oo) of Eqs. 7.75 and 7.76 lead to quantum
mechanical sum rules, Exercise 7.3.4.
Optical Dispersion
The function exp[i(/cx — cot)] describes a wave moving along the x-axis in
the positive direction with velocity v = со/к, со is the angular frequency, к the
wave number or propagation vector, and n = ck/co the index of refraction.
From Maxwell's equations, electric permittivity e, and Ohm's law with con-
conductivity a the propagation vector к for a dielectric becomes5
k2 = s^(l + i4™) G.77)
c2 \ a>e J
(with fi, the magnetic permeability taken to be unity). The presence of the
conductivity (which means absorption) gives rise to an imaginary part. The
propagation vector к (and therefore the index of refraction n) have become
complex.
Conversely, the (positive) imaginary part implies absorption. For poor
conductivity Dтгсг/со£ « 1) a binomial expansion yields
, _ г- со . 2na
and
an attenuated wave.
Returning to the general expression for k2, we find that Eq. 7.77 the index of
refraction becomes
n2 = i^ = e + ,*£. G 78)
CO CO
We take n2 to be a function of the complex variable со (with s and a depending
on со). However, n2 does not vanish as со -> oo but instead approaches unity.
So to satisfy the condition, Eq. 7.66, one works with f(co) = п2(ш) — 1. The
original Kronig-Kramers optical dispersion relations were in the form of
G.79)
5 See J. D. Jackson, Classical Electrodynamics, 2nd ed., Section 7.7, New York:
Wiley A975). Equation 7.77 follows Jackson in the use of Gaussian units.
DISPERSION RELATIONS 425
Knowledge of the absorption coefficient at all frequencies specifies the real
part of the index of refraction and vice versa.
The Parseval Relation
When the functions u(x) and v(x) are Hilbert transforms of each other and
each is square integrable,6 the two functions are related by
Лоо Лоо
\u(x)\2dx= \v(x)\2dx. G.80)
J — QO J — 00
This is the Parseval relation.
To derive Eq. 7.80, we start with
n s - x n } t - x
•У — 00 «/—00 «/—00 «/—00
using Eq. 7.71a twice.
Integrating first with respect to x, we have
Лоо Лоо Лоо .. Лоо j
\u(x)\2dx=\ ~\ y- -v(s)dsv(t)dt. G.81)
From Exercise 7.3.8 the x integration yields a delta function:
1 ax с/ ч
J — oo ^ ' ^ '
We have
poo Лоо Лоо
u(x)\2dx=\ v{s)S{s-t)dsv{t)dt. G.82)
' — oo •/ -oo
Then the s integration is carried out by inspection, using the defining property
of the delta function.
Л0О
v(s)d(s-t)ds=v(t). G.83)
J-oo
Substituting Eq. 7.83 into Eq. 7.82, we have Eq. 7.80, the Parseval relation.
Again, in terms of optics, the presence of refraction over some frequency range
(n ф 1) implies the existence of absorption and vice versa.
Causality
The real significance of the dispersion relations in physics is that they are a
direct consequence of assuming that the particular physical system obeys
causality. Causality is awkward to define precisely but the general meaning is
that the effect cannot precede the cause. A scattered wave cannot be emitted
by the scattering center before the incident wave has arrived. For linear systems
the most general relation between an input function G (the cause) and an output
function H (the effect) may be written as
This means that j^ \u(x)\2 dx and j^ |K*)|2 dx are finite.
426 FUNCTIONS OF A COMPLEX VARIABLE II
Лоо
H(t)= F(t-t')G(t')dt'. G.84)
J —oo
Causality is imposed by requiring that
F(t -t') = 0 for t-f<0.
Equation 7.84 gives the time dependence. The frequency dependence is obtained
by taking Fourier transforms. By the Fourier convolution theorem, Section
15.5,
h(w) = f(to)g(a>),
where/(со) is the Fourier transform of F(t), and so on. Conversely, F(t) is the
Fourier transform of/(со).
The connection with the dispersion relations is provided by the Titchmarsh
theorem.7 This states that if /(со) is square integrable over the real co-axis, then
any one of the following three statements implies the other two.
1. The Fourier transform of/(со) is zero for t < 0: Eq.
7.84.
2. Replacing со by z, the function/(z) is analytic in the
complex z plane for у > 0 and approaches/(x) almost
everywhere as у -*■ 0. Further,
Лоо
\f(x + iy)\2dx<K for>>>0,
J —oo
that is, the integral is bounded.
3. The real and imaginary parts of f(z) are Hilbert
transforms of each other: Eqs. 7.71a and 1.11b.
The assumption that the relationship between the input and the output of
our linear system is causal (Eq. 7.84) means that the first statement is satisfied.
If/(со) is square integrable, then the Titchmarsh theorem has the third statement
as a consequence and we have dispersion relations.
EXERCISES
7.3.1 The function /(z) satisfies the conditions for the dispersion relations. In addition,
/(z)=/*(z*), the Schwarz reflection principle, Section 6.5. Show that /(z) is
identically zero.
7.3.2 For /(z) such that we may replace the closed contour of the Cauchy integral
formula by an integral over the real axis we have
7 Refer to E. C. Titchmarsh, Introduction to the Theory of Fourier Integrals,
2nd ed., New York: Oxford University Press 1937. For a more informal
discussion of the Titchmarsh theorem and further details on causality see
J. Hilgevoord, Dispersion Relations and Causal Description. Amsterdam:
North-Holland Publishing Co. A962).
EXERCISES 427
Here Cx designates a small semicircle about x0 in the lower half-plane. Show that
this reduces to
_ Ар
which is Eq. 7.69.
7.3.3 (a) For f(z) = eiz, Eq. 7.66 does not hold at the end points, argz = 0, ж. Show,
with the help of Jordan's lemma, Section 7.2, that Eq. 7.67 still holds,
(b) For f(z) = eiz verify the dispersion relations, Eq. 7.71 or Eqs. 7.75 and 7.76,
by direct integration.
7.3.4 With f(x) = u(x) + iv(x) and f(x) = f*(-x), show that as x0
oo,
2 f00
(a) u(x0) ~ 2\ *v{x)dx,
rocojo
2 f00
(b) v(x0) ~ u(x)dx.
j
In quantum mechanics relations of this form are often called sum rules.
7.3.5 (a) Given the integral equation
__L__!pf "(*) иг
1 4- x2 n x - x
use Hilbert transforms to determine u(x0).
(b) Verify that the integral equation of part (a) is satisfied.
(c) From /(z)|j,=0 = u(x) + iv(x), replace x by z and determine /(z). Verify that
the conditions for the Hilbert transforms are satisfied.
(d) Are the crossing conditions satisfied?
ANS. (a) u(x0) =
x0
(i + 4У
(c) f(z) = (z + i)-\
7.3.6 (a) If the real part of the complex index of refraction (squared) is constant (no
optical dispersion), show that the imaginary part is zero (no absorption),
(b) Conversely, if there is absorption, show that there must be dispersion. In
other words, if the imaginary part of n2 — 1 is not zero, show that the real
part of n2 — 1 is not constant.
7.3.7 Given u(x) — x/(x2 + 1) and v(x) = — l/(x2 + 1), show by direct evaluation of each
integral that
ЛОО Л00
\u(x)\2dx= \v(x)\2dx.
-oo
лоо
ANS. | \u(x)\2dx= \v(x)\2dx = -.
I —oo
7.3.8 Take u(x) = S(x), a delta function, and assume that the Hilbert transform equations
hold,
(a) Show that
428 FUNCTIONS OF A COMPLEX VARIABLE II
(b) With changes of variables w — s — t and x — s — y, transform the 5 representa-
representation of part (a) into
1/*0D 7
j ax
d(s — t) = — .
Note. The 5 function is discussed in Section 8.7.
7.3.9 Show that
л, л { Г dt
is a valid representation of the delta function in the sense that
f f(xM(x)dx=f@).
J — oo
Assume that f(x) satisfies the condition for the existence of a Hilbert transform.
Hint. Apply Eq. 7.69 twice.
7.4 THE METHOD OF STEEPEST DESCENTS
In analyzing problems in mathematical physics, one often finds it desirable
to know the behavior of a function for large values of the variable, that is,
the asymptotic behavior of the function. Specific examples are furnished by the
gamma function (Chapter 10) and the various Bessel functions (Chapter 11).
The method of steepest descents is a method of determining such asymptotic
behavior when the function can be expressed as an integral of the general form
I(s)= g{z)esf(z) dz. G.85)
Jc
For the present, let us take s to be real. The contour of integration С is then
chosen so that the real part of/(z) approaches minus infinity at both limits and
that the integrand will vanish at the limits, or is chosen as a closed contour.
It is further assumed that the factor g(z) in the integrand is dominated by the
exponential in the region of interest.
If the parameter s is large and positive, the value of the integrand will become
large when the real part of f(z) is large and small when the real part of/(z) is
small or negative. In particular, as s is permitted to increase indefinitely (leading
to the asymptotic dependence), the entire contribution of the integrand to the
integral will come from the region in which the real part of/(z) takes on a
positive maximum value. Away from this positive maximum the integrand will
become negligibly small in comparison. This is seen by expressing/(z) as
f{z) = u{x,y) + iv{x,y).
Then the integral may be written as
I{s)= g{z)esu{x'y)eisv(x'y)dz. G.86)
Jc
If now, in addition, we impose the condition that the imaginary part of the
THE METHOD OF STEEPEST DESCENTS 429
exponent, iv(x, y), be constant in the region in which the real part takes on its
maximum value, that is, v(x,y) = v(xo,yo) = v0, we may approximate the
integral by
I{s) « eisv° g{z)esu{x-y) dz. G.87)
Jc
Away from the maximum of the real part, the imaginary part may be permitted
to oscillate as it wishes, for the integrand is negligibly small and the varying
phase factor is therefore irrelevant.
The real part of sf(z) is a maximum for a given s when the real part of/(z),
u(x, y), is a maximum. This implies that
ди _ди п
дх ду
and therefore, by use of the Cauchy-Riemann conditions of Section 6.2
df(z)
dz
= 0. G.88)
We proceed to search for such zeros of the derivative.
It is essential to note that the maximum value of u{x, y) is the maximum
only along a given contour. In the finite plane neither the real nor the imaginary
part of our analytic function possesses an absolute maximum. This may be
seen by recalling that both и and v satisfy Laplace's equation
~г + ~,=0- G-89)
ox cy
From this, if the second derivative with respect to x is positive, the second
derivative with respect to у must be negative, and therefore neither и nor v
can possess an absolute maximum or minimum. Since the function f(z) was
taken to be analytic, singular points are clearly excluded. The vanishing of the
derivative (Eq. 7.88) then implies that we have a saddle point, a stationary
value, which may be a maximum of u(x,y) for one contour and a minimum
for another (Fig. 7.19).
Our problem, then, is to choose the contour of integration to satisfy two
conditions. A) The contour must be chosen so that u(x, y) has a maximum at
the saddle point. B) The contour must pass through the saddle in such a way
that the imaginary part, v(x, y), is a constant. This second condition leads to
the path of steepest descent and gives the method its name. From Section 6.2,
especially Exercise 6.2.1, we know that the curves corresponding to и = constant
and v = constant form an orthogonal system. This means that a curve v = c;,
constant, is everywhere tangential to the gradient of u, \u. Hence the curve
v = constant is the curve that gives the line of steepest descent from the saddle
point.1
lrThe line of steepest ascent is also characterized by constant г;. The saddle
point must be inspected carefully to distinguish the line of steepest descent
from the line of steepest ascent. This is discussed later in two examples.
430 FUNCTIONS OF A COMPLEX VARIABLE II
u{x,y)
, 1'aili o!" steepest descent
x ' Contour lines, и = constant
FIG. 7.19 A saddle point
At the saddle point the function f(z) can be expanded in a Taylor series to
give
f(z) = f(z0) + \{z - zoJf"(zo) + • • •. G.90)
The first derivative is absent, since obviously Eq. 7.88 is satisfied. The first
correction term, \{z — zoJf"(zo), is real and negative. It is real, for we have
specified that the imaginary part shall be constant along our contour and
negative because we are moving down from the saddle point or mountain
pass. Then, assuming that/"(z0) ф 0, we have
f(z) - f(z0) « i(z - zoJf"{zo) = -It2,
G.91)
which serves to define a new variable t. If (z — z0) is written in polar form
(z - z0) = deia, G.92)
(with the phase a held constant), we have
t2 = -sf"{z0)d2e2ia. G.93)
Since t is real,2 it may be written as
t = ±d\sf"(zo)\112- G.94)
Substituting Eq. 7.91 into Eq. 7.85, we obtain
2 The phase of the contour (specified by a) at the saddle point is chosen so
that J\_f{z) -/(z0)] = 0, that is, \{z ~ zoJ/"(zo) must be real.
THE METHOD OF STEEPEST DESCENTS 431
С00 А
« g{zo)esflz°) e~t2'2 j- dt. G.95)
J — qo
We have
dz fdt
from Eqs. 7.92 and 7.94. Equation 7.95 becomes
G.97)
It will be noted that the limits have been set as minus infinity to plus infinity. This
is permissible, for the integrand is essentially zero when t departs appreciably
from the origin. Noting that the remaining integral is just a Gauss error integral
equal to y/2n, we finally obtain
G.98)
The phase a was introduced in Eq. 7.92 as the phase of the contour as it passed
through the saddle point. It is chosen so that the two conditions given [a =
constant; £%f{z) = maximum] are satisfied. It sometimes happens that the con-
contour passes through two or more saddle points in succession. If this is the case,
we need only add the contribution made by Eq. 7.98 from each of the saddle
points in order to get an approximation for the total integral.
One note of warning: We assumed that the only significant contribution to
the integral came from the immediate vicinity of the saddle point(s) z = z0, that
is,
0t\_f(z)~\ = u(x,y) 4: u(xo,yo)
over the entire contour away from z0 = x0 + iy0. This condition must be
checked for each new problem (Exercise 7.4.5).
EXAMPLE 7.4.1 Asymptotic Form of the Hankel Function, H(vl){s)
In Section 11.4 it is shown that the Hankel functions, which satisfy Bessel's
equation, may be defined by
G.99)
G.Ю0)
The contour Q is the curve in the upper half-plane of Fig. 7.20. The contour C2
is in the lower half-plane. We apply the method of steepest descents to the first
Hankel function, Hll)(s), which is conveniently in the form specified by Eq. 7.85,
432 FUNCTIONS OF A COMPLEX VARIABLE II
FIG. 7.20 Hankel func-
function contours
with/(z) given by
G.101)
By differentiating, we obtain
G.102)
Setting/'(z) = 0 in accordance with Eq. 7.88, we obtain
z = i,~i. G.103)
Hence there are saddle points at z = +i and z = — i. The integral for H(vl)(s) is
chosen so that it starts at the origin, moves out tangentially to the positive real
axis, and then moves around through the saddle point at z = +i and on out to
minus infinity, asymptotic with the negative real axis. We must choose the con-
contour through the point z = +i'm such a way that the real part of (z — 1/z) will be
a maximum and the phase will be constant in the vicinity of the saddle point.
We have
Ml z 1 = 0 for z = i.
We require &(z - 1/z) < 0 for the rest of Сх,{гФ i).
In the vicinity of the saddle point at z0 = +iwe have
-i = dei
G.104)
where д is a small number. Then
= д cos a + i(<5sina + 1) —
= д cos a + i(S sin a + 1) —
1
3 cos a + i(<5sina + 1)
8 cos a — i(Esina + 1)
1 + 25 sin a + S2
G.105)
Therefore our real part becomes
THE METHOD OF STEEPEST DESCENTS 433
0t(z --| = 5cosa-5cosa(l + 25 sin a + S2)'1. G.106)
z
Recalling that 3 is small, we expand by the binomial theorem and neglect terms
of order E3 and higher.
&(z--\ = 2d2 cos a sin a + O(d3) * d2 sin 2a. G.107)
We see that the real part of (z — 1/z) will take on an extreme value if sin 2a is
an extremum, that is, if 2a is л/2 or Зя/2. Hence the phase of the contour a should
be chosen to be я/4 or Зя/4. One choice will represent the path of steepest descent
that we want. The other choice will represent a path of steepest ascent that we
must avoid. We distinguish the two possibilities by substituting in the specific
values of a. For a = я/4
G.108)
For this choice z = i is a minimum.
For a = Зя/4
z--\=-d2 G.109)
and z = i is a maximum. This is the phase we want.
Direct substitution into Eq. 7.98 with a = Зя/4 now yields
1 / Л GU0)
7"
By combining terms, we finally obtain
GЛП)
as the leading term of the asymptotic expansion of the Hankel function
Additional terms, if desired, may be picked up by assuming a series of descending
powers and substituting back into Bessel's equation.
EXAMPLE 7.4.2 Asymptotic Form of the Factorial Function, s!
In many physical problems, particularly in the field of statistical mechanics,
it is desirable to have an accurate approximation of the gamma or factorial
function of very large numbers. As developed in Section 10.1, the factorial
function may be defined by the integral
( Г es{lnz-z)dz. G.112)
oe
434 FUNCTIONS OF A COMPLEX VARIABLE II
Here we have made the substitution p = zs in order to throw the integral into the
form required by Eq. 7.85. As before, we assume that s is real and positive, from
which it follows that the integrand vanishes at the limits 0 and oo. By differen-
differentiating the z-dependence appearing in the exponent, we obtain
, G.113)
(lnz2) l,
dz dz z
which shows that the point z = 1 is a saddle point. We let
z-l=deia, G.114)
with д small to describe the contour in the vicinity of the saddle point. Sub-
Substituting into/(z) = lnz — z, we develop a series expansion
/(z) = ln(l + 8eia) - A + 3eia)
= 3eia - \b2e2iCL + ■ ■ ■ ~ 1 — deia G.115)
From this we see that the integrand takes on a maximum value {e s) at the saddle
point if we choose our contour С to follow the real axis, a conclusion that the
reader may well have reached more or less intuitively.
Direct substitution into Eq. 7.98 with a = 0 now gives
s+l _-s
2 G-116)
Thus the first term in the asymptotic expansion of the factorial function is
se~s. G.117)
This result is the first term in Stirling's expansion of the factorial function. The
method of steepest descent is probably the easiest way of obtaining this first
term. If more terms in the expansion are desired, then the method of Section
10.3 is preferable.
In the foregoing example the calculation was carried out by assuming s to be
real. This assumption is not necessary. The student may show (Exercise 7.4.6)
that Eq. 7.117 also holds when s is replaced by the complex variable w, provided
only that the real part of w is required to be large and positive.
EXERCISES
7.4.1 Using the method of steepest descents, evaluate the second Hankel function given
by
J Г (s/2)(z-l/z)^L
with contour C2 as shown in Fig. 7.20.
ANS.
EXERCISES 435
7.4.2 The negative square root in Eq. 7.94 does not appear in Eq. 7.97. What is the
justification for dropping it? Illustrate your argument by detailed reference to
^), Example 7.4.1.
7.4.3 (a) In applying the method of steepest descent to the Hankel function H[l\s),
show that
for z on the contour Cl but away from the point z = z0 — i.
(b) Show that
for 0<r<l,
and for r > 1, — < 0 <-
(Fig. 7.21). 2 2
This is why Q may not be deformed to pass through the second saddle point
z = — i.
FIG. 7.21
7.4.4 Determine the asymptotic dependence of the modified Bessel functions
given
2ni
l/x)
JL
The contour starts and ends at t = — oo, encircling the origin in a positive sense.
There are two saddle points. Only the one at z = +1 contributes significantly to
the asymptotic form.
7.4.5 Determine the asymptotic dependence of the modified Bessel function of the
second kind, Kv(x), by using
1 /*°°
7.4.6 Show that Stirling's formula
s!
holds for complex values of s (with M{s) large and positive).
Hint. This involves assigning a phase to s and then demanding that J [s/(z)] =
constant in the vicinity of the saddle point.
436 FUNCTIONS OF A COMPLEX VARIABLE II
7.4.7 Assume H[l)(s) to have a negative power-series expansion of the form
„=0
with the coefficient of the summation obtained by the method of steepest descent.
Substitute into Bessel's equation and show that you reproduce the asymptotic
series for H^is) given in Section 11.6.
REFERENCES
Nussenzveig, H. M., Causality and Dispersion Relations. New York: Academic Press
A972). Volume 95 in Mathematics and Engineering series.
This is an advanced text covering causality and dispersion relations in the first chapter
and then moving on to develop the implications in a variety of areas of theoretical
physics.
Wyld, H. W., Mathematical Methods for Physics. Reading, Mass.: Benjamin/Cummings
A976).
This is a relatively advanced text that contains an extensive discussion of the dispersion
relations.
8 DIFFERENTIAL
EQUATIONS
8.1 PARTIAL DIFFERENTIAL EQUATIONS OF
THEORETICAL PHYSICS
Almost all the elementary and numerous advanced parts of theoretical
physics are formulated in terms of differential equations, often partial differential
equations. Among the most frequently encountered are the following:
1. Laplace's equation, \2ф = 0.
This very common and very important equation
occurs in studies of
a. electromagnetic phenomena including electro-
electrostatics, dielectrics, steady currents, and magne-
tostatics,
b. hydrodynamics (irrotational flow of perfect
fluid and surface waves),
с heat flow,
d. gravitation.
2. Poisson's equation, \2ф = — p/s0.
In contrast to the homogeneous Laplace equation,
Poisson's equation is nonhomogeneous with a
source term — p/e0.
3. The wave (Helmholtz) and time-independent diffu-
diffusion equations, \2ф ± к2ф = О.
These equations appear in such diverse phenomena
as
a. elastic waves in solids including vibrating
strings, bars, membranes,
b. sound or acoustics,
с electromagnetic waves,
d. nuclear reactors.
4. The time-dependent diffusion equation
r a2 dt
and the corresponding four-dimensional forms
involving the d'Alembertian, a four-dimensional
437
438 DIFFERENTIAL EQUATIONS
analog of the Laplacian in Minkowski space,
d2 d2 . d2 . d2 . d2
= V
дх\ дх2 ду2 8z2 (icJ8t2'
5. The time-dependent wave equation, п2ф = О.
6. The scalar potential equation, п2ф = — p/s0.
Like Poisson's equation this equation is non-
homogeneous with a source term — p/e0.
7. The Klein-Gordon equation, и2ф = \12ф, and the
corresponding vector equations in which the scalar
function ф is replaced by a vector function.
Other more complicated forms are common.
8. The Schrodinger wave equation,
-~vv + уф = ihd4~
2m v v dt
and
-~\2ф + Уф = Еф
2m
for the time-independent case.
9. The equations for elastic waves and for viscous
fluids and the telegraphy equation.
10. Maxwell's coupled partial differential equations for
electric and magnetic fields and those of Dirac for
relativistic electron wave functions. For Maxwell's
equations see the Introduction and also Section 1.9.
All these equations can be written in the form
in which Я is a differential operator,
uf8 8 8 д
\cx cy cz dt
F is a known function, and ф is the unknown scalar (or vector) function.
Two characteristics are particularly important:
1. All these equations are linear1 in the unknown func-
function ф. As the easier physical and mathematical
problems are being solved, nonlinear differential
equations such as those describing shock wave
phenomena are receiving more and more attention.
The fundamental equations of atmospheric physics
1 Compare Section 2.6 for definition of linearity.
PARTIAL DIFFERENTIAL EQUATIONS OF THEORETICAL PHYSICS 439
are nonlinear. Turbulence, perhaps the most impor-
important unsolved problem of classical physics, is basi-
basically nonlinear. However, both the nonlinear differ-
differential equations themselves and the numerical
techniques to which we often resort for determining
solutions are beyond the scope of this book.
2. These equations are all second-order differential
equations [Maxwell's and Dirac's equations are
first-order but involve two unknown functions.
Eliminating one unknown yields a second-order
differential equation for the other (compare Section
1.9).]
Occasionally, we encounter equations of higher order. In both the theory of
the slow motion of a viscous fluid and the theory of an elastic body we find the
equation
Fortunately, for introductory treatments such as this one these higher-order
differential equations are relatively rare.
Although not so frequently encountered and perhaps not so important as
second-order differential equations, first-order differential equations do appear
in theoretical physics. The solutions of some of the more important types of
first-order (ordinary) equations are developed in Section 8.2.
Some general techniques for solving the partial differential equations are
discussed in this section.
1. Separation of variables. The partial differential equa-
equation is split into ordinary differential equations that
may be attacked by Frobenius's method, Section 8.5.
This separation technique is introduced in Section
2.6 and is discussed further in Section 8.3. It does not
always work but is often the simplest method when
it does.
2. Integral solutions employing a Green's function. An
introduction to the Green's function technique is
given in Section 8.7. A more detailed treatment
appears in Chapter 16.
3. Other analytical methods such as the use of integral
transforms. Some of the techniques in this class are
developed and applied in Chapter 15.
4. Numerical calculations. The development of modern
high-speed computing machines has opened up a
wealth of possibilities based on the calculus of finite
differences. Here we have the relaxation methods.
In Section 8.8 two numerical methods, the Runge-
440 DIFFERENTIAL EQUATIONS
Kutta and a predictor-corrector are applied to
ordinary differential equations.2
8.2 FIRST-ORDER DIFFERENTIAL EQUATIONS
Physics involves some first-order differential equations. For completeness
(and possible review) it seems desirable to touch on them briefly.
We consider here differential equations of the general form
Equation 8.1 is clearly a first-order, ordinary differential equation. It is first-
order because it contains the first and no higher derivatives. Ordinary because
the only derivative dy/dx is an ordinary or total derivative. Equation 8.1 may or
may not be linear, although we shall treat the linear case explicitly later, Eq.
8.10.
Separable Variables
Frequently Eq. 8.1 will have the special form
<tPM (8.2)
fix,y)
dx Q{y)
Then it may be rewritten as
P{x)dx + Q{y)dy = 0.
Integrating from (xo,yo) to {x,y) yields
P(x)dx+ (Q{y)dy = 0. (8.3)
J
Jx0 Jy0
Since the lower limits x0 and y0 contribute constants, we may ignore the lower
limits of integration and simply add a constant of integration. Note that this
separation of variables technique does not require that the differential equation
be linear.
EXAMPLE 8.2.1 Boyle's Law
In differential form Boyle's gas law is
dP P
for the volume Fof a fixed quantity of gas at pressure P (and constant tempera-
temperature). Separating variables, we have
2 For further details of numerical computation the reader could start with
R. W. Hamming's Numerical Methods/or Scientists and Engineers. New York:
McGraw-Hill A973) and proceed to specialized references.
FIRST-ORDER DIFFERENTIAL EQUATIONS 441
V P
or
In 7= -1пР+ С
With two logarithms already, it is most convenient to rewrite the constant of
integration С as In k. Then
and
PV=k.
Exact Differential Equations
We rewrite Eq. 8.1 as
P(x,y)dx + Q(x,y)dy = O. (8.4)
This equation is said to be exact if we can match it to a differential dcp,
dx + ^dy. (8.5)
ex cy
Since Eq. 8.4 has a zero on the right, we look for an unknown function cp{x, y) =
constant and dcp = 0.
We have (if such a function cp{x,y) exists)
P(x,y)dx + Q(x,y)dy = ^dx + ^-dy (8.6)
and
The necessary and sufficient condition for our equation to be exact is that the
second, mixed partial derivatives of cp{x, y) (assumed continuous) are indepen-
independent of the order of differentiation :
d2g> = 8P(x,y) = 8Q(x,y) = d\p_ g)
дудх ду дх дхду'
Note the resemblance to the equations of Section 1.13, "Potential Theory." If
Eq. 8.4 corresponds to a curl (equal to zero), then a potential, cp{x,y), must exist.
If q>{x,y) exists then from Eqs. 8.4 and 8.6 our solution is
<p{x,y) = C. (8.9)
We may construct q>{x,y) from its partial derivatives just as we constructed a
magnetic vector potential in Section 1.13 from its curl.
It may well turn out that Eq. 8.4 is not exact, that Eq. 8.8 is not satisfied.
However, there always exists at least one and perhaps an infinity of integrating
factors, a{x,y), such that
442 DIFFERENTIAL EQUATIONS
<x{x, y)P{x, y) dx + <x{x,y)Q{x,y)dy = 0
is exact. Unfortunately, an integrating factor is not always obvious or easy to
find. Unlike the case of the linear first-order differential equation to be con-
considered next, there is no systematic way to develop an integrating factor for
Eq. 8.4.
A differential equation in which the variables have been separated is auto-
automatically exact. An exact differential equation is not necessarily separable.
Linear First-order Differential Equations
If/(x, y) in Eq. 8.1 has the form — p{x)y + q(x), then Eq. 8.1 becomes
(8.10)
Equation 8.10 is the most general linear first-order differential equation. If
q(x) = 0, Eq. 8.10 is homogeneous (in y). A nonzero q(x) may represent a source
or a driving term. Equation 8.10 is linear; each term is linear in у or dy/dx. There
are no higher powers; that is, y2, and no products, y(dy/dx). Note that the
linearity refers to the у and dy/dx; p(x) and q(x) need not be linear in x. Equation
8.10, the most important of these first-order differential equations for physics,
may be solved exactly.
Let us look for an integrating factor a(x) so that
a(x)^ + a(x)p(x)y = a(x)q(x) (8.11)
may be rewritten as
£loc(x)y-] = oc(x)q(x). (8.12)
The purpose of this is to make the left-hand side of Eq. 8.10 a derivative so that
it can be integrated—by inspection. It also, incidentally, makes Eq. 8.10 exact.
Expanding Eq. 8.12, we obtain
, ,dy da / w л
«M/ + -7-У = oc{x)q{x).
dx dx
Comparison with Eq. 8.11 shows that we must require
^=a(x)p(x). (8.13)
dx
Here is a differential equation for a(x), with the variables a and x separable. We
separate variables, integrate, and obtain
<x(x) = exp
p(x) dx
(8.14)
as our integrating factor.
With a(x) known we proceed to integrate Eq. 8.12. This, of course, was the
point of introducing a in the first place. We have
FIRST-ORDER DIFFERENTIAL EQUATIONS 443
Cx A Cx
— [aL{x)y{x)']dx= cc{x)q{x)dx.
Now integrating by inspection, we have
oc{x)y(x) = <x(x)q(x)dx + С
J
The constants from a constant lower limit of integration are lumped into the
constant C. Dividing by a(x), we obtain
y(x) = [a(
a{x)q(x)dx + C\.
Finally, substituting in Eq. 8.14 for a yields
y(x) = exp
- p(t)dt
exp
p{t)dt
q{s)ds
(8.15)
Here the (dummy) variables of integration have been rewritten to make them
unambiguous. Equation 8.15 is the complete general solution of the linear,
first-order differential equation, Eq. 8.10. The portion
x) = С exp
- Pit)dt
(8.16)
corresponds to the case q{x) = 0 and is a general solution of the homogeneous
differential equation. The other term in Eq. 8.15,
y2(x) = exp
- p(t)dt
exp
p(t)dt
q(s)ds,
(8.17)
is a particular solution corresponding to the specific source term q(x).
The reader might note that if our linear first-order differential equation is
homogeneous (q = 0), then it is separable. Otherwise, apart from special cases
such as p = constant, q = constant, or q(x) = ap{x), Eq. 8.10 is not separable.
EXAMPLE 8.2.1 RL Circuit
For a resistance-inductance circuit Kirchhoff's law leads to
=
at
for the current I(t), where L is the inductance and R the resistance, both constant.
V(t) is the time-dependent impressed voltage.
From Eq. 8.14 our integrating factor a{i) is
a{t) = exp —dt
= eRt/L.
444 DIFFERENTIAL EQUATIONS
Then by Eq. 8.15
щ = e~Rt/L
i>
with the constant С to be determined by an initial condition (a boundary
condition).
For the special case V(t) = Vo, a constant,
= e'RtlL
L R
eRtlL + С
Ce
'RtlL
If the initial condition is 1@) = 0, then С = - Vo/R and
Conversion to Integral Equation
Our first-order differential equation, Eq. 8.1, may be converted to an integral
equation by direct integration:
y(x)-y{xo)= f[x,y{x)]dx.
(8.18)
As an integral equation there is a possibility of a Neumann series solution (Sec-
(Section 16.3) with the initial approximation y{x) ж у(х0). In the differential equation
literature this is called the "Picard method of successive approximations."
First order
differential
equation
Eq. 8.1
Integral equation
Eq. 8.18
In
special
cases
Solution: Eq. 8.15
FIG. 8.1
Variables
separable
Eq. 8.2
When variables
are separated
Solution: Eq. 8.3
Solution: Ex. 8.2.7
The relationships among the various techniques introduced in this section
are shown in Fig. 8.1.
First-order differential equations will be encountered again in Chapter 15 in
EXERCISES 445
connection with Laplace transforms and in Chapter 17 from the Euler equation
of the calculus of variations. Numerical techniques for solving first-order dif-
differential equations are examined in Section 8.8.
EXERCISES
8.2.1 From Kirchhoff's law the current / in an RC (resistance-capacitance) circuit
(Fig. 8.2) obeys the equation
dt С
(a) Find I(t).
(b) For a capacitance of 10,000 microfarads charged to 100 volts and discharging
through a resistance of 1 megohm, find the current / for t = 0 and for
t = 100 seconds.
Note. The initial voltage is I0R or Q/C, where Q = Jo I(t)dt.
R
С
FIG. 8.2 RC circuit
8.2.2 The Laplace transform of Bessel's equation (n = 0) leads to
(s2 + l)f'(s) + sf(s) = 0.
Solve for f(s).
8.2.3 The decay of a population by catastrophic two body collisions is described by
— = ~kN2.
dt
This is a first-order, nonlinear differential equation. Derive the solution
W. + iY1,
where т0 = (kN0)~l. This implies an infinite population at t = — x0.
ANS. N(t) = AU 1 + —
-i
8.2.4 The rate of a particular chemical reaction A + В -* С is proportional to the
concentrations of the reactants A and B:
d£(t)
dt
= ф4@)-С@][В@)-С@].
446 DIFFERENTIAL EQUATIONS
(a) Find C(t) for A@) ф В@).
(b) Find C(t) for АЩ = B@).
The initial condition is that C@) = 0.
8.2.5 A boat, coasting through the water, experiences a resisting force proportional to
if, v being the boat's instantaneous velocity. Newton's second law leads to
dv .
m— = —kv.
dt
With v(t = 0) = v0, x(t = 0) = 0, integrate to find oasa function of time and v
as a function of distance.
8.2.6 In the first-order differential equation dy/dx = f(x,y) the function f(x,y) is a
function of the ratio y/x:
~ = g(y/x).
dx
Show that the substitution of и = y/x leads to a separable equation in и and x.
8.2.7 The differential equation
P(x,y)dx + Q(x,y)dy = 0
is exact. Construct a solution
q>(x,y)= P(x,y)dx+ Q(xo,y)dy = constant.
8.2.8 The differential equation
P(
is exact. If
<p(x,y)= P(x,y)dx+ Q(xo,y)dy,
JxQ Jy0
show that
Hence <p(x, y) = constant is a solution of the original differential equation.
8.2.9 Prove that Eq. 8.11 is exact in the sense of Eq. 8.8, provided that a(x) satisfies
Eq. 8.13.
8.2.10 A certain differential equation has the form
f(x)dx + g(x)h(y)dy = 0,
with none of the functions f(x), g(x), h(y) identically zero. Show that a necessary
and sufficient condition for this equation to be exact is that g(x) = constant.
8.2.11 Show that
y(x) = exp
- P(t)dt
exp
p(t)dt \q(s)ds+ С
J
is a solution of
EXERCISES 447
dv
^ + p(x)y(x) = q(x)
by differentiating the expression for y(x) and substituting into the differential
equation.
8.2.12 The motion of a body falling in a resisting medium may be described by
dv ,
m— = ma — bv
dt У
when the retarding force is proportional to the velocity, v. Find the velocity.
Evaluate the constant of integration by demanding that v@) = 0.
8.2.13 Radioactive nuclei decay according to the law
dN
dt
= -AN.
N being the concentration of a given nuclide and A, the particular decay constant.
In a radioactive series of n different nuclides, starting with N,,
dt L x
'ijNj — X2N2, and so on.
dN2
dt
Find N2(t) for the conditions Nt@) = No and JV2@) = 0.
8.2.14 The rate of evaporation from a particular spherical drop of liquid (constant
density) is proportional to its surface area. Assuming this to be the sole mechanism
of mass loss, find the radius of the drop as a function of time.
8.2.15 In the linear homogeneous differential equation
dv
-—= -av
dt
the variables are separable. When the variables are separated the equation is
exact. Solve this differential equation subject to v{0) = v0 by the following three
methods:
(a) Separating variables and integrating.
(b) Treating the separated variable equation as exact.
(c) Using the result for a linear homogeneous differential equation.
ANS. v(t) = voe~a.
8.2.16 Bernoulli's equation,
dy r. . . ч „
т + J(x)y = g(x)y
dx
is nonlinear for n Ф 0 or 1. Show that the substitution и — у1 ~" reduces Bernoulli's
equation to a linear equation.
ANS. ^ + A - n)j\x)u = A - n)g(x).
dx
8.2.17 Solve the linear, first-order equation, Eq. 8.10, by assuming y(x) = u(x)v(x), where
v(x) is a solution of the corresponding homogeneous equation [q(x) = 0]. This
is the method of variation of parameters due to Lagrange. We apply it to second-
order equations in Exercise 8.6.25.
448 DIFFERENTIAL EQUATIONS
8.3 SEPARATION OF VARIABLES—ORDINARY
DIFFERENTIAL EQUATIONS
The equations of mathematical physics listed in Section 8.1 are all partial
differential equations. Our first technique for their solution splits the partial
differential equation of n variables into n ordinary differential equations. Each
separation introduces an arbitrary constant of separation. If we have n variables,
we have to introduce n — 1 constants, determined by the conditions imposed in
the problem being solved.
In Section 2.6 the technique of separation of variables was illustrated for the
wave equation in cartesian, circular cylindrical, and spherical polar coordinates.
In the spherical polar coordinate system the wave equation
\2ф + к2ф=0 (8.19)
led to an azimuthal equation
^Ш + тгЩч>) = о, ,8.20)
dcp'
in which — m2 is a separation constant. As an illustration of how the constant is
restricted, we note that cp in spherical polar coordinates is an azimuth angle. If
this is a classical problem, we shall certainly require that the azimuthal solution
Ф{ср) be single-valued, that is,
Ф{ср + 2n) = Ф{ср). (8.21)
This is equivalent to requiring the azimuthal solution to have a period of 2% or
some integral multiple of it.1 Therefore m must be an integer. Which integer it is
depends on the details of the problem. This is discussed in Chapter 9. Whenever
a coordinate corresponds to an axis of translation or to an azimuth angle the
separated equation always has the form
= — т2Ф(ср)
dip2
for, cp, the azimuth angle, and
dz
= ±a2Z{z) (8.22)
for z, an axis of translation in one of the cylindrical coordinate systems. The
solutions, of course, are sin az and cosaz for —a2 and the corresponding hyper-
hyperbolic function (or exponentials) sinh az and cosh az for + a2.
The Legendre equation,
1This also applies in most quantum mechanical problems but the argument
is much more involved. If m is not an integer, rotation group relations (Section
4.9) and ladder operator relations (Section 12.7) are disrupted. Compare
E. Merzbacher, "Single Valvedness of Wave Functions." Am. J. Phys. 30,
237A962).
SEPARATION OF VARIABLES—ORDINARY DIFFERENTIAL EQUATIONS 449
(8.23)
г 9 -wv 4- Ш 4- 11v = О
dx dx
and the associated Legendre equation
1 d ( . nd&\ ,,, ,ч„ m2 _ л
—- —[ sin#—- + 1A + 1H j~Q = 0,
sin у dO \ dO } sin 0
(8.24)
also appear frequently. As noted in Section 2.6, these equations appear when V2
is used in spherical polar coordinates. Prolate and oblate spheroidal coordinates
also give rise to the Legendre and associated Legendre equations.
A third equation frequently encountered is Bessel's differential equation,
^ ^x2-n2)y = 0. (8.25)
In Sections 2.4 and 2.5 circular cylindrical and spherical polar coordinates
yielded varieties of Bessel's equation. The separation of variables of Laplace's
equation in parabolic coordinates also gives rise to Bessel's equation. It may be
noted that the Bessel equation is notorious for the variety of disguises it may
assume. For an extensive tabulation of possible forms the reader is referred to
Tables of Functions by Jahnke and Emde.3
Other occasionally encountered ordinary differential equations include the
Laguerre and associated Laguerre equations from the supremely important
hydrogen atom problem in quantum mechanics:
iy = 0, (8.26)
dx dx
dx2 dx
From the quantum mechanical theory of the linear oscillator we have Hermite's
equation,
p-2 - 2x^ + 2ocy = 0. (8.28)
dxz dx
Finally, from time to time we find the Chebyshev differential equation
~x^ + n2y = 0. (8.29)
dx2 dx
2These are equivalent algebraic forms in which * = cos в.
3Fourth revised edition. New York: Dover A945), p. 146. Also, E. Jahnke,
F. Emde, and F. Losch, Tables of Higher Functions, 6th ed. New York:
McGraw-Hill A960).
450 DIFFERENTIAL EQUATIONS
TABLE 8.1 Solutions in Spherical Polar Coordinates*
Ф = _аыФ\т
2
3.
,()H^(
k,(kr)} [<2,m(cos 0)j [ sin m<p)
* References for some of the functions are P,m(cos0), m = 0,
Section \2Л;тф О, Section 12.5; QT(cos в), Section 12.10;
ji(kr), n,(kr), i,(kr), and k,(kr), Section 11.7.
fcosm<p and sinm<p may be replaced by e±"n'p
TABLE 8.2 Solutions in Circular Cylindrical Coordinates
Ф=_ атЖ,«, У2Ф = 0
a
b ф = { 7т(аР)] |cosm<pj|cosaz")
Tf л, л л ч / Ь9 cosm^
с. It a = 0 (no z-dependence) ^ <
* References for the radial functions are Jm(ctp), Section 11.1; Nm{a.p),
Section 11.3; Im(ap) and Km(ap), Section 11.5.
For convenient reference, the forms of the solutions of Laplace's equation,
Helmholtz's equation, and the diffusion equation for spherical polar coor-
coordinates are collected in Table 8.1. The solutions of Laplace's equation in circular
cylindrical coordinates are presented in Table 8.2.
For the Helmholtz and the diffusion equation the constant + k2 is added to
the separation constant ±a2 to define a new parameter y2 or — y2. For the
choice +y2 (with y2 > 0) we get Jm{yp) and NJyp). For the choice — y2 (with
y2 > 0) we get Im{yp) and KJyp) as previously.
These ordinary differential equations and two generalizations of them will be
examined and systematized in the next section. General properties following
from the form of the differential equations are discussed in Chapter 9. The
individual solutions are developed and applied in Chapters 10 to 13.
The practicing physicist may and probably will meet other second-order
ordinary differential equations, some of which may possibly be transformed into
the examples studied here. Some of these differential equations may be solved by
the techniques of Sections 8.5 and 8.6. Others may require a calculating machine
for a numerical solution.
SINGULAR POINTS 451
EXERCISES
8.3.1 The quantum mechanical angular momentum operator is given by L = — г(г х V).
Show that
LLij/ = /(/+ 1)ф
leads to the associated Legendre equation.
Hint. Exercises 1.9.9 and 2.5.16 may be helpful.
8.3.2 The one-dimensional Schrodinger wave equation for a particle in a potential field
V = \kx2 is
2m dx2 2
(a) Using £ = ax and a constant A, we have
1/4
2£ /m\1/2
1 = ir[j :
show that
(b) Substituting
show that y(£) satisfies the Hermite differential equation.
8.3.3 Verify that the following are solutions of Laplace's equation:
(a) ф, = \/г,
(b) <k=lin^±l
2r r — z
8.3.4 If T is a solution of Laplace's equation, \24* = 0, show that дЧ'/дг is also a solu-
solution.
Note. The z derivatives of 1/r generate the Legendre polynomials, Pn(cos^),
Exercise 12.1.7. The z derivatives of (l/2r)ln [(r + z)/(r — z)] generate the Legendre
functions, Qn(cos в).
8.4 SINGULAR POINTS
In this section the concept of a singular point or singularity (as applied to a
differential equation) is introduced. The interest in this concept stems from its
usefulness in A) classifying differential equations and B) investigating the fea-
feasibility of a series solution. This feasibility is the topic of Fuchs's theorem,
Sections 8.5 and 8.6. First, a definition.
All the ordinary differential equations listed in Section 8.3 may be solved for
452 DIFFERENTIAL EQUATIONS
d2y/dx2. Using the notation d2y/dx2 = y", we have1
y"=f(x, >-,/). (8.30)
Now, if in Eq. 8.30 у and y' can take on all finite values at x = x0 and y" remains
finite, point x = x0 is an ordinary point. On the other hand, if y" becomes in-
infinite for any finite choice of у and y', point x = x0 is labeled a singular point.
Another way of presenting this definition of singular point is to write our
homogeneous differential equation as
/ + P(x)y' + Q(x)y = 0. (8.31)
Now, if the functions P(x) and Q{x) remain finite at x = x0, point x = x0 is an
ordinary point. However, if either P(x) or Q(x) (or both) diverges as x ->x0,
point x0 is a singular point.
Using Eq. 8.31, we may distinguish between two kinds of singular points.
1. If either P{x) or Q(x) diverges as x -► x0 but (x — x0)
P(x) and (x — x0J Q(x) remain finite as x -► x0, then
x = x0 is called a regular or nonessential singular
point.
2. If P(x) diverges faster than l/(x — x0), so that (x — x0)
P(x) goes to infinity as x -> x0, or Q(x) diverges faster
than l/(x — x0J so that (x — x0J Q(x) goes to in-
infinity asx->x0, then point x = x0 is labeled an ir-
irregular or essential singularity.
These definitions hold for all finite values of x0. The analysis of point x -> oo
is similar to the treatment of functions of a complex variable (Section 6.6). We
set x = 1/z, substitute into the differential equation, and then let z -► 0. By
changing variables in the derivatives, we have
dy(x) = dyjz'1) dz = 1 dyjz'1) = ,2 dy(z~l)
dx dz dx x2 dz dz
d2y{x) _ d
dx2 ~ dz
~dy(x)l dz =
dx \dx
d2yjz-1)
dz
Using these results, we transform Eq. 8.31 into
dz dz2
di -u ~" (8-33)
z*^| + [2z3 - z2P{z-ly]^+ Q{z~l)y = 0. (8.34)
The behavior at x = oo (z = 0) then depends on the behavior of the new
1 This prime notation, / = dy/dx, was introduced by Lagrange in the late
eighteenth century as an abbreviation for Leibnitz's more explicit but more
cumbersome dy/dx.
EXERCISES 453
coefficients
^—" and ^H
zz z*
as z -> 0. If these two expressions remain finite, point x = oo is an ordinary
point. If they diverge no more rapidly than 1/z and 1/z2, respectively, point
x = oo is a regular singular point, otherwise an irregular singular point (an
essential singularity).
EXAMPLE 8.4.1
Bessel's equation is
x2y" + xy' + (x2 - пг)у = 0. (8.35)
Comparing it with Eq. 8.31, we have
P{x) = i Q(x) = 1 - ^,
which shows that point x = 0 is a regular singularity. By inspection we see that
there are no other singular points in the finite range. As x -► oo (z -> 0), from
Eq. 8.34 we have the coefficients
2z - z . 1 - n2z2
—Y~ and z—.
zz z*
Since the latter expression diverges as z4, point x = oo is an irregular or essential
singularity.
The ordinary differential equations of Section 8.3, plus two others, the hyper-
hypergeometric and the confluent hypergeometric, have singular points, as shown in
Table 8.3.
It will be seen that the first three equations in the preceding tabulation, hyper-
hypergeometric, Legendre, and Chebyshev, all have three regular singular points. The
hypergeometric equation with regular singularities at 0, 1, and oo is taken as the
standard, the canonical form. The solutions of the other two may then be
expressed in terms of its solutions, the hypergeometric functions. This is done
in Chapter 13.
In a similar manner, the confluent hypergeometric equation is taken as the
canonical form of a linear second-order differential equation with one regular
and one irregular singular point.
EXERCISES
8.4.1 Show that Legendre's equation has regular singularities at x = — 1, 1, and x>.
8.4.2 Show that Leguerre's equation, like the Bessel equation, has a regular singularity
at x = 0 and an irregular singularity at x = oo.
454 DIFFERENTIAL EQUATIONS
TABLE 8.3
Equation
1.
2.
3.
4.
5.
6.
7.
8.
Hypergeometric
x(x - l)y" + [A + a + b)x - c] / + aby = 0.
Legendre*
A - x2)y" - 2xy' + 1A + l)y = 0.
Chebyshev
A - x2)y" - xy' + n2y = 0.
Confluent hypergeometric
xy" + (c - x)yr -ay = 0.
Bessel
x2y" + хУ + (x2 -n2)y = 0.
Laguerre*
xy" + A - x)y' + ay = 0.
Simple harmonic oscillator
/ + Ш2у = 0.
Hermite
y" - 2xy' + 2ay = 0.
Regular
Singularity
Л' =
0, 1, oo
-1, 1, oo
-1, 1, oo
0
0
0
—
—
Irregular
Singularity
Л' =
—
—
—
CO
00
00
CO
00
*The associated equations have the same singular points.
8.4.3 Show that the substitution
1 - x
a=-l,
converts the hypergeometric equation into Legendre's equation.
8.5 SERIES SOLUTIONS—FROBENIUS' METHOD
In this section we develop a method of obtaining one solution of the linear,
second-order, homogeneous differential equation. The method, a series expan-
expansion, will always work, provided the point of expansion is no worse than a
regular singular point. In physics this very gentle condition is almost always
satisfied.
A linear, second-order, homogeneous differential equation may be put in the
form
pL + P(x)^ + Q(x)y = 0. (8.36)
dx
dx dx
The equation is homogeneous because each term contains y{x) or a derivative;
linear because each y, dy/dx, or d2y/dx2 appears as the first power—and no
products. In this section we develop (at least) one solution of Eq. 8.36. In Section
8.6 we develop a second, independent solution and prove that no third, inde-
SERIES SOLUTIONS—FROBENIUS' METHOD 455
pendent solution exists. Therefore the most general solution of Eq. 8.36 may
be written as
y{x) = clyl{x) + c2y2{x). (8.37)
Our physical problem may lead to a nonhomogeneous, linear, second-order
differential equation
0 + P(x)£ + Q(x)y = F(x). (8.38)
The function on the right, F(x), represents a source (such as electrostatic charge)
or a driving force (as in a driven oscillator). Specific solutions of this nonhomo-
nonhomogeneous equation are touched on in Exercise 8.6.25. They are explored in some
detail, using Green's function techniques, in Sections 8.7, 16.5, and 16.6, and
with a Laplace transform technique in Section 15.11. Calling this solution yp,
we may add to it any solution of the corresponding homogeneous equation
(Eq. 8.36). Hence the most general solution of Eq. 8.38 is
y(x) = ciyi(x) + c2y2(x) + yp(x). (8.39)
The constants ct and c2 will eventually be fixed by boundary conditions.
For the present, we assume that F(x) = 0, that our differential equation is
homogeneous. We shall attempt to develop a solution of our linear, second-
order, homogeneous differential equation, Eq. 8.36, by substituting in a power
series with undetermined coefficients. Also available as a parameter is the power
of the lowest nonvanishing term of the series. To illustrate, we apply the method
to two important differential equations. First, the linear oscillator equation
+ и2 у = 0, (8.40)
with known solutions у = sin cox, cos cox.
We try
y(x) = xk{a0 + atx + a2x2 + a3x3 + • • •)
(8.41)
xxk+\ a ф 0
= X axxk+\ a0 ф 0,
л=о
with the exponent к and all the coefficients ax still undetermined. Note that к
need not be an integer. By differentiating twice, we obtain
00
л=о
оо
= Z ax{k + № + Я -
By substituting into Eq. 8.40, we have
£ ал{к + X){k + X - \)xk+x~2 + cc2 £ axxk+x = 0. (8.42)
л=о л=о
456 DIFFERENTIAL EQUATIONS
From our analysis of the uniqueness of power series (Chapter 5) the coefficients
of each power of x on the left-hand side of Eq. 8.42 must vanish individually.
The lowest power of x appearing in Eq. 8.42 is xk~2, for Я = 0 in the first
summation. The requirement that the coefficient vanish1 yields
aok{k - 1) = 0.
We had chosen a0 as the coefficient of the lowest nonvanishing terms of the series
(Eq. 8.41), hence, by definition, a0 ^ 0. Therefore we have
k(k - 1) = 0. (8.43)
This equation, coming from the coefficient of the lowest power of x, we call the
indicial equation. The indicial equation and its roots are of critical importance to
our analysis. Clearly, in this example we must require either that к = 0 or к = 1.
Before considering these two possibilities for k, we return to Eq. 8.42 and
demand that the remaining net coefficients, say, the coefficient of xk+l (/ > 0),
vanish. We set Я = ; + 2 in the first summation and Я = j in the second. (They
are independent summations and Я is a dummy index.) This results in
aJ+2(k +j+ 2)(/c + j + 1) + со2 uj = 0
or
This is a two-term recurrence relation.2 Given ap we may compute aj+2 and then
cij+4., aj+6, and so on up as far as desired. The reader will note that for this
example, if we start with a0, Eq. 8.44 leads to the even coefficients a2, a4, and so
on, and ignores ax, a3, a5> and so on. Since ax is arbitrary, let us set it equal to
zero (compare Exercises 8.5.3 and 8.5.4) and then by Eq. 8.44
a3 = a5 = a-j = • • • =0,
and all the odd numbered coefficients vanish. Do not worry about the lost
terms; the object here is to get a solution. The rejected powers of x will actually
reappear when the second root of the indicial equation is used.
Returning to Eq. 8.43, our indicial equation, we first try the solution к = 0.
The recurrence relation (Eq. 8.44) becomes
aj+2 = -dj - —. , (8.45)
which leads to
1 Uniqueness of power series, Section 5.7.
2The recurrence relation may involve three terms: that is, aJ+2, depending
on uj and cij-2- Equation 13.12 for the Hermite functions provides an example
of this behavior.
SERIES SOLUTIONS—FROBENIUS' METHOD 457
a* = -i
со
со2
CD2
CO
со6
a
0'
—^a0, and so on.
6!
By inspection (and mathematical induction)
a0,
and our solution is
1 (^*J |
2!
4!
{ojxf
6!
+
= a0 cos cox.
(8.46)
(8.47)
If we choose the indicial equation root /c = 1 (Eq. 8.44), the recurrence
relation becomes
aj+2 = -ar
co
Substituting in j = 0, 2, 4, successively, we obtain
со2 со2
(8.48)
со2
со4
co2 со6 ,
a6 = -а±-т-~ = -irra0, and so on.
6 • 7 7!
Again, by inspection and mathematical induction,
со
2и
(8.49)
For this choice, к = 1, we obtain
y{x)k=l - aox
CO
1 -
{cox)
{coxJ {coxL {coxN
3! 5! 7!
{coxK {coxM {coxI
3!
5!
7!
a0 .
= — sin cox.
CO
(8.50)
458 DIFFERENTIAL EQUATIONS
I
aok(k-\)
xk 2+
II
L_.
n
III
a2(k + 2)(k + 1)
aoa
>2
0
x* +
x" +
IK
a3(k + 3)(A- + 2),
2
— =0
= 0
FIG. 8.3
To summarize this approach, we may write Eq. 8.42 schematically as shown
in Fig. 8.3. From the uniqueness of power series (Section 5.7), the total coefficient
of each power of x must vanish—all by itself. The requirement that the first
coefficient A) vanish leads to the indicial equation, Eq. 8.43. The second coeffi-
coefficient is handled by setting at = 0. The vanishing of the coefficient of xk (and
higher powers, taken one at a time) leads to the recurrence relation Eq. 8.44.
This series substitution, known as Frobenius' method, has given us two
series solutions of the linear oscillator equation. However, there are two points
about such series solutions that must be strongly emphasized:
1. The series solution should always be substituted back
into the differential equation, to see if it works, as a
precaution against algebraic and logical errors.
Conversely, if it works, it is a solution.
2. The acceptability of a series solution depends on its
convergence (including asymptotic convergence). It
is quite possible for Frobenius' method to give a
series solution that satisfies the original differential
equation when substituted in the equation but that
does not converge over the region of interest.
Legendre's differential equation illustrates this
situation.
Expansion about xo
Equation 8.41 is an expansion about the origin, x0 — 0. It is perfectly possible
to replace Eq. 8.41 with
y(x) = £ ax{x - x0)
(8.51)
л=о
Indeed, for the Legendre, Chebyshev, and hypergeometric equations the choice
x0 = 1 has some advantages. The point x0 should not be chosen at an essential
singularity—or our Frobenius method will probably fail. The resultant series
(x0 an ordinary point or regular singular point) will be valid where it converges.
You can expect a divergence of some sort when \x — xo\ = \zs — xo\, where zs
is the closest singularity to x0 in the complex plane.
Symmetry of Solutions
The alert reader will note that we obtained one solution of even symmetry,
yt(x) = yt( — x), and one of odd symmetry, y2(x) = — y2{ — x). This is not just
an accident but a direct consequence of the form of the differential equation.
Writing a general differential equation as
SERIES SOLUTIONS—FROBENIUS' METHOD 459
&{x)y{x) = 0, (8.52)
in which J2?(x) is the differential operator, we see that for the linear oscillator
equation (Eq. 8.40) J£(x) is even; that is,
Se{x) = Se{-x\ (8.53)
Often this is described as even parity.
Whenever the differential operator has a specific parity or symmetry, either
even or odd, we may interchange +x and — x, and Eq. 8.52 becomes
±&{x)y{-x) = 0, (8.54)
+ if J2?(x) is even, — if У{х) is odd. Clearly, if y{x) is a solution of the differential
equation, y( — x) is also a solution. Then any solution may be resolved into even
and odd parts,
y(x) = ±[y{x) + y{-x)] + Цу(х) -y{-x)l (8.55)
the first bracket on the right giving an even solution, the second an odd solution.
If we refer back to Section 8.4, we can see that Legendre, Chebyshev, Bessel,
simple harmonic oscillator, and Hermite equations (or differential operators)
all exhibit this even parity. Solutions of all of them may be presented as series
of even powers of x and separate series of odd powers of x. The Laguerre
differential operator has neither even nor odd symmetry; hence its solutions
cannot be expected to exhibit even or odd parity. Our emphasis on parity stems
primarily from the importance of parity in quantum mechanics. We find that
wave functions usually are either even or odd, meaning that they have a definite
parity. Most interactions (beta decay is the big exception) are also even or odd
and the result is that parity is conserved.
Limitations of Series Approach—Bessel's
Equation
This attack on the linear oscillator equation was perhaps a bit too easy.
By substituting the power series (Eq. 8.41) into the differential equation (Eq.
8.40), we obtained two independent solutions with no trouble at all.
To get some idea of what can happen we try to solve Bessel's equation,
x2y" + xy' + {x2 - n2)y = 0, (8.56)
using y' for dy/dx and y" for d2y/dx2. Again, assuming a solution of the form
У{х) =
л=о
we differentiate and substitute into Eq. 8.56. The result is
00 00
£ ax{k + X)(k + X - \)xk+x + £ ax{k + X)xk+X
(8.57)
00 00
+ v-i t_i_ 2 + 7 v-i
460 DIFFERENTIAL EQUATIONS
By setting X = 0, we get the coefficient of xk, the lowest power of x appearing
on the left-hand side:
ao[k{k - 1) + к - n2] = 0, (8.58)
and again a0 ^ 0 by definition. Equation 8.58 therefore yields the indicial
equation
k2-n2 = 0 (8.59)
with solutions к = ±n.
It is of some interest to examine the coefficient of xk+l also. Here we obtain
al[{k + l)k + k+ 1 -и2] = 0
or
at{k+l -n)(k+ l + n) = 0. (8.60)
For к = ± n neither к + 1 — n nor к + 1 + n vanishes and we must require
at = 0.3
Proceeding to the coefficient of xk+j for к = n, we set X = j in the first, second,
and fourth terms of Eq. 8.57 and X = j — 2 in the third term. By requiring the
resultant coefficient of xk+j to vanish, we obtain
+j-l) + (n+j) - n2] + a,_2 = 0.
When у is replaced by у + 2, this can be rewritten as
which is the desired recurrence relation. Repeated application of this recurrence
relation leads to
1 aon\
a2 = -a0: ~
2{2n + 2)
1 _ , aon\
= —a7-~—■—- = +
2 л о™ _l л\ 24
2!(n
a6 = -
and in general;
^. (8-62)
Inserting these coefficients in our assumed series solution, we have
y(x) = aox"
n\x2 nix4
242!(n
(8.63)
= ±_n— — \ are exceptions.
SERIES SOLUTIONS—FROBENIUS' METHOD 461
In summation form
(8.64)
= ao2"n\
In Chapter 11 the final summation is identified as the Bessel function Jn{x).
Notice that this solution Jn{x) has either even or odd symmetry4 as might be
expected from the form of Bessel's equation.
When к = —n and n is not an integer, we may generate a second distinct
series to be labeled J-n{x). However, when — n is a negative integer, trouble
develops. The recurrence relation for the coefficients a3 is still given by Eq. 8.61,
but with 2n replaced by — 2n. Then, when j + 2 = 2n or j = 2(n — 1), the
coefficient aj+2 blows up and we have no series solution. This catastrophe can
be remedied in Eq. 8.64, as it is done in Chapter 11, with the result that
J_n(x) = (- l)Vn(x), n an integer. (8.65)
The second solution simply reproduces the first. We have failed to construct a
second independent solution for Bessel's equation by this series technique when
n is an integer.
By substituting in an infinite series, we have obtained two solutions for the
linear oscillator equation and one for Bessel's equation (two if n is not an integer).
To the questions "Can we always do this? Will this method always work?"
the answer is no, we cannot always do this. This method of series solution will
not always work.
Regular and Irregular Singularities
The success of the series substitution method depends on the roots of the
indicial equation and the degree of singularity of the coefficients in the differ-
differential equation. To understand better the effect of the equation coefficients on
this naive series substitution approach, consider four simple equations:
(8.66a)
f-
y"
f
X2
x2)
~x~*}
2
~x~2}
2
~x~2}
'-0,
' = 0,
' = 0,
(8.66c)
(8.6&0
The reader may show easily that for Eq. 8.66a the indicial equation is
AJn{x) is an even function if n is an even integer, an odd function if и is an odd
integer. For nonintegral n the x" has no such simple symmetry.
462 DIFFERENTIAL EQUATIONS
к2 - к - 6 = О,
giving к = 3, — 2. Since the equation is homogeneous in x (counting d2/dx2 as
x'2), there is no recurrence relation; a,- = 0 for i > 0. However, we are left with
two perfectly good solutions, x3 and x".
Equation 8.66b differs from Eq. 8.66a by only one power of x, but this sends
the indicial equation to
-6ao = 0,
with no solution at all, for we have agreed that a0 =fc 0. Our series substitution
worked for Eq. 8.66a, which had only a regular singularity, but broke down at
Eq. 8.66b, which has an irregular singular point at the origin.
Continuing with Eq. 8.66c, we have added a term y'/x. The indicial equation
is
k2 - a2 = 0,
but again, there is no recurrence relation. The solutions are у = ха, x~", both
perfectly acceptable one term series.
When we change the power of x in the coefficient of / from —1 to — 2,
Eq. 8.66d, there is a drastic change in the solution. The indicial equation (with
only the y' term contributing) becomes
/c = 0.
There is a recurrence relation
Unless the parameter a is selected to make the series terminate, we have
lim
J-+OO
a
7+1
a,-
= lim
JU -
/-+00 j
Hence our series solution diverges for all x =fc 0. Again, our method worked
for Eq. 8.66c with a regular singularity but failed when we had the irregular
singularity of 8.66d.
Fuchs's Theorem
The answer to the basic question when the method of series substitution
can be expected to work is given by Fuchs's theorem, when asserts that we can
always obtain at least one power-series solution, provided we are expanding
about a point that is an ordinary point or at worst a regular singular point
If we attempt an expansion about an irregular or essential singularity, our
method may fail as it did for Eqs. 8.66b and 8.66d. Fortunately, the more
important equations of mathematical physics listed in Section 8.4 have no
irregular singularities in the finite plane. Further discussion of Fuchs's theorem
appears in Section 8.6.
EXERCISES 463
From Table 8.3, Section 8.4, infinity is seen to be a singular point for all
equations considered. As a further illustration of Fuchs's theorem, Legendre's
equation (with infinity as a regular singularity) has a convergent series solution
in negative powers of the argument (Section 12.10). In contrast, Bessel's equation
(with an irregular singularity at infinity) yields asymptotic series (Sections 5.10
and 11.6). Although extremely useful, these asymptotic solutions are technically
divergent.
Summary
If we are expanding about an ordinary point or at worst about a regular
singularity, the series substitution approach will yield at least one solution
(Fuchs's theorem).
Whether we get one or two distinct solutions depends on the roots of the
indicial equation.
1. If the two roots of the indicial equation are equal,
we can obtain only one solution by this series sub-
substitution method.
2. If the two roots differ by a nonintegral number, two
independent solutions may be obtained.
3. If the two roots differ by an integer, the larger of the
two will yield a solution.
The smaller may or may not give a solution, depending on the behavior of the
coefficients. In the linear oscillator equation we obtain two solutions; for Bessel's
equation, only one solution.
The usefulness of the series solution in terms of what is the solution (i.e.,
numbers) depends on the rapidity of convergence of the series and the avail-
availability of the coefficients. Many, probably most, differential equations will not
yield nice simple recurrence relations for the coefficients. In general, the available
series will probably be useful for |x| (or |x — xo|) very small. Computers can
be used to determine additional series coefficients using a language such as
FORMAC. Often, however, for numerical work a direct numerical integration
will be preferred—Section 8.8.
EXERCISES
8.5.1 Uniqueness theorem. The function y(x) satisfies a second-order, linear, homo-
homogeneous differential equation. At x = x0, y(x) = y0, and dy/dx = /0. Show that
y(x) is unique in that no other solution of this differential equation passes through
the point (xo,yo) with a slope of y'o.
Hint. Assume a second solution satisfying these conditions and compare the
Taylor series expansions.
8.5.2 A series solution of Eq. 8.36 is attempted, expanding about the point x — x0.
If x0 is an ordinary point show that the indicial equation has roots к = 0, 1.
464 DIFFERENTIAL EQUATIONS
8.5.3 In the development of a series solution of the simple harmonic oscillator equation
the second series coefficient ux was neglected except to set it equal to zero. From
the coefficient of the next to the lowest power of x, xk~\ develop a second indicial
type equation.
(a) (SHO equation with к — 0). Show that ai may be assigned any finite value
(including zero).
(b) (SHO equation with к = 1). Show that ut must be set equal to zero.
8.5.4 Analyze the series solutions of the following differential equations to see when ay
may be set equal to zero without irrevocably losing anything and when ax must
be set equal to zero.
(a) Legendre, (b) Chebyshev, (c) Bessel, (d) Hermite.
ANS. (a) Legendre, (b) Chebyshev, and (d) Hermite: For к — 0, ay may
be set equal to zero; For к = 1, ax must be set equal to zero,
(c) Bessel: a, must be set equal to zero (except for к = ±n = —5).
8.5.5 Solve the Legendre equation
A - x2)y" - 2xy' + n(n + l)y = 0
by direct series substitution.
(a) Verify that the indicial equation is
k(k - 1) = 0.
(b) Using к = 0, obtain a series of even powers of x, (at = 0).
v _ a Pi
У even = a0 1
X H
where
a - ЛУ + 1) - Ф + 1
J+2~ (
1H4-2) J
(с) Using к = 1, develop a series of odd powers of x (a, = 0).
[(n — i){n + 2) з (и — l)(n — 3)(w + 2)(n + 4) 5
x _ x + - .X- +
where
Us.
u — u
(У + 2)(; + 3)
(d) Show that both solutions, у even and у odd, diverge for x — ±1 if the scries
continue to infinity.
(e) Finally, show that by an appropriate choice of n, one series at a time may
be converted into a polynomial, thereby avoiding the divergence catastrophe.
In quantum mechanics this restriction of n to integral values corresponds
to quantization of angular momentum.
8.5.6 Develop series solutions for Hermite's differential equation
(a) y" - 2xy' + 2ay = 0.
ANS. k(k — 1) = 0, indicial equation.
For к = 0
ul+2 = 2a: , (/' even),
(У +DC/ + 2)
2(-a)x2
_L_J_
EXERCISES 465
For к = 1
- -rr, U even),
Г 2A - a)x3 22A - a)C - a)x5 1
yodd = a0 x+ v 3; + —* ^ ^-+ ••• .
(b) Show that both series solutions are convergent for all x, the ratio of successive
coefficients behaving, for large index, like the corresponding ratio in the
expansion of expBx2).
(c) Show that by appropriate choice of a the series solutions may be cut off and
converted to finite polynomials. (These polynomials, properly normalized,
become the Hermite polynomials in Section 13.1.)
8.5.7 Laguerre's differential equation is
хЩх) + A - x)L'n(x) + nfn(x) = 0.
Develop a series solution selecting the parameter n to make your series a poly-
polynomial.
8.5.8 Solve the Chebyshev equation
by series substitution. What restrictions are imposed on n if you demand that
the series solution converge for x = ± 1 ?
ANS. The infinite series does con-
converge for x — ±1. Therefore no
restriction on n exists (compare
Exercise 5.2.16).
8.5.9 Solve
A - x2)U:(x) - ЗхВД + n(n + 2)Un(x) = 0,
choosing the root of the indicial equation to obtain a series of odd powers of x.
Since the series will diverge for x = 1, choose n to convert it into a polynomial.
k(k - 1) = 0.
For к = 1
„ _(; D(; )( )
(; + 2)(; + 3)
8.5.10 Obtain a series solution of the hypergeometric equation
x(x - 1)/' + [A + a + b)x -c]y' + aby = 0.
Test your solution for convergence.
8.5.11 Obtain two series solutions of the confluent hypergeometric equation
xy" + (c — x)y' — ay = 0.
Test your solutions for convergence.
8.5.12 A quantum mechanical analysis of the Stark effect (parabolic coordinates) leads
to the differential equation
d (\,du\ /1 m
2
466 DIFFERENTIAL EQUATIONS
Here a is a separation constant, E is the total energy, and F is a constant, where
Fz is the potential energy added to the system by the introduction of an electric
field.
Using the larger root of the indicial equation, develop a power series solution
about £ = 0. Evaluate the first three coefficients in terms of a0.
m2
Indicial equation k2 = 0,
m + 1 2(m + l)(m + 2) 4(m + 2)
Note that the perturbation E does not appear until a3 is included.
8.5.13 For the special case of no azimuthal dependence, the quantum mechanical
analysis of the hydrogen molecular ion leads to the equation
— A - rj2)-— + оси + flr\2u = 0.
drj [_ drj J
Develop a power-series solution for u(rj). Evaluate the first three nonvanishing
coefficients in terms of a0.
Indicial equation k(k — 1) = 0,
- 2-oc . , ГB-а)A2-а) 0
a 2 fB-a
120 20
8.5.14 To a good approximation, the interaction of two nucleons may be described by
a meson potential
Ae~ax
x
attractive for A negative. Develop a series solution of the resultant Schrodinger
wave equation
through the first three nonvanishing coefficients.
<Afc=i = ao{x + \Ax2 + \\_\A'2 -Е- аЛ']х3 + •••},
where the prime indicates multiplication by 2m/h2.
8.5.15 Near the nucleus of a complex atom the potential energy of one electron is given
ЬУ
V = — A + btr + b2r2),
r
where the coefficients bx and b2 arise from screening effects. For the case of zero
angular momentum show that the first three terms of the solution of the
Schrodinger equation have the same form as those of Exercise 8.5.14. By appro-
appropriate translation of coefficients or parameters, write out the first three terms in
a series expansion of the wave function.
8.5.16 If the parameter a2 in Eq. S.66d is equal to 2, Eq. S.66d becomes
/' + \y' - \y = o.
xz x
From the indicial equation and the recurrence relation derive a solution у =
4{1 + b + b~2
A SECOND SOLUTION 467
1 + 2x + 2x2. Verify that this is indeed a solution by substituting back into the
differential equation.
8.5.17 The modified Bessel function /0(x) satisfies the differential equation
Y2 J (y\ i yS_] (y\ _ Y2r /v\ _ A
»9 OV / ^^ 7 OV f 0\ / —
ax ax
From Exercise 7.4.4 the leading term in an asymptotic expansion is found to be
h(x)
Assume a series of the form
/0(x) t=={1 + bi* + b2x~
yJ2nx
Determine the coefficients bx and b2.
ANS. bx = |,
Ьг ~ TIs-
8.5.18 The even power-series solution of Legendre's equation is given by Exercise 8.5.5.
Take a0 = 1 and n not an even integer, say, n = 0.5. Calculate the partial sums
of the series through x200, x400, x600, ..., x2000 for x = 0.95@.01I.00. Also,
write out the individual term corresponding to each of these powers.
Note. This calculation does not constitute proof of convergence at x = 0.99 or
divergence at x = 1.00, but perhaps you can see the difference in the behavior
of the sequence of partial sums for these two values of x.
8.5.19 (a) The odd power-series solution of Hermite's equation is given by Exercise
8.5.6. Take a0 = 1. Evaluate this series for a = 0, x = 1, 2, 3. Cut off your
calculation after the last term calculated has dropped below the maximum
term by a factor of 106 or more. Set an upper bound to the error made in
ignoring the remaining terms in the infinite series.
(b) As a check on the calculation of part (a), show that the Hermite series
yodd(a = 0) corresponds to feexp(x2)dx.
(c) Calculate this integral for x = 1, 2, 3.
8.6 A SECOND SOLUTION
In Section 8.5 a solution of a second-order homogeneous differential equation
was developed by substituting in a power series. By Fuchs's theorem this is
possible, provided the power series is an expansion about an ordinary point
or a nonessential singularity.1 There is no guarantee that this approach will
yield the two independent solutions we expect from a linear second-order
differential equation. Indeed, the technique gave only one solution for Bessel's
equation (n an integer). In this section we develop two methods of obtaining a
second independent solution: an integral method and a power series containing
a logarithmic term. First, however, we consider the question of independence of
a set of functions.
lrThis is why the classification of singularities in Section 8.4 is of vital
importance.
468 DIFFERENTIAL EQUATIONS
Linear Independence of Solutions
Given a set of functions, cpx, the criterion for linear dependence is the existence
of a relation of the form
У k^cpi = 0, (8.67)
X
in which not all the coefficients kx are zero. On the other hand, if the only
solution of Eq. 8.67 is kx = 0 for all A, the set of functions cpx is said to be linearly
independent.
It may be helpful to think of linear dependence of vectors. Consider A, B, and
С in three-dimensional space with A-fixC^O. Then no relation of the form
aA + bB + cC = 0 (8.68)
exists. A, B, and С are linearly independent. On the other hand, any fourth
vector D may be expressed as a linear combination of A, B, and С (see Section
4.4). We can always write an equation of the form
D - aA - bB - cC = 0, (8.69)
and the four vectors are not linearly independent. The three noncoplanar vectors
А, В and С span our real three-dimensional space.
If a set of vectors or functions are mutually orthogonal, then they are auto-
automatically linearly independent. Orthogonality implies linear independence. This
can easily be demonstrated by taking inner products (scalar or dot product for
vectors, orthogonality integral of Section 9.2 for functions).
Let us assume that the functions cpx are differentiable as needed. Then,
differentiating Eq. 8.67 repeatedly, we generate a set of equations
= 0 (8.70)
= 0, and so on. (8.71)
This gives us a set of homogeneous linear equations in which kx are the unknown
quantities. By Section 4.1 there is a solution kx ф 0 only if the determinant of
the coefficients of the kxs vanishes. This means
W1 ^n =0. (8.72)
This determinant is called the Wronskian.
1. If the Wronskian is not equal to zero, then Eq. 8.67
has no solution other than kx = 0. The set of functions
cpx is therefore independent.
2. If the Wronskian vanishes at isolated values of the
argument, this does not necessarily prove linear
dependence (unless the set of functions has only two
functions). However, if the Wronskian is zero over
A SECOND SOLUTION 469
the entire range of the variable, the functions cpx
are linearly dependent over this range2 (compare
Exercise 8.5.2 for the simple case of two functions).
EXAMPLE 8.6.1 Linear Independence
The solutions of the linear oscillator equation 8.40 are q>x = sin cox, cp2 =
cos cox. The Wronskian becomes
sin cox cos cox
со cos cox — со sin cox
= — со ф 0.
These two solutions, cpt and cp2, are therefore linearly independent. For just two
functions this means that one is not a multiple of the other, which is obviously
true in this case.
You know that
sin cox = ±A — cos2coxI/2,
but this is not a linear relation, of the form of 8.67.
EXAMPLE 8.6.2 Linear Dependence
For an illustration of linear dependence, consider the solutions of the one-
dimensional diffusion equation. We have cpt = ex and c/>2 = e~x, and we add
c/>3 = cosh x, also a solution. The Wronskian is
ex e x coshx
ex —e~x sinhx
ex e~x coshx
= 0.
The determinant vanishes for all x because the first and third rows are identical.
Hence ex, e~x, and cosh x are linearly dependent, and indeed, we have a relation
of the form of Eq. 8.67:
ex + e~x - 2coshx = 0 with kx + 0.
A Second Solution
Returning to our linear, second-order, homogeneous, differential equation
of the general form
y" + P(x)/ + Q(x)y = 0, (8.73)
let yt and y2 be two independent solutions. Then the Wronskian, by definition,
is
2 Compare page 187 of H. Lass, Elements of Pure and Applied Mathematics.
New York: McGraw-Hill A957) for proof of this assertion. It is assumed that
the functions have continuous derivatives and that at least one of the minors
of the bottom row of Eq. 8.72 (Laplace expansion) does not vanish in [а,Л],
the interval under consideration.
470 DIFFERENTIAL EQUATIONS
W = yiy'2-y\y2. (8.74)
By differentiating the Wronskian, we obtain
Wf = y\y'2 + y1y'2:-y'iy2-y\y'2
= yi[-P{x)?2 - №Ы - yil-PWi - Q{x)yx~\ (8 75)
= -Р{х){у,у'2-у\у2).
The expression in parentheses is just W, the Wronskian, and we have
W' = -P{x)W. (8.76)
IfP(x) = O;thatis,
f + Q{x)y = 0, (8.77)
the Wronskian
W = yly'2- y\y2 = constant. (8.78)
Since our original differential equation is homogeneous, we may multiply the
solutions yx and y2 by whatever constants we wish and arrange to have the
Wronskian equal to unity (or — 1). This case, P(x) = 0, appears more frequently
than might be expected. The reader will recall that V2 in cartesian coordinates
contains no first derivative. Similarly, the radial dependence of \2(гф) in
spherical polar coordinates lacks a first derivative. Finally, every linear second-
order differential equation can be transformed into an equation of the form of
Eq. 8.77 (compare Exercise 8.6.11).
Let us now assume that we have one solution of Eq. 8.73 by a series sub-
substitution (or by guessing). We now proceed to develop a second, independent
solution. Rewriting Eq. 8.76 as
dw da
we integrate, from xx =atox, = x to obtain
or3
W{x) = W(a
But
(8.79)
3If P(xJ remains finite, a < хг < x, W{x) ф 0 unless W{a) = 0. That is,
the Wronskian of our two solutions is either identically zero or never zero.
A SECOND SOLUTION 471
By combining Eqs. 8.79 and 8.80, we have
dx\y,J y\
Finally, by integrating Eq. 8.81 from x2 = b to x2 = x we get
^P^x2. (8.82)
Here a and b are arbitrary constants and a term ^(x) y2{b)/yx{b) has been
dropped, for it leads to nothing new. Since W(a), the Wronskian evaluated at
x = a, is a constant and our solutions for the homogeneous differential equation
always contain an unknown normalizing factor, we set W{a) = 1 and write
^^dx2. (8.83)
Note that the lower limits Xj = a and x2 = b have been omitted. If they are
retained, they simply make a contribution equal to a constant times the known
first solution, y^x), hence add nothing new.
If we have the important special case of P(x) = 0, Eq. 8.83 reduces to
rx dx2
This means that by using either Eq. 8.83 or 8.84 we can take one known solution
and by integrating can generate a second independent solution of Eq. 8.73. This
technique is used in Section 12.10 to generate a second solution of Legendre's
differential equation.
EXAMPLE 8.6.3 A Second Solution for the Linear Oscillator Equation
From d2y/dx2 + у = 0 with P(x) = 0 let one solution be yt = sinx. By
applying Eq. 8.84, we obtain
dx2
J
= sinx( — cotx) = —cosx,
which is clearly independent (not a linear multiple) of sin x.
Series Form of the Second Solution
Further insight into the nature of the second solution of our differential
equation may be obtained by the following sequence of operations:
1. Express P(x) and Q{x) in Eq. 8.73 as
QO 00
P(x) = X Pix\ Q(x) = X qjXJ. (8.85)
The lower limits of the summations are selected to
create the strongest possible regular singularity (at
472 DIFFERENTIAL EQUATIONS
the origin). These conditions just satisfy Fuchs's
theorem and thus help us gain a better understanding
of Fuchs's theorem.
2. Develop the first few terms of a power-series solution,
as in Section 8.5.
3. Using this solution as yx, obtain a second series type
solution, y2, with Eq. 8.83, integrating term by term.
Proceeding with step 1, we have
y" + (р^лГ1 + Po + PiX + •••)/ + {Q-2X~2 + 4-\x l + • • -)У = ®-> (8-86)
in which point x = 0 is at worst a regular singular point. If p_j = q_x — q_2 = 0,
it reduces to an ordinary point. Substituting
00
y= Y axxk+x
(step 2), we obtain
00 00 00
(fc I "i\(h i ~i 1\Л „к+Л-2 , V ~ „i V 11, i П^ ,,fc + A-l
9j E x = 0.
j=-2 Я = 0
Assuming that p_x ф 0, q_2 ^0, our indicial equation is
k{k - \) + p-ik + q_2 =0,
which sets the net coefficient of xk~2 equal to zero. This reduces to
к2 + (р_, - l)/c + ^_2 = 0. (8.88)
We denote the two roots of this indicial equation by к = a and к = a — n, where
n is zero or a positive integer. (If n is not an integer, we expect two independent
series solutions by the methods of Section 8.5 and there is no problem.) Then
{k - a)(/c - a + n) = 0, (8.89)
or
k2 + (n - 2a)k + a(a - n) = 0,
and equating coefficients of к in Eqs. 8.88 and 8.89, we have
p_, - 1 = w - 2a. (8.90)
The known series solution corresponding to the larger root к = a may be
written as
У1 = x*
л=о
Substituting this series solution into Eq. 8.83 (step 3), we are faced with
A SECOND SOLUTION 473
ехЫ — fx2 Y°° r> x'Их )
(8.91)
where the solutions yx and y2 have been normalized so that the Wronskian,
W(a) = 1. Tackling the exponential factor first, we have
Pix\ dxx = p_, lnx2
(8.92)
Hence
(8.93)
"Л2
This final series expansion of the exponential is certainly convergent if the
original expansion of the coefficient P(x) was convergent.
The denominator in Eq. 8.69 may be handled by writing
00 N
,2a/ V n УЯ
"ЯЛ2
,Я=0
= X
-2a
(8.94)
_ v~2a V Ь хх
я=о
Neglecting constant factors that will be picked up anyway by the requirement
that W(a) = 1, we obtain
"X /да
x?->-4
\Я=0
dx
2.
(8.95)
By Eq. 8.90
x-p-t-z* = x-«-i5 (8.96)
and we have assumed here that n is an integer. Substituting this result into
Eq. 8.95, we obtain
У2{х) = Уг{х) f (сохГ^1 + c.xl" + c2x+1 + ■■■ + cnx-' + ■ ■ -)dx2. (8.97)
The integration indicated in Eq. 8.97 leads to a coefficient of yx (x) consisting of
two parts:
1. A power series starting with x~".
2. A logarithm term from the integration of x (when
Я = n). This term always appears when n is an integer
unless cn fortuitously happens to vanish.4
parity considerations, In x is taken to be In \x , even.
474 DIFFERENTIAL EQUATIONS
EXAMPLE 8.6.4 A Second Solution of Bessel's Equation
From Bessel's equation, Eq. 8.56 (divided by x2 to agree with Eq. 8.73), we
have
P(x) = x~! Q(x) = 1 for the case n = 0.
Hence p_! = 1, q0 = 1; all other p,'s and g/s vanish. The BesseJ indicial equation
is
/c2=0
(Eq. 8.59) with n = 0). Hence we verify Eqs. 8.88 to 8.90 with n and a = 0.
Our first solution is available from Eq. 8.64. Relabeling it to agree with
Chapter 11 (and using a0 = 1), we obtain5
- O(x6). (8.98a)
4 o4
Now, substituting all this into Eq. 8.83, we have the specific case corresponding
toEq. 8.91:
y2{x) = J0{x)
From the numerator of the integrand
;dx7.
(8.986)
exp
= exp[ —lnx2] =
Л2
This corresponds to the x2Pl in Eq. 8.93. From the denominator of the in-
integrand, using a binomial expansion, we obtain
i X2 , X2
X"T + 64
Corresponding to Eq. 8.95, we have
Cx 1
y2{x) = J0{x) —
J X2\
V4
X2
X2
T'1 32
x2
X2
dx.
(8.98c)
Let us check this result. From Eq. 11.63, which gives the standard form of the
second solution,
N0(x) = ^[lnx - In2 + y]J0{x) + Ц~ -
5 The capital О (order of) as written here means terms proportional to xb
and possibly higher powers of x.
A SECOND SOLUTION 475
Two points arise: A) Since Bessel's equation is homogeneous, we may multiply
y2{x) by any constant. To match N0{x), we multiply our y2(x) by 2/n. B) To our
second solution B/7i)y2{x), we may add any constant multiple of the first solu-
solution. Again, to match N0(x) we add
2,
n
where у is the usual Euler-Mascheroni constant (Section 5.2).6 Our new,
modified second solution is
2 2
y2(x) = -[Inx - In2 + y]J0{x) + -Jo
n n
Now the comparison with N0(x) becomes a simple multiplication of J0(x) from
Eq. 8.98a and the curly bracket of Eq. 8.98c. The multiplication checks—through
terms of order x2 and x4, which is all we carried. Our second solution from Eqs.
8.83 and 8.91 agrees with the standard second solution, the Neumann function,
N0(x).
From the preceding analysis, the second solution of Eq. 8.73, y2{x), may be
written as
00
у2{х) = уЛх)\пх+ У djxi+a, (8.98/)
the first solution times lnx and another power series, this one starting with
xa~", which means that we may look for a logarithmic term when the indicial
equation of Section 8.5 gives only one series solution. With the form of the
second solution specified by Eq. 8.98/ we can substitute Eq. 8.98/ into the
original differential equation and determine the coefficients d- exactly as in
Section 8.5. It may be worth noting that no series expansion of In x is needed. In
the substitution Inx will drop out; its derivatives will survive.
The second solution will usually diverge at the origin because of the logarith-
logarithmic factor and the negative powers of x in the series. For this reason y2{x) is often
referred to as the irregular solution. The first series solution, yx (x), which usually
converges at the origin, is called the regular solution. The question of behavior
at the origin is discussed in more detail in Chapters 11 and 12 in which we take
up Bessel functions, modified Bessel functions, and Legendre functions.
Summary
These two sections (together with the exercises) provide a complete solution
of our linear, homogeneous, second-order differential equation—assuming
that the point of expansion is no worse than a regular singularity. At least one
solution can always be obtained by series substitution (Section 8.5). A second,
linearly independent solution can be constructed by the Wronskian double
6 The Neumann function No is defined as it is in order to achieve convenient
asymptotic properties, Section 11.6.
476 DIFFERENTIAL EQUATIONS
integral, Eq. 8.83. This is all there are: no third, linearly independent solution
exists (compare Exercise 8.6.10).
The nonhomogeneous, linear, second-order differential equation will have an
additional solution: the particular solution. This particular solution may be
obtained by the method of variation of parameters, Exercise 8.6.25, or by tech-
techniques such as Green's functions, Section 8.7.
EXERCISES
8.6.1 You know that the three unit vectors i, j, and к are mutually perpendicular
(orthogonal). Show that i, j, and к are linearly independent. Specifically, show
that no relation of the form of Eq. 8.67 exists for i, j, and k.
8.6.2 The criterion for the linear independence of three vectors A, B, and С is that the
equation
aA + bB + cC = 0
(analogous to Eq. 8.67) has no solution other than the trivial a = b = с = 0. Using
components A = (АЛ,А2,А3), and so on, set up the determinant criterion for the
existence or nonexistence of a nontrivial solution for the coefficients a, b, and c.
Show that your criterion is equivalent to the scalar product A • В х С.
f x"
8.6.3 Using the Wronski determinant, show that the set of functions Л,—(n = 1,2,
..., N) v is linearly independent.
8.6.4 If the Wronskian of two functions yt and y2 is identically zero, show by direct
integration that
Ух =су2;
that is, >>! and y2 are dependent. Assume the functions have continuous deriva-
derivatives and that at least one of the functions does not vanish in the interval under
consideration.
8.6.5 The Wronskian of two functions is found to be zero at x = x0. Show that this
Wronskian vanishes for all x and that the functions are linearly dependent.
8.6.6 The three functions sinx, ex, and e~x are linearly independent. No one function
can be written as a linear combination of the other two. Show that the Wronskian
of sin x, ex, and e'x vanishes but only at isolated points.
ANS. ^=4sinx,
W = 0 for x = ±nn n = 0, 1, 2, ....
8.6.7 Consider two functions <pl = x and q>2 = \x\ = xsgnx (Fig. 8.4). The function
sgnx is just the sign of x. Since q>\ = 1 and q>'2 = sgnx, W{q>l,q>2) = 0 for any
interval including [ — 1, +1]. Does the vanishing of the Wronskian over [ — 1, +1]
prove that (py and q>2 are linearly dependent? Clearly, they are not. What is
wrong?
8.6.8 Explain that linear independence does not mean the absence of any dependence.
Illustrate your argument with coshx and ex.
8.6.9 Legendre's differential equation
A - x2)y" - 2xy' + n(n + l)y = 0
EXERCISES 477
FIG. 8.4 x and \x\
has a regular solution Р„(х) and an irregular solution Qn(x). Show that the
Wronskian of Р„ and Qn is given by
1-х2'
with А„ independent of x.
8.6.10 Show, by means of the Wronskian, that a linear, second-order, homogeneous,
differential equation of the form
f(x) + P(x)y'(x) + Q(x)y(x) = 0
cannot have three independent solutions. (Assume a third solution and show
that the Wronskian vanishes for all x.)
8.6.11 Transform our linear, second-order, differential equation
y" + P(x)y> + Q(x)y = 0
by the substitution
у — z exp
-- P(t)dt
and show that the resulting differential equation for z is
z" + q{x)z = 0,
where
q(x) = Q(x) - {P'{x) - \P2(x).
Note. This substitution can be derived by the technique of Exercise 8.6.24.
8.6.12 Use the result of Exercise 8.6.11 to show that the replacement of <p(r) by rcp(r)
may be expected to eliminate the first derivative from the Laplacian in spherical
polar coordinates. See also Exercise 2.5.18 (b).
8.6.13 By direct differentiation and substitution show that
satisfies
ds
478 DIFFERENTIAL EQUATIONS
y'2\x) + Р(х)у'2(х) + Q(x)y2(x) = 0.
Note. The Leibnitz formula for the derivative of an integral is
d P(a> . , ChMdf(x,a) ,
f(x,a)dx= ^±-^dx
L*) d(X
-xdg{a)
rn, ч -xdhOx)
+ f[h(a), «]-j^ ~
da
8.6.14 In the equation
f*exp[-J'P(f)A]
уi (x) satisfies
y'i + P(x)y\ + Q(x)y, = 0.
The function y2(x) is a linearly independent second solution of the same equation.
Show that the inclusion of lower limits on the two integrals leads to nothing new;
that is, it throws in only overall factors and/or a multiple of the known solution
8.6.15 Given that one solution of
r r2
is R = rm, show that Eq. 8.83 predicts a second solution, R = r~m.
8.6.16 Using yy(x) = ^=0(-l)"x2n+1/Bn + 1)! as a solution of the linear oscillator
equation, follow the analysis culminating in Eq. 8.98/ and show that ct = 0 so
that the second solution does not, in this case, contain a logarithmic term.
8.6.17 Show that when n is not an integer the second solution of Bessel's equation,
obtained from Eq. 8.83, does not contain a logarithmic term.
8.6.18 (a) One solution of Hermite's differential equation
y" - 2xy' + 2ay = 0
for a = 0 is yx(x) = 1. Find a second solution y2(x), using Eq. 8.83. Show
that your second solution is equivalent to _yodd (Exercise 8.5.6).
(b) Find a second solution for a = 1, where y^x) = x, using Eq. 8.83. Show that
your second solution is equivalent to ycven (Exercise 8.5.6).
8.6.19 One solution of Laguerre's differential equation
xy" + A - x)y' + ny = 0
for n = 0 is yy(x) = 1. Using Eq. 8.83, develop a second, linearly independent
solution. Exhibit the logarithmic term explicitly.
8.6.20 For Laguerre's equation with n = 0
(a) Write y2(x) as a logarithm plus a power series.
(b) Verify that the integral form of y2(x), previously given, is a solution of
Laguerre's equation (n = 0) by direct. differentiation of the integral and
substitution into the differential equation.
(c) Verify that the series form of y2(x), part (a), is a solution by differentiating
the series and substituting back into Laguerre's equation.
EXERCISES 479
8.6.21 One solution of the Chebyshev equation
A - x2)y" - xy' + n2y = 0
for n = 0 is yy = 1.
(a) Using Eq. 8.83, develop a second, linearly independent solution.
(b) Find a second solution by direct integration of the Chebyshev equation.
Hint. Let v = y' and integrate. Compare your result with the second solution given
in Section 13.3.
ANS. (a) y2 = sin~lx.
(b) The second solution, Vn(x), is not defined for n = 0.
8.6.22 One solution of the Chebyshev equation
for n = 1 is >>! (x) = x. Set up the Wronskian double integral solution and derive
a second solution, y2(x).
ANS. y2= -A -x2I/2
8.6.23 The radial Schrodinger wave equation has the form
h2 d2 ,., „ h2 "i
+ /(/ + 1)—^ + V{r) \y(r) = Ey(r).
2m dr 2mr J
The potential energy V(r) may be expanded about the origin as
b_t
r
(a) Show that there is one (regular) solution starting with r'+l.
(b) From Eq. 8.84 show that the irregular solution diverges at the origin as r~l.
8.6.24 Show that if a second solution, y2, is assumed to have the form y2(x) = yy(x)f(x),
substitution back into the original equation
У2 + Р(х)У2 + Q(x)y2 = 0
leads to
in agreement with Eq. 8.83.
8.6.25 If our linear, second-order differential equation is nonhomogeneous, that is, of
the form of Eq. 8.38, the most general solution is
y(x) = yi(x) + y2(x) + yp(x).
(yl and y2 are solutions of the homogeneous equation.)
Show that
with Wly^s), y2(s)} the Wronskian of y^s) and y2(s).
Hint. As in Exercise 8.6.24, let yp(x) = yy(x)v(x) and develop a first-order differen-
differential equation for v'(x).
8.6.26 (a) Show that
Ax
480 DIFFERENTIAL EQUATIONS
has two solutions:
(b) For a = 0 the two linearly independent solutions of part (a) reduce to
У10 = aox1/2. Using Eq. 8.84 derive a second solution
У20М = aoxi/2lnx.
Verify that y20 is indeed a solution.
(c) Show that the second solution from part (b) may be obtained as a limiting
case from the two solutions of part (a):
8.7 NONHOMOGENEOUS EQUATION—GREEN'S
FUNCTION
The series substitution of Section 8.5 and the Wronskian double integral of
Section 8.6 provide the most general solution of the homogeneous, linear, second-
order differential equation. The specific solution, yp, linearly dependent on the
source term (F(x) of Eq. 8.78) may be cranked out by the variation of parameters
method, Exercise 8.6.25. In this section we turn to a different method of solution
—Green's functions.
For a brief introduction to Green's function method, as applied to the solu-
solution of a nonhomogeneous partial differential equation, it is helpful to use the
electrostatic analog. In the presence of charges the electrostatic potential ф
satisfies Poisson's nonhomogeneous equation (compare Section 1.14)
V2iA=-A (mks units) (8.99)
«о
and Laplace's homogeneous equation,
V2iA = 0, (8.100)
in the absence of electric charge (p = 0). If the charges are point charges qh we
know that the solution is
0 Y
a superposition of single-point charge solutions obtained from Coulomb's law
for the force between two point charges qY and q2,
(8102)
4пвог
By replacement of the discrete point charges with a smeared out distributed
charge, charge density p, Eq. 8.101 becomes
NONHOMOGENEOUS EQUATION—GREEN'S FUNCTION 481
4iZ80
p(r)
dx
(8.103)
or, for the potential at r = r: away from the origin and the charge at r = r2,
P(r2)
r, - r
dx-,.
(8.104)
Dirac Delta Function
A formal derivation and generalization of this result is facilitated by using
<5(x), the Dirac delta function, as in Section 1.15. For the one-dimensional case
the Dirac delta function is often defined by the following properties:
S(x) = 0,
S(x)dx = 1,
(8.105)
(8.106)
and
f(x)S(x)dx=f@).
(8.107)
Here it is assumed that/(x) is continuous at x = 0.
From these defining equations <5(x) must be an infinitely high, infinitely thin
spike—as in the description of an impulsive force (Section 15.9) or charge
density for a point charge.1 The problem is that no such function exists in the
usual sense of function. It is possible to approximate the delta function by a
variety of functions, Eqs. 8.108 to 8.111 and Figs. 8.5 to 8.8:
Ш =
fo,
w,
lo,
n
X < —
1
~Yn<
2w
x
1
—~
In
x > —
In
Sn(x)=-1=exp{-n2x2)
1
n
n 1 + n2x
2V2
ш = -
sinnx
nx
In
, ixt
dt.
(8.108)
(8.109)
(8.110)
(8.111)
1-rhe delta function is frequently invoked to describe very short range forces
such as nuclear forces. It also appears in the normalization of continuum wave
functions of quantum mechanics. Compare Eq. 15.21c/ for plane wave
eigenfunctions.
482 DIFFERENTIAL EQUATIONS
FIG. 8.5 ^-sequence function
FIG. 8.6 ^-sequence function
These approximations have varying degrees of usefulness. Equation 8.108 is
useful in providing a simple derivation of the integral property, Eq. 8.107.
Equation 8.109 is convenient to differentiate. Its derivatives lead to the Hermite
polynomials, Eq. 13.7. Equation 8.111 is particularly useful in Fourier analysis
and in its applications to quantum mechanics. In the theory of Fourier series,
Eq. 8.111 often often appears (modified) as the Dirichlet kernel:
(8.112)
In using these approximations in Eq. 8.107 and later, we assume that/(x) is well
behaved—it offers no problems at large x.
For most physical purposes such approximations are quite adequate. From a
mathematical point of view the situation is still unsatisfactory: The limits
lim Sn(x)
do not exist.
NONHOMOGENEOUS EQUATION—GREEN'S FUNCTION 483
FIG. 8.7 ^-sequence function
♦- x
FIG. 8.8 E-sequence function
A way out of this difficulty is provided by the theory of distributions. Recog-
Recognizing that Eq. 8.107 is the fundamental property, we focus our attention on it
rather than on S(x) itself. Equations 8.108 to 8.111 with n = 1, 2, 3, .. . may be
interpreted as sequences of normalized functions:
Sn(x)dx = l.
J — oo
The sequence of integrals has the limit
lim Г Sn(x)f(x)dx =/@).
(8.113)
(8.114)
484 DIFFERENTIAL EQUATIONS
Note carefully that Eq. 8.114 is the limit of a sequence of integrals. Again, the
limit of Sn(x), n -+ oo, does not exist. (The limits for all four forms of Sn(x) diverge
at x = 0).
We may treat S(x) consistently in the form
Л» Лоо
d(x)f(x)dx=\im\ 5n{x)f{x)dx. (8.115)
J — oo J — oo
S(x) is labeled a distribution (not a function) defined by the sequences Sn(x) as
indicated in Eq. 8,115. We might emphasize that the integral on the left-hand side
of Eq. 8.115 is not a Riemann integral.2 It is a limit.
This distribution S(x) is only one of an infinity of possible distributions, but
it is the one we are interested in because of Eq. 8.107.
We use S(x) frequently and call it the Dirac delta function3—for historical
reasons. Remember that it is not really a function. It is essentially a shorthand
notation, defined implicitly as the limit of integrals of a sequence, Sn(x), accord-
according to Eq. 8.115. It should be understood that our Diract delta function has
significance only as part of an integrand and never as an end result. In this spirit
the Dirac delta function is often regarded as an operator, a linear operator:
d(x — x0) operates on/(x) and yields/(x0).
&(хо)Лх) = Г S(x - xo)f(x)dx =f(x0). (8.116)
J — oo
It may also be classified as a linear mapping or simply as a generalized function.
Shifting our singularity to the point x = x', we write the Dirac delta function as
S(x — x'). Equation 8.107 becomes
Лоо
f(x)S(x-x')dx=f(x'), (8.117)
J — oo
As a description of a singularity at x = x', the Dirac delta function may be
written as S(x — x') or as S(x' — x). Going to three dimensions and using
spherical polar coordinates, we obtain
Г П Г Г S(r)r2dr sin вdOd(p= I I I d(x)d(y)d{z)dxdydz = l. (8.118)
Jo Jo Jo
This corresponds to a singularity (or source) at the origin. Again, if our source is
at r = r1; Eq. 8.118 becomes
<S(r2 - rjr2 dr2 sin 02 dO2 d(p2 = 1. (8.119)
2 It can be treated as a Stieltjes integral if desired. d(x)dx is replaced by du(x),
where u(x) is the Heaviside step function (compare Exercise 8.7.13).
3 Dirac introduced the delta function to quantum mechanics. Actually the
delta function can be traced back to Kirchhoff, 1882. For further details see
M. Jammer, The Conceptual Development of Quantum Mechanics. McGraw-
Hill, New York A966).
NONHOMOGENEOUS EQUATION—GREEN'S FUNCTION 485
As already mentioned,
S(r2 - rj = <$(!-! - r2). (8.120)
Poisson's Equation—Green's Function Solution
Returning to our electrostatic problem, we use ф as the potential correspond-
corresponding to the given distribution of charge and therefore satisfying Poisson's equation
\2ф= —£-, (8.121)
«о
whereas a function G, which we label Green's function, is required to satisfy
Poisson's equation with a point source at the point defined by r2 :
\2G= -Sir, -r2). (8.122)
Physically, then, G is the potential at r: corresponding to a unit source (e0) at r2.
By Green's theorem (Section 1.11)
f(t//V2G- GV2il/)dt2 = f(j/fVG-GV«/O-da. (8.123)
Assuming that the integrand falls off faster than r~2, we may simplify our
problem by taking the volume so large that the surface integral vanishes, leaving
j ij/\2Gdx2 = ГG\2ij/dx2 (8.124)
or by substituting in Eqs. 8.121 and 8.122, we have
f 1 - r2)dr2 = - f ^^^dr2. (8.125)
Integration by employing the defining property of the Dirac delta function
(Eq. 8.107) produces
№1) = -[G(rur2)P(T2)dx2. (8.126)
£oJ
Note that we have used Eq. 8.122 to eliminate \2G but that the function G
itself is still unknown. In Section 1.14, Gauss's law, we found that
0 if the volume did not include the origin and — 4тг if the origin were included.
This result from Section 1.14 may be rewritten as
corresponding to a shift of the electrostatic charge from the origin to the position
r = r2. Here r12 = \rY — r2|, and the Dirac delta function S(rl — r2) vanishes
unless rj = r2. Therefore in a comparison of Eqs. 8.122 and 8.128 the function
486 DIFFERENTIAL EQUATIONS
G (Green's function) is given by
i(8.129)
-г2
The solution of our differential equation (Poisson's equation) is
(8.130)
in complete agreement with Eq. 8.104. Actually ^(гД Eq. 8.130, is the particular
solution of Poisson's equation. We may add solutions of Laplace's equation
(compare Eq. 8.39). Such solutions could describe an external field.
In Sections 16.5 and 16.6 these results will be generalized to the second-order
linear but nonhomogeneous, differential equation
The Green's function is taken to be a solution of
&G{TltT2)= -<$(!■!-r2) (8.132)
(analogous to Eq. 8.122). Then the particular solution y^J becomes
(8.133)
(There may also be an integral over a bounding surface depending on the con-
conditions specified.)
In summary, Green's function, often written G(rl,r2) as a reminder of the
name, is a solution of Eq. 8.122. It enters in an integral solution of our differential
equation, as in Eq. 8.104. For the simple, but important, electrostatic case we
obtain Green's function G(rY, r2) by Gauss's law, comparing Eqs. 8.122 and 8.128.
Finally, from the final solution (Eq. 8.130) it is possible to develop a physical
interpretation of Green's function. It occurs as a weighting function or influence
function that enhances or reduces the effect of the charge element p(r2)dx2
according to its distance from the field point xx. Green's function, G(rx,r2), gives
the effect of a unit point source at r2 in producing a potential at rx. This is how
it was introduced in Eq. 8.122; this is how it appears in Eq. 8.130.
Symmetry of Green's Function
An important property of Green's function is the symmetry of its two vari-
variables, that is,
G(r1,r2) = G(r2,r1). (8.134)
Although this is obvious in the electrostatic case just considered, it can be
proved under much more general conditions. In place of Eq. 8.122, let us
require that G(r, rj satisfy4
4Equation 8.135 is a three-dimensional version of the self-adjoint eigenvalue
equation, Eq. 9.4.
NONHOMOGENEOUS EQUATION—GREEN'S FUNCTION 487
r1)= -<S(r-ri), (8.135)
corresponding to a mathematical point source at r = r^ Here the functions
p(r) and q(r) are well behaved but otherwise arbitrary functions of r. Green's
function, G(r, r2), satisfies the same equation but the subscript 1 is replaced by
subscript 2.
V-[p(r)VG(r,r2)] + ^(r)G(r,r2)= -<S(r-r2). (8.136)
Then G(r, r2) is a sort of potential at r, created by a unit point source at r2. We
multiply the equation for G(r, rj by G(r, r2) and the equation for G(r, r2) by
G(r, rj and then subtract the two:
G(r,r2)\-[p(r)\G(r,ri)] - G(r,ri)V-|>(r)VG(r,r2)]
= -G(r,r2)«$(r - r,) + G(r,ri)<S(r - r2). (8.137)
The first term in Eq. 8.137,
G(r,r2)V.[p(r)VG(r,ri)]
may be replaced by
V • [G(r, r2)p(r) V G(r, r J] - V G(r, r2) • p(r) V G(r, r J.
A similar transformation is carried out on the second term. Then integrating
over whatever volume is involved and using Green's theorem, we obtain a
surface integral:
f [G(r,r2)p(r)VG(r,ri) - G(r,ri)p(r)VG(r,r2)] -da = -G(rbr2) + G(r2,ri).
Js
(8.138)
The terms on the right-hand side appear when we use the Dirac delta functions
and carry out the volume integration. Under the requirement that Green's
functions, G(r, rj and G(r, r2), have the same values over the surface 5 and that
their normal derivatives have the same values over the surfaces S, or that the
Green's functions vanish (Dirichlet boundary conditions, Section 9.1M over the
surface S, the surface integral vanishes and
G(r1,r2)=G(r2,r1), (8.139)
which shows that Green's function is symmetric. If the eigenfunctions are com-
complex, boundary conditions corresponding to Eqs. 9.20 to 9.22 are appropriate.
Equation 8.139 becomes
G(r1,r2)=G*(r2,r1). (8.140)
Note that this symmetry property holds for Green's function in every equation
in the form of Eq. 8.135. In Chapter 9 we shall call equations in this form self-
5 Any attempt to demand that the normal derivatives vanish at the surface
(Neumann's conditions, Section 9.1) leads to trouble with Gauss's Law. It is
like demanding that J E • do = 0 when you know perfectly well that there is
some electric charge inside the surface.
488 DIFFERENTIAL EQUATIONS
adjoint. The symmetry is the basis of various reciprocity theorems; the effect of
a charge at r2 on the potential at r: is the same as the effect of a charge at r:
on the potential at r2.
This use of Green's functions is a powerful technique for solving many of the
more difficult problems of mathematical physics. We return to it when we take
up integral equations in Chapter 16.
EXERCISES
8.7.1 Let
8.7.2
о,
x <
In
0,
2Й
< x.
Show that
lim [ f(x)dB{x)dx=№,
J — oo
assuming that /(x) is continuous at x = 0.
Verify that the sequence <5„(х), based on the function
_ fO, x<0
3" = \ne-"x, x > 0,
is a delta sequence (satisfying Eq. 8.114). Note that the singularity is at +0, the
positive side of the origin.
Hint. Replace the upper limit (oo) by c/n, where с is large but finite and use the
mean value theorem of integral calculus.
8.7.3 For
Ш = -
(Eq. 8.110), show that
n 1 + nzx2
3n(x)dx=L
8.7.4
Demonstrate that 3„ = sin nx/nx is a delta distribution by showing that
Лгл „, . sin nx ,
lim
nx
Assume that /(x) is continuous at x = 0 and vanishes asx-> ± oo.
Hint. Replace x by y/n and take lim n —► oo before integrating. The needed integral
is evaluated in Sections 7.2 and 15.7.
8.7.5 Fejer's method of summing series is, associated with the function
1
2nn
sin (nf/2)T
EXERCISES 489
Show that 3n(t) is a delta distribution in the sense that
lim
Inn
f«) -:
sin(nf/2)"
sin(f/2)
dt=f(O).
8.7.6 Prove that
Г i W-Li \
[Q\X XjJJ — O\X XjJ.
a
Note. If 3[a{x — xj] is considered even relative to xb the relation holds for
negative a and I/a may be replaced by \/\a .
8.7.7 Show that
<5[(x - Xj)(x - x2)] = [<5(x - xj + 8{x - x2)]/|xj - x2|.
Hint. Try using Exercise 8.7.6.
8.7.8 Using the Gauss error curve delta sequence (д„), show that
x—e5(x) = -d(x),
dx
treating <5(x) and its derivative as in Eq. 8.115.
8.7.9 Show that
Г 3'(x)flx)dx=-f'@).
J — oo
Here we assume that f'(x) is continuous at x = 0.
8.7.10 Prove that
df{x)
dx
-i
3(x - x0),
where x0 is chosen so that /(x0) = 0.
Hint. Note that 3(f)df - 3(x)dx.
8.7.11 Show that in spherical polar coordinates (r, cos в, <p) the delta function д(гх — r2)
becomes
((p1 — <p2).
Generalize this to the curvilinear coordinates (qi,q2,q3) of Section 2.1 with scale
factors /il5 h2, and h3.
8.7.12 A rigorous development of Fourier transforms (Sneddon, Fourier TransformsN
includes as a theorem the relations
.. 2 f*2 „. .sinax ,
lim - flu + x) dx
flu + 0) + flu - 0), X! < 0 < x2
flu + 0), x, = 0 < x2
flu - 0), x, < 0 = x2
0 Xj < x2 < 0 or 0 < x1 < x2
Verify these results using the Dirac delta function.
6Sneddon, I. N., Fourier Transforms. New York: McGraw-Hill A951).
490 DIFFERENTIAL EQUATIONS
FIG. 8.9 ^[1 + tanhwx] and the Heaviside unit step function
8.7.13 (a) If we define a sequence 3„(х) = п/B cosh2 nx), show that
3n(x)dx — 1, independent of n.
J — oo
(b) Continuing this analysis, show that*
dn(x)dx — \[\ + tanhnx] = un(x)
J — oo
and
,. . . fO, x<0,
hm uJx) = <
{l, x > 0.
This is the Heaviside unit step function.
8.7.14 Show that the unit step function u(x) may be represented by
ixtdi
t '
2 2ш !
щ/ CO
where P means Cauchy principal value (Section 7.2).
8.7.1 5 As a variation of Eq. 8.111, take
1 p° . _
"W ~ 2n I e
4/ — GO
Show that this reduces to {n/n)-1/A + и2х2), Eq. 8.110, and that
Лоо
dn{x)dx=l.
J — oo
Note. In terms of integral transforms, the initial equation here may be interpreted
as either a Fourier exponential transform of e~|r|/n or a Laplace transform of eixt.
8.7.16 Show that
_1Л|Г,—Г2|
- г2
is a Green's function satisfying the differential equation
(V2+/c2)G(ri,r2)= -<5(Г1-г2).
* Many other symbols are used for this function. This is the AMS-55 notation:
и for unit.
NUMERICAL SOLUTIONS 491
This involves two parts:
(a) Show that G(rl,r2) satisfies the homogeneous differential equation away
from Tj = r2.
(b) Show that
r2eV.
8.8 NUMERICAL SOLUTIONS
The analytic solutions and approximate solutions to differential equations in
this chapter and in succeeding chapters may suffice to solve the problem at hand
—particularly if there is some symmetry present. The power-series solutions
show how the solution behaves at small values of x. The asymptotic solutions
(compare Sections 11.6 and 12.10) show how the solution behaves at large values
of x. These limiting cases and also the possible resemblance of our differential
equation to the standard forms with known solutions (Chapters 11 to 13) are
invaluable in helping us gain an understanding of the general behavior of our
solution.
However, the usual situation is that we have a different equation, perhaps a
different potential in the Schrodinger wave equation, and we want a reasonably
exact solution. So we turn to numerical techniques.
First-Order Differential Equations
The differential equation involves a continuity of points. The independent
variable x is continuous. The (unknown) dependent variable y(x) is assumed
continuous. The concept of differentiation demands continuity. Our numerical
processes replace these continua by discrete sets. We consider x at
x0, x0 + h, x0 + 2h, x0 + 3h, and so on,
where h is some small interval. The smaller h is, the better the approximation
is—in principle. But if h is made too small, the demands on machine time will be
excessive, and accuracy may actually decline because of accumulated round-off
errors. We refer to the successive discrete values of x as xn, xn+l, and so on, and
the corresponding values of y(x) as y(xn) = yn. If x0 and y0 are given, the problem
is to find yl5 then to find y2, and so on.
Taylor Series Solution
Consider the ordinary (possibly nonlinear) first-order differential equation
£y(x) = f(x,y) (8.141)
with the initial condition y(x0) — y0. In principle, a step-by-step solution of the
first-order equation, Eq. 8.141, may be developed to any degree of accuracy by
a Taylor expansion
492 DIFFERENTIAL EQUATIONS
У(х0 + h) = y(x0) + hy'(x0) + ~у"Ы + ■■■ + ^/я)(х0) + ■ ■ ■ > (8-142)
(assuming the derivatives exist and the series is convergent). The initial value
y(x0) is known and y'(x0) is given as/(x0, y0). In principle, the higher derivatives
may be obtained by differentiating y'(x) = f(x, y). In practice, this differentiation
may be tedious. Now, however, this differentiation can be done by computer,
using languages such as FORMAC. For equations of the form encountered in
this chapter a large computer has no trouble generating and evaluating ten or
more derivatives.
The Taylor series solution is a form of analytic continuation, Section 6.5.
If the right-hand side of Eq. 8.142 is truncated after two terms, we have
(
= yo + hf(xo,yo),
neglecting the terms of order h2. Eq. 8.143 is often called the Euler solution.
Clearly, it is subject to serious error with the neglect of terms of order h2.
Runge-Kutta Method
The Runge-Kutta method is a refinement of this, with an error of order h5.
The relevant formulas are
(8-144)
where
К = hf(xn,yn),
y.. + £*oX ,o _
(8.145)
h, yn + i/cj,
кз = hf(xn + h,yn + k2).
A derivation of these equations appears in Ralston and Wilf: (Chapter 9 by
M. J. Romanelli).
Equations 8.144 and 8.145 define what might be called the classic fourth-order
Runge-Kutta method (accurate through terms of order hA). This is the form
followed in IBM's Scientific Subroutine Package (SSP). Many other Runge-
Kutta methods exist. Lapidus and Seinfeld (see references) analyze and compare
other possibilities and recommend a fifth-order form due to Butcher as slightly
superior to the classic method.
The form of Eqs. 8.144 and 8.145 is assumed and the parameters adjusted
to fit a Taylor expansion through /i4. From this Taylor expansion viewpoint
the Runge-Kutta method is also an example of analytic continuation.
For the special case in which dy/dx is a function of x alone [f{x,y) in Eq.
1 A. Ralston, and H. S. Wilf, eds., Mathematical Methods for Digital Com-
Computers. New York: Wiley A960)
NUMERICAL SOLUTIONS 493
8.141 ->/(x)], the last term in Eq. 8.144 reduces to a Simpson rule numerical
integration from xn to xn+1.
The Runge-Kutta method is stable, meaning that small errors do not get
amplified. It is self-starting, meaning that we just take the x0 and y0 and away
we go. But it has disadvantages. Four separate calculations of f(x, y) are required
at each step. The errors, although of order h5 per step, are not known. One
checks the numerical solution by cutting h in half and repeating the calculation.
If the second result argees with the first, then h was small enough.
Finally, the Runge-Kutta method can be extended to a set of coupled
first-order equations:
du f
(8.146)
— = j2(x, u, v), and so on,
with as many dependent variables as desired. Again, Eq. 8.146 may be nonlinear,
an advantage of the numerical solution.
Predictor-Corrector Methods
As an alternate attack on Eq. 8.141, we might estimate or predict a tentative
value of yn+l by
JU, =>.-,y. (8147)
)
This is not quite the same as Eq. 8.143. Rather, it may be interpreted as
кУя~1' (8Л48)
the derivative as a tangent being replaced by a chord. Next we calculate
y'n+l = f(xn+1,yn+i). (8.149)
Then to correct for the crudeness of Eq. 8.147, we take
Уп+,=Уп + \(Уп+1+У'п)- (8-150)
Here the finite difference ratio Ay/h is approximated by the average of the two
derivatives. This technique—a prediction followed by a correction (and iteration
until agreement is reached)—is the heart of the predictor-corrector method.
It should be emphasized that the preceding set of equations is intended only to
illustrate the predictor-corrector method. The accuracy of this set (to order h3)
is usually inadequate.
The iteration (substituting yn+l from Eq. 8.150 back into Eq. 8.149 and
recycling until yn+l settles down to some limit) is time-consuming in a computing
machine operation. Consequently, the iteration is usually replaced by an
intermediate step (the modifier) between Eqs. 8.147 and 8.149.
494 DIFFERENTIAL EQUATIONS
This modified predictor-corrector method has the major advantage over
the Runge-Kutta method of requiring only two computations of/(x, y) per
step, instead of four. Unfortunately, the method as originally developed was
unstable—small errors (round-off and truncation) tended to propagate and
become amplified.
This very serious problem of instability has been overcome in a version of
the predictor-corrector method devised by Hamming. The formulas (which
are moderately involved), a partial derivation, and detailed instructions for
starting the solution are all given by Ralston (Chapter 8 of Ralston and Wilf).
Hamming's method is accurate to order h4. It is stable for all reasonable values
of h and provides an estimate of the error. Unlike the Runge-Kutta method,
it is not self-starting. For example, Eq. 8.147 requires both yn_1 and yn. Starting
values (Уо,У1,У2>Уз) f°f tne Hamming predictor-corrector method may be
computed by series solution (power series for small x, asymptotic series for
large x) or by the Runge-Kutta method.
The Hamming predictor-corrector method may be extended to cover a set
of coupled first-order differential equations, that is, Eq. 8.146.
Second-Order Differential Equations
Any second-order differential equation
f(x) + P(x)y'(x) + Q(x)y(x) = F(x), (8.151)
may be split into two first-order differential equations by writing
y'(x) = z(x), (8.152)
and then
z\x) + P(x)z(x) + Q(x)y(x) = F(x), (8.153)
These coupled first-order differential equations may be solved by either the
Runge-Kutta or Hamming predictor-corrector techniques previously
described.
As a final note—a thoughtless turning the crank application of these powerful
numerical techniques is an invitation to disaster. The solution of a new and
different differential equation will usually involve a mixture of analysis and
numerical calculation. There is little point in trying to force a Runge-Kutta
solution through a singular point where the solution is going to blow up.
EXERCISES
8.8.1 The Runge-Kutta method, Eq. 8.144, is applied to a first-order differential equation
dy/dx = f(x). Note that this function f(x) is independent of y. Show that in this
special case the Runge-Kutta method reduces to Simpson's rule for numerical
quadrature, Appendix A2.
8.8.2 (a) A body falling through a resisting medium is described by
dv
EXERCISES 495
(for a retarding force proportional to the velocity). Take the constants to
be g = 9.80 (meters/sec2) and a = 0.2 (sec). The initial conditions are t = 0,
v = 0. Integrate this equation out to t = 20.0 in steps of 0.1 sec. Tabulate the
value of the velocity for each whole second, uA.0), uB.0), and so on. If a
plotting routine is available, plot v(t) versus t.
(b) Calculate the ratio of uB0.0) to the terminal velocity u(oo).
Check value. uA0) = 42.369 meters/sec.
ANS. (b) 0.9817.
8.8.3 The differential equation for the population of a radioactive daughter element is
dN2(t) ,
dt
-А2ЛГ2,
XY exp( — XYt) being the rate of production resulting from the decay of the parent
element. Xv = 0.10 sec, X2 = 0.08 sec. Integrate this differential equation from
t = 0 out to t = 40 seconds for the initial condition N2@) = 0. Tabulate and plot
N2(t) vs t.
8.8.4 The time-reversed asteroid depletion equation is
dN
dt
= kN2.
Solve this equation by using a Runge-Kutta or equivalent subroutine. The initial
conditions are
t0 = 0 (years)
No = 100 (asteroids)
к = 0.25 x 101 (years) (asteroid).
Carry out your solution as far as you can. (There will be trouble as you approach
t = 5 x 109 years.) Tabulate N(t) versus t, with At = 5 x 107 years.
Note. Exercise 8.2.3 (with к replaced by —k) gives the analytic solution.
8.8.5 Integrate Legendre's differential equation, Exercise 8.5.5, from x = 0 to x = 1
with the initial conditions y@) = l,/@) = 0 (even solution). Tabulate y(x) and
dy/dx at intervals of 0.05. Take n — 2.
8.8.6 The Lane-Emden equation of astrophysics is
dx x dx
Take y@) = 1, /@) = 0, and investigate the behavior of y(x) for s = 0, 1, 2, 3, 4, 5,
and 6. In particular, locate the first zero of y(x).
Hint. From a power-series solution y"@) = — 5.
Note. For s = 0, y(x) is a parabola, for 5 = 1, a spherical Bessel function, Jq{x).
As 5 -»• 5, the first zero moves out to 00, and for 5 > 5, y{x) never crosses the positive
x-axis.
ANS. For y(xs) = 0, xo = 2.45(V6),
xt = 3.14(л), x2=4.35,
x3 = 6.90.
8.8.7 As a check on Exercise 8.6.18(a), integrate Hermite's equation
dx dx
496 DIFFERENTIAL EQUATIONS
from x = 0 out to x = 3. The initial conditions are y@) = 0, y'@) — 1. Tabulate
y(l),yB),andyC).
ANS. y(l) = 1.463
yB) = 16.45
yC) = 1445.
REFERENCES
Bateman, H., Partial Differential Equations of Mathematical Physics. New York: Dover
A944; first edition, 1932).
A wealth of applications of various partial differential equations in classical physics.
Excellent examples of the use of different coordinate systems—ellipsoidal, parabo-
loidal, toroidal coordinates, and so on.
Davis, P. J. and P. Rabinowitz, Numerical Integration. Waltham, Mass.: Blaisdell A967).
This book covers a great deal of material in a relatively easy-to-read form. Appendix 1
(On the Practical Evaluation of Integrals by M. Abramowitz) is excellent as an overall
view.
Hamming, R. W., Numerical Methods for Scientists and Engineers, 2nd ed. New York:
McGraw-Hill A973).
This well-written text discusses a wide variety of numerical methods from zeros of
functions to the fast Fourier transform. All topics are selected and developed with a
modern high-speed computer in mind.
Ince, E. L., Ordinary Differential Equations. New York: Dover A926).
The classic work in the theory of ordinary differential equations.
Lapidus, L., and J. H. Seinfeld, Numerical Solutions of Ordinary Differential Equations.
New York: Academic Press A971).
A detailed and comprehensive discussion of numerical techniques with emphasis on the
Runge-Kutta and predictor-corrector methods. Recent work on the improvement of
characteristics such as stability is clearly presented.
Miller, R. K., and A. N. Michel, Ordinary Differential Equations. New York: Academic
Press A982).
Murphy, G. M., Ordinary Differential Equations and Their Solutions. Princeton, N.J.:
Van Nostrand A960).
A thorough, relatively readable treatment of ordinary differential equations, both
linear and nonlinear.
Ralston, A., and H. Wilf, Eds., Mathematical Methods for Digital Computers. New York:
Wiley A960).
Ritger, P. D., and N. J. Rose, Differential Equations with Applications. New York:
McGraw-Hill A968).
Stroud, A. H., Numerical Quadrature and Solution of Ordinary Differential Equations,
Applied Mathematics Series, Vol. 10. New York: Springer-Verlag A974).
A balanced, readable, and very helpful discussion of various methods of integrating
differential equations. Stroud is familiar with recent work in this field and provides
numerous current references.
9 STURM-
LIOUVILLE
THEORY-
ORTHOGONAL
FUNCTIONS
In the preceding chapter we developed two linearly independent solutions
of the second-order linear homogeneous differential equation and proved that
no third, linearly independent solution existed. In this chapter the emphasis
shifts from solving the differential equation to developing and understanding
general properties of the solutions. In Section 9.1 the concepts of self-adjoint
operator, eigenfunction, eigenvalue, and Hermitian operator are presented. The
concept of adjoint operator, given first in terms of differential equations is then
redefined in accordance with usage in quantum mechanics. The vital properties
of reality of eigenvalues and orthogonality of eigenfunctions are derived in
Section 9.2. In Section 9.3 we discuss the Gram-Schmidt procedure for system-
systematically constructing sets of orthogonal functions. Finally, the general property
of the completeness of a set of eigenfunctions is explored in Section 9.4.
9.1 SELF-ADJOINT DIFFERENTIAL EQUATIONS
In Chapter 8 we we studied, classified, and solved linear, second-order, differ-
differential equations corresponding to linear, second-order, differential operators
of the general form
)f-u{x) + Pl(x)~u(x) + p2(x)u(x). (9.1)
The functions po(x), рДх), and p2(x) are not to be confused with the constants
Pi of Section 8.6. Reference to Eq. 8.73 shows that P(x) — Pi(x)/po(x) and
Q(x) = P2(x)/Po(x)-
These coefficients, po(x), Pi(x), and p2(x) are real functions of x and over the
region of interest, a < x < b, the first 2 — i derivatives of p,(x) are continuous.
Further, po(x) does not vanish for a < x < b. Now, the zeros of po(x)are singular
points (Section 8.4), and the preceding statement simply means that we choose
497
498 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
our interval [a, fr] so that there are no singular points in the interior of the
interval. There may be and often are singular points on the boundaries.
It is convenient in the mathematical theory of differential equations to define
an adjoint1 operator if by
1х1^Р°и^ ~~dx^PlU^ +PlU
(9.2)
d2u ._ , .du , ..
= Po~dx1 + ( Po ~ Pl)~dx + ^Po ~ Pl + Pl)U'
In a comparison of Eqs. 9.1 and 9.2 the necessary and sufficient condition that
if = if is that
). (9.3)
их
When this condition is satisfied,
du{x)
4
dx
P(x)-
dx
+ q(x)u(x) (9.4)
and the operator f£ is said to be self-adjoint. Here, for the self-adjoint case,
po(x) is replaced by p(x) and p2(x) by q(x) to avoid unnecessary subscripts.
The importance of the form of Eq. 9.4 is that we will be able to carry out two
integrations by parts-Eq. 9.21 and following.2
In a survey of the differential equations introduced in Section 8.3, Legendre's
equation and the linear oscillator equation are self-adjoint, but others, such as
the Laguerre and Hermite equations, are not, However, the theory of linear,
second-order, self-adjoint differential equations is perfectly general because we
can always transform, the non-self-adjoint operator into the required self-adjoint
form. Consider Eq. 9.1 with p'o ф рх. If we multiply ^£ by3
1
Po(x)
exp
Pott)
dt
we obtain
JThe adjoint operator bears a somewhat forced relationship to the adjoint
matrix. A better justification for the nomenclature is found in a comparison
of the self-adjoint operator (plus appropriate boundary conditions) with the
self-adjoint matrix. The significant properties are developed in Section 9.2.
Because of these properties, we are interested in self-adjoint operators.
2The full importance of the self-adjoint form (plus boundary conditions) will
become apparent in Section 9.2. In addition, self-adjoint forms will be required
for developing integral equations and Green's functions in Section 16.5.
3If we multiply <£ by f{x)/po{x) and then demand that
/'(*)= ^,
so that the new operator will be self-adjoint, we obtain
SELF-ADJOINT DIFFERENTIAL EQUATIONS 499
Po(x)
exp
Ml
Po(t)'
dt
dx
,exp
-dt
du{x))
dx
(9.5)
which is clearly self-adjoint. Notice the po(x) in the denominator. This is why
we require po(x) ф 0, a < x < b. In the following development we assume that
if has been put into self-adjoint form.
Eigenfunctions, Eigenvalues
From separation of variables or directly from a physical problem we have a
linear second-order differential equation of the form
Xw{x)u(x) = 0.
(9.6)
Here Я is a constant and w(x) is a known function of x, called a density or
weighting function. The significance of these labels will appear in subsequent
sections. We require that w(x) > 0, except possibly at isolated points at which
w(x) = 0. For a given choice of the parameter Я, a function ил(х), which satisfies
Eq. 9.6 and the imposed boundary conditions, is called an eigenfunction corre-
corresponding to Я. The constant X is then called an eigenvalue. There is no guarantee
that an eigenfunction ux{x) will exist for any arbitrary choice of the parameter X.
Indeed, the requirement that there be an eigenfunction often restricts the
acceptable values of Я to a discrete set. Examples of this for the Legendre,
Hermite, and Chebyshev equations appear in the exercises of Section 8.5.
Here we have one mathematical approach to the process of quantization in
quantum mechanics.
The major example of Eq. 9.6 in physics is the Schrodinger wave equation
Нф(х) = Еф{х),
where the differential operator J? becomes the Hamiltonian H and the eigen-
eigenvalue (~X) becomes the total energy E of the system. The eigenfunction ф(х)
is usually called a wave function. A variational derivation of this Schrodinger
equation appears in Section 17.7.
EXAMPLE 9.1.1 Legendre's Equation
Legendre's equation is given by
A - x2)y" - Ixy' + n(n + l)y = 0. (9.7)
From Eqs. 9.1 and 9.6
po(x) = 1 - x2 = p w{x) = 1,
Pl(x) = -2x = p' X = n(n+ 1), (9.8)
p2(x) = 0 = q.
The reader will recall that our series solutions of Legendre's equation (Section
500 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
TABLE 9.1
Equation
Legendre
Shifted Legendre
Associated Legendre
Chebyshev I
Shifted Chebyshev I
Chebyshev II
Ultraspherical (Gegenbauer)
Bessel*
Laguerre
Associated Laguerre
Hermite
Simple harmonic oscillator1
p(x)
1-х2
x(l - x)
1 - x2
A _ x2I/2
[x(l -x)]1/2
A - x2K'2
A - x2f+1/2
X
xe~x
e~*2
1
q(x)
0
0
-m2/(l
0
0
0
0
n2
X
0
0
0
0
л
/(/ + 1)
Id + 1)
- x2) /(/ + 1)
n2
n2
n(n + 2)
n(n + 2a)
a2
a
a-k
2a
n2
w(x)
1
1
1
A-х2Г1/2
[x(l-x)]-1'2
A - x2)
A - х2Г1/2
X
e~x
xke~x
e~x2
1
* Orthogonality of Bessel functions is rather special. Compare Section 11.2
for details. A second type of orthogonality is developed in Section 11.7.
fThis will form the basis for Chapter 14, Fourier series.
8.5L diverged unless n was restricted to one of the integers. This represents a
quantization of the eigenvalue X.
When the equations of Chapter 8 are transformed into self-adjoint form,
we find the following values of the coefficients and parameters (Table 9.1).
The coefficient p(x) is the coefficient of the second derivative of the eigen-
eigenfunction and hopefully can be identified with no difficulty. The eigenvalue X is
the parameter of function of the parameter that is available [in a term of the
form Xw(x)y(xj]. Any x dependence apart from the eigenfunction becomes the
weighting function w(x). If there is another term containing the eigenfunction
(not the derivatives), the coefficient of the eigenfunction in this additional term
is identified as q(x). If no such term is present, q(x) is simply zero.
EXAMPLE 9.1.2 Deuteron
Further insight into the concepts of eigenfunction and eigenvalue may be
provided by an extremely simple model of the deuteron. The neutron-proton
nuclear interaction is represented by a square well potential: V = Vo < 0 for
0 < r < a, V = 0 for r > a. The Schrodinger wave equation is
^2ф + уф = Еф- {99)
With \jj = ф(г), we may write u(r) = гф(г), and using Exercise 2.5.18, the wave
equation becomes
4Compare also Sections 5.2 and 12.10.
SELF-ADJOINT DIFFERENTIAL EQUATIONS 501
d2u
dr2
with
kfu = 0, (9.10)
Л_1М
h2
k\=~-(E- Fo)>0 (9.11)
for the interior range, 0 < r < a. Here M is the reduced mass of the neutron-
proton system. For a < r < oo, we have
dht. 2 _
dr2 *2W-U' (9.12)
with
*1--^><1 (9.13)
From the boundary condition that ф remain finite, u@) = 0 and
u^r) = s'mk1r, 0 < r < a. (9.14)
In the range outside the potential well, we have a linear combination of the two
exponentials,
u2(r) = Aexpk2r + Bexp( — k2r), a < r < oo. (9.15)
Continuity of particle density and current demand that u^a) = u2{a) and that
u\(a) = u'2(a). These joining conditions give
tan/c,a = — t^ = —
^a = Aexpk2a + p( 2),
(9.16)
kx coskxa = k2Aexpk2a — k2Bexp( — k2a).
The condition that we actually have one proton-neutron combination is that
j ф*ф dx = 1. This constraint can be met if we impose a boundary condition
that ф(г) remain finite as r -*■ oo. And this, in turn, means that A = 0. Dividing
the preceding pair of equations (to cancel B), we obtain
(9.17)
a transcendental equation for the energy E with only certain discrete solutions.
If E is such that Eq. 9.17 can be satisfied, our solutions w}(r) and u2(r) can
satisfy the boundary conditions. If Eq. 9.17 is not satisfied, no acceptable solution
exists. The values of E for which Eq. 9.17 is satisfied are the eigenvalues; the
corresponding functions uy and u2 (or ф) are the eigenfunctions. For the actual
deuteron problem there is one (and only one) negative value of E satisfying
Eq. 9.17, that is, the deuteron has one and only one bound state.
Now, what happens if £ does not satisfy Eq. 9.17, if £ is not an eigenvalue?
In graphical form, imagine that E and therefore kx are varied slightly.
For E = Ex < Eo, kx is reduced, and sin/c^a has not turned down as much.
502 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
ф)
V а
\ "
>Е0
Е = Е
Е = £
,<Е0
FIG. 9.1 A deuteron eigenfunction
The joining conditions, Eq. 9.16, require A > 0 and the wave function goes to
+ oo, exponentially. For E = E2 > Eo, kx is larger, sin kY a peaks sooner and is
descending more rapidly at r = a. The joining conditions demand A < 0, and
the wave function goes to — oo, exponentially. Only for E = Eo, an eigenvalue,
will the wave function have the required negative exponential asymptotic
behavior.
Boundary Conditions
In the foregoing definition of eigenfunction, it was noted that the eigen-
eigenfunction ux(x) was required to satisfy certain imposed boundary conditions.
These boundary conditions may take three forms:
1. Cauchy boundary conditions. The value of a function
and normal derivative specified on the boundary. In
electrostatics this would mean (p, the potential, and
En the normal components of the electric field.
2. Dirichlet boundary conditions. The value of a func-
function specified on the boundary.
3. Neumann boundary conditions. The normal deriva-
derivative (normal gradient) of a function specified on the
boundary. In the electrostatic case this would be En
and therefore a, the surface charge density.
A summary of the relation of these three types of boundary condition to the
three types of two-dimensional partial differential equation is given in Table
9.2. For extended discussions of these partial differential equations the reader
may consult Sommerfeld, Chapter 2, or Morse and Feshbach, Chapter 6
(see General References).
Parts of Table 9.2 are simply a matter of maintaining internal consistency,
of common sense. For instance, for Poisson's equation with a closed surface,
Dirichlet conditions lead to a unique, stable solution. Neumann conditions,
SELF-ADJOINT DIFFERENTIAL EQUATIONS 503
TABLE 9.2
Boundary
conditions
Cauchy
Open surface
Closed surface
Dirichlet
Open surface
Closed surface
Neumann
Open surface
Closed surface
Type of partial
Elliptic
Laplace, Poisson
in (x, y)
Unphysical results
(instability)
Too restrictive
Insufficient
Unique, stable
solution
Insufficient
Unique, stable
solution
differential equation
Hyperbolic
Wave equation
in (x, t)
Unique, stable
solution
Too restrictive
Insufficient
Solution not
unique
Insufficient
Solution not
unique
Parabolic
Diffusion equation
in (x, t)
Too restrictive
Too restrictive
Unique, stable
solution in
one direction
Too restrictive
Unique, stable
solution in
one direction
Too restrictive
independent of the Dirichlet conditions, likewise lead to a unique stable solution
independent of the Dirichlet solution. Therefore Cauchy boundary conditions
(meaning Dirichlet plus Neumann) could lead to an inconsistency.
The term boundary conditions includes as a special case the concept of
initial conditions. For instance, specifying the initial position x0 and the initial
velocity v0 in some dynamical problem would correspond to the Cauchy
boundary conditions. The only difference in the present usage of boundary
conditions in these one-dimensional problems is that we are going to apply
the conditions on both ends of the allowed range of the variable.
Usually the form of the differential equation or the boundary conditions
on the solutions will guarantee that at the ends of our interval (that is, at the
boundary) the following products will vanish:
p(x)v*(x)
du(x)
and
p(x)v*(x)
dx
du{x)
= 0.
(9.18)
dx
= 0.
Here u(x) and v(x) are solutions of the particular differential equation (Eq. 9.6)
being considered. We can, however, work with a somewhat less restrictive set
of boundary conditions,
v*pu'\x=a = v*pu'\x=b, (9.19)
in which u(x) and v(x) are solutions of the differential equation corresponding
504 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
to the same or to different eigenvalues. Equation 9.19 might well be satisfied
if we were dealing with a periodic physical system such as a crystal lattice.
Equations 9.18 and 9.19 are written in terms of v*, complex conjugate.
When the solutions are real, v = v* and the asterisk may be ignored. However,
in Fourier exponential expansions and in quantum mechanics the functions
will be complex and the complex conjugate will be needed.
These properties (Eq. 9.18 or 9.19) are so important for the concept of
Hermitian operator (which follows) and the consequences (Section 9.2) that
literally the interval (a, b) will be chosen to ensure that Eq. 9.18 or 9.19 are
satisfied. If our solutions are polynomials, the coefficient p(x) will determine the
range of integration. Note that p(x) also determines the singular points of the
differential equation, Section 8.3. For nonpolynomial solutions, for example,
sin we, cosnx; (p = 1), the range of integration is determined by properties of
the solutions—as in Example 9.1.3.
EXAMPLE 9.1.3 Choice of Integration Interval, [a,b]
For if = d2/dx2 a possible eigenvalue equation is
d y(x) + n2y{x) = 0, (9.20)
dx
with eigenfunctions
un = cos nx
vm = sin mx.
Equation 9.19 becomes
— n sin mx sin nx
or
mcosmxcosnx|^ = 0,
interchanging un and vm. Since sin mx and cos nx are periodic with period 2л
(for n and m integral), Eq. 9.19 is clearly satisfied if a = x0 and fr = x0 + 2n.
The interval is chosen so that the boundary conditions (Eq. 9.19, etc.) are
satisfied. For this case (Fourier series) the usual choices are x0 = 0 leading to
@, In) and x0 = — n leading to ( — n, л). Here and throughout the following
several chapters the integration interval is chosen so that the boundary conditions
(Eq. 9.19) will be satisfied. The interval [a, b] and the weighting factor w(x)
for the most commonly encountered second-order differential equations are
listed in Table 9.3.
Hermitian Operators
We now prove an important property of the combination self-adjoint,
second-order differential operator (Eq. 9.6), plus solutions u(x) and v(x) that
satisfy boundary conditions given by Eq. 9.19.
SELF-ADJOINT DIFFERENTIAL EQUATIONS 505
TABLE 9.3
Equation
Legendre
Shifted Legendre
Associated Legendre
Chebyshev I
Shifted Chebyshev I
Chebyshev II
Laguerre
Associated Laguerre
Hermite
Simple harmonic oscillator
a
-1
0
-1
-1
0
-1
0
0
— oo
0
— n
b
1
1
1
1
1
1
oo
oo
oo
2n
n
w(x)
1
1
1
A - x2)/2
[x(l - x)]-1'2
A - x2I/2
e~x
xke~x
e~x2
1
1
Note. 1. The orthogonality interval [a, b] is determined by the boundary
conditions of Section 9.1.
2. The weighting function is established by putting the differential
equation in self-adjoint form.
By integrating v* (complex conjugate) times the second-order self-adjoint
differential operator if (operating on u) over the range a < x < b, we obtain
ль ль ль
v*£?udx= v*(pu')'dx+ v*qudx (9.21)
Ja Ja Ja
using Eq. 9.4. Integrating by parts, we have
v*(puj dx = v*pu'
- v*'pu'dx. (9.22)
The integrated part vanishes on application of the boundary conditions (Eq.
9.19). Integrating the remaining integral by parts a second time, we have
— v*'pu' dx — —v*'pu
+ u(pv*')'dx. (9.23)
Again, the integrated part vanishes in an application of Eq. 9.19. A combination
of Eqs. 9.21 to 9.23 gives us
v*^udx= u£fv*dx. (9.24)
J a J a
This property, given by Eq. 9.24, is expressed by saying that the operator if is
Hermitian with respect to the functions u(x) and v(x) which satisfy the boundary
conditions specified by Eq. 9.19. Note carefully that this Hermitian property
follows from self-adjointness plus boundary conditions.
Hermitian Operators in Quantum Mechanics
The preceding development in this section has focused on the classical
second-order differential operators of mathematical physics. Generalizing our
506 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
Hermitian operator theory as required in quantum mechanics, we have an
extension: The operators need be neither second-order differential operators
nor real. px = — ih(d/dx) will be an Hermitian operator. We simply assume
(as is customary in quantum mechanics) that the wave functions satisfy appro-
appropriate boundary conditions: vanishing sufficiently strongly at infinity or having
periodic behavior (as in a crystal lattice, or unit intensity for waves). The
operator if is called Hermitian if
ФХ^ф^х = (&\lix)*\li2dT. (9.25)
Apart from the simple extension to complex quantities, this definition is identical
with Eq. 9.24.
The adjoint A1' of an operator A is defined by
= \(Аф1)*ф2 d'x. (9.26)
This is quite different from our classical, second derivative operator-oriented
definition, Eq. 9.2. Here the adjoint is defined in terms of the resultant integral,
with the Af as part of the integrand. Clearly, if A = Af (self-adjoint), then A is
Hermitian. The converse is not so simple (and not always true), but in quantum
mechanics the two terms self-adjoint and Hermitian are usually taken to be
synonymous. (This is also done in matrix analysis, Section 4.5.)
The expectation value of an operator if is defined as
<J^>= ij/*£eil/dx. (9.27a)
In the framework of quantum mechanics <if > corresponds to the result of a
measurement of the physical quantity represented by if when the physical
system is in a state described by the wave function ф. If we require ^£ to be
Hermitian, it is easy to show that (J^} is real (as would be expected from a
measurement in a physical theory). Taking the complex conjugate of Eq. 9.27a,
we obtain
= фg'*ф*dx.
Rearranging the factors in the integrand, we have
<J^>*= {&ф)*ф dx.
Then, applying our definition of Hermitian operator, Eq. 9.25, we get
(9.27b)
or (J^y is real. It is worth noting that ф is not necessarily an eigenfunction
ofJS?.
EXERCISES 507
EXERCISES
9.1.1 Show that Laguerre's equation may be put into self-adjoint form by multiplying
by e~x and that w(x) = e~x is the weighting function.
9.1.2 Show that the Hermite equation may be put into self-adjoint form by multiplying
by e~x and that this gives w(x) = e~x as the appropriate density function.
9.1.3 Show that the Chebyshev equation (type I) may be put into self-adjoint form by
multiplying by A — x2)~1/2 and that this gives w(x) = A — x2)~1/2 as the ap-
appropriate density function.
9.1.4 Show the following when the linear second-order differential equation is expressed
in self-adjoint form:
(a) The Wronskian is equal to a constant divided by the initial coefficient p.
(b) A second solution is given by
9.1.5 Un(x), the Chebyshev polynomial (type II) satisfies the differential equation
A - x2)U:(x) - ЗхВД + n(n + 2)Un(x) = 0.
(a) Locate the singular points that appear in the finite plane and show whether
they are regular or irregular.
(b) Put this equation in self-adjoint form.
(c) Identify the complete eigenvalue.
(d) Identify the weighting function.
9.1.6 For the very special case 1 = 0 and q(x) = 0 the self-adjoint eigenvalue equation
becomes
d_
dx
p(x)
du(xj~
~dx~
satisfied by
du 1
dx p(x)
Use this to obtain a "second" solution of the following:
(a) Legendre's equation,
(b) Laguerre's equation,
(e) Hermite's equation.
ANS. (a) ы
(х) 1п
2 1-х
f* dt
(b) M2(x)- u2(x0)= e1—,
r-x ^ *^o
(c) u2(x)= e'2dt.
Jo
These second solutions illustrate the divergent behavior usually found in a
second solution.
Note. In all three cases u^x) — 1.
9.1.7 Given that ^u — 0 and gi£u is self-adjoint, show that for the adjoint operator
]?, Щди) = О.
508 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
9.1.8 For a second-order differential operator -Sf that is self-adjoint show that
\\ *х = Р(У\У2 - УхУ'г%
9.1.9 Show that if a function ф is required to satisfy Laplace's equation in a finite region
of space and to satisfy Dirichlet boundary conditions over the entire closed
bounding surface, then ф is unique.
Hint. One of the forms of Green's theorem, Section 1.11 will be helpful.
9.1.10 Consider the solutions of the Legendre, Chebyshev, Hermite, and Laguerre
equations to be polynomials. Show that the ranges of integration that guarantee
that the Hermitian operator boundary conditions will be satisfied are
(a) Legendre [—1,1],
(b) Chebyshev [-1,1],
(c) Hermite (—oo, oo),
(d) Laguerre [0, oo).
9.1.11 Within the framework of quantum mechanics (Eqs. 9.25 and following), show
that the following are Hermitian operators:
(a) momentum p = — ih\ ~ i—V
h
(b) angular momentum L = — ihr x V = i — г х V
2n
Hint. In cartesian form L is a linear combination of noncommuting Hermitian
operators.
9.1.12 (а) Л is a non-Hermitian operator. In the sense of Eqs. 9.25 and 9.26, show that
A + Af and i(A - Af)
are Hermitian operators.
(b) Using the preceding result, show that every non-Hermitian operator may
be written as a linear combination of two Hermitian operators.
9.1.13 U and V are two arbitrary operators, not necessarily Hermitian. In the sense of
Eq. 9.26, show that
(UV)* = VfUf.
Note the resemblance to Eq. 4.124 for adjoint matrices.
Hint. Apply the definition of adjoint operator—Eq. 9.26.
9.1.14 Prove **iat the product of two Hermitian operators is Hermitian (Eq. 9.25) if
and only .1" the two operators commute.
9.1.15 A and В are noncommuting quantum mechanical operators:
AB - BA = iC.
Show that С is Hermitian. Assume that appropriate boundary conditions are
satisfied.
9.1.16 The operator if is Hermitian. Show that <if2> > 0.
9.1.17 A quantum mechanical expectation value is defined by
= [ф*{х)Аф{х)йх,
where Л is a linear operator. Show that demanding that <Л> be real means that
A must be Hermitian—with respect to ф{х).
EXERCISES 509
9.1.18 From the definition of adjoint, Eq. 9.26, show that Лп = A in the sense that
j ф*Аfti//2 dz = j ф*Аф2 dr. The adjoint of the adjoint is the original operator.
Hint. The function i//j and ф2 of Eq. 9.26 represent a class of functions. The sub-
subscripts 1 and 2 may be interchanged or replaced by other subscripts.
9.1.19 The Schrodinger wave equation for the deuteron (with a Woods-Saxon potential)
is
VV+чф Еф.
2M W \ + exp[(r - ro)/aY W
Here E = —2.224 MeV. a is a "thickness parameter," 0.4 x 1СГ13 centimeters.
Expressing lengths in fermis A0~13 centimeters) and energies in million electron
volts (MeV), we may rewrite the wave equation as
+
dr2" T/ A\A1
1 -t- exp
г - rn
(гф) = О.
E is assumed known from experiment. The game is to find Vo for a specified value
of r0, (say, r0 = 2.1). If we let y(r) = гф(г), then y@) = 0 and we take y'{0) — 1.
Find Vo such that yB0.0) = 0. (This should be y(oo), but r = 20 is far enough
beyond the range of nuclear forces to approximate infinity.)
ANS. For a = 0.4 and r0 = 2.1 fm., Vo = -34.159 MeV.
9.1.20 Determine the nuclear potential well parameter Vo of Exercise 9.1.19 as a function
of r0 for r = 2.00@.05) 2.25 fermis.
Express your results as a power law
Determine the exponent v and the constant k. This power law formulation is
useful for accurate interpolation.
9.1.21 In Exercise 9.1.19 it was assumed that 20 fermis was a good approximation to
infinity. Check on this by calculating Vo for гф(г) — 0 at (a) r = 15, (b) r — 20,
(c) r = 25 and (d) r = 30. Sketch your results. Take r0 = 2.10 and a = 0.4 (fermis).
9.1.22 For a quantum particle moving in a potential well, V(x) = \moJx2, the Schrodin-
Schrodinger wave equation is
h2 й2ф{х) 1 2 2
2х2
2m dx 2
or
1 2 2 / / Ч ЕМ, Ч
-ты2х2ф(х) = Еф(х),
2
^j 2ф(г) ф(г),
dz no)
where z = (moj/hI/2x. Since this operator is even, we expect solutions of definite
parity. For the initial conditions that follow integrate out from the origin and
determine the minimum constant 2E/hw that will lead to ф(со) = 0 in each case.
(You may take z = 6 as an approximation of infinity.)
(a) For an even eigenfunction,
ф@) = 1., ^'@) = 0.
(b) For an odd eigenfunction
^@) = 0., ^'@) = 1.
Note. Analytical solutions appear in Section 13.1.
510 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
9.2 HERMITIAN (SELF-AD JOINT) OPERATORS
Hermitian or self-adjoint operators have three properties that are of extreme
importance in physics, both classical and quantum.
1. The eigenvalues of an Hermitian operator are real.
2. The eigenfunctions of an Hermitian operator are
orthogonal.
3. The eigenfunctions of an Hermitian operator form
a complete set.1
Real Eigenvalues
We proceed to prove the first two of these three properties. Let
,. + k{wut = 0. (9.28)
Assuming the existence of a second eigenvalue and eigenfunction
£eUj + Xjwuj = 0. (9.29)
Then, taking the complex conjugate, we obtain
&uf + Xfwuf = 0. (9.30)
Here ££ is a real operator (p and q are real functions of x) and w(x) is a real
function. But we permit Xk, the eigenvalues, and uk, the eigenfunctions, to be
complex. Multiplying Eq. 9.28 by uf and Eq. 9.30 by щ and then subtracting,
we have
и*&щ - u^uf = (Xf - Xi)wUiUf. (9.31)
We integrate over the range a < x < b,
ль ль гь
- щ<£uf dx = {Xf - A,-) u{ufwdx. (9.32)
a Ja
Since if is Hermitian, the left-hand side vanishes by Eq. 9.26 and
(Xf-X{) uiUfwdx = 0. (9.33)
Ja
If i = ;, the integral cannot vanish [w(x) > 0, apart from isolated points], except
in the trivial case ut = 0. Hence the coefficient (A* — A,) must be zero,
A* = A,-, (9.34)
which is a mathematical statement that the eigenvalue is real. Since A,- can
represent any one of the eigenvalues, this proves the first property. This is an
exact analog of the nature of the eigenvalues of real symmetric (and of Hermitian)
matrices (compare Section 4.6).
1 This third property is not universal. It does hold for our linear, second-order
differential operators in Sturm-Liouville (self-adjoint) form. Completeness
is defined and discussed in Section 9.4. A proof that the eigenfunctions of our
linear, second-order, self-adjoint, differential equations form a complete set
may be developed from the calculus of variations of Section 17.8.
HERMITIAN (SELF-ADJOINT) OPERATORS 511
This reality of the eigenvalues of Hermitian operators has a fundamental
significance in quantum mechanics. In quantum mechanics the eigenvalues
correspond to precisely measurable quantities, such as energy and angular
momentum. With the theory formulated in terms of Hermitian operators, this
proof of the reality of the eigenvalues guarantees that the theory will predict
real numbers for these measurable physical quantities. In Section 17.8 it will
be seen that the set of real eigenvalues has a lower bound.
Orthogonal Eigenfunctions
If we now take i ф j and if А,- ф A,, the integral of the product of the two
different eigenfunctions must vanish.
= 0. (9.35)
This condition, called orthogonality, is the continuum analog of the vanishing
of a scalar product of two vectors.2 We say that the eigenfunctions u,(x) and
Uj(x) are orthogonal with respect to the weighting function w(x) over the interval
[а, b]. Equation 9.35 constitutes a partial proof of the second property of our
Hermitian operators. Again, the precise analogy with matrix analysis should
be noted. Indeed, we can establish a one-to-one correspondence between this
Sturm-Liouville theory of differential equations and the treatment of Hermitian
matrices. Historically, this correspondence has been significant in establishing
the mathematical equivalence of matrix mechanics developed by Heisenberg
and wave mechanics developed by Schrodinger. Today, the two diverse ap-
approaches are merged into the theory of quantum mechanics and the mathe-
mathematical formulation that is more convenient for a particular problem is used
for that problem. Actually the mathematical alternatives do not end here.
Integral equations, Chapter 16, form a third equivalent and sometimes more
convenient or more powerful approach.
This proof of orthogonality is not quite complete. There is a loophole,
because we may have i ф j but still have A; = A,. Such a case is labeled degenerate.
Illustrations of degeneracy are given at the end of this section. If A,- = A,, the
integral in Eq. 9.33 need not vanish. This means that linearly independent
eigenfunctions corresponding to the same eigenvalue are not automatically
orthogonal and that some other method must be sought to obtain an orthogonal
set. Although the eigenfunctions in this degenerate case may not be orthogonal,
they can always be made orthogonal. One method is developed in the next
section.
2 From the definition of Riemann integral
f "f(x)g(x) dx = lim ( £ f(Xl)g(x,)) A*,
where x0 = a, xN = b, and xt — X;^ = Ax. If we interpret f(x;) and g(x;) as
the z'th components of an N component vector, then this sum (and therefore
this integral) corresponds directly to a scalar product of vectors, Eq. 1.22.
The vanishing of the scalar product is the condition for orthogonality of the
vectors—or functions.
512 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
We shall see in succeeding chapters that it is just as desirable to have a given
set of functions orthogonal as it is to have an orthogonal coordinate system.
We can work with nonorthogonal functions, but they are likely to prove as
messy as an oblique coordinate system
EXAMPLE 9.2.1 Fourier Series: Orthogonality
Continuing Example 9.1.3, the eigenvalue equation, Eq. 9.20,
dx2 '
perhaps describes a quantum mechanical particle in a box, perhaps a vibrating
violin string with (degenerate) eigenfunctions—cos nx, sin nx.
With n real (here taken to be integral), the orthogonality integrals become
a. sin mx sin nx dx = CnSnm,
b. cosmxcosnxdx = Dndnm,
с sin mx cos nx dx = 0.
J XQ
For an interval of 2n the preceding analysis guarantees the Kronecker delta in
(a) and (b) but not the zero in (c) because (c) involves degenerate eigenfunctions.
However, inspection shows that (c) always vanishes for all integral m and n.
Our Sturm-Liouville theory says nothing about the values of Cn and Dn.
Actual calculation yields
c = к п ф о,
"[О, п = 0,
к и ф о,
" ~ \2n, n = 0.
These orthogonality integrals form the basis of the Fourier series developed
in Chapter 14.
EXAMPLE 9.2.2 Expansion in Orthogonal Eigenfunctions: Square Wave
The property of completeness means that certain classes of function (i.e.,
sectionally or piecewise continuous) may be represented by a series of orthogonal
eigenfunctions to any desired degree of accuracy. Consider the square wave
-, 0 < x < n,
f(X)= I h (<
—-, — n < x < 0.
HERMITIAN (SELF-ADJOINT) OPERATORS 513
This function may be expanded in any of a variety of eigenfunctions—Legendre,
Hermite, Chebyshev, and so on. The choice of eigenfunction is made on the
basis of convenience. To illustrate the expansion technique, let us choose the
eigenfunctions of Example 9.2.1, cos nx and sin nx.
The eigenfunction series is conveniently (and conventionally) written as
f(X) = ~^ + Yj (an COS ПХ + К Sm ПХ)-
From the orthogonality integrals of Example 9.2.1 the coefficients are given by
i Г
an = ~\ f(t) cos ntdt,
J — n
i Г
bn = ~\ f(t)sinntdt, n = 0, 1, 2, ....
J — n
Direct substitution of ±h/2 for f(t) yields
which is expected here because of the antisymmetry, and
10, n even,
bn = ^(l-cosnn)=U nQdd
[nn'
Hence the eigenfunction (Fourier) expansion of the square wave is
m = 2£ f sinBn+l)x_ (9_37)
Additional examples, using other eigenfunctions, appear in Chapters 11 and 12.
Degeneracy
The concept of degeneracy was introduced earlier. If N linearly independent
eigenfunctions correspond to the same eigenvalue, the eigenvalue is said to be
ЛГ-fold degenerate. A particularly simple illustration is provided by the eigen-
eigenvalues and eigenfunctions of the linear oscillator equation, Example 9.2.1. For
each value of the eigenvalue n, there are two possible solutions: sin nx and cos nx
(and any linear combination). We may say the eigenfunctions are degenerate or
the eigenvalue is degenerate.
A more involved example is furnished by the physical system of an electron in
an atom (nonrelativistic treatment, spin neglected). From the Schrodinger
equation, Eq. 13.53 for hydrogen, the total energy of the electron is our eigen-
eigenvalue. We may label it EnLM by using the quantum numbers n, L, and M as
subscripts. For each distinct set of quantum numbers (n, L, M) there is a distinct,
linearly independent eigenfunction ф„ш(г, О, cp). For hydrogen, the energy EnLM
is independent of L and M. With 0 < L < n — 1 and —L<M<L, the eigen-
eigenvalue is n2-fold degenerate (including the electron spin would raise this to 2n2).
In atoms with more than one electron the electrostatic potential is no longer a
514 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
simple r potential. The energy depends on L as well as on n, although not on
M. EnLM is still BL + l)-fold degenerate. This degeneracy may be removed by
applying an external magnetic field, giving rise to the Zeeman effect.
EXERCISES
9.2.1 The functions u1(x) and u2{x) are eigenfunctions of the same Hermitian operator
but for distinct eigenvalues kx and k2. Prove that ul(x) and u2(x) are linearly
independent.
9.2.2 (a) The vectors е„ are orthogonal to each other: е„-еш = 0 for n ф т. Show
that they are linearly independent.
(b) The functions ф„(х) are orthogonal to each other over the interval [a, b] and
with respect to the weighting function w(x). Show that the ф„(х) are linearly
independent.
9.2.3
,, /1 + x\
P1(x) = x and
1 -x,
are solutions of Legendre's differential equation corresponding to different
eigenvalues.
(a) Evaluate their orthogonality integral
f1 x /1 + x\
J-i 2 V ~ x/
(b) Explain why these two functions are not orthogonal, why the proof of
orthogonality does not apply.
9.2.4 70(x) = 1 and V^x) = A - x2I/2 are solutions of the Chebyshev differential
equation corresponding to different eigenvalues. Explain, in terms of the boundary
conditions, why these two functions are not orthogonal.
9.2.5 (a) Show that the first derivatives of the Legendre polynomials satisfy a self-
adjoint differential equation with eigenvalue к = n(n + 1) — 2.
(b) Show that these Legendre polynomial derivatives satisfy an orthogonality
relation
f1
P'{x)P'n{x)(i - x2)dx = 0, m ф n.
J-i
Note. In Section 12.5 A — x2)ll2P'n(x) will be labeled an associated Legendre
polynomial, P*(x).
9.2.6 A set of functions un(x) satisfy the Sturm-Liouville equation
d Г , ,d ,
;['
^ I rvv J..-HV-, + ^nW(xK(x) = 0.
The functions um(x) and м„(х) satisfy boundary conditions that lead to orthogonal-
orthogonality. The corresponding eigenvalues km and km are distinct. Prove that for appro-
appropriate boundary conditions u'm{x) and u'n{x) are orthogonal with p(x) as a weighting
function.
SIS
9.2.7 A linear operator A has n distinct eigenvalues and n corresponding eigenfunc-
eigenfunctions. Аф{ = Я,|//,-. Show that the n eigenfunctions are linearly independent. A is
not necessarily Hermitian.
Hint. Assume linear dependence, that ф„ = ^"Г^а,!//,-. Use this relation and the
operator-eigenfunction equation first in one order, then in the reverse order.
Show that a contradiction results.
9.2.8 A set of functions are mutually orthogonal. Show that they are automatically
linearly independent, that orthogonality implies linear independence.
9.2.9 The ultraspherical polynomials С„(а)(х) are solutions of the differential equation
1A - x2)— - Ba + \)x~ + n(n + 2a)}c>(x) = 0.
( dx ax J
(a) Transform this differential equation into self-adjoint form.
(b) Show that the C^\x) are orthogonal for different n. Specify the interval of
integration and the weighting factor.
Note. Assume that your solutions are polynomials.
+
= 0
j = 0.
; dx =
uiPov'j
««(Pi -
b
= Vjl
a
- Po)vj
b
—
a
b
a
0.
9.2.10 With if not self-adjoint,
and
(a) Show that
provided
and
(b) Show that the orthogonality integral for the eigenfunctions u, and Vj becomes
9.2.11 In Exercise 8.5.8 the series solution of the Chebyshev equation is found to be
convergent for all n. Therefore n is not quantized by the argument used for
Legendre (Exercise 8.5.4). Calculate the sum of the к = 0 Chebyshev series for
n = v = 0.8, 0.9, and 1.0 and for x = 0.0@.1H.9.
Note. The Chebyshev series recurrence relation is given in Exercise 5.2.16.
9.2.12 (a) Evaluate the n = v = 0.9, к = 0 Chebyshev series for x = 0.98, 0.99, and
1.00. The series converges very slowly at x = 1.00. You may wish to use
double precision. Upper bounds to the error in your calculation can be set
by comparison with the v = 1.0 case which corresponds to A — x2I/2.
(b) These series solutions for v = 0.9 and for v = 1.0 are obviously not or-
orthogonal despite the fact that they satisfy a self-adjoint eigenvalue equation
with different eigenvalues. From the behavior of the solutions in the vicinity
of x = 1.00 try to formulate a hypothesis as to why the proof of orthogonality
does not apply.
516 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
9.2.13 The Fourier expansion of the (asymmetric) square wave is given by Eq. 9.37. With
h — 2, evaluate this series for x = ОGг/18)я/2, using the first (a) 10 terms, (b) 100
terms of the series.
Note. For 10 terms and x = л/18 or 10° your Fourier representation has a sharp
hump. This is the Gibbs phenomenon of Section 14.5. For 100 terms this hump
has been shifted over to about 1°.
9.2.14 The symmetric square wave
\
Ь \x
/(*) =
n
<\
1 n i
— 1, - < |X
< 71
has a Fourier expansion
n n=0 In + 1
Evaluate this series for x = ОGг/18Oг/2 using the first
(a) 10 terms,
(b) 100 terms of the series.
Note. As in Exercise 9.2.13, the Gibbs phenomenon appears at the discontinuity.
This means that a Fourier series is not suitable for precise numerical work in the
vicinity of a discontinuity.
9.3 GRAM-SCHMIDT ORTHOGONALIZATION
The Gram-Schmidt orthogonalization is a method that takes a nonor-
nonorthogonal set of linearly independent functions1 and literally constructs an
orthogonal set over an arbitrary interval and with respect to an arbitrary weight
or density factor. In the language of linear algebra the process is equivalent to a
matrix transformation relating an orthogonal set of basis vectors (functions) to
a nonorthogonal set. A specific example of this matrix transformation appears
in Exercise 12.2.1. The functions involved may be real or complex. Here for con-
convenience they are assumed to be real. The generalization to the complex case
should offer little difficulty.
Before taking up orthogonalization, we should consider normalization of
functions. So far no normalization has been specified. This means that
n
(pfwdx = N,2,
J a
but no attention has been paid to the value of Nt. Since our basic equation, (Eq.
9.6), is linear and homogeneous, we may multiply our solution by any constant
1 Such a set of functions might well arise from the solutions of a (partial)
differential equation in which the eigenvalue was independent of one or more
of the constants of separation. As an example, we have the hydrogen atom
problem (Sections 9.2 and 13.2). The eigenvalue (energy) is independent of
both the electron orbital angular momentum and its projection on the z-axis,
m. The student should note, however, that the origin of the set of functions
is irrelevant to the Gram-Schmidt orthogonalization procedure.
GRAM-SCHMIDT ORTHOGONALIZATION 517
and it will still be a solution. We now demand that each solution cp^x) be multi-
multiplied by Л/, so that the new (normalized) cp{ will satisfy
Гь
(p?(x)w(x)dx = 1 (9.38)
or
(b(pi(x)(Pj(x)W(x)dx = SiJ. (9.39)
Ja
Equation 9.38 says that we have normalized to unity. Including the property of
orthogonality, we have Eq. 9.39. Functions satisfying this equation are said to be
orthonormal (orthogonal plus unit normalization). It should be emphasized that
other normalizations are possible, and indeed, by historical convention, each of
the special functions of mathematical physics treated in Chapters 12 and 13 will
be normalized differently!
We consider three sets of functions: an original, given set un(x\ n = 0,1,2, ...;
an orthogonalized set ф„(х) to be constructed; and a final set of functions (pn(x)
which are the normalized t/^'s. The original wn's may be degenerate eigenfunc-
tions, but this is not necessary. We shall have
linearly independent linearly independent linearly independent
nonorthogonal orthogonal orthogonal
unnormalized unnormalized normalized
(orthonormal)
The Gram-Schmidt procedure is to take the nth ф function (фп) to be un(x)
plus an unknown linear combination of the previous (p's. The presence of the
new un(x) will guarantee linear independence. The requirement that фп{х) be
orthogonal to each of the previous (p's yields just enough constraints to deter-
determine each of the unknown coefficients. Then the fully determined фп will be
normalized to unity, yielding (pn(x). Then the sequence of steps is repeated for
Starting with n = 0, let
фо(х) = ио(х) (9.40)
with no "previous" (p's to worry about. Normalizing,
For n = 1, let
ф1(х) = WiW + ai0(p0(x). (9.42)
We demand that ф^(х) be orthogonal to (po(x). (At this stage the normalization
of ф^(х) is irrelevant.) This demand of orthogonality leads to
518 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
al0
J (9.43)
= 0.
Since (p0 is normalized to unity (Eq. 9.41), we have
аю = — u^wdx, (9.44)
J
fixing the value of a10. Normalizing, we define
<\ wdx) '
Generalizing, we have
(pi(x) = 2— Г2 {9.Щ
where
ф.(х) = щ + ai0(p0 + aiticp1+ • • • + <*,•,,•_!%•_,. (9.47)
The coefficients atj are given by
atj= — u;(pjWdx. (9.48)
J
Equation 9.48 is for unit normalization. If some other normalization is
selected, then
Equation 9.46 is replaced by
and aV] becomes
Equations 9.47 and 9.48 may be rewritten in terms of projection operators,
Pj. If we consider the (pn(x) to form a linear vector space, then the integral in
Eq. 9.48 may be interpreted as the projection of w, into the cpj "coordinate" or
the yth component of u-v With
PjU;(x) = ) MvjiMQ
Eq. 9.47 becomes
GRAM-SCHMIDT ORTHOGONALIZATION 519
) = у- L^-K-M- (9-47«)
Subtracting off the;th components,; = 1 to i — 1 leaves ф;(х) orthogonal to all
the (pj(x).
It will be noticed that although this Gram-Schmidt procedure is one possible
way of constructing an orthogonal or orthonormal set, the functions <p,(x) are
not unique. There is an infinite number of possible orthonormal sets for a given
interval and a given density function. As an illustration of the freedom involved,
consider two (nonparallel) vectors A and В in the xy-plane. We may normalize
A to unit magnitude and then form B' = aA + В so that B' is perpendicular to
A. By normalizing B' we have completed the Gram-Schmidt orthogonalization
for two vectors. But any two perpendicular unit vectors such as i and j could
have been chosen as our orthonormal set. Again, with an infinite number of
possible rotations of i and j about the z-axis, we have an infinite number of
possible orthonormal sets.
EXAMPLE 9.3.1 Legendre Polynomials by Gram-Schmidt Orthogonali-
Orthogonalization
Let us form an orthonormal set from the set of functions un(x) — x", n =
0, 1, 2, .... The interval is — 1 < x < 1 and the density function is w(x) = 1.
In accordance with the Gram-Schmidt orthogonalization process described,
m0 = 1 and <po = ^=. (9.49)
Then
,/, (Y\ — Y _|_ n _!_ (Q Z()\
т IV / — ' 10 I— ^y.JUJ
and
by symmetry. Normalizing ф1, we obtain
(9.52)
Continuing the Gram-Schmidt process, we define
ф2(х) = х2 + а20~~= +а21 -x, (9.53)
where
3 '
(9.54)
520 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
«2i = - I l-x*dx = O, (9.55)
again by symmetry. Therefore
and, on normalizing to unity, we have
<Pi(x)= \^'\{Ъх2-\). (9.57)
V2 2
The next function (p3(x) is
/7 1 ,
(рз(х) = / Ex — 3x). (9.58)
V2 2
Reference to Chapter 12 will show that
<Pn(x)= l2rL^Pn(x\
where Pn(x) is the nth-order Legendre polynomial. Our Gram-Schmidt process
provides a possible but very cumbersome method of generating the Legendre
polynomials.
The equations for Gram-Schmidt orthogonalization tend to be ill-condi-
ill-conditioned because of the subtractions. A technique for avoiding this difficulty using
the polynomial recurrence relation is discussed by Hamming2.
In Example 9.3.1 we have specified an orthogonality interval [— 1,1], a unit
weighting function, and a set of functions, x", to be taken one at a time in
increasing order. Given all these specifications the Gram-Schmidt procedure is
unique (to within a normalization factor and an overall sign as discussed sub-
subsequently). Our resulting orthogonal set, the Legendre polynomials, Po up
through Р„, form a complete set for the description of polynomials of order <n
over [—1,1]. This concept of completeness is taken up in detail in Section 9.4.
Expansions of functions in series of Legendre polynomials are found in Section
12.3.
Orthogonal Polynomials
This particular example has been chosen strictly to illustrate the Gram-
Schmidt procedure. Although it has the advantage of introducing the Legendre
polynomials, the initial functions un = x" are not degenerate eigenfunctions and
are not solutions of Legendre's equation. They are simply a set of functions that
we have here rearranged to create an orthonormal set for the given interval and
given weighting function. The fact that we obtained the Legendre polynomials
is not quite black magic but a direct consequence of the choice of interval and
2R. W. Hamming, Numerical Methods for Scientists and Engineers, 2nd ed.
New York: McGraw-Hill A973). See Section 27.2 and references given there.
EXERCISES 521
TABLE 9.4 Orthogonal Polynomials Generated by
Gram-Schmidt Orthogonalization ofun(x) = x", n = 0, 1,2, ...
Polynomials
Legendre
Shifted
Legendre
Chebyshev I
Shifted
Chebyshev I
Chebyshev II
Laguerre
Associated
Laguerre
Hermite
Interval
-1 ^x< 1
0<x < 1
-1 <x< 1
0 < x < 1
-1 <x < 1
0 < x < oo
0 < x < oo
— oo < x < oo
Weighting
function
w(x)
1
1
(i - xT1/2
A -x2I'2
e"x
к — x
x e
Standard normalization
\\^)У^ = 2,~Л
Г Г * I2 '
Jo Х=2"И"-П
[ [Г„(х)]2A-х2)-1'2<Ух = |7ГА
f1[7*(x)]2[x(l ~x)y1'2dx = I*1'
* 0
[Un(x)]2(l - x2)I/2dx = ?
J-i
Jo
lL.(x)jxe'xdx = -—~-'-
n\
Г [HJx)ye~*2dx = 2»nilzn\
n =
0
0
weighting function. The use of un(x) = x" but with other choices of interval and
weighting function leads to other sets of orthogonal polynomials as shown in
Table 9.4. We consider these polynomials in detail in Chapters 12 and 13 as
solutions of particular differential equations.
An examination of this orthogonalization process will reveal two arbitrary
features. First, as emphasized before, it is not necessary to normalize the func-
functions to unity. In the example just given we could have required
Г (pn(x)q>m(x)dx = 2^-jA».' (9-6°)
and the resulting set would have been the actual Legendre polynomials. Second,
the sign of cpn is always indeterminate. In the example we chose the sign by
requiring the coefficient of the highest power of x in the polynomial to be posi-
positive. For the Laguerre polynomials, on the other hand, we would require the
coefficient of the highest power to be (—!)"/«•'
EXERCISES
9.3.1 Rework Example 9.3.1 by replacing (р„(х) by the conventional Legendre polyno-
polynomial, Pn(x).
Г [Pn(x)Ydx= 2
2n+l
Using Eqs. 9.37a, 9.46a, and 9.48a, construct Po, P,(x), and P2(x).
522 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
ANS. Po = 1,
Pi=x,
P — ^Y2 — J-
9.3.2 Following the Gram-Schmidt procedure, construct a set of polynomials P*(x)
orthogonal (unit weighting factor) over the range [0,1] from the set {l,x}. Nor-
Normalize so that P*(l) = 1.
ANS. P$(x) = 1,
P*(x) = 2x - 1,
P*(x) = 6x2 - 6x + 1,
P3*(x) = 20x3 - 30x2 + 12x - 1.
These are the first four shifted Legendre polynomials.
Note. The "*" is the standard notation for "shifted": [0,1] instead of [-1,1]. It
does not mean complex conjugate.
9.3.3 Apply the Gram-Schmidt procedure to form the first three Laguerre polynomials
un{x) = x", n = 0, 1, 2, . . .,
0 < x < oo,
w(x) = e~x.
The conventional normalization is
Lm(x)Ln(x)e~xdx = 3mn
Jo
ANS. Lo = 1,
Lx = 1 - x,
B-4x + x2)
T _
L2-
9.3.4 You are given
(a) a set of functions un(x) = x", n = 0, 1, 2, ...,
(b) an interval @, oo),
(c) a weighting function w(x) = xe~x.
Use the Gram-Schmidt procedure to construct the first three orthonormalfunctions
from the set м„(х) for this interval and this weighting function.
ANS. (po{x)=l,
<pl(x) = (x-2)/y/2,
<p2(x) = (x2 - 6x + 6)/2,/3.
9.3.5 Using the Gram-Schmidt orthogonalization procedure, construct the lowest three
Hermite polynomials:
м„(х) = x", n = 0, 1, 2, ... — oo < x < go, w(x) = e~x .
For this set of polynomials the usual normalization is
Hm{x)Hn{x)w{x)dx = 3mn2mrn\n1'2.
J — GO
ANS. Ho = 1,
Ях = 2x,
H2 = Ax2 - 2.
9.3.6 Use the Gram-Schmidt orthogonalization scheme to construct the first three
Chebyshev polynomials (type I).
и„(х) = x", n = 0, 1, 2, ... - 1 < x < 1, w(x) = A - xT1/2.
COMPLETENESS OF EIGENFUNCTIONS 523
Take the normalization
.x (n, m = n = 0
Tm(x)Tn(x)w(x)dx = 3mn J n m = n>{
Hint. The needed integrals are given in Exercise 10.4.3. ANS. To = 1,
7,=x,
T2 = 2x2 - 1,
G3 = 4x3 - 3x).
9.3.7 Use the Gram-Schmidt orthogonalization scheme to construct the first three
Chebyshev polynomials (type II).
un(x) = x", n = 0, 1, 2, .. . - 1 < x < 1, w(x) = A - x2)+1/2.
Take the normalization to be
Hint.
1 1Л 2U/2 In л П 1 ' 3 • 5 • ' • B« — 1) 1 л т
A - xzyl2x2"dx = - x ^ '-, n== 1,2,3, ...
/ 2 468B 2)'
\
ANS.
U, = 2x,
h
U7 = Ax2 - 1.
9.3.8 As a modification of Exercise 9.3.5, apply the Gram-Schmidt orthogonalization
procedure to the set un(x) = x", n = 0, 1, 2, ..., 0 < x < oo. Take w(x) to be
exp[ —x2]. Find the first two nonvanishing polynomials. Normalize so that the
coefficient of the highest power of x is unity. In Exercise 9.3.2 the interval (— oc, oc)
led to the Hermite polynomials. These are certainly not the Hermite polynomials.
ANS. (p0 = 1,
9.3.9 Form an orthogonal set over the interval 0 < x < oo, using и„(х) = e~nx, n = 1,
2, 3, .... Take the weighting factor, w(x), to be unity. These functions are solutions
of u"n — п2и„ = 0, which is clearly already in Sturm-Liouville (self-adjoint) form.
Why doesn't the Sturm-Liouville theory guarantee the orthogonality of these
functions?
9.4 COMPLETENESS OF EIGENFUNCTIONS
The third important property of an Hermitian operator is that its eigen-
functions form a complete set. This completeness means that any well-behaved
(at least piecewise continuous) function F(x) can be approximated by a series
F(x) = £ ancpn(x) (9.61)
524 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
to any desired degree of accuracy.1 More precisely, the set (pn{x) is called com-
complete2 if the limit of the mean square error vanishes;
lim
F(x) - £ an(pn{x)
w(x)dx = 0. (9.62)
Technically, the integral here is a Lebesgue integral. We have not required that
the error vanish identically in [a, b] but only that the integral of the error squared
go to zero.
This convergence in the mean, Eq. 9.62, should be compared with uniform
convergence, Section 5.5, Eq. 5.67. Clearly, uniform convergence implies con-
convergence in the mean but the converse does not hold; convergence in the mean
is less restrictive. Specifically, Eq. 9.62 is not upset by piecewise continuous
functions, a finite number of finite discontinuities. Equation 9.62 is perfectly
adequate for our purposes and is far more convenient than Eq. 5.67. Indeed,
since we frequently use eigenfunctions to describe discontinuous functions,
convergence in the mean is all we can expect.
In the language of linear algebra, we have a linear space, a function space. The
linearly independent, orthonormal functions cpn{x\ form the basis for this
(infinite-dimensional) space. Equation 9.61 is a statement that the functions
(р„(х) span this linear space. With an inner product defined by Eq. 9.64, our linear
space is a Hilbert space.
The question of completeness of a set of functions is often determined by
comparison with a Laurent series, Section 6.5. In Section 14.1 this is done for
Fourier series, thus establishing the completeness of Fourier series. For all
orthogonal polynomials mentioned in Section 9.3 it is possible to find a poly-
polynomial expansion of each power of z,
z" = £ af-Az), (9.63)
;=o
where P^z) is the zth polynomial. Exercises 12.4.6, 13.1.8, 13.2.5, and 13.3.22 are
specific examples of Eq. 9.63. Using Eq. 9.63, we may reexpress the Laurent
expansion of f(z) in terms of the polynomials, showing that the polynomial
expansion exists (and existing, it is unique, Exercise 9.4.1). The limitation of this
Laurent series development is that it requires the function to be analytic.
Equations 9.61 and 9.62 are more general. F(x) may be only piecewise con-
continuous. Numerous examples of the representation of such piecewise continuous
functions appear in Chapter 14 (Fourier series). A proof that our Sturm-
Liouville eigenfunctions form complete sets appears in Courant and Hilbert.3
In Eq. 9.61 the expansion coefficients am may be determined by
1 If we have a finite set, as with vectors, the summation is over the number of
linearly independent members of the set.
2 Many authors use the term closed here.
3R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. 1
(English translation). New York: Interscience Publishers A953), Chapter 6,
Section 3.
COMPLETENESS OF EIGENFUNCTIONS 525
С
Powers of x:
Section 5.7
V
С
Eigenfunctions
/Sections 9.1, 9.2N
/ Degenerate
/ eigenvalues
Ex. 9.2.2
Nondegenerate
eigenvalues
Orthogonal
Linearly independent \<- / "H"«6"'""
set of functions: un{x)l >Л. set of functions: (p,,{x)
Section 8.6
Uniqueness of
power series
i
Gram-Schmidt
orthogonalization Section 9.4
Section 9.3
Eq. 9.64
Unique representation
of function: f(x)
Unique representation
of function: f(x)
Ex. 9.4.2
FIG. 9.2 Linear independence, orthogonality, and uniqueness.
Ex. 9.4.1
am=
F{x)(pm{x)w{x)dx.
(9.64)
This follows from multiplying Eq. 9.61 by (pm(x)w{x) and integrating. From the
orthogonality of the eigenfunctions, (р„(х), only the mth term survives. Here we
see the value of orthogonality. Equation 9.64 may be compared with the dot or
inner product of vectors, Section 1.3, and am interpreted as the mth projection of
the function F(x). Often the coefficient am is called a generalized Fourier
coefficient.
For a known function, F(x), Eq. 9.64 gives am as a definite integral which can
always be evaluated, by machine if not analytically.
For examples of particular eigenfunction expansions, see the following:
Fourier series,'Section 9.2 and Chapter 14; Bessel and Fourier-Bessel expan-
expansions, Section 11.2; Legendre series, Section 12.3; Laplace series, Section 12.6;
Her mite series, Section 13.1; Laguerre series, Section 13.2; and Chebyshev series,
Section 13.3.
It may also happen that the eigenfunction expansion, Eq. 9.61, is the expan-
expansion of an unknown F(x) in a series of known eigenfunctions (pn(x) with unknown
coefficients а„. An example would be the quantum chemist's attempt to describe
an (unknown) molecular wave function as a linear combination of known
atomic wave functions. The unknown coefficients а„ would be determined by a
variational technique—Rayleigh-Ritz, Section 17.8.
The relationships among eigenfunctions, orthogonal sets of functions,
linearly independent sets of functions, and uniqueness of representations are
presented schematically in Fig. 9.2.
526 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
Bessel's Inequality
If the set of functions (pn(x) does not form a complete set, possibly because we
simply have not included the required infinite number of members of an infinite
set, we are led to Bessel's inequality. First, consider the finite case. Let A be an n
component vector,
A = e^! + e2a2 + • • • + е„а„, (9.65)
in which e,- is a unit vector and a-v is the corresponding component (projection)
of A, that is,
a,- = A • e,. (9.66)
Then
UX (9.67)
If we sum over all n components, clearly, the summation equals A by Eq. 9.65
and the equality holds. If, however, the summation does not include all n com-
components, the inequality results. By expanding Eq. 9.67 and remembering that the
unit vectors satisfy an orthogonality relation,
erej = <S;;, (9.68)
we have
2J>2 (9.69)
This is Bessel's inequality.
For functions we consider the integral
fix) -~
w(x)dx>0. (9.70)
This is the continuum analog of Eq. 9.67, letting n -> oo and replacing the sum-
summation by an integration. Again, with the weighting factor w(x) > 0, the inte-
integrand is nonnegative. The integral vanishes by Eq. 9.61 if we have a complete
set. Otherwise it is positive. Expanding the squared term, we obtain
£ f(x)(Pi(x)w(x)dx + ^af> 0. (9.71)
' Ja
Applying Eq. 9.64, we have
b
^f (9.72)
Hence the sum of the squares of the expansion coefficients at is less than or equal
to the weighted integral of [/(x)]2, the equality holding if and only if the expan-
expansion is exact, that is if the set of functions (pn{x) is a complete set.
In later chapters when we consider eigenfunctions that form complete sets
COMPLETENESS OF EIGENFUNCTIONS 527
(such as Legendre polynomials), Eq. 9.72 with the equal sign holding will be
called a Parseval relation.
Bessel's inequality has a variety of uses, including proof of convergence of the
Fourier series.
Schwarz Inequality
The frequently used Schwarz inequality is similar to the Bessel inequality.
Consider the quadratic equation
(a{x + bd2 = t aKx + bja-f = 0. (9.73)
jai = constant, c, then the solution is x = —c. Yib-Ja^ is not a constant, all
terms cannot vanish simultaneously for real x. So the solution must be complex.
Expanding, we find that
x21>? + 2x |>Д + |>? = 0, (9.74)
I i i
and since x is complex (or = — b;/a;), the quadratic formula4 for x leads to
(НЧИЙ4 (9-75)
the equality holding when bjai equals a constant.
Once more, in terms of vectors, we have
(a-bJ = a2b2 cos2 6 <a2b2, (9.76)
where 0 is the included angle.
The Schwarz inequality for functions has the form
f*{x)g(x)dx
< f*{x)f{x)dx g*(x)g(x)dx, (9.77)
the equality holding if and only if g(x) = ccf{x), a being a constant. To prove this
function form of the Schwarz inequality,5 consider a complex function ф(х) =
f(x) + kj{x) with X a complex constant. The functions f(x) and g(x) are any
two functions (for which the integrals exist). Multiplying by the complex
conjugate and integrating, we obtain
(Ъ ij/*il/dx= Г f*fdx + X Г f*g dx + X* \g*fdx + XX* j g*g dx > 0.
Ja Ja Ja Ja Ja
(9.78)
The >0 appears since ф*ф is nonnegative, the equal ( = ) sign holding only
if ф(х) is identically zero. Noting that X and X* are linearly independent, we
4 With discriminant b2 — Лас negative (or zero).
5 An alternate derivation is provided by the inequality
-f(y)g{x)-]dxdy > 0.
528 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
differentiate with respect to one of them and set the derivative equal to zero
to minimize \ьаф*фdx:
Я 2
8 fb Г fb
1 ф*фйх= g*fdx + A\ g*gdx = 0.
*
a
This yields
A = -Ш±. (9.79a)
\bag*gdx
Taking the complex conjugate, we obtain
A* = _
\bag*gdx
Substituting these values of X and A* back into Eq. 9.78, we obtain Eq. 9.77,
the Schwarz inequality.
In quantum mechanics f(x) and g(x) might each represent a state or con-
configuration of a physical system. Then the Schwarz inequality guarantees that
the inner product \haf*(x)g{x)dx exists. In some texts the Schwarz inequality
is a key step in the derivation of the Heisenberg uncertainty principle.
The function notation of Eqs. 9.77 and 9.78 is relatively cumbersome. In
advanced mathematical physics and especially in quantum mechanics it is
common to use a different notation:
<f\g>= \b f*(x)g(x)dx.
Ja
Using this new notation, we simply understand the range of integration, (a, b),
and any weighting function. In this notation the Schwarz inequality becomes
(9.77а)
If g(x) is a normalized eigenfunction, (pt(x), Eq. 9.77 yields [here w(x) = 1]
afai< Сf*(x)f(x)dx, (9.80)
Ja
a result that also follows from Eq. 9.72.
Dime Delta Function
Let us assume that we have a complete, orthonormal set of real functions,
(р„(х), and use them to represent the Dirac delta function. We assume an expan-
expansion of the form
S(x-t)= ^ an(t)cpn(x) (9.81)
(Eq. 9.61), with the coefficients an functions of the variable t. Multiplying by
(pm(x) and integrating over the orthogonality interval (Eq. 9.64), we have
COMPLETENESS OF EIGENFUNCTIONS 529
ajt) = Г S(x - t)<pjx) dx = <pm(t) (9.82)
Ja
or
S(x - 0 = I <рн(хШ) = S(t - x). (9.83)
n = 0
(For convenience we assume that (р„(х) has been redefined to include [w(x)]1/2
if w(x) Ф 1.) This series in Eq. 9.83 is assuredly not uniformly convergent, but
it may be used as part of an integrand in which the ensuing integration will
make it convergent (compare Section 5.5).
Suppose we form the integral
[F(t)8(t-x)dx,
where it is assumed that F(t) can be expanded in a series of eigenfunctions,
cpp(t). We obtain
apcpp(t) £ <pH(x)q>H(t)dt
„=o 4)
F(t)S(t -x)dx=
the cross products (pp(pn (n Ф p) vanishing by orthogonality (Eq. 9.39). Referring
back to the definition of the Dirac delta function (Sections 1.15 and 8.7), we see
that our series representation, Eq. 9.83, satisfies the defining property of the
Dirac delta function and therefore is a representation of it. This representation
of the Dirac delta function is called closure. The assumption of completeness
of a set of functions for expansion of S(x — t) yields the closure relation. The
converse, that closure implies completeness, is the topic of Exercise 9.4.10.
Green's Function
A series somewhat similar to that representing S(x — t) results when we
expand the Green's function in the eigenfunctions of the corresponding homo-
homogeneous equation. In the inhomogeneous Helmholtz equation we have
V2.A(r) + к2ф(т) = -p(r). (9.85)
The homogeneous Helmholtz equation is satisfied by its eigenfunctions cpn.
\2<р„(г) + к2ж(г) = О. (9.86)
As outlined in Section 8.7, the Green's function G(rbr2) satisfies the point
source equation
V2G(ri,r2) + k2G(rur2) = -Sir, - r2). (9.87)
We expand the Green's function in a series of eigenfunctions of the homogeneous
equation (9.86), that is,
530 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
G(rur2) = £ an{r2)cpn{^\ (9-88)
n = 0
and by substituting into Eq. 9.87 obtain
oo oo oo
- £ an(r2)k2n(pn(ri) + k2 X ан(т2)<ря(Г1) = - £ <pH(ri)%(T2). (9.89)
Here E^! — r2) has been replaced by its eigenfunction expansion, Eq. 9.82.
When we employ the orthogonality of ^(rj to isolate an and then substitute
into Eq. 9.88, the Green's function becomes
)^). (9.90)
n=0
a bilinear expansion, symmetric with respect to гг and r2 as expected. Finally,
^(гД the desired solution of the inhomogeneous equation, is given by
ф(т1)= fG(r1(r2)p(r2)</T2. (9.91)
If we generalize our inhomogeneous differential equation to
&\j/+ Ц/= -p (9.92)
where if is an Hermitian operator, we find that
fe), (9.93)
„=o
where Xn is the nth eigenvalue and (pn, the corresponding orthonormal eigen-
eigenfunction of the homogeneous differential equation
&ф + hji = 0. (9.94)
The Green's function will be encountered again in Section 16.5, in which we
investigate it in more detail and relate it to integral equations.
Summary—Linear Vector Spaces—
Completeness
Here we summarize some properties of linear vector space, first with the
vectors taken to be the familiar real vectors of Chapter 1 and then with the
vectors taken to be ordinary functions—polynomials. The concept of complete-
completeness is developed for finite vector spaces and carried over into infinite vector
spaces.
lv. We shall describe our linear vector space with a set of n linearly inde-
independent vectors ef, i = 1, 2, . . ., и. If и = 3, et = i, e2 = j, and e3 = k. The ne;
span the linear vector space.
If. We shall describe our linear vector (function) space with a set of n
linearly independent functions, (p-Xx), i = 0, 1, . . ., n — 1. The index i starts
with 0 to agree with the labeling of the classical polynomials. Here cp^x) is
assumed to be a polynomial of degree /. The пср;(х) span the linear vector
(function) space.
COMPLETENESS OF EIGENFUNCTIONS 531
2v. The vectors in our linear vector space satisfy the following relations
(Section 1.2; the vector components are numbers):
a.
b.
с
d.
e.
f.
Vector addition is commutative
Vector addition is associative
There is a null vector
Multiplication by a scalar
Distributive
Distributive
Associative
Multiplication
By unit scalar
By zero
Negative vector
U + V = V + U
[u + v] + w = u +
0 + v = v
a[u + v] = аи + ay
(a + b)u = аи + bu
a[bu] = (ab)u
lu = u
0u = 0
(-l)u= -u
2f. The functions in our linear function space satisfy the properties listed
for vectors (substitute "function" for "vector").
fix) + g(x) = g(x) + j\x)
[/(x) + g(x)] + h(x) = f(x) + [g(x) + Л(х)]
0 + fix) = f(x)
a[f(x) + g{x)~] = af(x) + ag(x)
(a + b)f(x) = af(x) + bf(x)
a[bf(x)-] = (ah) f(x)
l-f(x) = f(x)
0-/(x) = 0
(-l)./(x)= -f{x)
3v. In n-dimensional vector space an arbitrary vector с is described by its
n components {cr,c2, . . ., с„) or
= I c,.e...
i*=i
When A) ne; are linearly independent and B) span the n-dimensional vector
space, then the e; form a basis and constitute a complete set.
3f. In n-dimensional function space a polynomial of degree m < n — 1 is
described by
"£
fix) = £
;=o
When A) the n(pt(x) are linearly independent and B) span the n-dimensional
function space, then the (pt(x) form a basis and constitute a complete set (for
describing polynomials of degree m <n — 1).
4v. An inner product (scalar, dot product) is defined by
532 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
(If с and d have complex components, the inner product is defined as Yj=\ cNr)
The inner product has the properties of
a. Distributive law of addition c-(d + e) = c-d + c-e
b. Scalar multiplication с • ad = ac • d
с Complex conjugation cd = (d*c)*
4f. An inner product is defined by
<f\g> = \b f*(x)g(x)w(x)dx.
Ja
The choice of the weighting function w(x) and the interval (a, b) follows from
the differential equation satisfied by (pi(x) and the boundary conditions—Section
9.1. In matrix terminology, Section 4.2, \g} is a column vector and </| is a row
vector, the adjoint of |/>.
The inner product has the properties listed for vectors:
а-
b. </|ад
с <f\g> = <g\f>
5v. Orthogonality.
If the net are not already orthogonal, the Gram-Schmidt process may be used
to create an orthogonal set.
5f. Orthogonality.
P
pj)= (p?(x)(pj(x)w(x)dx = Q,
J
If the ncpi(x) are not already orthogonal, the Gram-Schmidt process (Section
9.3) may be used to create an orthogonal set.
6v. Definition of norm.
Y2
с =
The basis vectors e, are taken to have unit norm (length) e,-*e,- = 1. The com-
components of с are given by
с ■ = e • • с /=12 n
6f. Definition of norm.
= </|/>1/2 =
\f(x)\2w(x)dx
1/2
"n-1
E
;=o
1/2
C;
COMPLETENESS OF EIGENFUNCTIONS 533
Parseval's identity. ||/|| > 0 unless/(x) is identically zero. The basis functions
(pi(x) may be taken to have unit norm (unit normalization),
Ml = i-
The expansion coefficients of our polynomial/(x) are given by
ct = (<Pt\f\ » = 0, 1 и — 1.
7v. Bessel's inequality.
с • с > £ cf.
i
If the = sign holds for all c, it indicates that the e,- span the vector space; that is,
they are complete.
7f. Bessel's inequality.
If the equal sign holds for all allowable/'s, it indicates that the (p,(x) span the
function space, that is, they are complete.
8v. Schwarz inequality.
C'd < |c| *|d .
The equal sign holds when с is a multiple of d. If the angle included between с
and d is в, then |cos Q\ < 1.
8f. Schwarz inequality.
The equals sign holds when/(x) and g(x) are linearly dependent, that is, when
f(x) is a multiple of g(x).
Now, let n -> oo, forming an infinite-dimensional linear vector space, I2.
9v. In an infinite-dimensional space our vector с is
We require that
oo
С = > C'd'
оо
cf < oo.
The components of с are given by
с • = e • • с i=1 2 oo
exactly as in a finite-dimensional space.
Then let n -> oo, forming an infinite-dimensional linear vector (function)
space, L2. Then L stands for Lebesgue, the superscript 2 for the 2 in |/(x)|2.
Our functions need no longer be polynomials but we do require that f(x) be
at least piecewise continuous (Dirichlet conditions for Fourier series) and that
</|/) = Ja |/(*)|2w(x)dx exist. This latter condition is often stated as a require-
requirement that f(x) be square integrable.
534 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
9f. Cauchy sequence.
Let
If
or
||/(x) - /я(х)|| - 0 as n -> oo
lim
/(К) -
w(x) dx = 0,
then we have convergence in the mean. This is analogous to the partial sum—
Cauchy sequence criterion for the convergence of an infinite series, Section 5.1.
If every Cauchy sequence of allowable vectors (square integrable, piecewise
continuous functions) converges to a limit vector in our linear space, the space
is said to be complete. Then
fix) = Z ct(Pi(x) (almost everywhere)
;=o
in the sense of convergence in the mean. As noted before, this is a weaker require-
requirement than point-wise convergence (fixed value of x) or uniform convergence.
Expansion (Fourier) Coefficients
ct = <(pi\f), i = 0, 1, . . ., oo,
exactly as in a finite-dimensional space. Then
A linear space (finite- or infinite-dimensional) that A) has an inner product
defined (</|#>) and B) is complete is a Hilbert space.
Infinite-dimensional Hilbert space provides a natural mathematical frame-
framework for modern quantum mechanics. Away from quantum mechanics, Hilbert
space retains its abstract mathematical power and beauty but the necessity for
its use is reduced.
EXERCISES
9.4.1
A function fix) is expanded in a series of orthonormal eigenfunctions
00
fix) = I а„<р„{х).
n = 0
Show that the series expansion is unique for a given set of (р„(х). The functions
<pnix) are being taken here as the basis vectors in an infinite dimensional Hilbert
space.
9.4.2 A function fix) is represented by a finite set of basis functions (p,(x),
EXERCISES 535
N
f(x) = Y, ci<P;(x)-
Show that the components cs are unique, that no different set c\ exists.
Note. Your basis functions are automatically linearly independent. They are
not necessarily orthogonal.
9.4.3 A function /(x) is approximated by a power series Yl!=ocix' over *ne interval
[0,1]. Show that minimizing the mean square error leads to a set of linear equa-
equations
Ac = b,
where
f1 1
Au=\ xl+Jdx = - : , /,7 = 0, 1,2, ...,n- 1
Jo i+j+l
and
fe. = Г x'f(x)dx, i = 0, 1, 2, . . ., n - 1.
Jo
Note. The A{j are the elements of the Hilbert matrix of order n. The determinant
of this Hilbert matrix is a rapidly decreasing function of n. For n — 5, det A =
3.7 x 1(T12 and the set of equations Ac = b is becoming ill-conditioned and
unstable.
9.4.4 In place of the expansion of a function F(x) given by
GO
F(x) = £ an(pn(x),
n=O
with
an = F(x)(pn(x)w(x)dx,
Ja
take the finite series approximation
m
г \л) ~ /_, cn(Pn\XJ-
Show that the mean square error
"b
'■) ~ X cn(Pn(x) w(x)dx
n = O
is minimized by taking cn = an.
Note. The values of the coefficients are independent of the number of terms in
the finite series. This independence is a consequence of orthogonality and would
not hold for a least-squares fit using powers of x.
9.4.5 From Example 9.2.2
J /г/2, 0 < x < n) _ 2/г ^ sinBn + l)x
-/г/2, -7r<x<0j n „tt, 2n + l
(a) Show that
Г [f(x)Ydx = ^h2
536 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
For a finite upper limit this would be Bessel's inequality. For the upper limit,
oo, as shown, this is Parseval's identity,
(b) Verify that
»=o
by evaluating the series.
Hint. The series can be expressed as a Riemann zeta function.
9.4.6 Differentiate Eq. 9.78:
<</#> = </|/> + Kf\Q> + **<g\f> + U*(g\g}
with respect to /* and show that you get the Schwarz inequality, Eq. 9.77.
9.4.7 Derive the Schwarz inequality from the identity
f(x)g(x)dx
lg{x)fdx
\ I I lf(x)g(y)-f(y)g(x)Ydxdy.
9.4.8 If the functions /(x)andg(x)of the Schwarz inequality, Eq. 9.77, may be expanded
in a series of eigenfunctions (p,(x), show that Eq. 9.77 reduces to Eq. 9.75 (with
n possibly infinite).
Note the description of /(x) as a vector in a function space in which <p,(x) corre-
corresponds to the unit vector e,-.
9.4.9 The operator H is Hermitian and positive definite, that is,
I f*Hfdx > 0.
Ja
Prove the generalized Schwarz inequality:
f*Hgdx
< f*Hfdx g*Hgdx.
9.4.10 (a) The Dirac delta function representation given by Eq. 9.83
00
<5(x -0=1 (pn(x)(pn(t)
n = 0
is often called the closure relation. For an orthonormal set of functions,
cpn, show that closure implies completeness, that is, Eq. 9.61 follows from
Eq. 9.83.
Hint. One can take
F(x)= \F(t)S(x-t)dt.
(b) Following the hint of part (a) you encounter the integral J F(t)<pn(t)dt. How
do you know that this integral is finite?
9.4.11 For the finite interval (— n, n) expand the Dirac delta function <5(x — t) in a series
of sines and cosines: sin их, cos их, n = 0, 1, 2, .... Note that although these
functions are orthogonal, they are not normalized to unity.
9.4.12 Substitute Eq. 9.90, the eigenfunction expansion of Green's function, into Eq.
EXERCISES 537
9.91 and then show that Eq. 9.91 is indeed a solution of the nonhomogeneous
Helmholtz equation (9.85).
9.4.13 (a) Starting with a one-dimensional nonhomogeneous differential equation,
(Eq. 9.92), assume that ф(х) and p(x) may be represented by eigenfunction
expansions. Without any use of the Dirac delta function or its representa-
representations, show that
Note that A) if p = 0, no solution exists unless Я = Я„ and B) if A = А„,
no solution exists unless p is orthogonal to (pn. This same behavior will
reappear with integral equations in Section 16.4.
(b) Interchanging summation and integration, show that you have constructed
the Green's function corresponding to Eq. 9.93.
9.4.14 The eigenfunctions of the Schrodinger equation are often complex. In this case
the orthogonality integral, Eq. 9.39, is replaced by
<p?(x)<pj(x)w(x)dx = dij.
Ja
Instead of Eq. 9.83, we have
00
<5(r, - r2) = £ <pn{r,)(p*{r2).
n = 0
Show that the Green's function, Eq. 9.90, becomes
G{r^r^= Lo-"k2_"k2 -
= G*(r2,r,).
9.4.15 A normalized wave function ф(х) = Y^=oan(Pn(x)- The expansion coefficients
an are known as probability amplitudes. We may define a density matrix p with
elements p^ = ataf. Show that
(P% = Pu
or
This result, by definition, makes p a projection operator.
Hint.
ф*фAх= 1.
9.4.16 Show that
(a) the operator
operating on
yields
538 STURM-LIOUVILLE THEORY—ORTHOGONAL FUNCTIONS
(b) 1|ф,-|
i
This operator is a projection operator projecting /(x) onto the /th co-
coordinate, selectively picking out the ith component с,|ф,(х)> of f(x).
Hint. The operator operates via the defined inner product.
REFERENCES
Byron, F. W., Jr., and R. W. Fuller, Mathematics of Classical and Quantum Physics.
Reading, Mass.: Addison-Wesley A969).
Miller, K. S., Linear Differential Equations in the Real Domain. New York: Norton A963).
Titchmarsh, E. C, Eigenfunction Expansions Associated with Second Order Differential
Equations. London: Oxford University Press, Vol. I, 2nd ed. A962), Vol. II A958).
10 THE GAMMA
FUNCTION
(FACTORIAL
FUNCTION)
THE GAMMA FUNCTION (FACTORIAL FUNCTION)
The gamma function appears occasionally in physical problems such as the
normalization of Coulomb wave functions and the computation of probabilities
in statistical mechanics. In general, however, it has less direct physical applica-
application and interpretation than, say, the Legendre and Bessel functions of Chapters
11 and 12. Rather, its importance stems from its usefulness in developing other
functions that have direct physical application. The gamma function, therefore,
is included here. A discussion of the numerical evaluation of the gamma function
appears in Section 10.3.
10.1 DEFINITIONS, SIMPLE PROPERTIES
At least three different, convenient definitions of the gamma function are in
common use. Our first task is to state these definitions, to develop some simple,
direct consequences, and to show the equivalence of the three forms.
Infinite Limit (Euler)
The first definition, named after Euler is
r(z)=lim- *'213»'Л л. ^ z^°. -1. ~2' -3, .... (ЮЛ)
n-oo z(z + l)(z + 2) • • • (z + n)
This definition of T(z) is useful in developing the Weierstrass infinite-product
form of F(z) and Eq. 10.16 and in obtaining the derivative of In F(z) (Section
10.2). Here and elsewhere in this chapter z may be either real or complex.
Replacing z with z + 1, we have
Viz + 1) = lim 1-2-3--- n nr+1
„-.oo (z _|_ i)(z + 2)(z + 3) • • • (z + n + 1)
,. nz 1 -2*3 • • • П „ /1ПЛЧ
= lim n- A0.2)
«-oo Z + П + 1 Z(Z + 1)(Z + 2) • • • (Z + П)
= zY{z).
539
540 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
This is the basic functional relation for the gamma function. It should be noted
that it is a difference equation. It has been shown that the gamma function is
one of a general class of functions that do not satisfy any differential equation
with rational coefficients. Specifically, the gamma function is one of the very
few functions of mathematical physics that does not satisfy either the hyper-
hypergeometric differential equation (Section 13.5) or the confluent hypergeometric
equation (Section 13.6).
Also, from the definition
1-2-3 ••• n
«-co 1-2-3 ••• n(n+ 1) jia3j
= 1.
Now, application of Eq. 10.2 gives
ГB) = 1,
ГC) = 2ГB) = 2, A0.4)
Г(п)= 1-2-3 • • - (и — 1) = (и - 1)!
Definite Integral (Euler)
A second definition, also frequently called Euler's form, is
Лоо
T(z)= e'ltz~ldt, Щг)>0. A0.5)
Jo
The restriction on z is necessary to avoid divergence of the integral. When the
gamma function does appear in physical problems, it is often in this form or
some variation such as
Лоо
T(z) = 2 «T'V* dt, M{z) > 0, A0.6)
J
o
,„
i
dt, Щг) > О. A0.7)
When z = j, Eq. 10.6 is just the Gauss error function, and we have the interesting
result
Щ) = ^п. A0.8)
Generalizations of Eq. 10.6, the Gaussian integrals, are considered in Exercise
10.1.11. This definite integral form of T(z), Eq. 10.5, leads to the beta function,
Section 10.4.
To show the equivalence of these two definitions, Eqs. 10.1 and 10.5, consider
the function of two variables
F(z, n) = ( 1 - ~) t*'1 dt> ®(z) > °>
DEHNITIONS, SIMPLE PROPERTIES 541
with n a positive integer.1 Since
limn --) =e~\ A0.10)
from the definition of the exponential
lim F(z,n) = F(z, oo) = e~4z~x dt
Jo A0.11)
= T{z)
by Eq. 10.5.
Returning to F(z,ri), we evaluate it in successive integrations by parts. For
convenience let и = t/n. Then
F(z,n) = пг\ A - ufuz~x du. A0.12)
Jo
Integrating by parts, we obtain
F(z,n) _n ,nuz
1
n
+ - A -uf~xuzdu. A0.13)
о z Jo
Repeating this with the integrated part vanishing at both end points each time,
we finally get
n \ z n(n — 1) • • • 1 f1 z+n-\ j
Г
z(z + 1) • • • (z + n - 1) Jo
Jo A0.14)
1-2-3 - - -n
-nz.
z(z + l)(z + 2) • • • (z + n)
This is identical with the expression on the right side of Eq. 10.1. Hence
lim F(z, n) = F(z, oo) = T(z). A0.15)
n—>oo
by Eq. 10.1, completing the proof.
Infinite Product (Weierstrass)
The third definition (Weierstrass's form) is
Щ = z?yz П (l + ^VZ/"' A016)
where у is the usual Euler-Mascheroni constant,
у = 0.577216.... A0.17)
This infinite-product form may be used to develop the reflection identity,
Eq. 10.23, and applied in the exercises, such as Exercise 10.1.19. This form can
be derived from the original definition (Eq. 10.1) by rewriting it as
1-The form of F(z, n) is suggested by the beta function (compare Eq. 10.60).
542 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
1-2-3 •• • n
F(z) = lim
z(z + 1) • • • (z + n)
-n*
A0.18)
Inverting and using
we obtain
Jr = z\ime{-lnn)zf\(i+-\
F(z) n-+oo m = l\ m)
A0.19)
A0.20)
Multiplying and dividing by
exp
we get
n
= П <'-"".
A0.21)
= zhimexp
zh
1 (Zj [n
X
m=l
-( ln/i z
и /
A0.22)
As shown in Section 5.2, the infinite series in the exponent converges and defines
y, the Euler-Mascheroni constant. Hence Eq. 10.16 follows.
It was shown in Section 5.11 that the Weierstrass infinite-product definition
of F(z) led directly to an important identity,
Г»ГA-2) =
n
sinzTr
A0.23)
This identity may also be derived by contour integration (Example 7.2.5 and
Exercises 7.2.18 and 7.2.19) and the beta function, Section 10.4. Setting z = \
in Eq. 10.23, we obtain
Г(£) = ^/п A0.24)
(taking the positive square root) in agreement with Eq. 10.8.
The Weierstrass definition shows immediately that F(z) has simple poles at
z = 0, — 1, —2, —3, . . ., and that [Цг)] has no poles in the finite complex
plane, which means that Y(z) has no zeros. This behavior may also be seen in
Eq. 10.23, in which we note that 7r/(sin nz) is never equal to zero.
Actually the infinite-product definition of T(z) may be derived from the
Weierstrass factorization theorem with the specification that [F(z)]"^1 have
simple zeros at z = 0, —1,2, — 3, .... The Euler-Mascheroni constant is fixed
by requiring ГA) = 1.
In probability theory the gamma distribution (probability density) is given
by
DEFINITIONS, SIMPLE PROPERTIES 543
/(*) =
0,
x>0
x <
A0.24a)
The constant [/?°T(a)] * is chosen so that the total (integrated) probability
will be unity. For x -*■ E, kinetic energy, а -> § and /? -> /cT, Eq. 10.24a yields
the classical Maxwell-Boltzmann statistics.
Factorial Notation
So far this discussion has been presented in terms of the classical notation. As
pointed out by Jeffreys and others, the — 1 of the z — 1 exponent in our second
definition (Eq. 10.5) is a continual nuisance. Accordingly, Eq. 10.5 is rewritten as
A0.25)
to define a factorial function z!. Occasionally we may still encounter Gauss's
notation, FI(z), for the factorial function
Y[(z) = zl A0.26)
The Г notation is due to Legendre. The factorial function of Eq. 10.25 is, of
course, related to the gamma function by
T(z) = (z-
or
T(z + l) = z!
If z -- n, a positive integer (Eq. 10.4) shows that
z\ = n\ = l -2- 3---n,
A0.27)
A0.28)
the familiar factorial. However, it should be noted carefully that since z! is now
defined by Eq. 10.25 (or equivalently by Eq. 10.27) the factorial function is no
FIG. 10.1 The factorial function-
extension to negative arguments
544 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
-1.0
FIG. 10.2 The factorial function and the first two derivatives of In (л!)
longer limited to positive integral values of the argument (Figure 10.1). The
difference relation (Eq. 10.2) becomes
B-1)! =
z!
This shows immediately that
0! = 1
and
n ! = + oo for n, a negative integer.
In terms of the factorial function Eq. 10.23 becomes
и \ i _ nz
sin %z'
A0.29)
A0.30)
A0.31)
A0.32)
By restricting ourselves to the real values of the argument, we find that x!
defines the curve shown in Fig. 10.2. The minimum of the curve is
xl = @.461,63 •••)! = 0.885,60 •••.
A0.33)
Double Factorial Notation
In many problems of mathematical physics, particularly in connection with
Legendre polynomials (Chapter 12), we encounter products of the odd positive
integers and products of the even positive integers. For convenience these are
given special labels: double factorials.
DEFINITIONS, SIMPLE PROPERTIES 545
• 3 • 5 ■■■Bn+ l) = Bn + 1)!!
2 • 4 • 6 • • • Bn) = Bn)!!
Clearly, these are related to the regular factorial functions by
Bn)ll = 2nnl and Bn+ l)!! = (?l±il
A0.336)
A0.33c)
FIG. 10.3 (Гор) Factorial function contour
Cut line
00
FIG. 10.4 (Bottom) The contour of Fig. 10.3 deformed
Integral Representation
An integral representation that is useful in developing asymptotic series for
the Bessel functions is
A0.34)
where С is the contour shown in Fig. 10.3. This contour integral representation
is particularly useful when v is not an integer, z — 0 then being a branch point.
Equation 10.34 may be readily verified for v > — 1 by deforming the contour as
shown in Fig. 10.4. The integral from oo into the origin yields — (v!), placing the
phase of z at 0. The integral out to oo (in the fourth quadrant) then yields e2nivv!,
the phase of z having increased to 2л. Since the circle around the origin contri-
contributes nothing when v > — 1, Eq. 10.34 follows.
It is often convenient to throw this result into a more symmetrical form
546 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
e~z(-z)vdz = 2isinv7ivl A0.35)
Jc
This corresponds to choosing the phase of z to have a range of —я to +n in
Eq. 10.34.
This analysis establishes Eqs. 10.34 and 10.35 for v> — 1. It is relatively
simple to extend the range to include all nonintegral v. First, we note that the
integral exists for v < — 1 as long as we stay away from the origin. Second, inte-
integrating by parts we find that Eq. 10.35 yields the familiar difference relation
(Eq. 10.29). If we take the difference relation to define the factorial function of
v < — 1, then Eqs. 10.34 and 10.35 are verified for all v (except negative integers).
EXERCISES
10.1.1 Derive the recurrence relations
ГB + 1) = zT{z)
from the Euler integral form (Eq. 10.5),
/•со
Y(z) = e-ftz-ldt.
Jo
10.1.2 In a power-series solution for the Legendre functions of the second kind we
encounter the expression
(и + 1)(и + 2)(и + 3) • • • (и + 2s - 1)(и + 2s)
2-4-6-8 • • • Bs - 2)Bs)-Bn + 3)Bи + 5)Bи + 7) • • • (In + 2s + 1)'
in which s is a positive integer. Rewrite this expression in terms of factorials.
10.1.3 Show that
Bs -2и)! (n-s)[
Here s and n are integers with s < n. This result can be used to avoid negative
factorials such as in the series representations of the spherical Neumann func-
functions and the Legendre functions of the second kind.
10.1.4 Show that Y(z) may be written
Л 00
Y(z) = 2 «T'V2 dt, &(z) > 0,
Jo
_, гт, пм-чи mz)>o.
10.1.5 In a Maxwellian distribution the fraction of particles between the speed v and
v + dv is
dN . ( m V/2
J*= \2nkf)
N being the total number of particles. The average or expectation value of v" is
defined as <u"> = N~l$v"dN. Show that
EXERCISES 547
10.1.6 By transforming the integral into a gamma function, show that
f1 1
— xk\nxdx= r k> —
Jo (* + IJ
10.1.7 Show that
10.1.8 Show that
lim'"-
im.
*-o (x - 1)! a
10.1.9 Locate the poles of T(z). Show that they are simple poles and determine the
residues.
10.1.10 Show that the equation x! = к, к ф 0, has an infinite number of real roots.
10.1.11 Show that
ЛСО
(a) x2
Jo ' 2а»+1"
(b) Г*:
Jo *■"
= Bs- 1)'! /^
2S+V >/a"
These Gaussian integrals are of major importance in statistical mechanics.
10.1.12 (a) Develop recurrence relations for (In)!! and for (In + 1)!!.
(b) Use these recurrence relations to calculate (or to define) 0!! and ( — 1)!!.
ANS. 0!! =
10.1.13 For s a nonnegative integer, show that
(-2s — 1)!! =
Bs- 1)!! Bs)!
10.1.14 Express the coefficient of the nth term of the expansion of A + xI/2
(a) in terms of factorials of integers.
(b) in terms of the double factorial (!!) functions.
/J/VS n —I 1V+1 ' ~ ' ' — i 1 y+l \^П ~ 3) ■ ■ _ ~ т .
22n~2n!(n-2)! (ln)\\
10.1.15 Express the coefficient of the nth term of the expansion of A + x)/2
(a) in terms of the factorials of integers.
(b) in terms of the double factorial (!!) functions.
- = ( — !)"- —, «=1,2,3,
10.1.16 The Legendre polynomial may be written as
548 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
Pn(cos0) = 2^=^p jcosnO + f^T7cos(
1-3 n(n~\)
1 «2Bn - l)Bn - 3)
1-3-5 и(и-1)(и-2)
1 • 2• 3 Bn — l)Bn — 3)Bn — 5) J
For и we let и = 2s + 1. Then Pn(cos0) = P2s+1(cos(J) = ^,=0amcosBm + 1H.
Find am in terms of factorials and double factorials.
am
10.1.17 (a) Show that
where n is an integer.
(b) Express T(\ + n) and T(j — n) separately in terms of nl/2 and a !! function.
ANS. ,. ....
«2"^
10.1.18 From one of the definitions of the factorial or gamma function, show that
|(bc).f r^.
sinh nx
10.1.19 Prove that
This equation has been useful in calculations in the theory of beta decay.
10.1.20 Show that
1/2 П
П
s=1
for n, a positive integer.
10.1.21 Show that
for all x. The variables x and у are real.
10.1.22 Show that
cosh ny
10.1.23 The probability density associated with the normal distribution of statistics is
given by
with ( — oo, oo) for the range of x. Show that
(a) the mean value of x, <x> is equal to /z.
(b) the standard deviation (<x2> — <x>2I/2 is given by a.
10.1.24 From the gamma distribution of Eq. 10.33a
DIGAMMA AND POLYGAMMA FUNCTIONS 549
/(*) =
о,
1
-x* ]
e xip,
X
X
<o,
show that
(a) <x> (mean) = a/?.
(b) a2 (variance) = <x2> - <x>2 = ajS2.
10.1.25 The wave function of a particle scattered by a pure Coulomb potential is ф(г, О).
At the origin the wave function becomes
iy),
where у = ZlZ2e2/hv. Show that
10.1.26 Derive the contour integral representation
2isinv7iv! = e~z(-z)vdz.
Jc
10.1.27 Write a function subprogram FACT(N) (fixed point independent variable) that
will calculate N!. Include provision for rejection and appropriate error message
if N is negative.
Note. For small N direct multiplication is simplest. For large N, if large N are
considered, Eq. 10.55, Stirling's series would be appropriate.
10.1.28 (a) Write a function subprogram to calculate the double factorial ratio
BN — 1)! !/BiV)!!. Include provision for N = 0 and for rejection and an
error message if N is negative. Calculate and tabulate this ratio for N =
1AI00.
(b) Check your function subprogram calculation of 199П/200!! against the
value obtained from Stirling's series (Section 10.3).
ANS. ^^ = 0.056348
200!!
10.1.29 Using either the Fortran supplied GAMMA or a library supplied subroutine
for x! or F(x), determine the value of x for which F(x) is a minimum A <, x < 2)
and this minimum value of F(x). Notice that although the minimum value of
F(x) may be obtained to about six significant figures (single precision), the
corresponding value of x is much less accurate. Why this relatively low accuracy?
10.1.30 The factorial function expressed in integral form can be evaluated by the
Gauss-Laguerre quadrature. For a 10-point formula Appendix 2 guarantees
the resultant x! theoretically exact for x an integer, 0 up through 19. What
happens if x is not an integer? Use the Gauss-Laguerre quadrature to evaluate
x!, x = 0.0@.1J.0. Tabulate the absolute error as a function of x.
Check value. x\exact— x!quadrature= 0.00034 for x = 1.3.
10.2 DIGAMMA AND POLYGAMMA FUNCTIONS
Digamma Functions
As may be noted from the three definitions in Section 10.1, it is inconvenient
to deal with the derivatives of the gamma or factorial function directly. Instead,
550 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
it is customary to take the natural logarithm of the factorial function, (Eq. 10.1),
convert the product to a sum, and then differentiate, that is,
z! = zT\z) = lim "• j-r—y A0.36)
n-co(z + 1)(Z + 2) ■ ■ ■ (Z + П)
and
ln(z!) = lim [ln(n !) + z In n - ln(z + 1)
A0.37)
- ln(z + 2) - • • ■ - ln(z + n)],
in which the logarithm of the limit is equal to the limit of the logarithm. Differen-
Differentiating with respect to z, we obtain
^-ln(z!) = F(z) = lim (Inn Ц- l— - ■■■ -— ), A0.38)
dz K w «-oo ^ z + 1 z + 2 z + nj K
which defines F(z), the digamma function. From the definition of the Euler-
Mascheroni constant1 Eq. 10.38 may be rewritten as
F(z)= -y-
„=i rc(rc + z)
One application of Eq. 10.39 is in the derivation of the series form of the
Neumann function (Section 11.3). Clearly,
F@)= -y= -0.577 215 664901 • • • 2 A0.40)
Another, perhaps more useful, expression for F(z) is derived in Section 10.3.
Polygamma Function
The digamma function may be differentiated repeatedly, giving rise to the
polygamma function:
A0.41)
A' w' J' ' * * *
у
Li /_
n = l \Z
A plot of F(x) and F'(x) is included in Fig. 10.1. Since the series in Eq. 10.41
defines the Riemann zeta function3 (with z = 0),
1 Compare Sections 5.2 and 5.6. We add and subtract £"=1 s l.
2y has been computed to 1271 places by D. E. Knuth, Math. Сотр. 16, 275
A962) and to 3566 decimal places by D. W. Sweeney, Math. Сотр. 17, 170
A963). It may be of interest that the fraction 228/395 gives у accurate to six
places.
3 Section 5.9. For z Ф 0 this series may be used to define a generalized zeta
function.
DIGAMMA AND POLYGAMMA FUNCTIONS 551
CO 1
CM = I ^, A0.42)
we have
F(m)(O) = (-l)m+1m! C(w+ 1), m= 1,2,3,.... A0.43)
The values of the polygamma functions of positive integral argument, F("°(rc),
may be calculated by using Exercise 10.2.6.
In terms of the perhaps more common Г notation,
In F(z) = — \l/(z) = ij/M(z). A0.44a)
From Eq. 10.27
фМ(г) = F(n)(z - 1). A0.44b)
Maclaurin Expansion, Computation
It is now possible to write a Maclaurin expansion for In (z!).
ln(z!)= l^
"~ХП' A0.44г)
CO H
convergent for |z | < l;forz = x, the range is — 1 < x < 1. Alternate forms of this
series appear in Exercise 5.9.14. Equation 10.44c is a possible means of comput-
computing z! for real or complex z, but Stirling's series (Section 10.3) is usually better,
and in addition, is an excellent table of values of the gamma function for complex
arguments based on the use of Stirling's series and the recurrence relation (Eq.
10.29) is now available.4
Series Summation
The digamma and polygamma functions may also be used in summing series.
If the general term of the series has the form of a rational fraction (with the
highest power of the index in the numerator at least two less than the highest
power of the index in the denominator), it may be transformed by the method of
partial fractions'(compare Section 15.8). The infinite series may then be expressed
as a finite sum of digamma and polygamma functions. The usefulness of this
method depends on the availability of tables of digamma and polygamma func-
functions. Such tables and examples of series summation are given in AMS-55,
Chapter 6.
EXAMPLE 10.2.1 Catalan's Constant
Catalan's constant, Exercise 5.2.22, or /JB) of Section 5.9 is given by
4 Table of the Gamma Function for Complex Arguments, National Bureau of
Standards, Applied Mathematics Series No. 34.
552 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
* = № =
fe=0
Grouping the positive and negative terms separately and starting with unit
index (to match the form of FA), Eq. 10.41), we obtain
K = l+ У ,, 1 ..,-i- У
CO 1 1 СО
9
Now, quoting Eqs. 10.41 and 10.44b, we get
10.44e)
= ! + ^A)(l+i)-^A)d+3).
Using the values of t//A) from Table 6.1 of AMS-55, we obtain
K = 0.9159 6559....
Compare this calculation of Catalan's constant with the calculations of Chapter
5, either direct summation by machine or a modification using Riemann zeta
functions and then a (shorter) machine computation.
EXERCISES
10.2.1 Verify that the following two forms of the digamma function,
and
00
x
are equal to each other (for x a positive integer).
10.2.2 Show that F (z) has the series expansion
F(z)= -y+ f (-irC(n)z"-1.
10.2.3 For a power series expansion of ln(z!) AMS-55 lists
GO
ln(z!)= -ln(l+z) + z(l-y)+ 1(-1Г№)-1;
n = 2
(a) Show that this agrees with Eq. 10.44c for |z| < 1.
(b) What is the range of convergence of this new expression?
10.2.4 Show that
2 П
Hint. Try Eq. 10.32.
10.2.5 Write out a Weierstrass infinite product definition of ln(z!). Without differentiat-
differentiating, show that this leads directly to the Maclaurin expansion of ln(z!), Eq. 10.44c.
EXERCISES 553
10.2.6 Derive the difference relation for the polygamma function
, «!_
(z + 1)"
F(m>(z + 1) = F<m>(z) + (- 1Г-_ , ;,m+1, m = 0, 1, 2, .. ..
10.2.7 Show that if
Г(х + iy) = и + iv
then
Г(х — iy) = и — iv.
This is a special case of the Schwarz reflection principle, Section 6.5.
10.2.8 The Pochhammer symbol (a)n is defined as
(a)n = a{a+ \) ■ ■ ■ (a + n - 1)
(fl)o = 1
(for integral и).
(a) Express (а)„ in terms of factorials.
(b) Find {d/da){a)n in terms of (а)„ and digamma functions.
ANS. j-(a)n = (a)n[F(a + n-l)~F(a~ 1)].
da
(c) Show that
10.2.9 Verify the following special values of the ф form of the di- and polygamma
functions
A(i)= -y
•AB)(l)= -2CC).
10.2.10 Derive the polygamma function recurrence relation
t//m)(l + Z) = фЫB) + (- irm!/zm+1, m = 0, 1, 2, ....
10.2.11 Verify
(a) e~r In rdr = —y.
Jo
(b) re~rlnrdr = 1 — y.
Jo
(c) r"e-r\nrdr = (n- 1)! + и r^^-'lnrdr, и = 1, 2, 3, .. ..
Jo Jo
Hint. These may be verified by integration by parts, three parts, or differentiating
the integral form of n! with respect to n.
10.2.12 Dirat relativistic wave functions for hydrogen involve factors such as [2A —
a2Z2I/2] ! where a, the fine structure constant, is yj^ and Z is the atomic number.
Expand [2A — a2Z2I/2] ! in a series of powers of a2Z2.
10.2.13 The quantum mechanical description of a particle in a coulomb field requires
a knowledge of the phase of the complex factorial function. Determine the phase
of(l + ib)\ for small 6.
554 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
1 0.2.14 The total energy radiated by a black body is given by
_ 8тгУсГ f x A
c3h3 L ex - 1
о
4/
Show that the integral in this expression is equal to 3! £D). [CD) = 7i4/90 =
1.0823....] The final result is the Stefan-Boltzmann law.
10.2.1 5 As a generalization of the result in Exercise 10.2.14, show that
o
10.2.16 The neutrino energy density (Fermi distribution) in the early history of the
universe is given by
An Г x3
3 H ехр(х//сГ)+1
Show that
7л5
10.2.17 Prove that
Г xs
Exercise 10.2.15 and 10.2.17 actually constitute Mellin integral transforms
(compare Section 15.1).
10.2.18 Prove that
10.2.19 Using di- and polygamma functions sum the series
(a)
(b)
GO (
у l
Note. You can use Exercise 10.2.6 to calculate the needed digamma functions.
10.2.20 Show that
(b-a)
а Ф b, and neither a nor 6 is a negative integer. It is of some interest to compare
this summation with the corresponding integral
Г" dx 1
)l (x + a)(x + b) b-a
{ln(l + b) - ln(l + a)}.
STIRLING'S SERIES 555
The relation between ф(х) (or F(x)) and lnx is made explicit in Eq. 10.51 in the
next section.
10.2.21 Verify the contour integral representation of £(s),
The contour С is the same as that for Eq. 10.35. The points z = ±2nni, n =
1, 2, 3 ... are all excluded.
10.2.22 Show that £(s) is analytic in the entire finite complex plane except at s = 1
where it has a simple pole with a residue of + 1.
Hint. The contour integral representation will be useful.
10.2.23 Using the complex variable capability of FORTRAN IV calculate 0t{\ + ib)!,
«/A + ib)!, |A + tf>)!| and phase A + ib)! for b = 0.0@.1I.0. Plot the phase of
A + ib)\ versus b.
Hint. Exercise 10.2.3 offers a convenient approach. You will need to calculate
C(n).
10.3 STIRLING'S SERIES
For computation of In (z!) for very large z (statistical mechanics) and for
numerical computations at nonintegral values of z a series expansion of In (z!)
in negative powers of z is desirable. Perhaps the most elegant way of deriving
such an expansion is by the method of steepest descents (Section 7.4). The
following method, starting with a numerical integration formula, does not
require knowledge of contour integration and is particularly direct.
Derivation from Euler-Maclaurin Integration Formula
The Euler-Maclaurin formula for evaluating a definite integral1 is
f(x)dx = i/@) + /A) + /B) + • • • + У\п)
Jo A0.45)
-b2U'(n) -/'@)] - bA\_fm(n) -/'"@)] ,
in which the b2n are related to the Bernoulli numbers B2n (compare Section 5.9)
by
Bn)lb2n = B2n, A0.46)
#0 = 1' B6 = ^,
B2=l B8 = -3L A0.47)
B4. = -M' B10 = ^, and so on.
By applying Eq. 10.45 to the definite integral, we have
Г dx 1
J z
A0.48)
1 Obtained by repeated integration by parts, Section 5.9.
556 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
(for z not on the negative real axis), we obtain
1= 1 +f">(z)-^-^*-.... A0.49)
z 2z zJ z
This is the reason for using Eq. 10.48. The Euler-Maclaurin evaluation yields
FA)(z), which is d2 ln(z \)/dz2.
Using Eq. 10.46 and solving for FA)(z), we have
2z z z
A0.50)
2z
,
Since the Bernoulli numbers diverge strongly, this series does not converge! It
is a semiconvergent or asymptotic series, useful for computation despite its
divergence (compare Section 5.10).
Integrating once, we get the digamma function
ID D
_ . . .... . -О? *-*А.
2z 2z2 4z4
A0.51)
Integrating Eq. 10.51 with respect to z from z — 1 to z and then letting z ap-
approach infinity, Cl5 the constant of integration may be shown to vanish. This
gives us a second expression for the digamma function, often more useful than
Eq. 10.38.
Stirling's Series
The indefinite integral of the digamma function (Eq. 10.51) is
ln(z!) = C2 + (z + \)]nz - z + f2- + • • • + Bln 2n_x + • ■ •, A0.52)
у 2y 2z 2nBn — X)zln
in which С2 is another constant of integration. To fix C2 we find it convenient to
use the doubling or Legendre duplication formula derived in Section 10.4,
zl(z- i)! = 2~2V/2Bz)! A0.53)
This may be proved directly when z is a positive integer by writing Bz)! as a
product of even terms times a product of odd terms and extracting a factor of
two from each term (Exercise 10.3.5). Substituting Eq. 10.52 into the logarithm
of the doubling formula, we find that C2 is
С2=|1п2тг, A0.54)
giving
1 / 1 \ 11 1
- • • •. A0.55)
STIRLING'S SERIES 557
0.93
0.92
).83% low
12 3 4 5 6 7
9 10
FIG. 10.5 Accuracy of Stirling's formula
This is Stirling's series, an asymptotic expansion. The absolute value of the error
is less than the absolute value of the first term neglected.
The constants of integration Q and C2 may also be evaluated by comparison
with the first term of the series expansion obtained by the method of "steepest
desceat." This is carried out in Section 7.4.
To help convey a feeling of the remarkable precision of Stirling's series for s!
the ratio of the first term of Stirling's approximation to s! is plotted in Fig. 10.5.
A tabulation gives the ratio of the first term in the expansion to s! and the ratio
of the first two terms in the expansion to s! (Table 10.2). The derivation of these
forms is Exercise 10.3.1.
TABLE
s
1
2
3
4
5
6
7
8
9
10
10.2
s\
0.92213
0.95950
0.97270
0.97942
0.98349
0.98621
0.98817
0.98964
0.99078
0.99170
V&+1/2e~s[l
s!
0.99898
0.99949
0.99972
0.99983
0.99988
0.99992
0.99994
0.99995
0.99996
0.99998
, i и
+ 12.vJ
558 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
Numerical Computation
The possibility of using the Maclaurin expansion* Eq. 10.44c, for the numeri-
numerical evaluation of the factorial function is mentioned in Section 10.2. However,
for large x, Stirling's series, Eq. 10.55, gives much more rapid convergence. The
Table of the Gamma Function for Complex Arguments, National Bureau of
Standards, Applied Mathematics Series No. 34, is based on the use of Stirling's
series for z = x + iy, 9 < x < 10. Lower values of x are reached with the recur-
recurrence relation, Eq. 10.29. Now suppose the numerical value of x! is needed for
some particular value of x in a program in a large, high-speed digital computer.
How shall we instruct the computer to compute x !? Stirling's series followed by
the recurrence relation is a good possibility. An even better possibility is to fit
x!, 0 < x < 1, by a short power series (polynomial) and then calculate x!
directly from this empirical fit. Presumably, the computing machine has been
told the values of the coefficients of the polynomial. Such polynomial fits
have been made by Hastings2 for various accuracy requirements. For example,
with
b2 =
x! =
-0.57719 1652
0.988205891
-0.89705 6937
0.91820 6857
„х" + e(x),
A0.56a)
b7 =
-0.756704078
0.48219 9394
-0.19352 7818
0.03586 8343
A0.56b)
with the magnitude of the error |e(x)| < 3 x 10 7, 0 < x < 1.
This is not a least-squares fit. Hastings employed a Chebyshev polynomial
technique similar to that described in Section 13.4 to minimize the maximum
value of |e(x)|.
10.3.1
10.3.2
10.3.3
10.3.4
10.3.5
Rewrite Stirling's series to give z! instead of ln(z!).
ANS. z\
12z 288z2 51,840z3 )
Use Stirling's formula to estimate 52!, the number of possible rearrangements
of cards in a standard deck of playing cards.
By integrating Eq. 10.51 from z — 1 to z and then letting z -> oo, evaluate the
constant Cy in the asymptotic series for the digamma function F(z).
Show that the constant C2 in Stirling's formula equals \\n2% by using the
logarithm of the doubling formula.
By direct expansion verify the doubling formula for z = n + \; n is an integer.
2С Hastings, Jr., Approximations for Digitial Computers. Princeton, NJ:
Princeton University, Press A955).
EXERCISES 559
1 0.3.6 Without using Stirling's series show that
(a) In (и!) < lnxdx,
i
(b) ln(n!)> lnxdx; n is an integer > 2.
Ji
Notice that the arithmetic mean of these two integrals gives a good approxima-'
tion for Stirling's series.
10.3.7 Test for convergence
П2 x 2p+l = f
J 2p + 2 p%
p%l p! J 2p + 2 p% Bp)!!Bp + 2)!!
This series arises in an attempt to describe the magnetic field created by and
enclosed by a current loop.
10.3.8 Show that
(x + b)!
1 0.3.9 Show that
.. Bn- 1)!! 1/2 _1/2
nm n =7i .
»-<*> Bи)!!
("> \
I to six significant figures for n = 10, 20,
/
and 30. Check your values by
(a) a Stirling series approximation through terms in n~',
(b) a double precision calculation.
ANS. () = 1.84756 x 105
\io/
()= 1-37846 x 10u
|) = 1.18264 x 1017.
V30/
10.3.11 Write a program (or subprogram) that will calculate Iog10(x!) directly from
Stirling's series. Assume that x > 10. (Smaller values could be calculated via
the factorial recurrence relation.) Tabulate Iog10(x!) versus x for x = 10A0K00.
Check your results against AMS-55 or by direct multiplication (for n = 10, 20,
and 30).
Check value. Iog10A00!) = 157.97.
10.3.12 Using the complex capability of FORTRAN IV, write a subroutine that will
calculate ln(z!) for complex z based on Stirling's series. Include a test and an
appropriate error message if z is too close to a negative real integer. Check your
subroutine against alternate calculations for z real, z pure imaginary, and
z = 1 + ib (Exercise 10.2.23).
Check values. |(i0.5) !| = 0.82618
phase (/0.5)! = -0.24406.
560 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
x — a
*~ x
FIG. 10.6 Transformation from car-
cartesian to polar coordinates
10.4 THE BETA FUNCTION
Using the integral definition (Eq. 10.25), we write the product of two factorials
as the product of two integrals. To facilitate a change in variables, we take the
integrals over a finite range.
Г f Щт) > -1,
m!n!=lim e~uu'"du\ e~vv"dv, A0.57a)
' ' ' .ЗД>-1. l ;
1
o Jo
Replacing и with x and v with у , we obtain
x2x2m+l dx \ e~y2yZn+1dy.
a—►oo
Transforming to polar coordinates gives us
mini = lim 4
o->oo
•*/2
lo
dr I cos2m+1Osin2"+1(JM
Jo
,2m+l /) о-„2и+1
= (m + n + 1)!2 cos2m+1 Osin2n+1 OdO.
Jo
A0.57b)
A0.58)
Here the cartesian area element dx dy has been replaced by r dr dO (Fig. 10.6). The
last equality in Eq. 10.58 follows from Exercise 10.1.11.
The definite integral, together with the factor 2, has been named the beta
function
B{m+l,n+ l) =
mini
(m + n + 1)!
Equivalently, in terms of the gamma function
B(p,q) =
sin2"+1 OdO
= B(n + l,m + 1).
A0.59a)
A0.5%)
THE BETA FUNCTION 561
The only reason for choosing m + 1 and n + 1, rather than m and n, as the argu-
arguments of В is to be in agreement with the conventional, historical beta function.
In this manipulation the transformation from cartesian to polar coordinates
needs some justification. As seen in Fig. 10.6, the shaded area is being neglected.
However, the maximum value of the integrand in this region is e~a2m+2n+3
which vanishes so strongly as a approaches infinity that the integral over the
neglected region vanishes.
Definite Integrals, Alternate Forms
The beta function is useful in the evaluation of a wide variety of definite
integrals. The substitution t — cos2 В converts Eq. 10.59a to1
B(m + 1, n + 1) = ?^i_- = Г tm(l - tfdt. A0.60a)
(т + и+l)! Jo
Replacing t by x2, we obtain
mini f1
mn- -= x2m+1(l -x2)"dx. A0.60b)
^ '' Jo
The substitution t = u/(\ + u) in Eq. 10.60a yields still another useful form,
oo
mini и
,m
(m + n+ 1)! Jo A + u)
m+n + 2
du. A0.61)
The beta function as a definite integral is useful in establishing integral repre-
representations of the Bessel function (Exercise 11.1.18) and the hypergeometric
function (Exercise 13.5.7).
Verification of тгсг/sin па Relation
If we take m = a, n = —a, — 1 < a < 1, then
ua
o a +«)
= a\(-a)\ A0.62)
By contour integration this integral may be shown to be equal to 7ra/sin7ra
(Exercise 7.2.18), thus providing another method of obtaining Eq. 10.32.
Derivation of Legendre Duplication Formula
The form of Eq. 10.59 suggests that the beta function may be useful in deriving
the doubling formula used in the preceding section. From Eq. 10.60a with
m = n = z and M{z) > — 1,
—^^-—= tz{\-tfdt. A0.63)
^ Jo
1 The Laplace transform convolution theorem provides an alternate derivation
of Eq. 10.60a, compare Exercise 15.11.2.
562 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
By substituting t = A + s)/2, we have
z!z! 2~2^Г (ls2f
A -s2fds
A -s2fds.
A0.64)
2f
o
The last equality holds because the integrand is even. Evaluating this integral as
a beta function (Eq. 10.60b), we obtain
z!z! 2-a.-,z!i-j)! (]065)
Bz+l)! (z + i)!
Rearranging terms and recalling that ( — \)\ = тг1/2, we quickly reduce these
equations to one form of the Legendre duplication formula,
z!(z + i)! = 2~2z-17r1/2Bz +1)!. A0.66a)
Dividing by (z + j), we obtain an alternate form of the duplication formula.
z!(z - i)! = 2~2V/2Bz)!. A0.666)
Although the integrals used in this derivation are defined only for M{z) > — 1,
the results (Eqs. 10.66a and 10.666) hold for all z by analytic continuation.2
Using the double factorial notation (Section 10.1), we may rewrite Eq. 10.66a
(with z = n, an integer) as
(и + i)! = nl/2Bn + l)!!/2"+1. A0.66c)
This is often convenient for eliminating factorials of fractions.
Incomplete Beta Function
Just as there is an incomplete gamma function (Section 10.5), there is also
an incomplete beta function,
Bx(p,q) = tp \1 - tf i dt, 0 < x < 1
p > 0 A0.67)
g>0(ifx = 1).
Clearly, Bx=l(p,q) becomes the regular (complete) beta function, Eq. 10.60. A
power-series expansion of Bx(p,q) is the subject of Exercises 5.2.18 and 5.7.8.
The relation to hypergeometric functions appears in Section 13.5.
The incomplete beta function makes an appearance in probability theory in
calculating the probability of at most к successes in n independent trials.3
2 If 2z is a negative integer, we get the valid but unilluminating result oo = x>.
3 W. Feller, An Introduction to Probability Theory and Its Applications. 3rd ed.,
Section V1.10. New York: Wiley A968).
EXERCISES 563
EXERCISES
10.4.1 Derive the doubling formula for the factorial function by integrating (sin20Jn+1
= Bsin0cos0Jn+1 (and using the beta function).
10.4.2 Verify the following beta function identities:
(a) B(a, b) = B(a +1,6) + B(a, b + 1),
(b) B(a,b) = a~^B(a,b+l),
о
(c) В(а,Ь) = ^±В(а + 1,Ь-1\
a
(d) B(a, b)B(a + b,c) = B(b, c)B(a, b + c).
10.4.3 (a) Show that
71/2
-x2)mx2ndx =
i-i
n
Bи + 2)!!'
n = 1, 2, 3, ....
(b) Show that
i-i
n
n
(In- 1)!!
Bи)М '
n= 1, 2, 3, ....
10.4.4 Show that
-x2)"dx =
i-i
п + 1)!!'
и = 0, 1, 2, ....
10.4.5 Evaluate jii A + х)аA — xf dx in terms of the beta function.
ANS. h
10.4.6 Show, by means of the beta function, that
f2 dx n
0 < а < 1.
Jt \z ~ x> Ух ~ l> sin na
This result is used in Section 16.2 to solve Abel's generalized integral equation.
10.4.7 Show that the Dirichlet integral
Яр алл PW B(P + !'<? + 1)
xpyqdA = ^—^ = —— -,
У (p + q + 2)\ p + q + 2
where the range of integration is the triangle bounded by the positive x- and
y-axes and the line x + у = 1.
10.4.8 Show that
o Jo
What are the limits on 91
Hint. Consider oblique xy coordinates.
ANS. -n<0<n.
564 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
10.4.9 Evaluate (using the beta function)
fn/2 (?tt\3/2
(a) 1 "*U1'«-mar-
(Ь) Jo Jo 2(n/2)!
for n odd,
и!!
к (п- 1)!! .
_. i l— for n even.
2 и!!
10.4.10 Evaluate te A - хА)'щ dx as a beta function.
ANS. -'\/2 = 1.311028777.
10.4.11 Given
9 / \2 Ля/2
л (y_ ,B)
show, with the aid of beta functions, that this reduces to the Bessel series
00 1 /z\2s+v
identifying the initial Jv as an integral representation of the Bessel function,
Jv (Section 11.1).
10.4.12 Given that the associated Legendre polynomial Р„7(х) = Bm - 1)!!A - x2)^2,
Section 12.5, show that
f1 , 2
(a) [P™(*)] dx = -Bm)!, m = 0, 1, 2, ....
J-i Bm+1)
(b) Г [РДх)]2 dx 2 = 2-Bm-l)!, m = 1,2,3, ....
J-i 1-х
10.4.13 Show that
(a) [1(x2)s+1/2(l -x2r1'2dx= {2s)U ,
Jo B5+1)!!'
(b) \\x2n\-x2fdx = \
Jo 2
10.4.14 A particle of mass m moving in a symmetric potential that is well described by
V(x) = A\x\n has a total energy \m{dx/dtJ + V(x) = E. Solving for dx/dt and
integrating, we find that the period of motion is
т ят— f *max dx
Jo *• '
where xmax is a classical turning point given by Лх^ах = E. Show that
_2 [Ъш/EV1"
10.4.1 5 Referring to Exercise 10.4.14,
(a) Determine the limit as n -> oo of
THE INCOMPLETE GAMMA FUNCTIONS AND RELATED FUNCTIONS 565
nyj E \a)
(b) Find lim т from the behavior of the integrand, (£ — Ax")12.
n—* oo
(c) Investigate the behavior of the physical system (potential well) as n -> oo.
Obtain the period from inspection of this limiting physical system.
10.4.16 Show that
Г00 sinhax
H cosh^x 2 \ 2 ' 2
- 1 < a < p.
Hint. Let sinh2 x = u.
10.4.17 The beta distribution of probability theory has a probability density
with x restricted to the interval @,1). Show that
(a) <x>(mean) = .
(b) a (variance) = <x2> — <x>2 =
(a + ft) (<x + p -
10.4.18 From
Ля/2
sin2" 0 dO
lim -i = 1
sin2n+1 OdO
Jo
derive the Wallis formula for n:
n 2*2 4»4 6-6
10.4.19 Tabulate the beta function B(p,q) for p and <? = 1.0@.1J.0, independently.
Check value. BA.3,1.7) = 0.40774.
10.4.20 (a) Write a subroutine that will calculate the incomplete beta function Bx(p, q).
For 0.5 < x < 1 you will find it convenient to use the relation
Bx(p,q) = B(p,q)-Bl.x{q,P)-
(b) Tabulate fix(f,f)- Spot check your results by using the Gauss-Legendre
quadrature.
10.5 THE INCOMPLETE GAMMA FUNCTIONS AND
RELATED FUNCTIONS
Generalizing the Euler definition of the gamma function (Eq. 10.5), we define
the incomplete gamma functions by the variable limit integrals
y(a,x)= е~Ча~1 dt,
Jo
566 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
and A0.68)
Лоо
Г(а,х) = e~4a~l dt.
Jx
Clearly, the two functions are related, for
y(a,x) + Г(а,х) = Г (а). A0.69)
The choice of employing y(a,x) or Г(а,х) is purely a matter of convenience.
If the parameter a is a positive integer, Eqs. 10.68 may be integrated completely
to yield
J7 A0.70)
^, n = 1, 2, . . ..
s=0 S-
For nonintegral a a power-series expansion of y{a,x) for small x and an
asymptotic expansion of Г(а,х) are developed in Sections 5.7 and 5.10.
x"
„to
Г(а,х) = x'-le- t 7ТЦ
„to(a- 1 - и)
} (-a)'- *"•
These incomplete gamma functions may also be expressed quite elegantly in
terms of confluent hypergeometric functions (compare Section 13.6).
Exponential Integral
Although the incomplete gamma function Г(а, x) in its general form (Eq.
A0.68)is only infrequently encountered in physical problems, a special case is
quite common and very useful. We define the exponential integral by1
-Ei(-x)= e-^dt = Ex{x). A0.72)
(See Fig. 10.7). To obtain a series expansion for small x, we proceed as follows.
Then
£1(х) = Г@,х)
A0.73)
= lim[F(a) - y{a,x)~\.
Caution is needed here, for the integral in Eq. 10.72 diverges logarithmically as
1 The appearance of the two minus signs in — Ei( — x) is an historical monstros-
monstrosity. AMS-55 denotes this integral as El(x).
THE INCOMPLETE GAMMA FUNCTIONS AND RELATED FUNCTIONS 567
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
FIG. 10.7 The exponential integral, £\(x) = — Ei( — x)
x —► 0. We may split the divergent term in the series expansion for y(a,x),
E^(x) = lim
aT\a) - xa~
a
(-1)"*"
п-п\
\
A0.74)
Using l'Hospital's rule (Exercise 5.6.9) and
j-{aY{a)} = j-a\ = jUln(a!> = a!
da da da
A0.74a)
and then Eq. 10.40,2 we obtain
Ei(x)= -y -lnx -
(-1)"*"
A0.75)
useful for small x. An asymptotic expansion is given in Section 5.10.
Further special forms related to the exponential integral are the sine integral,
cosine integral (Fig. 10.8), and logarithmic integral defined by3
si(x) = - ~Y~
/*oo
Ci(x) = —
dt
A0.76)
H(x) =
du
In и
= Ei(\n x).
By transforming from real to imaginary argument, we can show that
si(x) = ^.
- Ei(-ix)] = ~[_Ex{ix) - Ex{-
whereas
2dxa/da = xa lnx.
3 Another sine integral is given by Si(x) = si(x) + я/2.
A0.77)
568 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
1.0 -
*~x
-1.0
FIG. 10.8 Sine and cosine integrals
I.
21
i<
argx
n
A0.78)
Adding these two relations, we obtain
Ei(ix) = Ci(x)
s/Ы ПО 791
13М-Л-^5 111». / -/I
to show that the relation among these integrals is exactly analogous to that
among elx, cosx, and sinx. In terms of £j
Asymptotic expansions of Ci(x) and si(x) are developed in Section 5.10.
Power-series expansions about the origin for Ci(x), s/(x), and H(x) may be
obtained from those for the exponential integral, El (x), or by direct integration,
Exercise 10.5.10. The exponential, sine, and cosine integrals are tabulated in
AMS-55, Chapter 5.
Error Integrals
The error integrals
erfz =
e~t2dt,
n
erfcz = 1 — erfz =
n
A0.80a)
e't2dt
(normalized so that erfoo = 1) are introduced in Section 5.10 (Fig. 10.9).
Asymptotic forms are developed there. From the general form of the integrands
and Eq. 10.6 we expect that erfz and erfc z may be written as incomplete gamma
EXERCISES 569
erf x
н +- x
FIG. 10.9 Error function, erf x
functions with a = \. The relations are
_ _-1/2т-/1 _2\
— 71 1 (j,Z ).
The power-series expansion of erf z follows directly from Eq. 10.71.
EXERCISES
10.5.1 Show that
A0.80b)
(a) By repeatedly integrating by parts.
(b) Demonstrate this relation by transforming it into Eq. 10.71.
10.5.2
Show that
dm
(a)
dxm
dm
[x~ay(a,x)] = (-
m,x),
(а — m)
10.5.3
Show that y(a, x) and Г(а, х) satisfy the recurrence relations
(a) у (a + 1, x) = ay(a, x) — xae~x,
(b) Г(а + 1, x) = аГ(а, х) + xae'x.
10.5.4 The potential produced by a Is hydrogen electron is (Exercise 12.8.6) given by
V(r) =
(a) For r«l show that
j^-7C,2г)+ГB,2гI.
V(r) =
oao
3
(b) For r >>> 1 show that
V(r) =
1
4neoao r
Here r is a pure number, the number of Bohr radii, a0.
Note. For computation at intermediate values of r, Eqs. 10.70 are convenient.
10.5.5 The potential of a 2p hydrogen electron is found to be (Exercise 12.8.7)
570 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
0 24a0
4тге0 120ao[r3/v" '
Here r is expressed in units of a0, the Bohr radius. P2(cos^) is a Legendre
polynomial Section 12.1).
(a) For r«l, show that
1 a fl 1 , ">
V(t) = — < r2l
V 4тге0 ao[4 120
(b) For r » 1, show that
10.5.6 Prove that the exponential integral
)x t n% n-n\
у is the Euler-Mascheroni constant.
10.5.7 Show that E^{z) may be written as
Show also that we must impose the condition |argz| < л/2.
10.5.8 Related to the exponential integral (Eq. 10.72) by a simple change of variable is
the function
Show that En{x) satisfies the recurrence relation
En+i(x) = -e-x~-En{x), n= 1,2,3, ....
n n
10.5.9 With Е„(х) defined in Exercise 10.5.8, show that £„@) = 1/(и - 1), n > 1.
10.5.10 Develop the following power-series expansions
, ч ., ч Л г A)"+1
(а) 5г(х)= —+ V
2 4ь
2 я4ьBи+1)Bи+1)!
00 ( ]\" Y2"
( —
(b) Ci(x) = у+ 1пх+ £\
10.5.11 An analysis of a center-fed linear antenna leads to the expression
Cxl - cost ,
dt.
Jo t
Show that this is equal to
у + In x - Ci(x).
EXERCISES 571
\ Point
: „„^
charge
potential
charge
potential
i i
FIG. 10.10 Distributed charge potential produced by a IS hydrogen electron,
Exercise 10.5.14.
10.5.12 Using the relation
show that if y(a,x) satisfies the relations of Exercise 10.5.2, then F(a,x) must
satisfy the same relations.
10.5.13 (a) Write a subroutine that will calculate the incomplete gamma functions:
y(n, x) and Г(и, x) for n a positive integer. Spot check Г(п,х) by Gauss-
Laguerre quadratures—Appendix 2.
(b) Tabulate y(n,x)and F(n,x) for x = 0.0@.1I.0 and n = 1, 2, and 3.
10.5.14 Calculate the potential produced by a Is hydrogen electron (Exercise 10.5.4)
(Fig. 10.10). Tabulate V(r)/(q/4n£oao) for x = 0.0@.1L.0. Check your calcula-
calculations for г <к 1 and for r » 1 by calculating the limiting forms given in Exercise
10.5.4.
10.5.1 5 Using Eqs. 5.204 and 10.75, calculate the exponential integral £,(x) for
(a) x = 0.2@.2I.0,
(b) x-6.0B.0I0.0.
Program your own calculation but check each value, using a library subroutine
if available. Also check your calculations at each point by a Gauss-Laguerre
quadrature.
You should find that the power-series converges rapidly and yields high
precision for small x. The asymptotic series, even for x = 10, yields relatively
poor accuracy.
Check values. £,A.0) = 0.219384
£,A0.0) = 4.15697 x 10
-6
10.5.16 The two expressions for £j(x), A) Eq. 5.204, an asymptotic series and B) Eq.
10.75, a convergent power series, provide a means of calculating the Euler-
Mascheroni constant у to high accuracy. Using double precision, calculate у
from Eq. 10.75 with Er{x) evaluated by Eq. 5.204.
Hint. As a convenient choice take x in the range 10 to 20. (Your choice of x
will set a limit on the accuracy of your result.) To minimize errors in the alternat-
alternating series of Eq. 10.75, accumulate the positive and negative terms separately.
ANS. For x = 10 and "double precision" у = 0.5772 1566.
572 THE GAMMA FUNCTION (FACTORIAL FUNCTION)
REFERENCES
AMS-55, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical
Tables, U.S. Department of Commerce, National Bureau of Standards, Applied
Mathematics Series-55, M. Abramowitz and I. A. Stegun, Eds.
Contains a wealth of information about gamma functions, incomplete gamma func-
functions, exponential integrals, error functions, and related functions-Chapters 4 to 6.
Artin, Emil, The Gamma Function. (Translated by Michael Butler.) New York: Holt,
Rinehart and Winston A964).
Demonstrates that if a function/(x) is smooth (log convex) and equal to (n — 1)! when
x — n, it is the gamma function.
Davis, H. Т., Tables of the Higher Mathematical Functions. Bloomington, Ind.: Principia
Press A933).
Volume I contains extensive information on the gamma function and the polygamma
functions.
Luke, Y. L., The Special Functions and Their Approximations, Vol. I. New York and
London: Academic Press A969).
Luke, Y. L., Mathematical Functions and Their Approximations. New York: Academic
Press A975).
This is an updated supplement to Handbook of Mathematical Functions with Formulas,
Graphs, and Mathematical Tables (AMS-55). Chapter 1 deals with the gamma function.
Chapter 4 treats the incomplete gamma function and a host of related functions.
11 BESSEL
FUNCTIONS
11.1 BESSEL FUNCTIONS OF THE FIRST KIND, /v(x)
Bessel functions appear in a wide variety of physical problems. In Section 2.6
separation of the Helmholtz or wave equation in circular cylindrical coordinates
led to Bessel's equation. In Section 11.7 we will see that the Helmholtz equation
in spherical polar coordinates also leads to a form of Bessel's equation. Bessel
functions may also appear in integral form—integral representations. This may
result from integral transforms (Chapter 15) or from the mathematical elegance
of starting the study of Bessel functions with Hankel functions, Section 11.4.
Bessel functions and closely related functions form a rich area of mathe-
mathematical analysis with many representations, many interesting and useful
properties, and many interrelations. Some of the major interrelations developed
in Section 11.1 and in succeeding sections are outlined in Fig. 11.1. Note that
Bessel functions are not restricted to Chapter 11. The asymptotic forms are
developed in Section 7.4 as well as in Section 11.6. The confluent hypergeometric
representations appear in Section 13.6.
Generating Function, Integral Order, Jn(x)
Although Bessel functions are of interest primarily as solutions of differential
equations, it is instructive and convenient to develop them from a completely
different approach, that of the generating function.1 This approach also has the
advantage of focusing on the functions themselves rather than on the differential
equations they satisfy. An outline of the development of Bessel and related
functions from the generating function is shown in Fig. 11.1. Let us introduce
a function of two variables,
Expanding this function in a Laurent series (Section 6.5), we obtain
00
i/t) = £ Jn(x)t\ A1.2)
1 Generating functions have already been used in Chapter 5. In Section 5.6
the generating function (l+.v)" generated the binomial coefficients. In
Section 5.9 the generating function x(ex — I) generated the Bernoulli
numbers.
573
574 BESSEL FUNCTIONS
(Neumann Л
function, Nv J
Л
V
Bessel function
Л
Modified Bessel
. function, /v j
^Generating function
integral n
'Spherical Bessen
i function, jn J
Limits, bounds
Addition
theorems
f Integral
lrepresentations;
(Asymptotic Л
forms J
Orthogonality J ». ( Bessel series J
D
Confluent hypergcometric
representation
FIG. 11.1 Bessel function interrelations
BESSEL FUNCTIONS OF THE FIRST KIND, Jv(x) 575
FIG. 11.2 Bessel functions, J0(x), 7, (x), and J2(x)
The coefficient of t", Jn(x), is defined to be a Bessel function of the first kind of
integral order n. Expanding the exponentials, we have a product of Maclaurin
series in xt/2 and —x/2t, respectively,
vf/2 — v/?f
For a given s we get t" (n > 0) from r = n + s
, 11+S
A1.3)
t
A1.4)
The coefficient of t" is then2
Ш =
n+ 2s
X
n+2
—- +
2"n\ 2n+2(n+\)\
A1.5)
This series form exhibits the behavior of the Bessel function Jn(x) for small x
and permits numerical evaluation of Jn(x). The results for ,/0, Jl4 and J2 are
shown in Fig. 11.2. From Section 5.3 the error in using only a finite number of
terms in numerical evaluation is less than the first term omitted. For instance,
if we want Jn{x) to ± 1% accuracy, the first term alone of Eq. 11.5 will suffice,
provided the ratio of the second term to the first is less than 1% (in magnitude)
or x < 0.2(n + 1I/2. The Bessel functions oscillate but are not periodic—except
in the limit as x -> oo (Section 11.6). The amplitude of Jn(x) is not constant but
decreases asymptotically as x~1/2.
Equation 11.5 actually holds for n < 0, also giving
2s — n
A1.6)
which amounts to replacing n by — n in Eq. 11.5. Since n is an integer (here),
(s — n)\ -> oo for s = 0, . . ., (n — 1). Hence the series may be considered to start
2 From the steps leading to this series and from its convergence characteristics
it should be clear that this series may be used with x replaced by z and with z
any point in the finite complex plane.
576 BESSEL FUNCTIONS
with s = n. Replacing s by s + n, we obtain
h+2s
showing immediately that Jn(x) and J-n(x) are not independent but are related
by
J_B(x) = (- 1)"Л(х), (integral n). A1.8)
These series expressions (Eqs. 11.5 and 11.6) may be used with n replaced by v
to define Jv(x) and J_v(x) for nonintegral v (compare Exercise 11.1.7).
Recurrence Relations
The recurrence relations for Jn(x) and its derivatives may all be obtained by
operating on the series, Eq. 11.5, although this requires a bit of clairvoyance
(or a lot of trial and error). Verification of the known recurrence relations is
straightforward, Exercise 11.1.7. Here it is convenient to obtain them from the
generating function, g(x, t). Differentiating Eq. 11.1 partially with respect to t,
we find that
4 ' A1.9)
nJH(x)t»-\
and substituting Eq. 11.2 for the exponential and equating the coefficients of
like powers of t,3 we obtain
Jn_1(x) + Jn+1(x) = ^Jn(x). A1.10)
This is a three-term recurrence relation. Given Jo and Jl, for example, J2 (and
any other integral order Jn) may be computed.
With the opportunities offered by modern digital computers (and the
demands they levy), Eq. 11.10 has acquired an interesting new application.
In computing a numerical value ofJN(x0) for a given x0, one could use Eq. 11.5
for small x, or the asymptotic form, Eq. 11.144 of Section 11.6 for large x. A
better way, in terms of accuracy and machine utilization, is to use the recurrence
relation, Eq. 11.10, and work down.4 With ny> N and n » x0, assume
Л+1(*о) = ° and Л(*о) = а,
i
where a is some small number. Then Eq. 11.10 leads to Jn_l(x0), Jn-2{xoX
and so on, and finally, to J0(x0). Since a is arbitrary, the Jn's are all off by a
3This depends on the fact that the power-series representation is unique
(Sections 5.7, 6.5).
41. A. Stegun, M. Abramowitz, "Generation of Bessel functions on high
speed computers," Mathematical Tables and Other Aids for Computation, 11,
255-257A957).
BESSEL FUNCTIONS OF THE FIRST KIND, Jv(x) 577
common factor. This factor is determined by the condition
-J2«(*o)=l. A1.10a)
(Set t = 1 in Eq. 11.2.) The accuracy of this calculation is checked by trying again
at ri = n + 3. This technique yields the desired Jjv(x0) and all the lower integral
index J's down to Jo. This is the technique employed by the FORTRAN SSP
subroutine BESJ.
High-speed, high-precision numerical computation is more or less an art.
Modifications and refinements of this and other numerical techniques are being
proposed year by year. For information on the current "state of the art" the
student will have to go to the literature, and this means primarily to the journal
Mathematics of Computation.
Differentiating Eq. 11.1 partially with respect to x, we have
()
4 J A1.11)
oo
= £ J'n(x)t".
n= -co
Again, substituting in Eq. 11.2 and equating the coefficients of like powers oft,
we obtain the result
. A1.12)
As a special case of this general recurrence relation,
Jo(x)= -J,(x). A1.13)
Adding Eqs. 11.10 and 11.12 and dividing by 2, we have
^ . A1.14)
Multiplying by x" and rearranging terms produces
£[хЧп(х)]=хЧп^(х). A1.15)
Subtracting Eq. 11.12 from 11.10 and dividing by 2 yields
Jn+i(x) = ~Jn(x)-J'n(x). A1.16)
Multiplying by x~" and rearranging terms, we obtain
£ ). A1.17)
Bessel's Differential Equation
Suppose we consider a set of functions Zv(x) which satisfies the basic recur-
recurrence relations (Eqs. 11.10 and 11.12), but with v not necessarily an integer and
578 BESSEL FUNCTIONS
Zv not necessarily given by the series (Eq. 11.5). Equation 11.14 may be rewritten
(n -> v) as
x) = xZv_1(x)-vZv(.x). A1.18)
On differentiating with respect to x, we have
xz;'(x) + (v + i)z;-xz;., -zv_x = o. A1.19)
Multiplying by x and then subtracting Eq. 11.18 multiplied by v gives us
x2z; + xz; - v2zv + (v - i)xzv_t - x2z;_j = o. A1.20)
Now we rewrite Eq. 11.16 and replace n by v — 1.
xZ(,_! = (v — l)Zv_j — xZv. A1.21)
Using this to eliminate Zv_j and Z'v_t from Eq. 11.20, we finally get
x2z; + xz; + (x2 - v2)zv = o. A1.22)
This is just Bessel's equation. Hence any functions, Zv(x), that satisfy the recur-
recurrence relations (Eqs. ll.lOand 11.12, 11.14, and 11.16, or 11.15and 11.17) satisfy
Bessel's equation; that is, the unknown Zv are Bessel functions. In particular,
we have shown that the functions Jn(x), defined by our generating function,
satisfy Bessel's equation. If the argument is kp rather than x, Eq. 11.22 becomes
)~ZV(M + (/cV - v2)Zv(M = 0. A1.22a)
Integral Representation
A particularly useful and powerful way of treating Bessel functions employs
integral representations. If we return to the generating function (Eq. 11.2),
and substitute t = е1в,
eixsine = Jo(x) + 2(J2(x)cos20 + J4(x)cos4() + ■ ■ •)
+ 2/(J1(x)sin0 + J3(x)sin3() + ■ ■ ■),
in which we have used the relations
= 2/J1(x)sin0, A1.24)
J2(x)e2w + J_2(x)e-2ie = 2J2(x)cos20,
and so on.
In summation notation
cos(x sin в) = J0(x) + 2
"=1 A1.25)
00
sin(x sin в) = 2 Y, J2n-i(x)sinlBn — 1H],
BESSEL FUNCTIONS OF THE FIRST KIND, Jv{x) 579
equating real and imaginary parts, respectively. It might be noted that angle 0
(in radians) has no dimensions. Likewise sin 9 has no dimensions and the
function cos(x sin 9) is perfectly proper from a dimensional point of view.
By employing the orthogonality properties of cosine and sine,5
cos пв cos mO dO =-dnm A1.26a)
J 2
o
c
sin n9 sin m0d9 = ~Snm, A1.26b)
Jo ^
in which n and m are positive integers (zero is excluded),6 we obtain
1 Г / • m пап fJ»(*)' neven' mm
- cosixsm 9) cos n9d9 = { ,, A1.27)
n Jo @, n odd,
1 С71 ГО, и even,
- sin(xsin0)sinn0d0 = J 4 ' A1.28)
If these two equations are added together,
1 Cn
jn(x) = - [cos(x sin 0)cos n9 + sin(x sin O)sin n(T\ dO
П^° A1.29)
1 f*
= - cos(n6 — x sin 9) d9, n = 0, 1, 2, 3, . ...
71 Jo
As a special case,
jo(x) = - I cos(x sin 0)d0. A1.30)
71 Jo
Noting that cos(x sin 9) repeats itself in all four quadrants @x = 0,02 = n — 0,
03 = n + 9,94 = — 9), we may write Eq. 11.30 as
j(x) = _L Г \Os(x sin 9) dO. A1.30a)
2n Jo
On the other hand, sin(x sin 9) reverses its sign in the third and fourth quadrants
so that
— n%in(xsin<9)d0 = 0. A1.30b)
2n Jo
Adding Eq. 11.30a and i times Eq. 11.30b, we obtain the complex exponential
representation
5 They are eigenfunctions of a self-adjoint equation (linear oscillator equation)
and satisfy appropriate boundary conditions (compare Sections 9.2 and
14.1).
6 Equations 11.26a and b hold for either m or n = 0. If both m and n = 0, the
constant in 11.26a becomes n; the constant in Eq. 11.26Й becomes 0.
580 BESSEL FUNCTIONS
Г
.ixsin 0
2п
Л2я ;
1 C
— e
2n}0
ixcosO
d0
dO.
A1.30c)
This integral representation (Eq. 11.29) may be obtained somewhat more
directly by employing contour integration (compare Exercise 11.1.16).7 Many
other integral representations exist (compare Exercise 11.1.18).
I II HI
Incident waves
v
-+~ X
FIG. 11.3 Fraunhofer diffraction—circular aperture
EXAMPLE 11.1.1 Fraunhofer Diffraction, Circular Aperture
In the theory of diffraction through a circular aperture we encounter the
integral
Ф~ elbrco*ed0rdr A1.31)
Jo Jo
for Ф, the amplitude of the diffracted wave.8 Here 0 is an azimuth angle in the
7 For n — 0 a simple integration over в from 0 to 2л will convert Eq. 11.23
intoEq. 11.30c.
8 The exponent ibr cos в gives the phase of the wave on the distant screen at
angle a relative to the phase of the wave incident on the aperture at the point
(г, в). The imaginary exponential form of this integrand means that the
integral is technically a Fourier transform, Chapter 15. In general, the
Fraunhofer diffraction pattern is given by the Fourier transform of the
aperture.
BESSEL FUNCTIONS OF THE FIRST KIND, Цх) 581
TABLE 11.1 Zeros of the Bessel Functions and Their
First Derivatives
Number
of zero
1
2
3
4
5
1
2
3
J0(x)
2.4048
5.5201
8.6537
11.7915
14.9309
J'0(x)
3.8317
7.0156
10.1735
3.8317
7.0156
10.1735
13.3237
16.4706
J[(x)
1.8412
5.3314
8.5363
J2(x)
5.1356
8.4172
11.6198
14.7960
17.9598
J'lix)
3.0542
6.7061
9.9695
J3(x)
6.3802
9.7610
13.0152
16.2235
19.4094
Ji(x)
4.2012
8.0152
11.3459
J4(x)
7.5883
11.0647
14.3725
17.6160
20.8269
Js(x)
8.7715
12.3386
15.7002
18.9801
22.2178
Note. Jq(x) = -Jiix).
plane of the circular aperture of radius a, and a is the angle defined by a point
on a screen below the circular aperture relative to the normal through the
center point. The parameter b is given by
b = ^sinoe, A1.32)
A
with X the wavelength of the incident wave. The other symbols are defined by
Fig. 11.3. From Eq. 11.30c we get9
Ф~2я I J0(br)rdr. A1.33)
Jo
Equation 11.15 enables us to integrate Eq. 11.33 immediately to obtain
, 2nab T , ,4 acl T Bna . \ ,.л .,,
ф j (ajj\ J, —-sina . A1.34)
bl sin a \ a )
The intensity of the light in the diffraction pattern is proportional to Ф2 and
ф2 ^ ЩBпа/А) sin a]J
[ sin a j
From Table 11.1, which lists the zeros of the Bessel functions and their first
derivatives,10 the expression 11.35 will have a zero at
2na sin a = 3.8317... A1.36)
or
9 We could also refer to Exercise 11.1.16(b).
10 Additional roots of the Bessel functions and their first derivatives may be
found in C. L. Beattie, "Table of First 700 Zeros of Bessel Functions," Bell
Tech. J. 37, 689 A958) and Bell Monograph 3055.
582 BESSEL FUNCTIONS
3.8317Я ,л л -,~ч
sma = . A1.37)
2na
For green light X = 5.5 x 1(T5 cm. Hence, if a = 0.5 cm,
a » sin a = 6.7 x 10~5 (radian)
A1.38)
л 14 seconds of arc,
which shows that the bending or spreading of the light ray is extremely small.
If this analysis had been known in the seventeenth century, the arguments
against the wave theory of light would have collapsed.
In mid-twentieth century this same diffraction pattern appears in the scatter-
scattering of nuclear particles by atomic nuclei—a striking demonstration of the wave
properties of the nuclear particles.
A further example of the use of Bessel functions and their roots is provided by
the electromagnetic resonant cavity, Example 11.1.2 that follows and the
example and exercises of Section 11.2.
EXAMPLE 11.1.2 Cylindrical Resonant Cavity
In the interior of a resonant cavity electromagnetic waves oscillate with a
time dependence e~lon. Maxwell's equations lead to
V x V x E = oc2E
for the space part of the electric field with a2 = ш2е0/л0 (Example 1.9.2). With
V • E = 0 (vacuum, no charges),
V2E + oe2E = 0.
Separating variables in circular cylindrical coordinates (Section 2.4), we find
that the z-component (£z, space part only) satisfies the scalar Helmholtz
equation
V2Ez + a2Ez = 0, A1.39)
where a2 = со2е0/л0 = w2/c2. Further,
(Ez)mnk = I Uymnp)e±im(p[amn sinkz + bmn coskz]. A1.40)
m.n
The parameter к is a separation constant introduced in splitting off the z
dependence of Ez(p, q>, z). Similarly, m entered in splitting off the q> dependence.
у enters as a2 — k2 and is quantized by the requirement that ya be a root of the
Bessel function Jm (Eq. 11.43 which follows). Then the n in ymn designates the nth
root of Jm.
For the end surfaces at z = 0 and z = I (as in Fig. 11.4), let us set amn = 0, and
fc = £p p = 0,1,2,.... A1.41)
Maxwell's equations then guarantee that the tangential electric fields Ep and
Ещ will vanish at z = 0 and I. This is the transverse magnetic or TM mode of
BESSEL FUNCTIONS OF THE FIRST KIND, Цх) 583
у
FIG. 11.4 Cylindrical resonant cavity
oscillation. We have
2
y2 =
A1.42)
CO
/
But there is the usual boundary condition that Ez(p = a) = 0. Hence we must
set
Ути
оеи
A1.43)
where amn is the «th zero of Jm.
The result of the two boundary conditions and the separation constant m2
is that the angular frequency of our oscillation depends on three discrete
parameters
= с
a
I
2 '
m = 0, 1, 2
n = 1, 2, 3
P = 0, 1, 2
A1.44)
These are the allowable resonant frequencies for our TM mode. The ТЕ mode
of oscillation is the topic of Exercise 11.1.26.
Alternate Approaches
Bessel functions are introduced here by means of a generating function,
Eq. 11.2. Other approaches are possible. Listing the various possibilities,
584 BESSEL FUNCTIONS
we have
1. Generating function (magic), Eq. 11.2.
2. Series solution of Bessel's differential equation,
Section 8.5.
3. Contour integrals: Some writers prefer to start with
contour integral definitions of the Hankel functions,
Sections 7.4 and 11.4, and develop the Bessel function
Jv(x) from the Hankel functions.
4. Direct solution of physical problems: Example 11.1.1,
Fraunhofer diffraction with a circular aperture, illus-
illustrates this. Incidentally, Eq. 11.31 can be treated by
series expansion, if desired. Feynman11 develops
Bessel functions from a consideration of cavity
resonators.
In case the generating function seems too arbitrary, it can be derived from
a contour integral, Exercise 11.1.16, or from the Bessel function recurrence
relations, Exercise 11.1.6.
Bessel Functions of Nonintegral Order
These different approaches are not exactly equivalent. The generating
function approach is very convenient for deriving two recurrence relations,
Bessel's differential equation, integral representations, addition theorems
(Exercise 11.1.2), and upper and lower bounds (Exercise 11.1.1). However, the
reader will probably have noticed that the generating function defined only
Bessel functions of integral order, Jo, Jl5 J2, and so on. This is a great limitation
of the generating function approach. But the Bessel function of the first kind
Jv(x), may easily be defined for nonintegral v by using the series (Eq. 11.5)
as a new definition.
The recurrence relations may Ъе verified by substituting in the series form
of Jv(x) (Exercise 11.1.7). From these relations Bessel's equation follows. In fact,
if v is not an integer, there is actually an important simplification. It is found that
Jv and J_v are independent, for no relation of the form of Eq. 11.8 exists. On the
other hand, for v = n, an integer, we need another solution. The development
of this second solution and an investigation of its properties form the subject
of Section 11.3.
EXERCISES
11.1.1 From the product of the generating functions g(x, t) • g{x, — t) show that
1 = [J0(x)Y + 2ГУЛХ)]2 + 2[J2(x)]2 + • • •
1' R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on
Physics, Vol. II, Chap. 23. Reading, Mass.: Addison-Wesley A964).
EXERCISES 585
and therefore that \J0(x)\ < 1 and \Jn(x)\ < 1Д/2, n = 1, 2, 3,
гт. Use uniqueness of power series, Section 5,7.
11.1.2 Using a generating function g(x, t) = g{u + v, t) = g(u, t)-g(v, t), show that
00
(a) Jn{u + v)= I Js{u)-Jn.M
s= — oo
00
(b) J0(u + v) = J0(u)J0(v) + 2 X JMJ-M
s=l
These are addition theorems for the Bessel functions.
11.1.3 Using only the generating function
and not the explicit series form of Jn(x), show that Jn(x) has odd or even parity
according to whether n is odd or even, that is,12
11.1.4 Derive the Jacobi-Anger expansion
e'zcose= £ imjm{z)eime.
m= —Qo
This is an expansion of a plane wave in a series of cylindrical waves.
11.1.5 Show that
00
(a) cosx = J0(x) + 2^(-lfJ2n(x),
(b) sinx = 2f;(-l)"+1J2n+1(x).
11.1.6 To help remove the generating function from the realm of magic, show that it
can be derived from the recurrence relation, Eq. 11.10.
Hint. 1. Assume a generating function of the form
g(x,t)= £ Jm(x)t"'.
m— — oo
2. Multiply Eq. 11.10 by t" and sum over n.
3. Rewrite the preceding result as
\ tj x dt
4. Integrate and adjust the function of integration (a function of x) so
that the coefficient of t° is J0{x) as given by Eq. 11.5.
11.1.7 Show, by direct differentiation, that
satisfies the two recurrence relations
12
This is easily seen from the series form (Eq. 11.5).
586 BESSEL FUNCTIONS
and
r ( ) 1 j
v-l X
■/„_,(*)-J
Bessel's differential equation
11.1.8 Prove that
(a)
Hint
may
— 1 f (у-ппсП\гг\ъП ИП
x Jo
1 Ли/2
— 1 Jiijoueyiui;.
x Jo
. The definite integral
ft/2
cos2s+1 OdO
Jo
be useful.
r , л
V+l ^
2.
1-3
=
—
■4
•5
2v
X
25
v2
•6
J ( )
vx,
Ы
VAx) =
■ ■ ■ Bs)
■ Bs +
-.0
1)
11.1.9 Show that
r . . 2 Г1 cosxf
This integral is a Fourier cosine transform (compare Section 15.3). The corre-
corresponding Fourier sine transform,
. . . 2 Г smxt ,
J0{x) = ~\ du
is established in Section 11.4, using a Hankel function integral representation.
11.1.10 Derive
Hint. Try mathematical induction.
11.1.11 Show that between any two consecutive zeros of Jn(x) there is one and only one
zero of Jn+1(x).
Hint. Equations 11.15 and 11.17 may be useful.
11.1.12 An analysis of antenna radiation patterns for a system with a circular aperture
involves the equation
0(")= f(r)J0(ur)rdr.
Jo
If f(r) = 1 - r2, show that
g(u) - -jJ2(u).
и
11.1.13 The differential cross section in a nuclear scattering experiment is given by
da/dQ. = |/(#)|2. An approximate treatment leads to
EXERCISES 587
— ik C2n CR
fF) = exp [ikp sin 0 sirj cpl p dp dcp.
271 Jo Jo
Here в is an angle through which the scattered particle is scattered. R is the
nuclear radius. Show that
n\_ sinO
11.1.14 A set of functions С„(х) satisfies the recurrence relations
(a) What linear second-order differential equation does the С„(х) satisfy?
(b) By a change of variable transform your differential equation into Bessel's
equation. This suggests that Cn(x) may be expressed in terms of Bessel
functions of transformed argument.
11.1.15 A particle (mass m) is contained in a right circular cylinder (pillbox) of radius R
and height H. The particle is described by a wave function satisfying the
Schrodinger wave equation
h2
- ;r- V2 ф(р, <p, z) = Еф(р, <p, z)
2m
and the condition that the wave function go to zero over the surface of the
pillbox. Find the lowest (zero point) permitted energy.
where zpq is the qth zero of Jp, the index p fixed by the azimuthal dependence.
2m \\ R J \HJ J
11.1.16 (a) Show by direct differentiation and substitution that
2ni)c
or that the equivalent equation
satisfies Bessel's equation. С is the contour shown in Fig. 11.5. The negative real
axis is cut line.
Hint. Show that the total integrand (after substituting in Bessel's differential
equation) may be written as a total derivative:
d\
JtTV
(b) Show that the first integral (with n an integer) may be transformed into
1 [2n
2nJo
Г" Г-
588 BESSEL FUNCTIONS
— 00
FIG. 11.5 Bessel function contour
11.1.17 The contour С in Exercise 11.1.16 is deformed to the path — oo to — 1, unit
circle e~ln to ein, and finally — 1 to — oo. Show that
l Г
j (X) = - cos(vO - x sin 0)d0
SinV7T
n
e(-v0-x sinh
This is Bessel's integral.
Hint. The negative values of the variable of integration и may be handled by
using
и = te±in.
11.1.18 (a) Show that
'x
С nil
cos(x sin 0) cos2v 0 dO,
where v > — \.
Hint. Here is a chance to use series expansion and term-by-term integration.
The formulas of Section 10.4 will prove useful,
(b) Transform the integral in part (a) into
1
' X
' X
' X
cos(x cos 0) sin2v 0 dO
, v Г1
?±ixcdsesin2v0d0
e±ipx(l - p2y~112dp.
These are alternate integral representations of Jv(x).
11.1.19 (a) From
derive the recurrence relation
v
t.
EXERCISES 589
(b) From
j(x)
2ni
derive the recurrence relation
11.1.20 Show that the recurrence relation
follows directly from differentiation of
l Г
Jn(x) = - cos(nO — x sin 0)d0.
71 Jo
11.1.21 Evaluate
e~axJ0(bx)dx, a,b>0.
Jo
Actually the results hold for a > 0, — oo < b < со. This is a Laplace transform
of Jo.
Hint. Either an integral representation of Jo or a series expansion will be
helpful.'
11.1.22 Using trigonometric forms, verify that
J0{br) = — f K eibrsinOd0.
2n)o
11.1.23 (a) Plot the intensity (Ф2 of Eq. 11.35) as a function of (sin а/Я) along a diameter
of the circular diffraction pattern. Locate the first two minima,
(b) What fraction of the total light intensity falls within the central maximum?
Hint. [J!(x)]2/x may be written as a derivative and the area integral of the
intensity integrated by inspection.
11.1.24 The fraction of light incident on a circular aperture (normal incidence) that is
transmitted is given by
C2ka dx 1 f2ka
T2l JA}l
Here a is the radius of the aperture, and к is the wave number, 2л://. Show that
(a) T=l-±- tJ2n+iBka),
Ka n = O
X Г2ка
(b) T=l-— J0(x)dx.
2kaH
11.1.25 The amplitude U(p, (p, t) of a vibrating circular membrane of radius a satisfies
the wave equation
Here v is the phase velocity of the wave fixed by the elastic constants and
whatever damping is imposed,
(a) Show that a solution is
U(p,q>,t) = JJkp^a.e^ + a^'^b^ + Ь2е~ш).
590 BESSEL FUNCTIONS
(b) From the Dirichlet boundary condition, J,,,(ka) = 0, find the allowable
values of the wavelength A. (k = 2%/X).
Note. There are other Bessel functions besides Jn, but they all diverge at p = 0.
This is shown explicitly in Section 11.3. The divergent behavior is actually
implicit in Eq. 11.6.
11.1.26 Example 11.1.2 describes the TM modes of electromagnetic cavity oscillation.
The transverse electric (ТЕ) modes differ in that we work from the z component
of the magnetic induction B:
with boundary conditions
Bz@) = B.(l) = 0 and -^
dp
Show that the ТЕ resonant frequencies are given by
= 0.
p= 1,2,3, ....
11.1.27 Plot the three lowest TM and the three lowest ТЕ angular resonant frequencies,
ojmnp, as a function of the radius/length (a/I) ratio for 0 < a/I < 1.5.
Hint. Try plotting со2 (in units of c2/a2) versus (a/IJ. Why this choice?
11.1.28 A thin conducting disk of radius a carries a charge q. Show that the potential
is described by
/ ч Я f°° -*ыг „ 4sinfca ..
(П \ Y 7l — * I P lzl/ \Ur\ ——— n v
4neoa Jo к
where Jo is the usual Bessel function and r and z are the familiar cylindrical
coordinates.
Note. This is a difficult problem. One approach is through Fourier transforms
such as Exercise 15.3.11. For a discussion of the physical problem see Jackson
(Classical Electrodynamics).
11.1.29 Show that
xmJn(x)dx, m>n>0.
Jo
(a) is integrable in terms of Bessel functions and powers of x [such as apJq(a)]
for m + n odd;
(b) may be reduced to integrated terms plus j0 J0(x)dx for m + n even.
11.1.30 Show that
Г*Оп / v \ 1 fX0r,
Jo V W «On Jo
Here aOn is the nth root of J0(y). This relation is useful in computation (Exercise
11.2.11). The expression on the right is easier and quicker to evaluate—and
much more accurate. Taking the difference of two terms in the expression on
the left leads to a large relative error.
11.1.31 Write a program that will compute successive roots of the Bessel function
Jn(x), that is, ans, where Jn(ans) = 0. Tabulate the first five roots of Jo, J,, and
of J2.
Hint. See Appendix 1 for root-finding techniques and recommendations.
Check value, a,2 - 7.01559.
ORTHOGONALITY 591
11.1.32 The circular aperature diffraction amplitude Ф of Eq. 17.35 is proportional
to f(z) = J{(z)/z. The corresponding single slit diffraction amplitude is pro-
proportional to g(z) = sin z/z.
(a) Calculate and plot /(z) and g{z) for z = 0.0@.2) 12.0.
(b) Locate the two lowest values of z (z > 0) for which j\z) takes on an extreme
value. Calculate the corresponding values of /(z).
(c) Locate the two lowest values of z (z > 0) for which g{z) takes on an extreme
value. Calculate the corresponding values of g{z).
11.1.33 Calculate the electrostatic potential of a charged disk (p(r,z)/(q/4n£0a) from
the integral form of Exercise 11.1.28. Calculate the potential for r/a — 0.0@.5J.0
and z/o = 0.25@.25I.25. Why is z/a = 0 omitted? Exercise 12.3.17 is a spherical
harmonic version of this same problem.
Hint. Try a Gauss-Laguerre quadrature, Appendix 2.
11.2 ORTHOGONALITY
If Bessel's equation, Eq. 11.22a, is divided by x, we see that it becomes self-
adjoint, and therefore by the Sturm-Liouville theory, Section 9.2, the solutions
are expected to be orthogonal—if we can arrange to have appropriate boundary
conditions satisfied. To take care of the boundary conditions, for a finite interval
[0, a], we introduce parameters a and avm into the argument of Jv to get
Jv(avmp/a). Here a is the upper limit of the cylindrical radial coordinate p.
From Eq. 11.22a
Hdp2 v\ vmaj dp д ■■a/
Changing the parameter oevm to avn, we find that Jv(otvnp/a) satisfies
d2 , / p\ d
I 7 VI Vn II ^
a) dp \ a/ \ a'
Proceeding as in Section 9.2, we multiply Eq. 11.45 by Jv(avnp/a) and Eq. 11.45a
by Jv(avmp/a) and subtract, obtaining
J ( n —
A1.46)
a \ a) \ a
Integrating from p — 0 to p = a, we obtain
— JAOL
dp v
majo>
vm~~ )
dp
A1.47)
a
592 BESSEL FUNCTIONS
Upon integrating by parts, we see that the left-hand side of Eq. 11.47 becomes
nJ
PJv
p\d J
a dp
a.
ajdp
a,
A1.48)
For v > 0 the factor p guarantees a zero at the lower limit, p = 0. Actually the
lower limit on the index v may be extended down to v > — 1, Exercise 11.2.4.1
At p = a, each expression vanishes if we choose the parameters avn and avm
to be zeros or roots of Jv; that is, Jv(oevm) = 0. The subscripts now become
meaningful: avm is the mth zero of Jv.
With this choice of parameters, the left-hand side vanishes (the Sturm-
Liouville boundary conditions are satisfied) and for m ф п
a
a
A1.49)
This gives us orthogonality over the interval [0, a].
Normalization
The normalization integral may be developed by returning to Eq. 11.48,
setting avn = avm + e, and taking the limit e -> 0 (compare Exercise 11.2.2). With
the aid of the recurrence relation, Eq. 11.16, the result may be written as
a
of
T
A1.50)
Bessel Series
If we assume that the set of Bessel functions Jv(avmp/a) (v fixed, m = 1,2,3, . ..)
is complete, then any well-behaved but otherwise arbitrary function /(p) may
be expanded in a Bessel series (Bessel-Fourier or Fourier-Bessel)
ЯР) =
CvmJv
p
la
0 < p <a, v > -1.
The coefficients cvm are determined by using Eq. 11.50,
2 Г ' '
a2[Jv+1(avm)]2 Jo
A1.51)
A1.52)
A similar series expansion involving Jv(f$vmp/a) with {d/dp)Jv{fivmp/a)\p=a = 0
is included in Exercises 11.2.3 and 11.2.6(b).
EXAMPLE 11.2.1 Electrostatic Potential in a Hollow Cylinder
From Table 8.2 of Section 8.3 (with a replaced by k) our solution of Laplace's
equation in circular cylindrical coordinates is a linear combination of
= Pkm{p)<&m{(p)Zk{z)
[am sin m<p + bm cos тер] ■ \cxekz + c2e ks].
A1.53)
хТЬе case v = — 1 reverts to v = +1, Eq. 1 \.i
ORTHOGONALITY 593
The particular linear combination is determined by the boundary conditions
to be satisfied.
Our cylinder here has a radius a and a height /. The top end section has a
potential distribution ф(р,(р). Elsewhere on the surface the potential is zero.2
The problem is to find the electrostatic potential
ф{р, <p, z) = £ фкт(р, (p, z) A1.54)
k,m
everywhere in the interior.
For convenience, the circular cylindrical coordinates are placed as shown
in Fig. 11.4. Since ф(р,(р,0) = 0, we take Cj = — c2 = \. The z dependence
becomes sinhkz, vanishing at z = 0. The requirement that i^ = 0on the cylin-
cylindrical sides is met by requiring the separation constant к to be
k = kmn = amn/a, A1.55)
where the first subscript m gives the index of the Bessel function, whereas the
second subscript identifies the particular zero of Jm.
The electrostatic potential becomes
ij/(p,(p,z)= X Z Jm \amn- )-[amn sin тер+ bmn cos тер]-sinh a - . A1.56)
m=0n=l \ U/ V /
Equation 11.56 is a double series: a Bessel series in p and a Fourier series in (p.
At z = l,ij/ = ф(р, ср), a known function of p and q>. Therefore
Ф(р,ф)= Z Л Jm\^mn~)'lamn^m(p + bmncosm(p]-sinh[at-\. A1.57)
m=0n=i \ U/ \ "/
The constants amn and bmn are evaluated by using Eqs. 11.49 and 11.50 and the
•corresponding equations for sin q? and cos<p (Example 9.2.1 and Eqs. 14.7 to
14.9). We find3
~
A1.58)
sin rrupl
}pdpd(p.
o Jo V aj[cosmcp\
These are definite integrals, that is, numbers. Substituting back into Eq. 11.56
the series is specified and the potential ф(р, q>, z) is determined. The problem is
solved.
Continuum Form
The Bessel series, Eq. 11.51, and Exercise 11.2.6 apply to expansions over
the finite interval [0, a]. If a -> oo, then the series forms may be expected to go
over into integrals. The discrete roots oevm become a continuous variable a.
2 If ф = 0 at z^O, /, but ф Ф 0 for p = a, the modified Bessel functions,
Section 11.5, are involved.
3If m = 0, the factor 2 is omitted (compare Eq. 14.8).
594 BESSEL FUNCTIONS
A similar situation is encountered in the Fourier series, Section 14.2. The
development of the Bessel integral from the Bessel series is left as Exercise
11.2.8.
For operations with a continuum of Bessel functions, Jv(ap), a key relation
is the Bessel function closure equation
/•00 ,
Jv(ap)Jv(a'p)pdp = ^8(a-a'), v > -£. A1.59)
Jo a
This may be proved by the use of Hankel transforms, Section 15.1. An alternate
approach, starting from a relation similar to Eq. 9.82, is given by Morse and
Feshbach, Section 6.3.
A second kind of orthogonality (varying the index) is developed for spherical
Bessel functions in Section 11.7.
EXERCISES
11.2.1 (a) Show that
(a2 - b2) ГJv(ax)Jv(bx)xdx = P[bJv(aP)J'v(bP) - aJ[{aP)Jv{bP)l
Jo
with
Jo 2 { \ а2Р2) v° У
These two integrals are usually called the first and second Lommel integrals.
Hint. We have the development of the orthogonality of the Bessel functions as
an analogy.
11.2.2 Show that
,2
[['■HJ
Here avm is the mth zero of Jv.
Hint. With avn = avm + e, expand Jv[(avm + e)p/a] about avmp/a by a Taylor
expansion.
11.2.3 (a) If /?vm is the mth zero of {d/dp)Jv(Pvmp/a% show that the BesseL functions
are orthogonal over the interval [0, a] with an orthogonality integral
= 0, m ф n, v > — 1.
(b) Derive the corresponding normalization integral (m = n).
AAJ e _ / 1 \ Г T I R \~\2 i. >> 1
I ~*t I v\nWJ ' '
■^ \ rxm/
11.2.4 Verify that the orthogonality equation, Eq. 11.49 and the normalization equa-
equation, Eq. 11.50 hold for v > - 1.
EXERCISES 595
Hint. Using power-series expansions, examine the behavior of Eq. 11.48 as
11.2.5 From Eq. 11.49 develop a proof that Jv(z), v > — 1, has no complex roots.
Hint.
(a) Use the series form of Jx(z) to exclude pure imaginary roots.
(b) Assume avm to be complex and take avn to be a*m.
11.2.6 (a) In the series expansion
f(P) = I cvmJy («vm-\ 0 < p < a, v - 1,
a
m=l \ a
with Jv(avm) = 0, show that the coefficients are given by
c = 2 Г f, „ч r („ PN
vm fl2[Jv+1(avm)]2 Jo
(b) In the series expansion
= £ dvmJv /?vm^ , 0<p<a, v>-l,
with (d/<ip)./v(/?vmP/a)|p=a — 0, show that the coefficients are given by
2 Г
11.2.7 A right circular cylinder has an electrostatic potential of ф(р, ср) on both ends.
The potential on the curved cylindrical surface is zero. Find the potential at
all interior points.
Hint. Choose your coordinate system and adjust your z dependence to exploit
the symmetry of your potential.
11.2.8 For the continuum case, show that Eqs. 11.51 and 11.52 are replaced by
f(p) = a{a)Jv(ap)da,
Jo
/•oo
fl(a) = a f(p)Jv(ap)pdp.
Jo
Hint. The corresponding case for sines and cosines is worked out in Section 15.2.
These are Hankel transforms. A derivation for the special case v = 0 is the
topic of Exercise 15.1.1.
11.2.9 A function f(x) is expressed as a Bessel series:
f(x) = £ anJm(u.nmx),
n = \
with <xmn the nth root of Jm. Prove the Parseval relation
f' [f(x)]2xdx = \ X a2n{Jm+l{amn)f
o n=i
11.2.10 Prove that
Hint. Expand xm in a Bessel series and apply the Parseval relation.
596 BESSEL FUNCTIONS
11.2.11 A right circular cylinder of length / has a potential
фB= ±//2)=100(l-p/fl),
where a is the radius. The potential over the curved surface (side) is zero. Using
the Bessel series from Exercise 11.2.7, calculate the electrostatic potential for
p/a = 0.0@.2I.0 and z/l = 0.0@.1H.5. Take a/I = 0.5.
Hint. From Exercise 11.1.30 you have
10
Show that this equals
1
J0(y)dy.
Numerical evaluation of this latter form rather than the former is both faster
and more accurate.
Note. For p/a = 0.0 and z/l = 0.5 the convergence is slow, 20 terms giving
only 98.4 rather than 100.
Check value. For p/a = 0.4 and z/l = 0.3,
ф = 24.558.
11.3 NEUMANN FUNCTIONS, BESSEL FUNCTIONS
OF THE SECOND KIND, Nv{x)
From the theory of differential equations it is known that Bessel's equation
has two independent solutions. Indeed, for nonintegral order v we have already
found two solutions and labeled them Jv(x) and J-V(x), using the infinite series
(Eq. 11.5). The trouble is that when v is integral Eq. 11.8 holds and we have but
one independent solution. A second solution may be developed by the methods
of Section 8.6. This yields a perfectly good second solution of Bessel's equation
but is not the usual standard form.
Definition
As an alternate approach, we take the particular linear combination of JY(x)
and J_v(x)
= COSVT^X) - J_V(X)
sin vn
This is the" Neumann function (Fig. 11.6).1 For nonintegral v, Nv(x) clearly
satisfies Bessel's equation, for it is a linear combination of known solutions,
Jv(x) and J-V(x). However, for integral v, v = n, Eq. 11.8 applies and Eq. 11.60
becomes indeterminate. The definition of Nv(x) was chosen deliberately for this
indeterminate property. Evaluating Nn{x) by l'Hospital's rule for indeterminate
forms, we obtain
1ln AMS-55 and in most mathematics tables, this is labeled Yv(x).
NEUMANN FUNCTIONS, BESSEL FUNCTIONS OF THE SECOND KIND, NJx) 597
-1.0 -I
FIG. 11.6 Neumann functions, N0(x), N^x), and N2(x)
_ (d/dv) [cos vnJv(x) — J-v
(d/dv) sin vn
— nsinnnJn{x) + [cos пк д JJdv — dJ-Jdv]
"ад
dv
ncos пк
dv
A1.61)
Series Form
A series expansion2 gives the horrible result
n+2r
F{n
A1.62)
l"^(n-r-l)\ x
Г!
which exhibits the logarithmic dependence that was to be expected. This,
of course, verifies the independence of Jn and Nn. F(r) is the digamma function
that arises from differentiating the factorials in the denominator of Jv(x)
(compare Section 10.2 and especially Eq. 10.39). Using the properties of the
digamma function, we rewrite Eq. 11.62 in the only slightly less horrible form
'■ Using (d/dv)xv = xv In x.
598 BESSEL FUNCTIONS
n
лм
I
р р + п
\ -п + 2г
For n = 0we have the limiting value
N0(x) = -(In x + у - In 2) + O(x2)
and for v > 0
A1.63)
A1.64)
A1.65)
As with all the other Bessel functions, iVv(x) has integral representations.
For N0(x) we have
2 f°°
2 f
JV0(x) =— cos(xcosht)^
2 cos(xt) ,
A1.65a)
x>a
These forms can be derived as the imaginary part of the Hankel representations
_jof Exercise 11.4.5. The latter form is a Fourier cosine transform.
To verify that iVv(x), our Neumann function (Fig. 11.6) or Bessel function
of the second kind, actually does satisfy Bessel's equation for integral n, we may
proceed as follows. Differentiating Bessel's equation for J±v{x) with respect to v,
we have
- v2)^ = 2vJ
±,
A1.66)
Multiplying the equation for J_v by (—l)v, subtracting from the equation for
Jv (as suggested by Eq. 11.61), and taking the limit v -> n, we obtain
A1.67)
For v = n, an integer, the right-hand side vanishes by Eq. 11.8 and Nn(x) is
seen to. be a solution of Bessel's equation. The most general solution for any v
can therefore be written as
y(x) = AJv(x) + BNv(x).
A1.68)
It is seen from Eq. 11.62 that Nn diverges at least logarithmically. Any boundary
condition that requires the solution to be finite at the origin [as in our vibrating
circular membrane (Section 11.1)] automatically excludes Nn(x). Conversely,
in the absence of such a requirement Nn(x) must be considered.
3 Note that this limiting form applies to both integral and nonintegral values
of the index v.
NEUMANN FUNCTIONS, BESSEL FUNCTIONS OF THE SECOND KIND, Nv(x) 599
To a certain extent the definition of the Neumann function Nn(x) is arbitrary.
Equation 11.63 contains terms of the form anJn(x). Clearly, any finite value of
the constant an would still give us a second solution of Bessel's equation. Why
should an have the particular value shown in Eq. 11.63? The answer involves
the asymptotic dependence developed in Section 11.6. If Jn corresponds to a
cosine wave, then Nn corresponds to a sine wave. This simple and convenient
asymptotic phase relationship is a consequence of the particular admixture of
Jn in Nn.
Recurrence Relations
Substituting Eq. 11.60 for Nv(x) (nonintegral v) or Eq. 11.61 (integral v)
into the recurrence relations (Eqs. 11.10 and 11.12) for Jn(x), we see immediately
that Nv(x) satisfies these same recurrence relations. This actually constitutes
another proof that Nv is a solution. Note carefully that the converse is not
necessarily true. All solutions need not satisfy the same recurrence relations.
An example of this sort of trouble appears in Section 11.5.
Wronskian Formulas
From Section 8.6 and Exercise 9.1.4 we have the Wronskian formula4
for solutions of the Bessel equation
mv(xK(x) - <(x)vv(x) = ф. A169)
in which Av is a parameter that depends on the particular Bessel functions
uv(x) and vv(x) being considered. It is a constant in the sense that it is independent
of x. Consider the special case
uv(x) = Jv(x), vv(x) = J_v(x), A1.70)
JJLJ'J =
Since Av is a constant, it may be identified at any convenient point such as
x = 0. Using the first terms in the series expansions (Eqs. 11.5 and 11.6), we
obtain
xv 2vx~v
2vv!' v (-v)!'
A1.72)
1 21
2 v! (~v)'
Substitution Eq. 11.69 yields
J (y)J' (x) — J'(x)J (x) = ~
xv!(-v)!
A1.73)
2 sin vn
nx
4This result depends on P(x) of Section 8.5 being equal to p'(x)/p(x), the
corresponding coefficient of the self-adjoint form of Section 9.1.
600 BESSEL FUNCTIONS
using Eq. 10.32, we have
v !( — v)! =
smnv
Note that Av vanishes for integral v, as it must, since the nonvanishing of the
Wronskian is a test of the independence of the two solutions. By Eq. 11.73
Jn and J_n are clearly linearly dependent.
Using our recurrence relations, we may readily develop a large number of
alternate forms, among which are
^ A1-74)
^ A1.75)
JvNv+l~Jv+iNv= -—. A1.77)
nx
Many more will be found in the references given.
The reader will recall that in Chapter 8 Wronskians were of great value in
two respects: A) in establishing the linear independence or linear dependence
of solutions of differential equations and B) in developing an integral form of a
second solution. Here the specific forms of the Wronskians and Wronskian-
derived combinations of Bessel functions are useful primarily to illustrate the
general behavior of the various Bessel functions. Wronskians are of great use
in checking tables of Bessel functions. In Chapter 16 Wronskians reappear in
connection with Green's functions.
EXAMPLE 11.3.1 Coaxial Wave Guides
We are interested in an ^electromagnetic wave confined between the con-
concentric, conducting cylindrical surfaces p = a and p = b. Most of the mathe-
mathematics is worked out in Section 2.6 and Example 11.1.2. To go from the standing
wave of these examples to the traveling wave here, we let amn = ibmn in Eq. 11.40
and obtain
К = I bmnJm(yp)e±i^e^-^. A1.78)
>n,n
Additional properties of the components of the electromagnetic wave in the
simple cylindrical wave guide are explored in Exercises 11.3.9 and 10. For the
coaxial wave guide one generalization is needed. The origin p = 0 is now
excluded @ < a < p < b). Hence the Neumann function Nm(yp) may not be
excluded. Ez(p, q>, z, t) becomes
Ez = £ [bmnJm(yp) + cmnNm(yp)]e±im^k^\ A1.79)
m,n
EXERCISES 601
With the condition
Hz = 0, A1.80)
we have the basic equations for a TM (transverse magnetic) wave.
The (tangential) electric field must vanish at the conducting surfaces (Dirichlet
boundary condition) or
bmnUya) + cmnNm(ya) = 0. A1.81)
bmjm{yb) + cmnNm{yb) = 0. A1.82)
These transcendental equations may be solved for у (утп) and the ratio cmjbmn.
From Example 11.1.2,
2
К —- CO JUqBq у ——■ тг~ у . ^ll.O-Эу
Since k2 must be positive for a real wave, the minimum frequency that will be
propagated (in this TM mode) is
a> = yc, A1.84)
with у fixed by the boundary conditions, Eqs. 11.81 and 11.82. This is the cutoff
frequency of the wave guide.
There is also а ТЕ (transverse electric) mode with Ez = 0, and Hz given by
Eq. 11.79. Then we have Neumann boundary conditions in place of Eqs. 11.81
and 11.82. Finally, for the coaxial guide {not for the plain cylindrical guide,
a = 0), а ТЕМ (transverse electromagnetic) mode, Ez = Hz ~ 0, is possible.
This corresponds to a plane wave as in free space.
The simpler cases (no Neumann functions, simpler boundary conditions) of
a circular wave guide are included as Exercises 11.3.9 and 11.3.10.
To conclude this discussion of Neumann functions, we introduce the Neu-
Neumann function, Nv(x), for the following reasons:
1. It is a second, independent solution of Bessel's equa-
equation, which completes the general solution.
2. It is required for specific physical problems such as
electromagnetic waves in coaxial cables.
3. It leads to a Green's function for the Bessel equation
(Sections 16.5 and 16.6).
4. It leads directly to the two Hankel functions (Section
11.4).
EXERCISES
11.3.1 Verify the expansions (leading term only)
(l+l2)
X « 1.
602 BESSEL FUNCTIONS
For N0(x) differentiate the definition of the Neumann function as indicated
inEq. 11.61.
11.3.2 Prove that the Neumann functions Nn (with n an integer) satisfy the recurrence
relations
NII_1(x) + NB+1(x) = —Nn(x)
x
Hint. These relations may be proved by differentiating the recurrence relations
for Jv or by using the limit form of Nv but not dividing everything by zero.
11.3.3 Show that
11.3.4 Show that
JV0(x) = -ЛМх).
11.3.5 If У and Z are any two solutions of Bessel's equation, show that
x '
in which Av may depend on v but is independent of x. This is really a special
case of Exercise 9.1.4.
11.3.6 Verify the Wronskian formulas
/ ч , , ч 2 sin vn
Jv(x)J_v+1(x)-bJ_v(x
nx
2_
nx
11.3.7 As an alternative to letting x approach zero in the evaluation of the Wronskian
constant, we may invoke uniqueness of power series (Section 5.7). The coefficient
of x in the series expansion of uv(x)v'v(x) — u'v(x)vv(x) is then Av. Show by
series expansion that the coefficients of x° and x1 of Jv(x)Jlv(x) — J^(x)J_v(x)
are each zero.
11.3.8 (a) By differentiating and substituting into Bessel's differential equation, show
that
Лес
cos(x cosh t) dt
Jo
is a solution.
Hint. You can rearrange the final integral as
Г d
(b) Show that
is linearly independent of J0(x).
{x sin(x cosh t) sinh t} dt.
2 f*
~ — cos(x cosh t) dt
n
HANKEL FUNCTIONS 603
11.3.9 A cylindrical wave guide has radius r0. Find the nonvanishing components
of the electric and magnetic fields for
(a) TM01, transverse magnetic wave (Hz = Hp = E^ = 0),
(b) TE0!, transverse electric wave (Ez = Ep = Hv = 0).
The subscripts 01 indicate that the longitudinal component (E. or Hz) involves
Jo and the boundary condition is satisfied by the first zero of Jo or J'o.
Hint. All components of the wave have the same factor: e\pi{kz — tot).
11.3.10 For a given mode of oscillation the minimum frequency that will be passed by a
circular cylindrical wave guide (radius r0) is
11.3.11
in which Xc is fixed by the boundary condition
= 0
for TMnm mode,
= 0 for ТЕит mode.
The subscript n denotes the order of the Bessel function and m indicates the
zero used. Find this cut-off wavelength, kc for the three TM and three ТЕ modes
with the longest cut-off wavelengths. Explain your results in terms of the graph
ofj0, Jx, and J2, (Fig. 11.2).
Write a program that will compute successive roots of the Neumann function
Nn(x); that is, ans, where Nn(ans) = 0. Tabulate the first five roots of iV0, Nl5
and N2. Check your values for the roots against those listed in AMS-55
(Chapter 9).
Hint. See Appendix 1 for root-finding techniques and recommendations.
Check value. a12 = 5.42968.
5 ■
ч
5 ■
У
А
0
V
Ч
i i
Л
л
1
УоBх) Уо(л-)
М)Bдг) М)(дг)
Ч
Л
1
4x6
10
FIG. 11.7
604 BESSEL FUNCTIONS
11.3.12 For the case m = 0, a = 1, and b = 2 the coaxial wave guide boundary con-
conditions lead to
f(x\ = Jo(
N
N0Bx) NQ(x)
(Fig. 11.7).
(a) Calculate/(x) for x = 0.0@.1I0.0 and plot f(x) versus x to find the approxi-
approximate location of the roots.
(b) Call a root-finding subroutine to determine the first three roots to higher
precision.
ANS. 3.1230,6.2734,9.4182.
Note. The higher roots can be expected to appear at intervals whose length
approaches n. Why? AMS-55, Section 9.5, gives an approximate formula for
the roots. The function g(x) = J0(x)N0Bx) — J0{2x)N0(x) is much better be-
behaved than f(x) previously discussed.
11.4 HANKEL FUNCTIONS
Many authors prefer to introduce the Hankel functions by means of integral
representations and then use them to define the Neumann function, NY(z).
An outline of this approach is given at the end of this section.
Definitions
As we have already obtained the Neumann function by more elementary
(and less powerful) techniques, we may use it to define the Hankel functions,
) and Щ2\х):
= Jv(x) + iNv(x) A1.85)
and
H{v2)(x) = Jv(x)-iNv(x). A1.86)
This is exactly analogous to taking
e±w = cos6±isin0. A1.87)
For real arguments HvA) and H(v2) are complex conjugates. The extent of the
analogy will be seen even better when the asymptotic forms are considered
(Section 11.6). Indeed, it is their asymptotic behavior that makes the Hankel
functions useful.
Series expansion of Я^1}(х) and H(vZ)(x) may be obtained by combining Eqs.
11.5 and 11.63. Often only the first term is of interest; it is given by
i-lnx + 1 + i-{y - In2) + • • •, A1.88)
-*(V~1)!(|) + •••> v>0'
Щ2Хх) « -i-lnx + 1 - i~(y - In2) + • • •, A1.90)
/С 7С
v>0. A1.91)
HANKEL FUNCTIONS 605
Since the Hankel functions are linear combinations (with constant coeffi-
coefficients) of Jv and iVv, they satisfy the same recurrence relations (Eqs. 11.10 and
11.12).
Яу_,(х) + Яу+1(х) = ^Hv(x), A1.92)
хХ A1.93)
for both Яла)(х) and Щ2)(х).
A variety of Wronskian formulas can be developed:
А-, A1-94)
шх
^ A1.95)
inx
(П.96)
EXAMPLE 11.4.1 Cylindrical Traveling Waves
As an illustration of the use of Hankel functions, consider a two-dimensional
wave problem similar to the vibrating circular membrane of Exercise 11.1.25.
Now imagine that the waves are generated at r = 0 and move outward to
infinity. We replace our standing waves by traveling ones. The differential
equation remains the same, but the boundary conditions change. We now
demand that for large r the solution behave like
U -» е*кг~аа) A1.97)
to describe an outgoing wave. As before, к is the wave number. This assumes,
for simplicity, that there is no azimuthal dependence, that is, no angular
momentum, or m = 0. In Sections 7.4 and 11.6, H^Xkr) is shown to have the
asymptotic behavior
H^\kr)-*eikr. A1.98)
This boundary condition at infinity then determines our wave solution as
U{r,t) = H^\kr)e~i<ot. A1.99)
This solution diverges as r -> 0, which is just the behavior to be expected with
a source at the origin.
The choice of a fwo-dimensional wave problem to illustrate the Hankel
function Щ1](г) is not accidental. Bessel functions may appear in a variety of
ways, such as in the separation of conical coordinates. However, they enter
most commonly from the radial equations from the separation of variables in
the Helmholtz equation in cylindrical and in spherical polar coordinates. We
have taken a degenerate form of cylindrical coordinates for this illustration.
Had we used spherical polar coordinates (spherical waves), we should have
606 BESSEL FUNCTIONS
encountered index v = n + \, n an integer. These special values yield the
spherical Bessel functions to be discussed in Section 11.7.
Contour Integral Representation of the
Hankel Functions
The integral representation (Schlaefli integral)
2ni
= ^r~ \ e
v+7
A1.100)
may easily be established for v = n, an integer [recognizing that the numerator
is the generating function (Eq. 11.1) and integrating around the origin]. If v is
not an integer, the integrand is not single-valued and a cut line is needed in our
complex plane. Choosing the negative real axis as the cut line and using the
contour shown in Fig. 11.8, we can extend Eq. 11.100 to nonintegral v. Sub-
Substituting Eq. 11.100 into Bessel's differential equation, we can'represent the
combined integrand by an exact differential that vanishes as / -> oo e±in (compare
Exercise 11.1.16).
осе
FIG. 11.8 Bessel function contour
We now deform the contour so that it approaches the origin along the positive
real axis, as shown in Fig. 11.9. This particular approach guarantees that the
exact differential mentioned will vanish as t -> 0 because of the e~x/2t factor.
Hence each of the separate portions oo e~'n to 0 and 0 to oo e'n is a solution of
Bessel's equation. We define
1 f00*'* dt
e(*/2)<«-i/t)_ffL A1.101)
/0 *
П1
1
ni
A1.102)
These expressions are particularly convenient because they may be handled
by the method of steepest descents (Section 7.4). H^Xx) has a saddle point at
t = +1, whereas ЯуB)(х) has a saddle point at t — — i.
HANKEL FUNCTIONS 607
ooe'
ooe
FIG. 11.9 Hankel function contours
The problem of relating Eqs. 11.101 and 11.102 to our earlier definition of
the Hankel function (Eqs. 11.85 and 11.86) remains. Since Eqs. 11.100 to 11.102
combined yield
)] A1.103)
A1.104)
by inspection, we need only show that
Nv(x) = ±
This may be accomplished by the following steps:
1. With the substitutions t = ein/s for Щ1) and t = e~in/s
for H[2\ we obtain
A1.105)
Щ2\х) = eh*H(_2J{x). A1.106)
2. From Eqs. 11.103 (v -» - v), and 11.105 and 11.106,
)']. A1.107)
3. Finally, substitute Jv (Eq. 11.103) and J_v(Eq. 11.107)
into the defining equation for Nv, Eq. 11.60. This leads
to Eq. 11.104 and establishes the contour integrals
Eqs. 11.101 and 11.102 as the Hankel functions.
Integral representations have appeared before: Eq. 10.35 for F(z) and various
representations of Jv(z) in Section 11.1. With these integral representations of
the Hankel functions, it is perhaps appropriate to ask why we are interested in
integral representations. There are at least four reasons. The first is simply
aesthetic appeal—some people find them attractive. Second, the integral repre-
representations help to distinguish between two linearly independent solutions. In
Fig. 11.7, the contours Q and C2 cross different saddle points (Section 7.4).
608 BESSEL FUNCTIONS
For the Legendre functions the contour for Pn(z) (Fig. 12.9) and that for Qn{z)
encircle different singular points.
Third, the integral representations facilitate manipulations, analysis, and the
development of relations among the various special functions. Fourth, and
probably most important of all, the integral representations are extremely useful
in developing asymptotic expansions. One approach, the method of steepest
descents, appears in Section 7.4. A second approach, the direct expansion of
an integral representation is given in Section 11.6 for the modified Bessel func-
function Kv(z). This same technique may be used to obtain asymptotic expansions
of the confluent hypergeometric functions, M and U—Exercise 13.6.13.
In conclusion, the Hankel functions are introduced here for the following
reasons:
1. As analogs of e±lx they are useful for describing
traveling waves.
2. They offer an alternate (contour integral) and a rather
elegant definition of Bessel functions.
3. H{vl) is used to define the modified Bessel function Kv
of Section 11.5.
EXERCISES
11.4.1 Verify the Wronskian formulas
(a) Jv
(b)
(с)
(d)
(e)
(f)
(g)
11.4.2 Show that the integral forms
1
nx
nx
-2
nx
-2
nx
x) =
x> ~
2
-/4
nx
4
inx'
inx
171
satisfy Bessel's differential equation. The contours C{ and C2 are shown in Fig.
11.9.
11.4.3 Using the integrals and contours given in problem 11.4.2, show that
EXERCISES 609
11.4.4 Show that the integrals in Exercise 11.4.2 may be transformed to yield
ni]c
(a) H
(b) Щ
ni
e*sinh y~v
(see Fig. 11.10).
G7
— G7
c.
00 + G7
00 — G7
FIG. 11.10 Hankel function contours
11.4.5 (a) Transform H@1](x), Eq. 11.101, into
Hlol){x) =
i
ds,
in
where the contour С runs from —ею — in/2 through the origin of the s-plane
to oo + in/2.
(b) Justify rewriting H(q\x) as
f oo + in/2
/x cosh я
(c) Verify that this integral representation actually satisfies Bessel's differential
equation. (The in/2 in the upper limit is not essential. It serves as a conver-
convergence factor. We can replace it by ian/2 and take the limit.
11.4.6 From
show that
2 Г00
(a) J0(x) = -\ sin(x cosh s) ds.
71 Jo
(b) J0{x) = -
dt.
This last result is a Fourier sine transform.
610 BESSEL FUNCTIONS
11.4.7 From
2 f00
fj(l)(x) = -^— I p «cosh i J,,
Ы Jo
show that
2 f00
(a) N0(x) =— cos (x cosh s)rfs.
71 Jo
,, -. xr , . 2 Г00 cos(xt) ,
(b) N0(x) = -- -~=Ldt.
These are the equations given as Eq. 11.65a.
This last result is a Fourier cosine transform.
11.5 MODIFIED BESSEL FUNCTIONS, Iv(x) and Kv{x)
The Helmholtz equation,
VV + к2ф = 0,
separated in circular cylindrical coordinates, leads to Eq. 11.22a, the Bessel
equation. Equation 11.22a is satisfied by the Bessel and Neumann functions
Jv(kp) and Nv(kp) and any linear combination such as the Hankel functions
H(vX){kp) and Hl2\kp). Now the Helmholtz equation describes the space part
of wave phenomena. If instead we have a diffusion problem, then the Helmholtz
equation is replaced by
= 0. A1.108)
The analog to Eq. 11.22a is
>4~ yv(M - (k2p2 + v2)Yv(kp) = 0. A1.109)
The Helmholtz equation may be transformed into the diffusion equation by
the transformation к -*■ ik. Similarly, к -*■ ik changes Eq. 11.22a into Eq. 11.109
and shows that
Yv(kp) = Zv(ikp)-
The solutions of Eq. 11.109 are Bessel functions of imaginary argument. To
obtain a solution that is regular at the origin, we take Zv as the regular Bessel
function Jv. It is customary (and convenient) to choose the normalization so
that
Yv(kp) = /v(x) = rvJv(ix). A1.110)
(Here the variable kp is being replaced by x for simplicity.) Often this is written
as
/v(x) = e~vnil2JY{xein'2). A1.111)
/0 and It are shown in Figure 11.11.
MODIFIED BESSEL FUNCTIONS, Iv(x) and Kv(x) 611
2.4
2.0
1.6
1.2
0:8
0.4
Ко К\
1
/о /,
/ /
//
//
/
/
2
3
FIG. 11.11 Modified Bessel functions
Series Form
In terms of infinite series this is equivalent to removing the (—l)s sign in
Eq. 11.5 and writing
oo 1 /„\2s + v
/-v(*) -
1 X
,2s-v
A1.112)
The extra i v normalization cancels the iv from each term and leaves Jv(x) real.
For integral v this yields
/„(x) = /_„(х).
A1.113)
Recurrence Relations
The recurrence relations satisfied by /v(x) may be developed from the series
expansions, but it is perhaps easier to work from the existing recurrence relations
for Jv(x). Let us replace x by —ix and rewrite Eq. 11.110 as
A1.114)
Then Eq. 11.10 becomes
2vM
Replacing x by ix, we have a recurrence relation for Iv(x),
A1.115)
612 BESSEL FUNCTIONS
Equation 11.12 transforms to
/v_1(x) + /v+1(x) = 2/;(x). A1.116)
These are the recurrence relations used in Exercise 11.1.14.
It is worth emphasizing that although two recurrence relations, Eqs. 11.115
and 11.116 or Exercise 11.5.7, specify the second-order differential equation,
the converse is not true. The differential equation does not uniquely fix the
recurrence relations. Equations 11.115 and 11.116 and Exercise 11.5.7 provide
an example.
From Eq. 11.113 it is seen that we have but one independent solution when
v is an integer, exactly as in the Bessel functions Jv. The choice of a second,
independent solution of Eq. 11.108 is essentially a matter of convenience. The
second solution given here is selected on the basis of its asymptotic behavior—
as shown in the next section. The confusion of choice and notation for this
solution is perhaps greater than anywhere else in this field.1 Many authors2
choose to define a second solution in terms of the Hankel function Hl>1)(x) by
A1.117)
The factor /v+1 makes Kv(x) real when x is real. Using Eqs. 11.60 and 11.110,
we may transform Eq. 11.117 to3
, A1.118)
analogous to Eq. 11.60 for Nv(x). The choice of Eq. 11.117 as a definition is
somewhat unfortunate in that the function KY(x) does not satisfy the same
recurrence relations as Iv(x) (compare Exercises 11.5.7 and 11.5.8). To avoid
this annoyance other authors4 have included an additional factor of cosine
nn. This permits Kv to satisfy the same recurrence relations as /v, but it has the
disadvantage of making Kv = 0 for v = \, \, f, ....
The series expansion of Kv(x) follows directly from the series form of Hli](ix).
The lowest order terms are
K0(x) = -\nx - у + In2 +
(H-119)
Kv(x) = V-1(v-iy.x-v+ ....
Because the modified Bessel function Jv is related to the Bessel function Jv,
much as sinh is related to sine, Iv and the second solution Kv are sometimes
referred to as hyperbolic Bessel functions.
'A discussion and comparison of notations will be found in MTAC 1,
207-308 A944).
2 Watson, Morse and Feshbach, Jeffreys and Jeffreys (without the л/2).
3 For integral index n we take the limit as v -> n.
4Whittaker and Watson.
EXERCISES 613
/0(x) and K0(x) have the integral representations
A1.120)
K0(x)= Г cos(xsinht)dt= Г ™2^Х^2, х>0. A1.121)
J J
Equation 11.120 may be derived from Eq. 11.30 for J0(x) or may be taken
as a special case of Exercise 11.5.4, v = 0. The integral representation of Ko,
Eq. 11.121, is a Fourier transform and may best be derived with Fourier trans-
transforms, Chapter 15, or with Green's functions, Section 16.6. A variety of other
forms of integral representations (including v Ф 0) appear in the exercises.
These integral representations are useful in developing asymptotic forms (Sec-
(Section 11.6) and in connection with Fourier transforms, Chapter 15.
To put the modified Bessel functions /v(x) and Kv(x) in proper perspective,
we introduce them here because:
1. These functions are solutions of the frequently en-
encountered modified Bessel equation.
2. They are needed for specific physical problems such
as diffusion problems.
3. Kv(x) provides a Green's function, Section 16.6.
4. Kv(x) leads to a convenient determination of asym-
asymptotic behavior (Section 11.6).
EXERCISES
11.5.1 Show that
thus generating modified Bessel functions, In(x).
11.5.2 Verify the following identities
(a) l=I0(x) + 2t(-m2n(x),
CO
00
(d)
(e)
11.5.3 (a) From the generating function of Exercise 11.5.1 show that
614 BESSEL FUNCTIONS
(b) For n = v, not an integer, show that the preceding integral representation
may be generalized to
1 f r ^ dt
I (x) = — exp[(x/2)(t + 1АЛ-ТГ-
tv
The contour С is the same as that for Jv(x), Fig. 11.8.
11.5.4 For v > — j show that /v(z) may be represented by
Jo—
11.5.5 A cylindrical cavity has a radius a and a height I, Fig. 11.4. The ends, z = Oand
/, are at zero potential. The cylindrical walls, p = a, have a potential V = И(<р, z).
(a) Show that the electrostatic potential Ф(р, ф, z) has the functional form
00 00
Ф(р, <p,z)= £ X 4Д„Р) sin knz • {amn sin шф + bmn cos m<p),
where kn = —.
(b) Show that the coefficients amn and bmn are given by5
„ 1 1 Г2п П [sin
b™,J nllm(kna)jo Jo ' " ' (cos шф
/пг. Expand К(ф, z) as a double series and use the orthogonality of the trigono-
trigonometric functions.
11.5.6 Verify that Kv(x) is given by
л /_v(x) - /v(x)
2 sin vn
and from this show that
Kv(x) = K_v(x).
11.5.7 Show that Kv(x) satisfies the recurrence relations
Kv.l(x)-Kv+l(x)= -~Ky{x),
x
Kv^(x) + Kv+i(x)= -2K'v(x).
11.5.8 If Jfv = eyniKv, show that Jfv satisfies the same recurrence relations as /v.
' When m = 0, the 2 in the coefficient is replaced by 1.
EXERCISES 615
11.5.9 For v > — j show that Kv(z) may be represented by
_l/2 /_\v foo
vV 7 (v-i)!V2/J 2
_l/2 /V\v f°°
11.5.10 Show that Iv(x) and ^v(x) satisfy the Wronskian relation
This result is quoted in Section 16.6 in the development of a Green's function.
11.5.11 If r = (x2 + y2I'2, prove that
1 2 f00
- = - cos(xt)K0(yt)dt.
r n Jo
This is a Fourier cosine transform of Ko.
11.5.12 (a) Verify that
l Г
I0(x) = - cosh (x cos O)dO
71 Jo
satisfies the modified Bessel equation, v = 0.
(b) Show that this integral contains no admixture of K0{x), the irregular
second solution.
(c) Verify the normalization factor 1/л.
11.5.13 Verify that the integral representations
i Г
71 Jo
Kv(z) = e~z cosh 'cosh(vt)^, &(z) > 0,
Jo
satisfy the modified Bessel equation by direct substitution into that equation.
How can you show that the first form does not contain an admixture of Kn,
that the second form does not contain an admixture of /v? How can you check
the normalization?
11.5.14 Derive the integral representation
1 Г
ln(x) = ~ exco*ecos(n0)d0.
71 Jo
Hint. Start with the corresponding integral representation of Jn(x). Equation
11.120 is a special case of this representation.
11.5.15 Show that
K0(z)= e-'^'dt
Jo
satisfies the modified Bessel equation. How can you establish that this form is
linearly independent of/0(z)?
616 BESSEL FUNCTIONS
11.5.16 Show that
00
eax = I0(a)T0(x) + 2 £ In{a)Tn(x), - 1 < x < 1.
n = l
Tn(x) is the nth-order Chebyshev polynomial, Sections 13.3 and 13.4.
Hint. Assume a Chebyshev series expansion. Using the orthogonality and
normalization of the Tn(x), solve for the coefficients of the Chebyshev series.
11.5.17 (a) Write a double precision subroutine to calculate ln(x) to 12-decimal place
accuracy for n = 0, 1, 2, 3, ... and 0 < x < 1. Check your results against
the 10 place values given in AMS-55, Table 9.11.
(b) Referring to Exercise 11.5.16, calculate the coefficients in the Chebyshev
expansions of cosh x and of sinh x.
Note. An alternate calculation of these coefficients is one of the topics of Section
13.4.
11.5.18 The cylindrical cavity of Exercise 11.5.5 has a potential along the cylinder walls
= jlOOz/Z, 0 < z/l < 1/2
j 100A -z/l), 1/2 < z/l < I.
With the radius-height ratio a/I = 0.5, calculate the potential for z/l = 0.1@.1H.5
and p/a = 0.0@.2I.0.
Check value. For z/l -0.3 and p/a - 0.8, V = 26.396.
11.6 ASYMPTOTIC EXPANSIONS
Frequently in physical problems there is a need to know how a given Bessel
or modified Bessel function behaves for large values of the argument, that is,
the asymptotic behavior. This is one occasion when computers are not very
helpful. One possible approach is to develop a power-series solution of the
differential equation, as in Section 8.5, but now using negative powers. This is
the Stokes's method, Exercise 11.6.5. The limitation is that starting from some
positive value of the argument (for convergence of the series), we do not know
what mixture of solutions or multiple of a given solution we have. The problem
is to relate the asymptotic series (useful for large values of the variable) to the
power series or related definition (useful for small values of the variable). This
relationship can be established by introducing a suitable integral representation
and then using either the method of steepest descent, Section 7.4, or the direct
expansion as developed in this section.
Expansion of an Integral Representation, Kv{z)
As a direct approach, consider the integral representation (Exercise 11.5.9)
1/2
Kv{z)=v^w.u) \
For the present let us take z to be real, although Eq. 11.136 may be established
for — n/2 < argz < n/2(M(z) > 0). We have three problems: A) to show that
Kv as given in Eq. 11.122 actually satisfies the modified Bessel equation A1.108);
B) to show that the regular solution /v is absent; and C) to show that Eq. 11.122
has the proper normalization.
ASYMPTOTIC EXPANSIONS 617
1. The fact that Eq. 11.122 is a solution of the modified
Bessel equation may be verified by direct substitu-
substitution. We obtain
-l)v+ll2]dx = 0,
n ax
which transforms the combined integrand into the
derivative of a function that vanishes at both end
points. Hence the integral is some linear combina-
combination of Iv and Kv.
2. The rejection of the possibility that this solution
contains /v constitutes Exercise 11.6.1.
3. The normalization may be verified by substituting
x = 1 + t/z.
1/2 /_\v Л«
71 lZX ' e-xx(x2 - \y-uldx
dt,(l 1.123b)
taking out t2/z2 as a factor. This substitution has
changed the limits of integration to a more conve-
convenient range and has isolated the negative exponential
dependence, e~z. The integral in Eq. 11.123b may be
evaluated for z = 0 to yield Bv — 1)! Then, using
the duplication formula (Section 10.4), we have
(v _ i)\2v~1
limKv(z) = - -r , v > 0, A1.124)
in agreement with Eq. 11.119, which thus checks
the normalization.1
Now to develop an asymptotic series for Kv(z), we may rewrite 11.123a as
A1.125)
(taking out 2t/z as a factor).
We expand A + t/2z)v~112 by the binomial theorem to obtain
AU26)
1 For v = 0 the integral diverges logarithmically in agreement with the
logarithmic divergence of K0(z) (Section 11.5).
618 BESSEL FUNCTIONS
Term-by-term integration (valid for asymptotic series) yields the desired
asymptotic expansion of Kv(z).
~~ Dv2-l2) , Dv2 - I2)Dv2 - 32) ,
£
|
l!8z
2!(8zJ
A1.127)
Although the integral of Eq. 11.122, integrating along the real axis, was conver-
convergent only for — n/2 < argz < я/2, Eq. 11.127 may be extended to — Зтг/2 <
argz < Зтг/2. Considered as an infinite series, Eq. 11.127 is actually divergent.2
However, this series is asymptotic in the sense that for large enough z, Kv(z)
may be approximated to any fixed degree of accuracy. (Compare Section 5.10
for a definition and discussion of asymptotic series.)
It is convenient to rewrite Eq. 11.127 as
'z
A1.128)
where
- 9) (/г - l)(/i - 9)(/i - 25)(/i - 49)
2!(8zJ ' 4!(8zL
1 (/i-l)(/i-9)(/i-25)
3!(8zK
, A1.129a)
A1.129b)
and
= 4v2.
It should be noted that although Pv(z) of Eq. 11.129a and Qv(z) of Eq. 11.12%
have alternating signs, the series for Pv{iz) and Qv{iz) of Eq. 11.128 have all signs
positive. Finally, for z large, Pv dominates.
Then with the asymptotic form of Kv(z), Eq. 11.128, we can obtain expan-
expansions for all other Bessel and hyperbolic Bessel functions by defining relations:
1. From
A1.130)
we have
— n< arg z < 2n.
A1.131)
2 Our binomial expansion is valid only for t < 2z and we have integrated / out
to infinity. The exponential decrease of the integrand prevents a disaster but
the resultant series is still only asymptotic, not convergent. By Table 8.3
z = oo is an essential singularity of the Bessel (and modified Bessel) equations.
Fuchs's theorem does not guarantee a convergent series and we do not get a
convergent series.
ASYMPTOTIC EXPANSIONS 619
2. The second Hankel function is just the complex
conjugate of the first (for real argument),
H[2\z)= /—exp-i
KZ
[ад - tQv(z)l
— 2n< arg z < n.
A1.132)
An alternate derivation of the asymptotic behav-
behavior of the Hankel functions appears in Section 7.4 as
an application of the method of steepest descents.
3. Since Jv(z) is the real part of H[1](z),
( *V
— n < arg z < n.
A1.133)
4. The Neumann function is the imaginary part of
), or
Nv(z)= —\Py{z) sin
— n< argz < n.
A1.134)
5. Finally, the regular hyperbolic or modified Bessel
function Iv(z) is given by
Iv(z) = r*
A1.135)
or
'2nz
n n
-<argz<-.
A1.136)
This completes our determination of the asymptotic expansions. However, it is
perhaps worth noting the primary characteristics. Apart from the ubiquitous
z~l/2, Jv and Nv behave as cosine and sine, respectively. The zeros are almost
evenly spaced at intervals of n; the spacing becomes exactly n in the limit as
z -> go. The Hankel functions have been defined to behave like the imaginary
620 BESSEL FUNCTIONS
| 0.00 0.80 1.60 2.40 3.20 4.00 4.80 5.60
FIG. 11.12 Asymptotic approximation of J0(x)
exponentials, and the modified Bessel functions, Iv and Kv, go into the positive
and negative exponentials. This asymptotic behavior may be sufficient to
eliminate immediately one of these functions as a solution for a physical
problem. We should also note that the asymptotic series Pv(z) and Qv(z), Eqs.
11.129a and b, terminate for v = ±1/2, +3/2, . . . and become polynomials
(in negative powers of z). For these special values of v the asymptotic approxima-
approximations become exact solutions.
It is of some interest to consider the accuracy of the asymptotic forms,
taking just the first term, for example (Fig. 11.12),
COS
A1.137)
Clearly, the condition for the validity of Eq. 11.137 is that the sine term be
negligible; that is
8x»4rc2-l. A1.138)
For n or v > 1 the asymptotic region may be far out.
As pointed out in Section 11.3, the asymptotic forms may be used to evaluate
the various Wronskian formulas (compare Exercise 11.6.3).
Numerical Evaluation
When a program in a large high-speed computing machine calls for one of
the Bessel or modified Bessel functions, the programmer has two alternatives:
EXERCISES 621
to store all the Bessel functions and tell the computer how to locate the required
value or to instruct the computer to simply calculate the needed value. The
first alternative would be fairly slow and would place unreasonable demands
on the storage capacity. Thus our programmer adopts the "compute it your-
yourself" alternative.
The computation of Jn{x) using the recurrence relation, Eq. 11.10, is discussed
in Section 11.1. For Nn, /„, and Kn the preferred methods are the series if x is
small and the asymptotic forms (with many terms in the series of negative
powers) if x is large. The criteria of large and small may vary as shown in
Table 11.2.
TABLE 11.2 Equations for the Computation of Neumann
and the Modified Bessel Functions
Power Series
Asymptotic Series
Nn(x)
Kn(x)
Eq. 11.63, x<4
Eq. 11.112, x<12or<n
Eq. 11.119, x<l
Eq. 11.134,
Eq. 11.136,
Eq. 11.127,
x > 4
x > 12 and > n
x> 1
In actual practice, it is found convenient to limit the series (power or asymptotic) com-
computation of Nn(x) and Kn(x) to n = 0, 1. Then Nn(x), n > 2 is computed using the recurrence
relation, Eq. 11.10. К„(х), п > 2 is computed using the recurrence relations of Exercise
11.5.7. In(x) could be handled this way, if desired, but direct application of the power series
or asymptotic series is feasible for all values of n and x.
EXERCISES
11.6.1 In checking the normalization of the integral representation of Kv(z) (Eq. 11.122),
we assumed that /v(z) was not present. How do we know that the integral repre-
representation (Eq. 11.122) does not yield Kv(z) + e/v(z) with e + 0?
11.6.2 (a) Show that
= zv \e-z'{t2 - iy-ll2dt
satisfies the modified Bessel equation, provided the contour is chosen so that
e~zl(t2 - l)v+1/2
has the same value at the initial and final points of the contour,
(b) Verify that the contours shown in Fig. 11.13 are suitable for this problem.
-1
plane
B)
A)
FIG. 11.13 Modified Bessel function contours
622 BESSEL FUNCTIONS
11.6.3 Use the asymptotic expansions to verify the following Wronskian formulas:
(a) Jv(*)J-,-Ax) + J-w(x)Jv+1(x) = 2sinv7r
nx
(b) Jv(x)Nv+l(x) - Jv+1(x)Nv(x) = -—,
nx
(c) Jv{x)H£\{x) - Л-iWtff \x) = r2-,
J7TX
(d) Ux)K'v(x)-i:(x)Kv{x)=--,
(e) /v(x)Kv+1(x) + /v+1(x)Kv(x) = -.
11.6.4 From the asymptotic form of Kv(z) Eq. 11.127, derive the asymptotic form of
), Eq. 11.131. Note particularly the phase, (v + j)n/2.
11.6.5 Stokes's method.
(a) Replace the Bessel function in Bessel's equation by x~1/2y(x) and show that
y(x) satisfies
(b) Develop a power-series solution with negative powers of x starting with
the assumed form
00
y(x) = eix X anx~".
n = 0
Determine the recurrence relation giving an+i in terms of an. Check your
result against the asymptotic series, Eq. 11.131.
(c) From the results of Section 7.4 determine the initial coefficient, a0.
11.6.6 Calculate the first 15 partial sums of P0(x) and Q0(x), Eqs. 11.129a and 11.1296.
Let x vary from 4 to 10 in unit steps. Determine the number of terms to be retained
for maximum accuracy and the accuracy achieved as a function of x. Specifically,
how small may x be without raising the error above 3 x 10~6?
ANS. xmm = 6.
11.6.7 (a) Using the asymptotic series (partial sums) P0(x) and Q0(x) determined in
Exercise 11.6.6, write a function subprogram FCT(X) that will calculate
J0(x), x real, for x >xmin.
(b) Test your function by comparing it with the J0(x) (tables or computer
library subroutine) for x = xminA0)xmin + 10.
Note. A more accurate and perhaps simpler asymptotic form for J0(x) is given
in AMS-55, Eq. 9.4.3.
11.7 SPHERICAL BESSEL FUNCTIONS
When the Helmholtz equation is separated in spherical coordinates the
radial equation has the form
+ 2f + ^кггг _ n[n + 1}]я = 0 A1.139)
drl dr
SPHERICAL BESSEL FUNCTIONS 623
This is Eq. 2.91 of Section 2.6. The parameter к enters from the original Helm-
holtz equation while n(n + 1) is a separation constant. From the behavior of
the polar angle function (Legendre's equation, Sections 8.5 and 12.7), the
separation constant must have this form, with n a non-negative integer. Equa-
Equation 11.139 has the virtue of being self-adjoint but clearly it is not Bessel's
equation. However, if we substitute
(кгУ
Equation 11.139 becomes
2d2Z dZ
r
dr
Z = 0, A1.140)
which is Bessel's equation. Z is a Bessel function of order n + \ (n an integer).
Because of the importance of spherical coordinates, this combination,
Zn+V2{kr)
(krI12 '
occurs quite often.
Definitions
It is convenient to label these functions spherical Bessel functions with the
following defining equations
■/„(*) = /^„
A1.141)
K2)(x) = [~Кг\,г
= jn(x)~inn(x).
These spherical Bessel functions (Figs. 11.14 and 11.15) can be expressed in
series form by using the series (Eq. 11.5) for Jn, replacing n with n + \.
A1.142)
Using the Legendre duplication formula,
z\(z + I)! = 2~2z^1nll2Bz + 1I, A1.143)
^his is possible because cos(« + \)n = 0.
624 BESSEL FUNCTIONS
0.6 -
-0.
-0.2 -
-0.3 -
*- v
FIG. 11.14 Spherical Bessel functions
0.3
0.2
0.1
0
-0.1
-0.2
-0.3 -
-0.4
*- л
FIG. 11.15 Spherical Neumann functions
SPHERICAL BESSEL FUNCTIONS 625
we have
/A V (-l)s22s+2n+1(s + n)! /x\2s+"+1/2
j 7r1/2Bs + 2n + l)!s! Ы
ч 7 A1.144)
(-ms + n)\ 2s
\B 2l)\ '
Now Nn+ll2(x) = (- l)"+1J_n_1/2(x) and from Eq. 11.5 we find that
J_n_ll2(x)= f] — (~1)S 1 t (^\ s " l' _ A1.145)
This yields
.2s
The Legendre duplication formula can be used again to give
**. (Ч-147)
These se>ies forms, Eqs. 11.144 and 11.147, are useful in three ways: A) limiting
values as x ->• 0, B) closed form representations for n = 0, and, as an extension
of this, C) an indication that the spherical Bessel functions are closely related
to sine and cosine.
For the special case n = 0we find from Eq. 11.144
A1.148)
_ sinx
X
whereas for n0 Eq. 11.147 yields
no{x)= . A1.149)
From the definition of the spherical Hankel functions (Eq. 11.141),
) = -(sinx — icosx) = —e'x
X X
A1.150)
) = -(sinx + icosx) = -e~lx.
X X
Equations 11.148 and 11.149 suggest expressing the spherical Bessel functions
as combinations of sine and cosine. The appropriate combinations can be
developed from the power-series solution, Eqs. 11.144 and 11.147, but this
approach is awkward. Actually the trigonometric forms are already available
as the asymptotic expansion of Section 11.6. From Eqs. 11.131 and 11.129a
626 BESSEL FUNCTIONS
2z "
A1.151)
— {Pn + UzW
z
Now Pn+1/2 and Qn+1/2 are polynomials. This means that Eq. 11.151 is mathe-
mathematically exact, not simply an asymptotic approximation. We obtain
z A()()
A1.152)
^r-ir— f _J!_(?L±i)l
1 ' z sfos!Bz)s(n-s)r
Often a factor (-if = (e~in'2f will be combined with the <?'" to give e**-»*i2\
For z геа1у„(г) is the real part of this, nn(z) the imaginary part, and hB)(z) the
complex conjugate.
Specifically,
eix (--- — ) A1.153a)
V x x J
е^(~-\-^\ A1.153b)
^X X X J
sinx cosx
A1.154)
ьх =K smx ^cosx,
\хл xj xl
, . cosx sinx
«lW = 2 '
X X
A1.155)
/3 1\ 3 .
n2 (x) = — [ —5- cos x rsin x,
\x^ x/ xz
and so on.
Limiting Values
For x « I,2 Eqs. 11.144 and 11.147 yield
_П" + 1 Г —иМ _и1
A1.157)
nn\
2The condition that the second term in the series be negligible compared to
the first is actually x « 2[Bи + 2)Bи + 3)/(и + 1)]1/2 for;n(.\).
SPHERICAL BESSEL FUNCTIONS 627
The transformation of factorials in the expressions for nn(x) employs Exercise
10.1.3. The limiting values of the spherical Hankel functions go as ±inn(x).
The asymptotic values ofjn, nn, h{n2\ and h{n1) may be obtained from the Bessel
asymptotic forms, Section 11.6. We find
^Y A1.158)
п„(х)~ -Icos/'x-yY A1.159)
ix pi(x-nn/2)
h(^(x) ~ (-i)"+1 — = (-0- , A1.160a)
p — ix p~i(x-nn/2)
/i<,2>(x)~i"+1—= (/)- . A1.160b)
The condition for these spherical Bessel forms is that x » n(n + l)/2. From
these asymptotic values we see that jn(x) and nn(x) are appropriate for a de-
description of standing spherical waves; №п1](х) and h{2\x) correspond to traveling
spherical waves. If the time dependence for the traveling waves is taken to be
e~l0}t, then ^1}(х) yields an outgoing traveling spherical wave, h(n2)(x) an incoming
wave. Radiation theory in electromagnetism and scattering theory in quantum
mechanics provide many applications.
Recurrence Relations
The recurrence relations to which we now turn provide a convenient way
of developing the higher-order spherical Bessel functions. These recurrence
relations may be derived from the series, but as with the modified Bessel func-
functions, it is easier to substitute into the known recurrence relations (Eqs. 11.10
and 11.12). This gives
ЛЛ) + /W = —/.D
nfn^(x) - (n + 1)/„+1(х) = {In + 1)/Лх). A1.162)
Rearranging these relations (or substituting into Eqs. 11.15 and 11.17), we
obtain
A1.163)
— [х-"/„(х)] = -х-"/„+1(х). A1.164)
Here /„ may represent;,,, п„, h(^\ or h\f\
The specific forms, Eqs. 11.154 and 11.155, may also be readily obtained from
Eq. 11.164.
By mathematical induction we may establish the Rayleigh formulas
628 BESSEL FUNCTIONS
A1.167)
Numerical Computation
The spherical Bessel and modified Bessel functions may be computed using
the same techniques described in Sections 11.1 and 11.6 or evaluating the
Bessel functions. For jn(x) and in{xK it is convenient to use Eq. 11.161 and
Exercise 11.7.18 and work downward, as is done for Jn(x). Normalization is
accomplished by comparing with the known forms of;0(x) and io{x), Eq. 11.15
and Exercise 11.7.15. For nn(x) and kn(x), Eq. 11.161 and Exercise 11.7.19 are
used again, but this time working upward, starting with the known forms of
no(x), nt(x), ko(x), and кх(х\ Eq. 11.155 and Exercise 11.7.17.
Orthogonality
We may take the orthogonality integral for the ordinary Bessel functions
(Eq. Ц50),
JJ avp^ \Jjavqh- \pdp = ^[Jv+i(O]4« A1168)
Here anp and anq are roots of jn.
This represents orthogonality with respect to the roots of the Bessel func-
functions. An illustration of this sort of orthogonality is provided later in this
section by the problem of a particle in a sphere. Equation 11.170 guarantees
orthogonality of the wave functions jn(r) for fixed n. (If n varies, the spherical
harmonic will provide orthogonality.)
EXAMPLE 11.7.1. Particle in a Sphere
An illustration of the use of the spherical Bessel functions is provided by the
problem of a quantum mechanical particle in a sphere of radius a. Quantum
theory requires that the wave function \jj, describing our particle, satisfy
and substitute in the expression for )„ to obtain
„р)]Чг (П-169)
3The spherical modified Bessel functions, in(x) and kn(x), are defined in
Exercise 11.7.15.
SPHERICAL BESSEL FUNCTIONS 629
A1.170)
and the boundary conditions A) ф(г < a) remains finite, B) ij/(a) = 0. This
corresponds to a potential V = 0, r < a, and V = oo, r > a. Here h is Planck's
constant (divided by 2n), m, the mass of our particle, and E, its energy. Let us
determine the minimum value of the energy for which our wave equation has an
acceptable solution. Equation 11.170 is just Helmholtz's equation with a radial
part (compare Section 2.6 for separation of variables):
d2R 2dR
dr r dr
П(П
r
2
= 0, A1.171)
with к2 = 2mE/h2. Hence by Eq. 11.139, with n = 0,
R = Ajo(kr) + Bno(kr).
We choose the index n = 0, for any angular dependence would raise the energy.
The spherical Neumann function is rejected because of its divergent behavior at
the origin. Technically, the spherical Neumann function n0 is a Green's function
satisfying Green's equation and not satisfying the Schrodinger wave equation
at the origin. To satisfy the second boundary condition (for all angles), we require
ka =
where a is a root ofy0, that is,jo(a) = 0. This has the effect of limiting the allow-
allowable energies to a certain discrete set or, in other words, application of boundary
condition B) quantizes the energy E. The smallest a is the first zero of;0,
a = n
and
^^ AL173)
which means that for any finite sphere the particle will have a positive minimum
or zero-point energy. This is an illustration of the Heisenberg uncertainty
principle.
In solid state physics, astrophysics, and other areas of physics we may wish
to know how many different solutions (energy states) correspond to energies
less than or equal to some fixed energy Eo. For a cubic volume (Exercise 2.6.5)
the problem is fairly simple. The considerably more difficult spherical case is
worked out by R. H. Lambert, Am. J. Phys. 36, 417, 1169 A968).
Another form, orthogonality with respect to the indices, may be written as
/»O0
;и(х);л(х)^х = 0, т ф и, m, n > 0. A1.174)
J —OO
The proof is left as Exercise 11.7.10. If m = n (compare Exercise 11.7.11), we have
630 BESSEL FUNCTIONS
"dx = —H—. A1.175)
Most physical applications of orthogonal Bessel and spherical Bessel func-
functions involve orthogonality with varying roots and an interval [0, o], Eqs.
11.168 and 11.169. Orthogonality with varying index, Eq. 11.174, is mainly a
mathematical curiosity.
The spherical Bessel functions will enter again in connection with spherical
waves, but further consideration is postponed until the corresponding angular
functions, the Legendre functions, have been introduced.
EXERCISES
11.7.1 Show that if
nn{x)= h~
it automatically equals
11.7.2 Derive the trigonometric-polynomial forms of;',,(z) and nn(z).
(a) jn{z) — - si
z "~ V" 2 ) s% Bs)!BzJs(n - 2s)!
1 / nn\ [("^y2] (-Щи + 2s + 1)!
+ -COSJZ- * V У >У '
z V 2
(b) nn(z) — — cos (z
s%Bs)l{2zJs(n-2s)\
(~ !)s(" + 2s + 1)!
z \ 2 у stb Bs + l)\BzJs+l(n - 2s
11.7.3 Use the integral representation of Jv(x),
to show that the spherical Bessel functions jn(x) are expressible in terms of
trigonometric functions; that is, for example,
Joix)
Ji(x)
smx
X
sinx
x2
cosx
X
11.7.4 (a) Derive the recurrence relations
4The upper limit on the summation [и/2] means the largest integer that does
not exceed и/2.
EXERCISES 631
i/,-iW - (и
satisfied by the spherical Bessel functions, jn(x), п„(х), К1](х), and hB\x).
(b) Show, from these two recurrence relations, that the spherical Bessel func-
function jn(x) satisfies the differential equation
x2f;'(x) + 2xf:(x) + [x2 - n(n + l)]/n(x) = 0.
11.7.5 Prove by mathematical induction that
for n an arbitrary nonnegative integer.
11.7.6 From the discussion of orthogonality of the spherical Bessel functions, show
that a Wronskian relation for jn(x) and nn{x) is
Ux)K(x) -Jn(x)nn{x) = —2.
•A
11.7.7 Verify
11.7.8 Verify Poisson's integral representation of the spherical Bessel function,
11.7.9 Show that
Ш = ^~{ rcos(zcosO)sin2n+11)dO.
f00 r . . _ , Jx 2 sinUfi v)rc/2]
o x n /r v2
11.7.10 Derive Eq. 11.174:
11.7.11 Derive Eq. 11.175:
2п
11.7.12 Set up the orthogonality integral for jL{kr) in a sphere of radius R with the
boundary condition
- o.
The result is used in classifying electromagnetic radiation according to its
angular momentum.
11.7.13 The Fresnel integrals (Fig. 11.16) occurring in diffraction theory are given by
С
x(t)= cos(v2)dv,
Jo
632 BESSEL FUNCTIONS
0.5
1.0 2
FIG. 11.16 Fresnel integrals
V V
-**- X
y{t) = sin(v2)dv.
Jo
Show that these integrals may be expanded in series of spherical Bessel functions,
y{S) ~ 2 JO\U)U MM — Л ^ J2n+l(Sh
Jo n=o
'int. To establish the equality of the integral and the sum, you may wish to
work with their derivatives. The spherical Bessel analogs of Eqs. 11.12 and 11.14
are helpful.
11.7.14 A hollow sphere of radius a (Helmholtz resonator) contains standing sound
waves. Find the minimum frequency of oscillation in terms of the radius a and
the velocity of sound v. The sound waves satisfy the wave equation
_1_5V
v2 ~dt2
and the boundary condition
dr
= 0,
r = a.
This is a Neumann boundary condition. Example 11.7.1 has the same differential
equation but with a Dirichlet boundary condition.
EXERCISES 633
'o(-v)
12 3 4 5
FIG. 11.17 Spherical modified Bessel functions
ANS. vrain= 0.3313 v/a,
A™., = 3.018a.
11.7.1 5 Defining the spherical modified Bessel functions (Fig. 11.17) by
show that
io(x) =
Kn+i/2{x),
sinhx
ko{x) =
Note that the numerical factors in the definitions of in and kn are not identical.
11.7.16 (a) Show that the parity of/„(x) is (-1)".
(b) Show that kn(x) has no definite parity.
11.7.17 Show that the spherical modified Bessel functions satisfy the following relations:
(a) in(x) = rnjn(ix),
kn(x)= -(iyh^iix),
634 BESSEL FUNCTIONS
(b)
dx{
с
~ X J iX *Fl)>
ax
. „ / d V sinh x
=x" —- ,
\x ax/ x
x
11.7.18 Show that the recurrence relations for in(x) and kn(x) are
. , , . , , 2n + 1. , ч
(a)
(b)
(n
= Bn + l)C(x),
= к„(х),
x
11.7.19 Derive the limiting values for the spherical modified Bessel functions
v"
(a) in(>
Kix)
(b) Kix)
Ш
Bn+ 1)!!
Bn- 1)!!
1.
x ?t> n(n
11.7.20 Show that the Wronskian of the spherical modified Bessel functions is given by
in(x)K(x) - i'n{x)K(x) = - A-
X
11.7.21 A quantum particle is trapped in a "square" well of radius a. The Schrodinger
equation potential is
-Vo, 0<r<a
V{r) =
[0,
r > a.
The particle's energy E is negative (an eigenvalue).
(a) Show that the radial part of the wave function is given by jt(k, r) for 0 <
r < a and kl(k2r) for r > a. (We require that i^@) and ф{со) be finite.) Here
k\ — 2M(E + V0)/h2, k\ = —2ME/h2, and / is the angular momentum (n
inEq. 11.139).
(b) The boundary condition at r = a is that the wave function ф(г) and its
first derivative be continuous. Show that this means
dr
dr
kt{k2r)
EXERCISES 635
This equation determines the energy eigenvalues.
Note. This is a generalization of Example 9.1.2.
11.7.22 The quantum mechanical radial wave function for a scattered wave is given by
_ sin(/cr + 60)
кг
where к is the wave number, к = yj2mE/h, and <50 is the scattering phase shift.
Show that the normalization integral is
Гфк(г)ФАг)г2^ = ~3(к-к).
Hint. You can use a sine representation of the Dirac delta function. See Exercise
15.3.8.
11.7.23 Derive the spherical Bessel function closure relation
Г M)j(bJd S(b)
Note. An interesting derivation involving Fourier transforms, the Rayleigh
plane wave expansion, and spherical harmonics has been given by P. Ugincius,
Am. J. Phys., 40, 1690 A972).
11.7.24 (a) Write a subroutine that will generate the spherical Bessel functions, jn(x),
that is, will generate the numerical value ofjn{x) given x and n.
Note. One possibility is to use the explicit known forms of j0 and;\ and
to develop the higher index jn by repeated application of the recurrence
relation.
(b) Check your subroutine by an independent calculation such as Eq. 11.153.
If possible, compare the machine time needed for this check with the time
required for your subroutine.
11.7.25 The wave function of a particle in a sphere (Example 11.7.1) with angular
momentum / is ф{г,О,ср) = Aj,l— r) Y,"'@,(p). The Y"\O,(p) is a spherical
V h J
harmonic, described in Section 12.6. From the boundary condition ф{а, 0, cp) — O
/J2ME \
or j,\ -—-—a = 0 calculate the 10 lowest energy states. Disregard the m
\ h )
degeneracy B1 + 1 values of m for each choice of /). Check your results against
AMS-55, Table 10.6.
Hint. You can use your spherical Bessel subroutine and a root-finding sub-
subroutine.
Check values. j,{ah) = 0,
a01 = 3.1416
au =4.4934
a21 = 5.7635
a02 = 6.2832.
11.7.26 Let Example 11.7.1 be modified so that the potential is a finite Vo outside
(r > a).
(a) For E < Vo show that
636 BESSEL FUNCTIONS
(b) The new boundary conditions to be satisfied at r = a are
ф-1П{а,0,(р) = фош{а,0,ср)
— фш(а, 0,<р) = — фош (а, 0, <p)
or or
or
Фт Sr
For I = 0 show that the boundary condition at r = a leads to
= 0,
where к = J2ME/h and fc' = ^ /
(c) With a = Ih2/Me2 (Bohr radius) and Fo = 4Me4/2h2, compute the possible
bound states, @ < E < Vo).
Hint. Call a root-finding subroutine after you know the approximate
location of the roots of
/(£), @, Vo).
(d) Show that when a = Ih2/Me2 the minimum value of Vo for which a bound
state exists is Vo = 2A674Me4/2h2.
11.7.27 In some nuclear stripping reactions the differential cross section is proportional
to (j';(xJ, where / is the angular momentum. The location of the maximum on
the curve of experimental data permits a determination of /, if the location of
the (first) maximum of j,(x) is known. Compute the location of the first maximum
ofji{x),j2(x),andj3(x).
Note. For better accuracy look for the first zero of j,'(x). Why is this more accurate
than direct location of the maximum?
REFERENCES
McBride, E. В., Obtaining Generating Functions. New York: Springer-Verlag A971).
An introduction to methods of obtaining generating functions.
Watson, G. N., A Treatise on the Theory of Be sse I Functions, 2nd ed. Cambridge: Cam-
Cambridge University Press A952).
This is the definitive text on Bessel functions and their properties. Although difficult
reading, it is invaluable as the ultimate reference.
Watson, G. N., Theory of Bessel Functions. Cambridge: Cambridge University Press.
See also the references listed at the end of Chapter 13.
12 LEGENDRE
FUNCTIONS
12.1 GENERATING FUNCTION
Legendre polynomials may appear in many different mathematical and phys-
physical situations: A) They may originate as solutions of the Legendre differential
equation which we have already encountered in the separation of variables
(Section 2.6) for Laplace's equation, Helmholtz's equation, and similar differ-
differential equations in spherical polar coordinates. B) They may enter as a con-
consequence of a Rodrigues' formula (Section 12.4). C) They may be constructed
as a consequence of demanding a complete, orthogonal set of functions over
the interval [—1,1] (Gram-Schmidt orthogonalization, Section 9.3). D) In
quantum mechanics they (really the spherical harmonics, Sections 12.6 and 12.7)
represent angular momentum eigenfunctions. E) They may be generated by a
generating function. We introduce Legendre polynomials here by way of a
generating function. The development of the various properties and related
functions is shown schematically in Fig. 12.1.
Physical Basis—Electrostatics
As with Bessel functions, it is convenient to introduce the Legendre poly-
polynomials by means of a generating function. However, a direct physical inter-
interpretation is possible. Consider an electric charge q placed on the z-axis at z = a.
As shown in Fig. 12.2, the electrostatic potential of charge q is
*± (SI units). A2.1)
4тге0
Our problem is to express the electrostatic potential in terms of the spherical
polar coordinates r and 0 (the coordinate q> is absent because of symmetry
about the z-axis). Using the law of cosines, we obtain
tp = -^—{r2 + a2 - 2arcostfr1/2. A2.2)
4tzs0
Legendre Polynomials
Consider the case of r > a or, more precisely, r2 > \a2 — 2arcos0|. The
radical may be expanded by the binomial series to give
A2.3)
637
638 LEGENDRE FUNCTIONS
, /НypcrgeometricN
"" V representationJ
( Schlaefli integral J
Associated
Legendre
functions
Legendre series
Spherical
harmonics
I
Vector >.
spherical 1
harmonics/
FIG. 12.1 Legendre function interrelations
a series of powers of (a/r) with the coefficient of the nth power denoted by
Pn(cos0). The Pn are the Legendre polynomials (Fig. 12.3) and may be defined
by
g(t,x) = A - 2xt + г2)'1'2 = У Pn(x)t\ t < 1.
A2.4)
GENERATING FUNCTION 639
FIG. 12.2 Electrostatic potential.
Charge q displaced from origin
FIG. 12.3 Legendre polynomials, Л (v),
P3(x), P4{x), and P5(x)
This is equivalent to equating the right-hand sides of Eqs. 12.2 and 12.3 with
cosO replaced by x and a/r replaced by t. Equation 12.4 is our generating
function. In the next section it is shown that |Pn(cos0)| < 1, which means that
the series expansion (Eq. 12.4) is convergent for \t\ < I.1 Indeed, the series is
convergent for |t| = 1 except for |x| = 1.
Actually since Eq. 12.4 defines the Legendre polynomials, Pn(x), convergence
of the series is not necessary. We can still obtain the explicit values of the
polynomials and develop useful relations between them even when the series
diverges. However, the property of convergence is convenient in order to be
able to exploit the properties of power series (Section 5.7).
In physical applications Eq. 12.4 often appears in the vector form
1
A2.4a)
where
that the series in Eq. 12.3 is convergent for r > a even though the
binomial expansion involved is valid only for r > (a2 + 2ar)vz, cos в = — 1.
640 LEGENDRE FUNCTIONS
and
r> =
r< =
r> =
r< =
Г2
r2
for
for
Using the binomial theorem (Section 5.6) and Exercise 10.1.15, we expand
the generating function as follows:
- ъх
A2.5)
„=o {2n)\\
For the first few Legendre polynomials, say, Po, Pl5 and P2, we need the co-
coefficients oft0, tl, and t2. These powers oft appear only in the terms n — 0,1, and
2 and hence we may limit our attention to the first three terms of the infinite
series:
0!
2H
2°
(oiy
- t2)
2!
4!
24B!)
2J
- t2)
= lt° +xt
'3
Then, from Eq. 12.4 (and uniqueness of power series)
P0(*)=l, P!(x) = x, P2(x) = |x2-i
We repeat this limited development in a vector framework later in this section.
In employing a general treatment, we find that the binomial expansion of the
Bxt — t2)" factor yields the double series
(] _
(i
r1/2 -
)
^„t
n!
,n—ktk
{2x)n-kt
\n—k + n+k
A2.6)
n = 0 k=0
From Eq. 5.64 of Section 5.4 (rearranging the order of summation), Eq. 12.6
becomes
[n/2]
- 2XI
Bn-2k)\
Bx)n'2ktn. A2.7)
with the variable t independent of the index k.2 Now, equating our two power
' [и/2] = n/2 for n even, (и - 1)/2 for и odd.
GENERATING FUNCTION 641
series (Eqs. 12.4 and 12.7) term by term, we have3
2nk\{n - k)l(n - 2k)\
x
n~2k
A2.8)
Linear Electric Multipoles
Returning to the electric charge on the z-axis, we demonstrate the usefulness
and power of the generating function by adding a charge — q at z = — a, as
shown in Fig. 12.4. The potential becomes
A2.9)
4neo\rx r2f
and by using the law of cosines, we have
4nsor
-1/2
1 + 27
(r > a).
Clearly, the second radical is like the first, except that a has been replaced by —a.
Then, using Eq. 12.4, we obtain
. 4nsor
_ 2q
n=o
n=o
4пвог
The first term (and dominant term for r » a) is
__ 2aq ^(cosfl)
A2.10)
A2.11)
which is the usual electric dipole potential. Here 2aq is the dipole moment
(Fig. 12.4).
This analysis may be extended by placing additional charges on the z-axis so
that the P1 term, as well as the Po (monopole) term, is canceled. For instance,
charges of q at z = a and z — — a, — 2q at z = 0 give rise to a potential whose
series expansion starts with P2(cos 6). This is a linear electric quadrupole. Two
linear quadrupoles may be placed so that the quadrupole term is canceled, but
the P3, the octupole term, survives.
Vector Expansion
We consider the electrostatic potential produced by a distributed charge
3 Equation 12.8 starts with x". By changing the index, we can transform it
into a series that starts with x° for n even and x1 for и odd. These ascending
series are given as hypergeometric functions in Eqs. 13.104 and 13.105,
Section 13.5.
642 LEGENDRE FUNCTIONS
FIG. 12.4 Electric dipole
4п£0
-r
dz-,.
A2.12a)
This expression has already been encountered in Sections 1.15 and 8.7. Taking
the denominator of the integrand, using first the law of cosines and then a
binomial expansion, yields
1
ri -г2
, 2
1
_ 1
-1/2
for rl > r2
A2.12b)
1 +
\r\
(For rx — 1, r2 = t, and rj т2 = xt Eq. 12.12b reduces to the generating func-
function, Eq. 12.4.)
The first term in the square bracket, 1, yields a potential
4nsor1
A2.12c)
The integral is just the total charge. This part of the total potential is an electric
monopole.
The second term yields
A2.12c/)
Here the charge p(r2) is weighted by a moment arm r2. We have an electric
dipole potential. For atomic or nuclear states of definite parity p(r2) is an even
function and the dipole integral is identically zero.
GENERATING FUNCTION 643
The last two terms, both of order 0/riJ> таУ Ъе handled by using cartesian
coodinates
(=1 j=l
Rearranging variables to keep the x2's inside the integral yields
II3
)rfT2- A2.12e)
This is the electric quadrupole term. We note that the square bracket in the
integrand forms a symmetric, zero trace tensor.
A general electrostatic multipole expansion can also be developed by using
Eq. 12.12a for the potential (pir^ and replacing 1/D^^! — r2|) by Green's func-
function, Eq. 16.169. This yields the potential cp (r 1) as a (double) series of the spherical
harmonics Y,m(#i,<Pi) and Ylm{02,(p2).
Before leaving multipole fields, perhaps we should emphasize three points.
First, an electric (or magnetic) multipole has an absolute significance only if all
lower-order terms vanish. For instance, the potential of one charge q at z — a
was expanded in a series of Legendre polynomials. Although we may refer to
the P^cosO) term in this expansion as a dipole term, it should be remembered
that this term exists only because of our choice of coordinates. We actually have
a monopole, P0(cos£>).
Second, in physical systems we do not encounter pure multipoles. As an
example, the potential of the finite dipole (q at z = a, — q at z = — a) contained
a P3(cos9) term. These higher-order terms may be eliminated by shrinking the
multipole to a point multipole, in this case keeping the product qa constant
(a -> 0, q -> oo) to maintain the same dipole moment.
Third, the multipole theory is not restricted to electrical phenomena. Plane-
Planetary configurations are described in terms of mass multipoles, Sections 12.3 and
12.5. Gravitational radiation depends on the time behavior of mass quadrupoles.
(The gravitational radiation field is a tensor field. The radiation units, gravitons,
carry two units of angular momentum.)
It might also be noted that a multipole expansion is actually a decomposition
into the irreducible representations of the rotation group (Section 4.10).
Extension to Ultraspherical Polynomials
The generating function, g(t, x), used here is actually a special case of a more
general generating function,
1 00
i =YCf(x)tn. A2.13)
A - 2xt + t2f nh
The coefficients C(na)(x) are the ultraspherical polynomials (proportional to the
Gegenbauer polynomials). For a = 1/2 this equation reduces to Eq. 12.4; that
is; C^1/2)(x) = Pn{x). The cases a = 0 and a = 1 are considered in Chapter 13 in
connection with the Chebyshev polynomials.
644 LEGENDRE FUNCTIONS
EXERCISES
12.1.1
Develop the electrostatic potential for the array of charges shown. This is a
linear electric quadrupole (Fig. 12.5).
ч
■е-
2q
z = — a
е-
— a
FIG. 12.5 Linear electric quadrupole
12.1.2 Calculate the electrostatic potential of the array of charges shown (Fig. 12.6).
Here is an example of two equal but oppositely directed dipoles. The dipole
contributions cancel. The octupole terms do not cancel.
+2q
-2q
q
z — ~2a -a a 2a
FIG. 12.6 Linear electric octopole
12.1.3 Show that the electrostatic potential produced by a charge q at z = a for r < a is
4neoa „% \a
12.1.4 Using E = — \(p, determine the components of the electric field corresponding
to the (pure) electric dipole potential
Here it is assumed that r » a.
4пе0г2
, _ AaqcosO
4пеогъ
2aq sin I)
4п£ог3
Ee=
12.1.5 A point electric dipole of strength pA) is placed at z = a; a second point electric
dipole of equal but opposite strength is at the origin. Keeping the product pA)a
constant, let a -*■ 0. Show that this results in a point electric quadrupole.
Hint. Exercise 12.2.5 (when proved) will be helpful.
12.1.6 A point charge q is in the interior of a hollow conducting sphere of radius r0.
The charge q is displaced a distance a from the center of the sphere. If the con-
conducting sphere is grounded, show that the potential in the interior produced by
q and the distributed induced charge is the same as that produced by q and its
RECURRENCE RELATIONS AND SPECIAL PROPERTIES 645
image charge q'. The image charge is at a distance a = Гд/а from the center,
colinear with q and the origin (Fig. 12.7).
Hint. Calculate the electrostatic potential for a <r0 < a'. Show that the potential
vanishes for r = r0 if we take q' = — qro/a.
FIG. 12.7
12.1.7 Prove that
и! dz"\r
Hint. Compare the Legendre polynomial expansion of the generating function
(a, Fig. 12.2 -> Az) with a Taylor series expansion of 1/r, where z dependence of
r changes from z to z — Az (Fig. 12.8).
FIG. 12.8
12.1.8 By differentiation and direct substitution of the series form, Eq. 12.8, show that
Pn(x) satisfies the Legendre differential equation. Note that there is no restriction
upon x. We may have any x, — со < x < со and indeed any z in the entire finite
complex plane.
12.1.9 The Chebyshev polynomials (type II) are generated by (Eq. 13.62, Section 13.3)
Using the techniques of Section 5.4 for transforming series, develop a series
representation of Un{x).
12.2 RECURRENCE RELATIONS AND SPECIAL
PROPERTIES
Recurrence Relations
The Legendre polynomial generating function provides a convenient way of
deriving the recurrence relations1 and some special properties. If our generating
1 We can also apply the explicit series form (Eq. 12.8) directly.
646 LEGENDRE FUNCTIONS
TABLE 12.1 Legendre Polynomials
P0(x) = 1
Pl(x) = x
P4{x) = |C5x4 - 30x2 + 3)
P5(x) = |F3x5 - 70x3 + 15*)
P6(x) = ygB31x6 - 315x4 + 105x2 - 5)
P7(x) = т^D29х7 - 693x5 + 315x3 - 35л-)
P8(X) = -^F435x8 - 12012л:6 + 6930x4 - 1260л:2 +35)
function (Eq. 12.4) is differentiated with respect to t, we obtain
5f ~ A - 2xt +12K'2 ~ „%
By substituting Eq. 12.4 into this and rearranging terms, we have
00 00
t2) X nP(x)tn~l
A - 2xt + t2) X nPn(x)tn~l + (t - x) X Ря{х)Г = О. A2.15)
n=0 n~0
The left-hand side is a power series in t. Since this power series vanishes for all
values of t, we may put the coefficient of each power of t equal to zero, that is,
our power series is unique (Section 5.7). This may be done easily by separating
the individual summations and using distinctive summation indices,
1 - Y. 2nxPn{x)t" +
A2.16)
00 00
Ps(x)ts+1 - У xPn(x)t" = 0.
s
s=0 n=0
Now letting m = n + 1, s = n — 1, we find
{In + 1)хРп(х) = (и + 1)РЛ+1(х) + nPn_!(x), n = 1,2,3, .... A2.17)
This is another three-term recurrence relation similar to (but not identical to) the
recurrence relation for Bessel functions. With this recurrence relation we may
easily construct the higher Legendre polynomials. If we take n = 1 and insert the
easily found values of P0(x) and Pj(x) (Exercise 12.1.7 or Eq. 12.8), we obtain
= 2P2(x) + P0(x) A2.18)
or
iCx2-l). A2.19)
This process may be continued indefinitely. The first few Legendre polynomials
are listed in Table 12.1.
RECURRENCE RELATIONS AND SPECIAL PROPERTIES 647
Cumbersome as it may appear at first, this technique is actually more efficient
for a large digital computer than is direct evaluation of the series (Eq. 12.8). For
greater stability (to avoid undue accumulation and magnification of round off
error), Eq. 12.17 is rewritten as
Pn+1(x) = 2xPn(x) - Pn^(x) - [xPn(x) - Рп-Лх)]/(п + 1). A2.17a)
One starts with P0(x) — 1, Pi(x) = x, and computes the numerical values of all
the Pn{x) for a given value of x up to the desired PN(x). The values of Pn(x),
0 < n < N are available as a fringe benefit.
Differential Equations
More information about the behavior of the Legendre polynomials can be
obtained if we now differentiate Eq. 12.4 with respect to x. This gives
cg(t,x) t Y P'(x)t" A2 20)
or
00 00
A - 2xt + t2) £ P;(x)t" - t £ Pn(x)t" = 0. A2.21)
«=0 n=0
As before, the coefficient of each power of t is set equal to zero and we obtain
Р„'+1(х) + P^ix) = 2xP;(x) + Pn(x). A2.22)
A more useful relation may be found by differentiating Eq. 12.17 with respect
to x and multiplying by 2. To this we add Bn + 1) times Eq. 12.22, canceling the
P'n term. The result is
n
) - P^(x) = Bn + l)Pn(x). A2.23)
From Eqs. 12.22 and 12.23 numerous additional equations may be de-
developed,2 including
= (n + l)Pn(x) + xP;(x), A2.24)
= ~nPn(x) + xP;(x), A2.25)
2 Using the equation number in parentheses to denote the entire equation,
we may write the derivations as
2-—A2.17) + {In + 1)-A2.22)=*A2.23)
dx
| {A2.22) + A2.23)} =* A2.24)
^{A2.22)-A2.23)} => A2.25)
A2.24)„_я_1+х.( 12.25)=* A2.26)
—A2.26) + n • A2.25) => A2.28)
dx
648 LEGENDRE FUNCTIONS
(х) - nxPn(x), A2.26)
- х2)РДх) = {n+ l)xPn(x) - (n + 1)Ри+1(х). A2.27)
By differentiating Eq. 12.26 and using Eq. 12.25 to eliminate Pn'_j (x), we find that
Р„{х) satisfies the linear, second-order differential equation
A - х2)Р„"(х) - 2xPB'(x) + n(n + l)Pn(x) = 0. A2.28)
The previous equations, Eqs. 12.22 to 12.27, are all first-order differential equa-
equations, but with polynomials of two different indices. The price for having all
indices alike is a second-order differential equation. Equation 12.28 is Legendre's
differential equation. We now see that the polynomials Р„(х) generated by the
expansion of A — 2xt + £2)~1/2 satisfy Legendre's equation which, of course, is
why they are called Legendre polynomials.
In Eq. 12.28 differentiation is with respect to x(x = cos(i). Frequently, we
encounter Legendre's equation expressed in terms of differentiation with respect
to0,
;ю( *» )+ "+ '" '= '
Special Values
Our generating function provides still more information about the Legendre
polynomials. If we set x = 1, Eq. 12.4 becomes
1 1
A - It + t2I12 1 - t
A2.30)
using a binomial expansion. But
1 ^
^ __ = V Pn(l)t".
ц — ztx -f i )x=1 „=o
Comparing the two series expansions (uniqueness of power series, Section 5.7),
we have
Р„A)=1. A2.31)
If we let x = — 1, the same sort of analysis shows that
Pn(-l) = (-1)". A2.32)
For obtaining these results, we find that the generating function is more con-
convenient than the explicit series form.
If we take x = 0, using the binomial expansion
RECURRENCE RELATIONS AND SPECIAL PROPERTIES 649
we have3
) = 0, n = 0,1,2,.... A2.35)
These results also follow from Eq. 12.8 by inspection.
Parity
Some of these results are special cases of the parity property of the Legendre
polynomials. We refer once more to Eq. 12.4. If we replace x by — x and t by —t,
the generating function is unchanged. Hence
g(t,x) = g(-t, -x)
= [l-2(-t)(-x) + (-tJ]/2
= I p.(-*H-tr a2-36)
n = 0
00
= I PJL*)tn.
n = 0
Comparing these two series, we have
A2.37)
that is, the polynomial functions are odd or even (with respect to x — 0,0 = я/2)
according to whether the index n is odd or even. This is the parity4 or reflection
property that plays such an important role in quantum mechanics. For central
forces the index n is a measure of the orbital angular momentum, thus linking
parity and orbital angular momentum.
The reader will see this parity property confirmed by the series solution and
for the special values tabulated in Table 12.1. It might also be noted that Eq.
12.37 may be predicted by inspection of Eq. 12.17, the recurrence relation.
Specifically, if Pn_i(x) and xPn(x) are even, then Pn+1{x) must be even.
Upper and Lower Bounds for Pn(cos в)
Finally, in addition to these results, our generating function enables us to set
an upper limit on |Pn(cos0)|. We have
3The double factorial notation is defined in Section 10.1.
Bи - 1)!! = 1-3-5 • • • B/i - 1).
4 In spherical polar coordinates the inversion of the point (r,e,q>) through
the origin is accomplished by the transformation [r -> г, в -»n — в, and
Ф -»cp ± я]. Then, cos в -»■ cosGr — в) — — cos в, corresponding to x-»-i
(compare Exercise 2.5.8).
650 LEGENDRE FUNCTIONS
TABLE 12.2 Comparison of Generating
Function plus Recurrence Relations and Series
Expansion, Eq. 12.8
Application
Table 12.1
numerical value
Derivation of differential
equation, Eq. 12.27
Р„A), Eq. 12.30
Р„@), Eq. 12.34
Parity, Eq. 12.36
Bounds, Eq. 12.38
Generating function
recurrence relations
Eqs. 12.4, 12.17, and
12.22
Computer choice
Moderately involved
Easy
Easy
Easy
Fairly easy
Series
Eq. 12.8
More direct
Verification easy,
Derivation requires
Clairvoyance
Awkward
By inspection
By inspection
Awkward
t2y112 = A - te
= A +
- te~i0y112
+ lt2e2i0 +
A2.38)
with all coefficients positive. Our Legendre polynomial, Pn(cos0), still the co-
coefficient of tn, may now be written as a sum of terms of the form
am{em0 + e~im0)/2 = amcoshim0
with all the am positive. Then
Pn(cos 0) =
= am cos ml)
amcosm6.
A2.39a)
A2.3%)
m = 0or 1
This series (Eq. 12.39b) is clearly a maximum when 0 = 0 and cosmO = 1. But
for x = cos 0=1, Eq. 12.31 shows that P,,(l) = 1. Therefore
Pn(cos0)\ < Pn(l) = 1. A2.39c)
A fringe benefit of Eq. 12.39b is that it shows that our Legendre polynomial
is a linear combination of cos тв. This means that the Legendre polynomials
form a complete set for any functions that may be expanded by a Fourier cosine
series (Section 14.1) over the interval @, n).
In this section various useful properties of the Legendre polynomials are
derived from the generating function, Eq. 12.4. The explicit series representa-
representation, Eq. 12.8, offers an alternate and sometimes superior approach. Table 12.2
offers a comparison of the two approaches.
EXERCISES 651
EXERCISES
12.2.1 Given the series
a0 + a2 cos2 0 + a4 cos4 0 + a6 cos6 0 = a0P0 + a2P2 + a4P4 + a6P6.
Express the coefficients a; as a column vector a and the coefficients a, as a
column vector a and determine the matrices A and В such that
Aa = a and Ba = a.
Check your computation by showing that AB = 1 (unit matrix). Repeat for
the odd case
0 + a5cos50 + a7cos70 = alPl + a3P3 + a5P5 + a1P1.
Note. Pn(cos9) and cos"# are tabulated in terms of each other in AMS-55.
12.2.2 By differentiating the generating function, g{t,x), with respect to t, multiplying
by 2t, and then adding g(t, x), show that
1 — t2 °°
This result is useful in calculating the charge induced on a grounded metal
sphere by a point charge q.
12.2.3 (a) Derive Eq. 12.27
A - х2)Р„'(х) = (n + 1)хР„(х) - (и + 1)Ря+1(х).
(b) Write out the relation of Eq. 12.27 to preceding equations in symbolic
form analogous to the symbolic forms for Eqs. 12.23 to 12.26.
12.2.4 A point electric octupole may be constructed by placing a point electric quadru-
pole (pole strength pB) in the z-direction) at z = a and an equal but opposite
point electric quadrupole at z = 0 and then letting a-*0, subject to pB)a =
constant. Find the electrostatic potential corresponding to a point electric
octupole. Show from the construction of the point electric octupole that the
corresponding potential may be obtained by differentiating the point quadru-
quadrupole potential.
12.2.5 Operating in spherical polar coordinates, show that
Pn+l{cos0)
- ~(n + 1) -^ .
This is the key step in the mathematical argument that the derivative of one
multipole leads to the next higher multipole.
Hint. Compare Exercise 2.5.12.
12.2.6 From
PL{cos0) = j-}jl{l -2tcosO + t2rl'2\!=0
show that
12.2.7 Prove that
652 LEGENDRE FUNCTIONS
12.2.8 Show that Pn(cos0) = (— l)"Pn{ — cosO) by use of the recurrence relation relating
Pn, Pn+1, and Pn_x and your knowledge of Po and P,.
12.2.9 From Eq. 12.38 write out the coefficient of t2 in terms of cos nO, n < 2. This
coefficient is P2(cos0).
12.2.10 Write a program that will generate the coefficients as in the polynomial form
of the Legendre polynomial,
Р„(х) = £ asx\
12.2.11 (a) Calculate P,0(x) over the range [0,1] and plot your results.
(b) Calculate precise (at least to five decimal places) values of the five positive
roots of PioW- Compare your values with the values listed in AMS-55
(Table 25.4).
Hint. See Appendix 1 for root-finding techniques.
12.2.12 (a) Calculate the largest root of Pn(x) for n = 2AM0.
(b) Develop an approximation for the largest root from the hypergeometric
representation of Pn(x) (Section 13.4) and compare your values from part
(a) with your hypergeometric approximation. Compare also with the
values listed in AMS-55 (Table 25.4).
12.2.13 (a) From Exercise 12.2.1 and AMS-55 (Table 22.9) develop the 6 x 6 matrix
В that will transform a series of even order Legendre polynomials through
Pl0(x) into a power series £*=0 oi2nx2n.
(b) Calculate A as B~*. Check the elements of A against the values listed in
AMS-55 (Table 22.9).
(c) By using matrix multiplication, transform some even power series
^=0a2nx2" into a Legendre series.
12.2.14 Write a subroutine that will transform a finite power series Y,n=oanx" mt0 a
Legendre series ^=0Ь„Р„{х). Use the recurrence relation Eq. 12.17 and follow
the technique outlined in Section 13.3 for a Chebyshev series.
12.3 ORTHOGONALITY
Legendre's differential equation A2.28) may be written in the form
£[A - x2)P;(x)~] + n(n + l)Pn(x) = 0, A2.40)
showing clearly that it is self-adjoint. Subject to satisfying certain boundary
conditions, then, it is known that the solutions Pn(x) will be orthogonal. Re-
Repeating the Sturm-Liouville analysis (Section 9.2), we multiply Eq. 12.40 by
Pm{x) and subtract the corresponding equation with m and n interchanged.
Integrating from — 1 to + 1, we get
r A2.41)
= [m(m + 1) - n{n + 1)] Pn(x)Pm(x)dx.
ORTHOGONALITY 653
Integrating by parts, the integrated part vanishing because of the factor A - x2),1
we have
[m{m + 1) - n(n + 1)] Pn(x)Pm(x)dx = 0. A2.42)
Then for m =/= n
j A2.43)
Pn(cos 0)PJcos 0) sin 0 d() = 0,
Jo
showing that Р„(х) and Pm(x) are orthogonal for the interval [—1,1]- This
orthogonality may also be demonstrated quite readily by using Rodrigues'
definition of Pn(x) (compare Section 12.4, Exercise 12.4.2).
We shall need to evaluate the integral (Eq. 12.42) when n = m. Certainly
it is no longer zero. From our generating function
- 2tx
I ад^"
2
A2.44)
Integrating from x = — 1 to x = +1, we have
f1 dx °° C1
ЦА = V t2" [P <x\l2 dx- П245^
1 _ 9fv 4- tl *—* \
the cross terms in the series vanish by means of Eq. 12.43. Using у = 1 — 2tx + t2,
we obtain
f1 dx .ir4jJl+[^ A2.46)
J^l-^x + t2 2tja__tJ у t \l~t
Expanding this in a power series (Exercise 5.4.1) gives us
Since our power-series representation is known to be unique, we must have
-i
'This of course is why the limits were chosen as — 1 and + 1.
2 In Section 9.4 such integrals are intepreted as inner products in a linear
vector (function) space. Alternate notations are
Pn(x)Pm(x)dx = {Р„(х)\Рт(х)У
' = (ря(х), ад).
The < > form, popularized by Dirac, is common in physics literature. The
( ) form is more common in the mathematics literature.
654 LEGENDRE FUNCTIONS
We shall return to this result in Section 12.6 when we construct the orthonormal
spherical harmonics.
Expansion of Functions, Legendre Series
In addition to orthogonality, the Sturm-Liouville theory shows that the
Legendre polynomials form a complete set. Let us assume, then, that the series
f anPn(x) = f(x\ A2.49)
n = 0
in the sense of convergence in the mean (Section 9.4) in the interval [— 1,1].
This demands that f(x) and f'(x) be at least sectionally continuous in this
interval. The coefficients an are found by multiplying the series by Pm(x) and
integrating term by term. Using the orthogonality property expressed in Eqs.
12.43 and 12.48, we obtain
2 а„ = [ f(x)Pm(x)dx. A2.50)
2m + 1
We replace the variable of integration x by f and the index m by n. Then,
substituting into Eq. 12.49, we have
fix) = f 2rL~1( P f(t)Pn(t)dt)pn(x). A2.51)
»=o l VJ-i /
This expansion in a series of Legendre polynomials is usually referred to as a
Legendre series.3 Its properties are quite similar to the more familiar Fourier
series (Chapter 14). In particular, we can use the orthogonality property,
(Eq. 12.43), to show that the series is unique.
On a more abstract (and more powerful) level, Eq. 12.51 gives the repre-
representation of/(x) in the linear vector space of Legendre polynomials (a Hilbert
space, Section 9.4).
From the viewpoint of integral transforms (Chapter 15) Eq. 12.50 may be
considered a finite Legendre transform of/(x). Equation 12.51 is then the
inverse transform. It may also be interpreted in terms of the projection operators
of quantum theory. We may take
9 = P
as an (integral) operator, ready to operate on/(t). [The/(f) would go in the
square bracket as a factor in the integrand.] Then, from Eq. 12.50
The operator 0>т projects out the mth component of the function/.
3Note that Eq. 12.50 gives am as a definite integral, that is, a number for a
given f(x).
4The dependent variables are arbitrary. Here x came from the x in 3?m while
/ is a dummy variable of integration.
ORTHOGONALITY 655
Equation 12.3, which leads directly to the generating function definition of
Legendre polynomials, is a Legendre expansion of \frx-. This Legendre expansion
of \/ri or l/r12 appears in several exercises of Section 12.8. Going beyond a
simple Coulomb field, the l/r12 is often replaced by a potential K(|rj — r2|)
and the solution of the problem is again effected by a Legendre expansion.
In nuclear physics calculations the coefficients an may be' computed (by a
computing machine) up through a100.
The Legendre series, Eq. 12.49, has been treated as a known function/(x)
that we arbitrarily chose to expand in a series of Legendre polynomials. Some-
Sometimes the origin and nature of the Legendre series is different. In the next
examples we consider unknown functions we know can be represented by a
Legendre series because of the differential equation the unknown functions
satisfy. As before, the problem is to determine the unknown coefficients in the
series expansion. Here, however, the coefficients are not found by Eq. 12.50.
Rather, they are determined by demanding that the Legendre series match a
known solution at a boundary. These are boundary value problems.
EXAMPLE 12.3.1 Earth's Gravitational Field
An example of a Legendre series is provided by the description of the earth's
gravitational potential U (for exterior points), neglecting azimuthal effects.
With
R = equatorial radius
= 6378.1 ±0.1 km
GM
R
we write
GM
= 62.494 ± 0.001 km2/sec
U(r, в) =
R
r
n+\
A2.52)
a Legendre series. Artificial satellite motions have shown that
a2 = A,082,635 ± 11) x 10 9
a3 = (-2,531 ±7) x 10"9.
This is the famous pear-shaped deformation of the earth,
a4 = (-1,600 ± 12) x 10 "9.
Other coefficients have been computed through n = 20. The reader might note
that Pt is omitted, since it would represent a displacement and not a deformation.
More recent satellite data permit a determination of the longitudinal depen-
dependence of the earth's gravitational field. Such dependence may be described by a
Laplace series (Section 12.6).
656 LEGENDRE FUNCTIONS
v = о
FIG. 12.9 Conducting sphere in a uni-
uniform field
EXAMPLE 12.3.2 Sphere in a Uniform Field
Another illustration of the use of Legendre polynomials is provided by the
problem of a neutral conducting sphere (radius r0) placed in a (previously)
uniform electric field (Fig. 12.9). The problem is to find the new, perturbed,
electrostatic potential. Calling the electrostatic potential V5,
V2 V = 0,
A2.53)
Laplace's equation. We select spherical polar coordinates because of the spheri-
spherical shape of the conductor. (This will simplify the application of the boundary
condition at the surface of the conductor.) Separating variables and glancing at
Table 8.1 if necessary, we can write the unknown potential V{r,0) as a linear
combination of solutions.
V(r,0) = £ anr»Pn(cos0) +
A2.54)
No «^-dependence appears because of the axial symmetry of our problem. (The
center of the conducting sphere is taken as the origin and the z-axis is oriented
parallel to the original uniform field.)
It might be noted here that n is an integer, because only for integral n is the 0
dependence well behaved at cos (9 = ±1. For nonintegral n the solutions of
Legendre's equation diverge at the ends of the interval [ — 1,1], the poles 0 = 0,
n of the sphere (compare Example 5.2.4 and Exercises 5.2.15 and 8.5.5). It is for
this same reason that the second solution of Legendre's equation, Qn, is also
excluded.
Now we turn to our (Dirichlet) boundary conditions to determine the
unknown an's and bn's of our series solution, Eq. 12.54. If the original unperturbed
electrostatic field is Eo, we require, as one boundary condition,
V(r -> со) = — Eoz = — Eor cos 0
A2.55)
5 It should be emphasized that this is not a presentation of a Legendre series
expansion of a known F(cos в). Here we are back to boundary value problems.
ORTHOGONALITY 657
Since our Legendre series is unique, we may equate coefficients of Pn(cos 0) in
Eq. 12.54 (r -> go) and Eq. 12.55 to obtain
an — 0, n > 1,
A2.56)
a, = -£0.
If <яп ^ 0 for n > 1, these terms would dominate at large r and the boundary
condition (Eq. 12.55) could not be satisfied.
As a second boundary condition, we may choose the conducting sphere and
the plane в = к/2 to be at zero potential, which means that Eq. 12.54 now
becomes
V(r - r0) = a0 + bf
r
A2.57)
= 0.
In order that this may hold for all values of 0, each coefficient of Pn(cos 0) must
vanish.6 Hence
a0 — b0 = 0,7
A2.58)
bn = 0, n > 2,
whereas
b, = Еогъо. A2.59)
The electrostatic potential (outside the sphere) is then
V= - ЕогРг (cos 0) + Щ^-Р^ (cos 0)
A2.60)
In Section 1.15 it was shown that a solution of Laplace's equation that
satisfied the boundary conditions over the entire boundary was unique. The
electrostatic potential V, as given by Eq. 12.60, is a solution of Laplace's equa-
equation. It satisfies our boundary conditions and therefore is the solution of
Laplace's equation for this problem.
It may further be shown (Exercise 12.3.13) that there is an induced surface
charge density
a = — e
dV
о
= 3eoEocos0 A2.61)
6 Again, this is equivalent to saying that a series expansion in Legendre poly-
polynomials (or any complete orthogonal set) is unique.
7The coefficient of Po is a0 + bQ/r0. We set 60 = 0 (and therefore a0 = 0 also),
since there is no net charge on the sphere. If there is a net charge a, then
Ьф0
658 LEGENDRE FUNCTIONS
on the surface of the sphere and an induced electric dipole moment (Exercise
12.3.13)
P = 4nr30e0E0.
EXAMPLE 12.3.3 Electrostatic Potential of a Ring of Charge
A2.62)
As a further example, consider the electrostatic potential produced by a con-
conducting ring carrying a total electric charge q (Fig. 12.10). From electrostatics
(and Section 1.14) the potential ф satisfies Laplace's equation. Separating
variables in spherical polar coordinates (compare Table 8.1), we obtain
r > a.
A2.63a)
Here a is the radius of the ring that is assumed to be in the 0 = тс/2 plane. There is
no q? (azimuthal) dependence because of the cylindrical symmetry of the system.
(r,0)
FIG. 12.10 Charged, conducting ring
The terms with positive exponent radial dependence have been rejected since
the potential must have an asymptotic behavior
q
a.
A2.63b)
The problem is to determine the coefficients an in Eq. 12.63a. This may be done
by evaluating ф(г, 9) at в = 0, r = z, and comparing with an independent cal-
calculation of the potential from Coulomb's law. In effect, we are using a boundary
condition along the z-axis. From Coulomb's law (with all charge equidistant),
1
0 (z2+a2I12'
r — z,
I(-l)s
Bs) I
а
,2s
A2.63c)
22s
(sIJ \z
z > a.
The last step uses the result of Exercise 10.1.15. Now, Eq. 12.63a evaluated at
Q = 0,r = z (with Pn(l) = 1), yields
a"
EXERCISES 659
= z. A2.63d)
Comparing Eqs. 12.63c and 12.63d, we get an = 0 for n odd. Setting n - 2s, we
have
T, A2.63c)
and our electrostatic potential ф(г, О) is given by
^|^(^р r>a. A263Л
The magnetic analog of this problem appears in Section 12.5—Example 12.5.1.
EXERCISES
12.3.1 You have constructed a set of orthogonal functions by the Gram-Schmidt
process (Section 9.3), taking un(x) = x", n = 0, 1, 2, . . ., in increasing order with
w(x) = 1 and an interval — 1 < x < 1. Prove that the nth such function con-
constructed is proportional to Р„{х).
Hint. Use mathematical induction.
12.3.2 Expand the Dirac delta function in a series of Legendre polynomials, using the
interval — 1 < x < 1.
12.3.3 Verify the Dirac delta function expansions
<5A - x) = £ -^Pn{x)
l
S(l+x)=
n = 0
These expressions appear in a resolution of the Rayleigh plane wave expansion
(Exercise 12.4.7) into incoming and outgoing spherical waves.
Note. Assume that the entire Dirac delta function is covered when integrating
over [ —1,1].
12.3.4 Neutrons (mass 1) are being scattered by a nucleus of mass A(A > 1). In the
center of the mass system the scattering is isotropic. Then, in the lab system the
average of the cosine of the angle of deflection of the neutron is
<cos«/,>=- __^-L_LJ sini)d(i
2H(A2 + 2AcosO+\)l>2
2
Show, by expansion of the denominator, that <cos ф} = —.
12.3.5 A particular function f(x) defined over the interval [—1,1] is expanded in
a Legendre series over this same interval. Show that the expansion is unique.
12.3.6 A function f(x) is expanded in a Legendre series j\x) = Y?=o anPn{x). Show that
660 LEGENDRE FUNCTIONS
This is the Legendre form of the Fourier series Parseval identity, Exercise
14.4.2. It also illustrates Bessel's inequality, Eq. 9.72, becoming an equality for
a complete set.
12.3.7 Derive the recurrence relation
A - x2)Pn'(x) = nPn^(x) - nxPn(x)
from the Legendre polynomial generating function.
12.3.8 Evaluate ^0Pn(x)dx.
ANS. n = 2s; 1 for s = 0, 0 for s > 0,
n = 2s + 1; P2s{0)/Bs + 2) = (- If Bs - 1)! !/Bs + 2)!!
Hint. Use a recurrence relation to replace Pn(x) by derivatives and then integrate
by inspection! Alternatively, you can integrate the generating function.
show that
l-i n=o
Bи + 2)М
(b) By testing the series, prove that the series is convergent.
12.3.10 Prove that
x(l - хг)Р'пР'т dx = 0, unless m = n±l.
12.3.11 The amplitude of a scattered wave is given by
f@) = t £ B/ + l)exp[i<5,]sin<5,P,(cos0).
1 = 0
Here в is the angle of scattering, / the angular momentum, and S, the phase
shift produced by the central potential that is doing the scattering. The total
cross section is atot = \f*@)f{0)du. Show that
00
atot = 4n22 X B/+ I)sin2<5,.
/=o
12.3.12 The coincidence counting rate, W@), in a gamma-gamma angular correlation
experiment has the form
W@) = £ a2nP2n(cos 0).
«=o
Show that data in the range л/2 < 0 < n can, in principle, define the function,
WF), (and permit a determination of the coefficients a2n). This means that
although data in the range 0 < 0 < л/2 may be useful as a check, they are not
essential.
12.3.13 A conducting sphere of radius r0 is placed in an initially uniform electric field,
Eo. Show the following:
(a) The induced surface charge density is
о = 3eo£0 cos 0.
(b) The induced electric dipole moment is
P = 4лг^ео£о.
EXERCISES 661
The induced electric dipole moment can be calculated either from the
surface charge [part (a)], or by noting that the final electric field E is the
result of superimposing a dipole field on the original uniform field.
12.3.14 A charge q is displaced a distance a along the z-axis from the center of a spherical
cavity of radius R.
(a) Show that the electric field averaged over the volume a < r < R is zero.
(b) Show that the electric field averaged over the volume 0 < r < a is
= k£, = -k
(SI units)
4л:е0о2'
_ ьп4а
Зе0'
where n is the number of such displaced charges per unit volume. This is a
basic calculation in the polarization of a dielectric.
Hint. E = - \q>.
12.3.15 Determine the electrostatic potential (Legendre expansion) of a circular ring of
electric charge for r < a.
12.3.16 Calculate the electric field produced by the charged conducting ring of Example
12.3.3 for
(a) r> a,
(b) r <a.
12.3.17 As an extension of Example 12.3.3, find the potential ф{г,0) produced by a
charged conducting disk, Fig. 12.11, for r > a, the radius of the disk.
The charge density о (on each side of the disk) is
a p2 = x2+ y2.
4na(a2 - p2I'2'
FIG. 12.11 Charged, conducting disk
Hint. The definite integral you get can be evaluated as a beta function, Section
10.4.
,2/
ANS.
- P2l(cos6).
12.3.18 From the result of Exercise 12.3.17 calculate the potential of the disk. Since you
are violating the condition r > a, justify your calculation carefully.
Hint. You may run into the series given in Exercise 5.2.14.
12.3.19 The hemisphere defined by r — a, 0 < 0 < л/2 has an electrostatic potential
+ Vo. The hemisphere r = а, л/2 < 0 < n has an electrostatic potential — Vo.
Show that the potential at interior points is
662 LEGENDRE FUNCTIONS
. You need exercise 12.3.8.
12.3.20 A conducting spheres of radius a is divided into two electrically separate
hemispheres by a thin insulating barrier at its equator. The top hemisphere is
maintained at a potential Vo, the bottom hemisphere at — Vo.
(a) Show that the electrostatic potential exterior to the two hemispheres is
s = 0 \^л "Т" Ч ■ ■ V /
(b) Calculate the electric charge density a on the outside surface. Note that
your series diverges at cos 0 — ± 1 as you expected from the infinite
capacitance of this system (zero thickness for the insulating barrier).
dV
ANS. <r = SoEn=-eo-~-
or
г ~ а
= £o^o) ( —l)D.v + 3)-
s = 0
12.3.21 In the notation of Section 9.4 |фч> = лДЗТТТу^РДх), a Legendre polynomial
is renormalized to unity. Explain how |<ps> <</>x| acts as a projection operator. In
particular, show that if |/> = Y*a'n\(pn}, then
12.3.22 Expand x8 as a Legendre series. Determine the Legendre coefficients from Eq.
12.50,
I-1
Check your values against AMS-55, Table 22.9. This illustrates the expansion
of a simple function. Actually if/(x) is expressed as a power series, the technique
of Exercise 12.2.14 is both faster and more accurate.
Hint. Gaussian quadrature can be used to evaluate the integral.
12.3.23 Calculate and tabulate the electrostatic potential created by a ring of charge.
Example 12.3.3, for r/a = 1.5@.5M.0 and в = 0°A5°)90°. Carry terms through
P22(cos0).
Note. The convergence of your series will be slow for r/a = 1.5. Truncating the
series at P22 limits you to about a four significant figure accuracy.
Check value. For r/a = 2.5 and 0 = 60°, ф = 0A0272(q/4ne0r).
12.3.24 Calculate and tabulate the electrostatic potential created by a charged disk,
Exercise 12.3.17, for r/a = 1.5@.5M.0 and 0 = 0°A5°)90°. Carry terms through
Check value. For r/a = 2.0 and 0=15°,^ = 0.46638(q/4neQr).
12.3.25 Calculate the first five (nonvanishing) coefficients in the Legendre series expan-
expansion of f(x) = 1 — |x| using Eq. 12.51—numerical integration. Actually these
coefficients can be obtained in closed form. Compare your coefficients with
those obtained from Exercise 13.4.4.
ALTERNATE DEFINITIONS OF LEGENDRE POLYNOMIALS 663
ANS. a0 = 0.5000
a2 = -0.6250
a4 = 0.1875
a6 = -0.1016
as = 0.0664.
12.3.26 Calculate and tabulate the exterior electrostatic potential created by the two
charged hemispheres of Exercise 12.3.20, for r/a = 1.5(O.5M.Oand0 = 0°A5°)90°.
Carry terms through P23(cos0).
Check value. For r/a = 2.0 and в = 45°,
V = 0.27066F0.
12.3.27 (a) Given f(x) = 2.0, |x| < 0.5; 0, 0.5 < |x| < 1.0. Expand f(x) in a Legendre
series and calculate the coefficients а„ through a80 (analytically),
(b) Evaluate %%% anPn(x) for x = 0.400@.005H.600. Plot your results.
Note. This illustrates the Gibbs phenomenon of Section 14.5 and the danger of
trying to calculate with a series expansion in the vicinity of a discontinuity.
12.4 ALTERNATE DEFINITIONS OF LEGENDRE
POLYNOMIALS
Rpdrigues' Formula
The series form of the Legendre polynomials (Eq. 12.8) of Section 12.1 may
be transformed as follows. From Eq. 12.8
For n an integer
A2.64a)
Y2n-2r
2Hn\\dxJ r%r\{n-r)\
Note the extension of the upper limit. The reader is asked to show in Exercise
12.4.1 that the additional terms [и/2] + 1 to n in the summation contribute
nothing. However, the effect of these extra terms is to permit the replacement of
the new summation by (x2 — 1)" (binomial theorem once again) to obtain
A2.65)
This is Rodrigues' formula. It is useful in proving many of the properties of the
Legendre polynomials such as orthogonality. A related application is seen in
Exercise 12.4.3. The Rodrigues definition is extended in Section 12.5 to define
the associated Legendre functions. In Section 12.7 it is used to identify the orbital
angular momentum eigenfunctions.
664 LEGENDRE FUNCTIONS
Cut line
-1
t plane
FIG. 12.12 Schlaefli integral contour
Schlaefli Integral
Rodrigues' formula provides a means of developing an integral representa-
representation of Pn{z). Using Cauchy's integral formula (Section 6.4)
with
we have
L
2ni ] t — z
f(z) = (z2 - 1)",
L
A2.66)
A2.67)
2ni T t — z
Differentiating n times with respect to z and multiplying by 1/2"и! gives
A2.68)
2""
~ 1)"
A2.69)
with the contour enclosing the point t = z.
This is the Schlaefli integral. Margenau and Murphy1 use this to derive the
recurrence relations we obtained from the generating function.
The Schlaefli integral may readily be shown to satisfy Legendre's equation by
differentiation and direct substitution (Fig. 12.12). We obtain
d2Pn
dPn
n + 1 Г d
\t2 -
(t - z)
л + 2
dt. A2.70)
For integral n our function (r — l)n+i/(t — z)"+ is single-valued, and the in-
'H. Margenau, and G. M. Murphy, The Mathematics of Physics and Chem-
Chemistry, 2nd ed., Section 3.5. Princeton, N.J.' Van Nostrand A956).
EXERCISES 665
tegral around the closed path vanishes. The Schlaefli integral may also be used
to define Pv(z) for nonintegral v integrating around the points t = z, t = 1, but
not crossing the cut line — 1 to — со. We could equally well encircle the points
t — z and t = — 1, but this would lead to nothing new. A contour about t = +1
and t = — 1 will lead to a second solution Qv(z), Section 12.10.
EXERCISES
12.4.1 Show that each term in the summation
vanishes (r and n integral).
12.4.2 Using Rodrigues' formula, show that the Р„М are orthogonal and that
Г [Pn(x)Ydx= 2
2n
Hint. Use Rodrigues' formula and integrate by parts.
12.4.3 Show that J^ xmPn(x)dx = 0 when m < n.
Hint. Use Rodrigues' formula.
2n+inlnl
12.4.4 Show that
x"Pn(x) dx =
Note. You are expected to use Rodrigues' formula and integrate by parts but
also see if you can get the result from Eq. 12.8 by inspection.
12.4.5 Show that
12.4.6 As a generalization of Exercises 12.4.4 and 12.4.5, show that the Legendre
expansions of Xs are
4п + 1)Bг)!(г + п)!
K h
n% Br + 2n + l)!(r - n)\
(b) x- = t 2Dn + 3)Br + l)!(, + n + l)!
W „% {2г + 2п + Ъ)\{г-п)\ 2п+Л
12.4.7 A plane wave may be expanded in a series of spherical waves by the Rayleigh
equation
00
eikrmy= £ ajn(kr)Pn(cosy).
n = 0
Show that an = i"Bn + 1).
Hint. 1. Use the orthogonality of the Pn to solve for ajn(kr).
2. Differentiate n times with respect to (kr) and set r = 0 to eliminate
the r-dependence.
3. Evaluate the remaining integral by Exercise 12.4.4.
666 LEGENDRE FUNCTIONS
Note. This problem may also be treated by noting that both sides of the equation
satisfy the Helmholtz equation. The equality can be established by showing that
the solutions have the same behavior at the origin and also behave alike at large
distances. A "by inspection" type solution is developed in Section 16.6 using
Green's functions.
12.4.8 Verify the Rayleigh equation of Exercise 12.4.7 by starting with the following
steps:
1. Differentiate with respect to (kr) to establish
£>j,;(/cr)Pn(cosy) = i^ajn{kr) cos уPn(cosy).
n n
2. Use a recurrence relation to replace cos yPn(cos y) by a linear
combination of Pn_1 and Pn+1-
3. Use a recurrence relation to replace^ by a linear combina-
combination of./„_! andjn+1.
12.4.9 From Exercise 12.4.7 show that
This means that (apart from constant factors) the spherical Bessel function
jn(kr) is the Fourier transform of the Legendre polynomial Рп{ц).
12.4.10 The Legendre polynomials and the spherical Bessel functions are related by
Uz) = !(-/)" fV eosePn{cos6)smed0, n = 0, 1, 2, ....
Jo
Verify this relation by transforming the right-hand side into
2n + \
2n+1«!jo
cos(zcos0)sin2n+10d0
and using Exercise 11.7.9.
12.4.11 By direct evaluation of the Schlaefli integral show that Pn(l) = 1.
12.4.12 Explain why the contour of the Schlaefli integral, Eq. 12.69, is chosen to enclose
the points t = z and t=\ when n -> v, not an integer.
12.4.13 In numerical work (such as the Gauss-Legendre quadrature of Appendix 2)
it is useful to establish that Pn(x) has n real zeros in the interior of [—1,1].
Show that this is so.
Hint. Rolle's theorem shows that the first derivative of (x2 — IJ" has one zero
in the interior of [— 1,1]. Extend this argument to the second, third, and ulti-
ultimately to the nth derivative.
12.5 ASSOCIATED LEGEHDRE FUNCTIONS
When Helmholtz's equation is separated in spherical polar coordinates (Sec-
(Section 2.6), one of the separated ordinary differential equations is the associated
Legendre equation
1 d ( . „dP? cos 6s
sinO d6\ d9
n(n + 1) -
m
2
A2.71)
ASSOCIATED LEGENDRE FUNCTIONS 667
With x = cos в, this becomes
n{n
m
= 0 A2.72)
Only if the azimuthal separation constant m2 = 0 do we have Legendre's equa-
equation, Eq. 12.28.
One way of developing the solution of the associated Legendre equation is to
start with the regular Legendre equation and convert it into the associated
Legendre equation by using multiple differentiation. We take Legendre's
equation
A - х2)Р„" - 2хР„' + n{n + l)Pn = 0, A2.73)
and with the help of Leibnitz's formula1 differentiate m times. The result is
A - x2)w" - 2x(m + l)w' + (n - m)(n + m + \)u = 0, A2.74)
where
dm
u = j-^Pn(x). A2.75)
Equation 12.74 is not self-adjoint. To put it into self-adjoint form, we replace
u(x) by
/7mP (y\
v(x) = A - x2)ml2u(x) = A - x2)m/2 . , „, -
Solving for и and differentiating, we obtain
u' = (v' + ^)(l-x2T™>2,
A2.76)
A2.77)
u" =
„ 2mxv' mv
V H - ^- H ;
m(m + 2)x2v
1-х
2\2
A - X2)
■A-х2Гт/2 A2.78)
Substituting into Eq. 12.74, we find that the new function v satisfies the
differential equation
л
" - 2xv'
n{n
m
1 -JC'
v = 0,
A2.79)
which is the associated Legendre equation reducing to Legendre's equation, as
it must when m is set equal to zero. Expressed in spherical polar coordinates, the
associated Legendre equation is
1 Leibnitz's formula for the nth derivative of a product is
ax" s% \sj dx
a binomial coefficient.
dx
n\
s) (n-s)lsl
668 LEGENDRE FUNCTIONS
1
sin в de
n(n + 1
2
■sin2 0
v = 0. A2.80)
Associated Legendre Functions
The regular solutions, relabeled P™(x), are
v = P»(x) = A - х2Г'2 ^Р„(х). A2.81)
These are the associated Legendre functions.2 Since the highest power of x in
Р„(х) is x", we must have m < n (or the m-fold differentiation will drive our func-
function to zero). In quantum mechanics the requirement that m < n has the physical
interpretation that the expectation value of the square of the z-component of the
angular momentum is less than or equal to the expectation yalue of the square
of the angular momentum vector L, <L2> < <L2>.
From the form of Eq. 12.81 we might expect m to be nonnegative, differen-
differentiating a negative number of times not having been defined. However, if Pn(x) is
expressed by Rodrigues' formula, this limitation on m is relaxed and we may
have —n < m < n, negative as well as positive values of m being permitted.
Using Leibnitz's differentiation formula once again, the reader may show
(Exercise 12.5.1) that
Р„т(х) and P~m(x) are related by
РГЫ = (- !Г jJj-=^C(*)- A2.81a)
From our definition of the associated Legendre functions, P™{x),
P» = Pn{x). A2.82)
In addition, we may develop Table 12.3.
As with the Legendre polynomials, a generating function for the associated
Legendre functions does exist:
Bm)!(l - x )m £, , M2 831
t2)m+1/2 k '
2"m !A - 2tx + t2)
However, because of its more cumbersome form and lack of any direct physical
application, it is seldom used.
Recurrence Relations
As expected, the associated Legendre functions satisfy recurrence relations.
Because of the existence of two indices instead of just one, we have a wide
variety of recurrence relations:
2 Occasionally (as in AMS-55), the reader will find the associated Legendre
functions defined with an additional factor of (— l)m. This (— l)m seems an
unnecessary complication at this point. It will be included in the definition
of the spherical harmonics Y™(Q,q>) in Section 12.6.
ASSOCIATED LEGENDRE FUNCTIONS 669
TABLE 12.3 Associated Legendre
Functions
p}(x) = ЗхA - x2I/2 = 3 cos в sin0
P22(x) = 3(l -x2) = 3sin20
P32(x) = 15x(l - x2) = 15 cos 0 sin2 в
/>з3(х) = 15A - x2K/2 = 15 sin3 в
p^x) = §Gx3 -3x)(l -x2I/2 = |Gcos30-3cos0)sin0
i 4 ()C\ —— 2 v '-^ /V *^ J ~~ ~2^ v COS t/ — i j Sill t/
Pl(x) = 105x(l - x2K/2 = 10
/>4(x) = 105A - x2J = 105 sin4 в
[n{n + 1) - m(m - 1)]РГ1 = 0, A2.84)
Bи + 1)хР„т = (и + т)Р™^ +{п-т+ 1)Р™+1, A2.85)
Bи + 1)A - х2I/2Р„т
рт + 1 рт+1
— -Гп+1 ~" ^п-1
= (и + т)(и + т - l)?^1 - (и - т + 1)(и - т + 2)Р„т+"Л A2.86)
- i(n + т)(и -т+ 1)РГХ. A2.87)
These relations, and many other similar ones, may be verified by use of the
generating function (Eq. 12.4), by substitution of the series solution of the
associated Legendre equation A2.79) or reduction to the Legendre polynomial
recurrence relations, using Eq. 12.81. As an example of the last method, consider
the third equation in the preceding set. It is similar to Eq. 12.23:
{In + l)Pn(x) = P;+1(x) - P;_t(x). A2.88)
Let us differentiate this Legendre polynomial recurrence relation m times to
obtain
A2.89)
ra + 1 dm + 1
= dxm+1 ^"+1^ ~ dxm+1
Now multiplying by A — x2)(m+1)/2 and using the definition of P^(x), we obtain
Eq. 12.86.
Parity
The parity relation satisfied by the associated Legendre functions may be
determined by examination of the defining equation A2.81). As x-> —x, we
670 LEGENDRE FUNCTIONS
already know that Pn(x) contributes a (— 1)". The m-fold differentiation yields a
factor of (— l)m. Hence we have
P™(-x) = (-l)n+mP™(x). A2.90)
A glance at Table 12.3 verifies this for 1 < m < n < 4.
Also, from the definition in Eq. 12.81
0, for тфО. A2.91)
Orthogonality
The orthogonality of the P™(x) follows from the differential equation just as
in Р„(х) (Section 12.3); the term —m2/(l — x2) cancels out, assuming m is the
same in both cases. However, it is instructive to demonstrate the orthogonality
by another method, a method that will also provide the normalization constant.
Using the definition in Eq. 12.81 and Rodrigues' formula (Eq. 12.65) for Pn{x),
we find
Л1 / 1\т Л1 Jv+m ja+m
A2.92)
dxq+l
The function X is given by X = (x2 — 1). If p i= q, let us assume that p < q.
Notice that the superscript m is the same for both functions. This is an essential
condition. The technique is to integrate repeatedly by parts; all the integrated
parts will vanish as long as there is a factor X = x2 — 1. Let us integrate q + m
times to obtain
The integrand on the right-hand side is now expanded by Leibnitz^s formula to
give
d ( d \ f(q + m)l d d
d«+m\ dp+m ) 4 i\{ i)\ d^'1 dp+m+i '
( f
dx«+m\ dxp+m ) ,4 i\{q + m - i)\ dx^'1 dxp+m+i
A2.94)
Since the term Xm contains no power of x greater than x2m, we must have
q + m~i<2m A2.95)
or the derivative will vanish. Similarly,
p + m + i<2p. A2.96)
In the solution of these equations for the index i the conditions for a nonzero
result are
i>q-m, i <p -m. A2.97)
If p < q, as assumed, there is no solution and the integral vanishes. The same
result obviously must follow if p > q.
ASSOCIATED LEGENDRE FUNCTIONS 671
For the remaining case, p = q, we may still have the single term correspond-
corresponding to i = q — m. Putting Eq. 12.94 into Eq. 12.93, we have
)\dx* )
A2.98)
Since
Xm = (x2 - l)m = x2m - mx2m'2 + ■ ■ ■, A2.99)
£^*m = Bm)!, A2.100)
Eq. 12.98 reduces to
The integral on the right is just
^^22^'9' A2.102)
B«+l)!
(compare Exercise 10.4.9). Combining Eqs. 12.101 and 12.102, we have the
orthogonality integral
1J-1.^±^Sm (Ш03)
or, in spherical polar coordinates,
p;(cos 0)p,"(cos в) sin в de = --—^ • |g +_ ^; bPtq.. A2.104)
The orthogonality of the Legendre polynomials is actually a special case of
this result, obtained by setting m equal to zero; that is, for m = 0, Eq. 12.103
reduces to Eqs. 12.43 and 12.48. In both Eqs. 12.103 and 12.104 our Sturm-
Liouville theory of Chapter 9 could provide the Kronecker delta. A special
calculation, such as the analysis here, is required for the normalization constant.
The orthogonality of the associated Legendre functions over the same interval
and with the same weighting factor as the Legendre polynomials does not con-
contradict the uniqueness of the Gram-Schmidt construction of the Legendre
polynomials, Example 9.3.1. Table 12.3 suggests (and Section 12.4 verifies) that
]lt P™{x)P™{x)dx may be written as
Here
pWx)(l - x2)'2 = PWx).
672 LEGENDRE FUNCTIONS
The functions p™(x) may be constructed by the Gram-Schmidt procedure with
the weighting function w(x) = A — x2).
It is possible to develop an orthogonality relation for associated Legendre
functions of the same lower index but different upper index. We find
I
- x2)'1 dx =
m{n-m)\m'k'
A2.105)
Note that a new weighting factor, A-х2) \ has been introduced. This form is
essentially a mathematical curiosity. In physical problems orthogonality of the
cp dependence ties the two upper indices together and leads to Eq. 12.104.
EXAMPLE 12.5.1 Magnetic Induction Field of a Current Loop
Like the other differential equations of mathematical physics, the associated
Legendre equation is likely to pop up quite unexpectedly. As an illustration,
consider the magnetic induction field В and magnetic vector potential A created
by a single circular current loop in the equatorial plane (Fig. 12.13).
d\
FIG. 12.13 Circular current loop
We know from electromagnetic theory that the contribution of current
element / dk to the magnetic vector potential is
A2.106)
(This follows from Exercise 1.14.4). Equation 12.106, plus the symmetry of our
system, shows that A has only a cp0-component and that the component is
independent of <p3
A =
(r, в).
A2.107)
3Pair off corresponding current elements
<P ~ <Pi = Я>2 - <P-
and IdX(<p2), where
ASSOCIATED LEGENDRE FUNCTIONS 673
By Maxwell's equations
V x H = J, (dD/dt = 0, SI units). A2.108)
Since
^0H = В = V x A, A2.109)
we have
A2.110)
where J is the current density. In our problem J is zero everywhere except in the
current loop. Therefore, away from the loop,
Vx УхФо^(г,0) = О, A2.111)
using Eq. 12.107.
From the expression for the curl in spherical polar coordinates (Section 2.5),
we obtain (Example 2.5.2)
v x v x (p0A (r, 0) = ф0 —2 ^ 2 ~~52>2 2 ~57>(cc>t vAq,)
v dr r or r cu r cO v
= 0. A2.112)
Letting A^ir, 0) = R(r)€)@) and separating variables, we have
г?т-г* + 2^ ~ п(п + 1)Л = О, A2.113)
dr1 dr
A ^ + cot в^ + п(п+ 1H-tt^ = O. A2.114)
The second equation is the associated Legendre equation A2.80) with m = 1, and
we may immediately write
0@) = P,,1 (cos 0). A2.115)
The separation constant n(n + 1) was chosen to keep this solution well behaved.
By trial, letting R(r) = r2, we find that a = n, — n — 1. The first possibility is
discarded, for our solution must vanish as r -*■ oo. Hence
n+1
\ Pi (cos 0) A2.116)
and
АДг,в) = Z c« -) *V(cos0), (r > a). A2.117)
Here a is the radius of the current loop.
Since Ay must be invariant to reflection in the equatorial plane by the sym-
symmetry of our problem,
) = A^r, -cosfl), A2.118)
the parity property of P^(cos 0) (Eq. 12.90) shows that cn = 0 for n even.
674 LEGENDRE FUNCTIONS
To complete the evaluation of the constants, we may use Eq. 12.117 to cal-
calculate Bz along the z-axis [Bz = Br(r, в = 0)] and compare with the expression
obtained from the Biot and Savart law. This is the same technique that is used
in Example 12.3.3. We have (compare Eq. 2.47)
Br = V x A|r
The Biot and Savart law states that
2
г 'ж*гдв'
Using
дРЦсоьв)_ dPj (cos в)
л/1 — Sill I/ ,
A2.120)
- 1 р2 , П{П + 1) ро
(Eq. 12.87) and then Eq. 12.84 with m = 1:
Pn2(cos0) ?^7-P,}(cos0) + n(n + l)Pn(cos0) = 0, A2.121)
we obtain
QO /7n + 1
Br(r,0) = V cnn{n + l)^T2Pn(cose), r>a A2.122)
(for all 0). In particular, for 6 = 0,
00 ЛП + ^
В_(r,0) = У cnn(n + 1)^7. A2.123)
A V"+2
We may also obtain
A2.124)
« (cos 0)» r > a-
° (SI units). A2.125)
I
4я г
We now integrate over the perimeter of our loop (radius a). The geometry is
shown in Fig. 12.14. The resulting magnetic induction field is kBz, along the
z-axis, with
ASSOCIATED LEGENDRE FUNCTIONS 675
dB (out of paper)
Id\
r =
FIG. 12.14 Law of Biot and Savart applied to a circular loop
2 zA z2
Expanding by the binomial theorem, we obtain
т
-ir^ipfff, -*.
A2.126)
A2.127)
Equating Eqs. 12.123 and 12.127 term by term (with r = z),4 we find
C2 — C4 = ■ ■ • =0.
4'
16'
cn = (-
A2.128)
2и(и+1)
и odd.
Equivalently, we may write
*The descending power series is also unique.
676 LEGENDRE FUNCTIONS
C2n + l
= f iv Vo1 . Bw)! = f
77 A2129)
and
A2.130)
Вг(г,в) = 4 I
r
n
+ l)Bn + 2)f-YF2n+1(cos0), A2.131)
W
(cos 0).
A2.132)
These fields may be described in closed form by the use of elliptic integrals.
Exercise 5.8.4 is an illustration of this approach. A third possibility is direct
integration of Eq. 12.106 by expanding the factor 1/r as a Legendre polynomial
generating function. The current is specified by Dirac delta functions. These
methods have the advantage of yielding the constants cn directly.
A comparison of magnetic current loop dipole fields and finite electric dipole
fields may be of interest. For the magnetic current loop dipole the preceding
analysis gives
A2.133)
A2.134)
From the finite electric dipole potential of Section 12.1 we have
A2.135)
A2.136)
The two fields agree in form as far as the leading term is concerned (г 3Pl), and
this is the basis for calling them both dipole fields.
(г, 0, ф)
— a
FIG. 12.15 Electric dipole
As with electric multipoles, it is sometimes convenient to discuss point mag-
magnetic multipoles. For the dipole case, Eqs. 12.133 and 12.134, the point dipole is
formed by taking the limit a -> 0, / -> oo with la2 held constant. With n a unit
vector normal to the current loop (positive sense by right-hand rule, Section
1.10) the magnetic moment m is given by m = nlna2.
EXERCISES 677
EXERCISES
12.5.1 Prove that
12.5.4
12.5.5
12.5.6
{n
where Pnm(x) is defined by
п+т
^
Hint. One approach is to apply Leibnitz's formula to (x + l)"(x — 1)".
12.5.2 Show that
pi @) = 0
' v ' B"«!J
by each of the three methods:
(a) use of recurrence relations,
(b) expansion of the generating function,
(c) Rodrigues' formula.
Bn)!!
12.5.3 Evaluate Pnm@).
ANS. Pnm@) =
(-1)'
0,
in-тП .
(n - m)!
(n - m)!!
Show that
Pn"(cos 0) = Bи - 1)!! sin" в, п = 0, 1, 2,
Derive the associated Legendre recurrence relation
n + m even,
n + m odd.
n + m even.
Р„(х)-
2mx
гРДх) + [n(n + 1) - m(m -
= 0.
Develop a recurrence relation that will yield P*(x) as
rn W — Jl\.x>n)"Ax) + J2\x>n)"n-l\.xh
Follow either (a) or (b).
(a) Derive a recurrence relation of the preceding form. Give fx (x, n) and /2(x, n)
explicitly.
(b) Find the sought for recurrence relation in print.
A) Give the source.
B) Verify the recurrence relation.
12.5.7 Show that
678 LEGENDRE FUNCTIONS
12.5.8 Show that
dP? m*P:Pf\ _ 2n(n + 1) („+»,)!
d6+ i2 )smede- 2 l )! "
0[de d6+ sin20
(
o Vsin? de
These integrals occur in the theory of scattering of electromagnetic waves by
spheres.
12.5.9 As a repeat of Exercise 12.3.6, show, using associated Legendre functions, that
J-1X{1
2n + l 2n-l (n- 2)! m>" l
12.5.10 Evaluate
I sin2OP*{cos6)dd.
J
12.5.11 The associated Legendre polynomial P™{x) satisfies the self-adjoint differential
equation
A - x2)Pnm'(x) - 2xPnm'(x) + \n{n + 1) - j-^i|P"W = 0.
From the differential equations for P™(x) and P*(x) show that
\n{n + 1) - j-^
J-i 1-*
for к Ф m.
12.5.12 Determine the vector potential of a magnetic quadrupole by differentiating
the magnetic dipole potential.
ANS. AMQ = ^(Ia2)(dz)q>0 2^ ; + higher-order terms.
= HQ{Ia2){dz)\- -■"—> . a -2 (cos0I
J 3P2(C
)\ ro-^i
This corresponds to placing a current loop of radius a at z = Jz, an oppositely
directed current loop at z = — dz, and letting a-»0 subject to (dz)x (dipole
strength) equal constant.
Another approach to this problem would be to integrate dA (Eq. 12.106),
to expand the denominator in a series of Legendre polynomials, and to use the
Legendre polynomial addition theorem (Section 12.8).
12.5.13 A single loop of wire of radius a carries a current /.
(a) Find the magnetic induction В for r < a.
(b) Calculate the integral of the magnetic flux (B • da) over the area of the
current loop, that is,
ANS. x.
EXERCISES 679
The earth is within such a ring current in which / approximates millions
of amperes arising from the drift of charged particles in the Van Allen belt.
12.5.14 (a) Show that in the point dipole limit the magnetic induction field of the
current loop becomes
Lit Г
with m = Ina2.
(b) Compare these results with the magnetic induction of the point magnetic
dipole of Exercise 1.8.17. Take m = km.
12.5.15 A uniformly charged spherical shell is rotating with constant angular velocity.
(a) Calculate the magnetic induction В along the axis of rotation outside the
sphere.
(b) Using the vector potential series of Section 12.5, find A and then В for
all space outside the sphere.
12.5.16 In the liquid drop model of the nucleus the spherical nucleus is subjected to
small deformations. Consider a sphere of radius r0 that is deformed so that
its new surface is given by
r = ro[l + a2P2(cos0)].
Find the area of the deformed sphere through terms of order a2.
Hint.
Г /drVl1'2
dA = \r2 + I — I r sin OdOdcp.
ANS. A = 4nr%[l + $a| + 0(a|)].
Note. The area element dA follows from noting that the line element ds for
fixed <p is given by
ds = {r2d62 + dr2I12 = (r2 + (dr/dOJI12 dO.
12.5.17 A nuclear particle is in a potential V(r, 0, <p) = 0 for 0 < r < a and oo for r > a.
The particle is described by a wave function ф(г, О, ср) which satisfies the wave
equation
and the boundary condition
ф(г = a) = 0.
Show that for the energy £ to be a minimum there must be no angular de-
dependence in the wave function; that is, ф = ф(г).
Hint. The problem centers on the boundary condition on the radial function.
12.5.18 (a) Write a subroutine to calculate the numerical value o.f the associated
Legendre function P,v(x) for given values of N and x.
Hint. With the known forms of P/ and Pj you can use the recurrence
relation Eq. 12.85 to generate P£, N > 2.
(b) Check your subroutine by having it calculate Р^(х) for x = 0.0@.5I.0
and N = 1AI0. Check these numerical values against the known values
of PjJ(O) and Py(l) and against the tabulated values of
680 LEGENDRE FUNCTIONS
12.5.19 Calculate the magnetic vector potential of a current loop, Example 12.5.1.
Tabulate your results for r/a = 1.5@.5M.0 and в = 0°A5°)90°. Include terms
in the series expansion, Eq. 12.130, until the absolute values of the terms drop
below the leading term by a factor of 105 or more.
Note. This associated Legendre expansion can be checked by comparison with
the elliptic integral solution, Exercise 5.8.4.
Check value. For r/a = 4.0 and 0 = 20°,
AJnol = 4.9398 x 10~3.
12.6 SPHERIC AL HARMONICS
In the separation of variables of A) Laplace's equation, B) Helmholtz's or
the space-dependence of the classical wave equation, and C) the Schrodinger
wave equation for central force fields,
ф = 0, A2.137)
the angular dependence, coming entirely from the Laplacian operator, is1
Azimuthal Dependence—Orthogonality
The separated azimuthal equation is
sfe^—'• <12139)
with solutions
Ф((р) = e~im<p, eim<p, A2.140)
which readily satisfy the orthogonal condition
= 2nSmi<m2. A2.141)
Notice'that it is the product Ф*1((р)ФП12((р) that is taken and that * is used to
indicate the complex conjugate function. This choice is not required, but it is
convenient for quantum mechanical calculations. We could have used
Ф = sin тер, cos тер A2.142)
and the conditions oi" orthogonality that form the basis for Fourier series
(Chapter 14). For applications such as describing the earth's gravitational or
magnetic field sin пкр and cos пкр would be the preferred choice (see Example
12.6.1).
In electrostatics and most other physical problems we require m to be an
integer in order that Ф(ср) may be a single-valued function of the azimuth angle.
1 For a separation constant of the form n(n + 1) with n an integer, a Legendre
equation series solution becomes a polynomial. Otherwise both series solu-
solutions diverge, Exercise 8.5.5.
SPHERICAL HARMONICS 681
In quantum mechanics the question is much more involved because the observ-
observable quantity that must be single-valued is the square of the magnitude of the
wave function, Ф*Ф. However, it can be shown that we must still have m integral.
Compare footnote in Section 8.3.
By means of Eq. 12.141,
^ A2.143)
is orthonormal (orthogonal and normalized) with respect to integration over
the azimuth angle cp.
Polar Angle Dependence
Splitting off the azimuthal dependence, the polar angle dependence (9) leads
to the associated Legendre equation A2.80), which is satisfied by the associated
Legendre functions; that is, 0@) = P™(cos6). To include negative values of m,
we use Rodrigues* formula, Eq. 12.65, in the definition of Pnm(cos0). This leads
to
i rim+n
PB"(cos0) = —7(l - x2r'2^rn(x2 - If, -n<m<n. A2.144)
2"n\ ax
/'„'"(cos 6) and Pn"m(cos в) are related as indicated in Exercise 12.5.1. An advantage
of this approach over simply defining P™(cos в) for 0 < m < n and requiring
that P~m = P™ is that the recurrence relations valid for 0 < m < n remain valid
for — n < m < 0.
Normalizing the associated Legendre function by Eq. 12.103, we obtain the
orthonormal function
2 (n + m)!
-n<m<n. A2.145)
Spherical Harmonics
The function Фт(<р) (Eq. 12.143) is orthonormal with respect to the azimuthal
angle q>, whereas the function ^"(cos в) (Eq. 12.145) is orthonormal with respect
to the polar angle в> We take the product of the two and define
^^P^cos0)e A2.146)
to obtain functions of two angles (and two indices) which are orthonormal over
the spherical surface. These У„т@, q>) are spherical harmonics. The complete
orthogonality integral becomes
^Щ 1,21,2 A2.147)
<p = O Jo = O
The extra ( — l)m included in the defining equation of У„т@, ф) deserves some
comment. It is clearly legitimate, since Eq. 12.137 is linear and homogeneous.
It is not necessary, but in moving on to certain quantum mechanical calculations,
682 LEGENDRE FUNCTIONS
TABLE 12.4 Spherical
Harmonics (Condon-
Shortley Phase)
Yo°F,<p)=
Y-\e,(p)= + /A i
j on
у-\в, (р)= + /— 3 sin 0 cos
у24л:
у *(в, <p)=- P- 3 sin в cos ве*
у 24л:
particularly in the quantum theory of angular momentum (Section 12.7), it is
most convenient. The factor (— l)m is a phase factor, often called the Condon-
Shortley phase, after the authors of a classic text on atomic spectroscopy. The
effect of this (-If (Eq. 12.146) and the (-l)m of Eq. 12.81a for P~m(cos0)
is to introduce an alternation of sign among the positive m spherical harmonics.
This is shown in Table 12.4.
The functions УД0, q>) acquired the name "spherical harmonics" first because
they are defined over the surface of a sphere with 0 the polar angle and q> the
azimuth. The "harmonic" was included because solutions of Laplace's equation
were called harmonic functions and У„т@, <р) is the angular part of such a
solution.
In the framework of quantum mechanics Eq. 12.138 becomes an orbital
angular momentum equation and the solution Y^@, cp) (n replaced by L, m,
by M) is an angular momentum eigenfunction: L being the angular momentum
quantum number and M the z-axis projection of L. These relationships are
developed in detail in Section 12.7.
Laplace Series, Fundamental Expansion
Theorem
Part of the importance of spherical harmonics lies in the completeness
property, a consequence of the Sturm-Liouville form of Laplace's equation.
SPHERICAL HARMONICS 683
This property, in this case, means that any function fF, ф) (with sufficient
continuity properties) evaluated over the surface of the sphere can be expanded
in a uniformly convergent double series of spherical harmonics2 (Laplace's
series).
Л6,(р)=£атп¥птF,(р). A2.148)
m,n
Iff (в, (р) is known, the coefficients can be immediately found by the use of the
orthogonality integral. Within the framework of the theory of linear vector
spaces, the completeness of the spherical harmonics follows from Weierstrass's
theorem.
EXAMPLE 12.6.1 Laplace Series—Gravity Fields
The gravity fields of the earth, moon, and Mars have been described by a
Laplace series with real eigenfunctions:
U(r, в, <p) =
GM
r
--II -
Г \r
R
■ • ** fit v^ v ^ ^^
A2.148л)
Here M is the mass of the body, R the equatorial radius. The real functions
Y:n and Уш°„ are defined by
Y*nF,<p) = P?(cos в) cos mcp
Y°n(d,(p) = P?(cosd)smm(p.
For applications such as this the real trigonometric forms are preferred to the
imaginary exponential form of Y^F,(p). Satellite measurements have led to
the numerical values shown in Table 12.5.
TABLE 12.5 Gravity Field Coefficients,
Eq. 12.148л
Earth Moon Mars
C20 1.083 x 1(Г3 @.200 ± 0.002) x КГ3 A.96 ± 0.01) x 10'3
C22 0.16 x 10~5 B.4 ± 0.5) x 10~5 (-5±l)xlO~5
S22 -0.09 x 10 @.5+0.6) x 10 C + 1) x 10
C20 represents an equatorial bulge, whereas C22 and 522 represent an azimuthal
dependence of the gravitational field.
2 For a proof of this fundamental theorem see E. W. Hobson, The Theory of
Spherical and Ellipsoidal Harmonics. New York: Chelsea A955), Chapter VII.
Iff (в, q>) is discontinuous we may still have convergence in the mean, Section
9.4.
684 LEGENDRE FUNCTIONS
EXERCISES
12.6.1» Show that the parity of Jf@, ф) is (- l)L. Note the disappearance of any M
dependence.
Hint. For the parity operation in spherical polar coordinates see Section 2.5
and a footnote in Section 12.2.
12.6.2 Prove that
12.6.3 In the theory of Coulomb excitation of nuclei we encounter Y^(n/2,0). Show
that
(L+M)/2
умЫ q] _ [2L + 1Y'2 [(L -M)\{L + M)!]
L \Г ) \ An ) (L - Af)!!(L + M)!! V '
for L + M even
= 0 for L + M odd.
Here Bи)!! = 2иBи-2).-- 6-4-2,
Bи + 1)!! = Bи + 1)Bи - 1) ■ • • 5 • 3 • 1.
12.6.4 (a) Express the elements of the quadrupole moment tensor x,xy as a linear
combination of the spherical harmonics Y™ (and Yq ).
Note. The tensor х,х7- is reducible. The Y£ indicates the presence of a
scalar component,
(b) The quadrupole moment tensor is usually defined as
with p(r) the charge density. Express the components of (Зх,х7- — г2д^) in
terms of r2Yf.
(c) What is the significance of the — r2d^ term?
Hint. Compare Section 3.4.
12.6.5 The orthogonal azimuthal functions yield a useful representation of the Dirac
delta function. Show that
1 °°
) Z fi^ )]
12.6.6 Derive Jhe spherical harmonic closure relation
j — cos
12.6.7 The quantum mechanical angular momentum operators Lx ± iLy are given by
Show that
ANGULAR MOMENTUM AND LADDER OPERATORS 685
(a) (Lx + iLy)Y"{e,q>) = +^(Ь - M)(L + M
(b) (Lx - iLy)Y?@,<p) =
12.6.8 With L± given by
show that
(a) ГГ^Ш^ИЛ-П
(b) y,m-
12.6.9 In some circumstances it is desirable to replace the imaginary exponential of
our spherical harmonic by sine or cosine. Morse and Feshbach define
where
Г Г[г,уо(е,»)]28ше<ш»= ** iw" + m?l for « = 1,2,3,...
Jo Jo 2Bn + l)(n-m)!
= 4л: for n = О (Уо°о does not exist).
These spherical harmonics are often named according to the patterns of their
positive and negative regions on the surface of a sphere—zonal harmonics for
m = 0, sectoral harmonics for m = n, and tesseral harmonics for 0 < m < n.
For Y£n, n = 4, m = 0, 2,4, indicate on a diagram of a hemisphere (one diagram
for each spherical harmonic) the regions in which the spherical harmonic is
positive.
12.6.10 A function f(r, в, ф) may be expressed as a Laplace series
With < > sphere used to mean the average over a sphere (centered on the origin),
show that
<f(rA<P» sphere = /@,0,0).
12.7 ANGULAR MOMENTUM AND LADDER
OPERATORS
Orbital Angular Momentum
The classical concept of angular momentum L dassical = r x p is presented in
Section 1.4 to introduce the cross product. Following the usual Schrodinger
representation of quantum mechanics, the classical linear momentum p is
replaced by the operator — iV. The quantum mechanical angular momentum
operator becomes1
1 For simplicity, the h is dropped. This means that the angular momentum
is measured in units of h.
686 LEGENDRE FUNCTIONS
LQM= -irx V. A2.149)
This is used repeatedly in Sections 1.8,1.9, and 2.4 to illustrate vector differential
operators. From Exercise 1.8.6 the angular momentum components satisfy a
commutation relation
The eijk is the Levi-Civita symbol of Section 3.4. A summation over the index к
is understood.
From Exercises 2.5.12 and 2.5.13 we find
Lz =-//-, A2.151)
dip
in spherical polar coordinates. Hence
Lz 7LM@, cp) = MY?{e, cp). A2.152)
The differential operator corresponding to the square of the angular momentum
L2 = L-L = L2 + L2 + L2 A2.153)
may be determined from
L-L=-(r x V)-(r x V), A2.154)
which is the subject of Exercises 1.9.9 and 2.5.17(b). From these we find that
L • L operating on a spherical harmonic yields2
Г 1 Я / Я \ 1 Я2 ~)
\Y™{e,cp\ A2.155)
дв
V. \ /
or
L-LY^(e,cp) = L(L+ l)Y^(e,cp). A2.156)
This is Exercise 8.3.1.
Equation 12.150 presents the basic commutation relations of the components
of the quantum mechanical angular momentum. Indeed, within the framework
of quantum mechanics, these commutation relations define an angular momen-
momentum operator. From Eq. 12.152 our spherical harmonic Y^(9,cp) is an eigen-
function of Lz with eigenvalue M. Finally, from Eq. 12.156, Y^F, cp) is also an
eigenfunction of L2 with eigenvalue L(L + 1).
General Operator Approach
Apart from the replacement of p by — iV, the analysis so far has been in terms
of classical mathematics. Let us start anew with a more typical quantum
mechanical analysis.
2 In addition to these eigenvalue equations, the relation of L to rotations of
coordinate systems and to rotations of functions is examined in Sections 4.10
to 4.12.
ANGULAR MOMENTUM AND LADDER OPERATORS 687
1. We assume an Hermitian operator J whose com-
components satisfy the commutation relations .
Vi>Ji] = bukJk- A2-157)
Otherwise J is arbitrary.
2. We assume that \j/JM is simultaneously a normalized
eigenfunction (or eigenvector) of Jz with eigenvalue
M and an eigenfunction of J2 with eigenvalue
J(J + I):3
A2.158)
- A2-159)
Otherwise \j/JM is assumed unknown.
Let us see what general conclusions we can develop. Then we shall let our
general operators Jx, Jy, and Jz become the specific orbital angular momentum
operators Lx, Ly, and Lz. \j/JM will then become a function of the spherical polar
coordinate angles в and cp. We derive its form—in terms of Legendre poly-
polynomials and differential operators—and identify it with the spherical harmonic
Yf{0, ф). This will illustrate the generality and power of operator techniques—
particularly the use of ladder operators.* It will also make clear the basis of the
Condon-Shortley phase factor, the association of the (— 1)M with the positive
M spherical harmonics.
The ladder operators are defined as
J+ = Jx + Uy,
У A2.160)
J.=JX- Uy.
In terms of these operators J2 may be rewritten as
J2 = \{J+J- + J-J+) + J2. A2.161)
From the commutation relations, Eq. 12.157, we find
[JZ,J+] = +J+, [Л,-/-] = -J-, [J+,J-1 = 2JZ. A2.162)
Since J + commutes with J2 (Exercise 12.7.1),
J2(^+<Ajm) = -MJVjm) = J(J + 1)C/+<Ajm). A2.163)
Therefore, J+ij/jM is still an eigenfunction of J2 with eigenvalue J(J + 1).
Similarly, for J_\j/JM. But from Eq. 12.162
JzJ+=J+(Jz + l), A2.164)
or
<AJM)- A2.165)
3 That ij/JM is an eigenfunction of both Jz and J2 is a consequence of [Jz, J2] =0.
4 Ladder operators can be developed for other mathematical functions.
Compare Section 13.1 for Hermite polynomials.
688 LEGENDRE FUNCTIONS
Therefore J+ \j/JM is still an eigenfunction of Jz but now with eigenvalue M + 1.
J+ has raised the eigenvalue by 1 and so is .often called a raising operator.
Similarly, J_ lowers the eigenvalue by 1 and so is often called a lowering operator.
With respect to rotations (J2,Jz,J+,J_), the \j/JM form an irreducible, in-
invariant subspace; M varies and J is fixed. In Section 4.10 this property appears
as the rotation group operating on the spherical harmonics, Y™; m varies and /
is fixed.
Now what is the effect of letting first J + and then J_ operate on «AJM? The
answer comes from expressing J_J+ (and J+J-) in terms of J2 and Jz. From
Eqs. 12.157 and 12.161,
J_J+=J2-JZ(JZ + 1),
A2.166)
J+J_=J2-JZ(JZ-1).
Then using Eqs. 12.158, 12.159, and 12.166,
1) - M(M + 1)]«AJM = (J- M)(J + M +
(lz.167)
+ i) - м(м - i)]«AJM = (j + M)(j -м + i)«AJM.
Now, multiply by i/^m and integrate (over all angles for the spherical harmonics).
Since the \\iJM have been assumed normalized,
= (J- M){J + M + 1) > 0,
A2.168)
= (J + M)(J - M + 1) > 0.
The >0 part is worth a comment. In the language of quantum mechanics, J+
and J_ are Hermitian conjugates,5
J\=J_, P_=J+. A2.169)
Examples of this are provided by the matrices of Exercises 4.2.13 (spin y), 4.2.15
(spin 1), and 4.2.18 (spin 3/2). Therefore
J-J+=JIJ+, J+J_=J1J_, A2.170)
and the expectation values, Eq. 12.168, must be positive or zero.6 For our par-
particular orbital angular momentum ladder operators, L+ and L_, explicit forms
are given in Exercises 2.5.14 and 12.6.7. The reader can show (Exercise 12.7.2)
that
f [ A2.171)
5 The Hermitian conjugation or adjoint operation is defined for matrices in
Section 4.5, for operators in general in Section 9.1.
6 For an excellent discussion of adjoint operators and Hilbert space see
A. Messiah, Quantum Mechanics, Chapter 7. New York: Wiley A961).
ANGULAR MOMENTUM AND LADDER OPERATORS 689
This is a sort of integration by parts (with the extra minus sign in L_ canceled
by the minus sign in the integration by the parts formula). Actually the equality
is most easily verified by evaluating each side of Eq. 12.171, using Exercise 12.6.7.
From the right-hand side of Eq. 12.171 it is clear that the >0 in Eq. 12.168 is
valid. With the >0 justified, we must have M restricted to the range — J <
M <J.
Since J + raises the eigenvalue M to M + 1, we relabel the resultant eigen-
function «Aj>m+i- The normalization is given by Eq. 12.168 as
M + 1)</0,м+1, A2.172)
taking the positive square root and not introducing any phase factor. By the
same arguments
^^. A2.173)
Both ij/j^M+i and <Aj,m-i remain normalized to unity. An explicit calculation of
these results (using known ladder operators and known spherical harmonics) is
the topic of Exercise 12.6.7. In Eqs. 12.172 and 12.173 the positive square root
has been taken. Then the relative phase of t/o>M±i and tyJM is determined by the
ladder operators.
Repeated application of J + leads to
(J+Mjm = CJMnxlfJM+n. A2.174)
This operation must stop at M' = M + n = J, or else we would jump to M' > J
and be in contradiction with the conclusion from Eq. 12.168, M < J. Equiva-
lently, we may say that whatever Mmax is, since J + \j/JM = 0, the left-hand side
of Eq. 12.172 is zero, and therefore the right-hand sidle is zero. This yields
Mmax = J. In the same fashion,
(J_)>JM = DJMn<AJ>M_n A2.175)
must terminate at M" = M — n = — J. We conclude from this first, that
J+il/j,j = 0, J_il/j,_j = 0. A2.176)
Second, since M ranges from + J to — J in unit steps, 2 J must be an integer. J is
either an integer or half of an odd integer. As seen later, orbital angular momen-
momentum is described with integral J. But from the spins of some of the fundamental
particles and of some nuclei, we get J = \, f, f, • • •. Our angular momentum is
quantized—essentially as a result of the commutation relations.
Orbital Angular Momentum Operators
Now we return to our specific orbital angular momentum operators, Lx, Ly,
and Lz. Equation 12.158 becomes
, V) =
The explicit form of Lz indicates that \j/LM{9, q>) has a cp dependence of eiM<p—with
M an integer to keep \j/LM single-valued. And if M is an integer, then L is an
integer also.
690 LEGENDRE FUNCTIONS
To determine the 9 dependence of фш{в, (р), we proceed in two main steps:
A) the determination of ij/LLF, q>) and B) the development of \j/LM{9, q>) in terms
of \j/LL with the phase fixed by \j/LO.
Let
il/LM@, q>) = e
A2.177)
From Eq. 12.176, using the form of L+ given in Exercises 2.5.14 and 12.6.7, we
have
'd_
dd
and
Normalizing, we obtain
A2.178)
A2.179)
ctcL
sin2L+1eddd(p=l.
A2.180)
The 0 integral may be evaluated as a beta function (Exercise 10.4.9) and
l1 V 4nBL)ll ^lTV4^-
This completes our first step.
To obtain the \j/LM, M ф ±L, we return to the ladder operators. From
Eqs. 12.172 and 12.173 (J+ replaced by L+ and J_ replaced by L_),
A2.182)
Again, note that the relative phases are set by the ladder operators. L+ and
L_ operating on ®LM@)eiM<p may be written as
Ь+&ш(в)еш<р = el
a(cos t^)
= -em~1)<p
•}
A2.183)
a (cos a)
Repeating these operations n times yields
ANGULAR MOMENTUM AND LADDER OPERATORS 691
(L+)nGLM@)eiM<p = {-\)n^
d ^
A2.184)
From Eq. 12.182
and for M = — L
d(cos6YL A2.186)
Note the characteristic (— 1)L phase of \j/Lt-L relative to \l/LtL. This (— 1)L enters
from
0 = A - x2)L = (- l)L(x2 - l)L- A2.187)
Combining Eqs. 12.182, 12.184, and 12.186, we obtain
A2.188)
Equations 12.185 and 12.188 agree that
fe^ A2189)
Using Rodrigues's formula, Eq. 12.65, we have
A2.190)
The last equality follows from Eq. 12.181. We now demand that i/fLO@,0) be
real and positive. Therefore
With (-l^cJlcJ = 1, фьо@,<р) in Eq. 12.190 may be identified with the
spherical harmonic Y°F, cp) of Section. 12.6.
When we substitute (- l)LcL into Eq. 12.188,
692 LEGENDRE FUNCTIONS
I{2L)\ I2L + 1 / (L-M)\
L+M
sin2L6>
= 2L+1 {L-M)\ iM M
J An J(L4 — V ]
A2.192)
dx
L+M ~|
, M>0.
L+M'
The expression in the curly bracket is identified as the associated Legendre
function, (Eq. 12.144), and we have
ФьмФ, <р) = If (в, q>)
A2.193)
у 4я (L + M)!
in complete agreement with Section 12.6. Then by Eq. 12.81a, Y^f for negative
superscript is given by
YlMF, q>) = (- 1)мУ*<*(в, cp). A2.194)
Our angular momentum eigenfunctions tyLM{Q,<P) are identified with the
spherical harmonics. The phase factor (—1)M is associated with the positive
values of M and is seen to be a consequence of the ladder operators.
Our development of spherical harmonics here may be considered a portion
of Lie algebra—related to group theory—Section 4.10.
EXERCISES
12.7.1 Show that
(a) [J+,J2] = 0,
(b) [J_,J2] = 0.
12.7.2 Using the known forms of L+ and L_ (Exercises 2.5.14 and 12.6.7), show that
jYf*L_(L+ Y?)dQ = i(L+ Y?)*(L+ Yfi
12.7.3 Derive the relations
12.7.4 Derive the multiple operator equations
THE ADDITION THEOREM FOR SPHERICAL HARMONICS 693
(а) (Ь+)пеш(в)еШч> = (-\fei(M+n)<psm"+Mв d flV,siiTMв ®ш(в\
(b)
{в)еШч> = ei(M~nk> sin"~M в
d(cosdy
smM в QLMF).
Hint. Try mathematical induction.
12.7.5 Show, using (L _ f, that
12.7.6 Verify by explicit calculation that
(a) L+ У°@,<р) = - /5-si
(b) L_ Y?@, <p) =
The signs (Condon-Shortley phase) are a consequence of the ladder operators,
L+ andL_.
12.8 THE ADDITION THEOREM FOR SPHERICAL
HARMONICS
Trigonometric Identity
In the following discussion {вх, (р^) and F2, (p2) denote two different directions
in our spherical coordinate system, separated by an angle y. (Fig. 12.16). These
FIG. 12.16
694 LEGENDRE FUNCTIONS
angles satisfy the trigonometric identity
cos у = cos 9l cos 62 + sin 6X sin 62 cos^j — (p2), A2.195)
which is perhaps most easily proved by vector methods (compare Chapter 1).
The addition theorem, then, asserts that
>2), A2.196)
Zfl -\~ 1 m= — n
or equivalently,
Att n
A2.197)
In terms of the associated Legendre functions the addition theorem is
= Pn(cos91)Pn(cose2)
n ( _ ч, A2.198)
+ 2 X ^-^Pn"I(cos01)Pn"I(cos02)cosrn((Pl - cp2).
Equation 12.195 is a special case of Eq. 12.198.
Derivation of Addition Theorem
We now derive Eq. 12.197. Let g(9, cp) be a function that may be expanded in
a Laplace series
д(в1,<р1)= i;m@i><Pi) relative to xl,y1,z1
A2.199)
= £ anmYnm(y,ij/) relative to x2,y2,z2.
m= —n
Actually the choice of the 0 of the azimuth angle ф is irrelevant. At у = 0 we have
=o = ano(^~1) , A2.200)
since Р„A) = 1, whereas РД1) = 0 (m ф 0). Multiplying Eq. A2.199) by Yn°*(y, ф)
and integrating over the sphere, we obtain
ъф = an0. A2.201)
Now, using Eq. 12.199, we may rewrite Eq. 12.201 as
J Гп"Ч61,<р1)Гп°*(у,ф)<т = an0. A2.202)
As for Eq. 12.199, we assume that Pn(cosy) has an expansion of the form
Pn(cosy)= X Ь„тУД0,,Ы A2-203)
1 The asterisk may go on either spherical harmonic.
THE ADDITION THEOREM FOR SPHERICAL HARMONICS 695
where the bnm will, of course, depend on 62,(p2, that is, on the orientation of the
z2-axis. Multiplying by Y™*{dl,(pl) and integrating with respect to вх and q>x
over the sphere, we have
^<Pi = bnm. A2.204)
In terms of spherical harmonics Eq. J ° ^<A becomes
/ An Y/2 Г
Yn°(y,il/)Ynm*(ei,(pl)dQ = bnm. A2.205)
\2n + \j J
Note that the subscripts have been dropped from the solid angle element dQ.
Since the range of integration is over all solid angles, the choice of polar axis
is irrelevant. Then in a comparison of Eqs. 12.202 and 12.205,
=o by Eq. 12.200 A2.206)
471 Yn62,<p2) by Eq. 12.199.
2n+ 1
The change in subscripts occurs because
for у -> 0.
ty\ —* tyi
Substituting back into Eq. 12.203, we obtain Eq. 12.197, thus proving our
addition theorem.
The reader familiar with group theory will find a much more elegant proof
of Eq. 12.197 by using the rotation group.2 This is Exercise 4.10.11.
One application of the addition theorem is in the construction of a Green's
function for the three-dimensional Laplace equation in spherical polar co-
coordinates. If the source is on the polar axis at the point (r = а, в = 0, cp = 0),
then by Eq. 12.4
1 1 Д a"
A2.207)
rn
a
r <a.
Rotating our coordinate system to put the source at (a, 62, q>2) and the point of
observation at (г, вг, cp^, we obtain
2 Compare M. E. Rose, Elementary Theory of Angular Momentum. New York:
Wiley A957).
696 LEGENDRE FUNCTIONS
n=Om=-n
A2.208)
In Section 16.6 this argument is reversed to provide another derivation of the
Legendre polynomial addition theorem.
EXERCISES
12.8.1 In proving the addition theorem, we assumed that Y^F1,(p1) could be expanded
in a series of Y™F2,(p2) in which m varied from — n to + n but n was held fixed.
What arguments can you develop to justify summing only over the upper index
m and not over the lower index n?
Hints. One possibility is to examine the homogeneity of the Ynm, that is, Ynm
may be expressed entirely in terms of the form cos"~p0sinp0 or x"~p~sypzs/r".
Another possibility is to examine the behavior of the Legendre equation [V2 +
n(n + l)/r2]Pn(cos0) = 0 under rotation of the coordinate system.
12.8.2 An atomic electron with angular momentum L and magnetic quantum number
M has a wave function
ф(г,в,<р)=Яг)Г11(в,<р).
Show that the sum of the electron densities in a given complete shell is spherically
symmetric; that is, Y!m= -l Ф*(г> 6, ф)Ф(г, Q> <P) is independent of в and q>.
12.8.3 The potential of an electron at point re in the field of Z protons at rp is
e2 * 1
4та
=1 |ге-гр|
Show that this may be written as
е p=1 LM\rej ll -v i
where re > rp. How should q> be written for re<rpl
12.8.4 Two protons are uniformly distributed within the same spherical volume. If the
coordinates of one element of charge are (rl,el,q>l) and the coordinates of the
other are (г2,62,(р2) and r12 is the distance between them, the element of energy
of repulsion will be given by
2dvx dv2 _ 2r\drx sinQx ddx dcpx r\dr2sinв2dd2dq>2
r\i r12
tt charge 3e , ,
Here p = — = r, charge density,
volume 47гК
EXERCISES 697
Calculate the total electrostatic energy (of repulsion) of the two protons. This
calculation is used in accounting for the mass difference in "mirror" nuclei, such
asO15andN15.
ANS. For r2>r1 - —
5 R
5R
2
A?- (total).
This is double that required to create a uniformly charged sphere because we have
two separate cloud charges interacting, not one charge interacting with itself
(with permutation of pairs not considered).
12.8.5 Each of the two Is electrons in helium may be described by a hydrogenic wave
function
/73\l/2
И) () -**-
in the absence of the other electron. Here Z, the atomic number, is 2. The symbol
a0 is the Bohr radius, h2/me2. Find the mutual potential energy of the two electrons
given by
ANS.
8fl0
Note. d3rx= rfdrx %\пвх d6x d<px,
*1 -
12.8.6 The probability of finding a Is hydrogen electron in a volume element r2dr
sin в d6 dq> is
exp [ — 2r/o0] r2 dr sin edddcp.
Find the corresponding electrostatic potential. Calculate the potential from
Г12
with rx not on the z-axis. Expand r12. Apply the Legendre polynomial addition
theorem and show that the angular dependence of F(rx) drops out.
4 [2 j j
\у()
12.8.7 A hydrogen electron in a 2p orbit has a charge distribution
P=^2li2
where a0 is the Bohr radius, h2/me2. Find the electrostatic potential corresponding
to this charge distribution.
12.8.8 The electric current density produced by a Ip electron in a hydrogen atom is
698 LEGENDRE FUNCTIONS
Using
find the magnetic vector potential produced by this hydrogen electron.
Hint. Resolve into cartesian components. Use the addition theorem to eliminate
y, the angle included between rx and r2.
12.8.9 (a) As a Laplace series and as an example of Eq. 9.80 (now with complex func-
functions), show that
-Q2)=l
n,m
(b) Show also that this same Dirac delta function may be written as
">*, _i_ 1
d(Q1 - Q2) =
Now, if you can justify equating the summations over n term by term, you have
an alternate derivation of the spherical harmonic addition theorem.
12.9 INTEGRALS OF THE PRODUCT OF THREE
SPHERICAL HARMONICS
Frequently in quantum mechanics we encounter integrals of the general form
in which the integration is over all solid angles. The first factor in the integrand
may come from the wave function of a final state and the third factor from an
initial state, whereas the middle factor may represent an operator that is being
evaluated or whose "matrix element" is being determined.
By using group theoretical methods, as in the quantum theory of angular
momentum, we may give a general expression for the forms listed. The analysis
involves the vector-addition or Clebsch-Gordan coefficients, which have been
tabulated. Three general restrictions appear.1 A) The integral vanishes unless the
vector sum of the L's (angular momentum) is zero, \LX — L3\ < L2 < Lx + L3.
B) The integral vanishes unless M2 + M3 = Mt. Here we have the theoretical
foundation of the vector model of atomic spectroscopy. C) Finally, the integral
vanishes unless the product Y^Y^Y^ is even, that is, unless L^ + L2 + L3
is an even integer. This is a parity conservation law.
Details of this general and powerful approach will be found in the references.
1E. U. Condon and G. H. Shortley, The Theory of Atomic Spectra. Cambridge:
Cambridge University PressA951); M. E. Rose, Elementary Theory of Angular
Momentum. New York: Wiley A957); A. Edmonds, Angular Momentum in
Quantum Mechanics. Princeton, N.J.: Princeton University Press A957);
E. P. Wigner, Group Theory and Its Applications to Quantum Mechanics
(translated by J. J. Griffin). New York: Academic Press A959).
INTEGRALS OF THE PRODUCT OF THREE SPHERICAL HARMONICS 699
The reader will note that the vector-addition coefficients are developed in terms
of the Condon-Shortley phase convention in which the (— l)m of Eq. 12.146 is
associated with the positive m.
It is possible to evaluate many of the commonly encountered integrals of this
form with the techniques already developed. The integration over azimuth may
be carried out by inspection.
A2.209)
Physically this corresponds to the conservation of the z-component of angular
momentum.
Application of Recurrence Relations
A glance at Table 12.4 will show that the ^-dependence of 7L 2, that is,
P^F) can be expressed in terms of cos в and sin в. However, a factor of cos 0
or sin в may be combined with the Y^13 factor by using the associated Legendre
polynomial recurrence relations. For instance, from Eqs. 12.85 and 12.86 we get
11/2
cos в Y? = +
(L- M + 1)(L + M + 1)
BL + 1)BL + 3)
~(L -M)(H
M
A2.210)
BL - 1)BL + 1)J L~*
(L + M + 1)(L + M + 2)"
BL + 1)BL + 3)
"(L - M)(L - M - 1)'
BL - 1)BL + 1)
(L-M + 1)(L-MH
BL + 1)BL + ЗГ
~(L + M)(L + M - 1)"
1/2
1/2
2)"
1/2
A2.211)
1/2
M-1
BL - 1)BL + 1)
A2.212)
M-l
Using these equations, we obtain
"(L - M + 1)(L + M + 1)
BL + 1)BL + 3)
n
(L -M)(L + M)
BL - 1)BL + 1)
1/2
A2.213)
"L,,L-
The occurrence of the Kronecker delta (Ьг,Ь ± 1) is an aspect of the conserva-
conservation of angular momentum. Physically, this integral arises in a consideration of
ordinary atomic electromagnetic radiation (electric dipole). It leads to the
familiar selection rule that transitions to an atomic level with orbital angular
momentum quantum number Lt can originate only from atomic levels with
quantum numbers Lt — 1 or Lx + 1. The application to expressions such as
700 LEGENDRE FUNCTIONS
quadrupole moment ~ Y?*P2(cos0Of du
j
is more involved but perfectly straightforward.
EXERCISES
12.9.1 Verify
(a) [y?
47Г
(b)
BL + l)BL
MyiyM+i
j S
L + M+ 1)(L + M + 2)
BL + 1)BL + 3)
VW BL-1)BL + 1) "
These integrals were used in an investigation of the angular correlation of internal
conversion electrons.
12.9.2 Show that
(a)
xPL(x)PN(x)dx =
i-i
2(L + 1)
BL + 1)BL + 3)'
2L
BL - 1)BL + 1)'
2(L + 1)(L + 2)
(b)
x2PL(x)PN(x)dx =
BL + 1)BL + 3)BL + 5)'
2BL2 + 2L - 1)
BL - 1)BL + 1)BL + 3)'
2L(L - 1)
BL - 3)BL - 1)BL + 1)'
2,
N = L,
N = L-2.
12.9.3 Since xPn(x) is a polynomial (degree n + 1), it may be represented by the Legendre
series
xPn(x) = £ flsPs(x).
s=0
(a) Show that os = 0 for s < n — 1 and s > n + 1.
(b) Calculate an_l5 an, and an+1 and show that you have reproduced the recur-
recurrence relation, Eq. 12.17.
Note. This argument may be put in a general form to demonstrate the existence
of a three-term recurrence relation for any of our complete sets of orthogonal
polynomials:
LEGENDRE FUNCTIONS OF THE SECOND KIND. Qn(x) 701
12.10 LEGENDRE FUNCTIONS OF THE SECOND
KIND, Qa(x)
In all the analysis so far in this chapter we have been dealing with one solution
of Legendre's equation, the solution Pn(cos 9), which is regular (finite) at the two
singular points of the differential equation, cos в = ± 1. From the general theory
of differential equations it is known that a second solution exists. We develop
this second solution, Qn, by a series solution of Legendre's equation. Later a
closed form will be obtained.
Series Solutions of Legendre's Equation
To solve
d
dx
we proceed as in Chapter
with
y' =
dx
+ n(n + l)y = 0
8, letting1
oo
A = 0
00
I (k + X)a,
A = 0
A2.214)
A2.215)
A2.216)
y" = £ (k + X)(k + X- 1)аххк+л~2. A2.217)
A=0
Substitution into the original differential equation gives
£ (k + A)(fc + X- l)axxk+x~2
+ £ [w(w + 1) - 2(/c + Я) - (k + X)(k + X- 1)]алхк+х = О. A2.218)
The indicial equation is
k(k - 1) = 0, A2.219)
with solutions к = 0,1. We try first к = 0 with a0 = 1, ax = 0. Then our series is
described by the recurrence relation
(X + 2)(X + l)ax+2 + [n(n + 1) - 2X - X(X - l)]aA = 0, A2.220)
which becomes
1 Note that x may be replaced by the complex variable z.
702 LEGENDRE FUNCTIONS
(n + к + l)(n - k)
Labeling this series pn, we have
pJx) . i _ ^ti)x' + <" - 2)"(+ 1)|n + 3)x* + ■ ■ ■. A2.222)
The second solution of the indicia] equation, к = 1, with a0 = 1, a^ = 0, leads
to the recurrence relation
{n + к + 2)(и - Я - 1) „ o --,.
e*« = ~ a + 2)a + 3) "" A1223)
Labeling this series ^„, we obtain
Our general solution of Eq. 12.214, then, is
*,(*) = ЛпРп(х) + Bnqn{x\ A2.225)
provided we have convergence. From Gauss's test, Section 5.2 (see Example 5.2.4),
we do not have convergence at x = ± 1. To get out of this difficulty, we set the
separation constant n equal to an integer (Exercise 8.5.5) and convert the infinite
series into a polynomial.
For n, a positive even integer (or zero), series pn terminates, and with a proper
choice of a normalizing factor (selected to obtain agreement with the definition
ofPn(x) in Section 12.1)
Bs)!!
A2.226)
If и is a positive odd integer, series qn terminates after a finite number of terms,
and we write
P (x) = (—
n\
nI«»W
k\ for n =
) { f Bs)!!
A2.227)
Note that these expressions hold for all real values of x, — oo < x < oo, and for
complex values in the finite complex plane. The constants that multiply pn and
qn are chosen to make Pn agree with Legendre polynomials given by the generat-
generating function.
Equations 12.222 and 12.224 may still be used with n = v, not an integer, but
LEGENDRE FUNCTIONS OF THE SECOND KIND. Qn(x) 703
i
1.5
1.0
0.5
0
-0.5
-1.0
-
-
\ 0.2
см*)
Qo(x)
1
0.4
X
У
У
1
0.6
/
/ \
\
/
] \
/ол 1 1.о
FIG. 12.17 Second Legendre function, Qn{x), 0 <, x < 1
now the series no longer terminates, and the range of convergence becomes
— 1 < x < 1. The end points, x = ± 1 are not included.
It is sometimes convenient to reverse the order of the terms in the series. This
may be done by putting
n
in the first form of Pn(x),
n even,
s =
n- 1
— X in the second form of Pn(x), n odd,
so that Eqs. 12.230 and 12.231 become
X
n-2s
A2.228)
where the upper limit s = n/2 (for n even) or (n — l)/2 (for n odd). This reproduces
Eq. 12.8 of Section 12.1, which is obtained directly from the generating function.
This agreement with Eq. 12.8 is the reason for the particular choice of normaliza-
normalization in Eqs. 12.226 and 12.227.
Qn(x), Functions of the Second Kind
It will be noticed that we have used only pn for n even and qn for n odd (be-
(because they terminated for this choice of n). We may now define a second solution
of Legendre's equation (Fig. 12.17) by
704 LEGENDRE FUNCTIONS
10°
ю-1
10-2
I0-3
CoM
10
x FIG. 12.18 Second Legendre function,
Qn(x), x > 1
&(*) = (-17
и/2[>/2)!]22"
n\
Bs)!!
Bs - 1)!!
q2six\ f°r n even, и = 2s,
A2.229)
= (-1)
Bs)!!
Bs + 1)!!
A2.230)
n = 2s + \.
This choice of normalizing factors forces Qn to satisfy the same recurrence rela-
relations as Р„. This may be verified by substituting Eqs. 12.229 and 12.230 into
Eqs. 12.17 and 12.26. Inspection of the (series) recurrence relations (Eqs. 12.221
and 12.223), that is, by the Cauchy ratio test, sEows that Qn(x) will converge for
— 1 < x < 1. If |x| > 1, these series forms of our second solution diverge. A
solution in a series of negative powers of x can be developed for the region
|x| > 1 (Fig. 12.18) but we proceed to a closed form solution that can be used
over the entire complex plane (apart from the singular points x = ± 1 and with
care on cut lines).
Closed Form Solutions
Frequently, a closed form of the second solution, Qn(z), is desirable. This may
be obtained by the method discussed in Section 8.6. We write
dx
A2.231)
in which the constant An replaces the evaluation of the integral at the arbitrary
lower limit. Both constants, An and Bn, may be determined for special cases.
LEGENDRE FUNCTIONS OF THE SECOND KIND. Qn(x) 705
For n = 0, Eq. 12.231 yields
Q(z) = P(z)U +B Г ^ ]
Г °J d-x2)[P0(x)]2j
= A0 + B0Un^- A2.232)
( z3 z5 z2s+1 \
=л0+в0B+т+-+...+_+...I
the last expression following from a Maclaurin expansion of the logarithm.
Comparing this with the series solution (Eq. 12.224), we obtain
_3 _5 z25+1
Q0(z) = qo(z) = z + - + -+••• + -^-j + • • •, A2.233)
we have Ao = 0, Bo = 1. Similar results follow for и = 1. We obtain
dx
Al+Bl
j A — *2)*2,
'1, 1+z 1
A2.234)
Expanding in a power series and comparing with Qx{z)= —pl(z), we have
A1 = 0, B1 = 1. Therefore we may write
r»/\ 1, 1 + Z
A2.235)
Ql(z) \z]n\1, |
2 1 — z '
Perhaps the best way of determining the higher-order Qn(z) is to use the
recurrence relation (Eq. 12.17), which may be verified for both x2 < 1 and for
x2 > 1 by substituting in the series forms. This recurrence relation technique
yields
е2оо = |вд1п^з|-|лD A2.236)
Repeated application of the recurrence formula leads to
- • • • • A2.237)
From the form In [A + z)/(l — z)] it will be seen that for real z these expres-
expressions hold in the range — 1 < x < 1. If we wish to have closed forms valid outside
this range, we need only replace
, 1 + x , , z + 1
In- by In
1х
by In.
1-х z— 1
708 LEGENDRE FUNCTIONS
When using the latter form, valid for large z, we take the line interval — 1 < x < 1
as a cut line. Values of Qn(x), on the cut line, are customarily assigned by the
relation
= \ Ш* + Ю) + Qn(x - Ю)],
A2.238)
the arithmetic average of approaches from the positive imaginary side and from
the negative imaginary side. The reader will note that for z -> x > 1, z — 1 ->
A — x)e±in. The result is that for all z, except on the real axis — 1 < x < 1,
we have
Q0(z) =
1 , z + 1
A2.239)
A2.240)
and so on.
For convenient reference some special values of Qn(z) are given.
1. Qn{\) = oo, from the logarithmic term (Eq. 12.237).
2. б„(оо) = О. This is best obtained from a representa-
representation of Qn(x) as a series of negative powers of x,
Exercise 12.10.4.
Qn(-z) = (- l)n+1en(z). This follows from the series
form. It may also be derived by using Q0{z), Qi(z) and
the recurrence relation (Eq. 12.17).
б„@) = 0, for n even, by C).
3.
4.
5.
n\
= (-1)
,s + l
B5)!!
Bs + 1)!!'
for n odd, n = 2s + 1.
This last result comes from the series form (Eq. 12.230) with р„@) = 1.
EXERCISES
12.10.1 Derive the parity relation for Qn(x).
12.10.2 From Eqs. 12.226 and 12.227 show that
Bл+2s- 1)!
(а) Р2,.(*)=Ут1(-1I
Z s=0
Bs)\(n
Bn + 2s+ 1)!
Z
s=0
Check the normalization by showing that one term of each series agrees with
the corresponding term of Eq. 12.8.
VECTOR SPHERICAL HARMONICS 707
12.10.3 Show that
Q2n(x)-(-D2 ^(-
2s+l
Bs + 1)!Bh-2s)!
22" У (и + 5)!B5"
(b) С2,+1(х) = (-1Г122(-1
, 22П+, у (w+s)!Bs-2w-2)! 2,
Jii B5IE-И-1)! '
12.10.4 (a) Starting with the assumed form
Qn(x) = £ Ь_лхк-\
show that
2,
(b) The standard choice of b0 is
2"{n\J
u Bи + 1)!
Show that this choice of b0 brings this negative power series form of Qn(x)
into agreement with the closed form solutions.
12.10.5 Verify that the Legendre functions of the second kind, Qn{x), satisfy the same
recurrence relations as Pn{x), both for |x| < 1 and for |x| > 1.
Bл + l)xQn{x) = {n н
12.10.6 (a) Using the recurrence relations, prove (independently of the Wronskian
relation) that
n[PH(x)Qn-i(x) ~ Рп-ЛхШх)] = РЛхШх) - Р0Ш:(х).
(b) By direct substitution show that the right-hand side of this equation
equals 1.
12.10.7 (a) Write a subroutine that will generate Qn(x) and lower index Q's based on
the recurrence relation for these Legendre functions of the second kind.
Take x to be within (— 1,1)—excluding the end-points.
Hint. Take Q0(x) and Qi(x) to be known.
(b) Test your subroutine for accuracy by computing Ql0(x) and comparing
with the values tabulated in AMS-55 (Chapter 8).
12.11 VECTOR SPHERICAL HARMONICS
Most of our attention in this chapter has been directed toward solving the
equations of scalar fields such as the electrostatic field. This was done primarily
because the scalar fields are easier to handle than vector fields! However, with
708 LEGENDRE FUNCTIONS
scalar field problems under firm control, more and more attention is being paid
to vector field problems.
Magnetic Field of a Current Loop
To illustrate the difficulties, let us consider the equation1
V x V x A = fi0 J
A2.241)
for the magnetic vector potential. Let us further suppose that the boundary
conditions are best expressed in spherical polar coordinates. In the example of
a current loop (Section 12.5) it was possible to handle this equation because
the form of A was highly restricted. In general, this equation will yield three
scalar equations, each involving all three components of A, Ar, Ae, and Av.
Such coupled differential equations can be solved, but the complexities are
formidable.
Setting V • A = 0, we can convert our equation into the vector Laplacian
V2A. This will separate into one equation for each component in cartesian
coordinates. Unfortunately, our boundary conditions (for the current loop)
are in spherical coordiijates. To satisfy them we would still have to mix the
Ay
awkward and difficult to handle.
To facilitate the solution of Eq. 12.241 and other equations, such as the
vector Helmholtz and the vector wave equation, we have used various com-
combinations of the (scalar) spherical harmonics to construct vectors in spherical
polar coordinates. One set, useful in quantum mechanics, has been described
by Hill.2 His three vector spherical harmonics are
cartesian components Ax, Ay, and Az in a form that would probably be both
*LM — Г0
<Po
iM
A2.242)
'Ml
l)]1/2sin0 L у
' LM — M0
,1/2
'M
2L + 1
<Po
iM
[LBL + l)]1/2sin0
A2.243)
Л.Г Vf
-M
Y? +
— i
6Y,M)
\[L(L+l)Y12sine'L j ' ™[[L(L
These functions satisfy a general orthogonality relation
6в
A2.244)
1 Compare Exercise 1.14.5 for a derivation from Maxwell's equations.
2E. H. Hill, "Theory of Vector Spherical Harmonics," Am. J. Phys. 22, 211
A954); also J. M. Blatt and V. Weisskopf, Theoretical Nuclear Physics.
New York: Wiley A952). Note that Hill assigns phases in accordance with
the Condon-Shortley phase convention (Section 12.6).
VECTOR SPHERICAL HARMONICS 709
J = 5AB5LL.5MM., A2.245)
where A and В may be V, X, or W. This may be verified by using the definitions
of V, X, and W and reducing the integral to one of ordinary orthonormal
spherical harmonics, Y^@, q>).
Under the parity operations (coordinate inversion) the vector spherical
harmonics transform as
LM@, cp), A2.246)
XLMF',(p') = (-l)LXLMF,(p),
where
& = n-e
A2.247)
q>' = л + <p.
In verifying these relations, the reader should remember that the spherical polar
coordinate unit vectors r0 and <p0 are odd and 90 is even. These properties may
be verified by expressing the unit vectors r0, 90, and <p0 in terms of the cartesian
unit vectors i, j, and к and spherical polar coordinates.
To demonstrate the use of the vector spherical harmonics, consider Eq. 12.241
again. From Hill's table of differential relations
2L + 1
1/2
dr r
) WLM@,«>)] =
1/2
Y"@,cp), A2.248)
A2.249)
dr r -_p —
V-[F(r)XLM(O,^)]=a A2.250)
The condition
V-A = 0 A2.251)
eliminates VLM and WLM, leaving only XLM. In the absence of current (J = 0),
that is, away from the current loop, Eq. 12.241, subject to Eq. 12.251, becomes
V2A = 0. A2.252)
Using another Hill differential relation with ALM = R(r)XLM@, cp), we obtain
\2[R(r)XLMF,<p)-\ =
d2R 2<iR L(L+1)R
XLM = 0, A2.253)
dr2 r dr r2
in agreement with our Eq. 12.113. We have
ALM = cLLMr-L~lXLM{0, cp). A2.254)
We note that there can be no azimuthal dependence because of the symmetry
of our loop, M = 0, and our solution reduces to
710 LEGENDRE FUNCTIONS
A2255)
This is equivalent to Eq. 12.116. The constants aL are determined by fitting
boundary conditions, as done in Section 12.5 for cn. The magnetic field may be
found from
V x [F(r)XLM] = i[^~^ , —--.-, rLM
A2.256)
fLM>
which corresponds to Eq. 12.119. [Here F(r) = aLr~L~\~\
The definitions of the vector spherical harmonics given here are dictated
by convenience, primarily in quantum mechanical calculations, in which the
angular momentum is a significant parameter. Morse and Feshbach describe
another set of vector spherical harmonics, В, С, and P, in which the radial
dependence is entirely in P and the angular dependence entirely in В and C.
This set offers advantages in treating the wave equation when we want to
separate the longitudinal and transverse parts of the wave.
Further examples of the usefulness and power of the vector spherical har-
harmonics will be found in Blatt and Weisskopf, in Morse and Feshbach, and in
Jackson's Classical Electrodynamics, which uses vector spherical harmonics in
a description of multiple radiation and related electromagnetic problems.
Vector spherical harmonics may be developed as the result of coupling L
units of orbital angular momentum and 1 unit of spin angular momentum.
An extension, coupling L units of orbital angular momentum and 2 units of
spin angular momentum to form tensor spherical harmonics, is presented by
Mathews.2 The major application of tensor spherical harmonics is in the
investigation of gravitational radiation.
EXERCISES
12.11.1 Construct the / = 0, m = 0 and / = 1, m = 0 vector spherical harmonics.
ANS. \00=-г0DпГ1/2
Xoo=0
V,o = -rpB7rr1/2cos0 - eo(87rr1/2sin0
W,o = го°DтгГ1/2 cos 0 - воDтгГ1/2 sin в.
12.11.2 Verify that the parity of \LM isV 1)L+1, the parity of XLM is (- 1)L, and that of
WLM is (- 1)L+1. What happened to the M-dependence of the parity?
Hint. r0 and <p0 have odd parity; 90 has even parity (compare Exercise 2.5.8).
2J. Mathews, "Gravitational Multipole Radiation," in H. P. Robertson, In
Memoriam. Philadelphia: Society for Industrial and Applied Mathematics
A963).
REFERENCES 711
12.11.3 Verify the orthonormality of the vector spherical harmonics VLM, XLM, and
wLM.
12.11.4 In Classical Electrodynamics 2nd ed., Jackson defines XLM by the equation
Хш(в, ф) = / LYi?@, <p),
in which the angular momentum operator L is given by
L= -i(r x V).
Show that this definition agrees with Eq. 12.244.
12.11.5 Show that
£ X^@,<p)-XLM|?,«p) = ^A
Hint. One way is to use Exercise 12.11.4 with L expanded in cartesian coordinates
using the raising and lowering operators of Section 12.7.
12.11.6 Show that
J-
The integrand represents an interference term in electromagnetic radiation
that contributes to angular distributions but not to total intensity.
REF
Hobson, E. W., The Theory of Spherical and Ellipsoidal Harmonics. New York: Chelsea
A955).
This is a very complete reference, which is the classic text on Legendre polynomials and
all related functions.
See also the references listed at the end of Chapter 13.
13 SPECIAL
FUNCTIONS
In this chapter we shall study four sets of orthogonal polynomials, Hermite,
Laguerre, and Chebyshev1 of first and second kinds. Although these four sets
are of less importance in mathematical physics than the Bessel and Legendre
functions of Chapters 11 and 12, they are used occasionally and therefore
deserve at least a little attention. Section 13.4 is devoted to important numerical
applications of Chebyshev polynomials. Because the general mathematical
techniques duplicate those of the preceding two chapters, the development of
these functions is only outlined. Detailed proofs, along the lines of Chapters 11
and 12, are left to the reader. To conclude the chapter, we express these poly-
polynomials and other functions in terms of hypergeometric and confluent hyper-
geometric functions.
13.1 HERMITE FUNCTIONS
Generating Functions—Hermite Polynomials
The Hermite polynomials (Fig. 13.1), Hn(x), may be defined by the generating
function2
Нн(х)—. A3.1)
п = 0 П-
Recurrence Relations
Note the absence of a superscript, which distinguishes it from the unrelated
Hankel functions. From the generating function we find that the Hermite
polynomials satisfy the recurrence relations
Hn+1(x) = 2xHn(x) - 2nHn_l(x) A3.2)
and
= 2nHn_1(x). A3.3)
'This is the spelling choice of AMS-55. However, a variety of forms such as
Tschebyscheff is encountered.
2 A derivation of this Hermite generating function is outlined in Exercise
13.1.3.
712
HERMITE FUNCTIONS 713
10'
H0{x)
/T
-*-x
FIG. 13.1 Hermite polynomials
Equation 13.2 may be obtained by differentiating the generating function with
respect to t; differentiation with respect to x leads to Eq. 13.3.
Direct expansion of the generating function easily gives H0(x) = 1 and
H1(x) = 2x. Then Eq. 13.2 permits the construction of any Hn(x) desired
(integral n). For convenient reference the first several Hermite polynomials
are listed in Table 13.1.
Special values of the Hermite polynomials follow from the generating
function; that is,
A3.4)
n\
Я2п+1@) = 0. A3.5)
We also obtain from the generating function the important parity relation
Hn(x) = (-l)"Hn(-x). A3.6)
Alternate Representations
Differentiation of the generating function3 n times with respect to t and then
setting t equal to zero yields
= (-l)"ex2£-n(e~x2).
A3.7)
This gives us a Rodrigues representation of Hn(x). A second representation may
be obtained by using the calculus-of residues (Chapter 7). If we multiply Eq. 13.1
by t~m~1 and integrate around the origin, only the term with Hm(x) will survive.
Hm{x) =
dt.
A3.8)
3 Rewrite the generating function as g(x, t) = ex2e " xJ. Note that
Yx
А„-«-хJ — <Lp-«-xJ
714 SPECIAL FUNCTIONS
TABLE 13.1 Hermite
Polynomials
Mo(x) = 1
= 4jc2 - 2
#3(jc) = Sx3 - 12*
Я4(х) = 16x4 - 48jc2 + 12
Я5(х) = 32x5 - 160jc3 + 120*
H6(x) = 64x6 - 480x4 + 72(bc2 - 120
Also, from Eq. 13.1 we may write our Hermite polynomial Я„(х) in series form.
Bx)" - <^fenBхГ 2 + ог^шBхГ 4l •3 ■ ■ ■
[n/2] / \
25 1-3-5- -B5- 1) A3.9)
\2sJ
[n/2]
This terminates for integral n and yields our Hermite polynomial.
Orthogonality
The recurrence relations (Eqs. 13.2 and 13.3) lead to the second-order linear
differential equation
Щ(х) - 2хЩ(х) + 2пНп(х) = 0, A3:10)
which is clearly not self-adjoint.
To put Eq. 13.10 i» self-adjoint form, we multiply by exp( — x2), Exercise 9.1.2.
This leads to the orthogonality integral
/*QO
Hm(x)Hn(x)e~x2 dx = 0, тфп, A3.10a)
J-QO
with the weighting function exp( — x2) a consequence of putting the differential
equation into self-adjoint form. The interval ( — 00,00) is selected to satisfy the
Hermitian operator boundary conditions, Section 9.1. It is sometimes conve-
convenient to absorb the weighting function into the Hermite polynomials. We may
define
<pn(x) = e~x2l2Hn(x) A3.11)
with (pn(x) no longer a polynomial.
Substitution into Eq. 13.10 yields, the differential equation for q>n(x),
Bn + l- x2)q>n(x) = 0. A3.12)
This is the differential equation for a quantum mechanical, simple harmonic
HERMITE FUNCTIONS 715
oscillator which is perhaps the most important single application of the Hermite
polynomials. Equation 13.12 is self-adjoint and the solutions <pn(x) are ortho-
orthogonal for the interval ( — oo < x < oc) with a unit weighting function.
The problem of normalizing these functions remains. Proceeding as in
Section 12.3, we multiply Eq. 13.1 by itself and then by e~x\ This yields
,, m t n
ш.„=о mini
When we integrate over x from — oo to + oo the cross terms of the double sum
drop out because of the orthogonality property4
/ct\n Г00
~ e~x\Hn{x)]2dx=
= 0 • •
n=o n\
By equating coefficients of like powers of st, we obtain
\x)]2dx = 2nnmn[. A3.15)
Quantum Mechanical Simple Harmonic Oscillator
As already indicated, the Hermite polynomials are used in analyzing the
quantum mechanical simple harmonic oscillator. For a potential energy
V = \Kz2 = jmoJz2 (force F = —\V= —Kz), the Schrodinger wave equation
is
V44z) + KzV{z) EV{z). A3.16)
2m 2
Our oscillating particle has mass m and total energy E. By use of the
abbreviations
. . 4 mK m2to2
x = olz with a = -tt~ — ~ i^r~'
hz hz
h
A3.17)
\1/2=2£
2£/m\1/22£
in which о is the angular frequency of the corresponding classical oscillator,
Eq. 13.16 becomes [with T(z) ^ Т(х/а) = ^(jc)]
d2\jj(x)
dx:
(л-х2)ф(х) = 0. A3.18)
4The cross terms (m Ф n) may be left in, if desired. Then, when the coefficients
of sxtfi are equated, the orthogonality will be apparent.
716 SPECIAL FUNCTIONS
FIG. 13.2 Quantum mechanical oscil-
oscillator wave functions: the heavy bar on the
x-axis indicates the allowed range of the
classical oscillator with the same total
energy
This is Eq. 13.12 with X = In + 1. Hence (Fig. 13.2),
фп(х) = 2~nl2n-ll4(n !Г1/2е-*2/2Я„(х) (normalized).
A3.19)
The requirement that n be an integer is dictated by the boundary conditions of
the quantum mechanical system,
lim
z->±oo
Specifically, if n -*■ v, not an integer, a power-series solution of Eq. 13.10 (Exercise
8.5.6) shows that Hv(x) will behave as xve*2 for large x. The functions ij/v(x) and
^(z) will therefore blow up at infinity, and it will be impossible to normalize
the wave function 4>(z). With this requirement, energy E becomes
±)hco.
A3.20)
As n ranges over integral values (n > 0), we see that the energy is quantized and
that there is a minimum or zero point energy
£Bto=i*o>- A3-21)
This zero point energy is an aspect of the uncertainty principle, a purely quantum
phenomenon.
Raising and Lowering Operators
An alternate treatment of the quantum mechanical oscillator found in many
quantum mechanics texts employs raising and lowering operators:
EXERCISES 717
пШф"'1{х)- A1226)
Often in quantum mechanics the raising operator is labeled a creation operator,
a\ and the lowering operator an annihilation operator, a. The wave function фп
(actually given by Eq. 13.19) is unknown. The development is similar to the use
of the raising and lowering operators presented in Section 12.7. The minimum
energy or ground state wave function, ф0, satisfies the equation
= 0. A2.23)
Normalized to unity,
I / \ ~ 1/4. y^/2 / 1 Л Л Т \
i//0(x) = n ' e ' , A2.23a)
in agreement with Eq. 13.19. The excited state wave functions, i//,, i//2,and soon,
are then generated by the raising operator—Eq. 13.22a. The verification of these
raising and lowering operators, Eqs. 13.22a and 13.226, is left as Exercise 13.1.16.
In quantum mechanical problems, particularly in molecular spectroscopy,
a number of integrals of the form
xre~x2Hn(x)HJx)dx
J — oo
are needed. Examples for r = 1 and r = 2 (with n — m) are included in the
exercises at the end of this section. A large number of other examples are
contained in Wilson, Decius, and Cross.5
The oscillator potential has also been employed extensively in calculations
of nuclear structure (nuclear shell model).
There is a second independent solution of Eq. 13.10. This Hermite function
of the second kind is an infinite series (Sections 8.5, 8.6) and of no physical
interest, at least not yet.
EXERCISES
13.1.1 Assume the Hermite polynomials are known as solutions of the differential
equation A3.10) and from this the recurrence relation, Eq. 13.3, and the values
of Hn@) are also known,
(a) Assume the existence of a generating function
g(x,t)= I Hn(x)C/n\.
n-0
'E. B. Wilson, Jr., J. С Decius, and P. С Cross. Molecular Vibrations.
New York: McGraw-Hill A955).
718 SPECIAL FUNCTIONS
(b) Differentiate g(x, t) with respect to x and using the recurrence relation
develop a first-order differential equation for g(x, t).
(c) Integrate with respect to x, holding t fixed.
(d) Evaluate g@, t) using Eqs. 13.4 and 13.5. Finally, show that
g(x,t) = exp(-t2 +2tx).
13.1.2 In developing the properties of the Hermite polynomials, you could start at a
number of different points such as:
1. Hermite differential equations, Eq. 13.10,
2. Rodrigues' formula, Eq. 13.7,
3. Integral representation, Eq. 13.8,
4. Generating function, Eq. 13.1,
5. Gram-Schmidt construction of a complete set of orthogonal
polynomials over ( — 00,00) with a weighting factor of
exp(-jc2), Section 9.3.
Outline how you can go from any one of these starting points to all the other
points.
13.1.3 From the generating function show that
[n/2] „ |
l2
13.1.4 From the generating function derive the recurrence relations
Hn+1(x) = 2xHn(x)-2nHn^(x),
13.1.5 Prove that
Hint. Check out the first couple of examples and then use mathematical induc-
induction.
13.1.6 Prove that
\Hn(x)\ < \Hn(ix)\.
13.1.7 Rewrite the series form of #„(*), Eq. 13.9, as an ascending power series.
J\ Bn) i
ANS u '~* ' iw v / iw>~\2s \"ч•
\2s+l
13.1.8 (a) Expand x2r in a series of even order Hermite polynomials,
(b) Expand x2r+1 in a series of odd order Hermite polynomials
ANS. (a) ** = ££t H2»(X)
22> n%{2n)\{r-n)\
f Я2п+1(х)
(b) х= f
W * 22-1 Д
Hint. Use a Rodrigues representation of H2n{x) and integrate by parts.
EXERCISES 719
13.1.9 Show that
Г r ■> -, Bтгл!/(л/2)!, «even
(b) хЯп(х)ехр[-х2/2]</х =
О, п even
_ (л + 1)! , .
2л- — , «odd.
13.1.10 Show that
Г
х"'е х Hn(x)dx = 0 for m an integer, 0 < т < п — I.
J-x-
13.1.11 The transition probability between two oscillator states, m and n, depends on
[ xe*2Hn(x)HJx)dx.
J - x
Show that this integral equals л'/22"~'п !<)'„,_„_, + л1/22"(« + 1)! SIIKlH ,.
This result shows that such transitions can occur only between states of adjacent
energy levels, m = n + 1.
Hint. Multiply the generating function (Eq. 13.1) by itself using two different sets
of variables (x,.v) and (x,/). Alternatively, the factor x may be eliminated by the
recurrence relation Eq. 13.2.
13.1.12 Show that
x2e xlH,,(x)H,,(x)dx = nil22"n\(n + -J.
This integral occurs in the calculation of the mean-square displacement of our
quantum oscillator.
Hint. Use the recurrence relation Eq. 13.2 and the orthogonality integral.
13.1.13 Evaluate
f x2cxp[-x2]//n(x)Hm(xMx
I — x
in terms of n and m and appropriate Kronecker delta functions.
ANS. 2"~V/2Bn + 1)л!<)„„, + 2"я1/2(» + 2)!<5И,2.1И + 2"-2nnlSn. 2.m.
13.1.14 Show that
= rn P>>
J да B"n ' (n + /•)•, P = r-
n, p, and r are nonnegative integers.
Hint. Use the recurrence relation, Eq. 13.2, p times.
13.1.1 5 (a) Using the Cauchy integral formula, develop an integral representation of
Hn(x) based on Eq. 13.1 with the contour enclosing the point z = — x.
ANS. Hn(x) = -"V2 <f . -[_~'z--dz.
2ni J (r + x)"
(b) Show by direct substitution that this result satisfies the Hermite equation.
720 SPECIAL FUNCTIONS
13.1.16 With
ф„(х) = е
verify that
апфп{х) = -jJx + £\фп(х) = птфп^ (х),
Note. The usual quantum mechanical operator approach establishes these
raising and lowering properties before the form of ф„(х) is known.
13.1.17 (a) Verify the operator identity
x - £ = -exP[x2/2]£exp[-x2/2].
(b) The normalized simple harmonic oscillator wave function is
ф„(х) = {nll22nn\rli2^Vi-x2/2-\Hn{x).
Show that this may be written as
ф„(х) = {пш2пп\)-(х - £Yexp[-x2/2].
Note. This corresponds to an n-fold application of the raising operator of
Exercise 13.1.16.
13.1.18 (a) Show that the simple oscillator Hamiltonian (from Eq. 13.18) may be
written as
Н + 2(+аа>
Hint. Express E in units of hw.
(b) Using the creation—annihilation operator formulation of part (a)—show
that
Нф(х) = (п + \)ф{х).
This means the energy eigenvalues are E = (n + |)(йсо), in agreement with
Eq. 13.20.
13.1.19 Write a program that will generate the coefficients as in the polynomial form
of the Hermite polynomial, Hn(x) = Y^=oasxS-
13.1.20 A function f(x) is expanded in an Hermite series:
.f(x) = £ anHn(x)-
n=0
From the orthogonality and normalization of the Hermite polynomials the
coefficient an is given by
For f(x) = x8 determine the Hermite coefficients а„ by the Gauss-Hermite
quadrature (Appendix 2). Check your coefficients against AMS-55, Table 22.12.
LAGUERRE FUNCTIONS 721
13.1.21 (a) In analogy with Exercise 12.2.13 set up the matrix of even Hermite poly-
polynomial coefficients that will transform an e.ven Hermite series into an even
power series:
'1 -2 12
0 4 -48
B=l 0 0 16
Extend В to handle an even polynomial series through Hs(x).
(b) Invert your matrix to obtain matrix A which will transform an even power
series (through x8) into a series of even Hermite polynomials. Check the
elements of A against those listed in AMS-55 (Table 22.12).
(c) Finally, using matrix multiplication, determine the Hermite series equiva-
equivalent to f(x) = x8.
13.1.22 Write a subroutine that will transform a finite power series Y*=oanx" mto an
Hermite series ^п=оЬ„Н„(х). Use the recurrence relation Eq. 13.2 and follow
the technique outlined in Section 13.4 for a Chebyshev series.
Note. Both Exercises 13.1.21 and 13.1.22 are faster and more accurate than the
Gaussian quadrature, Exercise 13.1.20, if fix) is available as a power series.
13.1.23 Write a subroutine for evaluating Hermite polynomial matrix elements of the
form
Mpqr= f Hpix)Hqix)xre-x2dx,
J — oo
using the 10-point Gauss-Hermite quadrature (for p + q + r < 19). Include a
parity check and set equal to zero the integrals with odd parity integrand. Also,
check to see if r is in the range \p — q\ < r < p + q. Otherwise Mpqr = 0. Check
your results against the specific cases listed in Exercises 13.1.11, 13.1.12, 13.1.13
and 13.1.14.
13.1.24 Calculate and tabulate the normalized linear oscillator wave functions
ф„(х) = 2'nl2n-minl)~il2Hnix)expi-x2/2) for x = 0.0@.1M.0
and n = 0AM. If a plotting routine is available, plot your results.
13.2 LAGUERRE FUNCTIONS
Differential Equation—Laguerre Polynomials
If we start with the appropriate generating function, it is possible to develop
the Laguerre polynomials in exact analogy with the Hermite polynomials.
Alternatively, a series solution may be developed by the methods of Section 8.5.
Instead, to illustrate a different.technique, let us start with Laguerre's differential
equation and obtain a solution in the form of a contour integral, as we did with
the modified Bessel function Kvix) (Section 11.6). From this integral representa-
representation a generating function will be derived.
Laguerre's differential equation is
x/(x) + A - x)/(x) + ny(x) = 0. A3.24)
722 SPECIAL FUNCTIONS
FIG. 13.3 Laguerre function contour
We shall attempt to represent y, or rather yn, since у will depend on n, by the
contour integral
2*iJ(l-
тп+1
dz.
The contour includes the origin but does not enclose the point z
Section 6.4
i Г o-xzl(l-z)
-xz/(l-z)
Substituting into the left-hand side of Eq. 13.24, we obtain
X 1 — X П ~\ -
A3.25a)
1. From
A3.25b)
A3.25c)
1 Г
brij
xz/(l-z) j
which is equal to
dz.
A3.26)
If we integrate>our perfect differential around a contour chosen so that the final
value equals the initial value (Fig. 13.3), the integral will vanish, thus verifying
that yn(x) (Eq. 13.25a) is a solution of Laguerre's equation.
It has become customary to define Ln(x), the Laguerre polynomial (Fig. 13.4),
by1
LJx) =
,-ХГ/A-2)
dz.
A3.27)
1 Other definitions of Ln(x) are in use. The definitions here of the Laguerre
polynomial Ln{x) and the associated Laguerre polynomial Lj(x) agree with
AMS-55 (Chapter 22).
LAGUERRE FUNCTIONS 723
*-.v
FIG. 13.4 Laguerre polynomials
This is exactly what we would obtain from the series
z) oo
g(x,z) =
1 -z
= I K{x)z\
< 1
A3.28)
if we multiplied by z " l and integrated around the origin. As in the development
of the calculus of residues (Section 7.2), only the z term in the series survives.
On this basis we identify g(x,z) as the generating function for the Laguerre
polynomials.
With the transformation
xz s — x
= s — x or z = .
1 -z
sne~s
A3.29)
A3.30)
the new contour enclosing the point s = x in the s-plane. By Cauchy's integral
formula (for derivatives)
Ln(x) = ^£-„(*"e~x), (integral n\
A3.31)
giving Rodrigues' formula for Laguerre polynomials. From these representa-
representations of Ln(x) we find the series form (for integral n):
(-1)"
n\
n
= E(-
-
(n
П2 „_! П2(
1!X +
— m)! m! m !
n — IJ
2! X
~ s = 0
n
s)
!(n-
-
5)
s!
A3.32)
and the specific polynomials listed in Table 13.2 (Exercise 13.2.1).
By differentiating the generating function in Eq. 13.28, with respect to x and
z, we obtain the recurrence relations
724 SPECIAL FUNCTIONS
TABLE 13.2 Laguerre Polynomials
L0(x) = 1
Li(x)= -x+ 1
2\L2(x) = x2-4x + 2
3\L3(x)= -x3 +9x2- 18jc+6
4!L4(x) = x4 - 16x3 + 12x2 - 96л: + 24
5!L5(jc) = -x5 + 25jc4 - 200jc3 + 600jc2 - 600* + 120
6!L6Qc) = x6 - 36x5 + 450x4 - 2400л:3 + 5400л:2 - 4320л: + 720
(и + l)LB+1(x) = Bи + 1 - x)Ln(x) - nL^ix), A3.33)
xL'n(x) = nLn(x) - nL^ix). A3.34)
Equation 13.33, modified to read
Lrt+l(x) = 2Ln(x) - Ln^(x)
A3.33a)
- [A + x)Ln(x) - I^-AxWn + 1),
for reasons of economy and numerical stability, is used for machine computa-
computation of numerical values of Ln(x). The computing machine starts with known
numerical values of L0(x) and L^x), Table 13.2, and works up step by step—in
milliseconds. This is the same technique discussed for computing Legendre
polynomials, Section 12.2.
Also, from Eq. 13.28 we find the special value
Ln@) = 1. A3.35)
As may be seen from the form of the generating function, the form of Laguerre's
differential equation, or from Table 13.2, the Laguerre polynomials have neither
odd nor even symmetry (parity).
The Laguerre differential equation is not self-adjoint and the Laguerre poly-
polynomials, Ln(x), do not by themselves form an orthogonal set. However, follow-
following the method of Section 9.1, we may multiply Eq. 13.24 by e~x (Exercise 9.1.1)
and obtain
f e~xLm(x)Ln(x)dx = 6min. A3.36)
Jo
This orthogonality is a consequence of the Sturm-Liouville theory, Section 9.1.
The normalization follows from the generating function. It is sometimes con-
convenient to define orthogonalized Laguerre functions (with unit weighting func-
function) by
cpn(x) = e-x'2Ln(x) A3.37)
Our new orthonormal function q>n(x) satisfies the differential equation
(p'n(x) + (n + \~f\ <Pn(x) = 0, A3.38)
LAGUERRE FUNCTIONS 725
which is seen to have the Sturm-Liouville form (self-adjoint). Note that it is
the boundary conditions in the Sturm-Liouville theory that fix our interval as
@ < x < oo).
Associated Laguerre Polynomials
In many applications, particularly in quantum theory, we need the associated
Laguerre polynomials defined by2
A3.39)
From the series form of Ln(x)
Lk0(x) = 1
L\(x) = — x + к + 1
A3.40)
In general,
Lj(x)= £(-1Г7 ТУ7ГгЧт-Тх'"' k>~L
m=0 (n-m)l(k + m)\m\
A generating function may be developed by differentiating the Laguerre
generating function к times. Adjusting the index to Ln+k, we obtain
^рт I H1- A3-42)
From this
Lj@) = &L+M1. A3.43)
n\k\
Recurrence relations can easily be derived from the generating function or by
differentiating the Laguerre polynomial recurrence relations. Among the nu-
numerous possibilities are
(n + l)Lkn+l(x) = Bn + k + l- x)Lkn(x) -(n + k)Lkn^(x) A3.44)
xLkn'(x) = nLk(x) -(n + /c)Lti(x). A3.45)
From these or from differentiating Laguerre's differential equation к times we
have the associated Laguerre equation
xL*"(x) + (k + 1 - x)Lk'(x) + nLk(x) = 0. A3.46)
When associated Laguerre polynomials appear in a physical problem it is
usually because that physical problem involves Eq. 13.46.
2Sortie authors use j£?nk+k(x) = (dk/dxk)[Ltt+k(x)~\. Hence our
726 SPECIAL FUNCTIONS
A Rodrigues representation of the associated Laguerre polynomial is
Lkn{x)=e~^r£~»{e~xxn+k)- (l3-47)
The reader will note that all these formulas for Lk(x) reduce to the corresponding
expressions for Ln(x) when к = 0.
The associated Laguerre equation A3.46) is not self-adjoint but it can be put
in self-adjoint form by multiplying by e~xxk, which becomes the weighting func-
function (Section 9.1). We obtain
Г -*xkLk(x)Lk
Г e-*xkLkn(x)Lkm(x)dx = {^~^Ьтж A3.48)
Jo
Equation 13.48 shows the same orthogonality interval @, oo) as that for the
Laguerre polynomials, but with a new weighting function we have a new set of
orthogonal polynomials, the associated Laguerre polynomials.
By letting ^*(x) = e~x/2xk/2Lk(x), ф^(х) satisfies the self-adjoint equation
xtf"(x) + itf'(x) + (-*+ 2П + * + * ~ *£\#(*) = a A3.49)
The \J/^(x) are sometimes called Laguerre functions. Equation 13.36 is the special
case к = 0.
A further useful form is given by defining3
Фкп(х) = e-xl2x(k+mLkn{x). A3.50)
Substitution into the associated Laguerre equation yields
n v' V 4 2x
The corresponding normalization integral is
e-xxk+lLk(x)Lk(x)dx = (n+|/c)!Bn + к + 1). A3.52)
o
The reader may show that the Ф£(х) do not form an orthogonal set (except with
x as a weighting function) because of the x~l in the term Bn + к + l)/2x.
The Laguerre functions Ц(х) in which the indices v and /i are not integers may
be defined using the confluent hypergeometric functions of Section 13.6.
EXAMPLE 13.2.1 The Hydrogen Atom
Perhaps the most important single application of the Laguerre polynomials
is in the solution of the Schrodinger wave equation for the hydrogen atom. This
equation is
3This corresponds to modifying the function ф in Eq. 13.49 to eliminate the
first derivative (compare Exercise 8.6.11).
LAGUERRE FUNCTIONS 727
2 ^ A3.53)
\ф ф
in which Z = 1 for hydrogen, 2 for singly ionized helium, and so on. Separating
variables, we find that the angular dependence of ф is У^@, ср). The radial part,
R(r\ satisfies the equation
M\*l J!L!4k+ll A3.54)
-.}I(rR + R
2m r dry dr J r 2m r
By use of the abbreviations
., 2 SmE „ _
p = ar with a = —-^-, E < 0,
A3.55)
: _ 2mZe
л — j2 5
an
Eq. 13.54 becomes
BШ\ A1 Mk+ll) o. A3.56)
p2 dp\ dp ) \p 4 P /
where x(p) = R(p/a). A comparison with Eq. 13.51 for Ф^(х) shows that Eq. 13.56
is satisfied by
PX(P) = e-<»2pL+'LlL_tUp), A3-57)
in which к is replaced by 2L + 1 and n by A — L — 1.
We must restrict the parameter A by requiring it to be an integer n, n =
1, 2, 3, ... .4 This is necessary because the Laguerre function of nonintegral n
would diverge as p"ep, which is unacceptable for our physical problem in which
lim R(r) = 0.
r-»oo
This restriction on A, imposed by our boundary condition, has the effect of
quantizing the energy
Z2meA
En = - 2 2 . A3.58)
The negative sign enters because we are dealing here with bound states, E = 0,
corresponding to an electron that is just able to escape to infinity. Using this
result for £„, we have
n,
.we Z 2Z
a = 2 —2 =
n n na0
A3.59)
p = r
na0
4This is the conventional notation for X. It is not the same n as the index n
in Ф„к(х).
728 SPECIAL FUNCTIONS
with
h2
a0 = —=■ the Bohr radius.
The final normalized hydrogen wave function may be written as
A3.60)
EXERCISES
13.2.1 Show with the aid of the Leibnitz formula, that the series expansion of Ln(x)
(Eq. 13.32) follows from the Rodrigues representation (Eq. 13.31).
13.2.2 (a) Using the explicit series form (Eq. 13.32) show that
Ц@)= -п
(b) Repeat without using the explicit series form of Ln(x).
13.2.3 From the generating function derive the Rodrigues representation
13.2.4 Derive the normalization relation (Eq. 13.48) for the associated Laguerre
polynomials.
13.2.5 Expand xr in a series of associated Laguerre polynomials Lk(x\ к fixed and n
ranging from 0 to r (or to oo if r is not an integer).
Hint. The Rodrigues form of Lj(x) will be useful.
ANS. х' = (г
£ ^\.» 0^x<oo.
„to (n + k)\(r - n)\
13.2.6 Expand e~ax in a series of associated Laguerre polynomials L*(x), к fixed and n
ranging from 0 to oo.
(a) Evaluate directly the coefficients in your assumed expansion.
(b) Develop the desired expansion from the generating function.
13.2.7 Show that
f
Jo
e'xxk+lLk(x)Lk(x)dx = *и + **!Bи + к + 1).
о nl
Hint. Note that
xL* = (In + к + l)Lkn ~(n + /c)L*_, - (n + i)Lk+l.
13.2.8 Assume that a particular problem in quantum mechanics has led to the dif-
differential equation
EXERCISES 729
0
dx2 I 4x2 2x _|
Write y(x) as
y(x) = A(x)B(x)C(x)
with the requirement that
(a) A(x) be a negative exponential giving the required asymptotic behavior of
y(x) and
(b) B{x) be a positive power of x giving the behavior of y{x) for x« 1.
Determine A(x) and B(x). Find the relation between C(x) and the associated
Laguerre polynomial.
ANS. A{x) = e"*'2
B{x) = x(k+1)/2,
C{x) = Lkn(x).
13.2.9 From Eq. 13.60 the normalized radial part of the hydrogenic wave function is
n"w [. 2n{n + L)l
in which a = 2Z/na0 = 2Zme2/nh2. Evaluate
(a) <r>=[ rKni(ar)KnL(ar)r2c/r, (b) {Г1} = f r'^^iarJR^^r2^.
Jo Jo
The quantity <r> is the average displacement of the electron from the nucleus,
whereas <r-1> is the average of the reciprocal displacement.
ANS. <r> = ^ O2 - L(L + 1)]
n ar
13.2.10 Derive the recurrence relation for the hydrogen wave function expectation
values.
^<rs+1> - Bs + 3)ao<rs> + i±A[BL + IJ - (s + lJ]^^) = 0
withs> -2L - 1. <rs> = Л
Яшг. Transform Eq. 13.56 into a form analogous to Eq. 13.51. Multiply by
ps+2u' — cps+1u. Here u = рФ. Adjust с to cancel terms that do not yield
expectation values.
13.2.11 The hydrogen wave functions, Eq. 13.60, are mutually orthogonal as they should
be, since they are eigenfunctions of the self-adjoint Schrodinger equation.
Yet the radial integral has-the (misleading) form
^l ЦМJ dr,
which appears to match Eq. 13.52 and not the associated Laguerre orthogonality
relation, Eq. 13.48. How do you resolve this paradox?
ANS. The parameter a is dependent on n. The first three a's previously
shown are 2Z/n1a0. The last three are 2Z/n2a0. For щ — n2
Eq. 13.52 applies. For ny ф n2 neither Eq. 13.48 nor Eq. 13.52
is applicable.
730 SPECIAL FUNCTIONS
13.2.12 A quantum mechanical analysis of the Stark effect (parabolic coordinate) leads
to the differential equation
dtydtj \2 * 4
Here F is a measure of the perturbation energy introduced by an external electric
field. Find the unperturbed wave functions (F = 0) in terms of associated
Laguerre polynomials.
ANS. u(£) = £T£4/2£m/2^(e£), with г = yf^TE > 0, p = а/г - (m + l)/2,
a nonnegative integer.
13.2.13 The wave equation for the three-dimensional harmonic oscillator is
Here со is the angular frequency of the corresponding classical oscillator. Show
that the radial part of ф (in spherical polar coordinates) may be written in terms
of associated Laguerre functions of argument (/fr2), where /? = Mw/h.
Hint. As in Exercise 13.2.8, split off radial factors of rl and e~Pr2/2. The associated
Laguerre function will have the form ^tf2
13.2.14 Write a program that will generate the coefficients as in the polynomial form
of the Laguerre polynomial, Ln(x) £s
13.2.15 (a) Write a subroutine that will transform a finite power series Y^=o anx"mto
a Laguerre series ^=obnLn(x). Use the recurrence relation Eq. 13.33 and
follow the technique outlined in Section 13.4 for a Chebyshev series.
13.2.16 Tabulate L10(x) for x = 0.0@.1K0.0. This will include the 10 roots of L10.
Beyond x = 30.0, L10(x) is monotonic increasing. If a plotting subroutine is
available, plot your results.
Check value. Eighth root = 16.279.
13.2.17 Determine the 10 roots of Li0(x) using a root-finding subroutine (compare
Appendix 1). You may use your knowledge of the approximate location of the
roots or develop a search routine to look for the roots. The 10 roots of L10(x)
are the evaluation points for the 10-point Gauss-Laguerre quadrature (compare
Appendix 2). Check your values by comparing with AMS-55 (Table 25.9).
13.2.18 Calculate the coefficients of a Laguerre series expansion (Ln(x),k = 0) of the
exponential e~x. Evaluate the coefficients by the Gauss-Laguerre quadrature
(compare Eq. 9.^4). Check your results against the values given in Exercise
13.2.6.
Note. Direct application of the Gauss-Laguerre quadrature with /(x) =
Ln(x)e~x gives poor accuracy because of the extra e~x. Try a change of variable,
у = 2х, so that the function appearing in the integrand will be simply Ln(y/2).
13.2.19 (a) Write a subroutine to calculate the Laguerre matrix elements
AC- ГLm(x)LH(x)x>e-'dx.
Г
Jo
Include a check that the condition \m — n\ <, p < m + n. (If p is outside
this range, Mmnp = 0: Why?)
Note. A 10-point Gauss-Laguerre quadrature will give accurate results
for m + n + p < 19.
CHEBYSHEV (TSCHEBYSCHEFF) POLYNOMIALS 731
(b) Call your subroutine to calculate a variety of Laguerre matrix elements.
Check Mnnl against Exercise 13.2.7.
13.2.20 Write a subroutine to calculate the numerical value of Lkn{x) for specified values
of n, k, and x. Require that n and к be nonnegative integers and x be > 0.
Hint. Starting with known values of Lk0 and L\{x), we may use the recurrence
relation, Eq. 13.44, to generate Lkn(x), n = 2, 3, 4, ....
13.2.21 Write a program to calculate the normalized hydrogen radial wave function
Фпь(г)- This is \jjnLM of Eq. 13.60, omitting the spherical harmonic Y^{0,(p).
Take Z = 1 and a0 — 1. (which means that r is being expressed in units of Bohr
radii). Accept n and L as input data. Tabulate фпЬ{г) for r = 0.0@.2)i? with R
taken large enough to exhibit the significant features of ф. This means roughly
R = 5 for n = 1, 10 for n = 2, and 30 for n = 3.
13.3 CHEBYSHEV (TSCHEBYSCHEFF)
POLYNOMIALS
In this section two types of Chebyshev polynomials are developed as special
cases of ultraspherical polynomials. Their properties follow from the ultra-
spherical polynomial generating function. The primary importance of the
Chebyshev polynomials is in numerical analysis. Section 13.4 is devoted to these
numerical applications.
Generating Functions
In Section 12.1 the generating function for the ultraspherical or Gegenbauer
polynomials
was mentioned, with a = j giving rise to the Legendre polynomials. In this
section we first take a = 1 and then a = 0 to generate two sets of polynomials
known as the Chebyshev polynomials.
Type II
With a = 1 and C{n1](x) = Un(x), Eq. 13.61 gives
1 °°
IW N1 |«|1 (B.62)
These functions, Un(x), generated by A — 2xt + t2) ' are labeled Chebyshev
polynomials type II. Although these polynomials have few applications in
mathematical physics, one unusual application is in the development of four-
dimensional spherical harmonics used in angular momentum theory.
Type I
With a = 0 there is difficulty. Indeed, our generating function reduces to the
constant 1. We may avoid this problem by first differentiating Eq. 13.61 with
732 SPECIAL FUNCTIONS
respect to t. This yields
-<x(-2x
A - 2xt + t2f+1
or
x — t £ n
= У
A - 2xt + t2f+l „ti 2
We de/ше qO)(x) by
A3.64)
«-►o a
A3.65)
The purpose of differentiating with respect to t was to get an alpha in the
denominator and to create an indeterminant form. Now multiplying Eq. 13.64
by It and adding 1 = A - 2xt + r2)/(l - 2xt + t2), we obtain
r^I £f),.. A3.66,
We define Tn(x) by
I1 A367)
Notice the special treatment for n = 0. This is similar to the treatment of n = 0
term in the Fourier series. Also, note carefully that C^0) is the limit indicated in
Eq. 13.65 and not a literal substitution of a = 0 into the generating function
series. With these new labels,
1 1 ~/l t2 = W + 2 £ адг, |*|<1_И<1. • A3.68)
i — zxr -t- r „=i
We call Г„(х) the Chebyshev polynomials, type I. The reader should be warned
that the notation for these functions differs from reference to reference. There is
almost no general agreement. Here we follow the usage of AMS-55.
These Chebyshev polynomials (type I), which combine useful features of A)
the Fourier series and B) orthogonal polynomials, are of great interest in
numerical computation. For example, a least-squares approximation minimizes
the average squared error. An approximation using Chebyshev polynomials
allows a larger average squared error but may keep extreme errors down,
Section 13.4.
Differentiating the generating function (Eqs. 13.62 and 13.68) with respect to
t, we obtain recurrence relations
Tn+l(x) - 2xTn(x) + Tn_,(x) = 0, A3.69)
Un+l(x) - -2xUn(x) +Vn^(x) = 0 A3.70)
(see Table 13.3).
CHEBYSHEV (TSCHEBYSCHEFF) POLYNOMIALS 733
TABLE 13.3 Orthogonal Polynomial
Recurrence Relation0
Legendre
Chebyshev I
Shifted Chebyshev I
Chebyshev II
Shifted Chebyshev II
Laguerre Associated
Laguerre
Hermite
PJLx)
PnU)
Ux)
T*(x)
UH(x)
U*{x)
Lf{x)
Я„(х)
aPn is any orthogonal polynomial.
TABLE 13.4
Chebyshev
Polynomials,
Type 1
T0=l
T2 = 2x2 - 1
T3 = 4a3 - 3a-
TA = 8x4 - 8a-2 + 1
T5 = 16a5 - 20a3 + 5x
T6 = 32x6 - 48x4 + 18x2
TABLE 13.5
Chebyshev
Polynomials,
Type II
U0=l
Ut =2x
U2 = 4x2 - 1
U3 = 8x3 - 4x
U4 = 16x4 - 12x2 + 1
U5 = 32x5 - 32x3 + 6x
U6 = 64x6 - 80x4 + 24x:
i .
г _ i
An
2n+ 1
n + 1
2
4
2
4
1
n+ 1
2
0
0
-2
0
-2
2n + fc + 1
n + 1
0
с
1
n-f 1
1
1
1
1
n + k
n+ 1
2/7
734 SPECIAL FUNCTIONS
-i
FIG. 13.5 Chebyshev polynomials, Т„(х)
FIG. 13.6 Chebyshev polynomials, Un(x)
Then, using the generating functions for the first few values of n and these
recurrence relations for the higher-order polynomials, we get Tables 13.4 and
13.5 (see also Figs. 13.5 and 13.6).
As with the Her mite polynomials, Section 13.1, the recurrence relations, Eqs.
13.60 and 13.70, together with the known values of T0(x), T^x), U0(x), and
U^x), provide a convenient—that is, for a high-speed electronic computer—
means of getting the numerical value of any Tn(x0) or Un(x0), with x0 a given
number. Again, from the generating functions, we have the special values
CHEBYSHEV (TSCHEBYSCHEFF) POLYNOMIALS 735
A3.71)
Un(l) = n+\
(-l)=<-lr<n+l)
)
The parity relations for Tn and Un are
(-1)"Т„(-х) A3.73)
(-l)"Un(-x). A3.74)
Rodrigues representations of Tn(x) and £/„(*) are
Г(Х) А' ^/2
and
(!)>+ lOr^ </" 2 +1/2
Recurrence Relations-Derivatives
From the generating functions for Tn(x) and Un(x) differentiation with respect
to x leads to a variety of recurrence relations involving derivatives. Among the
more useful equations are
A - х2)Т„'(х) = -ихТ„(х) + иТя_,(х), A3.77)
and
П — x2)U'(x) — — nxlJ (x) 4- (n 4- UU ,(x) П3 78)
From Eqs. 13.69 and 13.77 Т„(х) the Chebyshev polynomial type I satisfies
A - х2)Т„"(х) - хТя'(х) + п2Т„(х) = 0. A3.79)
Un(x) the Chebyshev polynomial .of type II satisfies
A - х2)Щ(х) - Зх1/Я'(х) + п(п + 2I/я(х) = 0. A3.80)
The ultraspherical equation
= 0A3.81)
is a generalization of these differential equations, reducing to Eq. 13.79 for
a — 0 and Eq. 13.80 for a = 1 (and to Legendre's equation for a — }).
736 SPECIAL FUNCTIONS
Trigonometric Form
At this point in the development of the properties of the Chebyshev solutions
it is beneficial to change variables replacing x by cos 6. With x = cos в and
d/dx = (- 1/sin e)(d/d6). Equation 13.79 becomes
^ + п2Т„ = 0, A3.82)
the simple harmonic oscillator equation with solutions cos пв and sin пв. The
special values (boundary conditions) identify
Tn = cos пв = cos «(arc cos x). A3.83a)
A second linearly independent solution of Eqs. 13.79 and 13.82 is labeled
Vn = sin пв = sin «(arc cos x). A3.836)
The solutions of the type II Chebyshev equation, Eq. 13.80, become
= sin(n+lH
sin0
_ cos(H + l)fl
sin#
The two sets of solutions, type I and type II are related by
Vn(x) = (l-x2f2Un_l(x) A3.85a)
Wn(x) = (l-x2rl'2Tn+1(x). A3.85b)
As already seen from generating functions, Tn(x) and Un(x) are polynomials.
Clearly, Vn{x) and Wn(x) are not polynomials.
From
Tn(x) + iVn(x) = cos пв + i sin пв
= (cos0 + isin0)" A3.86)
= [x + i(l - x2)]", |x| < 1
we obtain expansions
ВД = хп - (П\хп~2{\ - х2) + ("]хпA - х2J , A3.87а)
and
Лх"-* - (П\хн~3A - x2) + (П\хп~5{\ - x2J - • •
vn(X) = Vi-^
L\V W W
A3.876)
Here the binomial coefficient (£) is given by
ml m\(n — m)!
EXERCISES 737
From the generating functions, or from the differential equations, power-series
representations are
v, ["W (v, _ m - \\\
Bxf'2m A3.88a)
and
ад =
m\(n-
A3.886)
Orthogonality
If Eq. 13.79 is put into self-adjoint form (Section 9.1), we obtain w(x) =
A — x2)~1/2 as a weighting factor. For Eq. 13.81 the corresponding weighting
factor is A — x2)+1/2. The resulting orthogonality integrals are
Тт(х)Тя(х){1 - x2)-1'2dx = <
Vm(x)Vn(x)(l -
dx= \
o,
n
r
n,
o,
n
2'
0,
m ф n,
m = n ф 0,
m — n = 0,
тфп,
m — n ф 0,
m = n = 0,
A3.89)
A3.90)
A3.91)
1-1
and
n .
A3.92)
This orthogonality is a direct consequence of the Sturm-Liouville theory,
Chapter 9. The normalization values may best be obtained by using x — cos 0
and converting these four integrals into Fourier normalization integrals (for the
half integral [0, л]).
EXERCISES
13.3.1 Another Chebyshev generating function is
\t\<
How is Xn(x) related to Т„(х) and Un(x)l
13.3.2 Given
738 SPECIAL FUNCTIONS
A - x2)U;'(x) - 3xl/n'(x) + n(n + 2)Un(x) = 0,
show that Vn(x) satisfies
A - x2)V;\x) - xVJ(x) + n2Vn(x) = 0,
which is Chebyshev's equation.
13.3.a Show that the Wronskian of Tn{x) and Vn(x) is given by
n
- Tn'(x)Vn(x) = ~
A-х2I12'
This verifies that Tn and Vn(n Ф 0) are independent solutions of Eq. 13.79.
Conversely, for n = 0, we do not have linear independence. What happens at
n = 0? Where is the "second" solution?
13.3.4 Show that Wn(x) = A - x2)/27n+1(x) is a solution of
A - x2)Wn"(x) - 3xWB'(x) + n(n + 2)Wn(x) = 0
13.3.5 Evaluate the Wronskian of Un{x) and Wn{x) = A - x2)~1/27n+1(x).
13.3.6 Vn{x) »A - x2I/2C/n_!(x) is not defined for n = 0. Show that a second and
independent solution of the Chebyshev differential equation for 7^(x), (n = 0)
is V0(x) = arccosx (or arc sin x).
13.3.7 Show that Vn(x) satisfies the same three-term recurrence relation as 7^(x)
(Eq. 13.69).
13.3.8 Verify the series solutions for 7n(x) and C/n(x) (Eqs. 13.88a and 13.88b).
13.3.9 Transform the series form of 7n(x), Eq. 13.88a, into an ascending power series.
ass. ад -,-ir.jh-irilii^
13.3.10 Rewrite the series form of Un{x), Eq. 13.886, as an ascending power series.
ass. ^„(-irt^-irj-fei^
13.3.11 Derive the Rodrigues representation of 7^(x).
( % J
2"(n-t)! dx"
Hints. One possibility is to use the hypergeometric function relation
^(a.b.cjz) = A - z)~Fl(a,c -b,c;-
1-z/
with z = A — x)/2. An alternate approach is to develop a first-order differential
equation for у = A - x2y~1/2. Repeated differentiation of this equation leads
to the Chebyshev equation.
13.3.12 (a) From the differential equation for Tn (in self-adjoint form) show that
_j dx ax
EXERCISES 739
m ф п.
(b) Confirm the preceding result by showing that
^ = «£/,-, W.
dx
13.3.13 The expansion of a power of x in a Chebyshev series leads to the integral
(a) Show that this integral vanishes for m < n.
(b) Show that this integral vanishes for m + n odd.
13.3.14 Evaluate the integral
for m > n and m + n even by each of two methods:
(a) Operate with x as the variable replacing Tn by its Rodrigues representation.
(b) Using x = cos#, transform the integral to a form with 0 as the variable.
.... F ml (m — n — 1)!!
ANS. lmn = n -51 —, m>n, m-\r n even.
(m-n)\ (m + n)U
13.3.1 5 Establish the following bounds, - 1 < x < 1:
(a) \Un{x)\<n+\,
(b)
d
Ux)
2
<n\
dx
13.3.16 (a) Establish the following bound, -1 < x < 1:
Vn{x) = 1.
(b) Show that Wn(x) is unbounded in — 1 < x < 1.
13.3.17 Verify the orthogonality-normalization integrals for
(a) Tm(x), Tn(x)
(b) Vm(x), Vn(x)
(c) Um(x), UH(x)
(d) Wm{x), Wn{x).
Hint. All these can be converted to Fourier orthogonality-normalization
integrals.
13.3.18 Show whether
(a) Tm(x) and Vn(x) are or are not orthogonal over the interval [ — 1,1] with
respect to the weighting factor A — x2)~1/2.
(b) Um(x) and Wn(x) are or are not orthogonal over the interval [ — 1,1] with
respect to the weighting factor A — x2I/2.
13.3.19 Derive
(a) Т„+1(х)+ Г„_1(х) = 2хТ„(х),
(b) Tm+n(x) + Tm_n(x) = 2Tm(x)Tn(x),
from the "corresponding" cosine identities.
13.3.20 A number of equations relate the two types of Chebyshev polynomials. As
UQ SPECIAL FUNCTIONS
examples show that
and
13.3.21 Show that
Tn(x)
(a) using the trigonometric forms of Vn and Т„,
(b) using the Rodrigues representation.
13.3.22 Starting with x = cos в and 7n(cos в) - cos пв, expand
x =
e» 4-
and show that
xk = r^r \Tk(x) + ( ) 7fc_2(x) + Г) Tk_4(x) +•.•],
^ L V1/ W J
//д lA\
the series in brackets terminating with I 1Tj (x) for к = 2m 4-1 or -1 1 To
\m/ 2\mJ
for fc = 2m.
13.3.23 (a) Calculate and tabulate the Chebyshev functions Vt(x), V2(x), and V3(x)
forx= -1.0@.1I.0.
(b) A second solution of the Chebyshev differential equation, Eq. 13.79, for
n = 0 is y(x) = sin"! x. Tabulate and plot this function over the same range:
-1.0@.1I.0.
13.3.24 Write a program that will generate the coefficients as in the polynomial form of
the Chebyshev polynomial, 7n(x) = ^=0 a^.
13.3.25 Tabulate T10(x) for 0.00@.01I.00. This will include the five positive roots of
Ti0. If a plotting subroutine is available, plot your results.
13.3.26 Determine the five positive roots of Ti0(x) by calling a root-finding subroutine
(compare Appendix 1). Use your knowledge of the approximate location of
these roots from Exercise 13.3.25 or write a search routine to look for the roots.
These five positive roots (and their negatives) are the evaluation points of the
10-point Gauss-Chebyshev quadrature method (Appendix 2).
xk = cos[Bfc - 1)я/20], к = 1, 2, 3, 4, 5.
13.4 CHEBYSHEV POLYNOMIALS—NUMERICAL
APPLICATIONS
In contrast with the Legendre, Hermite, and Laguerre polynomials, the
Chebyshev polynomials (Tn(x)) play no significant role in a direct description
of the physical world. Their importance stems from a rapidly growing wealth
of applications in numerical analysis. The following are examples:
CHEBYSHEV POLYNOMIALS—NUMERICAL APPLICATIONS 741
a. Chebyshev polynomials. They provide a convenient
and rather accurate approximation to a minimax
approximation of a function over [—1,1]. This
minimax approximation is an approximation in
which the maximum magnitude of the error (of the
approximation) is minimized.
b. Numerical evaluation of integrals, Gauss-Chebyshev
quadrature. Compare Appendix 2.
с A variety of miscellaneous applications, including
matrix inversion and numerical integration of
differential equations.
Here we concentrate on (a), Chebyshev series and their use in approximating
functions.
TRIGONOMETRIC FORM
From the preceding section
rn(cos в) = cos nO A3.93a)
or
Tfl(x) = cos(ncos'ix). A3.936)
From this trigonometric form we obtain the properties that make these ortho-
orthogonal polynomials so useful in numerical analysis (over the orthogonality
interval [-1,1]).
а- |ВД|<1
b. Max Tn(x) = + 1, min Tn(x) = - 1 A3.94)
for all maxima and minima. This leads to the
equiripple property discussed later,
с The maxima and minima are spread reasonably
uniformly over the range [— 1,1].
Chebyshev Series
The representation of a function/(x) by a series of Chebyshev polynomials
has some significant advantages over the regular power series: A) The conver-
convergence is much more rapid,1 B) the technique of telescoping series to obtain a
more compact representation is opened up, and C) a minimax approximation
is approached.
From
oo
f(x)= %аяТя(х) A3.95)
n = 0
the coefficients а„ can be calculated by using the orthogonality of the Chebyshev
polynomials and the normalization, Eq. 13.87. We obtain
/(х)Тя(х)A - x2y1/2dx, n = 1, 2, 3 A3.96)
■i
1 The basic theorem was proved by Chebyshev.
742 SPECIAL FUNCTIONS
and half of this for a0. This anomalous behavior of the first coefficient is repeated
in the Fourier cosine series of Chapter 14. Note that this is a least-squares fit.
Actually the Chebyshev series is a Fourier cosine series in disguise. With
Eq. 13.93a, Eq. 13.95 becomes
/(cos0)= £ ancosnd, A3.97)
n=0
similar to Eq. 14.1.
Uf(x) is a finite power series (polynomial), the Chebyshev coefficients may
be determined by other techniques that are faster and more accurate than the
direct integration of Eq. 13.96. We have
£ bnx"= £>„ВД. A3.98)
n=0 n=0
The equality of upper limits n = N is plausible if we recall that Tn(x) has x" as
its highest power. The Tn(x) then are a reordering of the powers of x appearing
in the power series. This argument can be made rigorous by mathematical
induction or the Gram-Schmidt orthogonalization of Section 9.3.
With the power-series coefficients bn known, there are various techniques
for determining the unknown Chebyshev coefficients, an.
Matrix Multiplication In direct analogy with Exercise 12.2.1 for Legendre
polynomials we can set up the Chebyshev transformation matrix and obtain
the an coefficients by matrix multiplication.
We may write
x
n=£cnsTs, A3.99)
s=0
with the cns tabulated in AMS-55, Table 22.3. Substituting into Eq. 13.98
(with the dummy index n on the right replaced by s) and equating coefficients
of the same Т„ we obtain
where <Ь„| and <as| are row vectors (bars) and cns is a matrix, actually lower
left triangular. Taking the adjoint
(csn)\bn> = K>, A3.100)
we have |Ь„> and |as> as column vectors (kets). The power series to Chebyshev
series transformation (csn), now an upper right triangular matrix, is given by
2
0
1
2
0
0
0
0
з
4
0
4
0
0
8
0
4
8
0
8
0
0
10
16
0
0
CHEBYSHEV POLYNOMIALS—NUMERICAL APPLICATIONS 743
The right-hand column of this matrix is taken from
x5 = -{lOT.ix) + 5T3(x) + lT5(x)},
a special case of Eq. 13.99 for n = 5. For n > 0 the nth column contains a
factor of 1/2" ~1 which may be factored out.
A significant limitation of this matrix transformation technique, Eq. 13.100,
is that the matrix size and therefore the upper limit N is fixed. In the preceding
case N = 5. If you wish to handle N = 6, then the coefficients of x6 in Eq. 13.99
must be added on the right (and zeros at the bottom).
Fast Fourier Transform Method This is discussed in Chapter 14.
Recurrence Relation Iteration This technique is discussed subsequently.
Power Series to Chebyshev Series
Let us rewrite our polynomial in a nested multiplication form
f(x) = bo + x(b, + ■■■ + x{bN_2 + x{bN^ + xbN))). A3.102)
We employ, Eq. 13.69, the recurrence relation as
xTn(x) = iTB+1(x) + K-i(*), n = l,2, ... A3.103)
and
xT0 = T1 (forn = 0). A3.104)
Starting with the innermost parentheses, we obtain
bN_i + xbN = bN_l T0(x) + buT^x) A3.105)
(from Table 13.3). Multiplying by x and using Eqs. 13.103 and 13.104, we have
bN_2 + х(Ь^_! + xbN) = bN_2T0 + х(Ь^_! To + bNT^
I i i-^ A3-106)
= bN_2T0 + Ьдг.! Tx + \bNT0 + \bNT2.
Collecting coefficients, we get
bN-2 + x(bw-i + xbN) = (bN_2 + ±bN)T0 + Vx 7\ + l2bNT2. A3.107)
Schematically, we have (coefficient of Tn in the column labeled Tn, each row down
giving the result of one more iteration):
т0 тх т2 т3
bjv-i
The coefficient of TN will be aN = 2"(iV)biV.
Note the following features:
744 SPECIAL FUNCTIONS
1. In the wth row aN_m is added into the To column.
(bN appears in the Tj column in the first row.)
2. The To coefficient of one row is shifted to the Tt
column in the next row down (solid arrows).
3. All other entries (Tl9 T2, ... columns) are shifted to
both right and left but with a coefficient of \—in
accordance with Eq. 13.103. (Dotted arrows).
This procedure continues until the last coefficient b0 has been fed into the To
column and that row is complete. The number then appearing in the Tm column
is its coefficients am. As a computing program this procedure is fast and accurate.
It also has the advantage of not requiring any knowledge of the coefficients of
the Chebyshev polynomials (beyond To and TJ.
Telescoping Series (Economization)
Suppose that coshx is represented in the interval [— 1,1] by the truncated
Maclaurin series
6
cosh* % £ b2nx2n, A3.108)
n = 0
with b2n = l/Bn)\. Since the coefficients form a rapidly decreasing sequence,
the maximum error (at x = 1) is approximately the first term omitted —1/A4)! =
1.147 x 101. Transforming to a series of Chebyshev polynomials through
Tl2(x), we obtain
cosh* * t b2n*2n = t «2„Т2„(х). A3.109)
n = 0 n = 0
The Maclaurin series coefficients and the Chebyshev coefficients are shown in
Table 13.6. The ratio a2jb2n is also included—to exhibit the much more rapid
convergence of the corresponding Chebyshev series.
TABLE
n
0
2
4
6
8
10
12
13.6
Maclaurin series
coefficients"
Eq. 13.108
1.000 x 10°
5.000 x 10
4.167 x 10'2
1.389 x 10'3
2.480 x 10
2.756 x 10
2.088 x 10"9
6
cosh x v £ b2nx2n
n=0
Chebyshev series
coefficients'
Eq. 13.109
1.26606587 7749 6
0.271495339529 8
0.005474240439 3
0.00004497 73215
0.0000001992120
0.0000000005505
0.0000000000010
6
= I a2nT2n(x).
n=0
dlJbln
Chebyshev
Maclaurin
1.27
5.43 x 10
1.31 x 10
3.24 x 10
8.03 x 10
2.00 x 10
4.88 x 10
'All coefficients are calculated to 13 decimal accuracy.
CHEBYSHEV POLYNOMIALS—NUMERICAL APPLICATIONS 745
TABLE 13.7 Approximations to cosh jc
n
0
2
4
6
8
10
12
Maximum
error
Maximum
error in
7
Seven-term
Maclaurin series
К
1.000000000000
0.500000000000
0.0416 6666 6667
0.0013 8888 8889
0.0000 2480 1587
0.00000027 5573
0.00000000 2088
( —I.147 x 10*"
Maclaurin series of same
ib2nx2»
Telescoped to
six terms
К
0.9999 99999999
0.500000000073
0.0416 6666 5810
0.0013 8889 2542
0.0000 24794541
0.000000281836
—
1.3 x 10*"
( —J.1 x 10~9
number of terms
Telescoped to
five terms
К
1.000000000549
0.49999997 2550
0.04166688 5995
0.0013 8827 6026
0.0000 25499132
—
5.6 x 1O~10
(-J.8 x 10~7
The final ratio al2/bl2 is 2 11, as expected, to within the accuracy of the
Chebyshev coefficient.
Now the final term in this seven-term Chebyshev series is 1.0 x 10~12T12(x),
with the maximum magnitude of 1.0 x 102 by Eq. 13.94. Since our original
approximation of coshx (Eq. 13.108) is accurate only to 1.1 x 101, this T12
term may be dropped without any significant loss of accuracy! If desired, the
shortened six-term Chebyshev series may be transformed back into a power
series through x10. And this telescoped power series has essentially the same
accuracy as the original series-through x12.
This process of dropping the highest order term of the Chebyshev series
(telescoping) may be continued as desired. Table 13.7 gives the resulting power-
series coefficients.
The maximum error in the six-term telescoped series is comparable to the
maximum error in the original seven-term series. The maximum error in the
five-term telescoped series is appreciably less than the maximum error in the
six-term Maclaurin series. This process of telescoping reduces the maximum
error (comparing telescoped and Maclaurin series of the same number of terms)
and distributes it more uniformly across the interval [— 1, l] instead of con-
concentrating it at x = ±1. For a fixed number of terms we have approached a
minimization of the maximum error—a minimax approximation. This redistri-
redistribution of the error (shown in Fig. 13.7) is given approximately by the last
bm Tm(x) dropped—approximately equiripple.
Our Chebyshev approximations
5
coshx^ X b'2nx2n A3.108a)
У2'„х2" A3.1О8Ь)
746 SPECIAL FUNCTIONS
50 X 10"S-4
■50 X \0'Щ
FIG. 13.7 Errors in representations of cosh x
(T) Error in seven-term Maclaurin series telescoped to five terms. ©
Error in five-term Maclaurin series. ® Error in six-term Maclaurin
series.
of Table 13.7 are not exact minimax approximations nor is the error curve,
Fig. 13.7, exactly equiripple. The approximation may be modified to be exactly
minimax or the error exactly equiripple by iterative numerical techniques, but
for almost all purposes the Chebyshev approximations will suffice.
Shifted Chebyshev Polynomials
Our Chebyshev polynomials are defined and are orthogonal over the specific
interval [ — 1,1]. Since any finite interval a <, x ^ b can be transformed into
— 1 <L t < 1 by the linear transformation
_ ^ ~ a i
x — - t н
a
A3.110)
the choice [ — 1,1] is perfectly general. However, it is often convenient to work
in the interval [0,1] and to define polynomials orthogonal over this interval.
Following Eq. 13.110 we use Tn(t) = TnBx - 1) and define these to be the
shifted Chebyshev polynomials T*(x):
EXERCISES 747
Г„*(х) = ГнBх - 1), 0<х<1 и = 0,1,2...... A3.111)
The shifted Chebyshev polynomials may be expressed in terms of an angle 0.
We have
2x — 1 = cos U
as the argument of Tn. Then
1 + costf 20 . 1in,
x = = cos2-. A3.112)
Since we have made a linear transformation in going from Tn to Tn*, we still
have
T*(x) = cos nO,
but now x and 0 are related by Eq. 13.112. The properties of Tn*(x) may be
derived from the corresponding Tn(x) properties. Again, because of the occa-
occasional usefulness of the shifted Chebyshev polynomials,4 the IBM Scientific
Subroutine Package (SSP), including appropriate subroutines, is provided.
EXERCISES
13.4.1 Derive the relations
dx
x - x2I12
0
n
2
n
m
m
—
n
n
= 0.
13.4.2 (a) Show that T*(x) = 1 and T*(x) = 2x - 1.
(b) Derive the shifted Chebyshev polynomial recurrence relation
T*+l(x) = 2Bx - 1)Г*(х) - Г„* ,(х).
With this recurrence relation and the results of part (a), all the other shifted
Chebyshev polynomials can be developed.
13.4.3 Develop the following Chebyshev expansions (for [— 1,1]):
(a) A - x2I'2 = -[l - 2 £ Ds2 - 1)"] T2s(x)}.
s=i
+ 1, 0<*<П 4£ ,
-1, -1 < x < Oj n s%
13.4.4 (a) For the interval [-1,1] show that
Я 7TS=! 4S" - 1
(b) Show that the ratio of the coefficient of T2s{x) to that of P2s(x) approaches
748 SPECIAL FUNCTIONS
(us) 1/2 as s -> oo. This illustrates the relatively rapid convergence of the
Chebyshev series.
Hint. Legendre—with the Legendre recurrence relations, rewrite xPn(x) as
a linear combination of derivatives. Chebyshev—the trigonometric substitu-
substitution x = cos в, Т„(х) — cos пв is most helpful.
13.4.5 Show that
тг2 00
_ = 1 + 2 £ Ds2 - I).
° s = l
Hint. Apply Parseval's identity (or the completeness relation) to the results of
Exercise 13.4.4.
13.4.6 Show that
(a)
(b) sin-*-- f * 1ч2Г2я+1(х).
13.4.7 (a) Write a double precision subroutine that will transform a finite power series
Y*=o bnxH into a Chebyshev series ££L0 anTn(x). Use the recurrence relation
iteration technique outlined in this section.
(b) Call your subroutine to find the Chebyshev series coefficients for A) e*,
B) e~x, C) coshx, and D) sinhx Carry terms through Tl2(x).
Note. Exercise 11.5.16 is a calculation of these Chebyshev coefficients in
terms of modified Bessel functions, /„.
13.4.8 (a) Using the double precision Chebyshev coefficients for sinhx from Exercise
13.4.7 or 11.5.16 through anTxl, drop the ОцГц term. Compare the error
in your telescoped series with the error in A) the original series and B) the
error in the Maclaurin series of the same number of terms as your telescoped
series. Convert your new Chebyshev series into a power series,
(b) Repeat part (a), dropping a9T9. Calculate the approximately equiripple
error curve and compare with the error curve for the Maclaurin series
through
13.5 HYPERGEOMETRIC FUNCTIONS
In Chapter 8 the hypergeometric equation1
x(l - x)y"(x) + [c - (a + b + l)x]y'(x) - aby(x) = 0 A3.113)
was introduced as a canonical form of a linear second-order differential equation
with regular singularities at x = 0, 1, and oo. One solution is
y(x) = 2F1(a,b,c;x)
с V. c(c + 1) 2!
'This is sometimes called Gauss's differential equation. The solutions then
become Gauss functions.
HYPERGEOMETRIC FUNCTIONS 749
which is known as the hypergeometric function or hypergeometric series.
The range of convergence |x| < 1 and x = 1, for с > a + b, and x = — 1, for
с > a + b — 1. In terms of the often used Pochhammer symbol
{a)n = a(a + \){a + 2) ■ • • (a + n - 1) = ~ * "_~U} ,
(a l)- A3.115)
(fl)o = 1.
the hypergeometric function becomes
2F1(a,b,c;x)= t^¥^"^- A3.116)
In this form the subscripts 2 and 1 become clear. The leading subscript 2
indicates that two Pochhammer symbols appear in the numerator and the
final subscript 1 indicates one Pochhammer symbol in the denominator.2 The
confluent hypergeometric function 1Fl with one Pochhammer symbol in the
numerator and one in the denominator appears in Section 13.6.
From the form of Eq. 13.114 we see that the parameter с may not be zero
or a negative integer. On the other hand, if a or b equals 0 or a negative integer,
the series terminates and the hypergeometric function becomes a simple
polynomial.
Many more or less elementary functions can be represented by the hyper-
hypergeometric function.3 We find
ln(l + x) = x2Fl{l, 1,2; -x). A3.117)
For the complete elliptic integrals К and E
Лтг/2
K(k2)= A - k2 sin2 0)~112 dO
^° A3.118)
71 с / 1 1 i 7 2 1
= ^2^11 ^^»i;* )■>
Лтг/2
E(k2)= A - k2 sin2 OI'2 dO
0 A3.119)
= 7^2Fl (-, --,l;/c2).
The explicit series forms and other properties of the elliptic integrals are
developed in Section §.8.
The hypergeometric equation as a second-order linear differential equation
2 The Pochhammer symbol is often useful in other expressions involving
factorials, for instance,
'With three parameters, a, b, and c, we can represent almost anything.
750 SPECIAL FUNCTIONS
has a second independent solution. The usual form is
y(x) = x1-c2F1(a+ 1 -c,b+ 1 -c,2-c;x), с ф 2, 3,4,.... A3.120)
The reader may show (Exercise 13.5.1) that if с is an integer either the two
solutions coincide or (barring a rescue by integral a or integral b) one of the
solutions will blow up. In such a case the second solution is expected to include
a logarithmic term.
Alternate forms of the hypergeometric equation include
- 2c)] £ у (l-=^- J - aby
d2
Ba + 2b+ \)z
1 -2c
d
y(z2) - 4aby(z2) = 0.
dz' A3.122)
Contiguous Function Relations
The parameters a, b, and с enter in the same way as the parameter «of Bessel,
Legendre, and other special functions. As we found with these functions, we
expect recurrence relations involving unit changes in the parameters a, b, and
c. The usual nomenclature for the hypergeometric functions in which one
parameter changes by + or — 1 is a "contiguous function." Generalizing this
term to include simultaneous unit changes in more than one parameter, we
find 26 functions contiguous to 2^i(fl> b, c;x). Taking them two at a time,
we can develop the formidable total of 325 equations among the contiguous
functions. One typical example is
(a-b){c(a + b- 1)+ 1 -a2 - b2 + [(a - bJ - 1]A - x)} 2F1(a,b,c;x)
= (c -a)(a-b+ l)b2F1(a- 1,6+ l,c;x) A2.123)
+ (c -b)(a -b - l)a2Fl(a+ 1,6- l,c;x).
Another contiguous function relation appears in Exercise 13.5.10.
Hypergeometric Representations
Since the ultraspherical equation A3.81) in Section 13.3 is a special case of
Eq. 13.113, we see that ultraspherical functions (and Legendre and Chebyshev
functions) may be expressed as hypergeometric functions. For the ultraspherical
function we obtain
^У A3-124)
For Legendre and associated Legendre functions
ЗД = 2F1 (-n,n + 1,1;^\ A3.125)
EXERCISES 751
-х2)ш/2 „ / , . 1-х
Alternate forms are
22 )
A3.127)
,Л2п- 1)!! „ / 1 1 2\
)F Wl lfVM + "; I
rBn + l)lx^ ( nn | 3^3. Л
A3.128)
In terms of hypergeometric functions the Chebyshev functions become
2Fi (~и'и4 ;ir^)' A3129)
A3.130)
=
К„(х) = VT^x1^ 2F} (-n +ln+ 1,|; -=-Y A3.131)
The leading factors are determined by direct comparison of complete power
series, comparison of coefficients of particular powers of the variable, or evalua-
evaluation at x = 0 or 1, and so on.
The hypergeometric series may be used to define functions with nonintegral
indices. The physical applications are minimal.
EXERCISES
13.5.1 (a) For c, an integer, and a and b, nonintegral, show that
2Fl(a,b,c;x) and хг~сzFx(a + 1 - c,b + 1 - c-,2 - c;x)
yield only one solution to the hypergeometric equation,
(b) What happens if a is an integer, say, a = — 1, and с = —2?
13.5.2 Find the Legendre, Chebyshev I, and Chebyshev 11 recurrence relations corre-
corresponding to the contiguous hypergeometric function equation A3.123).
13.5.3 Transform the following polynomials into hypergeometric functions of argu-
argument x2. (a) T2n(x); (b) x-lT2n+l(x); (c) U2n(x); (d) x'lV2n+l(x).
ANS. (a) T2n(x) = (-l)Fl(-n,nA;x2).
(b) x~1r2n+1(x) = (-l)"B«+lJF1(-n,«+l,|;x2).
(c) l/2lI(x) = (-l)/\(-n,n+l,i;x2).
(d) x-lU2n+1(x) = (- 1)"Bи + 2JF1(-n,n + 2,l;x2).
752 SPECIAL FUNCTIONS
13.5.4 Derive or verify the leading factor in the hypergeometric representations of the
Chebyshev functions.
13.5.5 Verify that the Legendre function of the second kind, Qv(z), is given by
/v 1 v v 3 Л
|z|>l, |argz|<7r, хф -1, -2, -3, ....
13.5.6 Analogous to the incomplete gamma function, we may define an incomplete
beta function by
Bx(a,b)= (Vm-OH1*.
Jo
Show that
Bx(a,b) = a~lxa2Fl{a,l -b,a+ l;x).
13.5.7 Verify the integral representation
What restrictions must you place on the parameters b and c, on the variable z?
Note. The restriction on |z| can be dropped—analytic continuation. For
nonintegral a the real axis in the z-plane 1 to oo is a cut line.
Hint. The integral is suspiciously like a beta function and can be expanded into
a series of beta functions.
ANS. 0t(c) > ЩЬ) > 0,
and \z\ < 1.
13.5.8 Prove that
£ c*0,l,2,..., oa +
- b)
Hint. Here is a chance to use the integral representation, Exercise 13.5.7.
13.5.9 Prove that
( ~X *
2Fx(a,b,c;x) = A - xfa jFx (a,c - b,c;
1 - x
Hint. Try the integral representation, Exercise 13.4.7.
Note. This relation is useful in developing a Rodrigues representation of Т„(х)
(compare Exercise 13.3.11).
13.5.10 Verify
Hint. Here is a chance to use the contiguous function relation
[2a-c + (b- a)x]F(a, b,c;x) = a(l - x)F(a + 1, b, c; x) - (c - a)F(a - 1, b,
c;x) and mathematical induction. Alternatively, you can use the integral
representation and the beta function.
CONFLUENT HYPERGEOMETRIC FUNCTIONS 753
13.6 CONFLUENT HYPERGEOMETRIC FUNCTIONS
The confluent hypergeometric equation1
x/'(x) + (с - х)У(х) - ay(x) = 0 A3.132)
may be obtained from the hypergeometric equation of Section 13.5 by merging
two of its singularities. The resulting equation has a regular singularity at x = 0
and an irregular one at x = oo. One solution of the confluent hypergeometric
equation is
y(x) = iFifacix) = M(a,c;x)
«A 2feLLU|!..., СФ0.-1.-2 A3.133)
с 1! c{c + 1) 2 !
This solution is convergent for all finite x (or z). In terms of the Pochhammer
symbols, we have
00 (n\ y"
M(a,c;x)= Erv—r {\3.\3A)
„=o (c)n n \
Clearly, M(a,c;x) becomes a polynomial if the parameter a is 0 or a negative
integer. Numerous more or less elementary functions may be represented by
the confluent hypergeometric function. Examples are the error function and the
incomplete gamma function.
^ jV |A,|;-x2\ A3.135)
y(a,x) = e lta l dt
Jo A3.136)
= a~lxaM(a,a + 1; -x), from Eq. 10.71 ?J(a) > 0.
Clearly, this coincides with the first solution for с = 1. The error function and
the incomplete gamma function are discussed further in Section 10.5.
A second solution of Eq. 13.132 is given by
y(x) = x1-cM(a+l-c,2-c;x), с ф 2,3,4, .. .. A3.137)
The standard form of the second solution of Eq. 13.132 is a linear combination
ofEqs. 13.133 and 13.137.
U(a,c;x) =
n
sinTic
M(a, c;x) xl cM(a + 1 — c,2 — c;x)
(а-с)!(с-1)! (я - 1IA -c)!
A3.138)
Note the resemblance to our definition of the Neumann function, Eq. 11.60. As
with our Neumann function, Eq. 11.60, this definition of U(a, c;x) becomes in-
indeterminate in this case for с an integer.
2This is often called Kummer's equation. The solutions, then, are Rummer
functions.
754 SPECIAL FUNCTIONS
An alternate form of the confluent hypergeometric equation that will be
useful later is obtained by changing the independent variable from x to x2.
d
2
■y(x2)
c- 1
x
-2x
dx
y(x2) - 4ay(x2) = 0. A3.139)
As with the hypergeometric functions, contiguous functions exist in which the
parameters a and с are changed by +1. Including the cases of simultaneous
changes in the two parameters,2 we have eight possibilities. Taking the original
function and pairs of the contiguous functions, we can develop a total of 28
equations.3
Integral Representations
It is frequently convenient to have the confluent hypergeometric functions in
integral form. We find (Exercise 13.6.10)
M(a,c;x) = ~-^ ГехЧа-1A-1Га-Ы1, ®{c) > ®{a) > 0, A3.140)
r(a)r(cfl)J
U(a,c;x) = -J— Г e-^f-^l + ty'"'1 dt, 9t{x) > 0,@(a) > 0. A3.141)
r(fl)Jo
Three important techniques for deriving or verifying integral representations
are as follows:
1. Transformation of generating function expansions
and Rodrigues representations: The Bessel and
Legendre functions provide examples of this ap-
approach.
2. Direct integration to yield a series: This direct tech-
technique is useful for a Bessel function representation
(Exercise 11.1Д8) and a hypergeometric integral
(Exercise 13.5.7).
3. (a) Verification that the integral representation satis-
satisfies the differential equation, (b) Exclusion of the
other solution, (c) Verification of normalization.
This is the method used in Section 11.6 to establish
an integral representation of the modified Bessel
function, Kv(z). It will work here to establish Eqs.
13.140 and 13.141.
Bessel and Modified Bessel Functions
Kummer's first formula,
M(a,c;x) = exM(c -a,c;-x), A3.142)
2 Slater refers to these as associated functions.
3The recurrence relations for Bessel, Hermite, and Laguerre functions are
special cases of these equations.
CONFLUENT HYPERGEOMETRIC FUNCTIONS 755
is useful in representing the Bessel and modified Bessel functions. The formula
may be verified by series expansion or use of an integral representation (compare
Exercise 13.6.10).
As expected from the form of the confluent hypergeometric equation and the
character of its singularities, the confluent hypergeometric functions are useful
in representing a number of the special functions of mathematical physics. For
the Bessel functions
whereas for the modified Bessel functions of the first kind,
A3.144)
Hermite Functions
The Hermite functions are given by
^^ A3.146)
using Eq. 13.139.
Comparing the Laguerre differential equation with the confluent hyper-
hypergeometric equation, we have
Ln(x) = M(-n,l;x). A3.147)
The constant is fixed as unity by noting Eq. 13.35 for x — 0. For the associated
Laguerre functions
dxm
A3.148)
(n + m)\ , .
= - — My — n,m + l;x).
nlml
Alternate verification is obtained by comparing Eq. 13.148 with the power-series
solution (Eq. 13.41 of Section 13.2). Note that in the hypergeometric form, as
distinct from a Rodrigues representation, the indices n and m need not be in-
integers and, if they are not integers, L™{x) will not be a polynomial.
Miscellaneous Cases
There are certain advantages in expressing our special functions in terms of
hypergeometric and confluent hypergeometric functions. If the general behavior
of the latter functions is known, the behavior of the special functions we have
investigated follows as a series of special cases. This may be useful in determining
asymptotic behavior or evaluating normalization integrals. The asymptotic be-
756 SPECIAL FUNCTIONS
havior of M(a,c;x) and U(a,c;x) may be conveniently obtained from integral
representations of these functions, Eqs. 13.140 and 13.141. The further advantage
is that the relations between the special functions are clarified. For instance, an
examination of Eqs. 13.145, 13.146, and 13.148 suggests that the Laguerre and
Hermite functions are related.
The confluent hypergeometric equation A3.132) is clearly not self-adjoint.
For this and other reasons it is convenient to define
Mk(l{x) = е-х/2х"+1/2М(д - к + \,2ц + l;x). A3.149)
This new function Mkfi(x) is a Whittaker function which satisfies the self-adjoint
equation
" \ 4 x xl )
The corresponding second solution is
Wkfl(x) = e~x/2xM+1/2C/(// - fc + \,2\i + l;x). A3.151)
EXERCISES
13.6-1 Verify the confluent hypergeometric representation of the error function
' n1'2 yrr л /
13.6.2 Show that the Fresnel integrals C(x) and S(x) of Exercise 5.10.2 may be expressed
in terms of the confluent hypergeometric function as
C(x) + iS(x) = xM(\\^y)-
13.6.3 By direct differentiation and substitution verify that
у = ax'a e~'ta~l dt = ax~ay(a,x)
Jo
actually does satisfy
xy" + (a + 1 + x)y' + ay = 0.
13.6.4 Show that the modified Bessel function of the second kind Kv(x) is given by
Xv(x) = ni/2e~xBx)vU{v + £,2v + l;2x).
13.6.5 Show that the cosine and sine integrals of Section 10.5 may be expressed in
terms of confluent hypergeometric functions as
Ci(x) + isi(x) = -eixU{i, 1; -ix).
This relation is useful in numerical computation of Ci(x) and si(x) for large
values ofx.
13.6.6 Verify the confluent hypergeometric form of the Hermite polynomial Я2п+1(х)
(Eq. 13.146) by showing that
EXERCISES 757
(a) H2n+1(x)/x satisfies the confluent hypergeometric equation with a = — n,
с = т and argument x2,
(b)
(lr.
13.6.7 Show that the contiguous confluent hypergeometric function equation,
(c - a)M(a - l,c;x) + Ba - с + x)M(a,c;x) - aM(a + l,c;x) = 0,
leads to the associated Laguerre function recurrence relation (Eq. 13.44).
13.6.8 Verify the Kummer transformations:
(a) M(a,c;x) = exM(c — a,c;—x)
(b) U(a,c;x) = x1~cU(a-c+ 1,2-c;x).
13.6.9 Prove that
(a) ^ ^
^M(a,c;x) ^
dx" [b)n
(b) £-nU(a,c;x) = (- Ща^Ща + п,с + п;х).
13.6.10 Verify the following integral representations:
(a) M(a,c;x) = —-^ Г extta^(l ~ t)c~a~ldt, Щс) > <#{a) > 0,
Г(а)Г(с -a)J0
(b) U(a,c;x) = —\ e~xtta~\l + tf-"'1 dt, Щх)>0, Ща)>0.
r(«) Jo
Under what conditions can you accept M(x) = 0 in part (b)?
13.6.11 From the integral representation of M(a,c;x), Exercise 13.6.10(a), show that
M{a,c;x) = exM(c — a,c; —x).
Hint. Replace the variable of integration t by 1 — s to release a factor ex from
the integral.
13.6.12 From the integral representation of U{a,c;x), Exercise 13.6.10(b), show that
the exponential integral is given by
Hint. Replace the variable of integration t in E1{x) by x(l + s).
13.6.13 From the integral representations of M(а,с;х) and U(a, с;х) in Exercise 13.6.10
develop asymptotic expansions of
(a) M{a,c;x),
(b) U(a,c;x).
Hint. You can use the technique that was employed with Kv(z), Section 11.6.
W + + +
т^/ л с — а I { 1
1 [ аA + а - с) а(а + 1)A + а - с)B + а - с)
(b) |1 ++ , +|.
13.6.14 Show that the Wronskian of the two confluent hypergeometric functions,
M(a, с ;x) and U(a, c;x) is given by
(c nt ex
MU'-M'U= -[^
(a- l)!xf
What happens if a is 0 or a negative integer?
758 SPECIAL FUNCTIONS
13.6.15 The Coulomb wave equation (radial part of the Schrodinger wave equation
with Coulomb potential) is
d2y ,
dp2
Show that a regular solution, у = FL(n, p), is given by
Fl(i,P) = CL(n)pL+1e-ipM(L + 1 - in,2L + 2;2ip).
13.6.16 (a) Show that the radial part of the hydrogen wave function, Eq. 13.60, may
be written as
(n + L)\
(b) It was assumed previously that the total (kinetic + potential) energy E
of the electron was negative. Rewrite the (unnormalized) radial wave
function for the free electron E > 0.
ANS. e+ixr/2((xr)LM(L+ 1 - in,2L + 2, -iar), outgoing wave. This
representation provides a powerful alternative technique for
the calculation of photoionization and recombination co-
coefficients.
13.6.17 Show that the Laplace transform of M(a, c; x) is
13.6.18 Evaluate
(а) Г\_МкA(х)]Чх
Jo
(b)
Jo x
where 2ц = 0, 1, 2, ... ,k - ц - { = 0, 1, 2, ..., a > -2ц - 1.
ANS. (a) Bfx)\2k.
(b) Bfx)l
(c) B(х)\Bку.
REFERENCES
Abramowitz, M., and I. A. Stegun, eds., Handbook of Mathematical Functions. Washing-
Washington, D.C.: National Bureau of Standards, Applied Mathematics Series-55 A964).
Paperback edition, New York: Dover A964).
Chapter 22 is a detailed summary of the properties and representations of orthogonal
polynomials. Other chapters summarize properties of Bessel, Legendre, hypergeometric,
and confluent hypergeometric functions and much more.
H. Buchholz, The Confluent Hypergeometric Function. New York: Springer-Verlag A952,
translated 1969).
Buchholz strongly emphasizes the Whittaker rather than the Kummer forms. Applica-
Applications to a variety of other transcendental functions,
A. Erdelyi, W. Magnus, F. Oberhettinger, and F. G. Tricomi Higher Transcendental
Functions, Three vols. New York: McGraw-Hill A953; reprinted 1981).
REFERENCES 759
A detailed, almost exhaustive listing of the properties of the special functions of mathe-
mathematical physics.
L. Fox, and I. B. Parker, Chebyshev Polynomials in Numerical Analysis. Oxford: Oxford
University Press A968).
A detailed, thorough but very readable account of Chebyshev polynomials and their
applications in numerical analysis.
Lebedev, N. N., Special Functions and their Applications. Translated by R. A. Silverman.
Englewood Cliffs, N.J.: Prentice-Hall A965). Paperback, New York: Dover A972).
Luke, Y. L., The Special Functions and Their Approximations. Academic Press: New York
A969).
Two volumes; Volume 1 is a thorough theoretical treatment of gamma functions,
hypergeometric functions, confluent hypergeometric functions, and related functions.
Volume 2 develops approximations and other techniques for numerical work.
Luke, Y. L. Mathematical Functions and Their Approximations. New York: Academic
Press A975).
This is an updated supplement to Handbook of Mathematical Functions with Formulas,
Graphs and Mathematical Tables (AMS-55).
Magnus, W., F. Oberhettinger, and R. P. Soni, Formulas and Theorems for the Special
Functions of Mathematical Physics. Springer: New York A966).
This is a new and enlarged edition. An excellent summary of just what the title says,
including the topics of Chapters 10 to 13.
Rainville, E. D., Special Functions. New York: Macmillan A960).
This book is a coherent, comprehensive account of almost all the special functions of
mathematical physics that the reader is likely to encounter.
Sansone, G. Orthogonal Functions. Translated by A. H. Diamond. New York: Interscience
Publishers A959; reprinted 1977).
Slater, L. J., Confluent Hypergeometric Functions. Cambridge: Cambridge University
Press A960).
This is a clear and detailed development of the properties of the confluent hyper-
hypergeometric functions and of relations of the confluent hypergeometric equation to other
differential equations of mathematical physics.
I. N. Sneddon, Special Functions of Mathematical Physics and Chemistry, 3rd ed. New
York: Longman A980).
14 FOURIER SERIES
14.1 GENERAL PROPERTIES
Fourier Series
A Fourier series may be defined as an expansion of a function or representa-
representation of a function in a series of sines and cosines such as
f(x) = -r- + ) я„ cos nx + > Ъ„ sin nx.
Z n=l n=l
The coefficients a0, an, and bn are related to the given function/(x) by definite
integrals: Eqs. 14.11 and 14.12. You will notice that a0 is singled out for special
treatment by the inclusion of the factor \. This is done so that Eq. 14.11 will apply
to all an,n = 0 as well as n > 0.
The conditions imposed onf(x) to make Eq. 14.1 valid are that/(x) has only
a finite number of finite discontinuities and only a finite number of extreme
values, maxima, and minima.1 Functions satisfying these conditions may be
called piecewise regular. The conditions themselves are known as the Dirichlet
conditions. Although there are some functions that do not obey these Dirichlet
conditions, they may well be labeled pathological for purposes of Fourier expan-
expansions. In the vast majority of physical problems involving a Fourier series these
conditions will be satisfied. In most physical problems we shall be interested in
functions that are square integrable (in the Hilbert space L2 of Section 9.4). In
this space the sines and cosines form a complete orthogonal set. And this in turn
means that Eq. 14.1 is valid in the sense of convergence in the mean.
Expressing cos nx and sin nx in exponential form, we may rewrite Eq. 14.1 as
/(*)= £ cneinx A4.2)
n = — oo
in which
cn - 2{an - ibn), (i4 ^
and
Cn = тип-
1 These conditions are sufficient but not necessary.
760
GENERAL PROPERTIES 761
Completeness
The problem of establishing completeness may be approached in a number
of different ways. One way is to transform the trigonometric Fourier series into
exponential form and compare it with a Laurent series. If we expand /(z) in a
Laurent series2 (assuming/(z) is analytic),
f(z)= X dnz". A4.4)
n= — oo
On the unit circle z = е1в and
f(z)=f(eie)= X dneM. A4.5)
The Laurent expansion on the unit circle (Eq. 14.5) has the same form as the
complex Fourier series (Eq. 14.2), which shows the equivalence between the two
expansions. Since the Laurent series as a power series has the property of com-
completeness, we see that the Fourier functions, e'"x, form a complete set. There is a
significant limitation here. Laurent series and power series cannot handle dis-
discontinuities such as a square wave or the sawtooth wave of Fig. 14.1.
The theory of linear vector spaces provides a second approach to the com-
completeness of the sines and cosines. Here completeness is established by the
Weierstrass theorem for two variables.
The Fourier expansion and the completeness property may be expected, for
the functions sin юс, cos юс, етх are all eigenfunctions of a self-adjoint linear
differential equation,
f + n2y = 0. A4.6)
We obtain orthogonal eigenfunctions for different values of the eigenvalue n by
choosing the interval [0,pn], p an integer, to satisfy the boundary conditions in
the Sturm-Liouville theory (Chapter 9). If we further choose p = 2, the different
eigenfunctions for the same eigenvalue n may be orthogonal. We have
sin mx sin nx dx = 1 "''"' ' A4.7)
[0, m = 0,
cos mx cos nx dx = ) m'"' ' A4.8)
,0 [2n, m = n = 0,
sin mx cos nx dx .= 0 for all integral m and n. A4.9)
Note carefully that any interval x0 < x < x0 + 2n will be equally satisfactory.
Frequently, we shall use x0 = —л to obtain the interval — n < x <n. For the
complex eigenfunctions e±inx orthogonality is usually defined in terms of the
complex conjugate of one of the two factors,
2 Section 6.5.
762 FOURIER SERIES
Г
Jo
(eimx)*einxdx = 2пдт>п. A4.10)
This agrees with the treatment of the spherical harmonics (Section 12.6).
Sturm-Liouville Theory
The Sturm-Liouville theory guarantees the validity of Eq. 14.1 (for functions
satisfying the Dirichlet conditions) and, by use of the orthogonality relations,
Eqs. 14.7, 8 and 9, allows us to compute the expansion coefficients
"n = -fV(O cos ntdt, A4.11)
71 Jo
К = - ГV@ sin ntdt, n = 0,1,2,.-... A4.12)
This, of course, is subject to the requirement that the integrals exist. They do if
f(t) is piecewise continuous (or square integrable). Substituting Eqs. 14.11 and
14.12 into Eq. 14.1, we write our Fourier expansion as
f(x) = — f(t) dt + - У I cos nx f(t) cos nt dt + sin nx f(t) sin nt dt)
2n)o Я»=А Jo Jo /
- x)dt, A4.13)
the first (constant) term being the average value off(x) over the interval [0,2я].
Equation 14.13 offers one approach to the development of the Fourier integral
and Fourier transforms, Section 15.1.
Another way of describing what we are doing here is to say that/(x) is part of
an infinite-dimensional Hilbert space, with the orthogonal cos nx and sin nx as
the basis. (They can always be renormalized to unity if desired.) The statement
that cos nx and sin nx (n = 0,1,2, • • •) span this Hilbert space is equivalent to
saying that they form a complete set. Finally, the expansion coefficients an and
bn correspond to the projections oif{x) with the integral inner products (Eqs.
14.11 and 14.12) playing the role of the dot product of Section 1.3. These points
are outlined in Section 9.4.
Sawtooth Wave
An idea of the convergence of a Fourier series and the error in using only a
finite number of terms in the series may be obtained by considering the expan-
expansion of
/<x) = f , ^X<*: A4.14)
\x — 2n, n < x <, 2n.
This is a sawtooth wave, and for convenience we shall shift our interval from
[0,2я] to [ — я, я]. In this interval we have simply/(x) = x. Using Eqs. 14.11 and
14.12, we may show the expansion to be
GENERAL PROPERTIES 763
fix)
10 terms
f(x) = x = 2
sin 2x sin 3x
sin x 1 —
FIG. 14.1 Fourier representation of saw-
sawtooth wave
A4.15)
n
Figure 14.1 shows/(x) for 0 < x < n for the sum of 4, 6, and 10 terms of the
series. Three features deserve comment.
1. There is a steady increase in the accuracy of the
representation as the number of terms included is
increased.
2. All the curves pass through the midpoint у = 0 at
x = n.
3. In the vicinity of x = n there is an overshoot that
persists and shows no sign of diminishing.
As a matter of incidental interest, setting x = n/2 in Eq. 14.15 provides an
alternate derivation of Leibnitz's formula, Exercise 5.7.6.
Behavior of Discontinuities
The behavior at x = n is an example of a general rule that at a finite discon-
discontinuity the series converges to the arithmetic mean. For a discontinuity at
x = x0 the series yields
the arithmetic mean of the right and left approaches to x — x0. A general proof
using partial sums, as in Section 14.5, is given by Jeffreys and by Carslaw. The
proof may be simplified by the use of Dirac delta functions—Exercise 14.5.1.
The overshoot just before x = n is an example of the Gibbs phenomenon,
discussed in Section 14.5.
Summation of a Fourier Series
Usually in this chapter we shall be concerned with finding the coefficients of
the Fourier expansion of a known function. Occasionally, we may wish to
reverse this process and determine the function represented by a given Fourier
series.
Consider the series ^°=1^coswc, @,2л). Since this series is only con-
conditionally convergent (and diverges at x = 0), we take
cosnx ,. ^ rncosnx
hm >
764 FOURIER SERIES
_1~Т~=г1Ш?1 ^ ' A4Л7)
absolutely convergent for \r\ < 1. Our procedure is to try forming power series
by transforming the trigonometric functions into exponential form:
Now these power series may be identified as Maclaurin expansions of
- In A - z), z = reix, re~ix (Eq. 5.95), and
У r"cosnx = -i[ln(l - reix) + ln(l - re.*)]
«=i и 2 A4.19)
+ r2)-2rcosx]1/2.
Letting r = 1,
£ cosnx .
2, = — lnB —
A4.20)
@,2я).3
Both sides of this expression diverge as x -*■ 0 and 2я.
EXERCISES
14.1.1 A function f(x) (quadratically integrable) is to be represented by a finite Fourier
series. A convenient measure of the accuracy of the series is given by the integrated
square of the deviation
J'2* Г a " "I2
/(*) - -7 - Z (ancosnx + bnsinnx) dx.
0 L 2 "=i J
Show that the requirement that Ap be minimized, that is,
for all n, leads to choosing an and bn, as given in Eqs. 14.11 and 14.12.
Note. Your coefficients а„ and bn are independent of p. This independence is
a consequence of orthogonality and would not hold for powers of x, fitting a
curve with polynomials.
14.1.2 In the analysis of a complex waveform (ocean tides, earthquakes, musical tones,
etc.) it might be more convenient to have the Fourier series written as
3The limits may be shifted to [ — я, я] (and x Ф 0) using |x| on the right-hand
side.
EXERCISES 765
Show that this is equivalent to Eq. 14.1 with
bn = ansin0n, tan 0n = bJan.
Note. The coefficients a2 as a function of n define what is called the power spec-
spectrum. The importance of a2 lies in its invariance under a shift in the phase ()„.
14.1.3 A function /(x) is expanded in an exponential Fourier series
f(x)= t СУХ-
n= — oo
If f(x) is real, f(x) = /*(x), what restriction is imposed on the coefficients с„?
14.1.4 Assuming that §1K f(x) dx and |1„ [/(x)]2 dx are finite, show that
lim am = 0, lim bm = 0.
ш-^оо ш->оо
Яш^. Integrate [/(x) — sn(x)]2, where sn(x) is the nth partial sum and use Bessel's
inequality, Section 9.4. For our finite interval the assumption that j\x) is square
integrable (§1к\Дх)\2dx is finite) implies that |!1я|/(х)|с/х is also finite. The
converse does not hold.
fix)
J
TV
2
Ax)
TV
~ 2
=]£
я = 1
1 .
-sin
TV
nx
FIG. 14.2
14.1.5 Apply the summation technique of this section to show that
I
sin nx
у(я — x), 0 < x < я
j(n + x), -к<х<0
(Fig. 14.2).
14.1.6 Sum the trigonometric series
766 FOURIER SERIES
and show that it equals x/2.
14.1.7 Sum the trigonometric series
sinB«
I
n = 0
and show that it equals
14.1.8 Calculate the sum of the finite Fourier sine series for the sawtooth wave, j\x) = x,
,(-я,я), Eq. 14.15. Use 4-, 6-, 8-, and 10-term series and х/я = 0.00@.02I.00.
If a plotting routine is available, plot your results and compare with Fig. 14.1.
14.2 ADVANTAGES, USES OF FOURIER SERIES
Discontinuous Function
One of the advantages of a Fourier representation over some other represen-
representation, such as a Taylor series, is that it may represent a discontinuous function.
An example is the sawtooth wave in the preceding section. Other examples are
considered in Section 14.3 and in the exercises.
Periodic Functions
Related to this advantage is the usefulness of a Fourier series in representing
a periodic function. If/(x) has a period of 2тг, perhaps it is only natural that we
expand it in a series of functions with period 2тг, 2тг/2, 2тг/3, .... This guarantees
that if our periodic/(x) is represented over one interval [0,2л] or [ — л, л] the
representation holds for all finite x.
At this point we may conveniently consider the properties of symmetry. Using
the interval [ — л, л], sinx is odd and cosx is an even function of x. Hence by
Eqs. 14.11 and 14.12,1 if/(x) is odd, all an = 0 and iff{x) is even all />,, = 0. In
other words,
f(x) = ^ + £ an cos их, j\x) even, A4.21)
2
„=\
fix) = £ bn sin их, /(x)odd. A4.22)
Frequently these properties are helpful in expanding a given function.
We have noted that the Fourier series is periodic. This is important in con-
considering whether Eq. 14.1 holds outside the initial interval. Suppose we are given
only that
1 With the range of integration — л < x < n.
ADVANTAGES, USES OF FOURIER SERIES 767
./(
i
y^. 77 -
\S 1 x
277 -77 ^S
/^Taylor series
' Xx^ Fourier cosine >
77 / 277
у,' Fourier sine ser
FIG. 14.3 Comparison of Fourier cosine series, Fourier sine series, and Taylor
series
/(x) = x, 0 < x < n
A4.23)
and are asked to represent/(x) by a series expansion. Let us take three of the
infinite number of possible expansions.
1. If we assume a Taylor expansion, we have
fix) = x,
A4.24)
a one-term series. This (one-term) series is defined for
all finite x.
2. Using the Fourier cosine series (Eq. 14.21), we predict
that
f(x) = — x, — n < x < 0,
A4.25)
f(x) = 2n — x, n < x < In.
3. Finally, from the Fourier sine series (Eq. 14.22), we
have
f(x) = x, — n < x < 0,
A4.26)
/(x) = x — 2тг, п < x < 2n.
These three possibilities, Taylor, series, Fourier cosine series, and Fourier sine
series, are each perfectly valid in the original interval [0, n\. Outside, however,
their behavior is strikingly different (compare Fig. 14.3). Which of the three, then,
is correct? This question has no answer, unless we are given more information
about/(x). It may be any of the three or none of them. Our Fourier expansions
are valid over the basic interval. Unless the function/(x) is known to be periodic
with a period equal to our basic interval, or (l/n)th of our basic interval, there is
no assurance whatever that the representation (Eq. 14.1) will have any meaning
outside the basic interval.
768 FOURIER SERIES
It should be noted that the set of functions cos юс, n = 0,1,2, . . ., forms a
complete orthogonal set over [0, л]. Similarly, the set of functions sin юс,
n = 1,2,3, .. ., forms a complete orthogonal set over this same interval. Unless
forced by boundary conditions or a symmetry restriction, the choice of which
set to use is arbitrary.
In addition to the advantages of representing discontinuous and periodic
functions, there is a third very real advantage in using a Fourier series. Suppose
that we are solving the equation of motion of an oscillating particle subject to a
periodic driving force. The Fourier expansion of the driving force then gives us
the fundamental term and a series of harmonics. The (linear) differential equa-
equation may be solved for each of these harmonics individually, a process that may
be much easier than dealing with the original driving force. Then, as long as the
differential equation is linear, all the solutions may be added together to obtain
the final solution.2 This is more than just a clever mathematical trick. It cor-
corresponds to finding the response of the system to the fundamental frequency and
to each of the harmonic frequencies.
One question that is sometimes raised is, "Were the harmonics there all along
or were they created by our Fourier analysis?" One answer compares the func-
functional resolution into harmonics with the resolution of a vector into rectangular
components. The components may have been present in the sense that they may
be isolated and observed, but the resolution is certainly not unique. Hence many
authorities prefer to say that the harmonics were created by our choice of expan-
expansion. Other expansions in other sets of orthogonal functions would give different
results. For further discussion the reader should consult a series of notes and
letters in the American Journal of Physi'cs.3
Change of Interval
So far attention has been restricted to an interval of length 2n. This restriction
may easily be relaxed. If/(x) is periodic with a period 2L, we may write
r< \ ao \г Г ппх и ■ nnx~] /1/1->-n
f(x)=-£-+L flncos—7- + bnsin—- , A4.27)
with
n = 0,1,2,3,..., A4.28)
"= 1.2,3,.... A4.29)
2 One of the nastier features of nonlinear differential equations is that this
principle of superposition is not valid.
3 B. L. Robinson, "Concerning frequencies resulting from distortion," Am. J.
Phys. 21, 391 A953).
F. W. Van Name, Jr., "Concerning frequencies resulting from distortion," Am. J.
Phys. 22, 94 A954).
EXERCISES 769
replacing x in Eq. 14.1 with nx/L and t in Eqs. 14.11 and 14.12 with nt/L. (For
convenience the interval in Eqs. 14.11 and 14.12 is shifted to — n < t < n.) The
choice of the symmetric interval ( — L,L) is not essential. For/(x) periodic with a
period of 2L, any interval (xo,xo + 2L) will do. The choice is a matter of con-
convenience or literally personal preference.
EXERCISES
14.2.1 The boundary conditions (such as ф@) = фA) = 0) may suggest solutions of the
form sm(nnx/l) and eliminate the corresponding cosines.
(a) Verify that the boundary conditions used in the Sturm-Liouville theory
are satisfied for the interval @, /). Note that this is only half the usual Fourier
interval.
(b) Show that the set of functions (pn(x) = sm(nnx/l), n = 1, 2, 3, ... satisfies
an orthogonality relation
14.2.2 (a) Expand /(x) = x in the interval @,2L). Sketch the series you have found
(right-hand side of Ans.) over ( —2L, 2L).
..._ , 2LS1 . (nnx\
ANS. x = L У -sin .
- \L I
„=i
(b) Expand f(x) = x as a sine series in the half interval @, L). Sketch the series
you have found (right-hand side of Ans.) over ( —2L, 2L).
2L £, , _„+, . /пжх\
ANS. x =
sin
14.2.3 In some problems it is convenient to approximate sinnx over the interval [0,1]
by a parabola ax(l — x), where a is a constant. To get a feeling for the accuracy
of this approximation, expand 4x(l — x) in a Fourier sine series:
Ux(l-x), 0<x<l} * .
f(x) =
Dx(l + x),
-1 <x<0
1ИЯХ
ANS.
32 1 , ,
?„ = —j#—3, n odd
n even
(Fig. 14.4).
FIG. 14.4
770 FOURIER SERIES
/(*)
-277
— 77
2т7
FIG. 14.5 Square wave
14.3 APPLICATIONS OF FOURIER SERIES
EXAMPLE 14.3.1 Square Wave—High Frequencies
One simple application of Fourier series, the analysis of a "square" wave
(Fig. 14.5) in terms of its Fourier components, may occur in electronic circuits
designed to handle sharply rising pulses. Suppose that our wave is defined by
/(x) = 0, -7i<x<0,
f(x) = h, 0 < x < п.
From Eqs. 14.11 and 14.12 we find
1 P
ao=-\ hdt = h,
71 Jo
i Г
a =— hcosntdt = 0, n = 1, 2, 3,
71 Jo
Ь„ = — hsmntdt = —A — cosmr);
A4.30)
2h
mi
n odd,
n even.
A4.31)
A4.32)
A4.33)
A4.34)
A4.35)
A4.36)
Except for the first term which represents an average of/(x) over the interval
[ — я, я], all the cosine terms have vanished. Since/(x) — h/2 is odd, we have a
Fourier sine series. Although only the odd terms in the sine series occur, they
fall only as n~1. This is similar to the convergence (or lack of convergence)
of the harmonic series. Physically this means that our square wave contains a
lot of high-frequency components. If the electronic apparatus will not pass these
The resulting series is
r h 2hfsmx sin3x sin5x
J{x) = 2 + "тГГГ + ~Y~+ ~~Г
APPLICATIONS OF FOURIER SERIES 771
•- u-V
-2т:
FIG. 14.6 Full wave rectifier
components, our square wave input will emerge more or less rounded off,
perhaps as an amorphous blob.
EXAMPLE 14.3.2 Full Wave Rectifier
As a second example, let us ask how well the output of a full wave rectifier
approaches pure direct current (Fig. 14.6). Our rectifier may be thought of as
having passed the positive peaks of an incoming sine wave and inverting the
negative peaks. This yields
fit) = sin cot, 0 < cot < n,
f{t) = —sincot, — n < cot < 0.
A4.37)
Since fit) defined here is even, no terms of the form sin mot will appear.
Again, from Eqs. 14.11 and 14.12, we have
if0 if"
— — sin cotdicot) + - sin (otd((ot)
J — ж Jo
2
= — sin tot
nJo
4
—,
n
A4.38)
an = —\ sin cot cos ncotdicot)
71 Jo
Tin2 - 1'
= 0,
n even
n odd.
A4.39)
Note carefully that [0, n\ is not an orthogonality interval for both sines and
cosines together and we do not get zero for even n. The resulting series is
/v ч _ 2 4 ^, cos not
11 "-« = 2,4.6
772
П
A4.40)
The original frequency со has been eliminated. The lowest frequency oscillation
772 FOURIER SERIES
is 2c;. The high-frequency components fall off as и'2, showing that the full
wave rectifier does a fairly good job of approximating direct current. Whether
this good approximation is adequate depends on the particular application.
If the remaining ac components are objectionable, they may be further sup-
suppressed by appropriate filter circuits.
These two examples bring out two features characteristic of Fourier
expansions.1
1. If/(x) has discontinuities (as in fhe square wave in
Example 14.3.1), we can expect the nth coefficient to
be decreasing as \/n. Convergence is relatively slow.2
2. If/(.x) is continuous (although possibly with dis-
discontinuous derivatives as in the full wave rectifier of
Example 14.3.2), we can expect the /?th coefficient to
be decreasing as \jn2.
EXAMPLE 14.3.3 Infinite Series, Riemann Zeta Function
As a final example, we consider the purely mathematical problem of expand-
expanding x2. Let
j\x) = x2, — n < x < n. A4.41)
By symmetry all bn = 0. For the an's we have
1 2 2тг
aQ = ~ x'dx = --- ,
J - n
a,= 2- x2 cos nxdx A4.42)
71 Jo
2 2n
1
n
= (-1)"—. A4.43)
/r
From this we obtain
n2 x' cos и x
x2 = H 4 У (- 1)" - —-- ■ A4.44)
3 „=i n-
As it stands, Eq. 14.44 is of no particular importance, but if we set x = n,
cos»7c = (-l)" A4.45)
and Eq. 14.44 becomes3
1 G. Raisbeck, "Order of Magnitude of Fourier Coefficients." Am. Math.
Monthly 62. 149- 155A955).
2 A technique for improving the rate of convergence is developed in the
exercises of Section 14.4.
3Note that the point д- = п is not a point of discontinuity.
APPLICATIONS OF FOURIER SERIES 773
„=i n
A4.46)
or
A4.47)
thus yielding the Riemann zeta function, £B), in closed form (in agreement with
the Bernoulli number result of Section 5.9). From our expansion of x2 and
expansions of other powers of .x numerous other infinite series can be evaluated.
A few are included in the subsequent list of exercises.
Fourier Series
1. У -sinюс =
n
ifa + x),
iGl - X),
-n <x<0
0 < X < П
2.
1 .
n
sinBn+
00
3. У
oo i
4. У -cosюс = — In
— тг/2
+ 7Г/2,
2 sin I -—
5. £ (-l)"-cosnx= -In
2 cos -
— п < х < О
О < х < п
— 71 < X < 71
— 71 < X < 71
Reference
Exercise 14.1.5
Exercise 14.3.3
Exercise 14.1.6
Exercise 14.3.2
Exercise 14.1.7
Eq. 14.36
Eq. 14.20
Exercise 14.3.15
Exercise 14.3.15
00 1
6. z'2^-7
1
cosBn + l)x = -In
cot-
x
— 71 < X < 71
Complex Variables—Abel's Theorem
Consider a function/(z) represented by a convergent power series
/(z) = £ cnz" = £ спг"еш.
A4.48)
This is our Fourier exponential series, Eq. 14.2. Separating real and imaginary
parts
A4.49)
the Fourier cosine and sine series. Abel's theorem asserts that if u(\,0) and
v(l, 0) are convergent for a given 0, then
u(r,0)= 2_, cnr"cosn0
00
v(r,O) = Y, cnr" sin nO,
iv(l,0) = Iim f(rew).
A4.50)
An application of this appears as Exercise 14.3.15.
774 FOURIER SERIES
EXERCISES
14.3.1 Develop the Fourier series representation of
— n < iot < 0.
0 < at < n.
fit) = :
This is the output of a simple half-wave rectifier. It is also an approximation
of the solar thermal effect that produces "tides" in the atmosphere.
.,,„ '■ 11. 2 ^ cos mot
ANS. /(') = - + -sino;r - - >j —, r.
я 2 n
n-2.4.6...
even
14.3.2 A sawtooth wave is given by
fix) = x, —n<x<n.
Show that
/(x) = 2X( i}-si
14.3.3 A different sawtooth wave is described by
Show that j\x) = X^--i isin nx/n).
sin /;.v.
-л < x < 0
0 < л < п.
14.3.4 A triangular wave (Fig. 14.7) is represented by
j x, 0 < л < n
[-a\ -л<х<0.
Represent /(л) by a Fourier series.
2 я .-us
odd
/(V)
— 477 —377 —277 —77
77 277 ЗТГ 477
FIG. 14.7 Triangular wave
14.3.5 Expand
in the interval [ — n. n\.
fix) =
la
x2 < x20
x2 > x2
EXERCISES 775
-v
FIG. 14.8
Note. This variable width square wave is of some importance in electronic
music.
14.3.6 A metal cylindrical tube of radius a is split lengthwise into two nontouching
halves. The top half is maintained at a potential + V, the bottom half at a
potential — V (Fig. 14.8). Separate the variables in Laplace's equation and solve
for the electrostatic potential for r < a. Observe the resemblance between your
solution for r = a and the Fourier series for a square wave.
14.3.7 A metal cylinder is placed in a (previously) uniform electric field, £0, the axis
of the cylinder perpendicular to that of the original field.
(a) Find the perturbed electrostatic potential.
(b) Find the induced surface charge on the cylinder as a function of angular
position.
14.3.8 Transform the Fourier expansion of a square wave, Eq. 14.3.6, into a power
series. Show that the coefficients of x1 form a divergent series. Repeat for the
coefficients of x3.
A power series cannot handle a discontinuity. These infinite coefficients are
the result of attempting to beat this basic limitation on power series.
14.3.9 (a) Show that the Fourier expansion of cos ax is
cos ax =
2a sin ал f 1
cos x cos 2x
ft [2a2 a2 - I2 a2 - 22
„ч„ 2a sin aft
14.3.10
"" v x/ n(a2 - n2)
(b) From the preceding result show that
00
aft-cotaft = 1 - 2 X Шр)а2р.
p=i
This provides an alternate derivation of the relation between the Riemann
zeta function and the Bernoulli numbers, Eq. 5.151.
Derive the Fourier series expansion of the Dirac delta function 5{x) in the
interval — n < x < n.
(a) What significance can be attached to the constant term?
(b) In what region is this representation valid?
(c) With the identity
776 FOURIER SERIES
Д sin(Nx/2) [~/ 1\ "
) cos их =—-—— cos\\N + - x/2
„% sin(;c/2) [I 2/ .
show that your Fourier representation of S(x) is consistent with Eq. 8.83d.
14.3.11 Expand <5(x — t) in a Fourier series. Compare your result with the bilinear
form of Eq. 9.83.
1 Iе0
ANS. 6(x — l) = h - У (cosnxcosnt + sinnxsin«0
2л яи=1
1 1 °°
—^" Xcos n(x ~~
2
14.3.12 Verify that
is a Dirac delta function by showing that it satisfies the definition of a Dirac
delta function:
Г /(Ф.к1- £ e^r^dcp,=f{<p2).
Hint. Represent Д(р{) by an exponential Fourier series.
Note. The continuum analog of this expression is developed in Section 15.2.
The most important application of this expression is in the determination of
Green's functions, Section 16.6.
14.3.13 (a) Using
f(x) = x2, — n < x < я,
show that
n% n2 12"
(b) Using the Fourier series for a triangular wave developed in Exercise 14.3.4,
show that
£ 1^ __ _ я2 _
(c) Using
/(x) = x4, — n < x < n,
show that
П 1Л)
n-l
(d) Using
| x)' 0 < x < л.
f,x) = |
|х(л -f x), — n < x < 0,
EXERCISES 777
derive
/w=8 £ sinnx
Яп = 1,3,5,... П
odd
and show that
oo -lit __3
X (-ir1)/2«-3 = i-l-fl-l+--- = ^ = m
11=1,3.5,... -1 3 ' 32
odd
(e) Using the Fourier series for a square wave, show that
00 1 1 1 7Г
„=iL...(~1)(" 1>/2" 1 = 1"з + 5+'" = 4 = ЖЦ
odd
This is Leibnitz's formula for л, obtained by a different technique in Exercise
5.7.6.
Note. The rj{2), jjD), ЯB), /5A), and /5C) functions are defined by the indi-
indicated series. General definitions appear in Section 5.9.
14.3.14 (a) Find the Fourier series representation of
JO, - л < x < 0
~\x, 0 < x < я.
(b) From your Fourier expansion show that
14.3.15 Let f(z) = ln(l + z) = У^°=1 (-l)n+1zn/«. (This series converges to ln(l + z) for
\z\ < 1, except at the point z = — 1.)
(a) From the imaginary parts show that
In 2cos- = У (-1)"+ , -n<0<n.
V У -=i "
(b) Using a change of variable, transform part (a) into
, /, . cp\ S cosncp . .
-In 2sin-^-= У ^, 0 < <p < In.
14.3.16 A symmetric triangular pulse of adjustable height and width is described by
[O, 6 < |x| < n.
(a) Show that the Fourier coefficients are
ab lab t ,.., ,.-,
a0 = —,. а„ = A - cos nb)/(nb).
л л
Sum the finite Fourier series through n = 10 and through и = 100 for
х/л = 0A/9I. Take a = 1 and b = я/2.
(b) Call a Fourier analysis subroutine (if available) to calculate the Fourier
coefficients of/(x), a0 through a10.
14.3.17 (a) Using a Fourier analysis subroutine, calculate the Fourier cosine coeffi-
coefficients a0 through a10 of
778 FOURIER SERIES
(b) Spot check by calculating some of the preceding coefficients by direct
numerical quadrature.
Check values. a0 = 0.785, a2 = 0.284.
14.3.18 Using a Fourier analysis subroutine, calculate the Fourier coefficients through
al0 and b10 for
(a) a full-wave rectifier, Example 14.3.2,
(b) a half-wave rectifier, Exercise 14.3.1. Check your results against the analytic
forms given (Eq. 14.39 and Exercise 14.3.1).
14.4 PROPERTIES OF FOURIER SERIES
Convergence
It might be noted, first, that our Fourier series should not be expected to be
uniformly convergent if it represents a discontinuous function. A uniformly
convergent series of continuous functions (sin их, cos ил) always yields a con-
continuous function (compare Section 5.5). If, however, (a) f(x) is continuous,
— n < x < я, (b) J\ — n) = /D-я), and (c) f\x) is sectionally continuous, the
Fourier series for f(x) will converge uniformly. These restrictions do not demand
that f(x) be periodic, but they will be satisfied by continuous, differentiable,
periodic functions (period of 2я). For a proof of uniform convergence the
reader is referred to the literature.1 With or without a discontinuity in/(x),
the Fourier series will yield convergence in the mean, Section 9.4.
Integration
Term-by-term integration of the series
a x
a
j\x) = — 4- £ a„ cos nx + Y, bnsmnx A4.51)
„ = i n -1
yields
oc
a
4- У —" sin nx
n
h
— У -"cosих
n
A4.52)
Clearly, the effect of integration is to place an additional power of n in the
denominator of each coefficient. This results in more rapid convergence than
before. Consequently, a convergent Fourier series may always be integrated
term by term, the resulting series converging uniformly to the integral of the
original function. Indeed, term-by-term integration may be valid even if the
original series (Eq. 14.51) is not itself convergent! The function/(x) need only
be integrable. A discussion will be found in Jeffreys and Jeffreys, Section 14.06.
Strictly speaking, Eq. 14.52 may not be a Fourier series; that is, if a0 Ф 0,
there will be a term \aox. However,
1 See. for instance, R. V. Churchill. Fourier Series and Boundary Value Prob-
Problems. New York: McGraw-Hill A941), Section 38.
EXERCISES 779
f(x)dx-\aox A5.53)
will still be a Fourier series.
Differentiation
The situation regarding differentiation is quite different from that of integra-
integration. Here the word is caution. Consider the series for
/(x) = x, ~n<x<n. A4.54)
We readily find (compare Exercise 14.3.2) that the Fourier series is
-n<x<n. A4.55)
n
Differentiating term by term, we obtain
1 = 2 X (-l)"+1cosnx, A4.56)
which is not convergent! Warning. Check your derivative.
For a triangular wave (Exercise 14.3.4), in which the convergence is more
rapid (and uniform),
Ax)_|_i £ £2!«. ,14.57)
Differentiating term by term
Л GO Ci'n J1V
f{x)=l £ ^^ (R58)
which is the Fourier expansion of a square wave
( 1, 0 < x < n,
f(x) = \ ' A4-59)
(_—1, — 71 < X < 0.
Inspection of Fig. 14.7 verifies that this is indeed the derivative of our triangular
wave.
As the inverse of integration, the operation of differentiation has placed an
additional factor n in the numerator of each term. This reduces the rate of
convergence and may, as in the first case mentioned, render the differentiated
series divergent.
In general, term-by-term differentiation is permissible under the same condi-
conditions listed for uniform convergence.
EXERCISES
14.4.1 Show that integration of the Fourier expansion of/(x) = x, — n < x < я, leads
to
780 FOURIER SERIES
12 „V
= 1— 4 + 9 — Тб + ■■'■
14.4.2 Parseval's identity.
(a) Assuming that the Fourier expansion of f(x) is uniformly convergent, show
that
This is Parseval's identity. It is actually a special case of the completeness
relation, Eq. 9.72.
(b) Given
7 Я2 . S ( — 1)" COS ИХ
x2 = h 4 У —. , -я<х<я,
3 „ti n2
apply Parseval's identity to obtain (D) in closed form.
(c) The condition of uniform convergence is not necessary. Show this by
applying the Parseval identity to the square wave
f-1, -я< x <0
_ 4 »
2n-l
14.4.3 Show that integrating the Fourier expansion of the Dirac delta function
(Exercise 14.3.10) leads to the Fourier representation of the square wave, Eq.
14.3.6, with ft = 1.
Note. Integrating the constant term A/2я) leads to a term х/2я. What are you
going to do with this?
14.4,ЗА Integrate the Fourier expansion of the unit step function
,( , JO, -я<х<0
fix) = <
[x, 0 < x < я.
Show that your integrated series agrees with Exercise 14.3.14.
14.4.4 In the interval (- я, я),
д„(х) — n, for |x| < —,
0, for |x|>f
' ' 2n
(Fig. 14.9).
(a) Expands <5n(x) as a Fourier cosine series.
(b) Show that your Fourier series agrees with a Fourier expansion of <5(x) in
the limit as n -> oo.
14.4.5 Confirm the delta function nature of your Fourier series of Exercise 14.4.4 by
showing that for any f(x) that is finite in the interval [ — я, я] and continuous
at x = 0,
Г
j(x) [Fourier expansion of дх (x) J «be = /"@).
EXERCISES 781
— n
1
• ш
1
n
X
n
14.4.6
14.4.7
14.4.8
In
In
FIG. 14.9 Rectangular pulse
(a) Show that the Dirac delta function d(x — a), expanded in a Fourier sine
series in the half interval @, L), @ < a < L), is given by
5(x-a) =
. fnna\ . (nnx\
sin — sin — - .
\ L J \L J
Note that this series actually describes
— d(x + a) + d(x — a) in the interval ( — L,L).
(b) By integrating both sides of the preceding equation from 0 to x, show that
the cosine expansion of the square wave
/(*) =
U,
0 < x < a
a < x < L,
is
ft , 2 » 1 . /nna\ 2 » 1 . nnd\ (nnx\
f{x) = - X -sin —)-- X -sin -—)cos —- ,
0 <x <L.
(c) Verify that the term
2 £, 1 . (nna\ . . „ чч
- £ -sin — is </(*)>.
Verify the Fourier cosine expansion of the square wave, Exercise 14.4.6(b), by
direct calculation of the Fourier coefficients.
(a) A string is clamped at both ends x = 0 and x = L. Assuming small ampli-
amplitude vibrations, we find that the amplitude y(x, t) satisfies the wave equation
8x2~lS 8t2'
Here v is the wave velocity. The string is set in vibration by a sharp blow
at x — a. Hence we have
dy{x,t)
ct
= Lvod(x — a) at t = 0.
The constant L is included to compensate for the dimensions (inverse
length) of d(x — a). With d(x — a) given by Exercise 14.4.6(a), solve the
wave equation subject to these initial conditions.
..._ , . 2vQL S 1 . nna . mix . nnvt
ANS. y(x,t) — —^— > -sin sin sin .
nv „=, n L L L
782 FOURIER SERIES
(b) Show that the transverse velocity of the string ' is given by
dt
dy(x, t) » ^ . nna . nnx nnvt
-¥_ = 2,,0Xsin —an —cos—.
14.4.9 A string, clamped at x — 0 and at x = I, is vibrating freely. Its motion is described
by the wave equation
d2u(x,t) __ 2d2u(x,t)
dt2 ~V dx2 '
Assume a Fourier expansion of the form
u(x,t) = £ bn(t) sin—-
and determine the coefficients bn(t). The initial conditions are
q
u(x, 0) = f(x) and -- u(x, 0) = g(x).
dt
Note. This is only half the conventional Fourier orthogonality integral interval.
However, as long as only the sines are included here, the Sturm-Liouville
boundary conditions are still satisfied and the functions are orthogonal.
ANS. bn(t) = Ancos—— + Bnsin——,
2 f' nnx , 2 C' nnx ,
Л = т /Wsin——dx, Bn = gixjsm—y-dx.
1 Jo ' nnv Jo '
14.4.10 (a) Continuing the vibrating string problem, Exercise 14.4.9, the presence
of a resisting medium will damp the vibrations according to the equation
_ 2d2u(x,t) kdu(x,t)
d2
k
dt2 dx2 dt
Assume a Fourier expansion
/ 4 £ L /4 • ППХ
u(x,t)= 2, bn(t)sin-—
n = l '
and again determine the coefficients bn{t). Take the initial and boundary
conditions to be the same as in Exercise 14.4.9. Assume the damping to
be small.
(b) Repeat but assume the damping to be large.
ANS. (a) bn(t) = e~ktl2{Ancoscont + Bnsintont\,
2 f' nnx
Л =7 /(x)sin—-dx,
1 Jo
2 Г' ч . nnx J к A , fnnv\2 fk\2
«n/J0 / 2ш„ \ / / V2/
(b) Ь„@ = e~ktl2 {An cosh an? + Bn sinh an?},
. 2 f' , . . иях
Л = 7 /Wsm—rfx,
5„ = —г ^(x)sin-r«fx + —-Д„ап2= - - —-
*J Jo / 2an V \l
GIBBS PHENOMENON 783
14.4.11 Find the charge distribution over the interior surfaces of the semicircles of
Exercise 14.3.6.
Note. You obtain a divergent series and this Fourier approach fails. Using
conformal mapping techniques, we may show the charge density to be pro-
proportional to esc 0. Does esc 0 have a Fourier expansion?
Л АЛЛ 2 Given
<Pi(x)=
sin их
— (я + x), — я<х<0
-(я — х) 0 < х < я,
show by integrating that
cos их
14.4.13 Given
(я + xJ я2
__ _)
(я - xJ n2
4 12
sin их
COS ПХ
~n < x < 0
0 < x < n.
Develop the following recurrence relations:
(а) ф2*(х)= Ф2*-Лх)<1х
(b)
= CBs + 1) -
Jo
Note. These functions ф„(х) and the cpn(x) of the preceding exercise are known
as Clausen functions. In theory they may be used to improve the rate of con-
convergence of a Fourier series. As with the series of Chapter 5, there is always the
question of how much analytical work we do and how much arithmetic work
we demand that the computing machine do. As machines become steadily
more powerful, the balance progressively shifts so that we are doing less and
demanding that the machines do more.
14.4.14 Show that
/M =
may be written as
= ф1(х)-<р2(х)+
cos их
„=, п\п + 1)
Note. ф^(х) and ф2(х) are defined in the preceding exercises.
14.5 GIBBS PHENOMENON
The Gibbs phenomenon is an overshoot, a peculiarity of the Fourier series
and other eigenfunction series at a simple discontinuity. An example is seen in
Fig. 14.1.
784 FOURIER SERIES
Summation of Series
In Section 14.1 the sum of the first several terms of the Fourier series for a
sawtooth wave was plotted (Fig. 14.1). Now we develop an analytic method of
summing the first r terms of our Fourier series.
From Eq. 14.13
1 Г
an cos nx + bn sin nx = - j (t) cos n(t — x)dt. A4.60)
Then the rth partial sum becomes1
ancos nx + К sin tlx)
A4.61)
1
e
-i(t-x)n
dt.
n
2
Summing the finite series of exponentials (geometric progression),2 we obtain
A4.62)
This is convergent at all points, including t = x. The factor
is the Dirichlet kernel mentioned in Section 8.7 as a Dirac delta distribution.
Square Wave
For convenience of numerical calculation we consider the behavior of the
Fourier series that represents the periodic square wave
fix) =
-, 0 < X < 71,
■-, — n < x < 0.
A4.63)
This is essentially the square wave used in Section 14.3, and we see immediately
that the solution is
n \ 1 3 5 /
Applying Eq. 14.62 to our square wave (Eq. 14.63), we have the sum of the
first r terms (plus ^a0, which is zero here).
1 It is of some interest to note that this series also occurs in the analysis of the
diffraction grating (r slits).
2 Compare Exercise 6.1.7 with initial value я = 1.
GIBBS PHENOMENON 785
, ч h Сж sin(r + h(t — x) , h f ° sin(r + ^)(t — x)
4л: L smMt — x) An smMt — x)
h sin(r + i)(f — x)
4n
n sin(r + i)
A4.65)
4n
10 "ixi ^V" -"V '"'JO
This last result follows from the transformation
t = —t in the second integral.
Replacing t — x in the first term with s and t + x in the second term with s,
we obtain
■•7Г-Х
^4-— ds. A4.66)
I» sin is
sin^s
A
4тг
V////A
— x
V////
//////////
77
-4r-
77 + .V
77 — .V
77 + .V
FIG. 14.10 Intervals of integration—Eq. 14.66
The intervals of integration are shown in Fig. 14.10 (top). Because the inte-
integrands have the same mathematical form, the integrals for x to n — x cancel
leaving the integral ranges shown in the bottom portion of Fig. 14.10.
/ ^ h
S1112S
sin is
A467)
Consider the partial sum in the vicinity of the discontinuity at x = 0. As
x -> 0, the second integral becomes negligible, and we associate the first integral
with the discontinuity at x = 0. Using (r + \) = p and ps = £, we obtain
A
2тг
■•px
sin t,
A4.68)
Calculation of Overshoot
Our partial sum, sr(x), starts at zero when x = 0 (in agreement with Eq. 14.16)
and increases until £ = ps = л, at which point the numerator, sin £, goes negative.
786 FOURIER SERIES
For large r, and therefore for large p, our denominator remains positive. We get
the maximum value of the partial sum by taking the upper limit px = n. Right
here we see that x, the location of the overshoot maximum, is inversely propor-
proportional to the number of terms taken
P r
The maximum value of the partial sum is then
h 1 [**
2 n L sin(£/2p)p
J» A4.69)
h 2
In terms of the sine integral, si(x) of Section 10.5,
f sl£i^ = ^ + SiW A4.70)
Jo ^ ^
The integral is clearly greater than л/2, since it can be written as
We saw in Section 7.2 that the integral from 0 to со is n/2. From this integral
we are subtracting a series of negative terms. A Gaussian quadrature (Appendix
2) or a power-series expansion and term-by-term integration yields
1 Г ^|i^= 1.1789797..., A4.72)
which means that the Fourier series tends to overshoot the positive corner by
some 18 percent and to undershoot the negative corner by the same amount,
as suggested in Fig. 14.11. The inclusion of more terms (increasing r) does
nothing to remove this overshoot but merely moves it closer to the point of
discontinuity. The overshoot is the Gibbs phenomenon, and because of it the
Fourier series representation may be highly unreliable for precise numerical
work, especially in the vicinity of a discontinuity.
The Gibbs phenomenon is not limited to the Fourier series. It occurs with
other eigenfunction expansions. Exercise 12.3.27 is an example of the Gibbs
phenomenon for a Legendre series.
EXERCISES
14.5.1 With the partial sum summation techniques of this section, show that at a
discontinuity in f(x) the Fourier series for j\x) takes on the arithmetic mean of
the right- and left-hand limits:
DISCRETE ORTHOGONALITY—DISCRETE FOURIER TRANSFORM 787
100 terms
80 60 40
1.2
1.0
0.8
0.6
0.4
0.2
0.1
0.02 0.04 0.06 0.08
FIG. 14.11 Square wave—Gibbs phenomenon
20 terms
0.10
In evaluating lim sr(x0) you may find it convenient to identify part of the integrand
r-»oo
as a Dirac delta function.
14.5.2 Determine the partial sum, sn, of the series in Eq. 14.64 by using
, . sinmx Cx ,
(a) = cos my dy
m
and
sin2ny
y
(b) X cosBp-l)y=
Do you agree with the result given in Eq. 14.68?
14.5.3 Evaluate the finite step function series, Eq. 14.64, h = 2., using 100, 200, 300,
400, and 500 terms for x = 0.0000@.0005H.0200. Sketch your results (five curves)
or if a plotting routine is available, plot your results.
14.5.4 (a) Calculate the value of the Gibbs's phenomenon integral
2 psinf
я
dt
/о
t
by numerical quadrature accurate to 12 significant figures,
(b) Check your result by A) expanding the integrand as a series, B) integrating
term by term, and C) evaluating the integrated series. This calls for double
precision calculation.
ANS. 7 = 1.178979744472.
14.6 DISCRETE ORTHOGONALITY—DISCRETE
FOURIER TRANSFORM
For many physicists the Fourier transform is automatically the continuous
Fourier transform of Chapter 15. The use of the electron digital computer,
however, necessarily replaces a continuum of values by a discrete set; an inte-
788 FOURIER SERIES
gration is replaced by a summation. The continuous Fourier transform becomes
the discrete Fourier transform and an appropriate topic for this chapter.
Orthogonality Over Discrete Points
The orthogonality of the trigonometric functions and the imaginary expo-
exponentials is expressed in Eqs. 14.7 to 14.10. This is the usual orthogonality for
functions: integration of a product of functions over the orthogonality interval.
The sines, cosines, and imaginary exponentials have the remarkable property
that they are also orthogonal over a series of discrete, equally spaced points
over the period (the orthogonality interval).
Consider a set of 2N time values
t — о T z\H v^' v П4 7^
for the time interval @, T). Then
kT
' n * " ^AT - 1. A4.74)
We shall prove that the exponential functions expBniptJT) and expBniqtk/T)
satisfy an orthogonality relation over the discrete points tk:
2N-1
Y, [ехрB7гф/уТ)]*ехрBл:/д/уТ) = 2N3pq±2nN- A4.75)
fc = 0
Here n, p, and q are all integers.
Replacing q — p by s, we find that the left-hand side of Eq. 14.75 becomes
2N-1 2/V-l
X expB7Hs^/r) = X expBnisk/2N).
k=0 k=0
This right-hand side is obtained by using Eq. 14.74 to replace T. This is a finite
geometric series with an initial term 1 and a ratio
r = exp(nis/N).
From Eq. 5.7
2N~X 2 L_ = 0, Гф\
1 - r ^ A4.76)
2iV, r = 1,
establishing Eq. 14.75, our basic orthogonality relation. The upper value, zero,
is a consequence of
r2N = expB7ris) = 1
for s an integer. The lower value, 2N, for r = 1 corresponds to p = q.
The orthogonality of the corresponding trigonometric functions is left as
Exercise 14.6.1.
DISCRETE ORTHOGONALITY—DISCRETE FOURIER TRANSFORM 789
Discrete Fourier Transform
To simplify the notation slightly and to make more direct contact with
physics, we introduce the (reciprocal) co-space, angular frequency, with
cop = 2np/T, p = 0, 1, 2, . . ., 2JV - 1. A4.77)
We make p range over the same integers as k. The exponential exp(±2niptk/T)
of Eq. 14.75 becomes exp(±icoptk). The choice of whether to use the + or the —
sign is a matter of convenience or convention. In quantum mechanics the
negative sign is selected when expressing the time dependence.
Consider a function of time defined (measured) at the discrete time values tk.
We may construct
i 2N-1
I f(h)eica^. A4.78)
k=0
Employing the orthogonality relation, we obtain
i 2N-1
L £
£ (gAVmjVV* = 5mk, A4.78a)
2J\ p=0
and then replacing subscript m by k, we find that the amplitudes,/^), become
2N-1
Ж)= I F{cop)e-ia*. A4.79)
p = 0
The time function f(tk), к = 0, 1, 2, . .., 2JV — 1, and the frequency function
F(cop), p = 0, 1,2, . . ., 2JV — 1, are discrete Fourier transforms of each other.1
Compare Eqs. 14.78 and 14.79 with the corresponding continuous Fourier
transforms Eqs. 15.22 and 15.23 of Chapter 15.
Limitations
Taken as a pair of mathematical relations, the discrete Fourier transforms
are exact. We can say that the 2N 2N component vectors exp( — ia)ptk), к — 0,
1, 2, . . ., 2jV — 1, form a complete set2 spanning the ^-space. Then f(tk) in
Eq. 14.79 is simply a particular linear combination of these vectors. Alter-
Alternatively, we may take the 2jV measured components f(tk) as defining a 2N
component vector in ^-space. Then, Eq. 14.78 yields the 2jV component vector
F(cop) in the reciprocal ctyspace. Equations 14.78 and 14.79 become matrix
equations with exp(icoptk)/BNI12 the elements of a unitary matrix.
The limitations of the discrete Fourier transform arise when we apply
Eqs. 14.78 and 14.79 to physical systems and attempt physical interpretation
and the generalization F{cop) ->> F(co). Example 14.6.1 illustrates the problem that
can occur. The most important precaution to be taken to avoid trouble is to
1 The two transform equations may be symmetrized with a resulting BN) 1/2
in each equation if desired.
2 By Eq. 14.76 these vectors are orthogonal and are therefore linearly
independent.
790 FOURIER SERIES
take JV sufficiently large so that there is no angular frequency component of a
higher angular frequency than coN = 2nN/T. For details on errors and limita-
limitations in the use of the discrete Fourier transform the reader is referred to
Bergland and Hamming.
EXAMPLE 14.6.1 Discrete Fourier Transform—Aliasing
Consider the relatively simple case of T = 2n, N = 2, and f(tk) = costk.
From
tk = kT/4 = kn/2, к = 0, 1, 2, 3 A4.80)
f(tk) = cos(tk) is represented by the four-component vector
/(tk) = (l,0,-l,0). A4.81)
The frequencies, (jop are given by Eq. 14.77:
wp = 2np/T = p. A4.82)
Clearly, cos tk implies a p = 1 component and no other frequency components.
The transformation matrix
B/Vr1 exp{ia)ptk) = BN)~l exp(ipkn/2)
becomes
A4.83)
Note that the 2N x 2N matrix has only 2JV independent components. It is the
repetition of values that makes the fast Fourier transform technique possible.
Operating on column vector f(tk), we find that this matrix yields a column
vector
F(a)p) = @A,0A). (H-84)
Apparently, there is a p = 3 frequency component present. We reconstruct f(tk)
by Eq. 14.79, obtaining
= y-^ + ^e~3itK A4.85)
Taking real parts, we can rewrite the equation as
f(tk) = % cos tk + $ cos 3tk. A4.86)
Obviously, this result, Eq. 14.86, is not identical with our original f(tk) = costk.
But cos tk = \ cos tk + \ cos 3tk at tk = 0, л/2, п; and Зл:/2. The cos tk and cos 3tk
mimic each other because of the limited number of data points (and the partic-
particular choice of data points). This error of one frequency mimicking another is
known as aliasing. The problem can be minimized by taking more data points.
EXERCISES 791
Fast Fourier Transform
The fast Fourier transform is a particular way of factoring and rearranging
the terms in the sums of the discrete Fourier transform. Brought to the attention
of the scientific community by Cooley and Tukey,3 its importance lies in the
drastic reduction in the number of numerical operations required. Because of
the tremendous increase in speed achieved (and reduction in cost), the fast
Fourier transform has been hailed as one of the few really significant advances
in numerical analysis in the past few decades.
For jV time values (measurements) a direct calculation of a discrete Fourier
transform would mean about N2 multiplications. For jV a power of 2 the fast
Fourier transform technique of Cooley and Tukey cuts the number of multi-
multiplications required to (jV/2)log2 N. If N = 1024 (= 210), the fast Fourier trans-
transform achieves a computational reduction by a factor of over 200. This is why
the fast Fourier transform is called fast and why it has literally revolutionized
the digital processing of waveforms.
The fast Fourier transform should be available at every computation center.
It is included in the SSP. Details on the internal operation will be found in the
paper by Cooley and Tukey and in the paper by Bergland.4
EXERCISES
14.6.1 Derive the trigonometric forms of discrete orthogonality corresponding to Eq.
14.75:
2JV-1
X cosBnptk/T)sinBnqtk/T) = 0
cosBnptk/T)cosBnqt,JT) =
2N,
2JV-1
X sinBnptk/T) sinBnqtJT) = <N, p = q ф 0, N
k=° { 0, p = q = 0, N.
Яш£. Trigonometric identities such as
sin A cos В = ^[sin(y4 + B) + sin(/l - B)]
are useful.
14.6.2 Equation 14.75 exhibits orthogonality summing over time points. Show that we
have the same orthogonality summing over frequency points.
1 2JV-1
— Xo (е'юр'»>)*ешр'к = ётк.
3 J. W. Cooley and J. W. Tukey, Math. Computation 19, 297 A965).
4G. D. Bergland, A Guided Tour of the Fast Fourier Transform, IEEE Spec-
Spectrum, pp. 41-52 (July 1969).
792 FOURIER SERIES
14.6.3 Show, in detail, how to go from
1 2JV-1.
to
2JV-1
f(h)= I F{wp)e'kaPlK
p = 0
14.6.4 The functions f(tk) and F((op) are discrete Fourier transforms of each other.
Derive the following symmetry relations:
(a) If f(tk) is real, F(wp) is Hermitian symmetric; that is,
v "' \ T
(b) If f(tk) is pure imaginary,
F(wp)= -F*
p) Y
Note. The symmetry of part (a) is an illustration of aliasing. The frequency
4nN/T — wp masquerades as the frequency wp.
14.6.5 Given N = 2, T = 2л, and f(tk) = sin tk.
(a) Find F(wp), p = 0, 1, 2, 3.
(b) Reconstruct f(tk) from F(iop) and exhibit the aliasing of w, = 1 and w3 = 3.
^N5. (a) F(wp) = @, i/2,0, - i/2)
(b) /('*) = 2sinrfc-isin3tfc.
14.6.6 Show that the Chebyshev polynomials 7^n(x) satisfy a discrete orthogonality
relation
, 0, тфп
Here xs = cos 6S, where the (N + l)^s's are equally spaced along the 0 axis:
REFERENCES
Carslaw, H. S., Introduction to the Theory of Fourier's Series and Integrals. 2nd ed.
London: Macmillan A921); 3rd ed., paperback, New York: Dover A952).
This is a detailed and classic work, which includes a considerable discussion of Gibbs
phenomenon in Chapter IX.
Hamming, R. W., Numerical Methods for Scientists and Engineers, 2nd ed. New York:
McGraw-Hill A973).
Chapter 33 provides an excellent description of the fast Fourier transform.
Jeffreys, H. and B. S. Jeffreys, Methods of Mathematical Physics, 3rd ed. Cambridge:
Cambridge University Press A966).
Kufner, A. and J. Kadlec, Fourier Series. London: Iliffe A971).
This book is a clear account of Fourier series in the context of Hilbert space.
REFERENCES 793
Lanczos, С. Applied Analysis. Englewood Cliffs, N.J.: Prentice-Hall A956).
The book gives a well-written presentation of the Lanczos convergence technique
(which suppresses the Gibbs phenomenon oscillations). This and several other topics
are presented from the point of view of a mathematician who wants useful numerical
results and not just abstract existence theorems.
Oberhettinger, F., Fourier Expansions, A Collection of Formulas. New York and London:
Academic Press A973).
Zygmund, A., Trigonometric Series. Cambridge: Cambridge University Press A977).
The volume contains an extremely complete exposition, including relatively recent
results in the realm of pure mathematics.
15 INTEGRAL
TRANSFORMS
15.1 INTEGRAL TRANSFORMS
Frequently in mathematical physics we encounter pairs of functions related
by an expression of the following form:
g(a)= (bf(t)K(a,t)dt. A5.1)
J a
The function g(a) is called the (integral) transform off(t) by the kernel K(a, t).
The operation may also be described as mapping a function/(г) in г-space into
another function g(a) in a-space. This interpretation takes on physical signifi-
significance in the time-frequency relation of Example 15.3.1 and in the real space-
momentum space relations of Section 15.6.
Fourier Transform
One of the most useful of the infinite number of possible transforms is the
Fourier transform given by
0(a) = -4= Г f(t)eiMdt. A5.2)
Two modifications of this form, developed in Section 15.3, are the Fourier
cosine and Fourier sine transforms:
/2 f°°
gc(a) = /- f{t)cosatdt, A5.3)
&(«)= /- Г f(t) sin at dt. A5.4)
The Fourier transform is based on the kernel еш and its real and imaginary parts
taken separately, cos at and sin at. Because these kernels are the functions used
to describe waves, Fourier transforms appear frequently in studies of waves and
the extraction of information from waves, particularly when phase information
is involved. The output of a stellar interferometer, for instance, involves a Fourier
transform of the brightness across a stellar disk. The electron distribution in an
atom may be obtained from a Fourier transform of the amplitude of scattered
X-rays. In quantum mechanics the physical origin of the Fourier relations of
794
INTEGRAL TRANSFORMS 795
Section 15.6js the wave nature of matter and our description of matter in terms
of waves.
Laplace, Mellin, and Hankel Transforms
Three other useful kernels are
в'*', Un(at), t'~l.
These give rise to the following transforms
9(<*)= f(t)e~atdt, Laplace transform A5.5)
Jo
Лоо
g(a) = f(t)Un(at)dt, Hankel transform (Fourier-Bessel) A5.6)
Jo
Лоо
g{a)= f{t)ta-4t, Mellin transform. A5.7)
Jo
Clearly, the possible types are unlimited. These transforms have been useful in
mathematical analysis and in physical applications. We have actually used the
Mellin transform without calling it by name; that is, gr(a) = (a — 1)! is the Mellin
transform of/(f) = e~'. Of course, we could just as well say g(a) = n !/a"+1 is the
Laplace transform of/(f) = t". Of the three, the Laplace transform is by far the
most used. It is discussed at length in Sections 15.8 to 15.12. The Hankel trans-
transform, a Fourier transform for a Bessel function expansion, represents a limiting
case of a Fourier-Bessel series. It occurs in potential problems in cylindrical
coordinates and has been applied extensively in acoustics.
Linearity
All these integral transforms are linear; that is,
= I cJl{t)K{oi,t)dt+ Г c2f2(t)K(ot,t)dt, A5.8)
J a J a
Fcf(t)K{a,t)dt = с [Ъf{t)K{v,t)dt, A5.9)
J a J a
where cx and c2 are constants and/^f) and/2@ are functions for which the
transform operation is defined.
Representing our linear integral transform by the operator £f, we obtain
A5.10)
We expect an inverse operator <£~x exists such that1
1 Expectation is not proof, and here proof of existence is complicated because
we are actually in an /«/гиг'ге-dimensional Hilbert space. We shall prove
existence in the special cases of interest by actual construction.
796 INTEGRAL TRANSFORMS
Problem in
transform space
Integral
transform
Original
problem
FIG. 15.1
Relatively easy solution
Difficult solution
Solution in
transform space
Inverse
| „..„„.„
Solution of
original problem
A5.11)
For our three Fourier transforms if ~* is given in Section 15.3. In general, the
determination of the inverse transform is the main problem in using integral
transforms. The inverse Laplace transform is discussed in Section 15.12. For
details of the inverse Hankel and inverse Mellin transforms the reader is referred
to the references at the end of the chapter.
Integral transforms have many special physical applications and interpreta-
interpretations that are noted in the remainder of this chapter. The most common ap-
application is outlined in Fig. 15.1. Perhaps an original problem can be solved only
with difficulty, if at all, in the original coordinates (space). It often happens that
the transform of the problem can be solved relatively easily. Then, the inverse
transform returns the solution from the transform coordinates to the original
system. Example 15.4.1 and Exercise 15.4.1 illustrate this technique.
EXERCISES
15.1.1 The Fourier transforms for a function of two variables are
\f(x,y)e(lux+ry)dxdy,
^ J-00 J
fix, У) = ^-[ ! F(u, ю)е~1Ых+^ dudv.
Using f(x,y) = f(\_x2 + у2У/2), show that the zero-order Hankel transforms
F(p)= f rf(r)J0(pr)dr,
Jo
f(r)= f pF(p)J0(pr)dp,
Jo
are a special case of the Fourier transforms.
This technique may be generalized to derive the Hankel transforms of order
v, v = 0, j, 1, §, . .. (compare Sneddon, Fourier Transforms). A more general
approach, valid for v > —\, is presented in Sneddon's The Use of Integral
DEVELOPMENT OF THE FOURIER INTEGRAL 797
Transforms. It might also be noted that the Hankel transforms of nonintegral
order v = ±\ reduce to Fourier sine and cosine transforms.
15.1.2 Assuming the validity of the Hankel transform-inverse transform pair of equa-
equations
Q(OL)= f f(t)Jn(OLt)tdt
Jo
Лео
f(t)= g(a)Jn(at)*da,
Jo
show that the Dirac delta function has a Bessel integral representation
Лео
d(t- f)=t\ Jn(oct)Jn(at')ocdoc.
Jo
This expression is useful in developing Green's functions in cylindrical coordi-
coordinates, where the eigenfunctions are Bessel functions.
15.1.3 From the Fourier transforms, Eqs. 15.22 and 15.23, show that the transformation
t -> In x
iaj -» a — у
leads to
Лео
G(a)= F(x)xccdx
Jo
and
1 Л у+ ioo
F(x) = — G(a)x~*da.
2ni
J y — ico
These are the Mellin transforms. A similar change of variables is employed in
Section 15.12 to derive the inverse Laplace transform.
15.1.4 Verify the following Mellin transforms:
foo
(a) x"-lsin(kx)dx = k~"(oL- l)!sin —, -1<а<1.
Jo 2
Лоо
(b) x"-1 cos(kx)dx = k-«(a - 1)! cos--, 0 < а < 1.
Jo 2
Hint. You can force the integrals into a tractable form by inserting a convergence
factor e~bx and (after integrating) letting b -> 0. Also, cos kx + г sin kx = exp ikx.
15.2 DEVELOPMENT OF THE FOURIER INTEGRAL
In Chapter 14 it was shown that Fourier series are useful in representing
certain functions A) over a limited range [0,2л], [ — L,L], and so on, or B) for
the infinite interval ( — oo, oo), if the function is periodic. We now turn our atten-
attention to the problem of representing a nonperiodic function over the infinite
range. Physically this means resolving a single pulse or wave packet into
sinusoidal waves.
798 INTEGRAL TRANSFORMS
We have seen (Section 14.2) that for the interval \_ — L,L] the coefficients an
and bn could be written as
I Г/Wcos^A A5.12)
~ Г f(t)sinn^dt. A5.13)
The resulting Fourier series is
dJ] cos -— /(О cos
A5.14)
sin
or
] T%A ^ A5.15)
We now let the parameter L approach infinity, transforming the finite interval
[ — L,L] into the infinite interval (— oc, oo). We set
Then we have
-j- = со, — = Аю, with L -> oo.
1 00 /*0O
/(x) -> - 2 Aw /(*) cos a){t -x)dt A5.16)
or
•oo
l f00 fc
/(x) = - dco \ f{t) cos co{t - x)dt, A5.17)
^Jo J-oo
replacing the infinite sum by the integral over со. The first term (corresponding to
a0) has vanished, assuming that ^^/(tfdt exists.
It must be emphasized that this result (Eq. 15.17) is purely formal. It is not
intended as a rigorous derivation, but it can be made rigorous (compare I. N.
Sneddon, Fourier Transforms, Section 3.2). We take Eq. 15.17 as the Fourier
integral. It is subject to the conditions that/(x) is A) piecewise continuous, B)
differentiable, and C) absolutely integrable—that is, j00^ |/(x)| dx is finite.
Fourier Integral—Exponential Form
Our Fourier integral (Eq. 15.17) may be put into exponential form by noting
that
/(x) = x-\ dw \ f{t)costo{t - x)dt, A5.18)
— oo J — oo
DEVELOPMENT OF THE FOURIER INTEGRAL 799
whereas
-t Л oo Л oo
— dto\ f(t)sma)(t-x)dt = O; A5.19)
2n
J — oo »/ — oo
cosco(£ — x) is an even function of со and sinco(£ — x) is an odd function of со.
Adding Eqs. 15.18 and 15.19 (with a factor i), we obtain
2n . - — j f(t)eltaldt. A5.20)
J —oo J — oo
The variable со introduced here is an arbitrary mathematical variable. In many
physical problems, however, it corresponds to the angular frequency w. We may
then interpret Eq. 15.18 or 15.20 as a representation of/(x) in terms of a distribu-
distribution of infinitely long sinusoidal wave trains of angular frequency ш in which this
frequency is a continuous variable.
Dirac Delta Function Derivation
If the order of integration of Eq. 15.20 is reversed, we may rewrite it as
/(*) = f(t)fc\ e'^x)d(Adt A5.20a)
J —oo v. J —oo J
Apparently the quantity in curly brackets behaves as a delta function—S(t — x).
We might take Eq. 15.20a as presenting us with a representation of the Dirac
delta function. Alternatively, we take it as a clue to a new derivation of the
Fourier integral theorem.
From Eq. 8.114 (shifting the singularity from t = 0 to t = x)
C<x>
f(tKn(t - x)dt, A5.21a)
n->oo
J — oo
where 5n(t — x) is a sequence defining the distribution 3(t — x). Note that Eq.
15.21a assumes that/(г) is continuous at t = x.
We take Sn(t - x) to be
_ smn(t - x) If" еЫ1-х) d^ 52{b)
n(t — x) 2n
using Eq. 8.111. Substituting into Eq. 15.21a, we have
j ЛОО Лл
/(x)=lim— f(t)\ ei(a(t~x)dtodt. A5.21c)
2n
Interchanging the order of integration and then taking the limit as n —> oo, we
have Eq. 15.20, the Fourier integral theorem.
With the understanding that it belongs under an integral sign as in Eq. 15.21a,
the identification
_ x) = ~ Г eico(t'x)da>, A5.2Ы)
2n
J — oo
800 INTEGRAL TRANSFORMS
provides a very useful representation of the delta function. It is used to great
advantage in Sections 15.5 and 15.6.
15.3 FOURIER TRANSFORMS—INVERSION
THEOREM
Let us define g{co), the Fourier transform of the function/(г), by
1 [ f(t)emtdt. A5.22)
2n
Exponential Transform
Then from Eq. 15.20 we have the inverse relation
/(*) = -?== g(a>)e^*da>. A5.23)
It will be noted that Eqs. 15.22 and 15.23 are almost but not quite symmetrical,
differing in the sign of i.
Here two points deserve comment. First, the l/y/2n symmetry is a matter of
choice, not of necessity. Many authors will attach the entire 1/2л factor of Eq.
15.20 to one of the two equations: to Eq. 15.22 or Eq. 15.23. Second, although
the Fourier integral Eq. 15.20 has received much attention in the mathematics
literature, we shall be primarily interested in the Fourier transform and its
inverse. They are the equations with physical significance.
When we move the Fourier transform pair to three-dimensional space,
it becomes
d(k) = тЛщ [f(r)eik-rd3x A5.23a)
Bn) J
4 [-** d3k. A5.23b)
The integrals are over all space. Verification, if desired, follows immediately by
substituting the left-hand side of one equation into the integrand of the other
equation and using the three-dimensional delta function.1 Equation 15.23b may
be interpreted as an expansion of a function /(r) in a continuum of plane wave
eigenfunctions. g(k) then becomes the amplitude of the wave exp( — ik т)„
! - г2) = 5(дс, — х2) д(у1 - у2) d(zx - z2)
If00 1
= z- exp^Mxj - *2)] **i 'T
2nJ-o
00
'00
00
— I exp[ife3(z1-z2)]dfe3
J — 00
1
Bn)-
FOURIER TRANSFORMS—INVERSION THEOREM 801
Cosine Transform
If/(x) is odd or even, these transforms may be expressed in a somewhat
different form. Consider, first, /(x) = /( — x), even. Writing the exponential of
Eq. 15.22 in trigonometric form, we have
1 f°°
gc(co) = —j== fc(t)(coscot + i sin (ot)dt
V2^J-oo'
/2 f°° A5.24)
= /~ fc(t) cos cot dt,
the sin cot dependence vanishing on integration over the symmetric interval
(-co, oo). Similarly, since cos art is even, Eq. 15.23 transforms to
gc{co) cos fox dco. A5.25)
Equations 15.24 and 15.25 are known as Fourier cosine transforms.
Sine Transform
The corresponding pair of Fourier sine transforms is obtained by assuming
that /(x) = —/(— x), odd, and applying the same symmetry arguments. The
equations are
9s(<o)= - Г fs(t) sin ojtdt,2 A5.26)
Vя Jo
з
gs(co) sin cox dco. A5.27)
From the last equation we may develop the physical interpretation that/(x) is
being described by a continuum of sine waves. The amplitude of sin wx is given
by y/2/n gs(co), in which gs(co) is the Fourier sine transform offs(x). It will be seen
that Eq. 15.27 is the integral analog of the summation (Eq. 14.18). Similar inter-
interpretations hold for the cosine and exponential cases.
If we take Eqs. 15.22, 15.24, and 15.26 as the direct integral transforms,
described by <£ in Eq. 15.10 (Section 15.1), the corresponding inverse transforms,
j^1 of Eq. 15.11, are given by Eqs. 15.23, 15.25, and 15.27.
The reader will note that the Fourier cosine transforms and the Fourier sine
transforms each involve only positive values (and zero) of the arguments. We
use the parity off(x) to establish the transforms, but once the transforms are
established, the behavior of the functions / and g for negative argument is
irrelevant. In effect, the transform equations themselves impose a definite
parity; even for the Fourier cosine transform and odd for the Fourier sine
transform.
EXAMPLE 15.3.1 Finite Wave Train
An important application of the Fourier transform is the resolution of a
finite pulse into sinusoidal waves. Imagine that an infinite wave train sin(o0£ is
!Note that a factor—i has been absorbed into this g(a>).
802 INTEGRAL TRANSFORMS
clipped by Kerr cell or saturable dye cell shutters so that we have
Nn
f(t) =
smco0t,
0,
Ntz
co0
A5.28)
This corresponds to N cycles of our original wave train (Fig. 15.2). Since/(г) is
odd, we may use the Fourier sine transform (Eq. 15.26) to obtain
FIG. 15.2 Finite wave train
gs(co)= -
n
sm co0t sin cot dt.
A5.29)
Integrating, we find our amplitude function
gs(co)= -
\ n
sin[(co0 — co)(Nn/co0)'] sin [(io0 + to)(Nn/to0)]
2(co0 - со)
2(to0 + со)
A5.30)
It is of some considerable interest to see how gs(co) depends on frequency. For
large co0 and со ж со0 only the first term will be of any importance. It is plotted in
Fig. 15.3. This is the amplitude curve for the single slit diffraction pattern. There
are zeroes at
1 N-n
to =; too
FIG. 15.3 Fourier transform of finite
wave train
EXERCISES 803
coo — со Лео ,1 ,2 , ,.,„,
—- = — =+—, + —, and so on. A5.31)
coo w0 N N
gs(ao) may also be interpreted as a Dirac delta distribution as in Section 8.7.
Since the contributions outside the central maximum are small, we may take
^ A5.32)
as a good measure of the spread in frequency of our wave pulse. Clearly, if jV is
large (a long pulse), the frequency spread will be small. On the other hand, if our
pulse is clipped short, N small, the frequency distribution will be wider.
Uncertainty Principle
Here is a classical analog of the famous uncertainty principle of quantum
mechanics. If we are dealing with electromagnetic waves,
— = E, energy (of our wave pulse or photon)
2n
hAco AJ7 A5-33)
-^— = A£,
h being Planck's constant, which represents an uncertainty in the energy of our
pulse. There is also an uncertainty in the time, for our wave of jV cycles requires
2Nn/coo seconds to pass. Taking
21^ A5.34)
we have the product of these two uncertainties:
hAco 2nN
AE-At =
2nN
The Heisenberg uncertainty principle actually states
2n w0
= h. A5.35)
AE-At>~, A5.36)
4n
and this is clearly satisfied in our example.
EXERCISES
15.3.1 (a) Show that g( — w) — g*{a)) is a necessary and sufficient condition for f(x)
to be real,
(b) Show that g( — w) = —g*(a)) is a necessary and sufficient condition for
f(x) to be pure imaginary.
Note. The condition of part (a) is used in the development of the dispersion
relations of Section 7.3.
804 INTEGRAL TRANSFORMS
15.3.2 Let F(co) be the Fourier (exponential) transform of /(x) and G(co) the Fourier
transform of g{x) = f(x + a). Show that
15.3.3 The function
G(oi) = e~iataF{aj\
f(x) =
[0,
x\ < 1
x|> 1
is a symmetrical finite step function.
(a) Find the gc(a>), Fourier cosine transform of f(x).
(b) Taking the inverse cosine transform, show that
/(*) = -
я
sin со cos cox
to
(c) From part (b) show that
sin со cos cox
CO
da) —
fo,
n
4'
n
2'
da).
\x\ > 1,
\x\ = 1.
Ixl < 1.
15.3.4 (a) Show that the Fourier sine and cosine transforms of e "' are
. , /2 w
n aJ + a2
gc[w)= /
я
a
я to + fl
шг. Each of the transforms can be related to the other by integration by
parts,
(b) Show that
a) sin cox ■ _ я _ux
^ z~ а со — t? .
w2 + a2 2
cos cox" , я _„,
—r—Idw = —e "
со + a 2a
x>0,
x> 0.
These results may also be obtained by contour integration (Exercise 7.2.14).
15.3.5 Find the Fourier transform of the triangular pulse
Note. This function provides another delta sequence with h = a and a -> со.
EXERCISES 805
15.3.6 We may define a sequence
(Vi, |x| < I/In,
Ш = |o, \x\ > 1/2*.
(This is Eq. 8.108.) Express dn(x) as a Fourier integral (via the Fourier integral
theorem, inverse transform, etc.). Finally, show that we may write
d(x) = Urn д„(х) = — I e~ikxdk.
2я J
ёп{х) = —-exp(-nV),
15.3.7 Using the sequence
show that
1 f00
d{x)=in\ e~ikxdL
Note. Remember that d(x) is defined in terms of its behavior as part of an
integrand—Section 8.7, especially Eqs. 8.114 and 8.115.
15.3.8 Derive sin and cosine representations of 5(t — x) that are comparable to the
exponential representation, Eq. 15.2Ы.
2 Г
ANS. - sin uit sin a>x da>
71 Jo
2 Г
- cos a>t cos (ox da>.
71 Jo
15.3.9 In a resonant cavity an electromagnetic oscillation of frequency aH dies out as
A(t) = Aoe'^tl2Qe-i(°o\ t > 0.
(Take A(t) = 0 for t < 0.)
The parameter Q is a measure of the ratio of stored energy to energy loss per
cycle. Calculate the frequency distribution of the oscillation, a*(a>)a(a>), where
a(oj) is the Fourier transform of A(t).
Note. The larger Q is, the sharper your resonance line will be.
A2 1
ANS a*( \n( \ — -
- 2n{oj_ WoJ + {aJo/2QJ'
15.3.10 Prove that
h Г00 e~i(Otd(o (e\p( — Yt/2h)exp( — iEot/h), t > 0,
i J.^ Eo - iT/2 - йсо [0, t < 0.
This Fourier integral appears in a variety of problems in quantum mechanics:
WKB barrier penetration, scattering, time-dependent perturbation theory, and
so on.
Hint. Try contour integration.
15.3.11 Verify that the following are Fourier integral transforms of one another:
and J0(ay),
(a) / , . ., \x\<a,
Vя y/a2 - x2
0, \x\ > a,
806 INTEGRAL TRANSFORMS
(b) 0, Ы < a,
and N0(a\y\),
\x\ > a,
(c) /-—=== and K0{a
(d) Can you suggest why /о(аУ) is not included in this list?
Hint. Jo, No, and Ko may be transformed most easily by using an exponential
representation, reversing the order of integration, and employing the Dirac
delta function exponential representation (Section 15.2). These cases can be
treated equally well as Fourier cosine transforms.
Note. The Ko relation appears as a consequence of a Green's function equation
in Exercise 16.6.14.
15.3.12 A calculation of the magnetic field of a circular current loop in circular cylin-
cylindrical coordinates leads to the integral
coskzkK^kdjdk.
Jo
Show that this integral is equal to
na
Hint. Try differentiating Exercise 15.3.11 (c).
15.3.13 As an extension of Exercise 15.3.11, show that
(a)
(b)
(c)
Jo
Jo
Jo
J0(y)dy =
N0(y)dy =
K0(y)dy =
1,
o,
n
2'
15.3.14 The Fourier integral, Eq. 15.18, has been held meaningless for f(t) — cosat.
Show that the Fourier integral can be extended to cover f(i) = cos at by use of
the Dirac delta function.
15.3.15 Show that
sin kaJ0(kp)dk =<
Jo 1°. P > a-
Here a and p are positive. The equation comes from the determination of the
distribution of charge on an isolated conducting disk, radius a.
Note that the function on the right has an infinite discontinuity at p = a.
Note. A Laplace transform approach appears in Exercise 15.10.8.
15.3.16 The function /(r) has a Fourier exponential transform
Determine fix).
Hint. Use spherical polar coordinates in /c-space.
ANS. /(r)=
FOURIER TRANSFORM OF DERIVATIVES 807
15.3.17 (a) Calculate the Fourier exponential transform of f(x) = e'"M.
(b) Calculate the inverse transform by employing the calculus of residues
(Section 7.2).
15.3.18 Show that the following are Fourier transforms of each other
i"Jn(t) and
0, |x| >
Tn(x) is the nth-order Chebyshev polynomial.
Hint. With Tn(cos6) = cosnO, the transform of Tn(x)(l — x2)'112 leads to an
integral representation of Jn{t).
15.3.19 Show that the Fourier exponential transform of
(o, Ja*| > i
is Bi"/2n)jn(kr). Here Рп(ц) is a Legendre polynomial and jn(kr) is a spherical
Bessel function.
15.3.20 Show that the three-dimensional Fourier exponential transform of a radially
symmetric function may be rewritten as a Fourier sine transform:
-^ Г /0У- d'x 4. /? Г О/О")] sin krdr.
15.3.21 (a) Show that f(x) = x~112 is a self-reciprocal under both Fourier cosine and
sine transforms; that is,
/2 f00
/-
я Jo
/2 Г00
/-
я Jo
= Г1/2
= Г1/2.
(b) Use the preceding results to evaluate the Fresnel integrals Jo°cos(y2)dy
and Jo°sin(y2)<y.
15.4 FOURIER TRANSFORM OF DERIVATIVES
In Section 15.1 Fig. 15.1 outlines the overall technique of using Fourier
transforms and inverse transforms to solve a problem. Here we take an initial
step in solving a differential equation—obtaining the Fourier transform of a
derivative.
Using the exponential form, we determine that the Fourier transform of
f(x) is
#(«)= * f f(x)ei(axdx A5.37)
and for df(x)/dx
* П ^dx. A5.38)
2n dx
-oo
808 INTEGRAL TRANSFORMS
Integrating Eq. 15.38 by parts, we obtain
e
uox
ICO
Лоо
f(x)ei<DXdx. A5.39)
' —oo
If/(x) vanishes1 asx-> + oo, we have
g1{co)= -ico g{со); A5.40)
that is, the transform of the derivative is (— ico) times the transform of the original
function. This may readily be generalized to the nth derivative to yield
gn{co) = {-Uo)ng{co), A5.41)
provided all the integrated parts vanish as x -> ±00. This is the power of the
Fourier transform, the reason it is so useful in solving (partial) differential equa-
equations. The operation of differentiation has been replaced by a multiplication.
EXAMPLE 15.4.1 Wave Equation
This technique may be used to advantage in handling partial differential
equations. To illustrate the technique let us derive a familiar expression of
elementary physics. An infinitely long string is vibrating freely. The amplitude
у of the (small) vibrations satisfies the wave equation
A5.42)
We shall assume an
initial
82y
дх2
condition
У{х,
0)
1 e2y
v2 dt2 ■
A5.43)
Applying our Fourier transform, which means multiplying by eiax and inte-
integrating over x, we obtain
дх2 v2 dV
1 — 00 j — 00
or
( in\2 Yin a— K ' ' П5 45^
v dt
Here we have used
y{x,t)eiaxdx A5.46)
and Eq. 15.41 for the second derivative. Note that the integrated part of Eq.
15.39 vanishes. The wave has not yet gone to oo. Since no derivatives with
1 Apart from cases such as Exercise 15.3.6,/(x) must vanish as x-* ±oo
in order for the Fourier transform of/(x) to exist.
EXERCISES 809
respect to a appear, Eq. 15.45 is actually an ordinary differential equation—in
fact, the linear oscillator equation. This transformation, from a partial to an
ordinary differential equation, is a significant achievement. We solve Eq. 15.45
subject to the appropriate initial conditions. At t = 0, applying Eq. 15.43, Eq.
15.46 reduces to
f'oo
f(x)eiaxdx
A5.47)
= F(a).
The general solution of Eq. 15.45 in exponential form is
Y{a,t) = F{a)e±ivat. A5.48)
Using the inversion formula (Eq. 15.23), we have
Y(a,t)e~iaxda A5.49)
/2n
v
and, by Eq. 15.48,
1 f00
y(x,t) = ~~\ F{a)e~ia(x+Vt)da. A5.50)
'2n I
Since/(x) is the Fourier inverse transform of F(a),
y(x,t)=f(x+vt), A5.51)
corresponding to waves advancing in the + x- and — x-directions, respectively.
The particular linear combinations of waves is given by the boundary con-
condition of Eq. 15.43 and some other boundary condition such as a restriction on
dy/dt.
The accomplishment of the Fourier transform here deserves special emphasis.
Our Fourier transform converted a partial differential equation into an ordinary
differential equation, where the "degree of transcendence" of the problem was
reduced. In Section 15.9 Laplace transforms are used to convert ordinary
differential equations (with constant coefficients) into algebraic equations.
Again, the degree of transcendence is reduced. The problem is simplified—as
outlined in Fig. 15.1.
EXERCISES
15.4.1 The one-dimensional Fermi age equation for the diffusion of neutrons slowing
down in some medium (such as graphite) is
д2д{х,т)
dx2 dz
Here q is the number of neutrons that slow down, falling below some given
energy per second per unit volume. The Fermi age, т, is a measure of the energy
loss.
810 INTEGRAL TRANSFORMS
If q(x,0) = SS(x\ corresponding to a plane source of neutrons at x = 0,
emitting S neutrons per unit area per second, derive the solution
1 " Г2
yJAm
Hint. Replace q(x, т) with
i f00
q{x,x)eikx dx.
This is analogous to the diffusion of heat in an infinite medium.
15.4.2 Equation 15.41 yields
g2 (w)= -aJg{w)
for the Fourier transform of the second derivative of f(x). The condition
f(x) -> 0 for x -> +oo may be relaxed slightly. Find the least restrictive Condition
for the preceding equation for g2{oj) to hold.
ANS. №{X)
dx
15.4.3 The one-dimensional neutron diffusion equation with a (plane) source is
= 0.
— oo
where cp(x) is the neutron flux, Qd(x) is the (plane) source at x = 0, and D and
K2 are constants. Apply a Fourier transform. Solve the equation in transform
space. Transform your solution back into x-space.
ANS. ср(х) = -Я-е~^.
ZJS.L/
15.4.ЗА For a point source at the origin the three-dimensional neutron diffusion
equation becomes
-D\2<p(r) + K2D<p(r) = Qe(r).
Apply a three-dimensional Fourier transform. Solve the transformed equa-
equation. Transform the solution back into r-space.
1 5.4.4 (a) Given that F(k) is the three-dimensional Fourier transform of /(r) and
Ft(k) is the three-dimensional Fourier transform of V/(r), show that
This is a three-dimensional generalization of Eq. 15.40.
(b) Show that the three-dimensional Fourier transform of V' V/(r) is
F2(k) = (-ikJF(k).
Note. Vector к is not the unit vector along the z-axis. It is a vector in the
transform space. In Section 15.6 we shall have hk = p, linear momentum.
15.5 CONVOLUTION THEOREM
We shall employ convolutions to solve differential equations, to normalize
momentum wave functions (Section 15.6), and to investigate transfer functions
(Section 15.7).
CONVOLUTION THEOREM 811
FIG. 15.4
Let us consider two functions/(x) and g(x) with Fourier transforms F(t) and
G(t), respectively. We define the operation
f*9 =
1
9{y)f(x-y)dy
A5.52)
as the convolution of the two functions/and g over the interval (— oo, oo). This
form of an integral appears in probability theory in the determination of the
probability density of two random, independent variables. Our solution of
Poisson's equation, Eq. 8.99, may be interpreted as a convolution of a charge
distribution, p(r2), and a weighting function, D^0^! — r2\)~1- In other works
this is sometimes referred to as the Faltung, to use the German term for
"folding. We now transform the integral in Eq. 15.52 by introducing the
Fourier transforms:
9(y)f(x - y)dy =
1
F{t)e'it(x'y)dtdy
g(y)eltydy
-J -oo
e~itxdt A5.53)
F(t)G(t)e'!txdt,
interchanging the order of integration and transforming g(y). This result may be
interpreted as follows: The Fourier inverse transform of a product of Fourier
transforms is the convolution of the original functions, f*g.
For the special case x = 0we have
F(t)G(t)dt= f(-y)g(y)dy.
A5.54)
1 For/(j) = e y,f{y) and/(x — y) are plotted in Fig. 15.4. Clearly,/(j) and
f(x — y) are mirror images of each other in relation to the vertical line у = x/2,
that is, we could generate/(x — y) by folding over/(y) on the line у = x/2.
812 INTEGRAL TRANSFORMS
The minus sign in — у suggests that modifications be tried. We now do this with
g* instead of g using a different technique.
Parseval's Relation
Results analogous to Eqs. 15.53 and 15.54 may be derived for the Fourier sine
and cosine transforms (Exercises 15.5.1 and 15.5.2). Equation 15.54 and the cor-
corresponding sine and cosine convolutions are often labeled "Parseval's relations"
by analogy with Parseval's theorem for Fourier series (Chapter 14, Exercise
14.4.2).
The Parseval relation2'3
Лоо лоо
F(co)G*(co)dco= f(t)g*{t)dt, A5.55)
J — oo J — oo
may be derived very beautifully using the Dirac delta function representation,
Eq. 15.2W. We have
ЛОО ЛОО ч Лоо -i ЛОО
f(t)g*(t)dt= \ —= F(a>)e~icotda>--^=\ G*(x)eixt dxdt, A5.56)
^J-oo V270-oo
-oo
with attention to the complex conjugation in the G*(x) to g*(t) transform. In-
Integrating over t first, and using Eq. 15.2W, we obtain
Лоо Лоо
ftt)g*(t)dt=\ F(co) G*(x)S(x-(o)dxd(o
L" A5.57)
F(co)G*(co) dco,
our desired Parseval relation. Iff(t) = g(t), then the integrals in the Parseval
relation are normalization integrals (Section 9.4). Equation 15.57 guarantees
that if a function/(t) is normalized to unity, its transform F(co) is likewise nor-
normalized to unity. This is extremely important in quantum mechanics as de-
developed in the next section.
It may be shown that the Fourier transform is a unitary operation (in the
Hilbert space L2, square integrable functions). The Parseval relation is a
reflection of this unitary property—analogous to Exercise 4.5.26 for matrices.
In Fraunhofer diffraction optics the diffraction pattern (amplitude) appears
as the transform of the function describing the aperture (compare Exercise
15.5.5). With intensity proportional to the square of the amplitude the Parseval
relation implies that the energy passing through the aperture seems to be some-
somewhere in the diffraction pattern—a statement of the conservation of energy.
Parseval's relations may be developed independently of the inverse Fourier
transform and then used rigorously to derive the inverse transform. Details are
given by Morse and Feshbach,4 Section 4.8 (see also Exercise 15.5.4).
2 Note that all arguments are positive in contrast to Eq. 15.54.
3Some authors prefer to restrict Parseval's name to series and refer to Eq.
15.55 as Rayleigh's theorem.
4P. M. Morse, and H. Feshbach, Methods of Theoretical Physics. New York:
McGraw-Hill A953).
EXERCISES 813
EXERCISES
15.5.1 Work out the convolution equation corresponding to Eq. 15.53 for
(a) Fourier sine transforms
ЛОО ЛОО
2 9(y)f(x-y)dy= -\ Fs(s)Gs(s)cossxds,
J-oo Jo
where / and g are odd functions.
(b) Fourier cosine transforms
ЛОО ЛОО
2 9{y)f{x~y)dy= \ Fc(s)Gc(s) cos sxds,
J-oo Jo
where / and g are even functions.
15.5.2 F(p) and G(p) are the Hankel transforms of f(r) and g(r), respectively (Exercise
15.1.1). Derive the Hankel transform Parseval relation:
ЛОО ЛОО
F*(p)G(p)pdp= f*{r)g{r)rdr.
Jo Jo
15.5.3 Show that for both Fourier sine and Fourier cosine transforms Parseval's relation
has the form
ЛОО ЛОО
F(t)G(t)dt= f(y)g(y)dy
Jo Jo
15.5.4 Starting from Parseval's relation (Eq. 15.54), let g(y) = 1, 0 < у < a, and zero
elsewhere. From this derive the Fourier inverse transform (Eq. 15.23).
Hint. Differentiate with respect to a.
15.5.5 (a) A rectangular pulse is described by
|l, |x| < a
Show that the Fourier exponential transform is
F(t)= /?EL*.
"Vя t
Here is the single slit diffraction problem of physical optics. The slit is
described by f(x). The diffraction pattern amplitude is given by the Fourier
transform F(i).
(b) Use the Parseval relation to evaluate
Лоо • 2
~-dt.
This integral may also be evaluated by using the calculus of residues,
Exercise 7.2.12.
ANS. (b) я.
15.5.6 Solve Poisson's equation \2ф(г) = —p(r)/s0 by the following sequence of
operations:
(a) Take the Fourier transform of both sides of this equation. Solve for the
Fourier transform of ф(т).
(b) Carry out the Fourier inverse transform by using a three-dimensional
analog of the convolution theorem, Eq. 15.53.
814 INTEGRAL TRANSFORMS
15.5.7 (a) Given f(x) = 1 — \x/2\, —2 < x < 2 and zero elsewhere, show that the
с • с e r/ \ • г, \ Д /sin Л2
Fourier transform of/(x) is Fit) = /-( .
Vn t )
(b) Using the Parseval relation, evaluate
"°° /sin tY
ANS. (b) —.
15.5.8 With F(t) and G(t) the Fourier transforms of f(x) and g(x), respectively, show that
f |/(x) - g(x)\2 dx = Г \F(t) - G(t)\2 dt.
J — oo J — oo
If g(x) is an approximation to f(x), the preceding relation indicates that the
mean square deviation in t-space is equal to the mean square deviation in x-space.
15.5.9 Use the Parseval relation to evaluate
Л00 j
(a) Jo(^T??.
(b) Гг^з-
j0 уш -г и )
Hint. Compare Exercise 15.3.4.
ANS. (a)
4a3'
л
4a'
15.6 MOMENTUM REPRESENTATION
In advanced dynamics and in quantum mechanics linear momentum and
spacial position occur on an equal footing. In this section we shall start with the
usual space distribution and derive the corresponding momentum distribution.
For the one-dimensional case our wave function ф(х), a solution of the Schrodin-
ger wave equation, has the following properties:
1. if/*(x)ij/(x)dx is the probability of finding the quan-
quantum particle between x and x + dx and
Лоо
2. ф*{х)ф{х)<1х = 1, A5.58)
J — со
corresponding to one particle (along the x-axis).
In addition, we have
Лоо
3. <x> = ф*(x)xф(x)dx A5.59)
— oo
for the average position of the particle along the x-
axis. This is often called an expectation value.
MOMENTUM REPRESENTATION 815
We want a function g(p) that will give the same information about the
momentum.
1- 9*(P)g(P)dp is the probability that our quantum
particle has a momentum between p and p + dp.
Лоо
2. g*(p)g(p)dp=L A5.60)
' — oo
Лоо
3. (рУ= g*(p)pg(p)dp. A5.61)
As subsequently shown, such a function is given by the Fourier transform of our
space function ф(х). Specifically,1
1 f°°
g(p) = -— ф(х)е'1рх1Ых A5.62)
I -co
лоо
A5.63)
The corresponding three-dimensional momentum function is
To verify Eqs. 15.62 and 15.63, let us check on properties 2 and 3.
Property 2, the normalization, is automatically satisfied as a Parseval rela-
relation, Eq. 15.55. If the space function ф(х) is normalized to unity, the momentum
function g(p) is also normalized to unity.
To check on property 3, we must show that
Г00 Г°° h й
<P>=\ g*(p)P9(P)dp= r{x)n.^{x)dx> A5.64)
J —oo J — oo
where (h/i)(d/dx) is the momentum operator in the space representation. We
replace the momentum functions by Fourier transformed space functions, and
the first integral becomes
oo
\\\~ip(x-x)l^*(W()dd'd A5.65)
Now
1 The h may be avoided by using the wave number k, p = kh (and p = kh),
so that
An example of this notation appears in Section 16.1.
816 INTEGRAL TRANSFORMS
d
-ipix-х'Ш = U _'±e-ip(x-x')lh ц566)
dx i
Substituting into Eq. 15.65 and integrating by parts, holding x' and p constant,
we obtain
"JL Г
J —
e
'x')lhdp
oo
i dx
A5.67)
Here we assume ф(х) vanishes as x-> +oo, eliminating the integrated part.
Again using the Dirac delta function, Eq. 15.21c, Eq. 15.67 reduces to Eq. 15.64
to verify our momentum representation. The reader will note that technically we
have employed the inverse Fourier transform in Eq. 15.62. This was chosen
deliberately to yield the proper sign in Eq. 15.67.
EXAMPLE 15.6.1 Hydrogen Atom
The hydrogen atom ground state2 may be described by the spacial wave
function
1 <Tr/ao, A5.68)
a0 being the Bohr radius, h2/me2. We now have a three-dimensional wave func-
function. The transform corresponding to Eq. 15.62 is
Uw'M d3r. A5.69)
Substituting Eq. 15.68 into Eq. 15.69 and using
[e -ar+ib-r d3r = -^2^2-2, A5-7°)
we obtain the hydrogenic momentum wave function
23/2
n (alp2 + h2J'
Such momentum functions have been found useful in problems like Compton
scattering from atomic electrons, the wavelength distribution of the scattered
radiation, depending on the momentum distribution of the target electrons.
The relation between the ordinary space representation and the momentum
representation may be clarified by considering the basic commutation relations
of quantum mechanics. We can go from a classical Hamiltonian to the Schrodin-
2 See E. V. Ivash, "A momentum representation treatment of the hydrogen
atom problem," Am. J. Phys. 40, 1095 A972) for a momentum representation
treatment of the hydrogen atom, / = 0 states.
MOMENTUM REPRESENTATION 817
ger wave equation by requiring that momentum p and position x not commute.
Instead, we require that
[p,x] = (px - xp) = -ih. A5.72)
For the multidimensional case Eq. 15.72 is replaced by
[pi,xj]= -ihSu. A5.73)
The Schrodinger (space) representation is obtained by using
Xj -> Xj,
-> -ih— {x)
replacing the momentum by a partial space derivative. The reader will easily
see that
[р,х]ф(х)= -Щ{х\ A5.74)
However, Eq. 15.72 can equally well be satisfied by using
xj
dPj (p)
Pi -+ Pi-
This is the momentum representation. Then
[p,x]g(p)= -ihg(p). A5.75)
Hence the representation (x) is not unique; (p) is an alternate possibility.
In general, the Schrodinger representation (x) leading to the Schrodinger
wave equation is more convenient because the potential energy Fis generally
given as a function of position V(x,y,z). The momentum representation (p)
usually leads to an integral equation (compare Chapter 16 for the pros and cons
of the integral equations). For an exception, consider the harmonic oscillator.
EXAMPLE 15.6.2 Harmonic Oscillator
The classical Hamiltonian (kinetic energy + potential energy = total energy)
is
A5.76)
Where k is the Hooke's law constant.
In the Schrodinger representation we obtain
l2 d2
tl2 d2lj/(x) . 1, 2// ч V I , \ M<77\
fY- + -kx2i//(x ) = Еф(х). A5.77)
2m ax 2
For total energy E equal to J(k/m)h/2 there is a solution (Section 13.1)
818 INTEGRAL TRANSFORMS
ф(х) = e-(V"^2ft)*2 A5.78)
The momentum representation leads to
Again, for
E= \-\ A5.80)
у т 2
the momentum wave equation A5.79) is satisfied by
g(p) = е~р2К2ь^к)_
Either representation, space or momentum (and an infinite number of other
possibilities), may be wed, depending on which is more convenient for the par-
particular problem uncfer attack.
The demonstration that g(p) is the momentum wave function corresponding
to Eq. 15.78—that it is the Fourier inverse transform of Eq. 15.78—is left as
Exercise 15.6.3.
EXERCISES
15.6.1 The function e'k'T describes a plane wave of momentum p = ftk normalized to
unit density. (Time dependence of e~iM is assumed.) Show that these plane wave
functions satisfy an orthogonality relation
I (e*' )*e'"'•' dxdydz = BлK<5(к - к').
15.6.2 An infinite plane wave in quantum mechanics may be represented by the
function
ф(х) = eip1xlh.
Find the corresponding momentum distribution function. Note that it has an
infinity and that ф(х) is not normalized.
15.6.3 A linear quantum oscillator in its ground state has a wave function
ф(х) = a~ll2n'^e~x2/2a2.
Show that the corresponding momentum function is
g(p) = a1/2n~ll4h-ll2e'a2p2/2h2.
15.6.4 The nth excited state of the linear quantum oscillator is described by
фп(х) = а^22'^тг^\пГХ11е~^2а2Яп{х1а);
where Hn(x/a) is the nth Hermite polynomial, Section 13.1. As an extension
of Exercise 15.6.3, find the momentum function corresponding to фп(х).
Hint. фп(х) may be represented by <£\ фо(х), where ^+ is the raising operator,
Exercise 13.1.16.
15-6.5 A free particle in quantum mechanics is described by a plane wave
EXERCISES 819
Combining waves of adjacent momentum with an amplitude weighting factor
(p(k), we form a wave packet
4f(x,t)= Г (p(k)ei[kx^hk2l2m)t]dk.
J — oo
(a) Solve for cp(k) given that
(b) Using the known value of cp(k), integrate to get the explicit form of 4>(x, t).
Note that this wave packet diffuses or spreads out with time.
-{x2/2[(a2 + (ift/m)(])
ANS. 4>(X,t)^~ — j—rjj.
v [1 + (iht/ma2)]112
Note. An interesting discussion of this problem from the evolution operator
point of view is given by S. M. Blinder, "Evolution of a Gaussian wavepacket."
Am. J.Phys.36, 525 A968).
15.6.6 Find the time-dependent momentum wave function g(k, i) corresponding to
4*(x, t) of Exercise 15.6.5. Show that the momentum wave packet g*(k,t)g(k,t)
is independent of time.
15.6.7 The deuteron, Example 9.1.2, may be described reasonably well with a Hulthen
wave function
with A, a, and /? constants. Find g(p) the corresponding momentum function.
Note. The Fourier transform may be rewritten as Fourier sine and cosine
transforms or as a Laplace transform, Section 15.8.
15.6.8 The nuclear form factor F(k) and the charge distribution p(r) are three-dimen-
three-dimensional Fourier transforms of each other:
If the measured form factor is
у а
find the corresponding charge distribution.
ANS. p(r) = —
a2 e~ar
15.6.9 Check the normalization of the hydrogen momentum wave function
23/2
An r
я (a2p2+h2J
by direct evaluation of the integral
\g*(p)g(p)d3p.
15.6.10 With ф(г) a wave function in ordinary space and <p(p) the corresponding mo-
momentum function, show that
/\ ■*■ I I / \ inn/ft |1
820 INTEGRAL TRANSFORMS
1 r 2 _.rf/h 3
(Ь) B^йр J Г ^(г)в d X ~~
Note. \p is the gradient in momentum space:
. д , . д , , д
дрх дру dpz
These results may be extended to any positive integer power of r and therefore
to any (analytic) function that may be expanded as a Maclaurin series in r.
15.6.11 The ordinary space wave function ф(х, t) satisfies the time-dependent Schrodin-
ger equation
dt 2m т У'т
Show that the corresponding time-dependent momentum function satisfies the
analogous equation
Note. Assume that V(x) may be expressed by a Maclaurin series and use Exercise
15.6.10.
V(ih\p) is the same function of the variable ih\p that V(x) is of the variable r.
15.6.12 The one-dimensional time-independent Schrodinger wave equation is
2m dxz
For the special case of V(x) an analytic function of x, show that the corresponding
momentum wave equation is
Derive this momentum wave equation from the Fourier transform, Eq. 15.62,
and its inverse. Do not use the substitution x -* ih(d/dp) directly.
15.7 TRANSFER FUNCTIONS
A time-dependent electrical pulse may be regarded as built-up as a super-
superposition of plane waves of many frequencies. For angular frequency со we have a
contribution
F(co)eimt.
Then the complete pulse may be written as
/(j) = _L F(co)eicotd(D. A5.82)
27Г I
J — oo
Because the angular frequency со is related to the linear frequency v by
@
v = —~,
2n
TRANSFER FUNCTIONS 821
g(t)
FIG. 15.5
Output
no
Input
it is customary to associate the entire l/2n factor with this integral.
But if со is a frequency, what about the negative frequencies? The negative
co's may be looked on as a mathematical device to avoid dealing with two func-
functions (cos cor and sin со?) separately (compare Section 14.1).
Because Eq. 15.82 has the form of a Fourier transform, we may solve for F(a>)
by writing the inverse transform
ЛО0
F(co)= f(t)e~iM dt. A5.83)
J —oo
Equation 15.83 represents a resolution of the pulse/(£) into its angular frequency
components. Equation 15.82 is a synthesis of the pulse from its components.
Consider some device such as a servomechanism or a stereo amplifier (Fig.
15.5) with an input/(?) and an output g(t). For an input of a single frequency w,
fm(t) = ela", the amplifier will alter the amplitude and may also change the phase.
The changes will probably depend on the frequency. Hence
gjt) = (p(w)fjt). A5.84)
This amplitude and phase modifying function, cp(co), is called a transfer function.
It usually will be complex:
cp(co) = u{co) + iv{co), A5.85)
where the functions Цсо) and v(co) are real.
In Eq. 15.84 we assume that the transfer function <p(co) is independent of input
amplitude and of the presence or absence of any other frequency components.
That is, we are assuming a linear mapping of/(£) onto g(t). Then the total output
may be obtained by integrating over the entire input, as modified by the amplifier
g{t) = ±-\ (p(co)F(co)eiiatd(o. A5.86)
2n
J — oo
The transfer function is characteristic of the amplifier. Once the transfer
function is known (measured or calculated), the output g(t) can be calculated for
any input f(t). Let us consider <p(co) as the Fourier (inverse) transform of some
function
cp(co)= 0>{t)e-iaadt. A5.87)
J —oo
Then Eq. 15.86 is the Fourier transform of two inverse transforms. From Section
15.5 we obtain the convolution
822 INTEGRAL TRANSFORMS
Лоо
g(t)= /(т)Ф(Г-т)<*т. A5.88)
J —oo
Interpreting Eq. 15.88, we have an input—a "cause"—/(x), modified by
<D(£ — т), producing an output—an "effect"—g(t). Adopting the concept of
causality—that the cause precedes the effect—we must require x < t. We do
this by requiring
- т) = 0, t > t. A5.89)
Then Eq. 15.88 becomes
g(t)= Г /(т)Ф(*-т)<*т. A5.90)
J — oo
The adoption of Eq. 15.89 has profound consequences here and equivalently
in the dispersion theory, Section 7.3.
Significance
To see the significance of Ф, let/(x) be a sudden impulse starting at т = 0,
f(x) = S(x),
where S(x) is a Dirac delta distribution on the positive side of the origin. Then
Eq. 15.90 becomes
g(t)= Г d(x)O(t-x)dx
J — oo
(ФA), t > 0
{
This identifies Ф(£) as the output function corresponding to a unit impulse at
t = 0. Equation 15.91 also serves to establish that Ф(£) is real. Our original
transfer function gives the steady-state output corresponding to a unit amplitude
single frequency input. Ф(£) and (p(co) are Fourier transforms of each other.
From Eq. 15.87 we now have
Лоо
(p(co)= ФA)е~ш dt, A5.92)
Jo
with the lower limit set equal to zero by causality (Eq. 15.89). With Ф(£) real from
Eq. 15.91 we separate real and imaginary parts and write
u(co) =
° A5.93)
Лоо v '
v(co) = — Q>(t) sin cot dt, t > 0.
Jo
From this we see that the real part of (p(co), u(co) is even, whereas the odd part of
(p(co), v(co) is odd:
EXERCISES 823
u(— со) = u(co)
v( — co)= —v(co).
Compare this result with Exercise 15.3.1.
Interpreting Eq. 15.93 as Fourier cosine and sine transforms, we have
ФA) = — u(co) cos cotdco
° A5.94)
2 f°°
= — v(co) sin cot dco, t > 0.
71 Jo
Combining Eqs. 15.93 and 15.94, we obtain
Лео Г2 Л oo Л
u(eo) = - sincot<- u(co') cos со't dco'}dt, A5.95)
Jo I71 Jo J
showing that if our transfer function has a real part, it will also have an imaginary
part (and vice versa). Of course, this assumes that the Fourier transforms exist,
thus excluding cases such as Ф(со) = 1.
The imposition of causality has led to a mutual interdependence of the real
and imaginary parts of the transfer function. The reader should compare this
with the results of the dispersion theory of Section 7.3, also involving causality.
It may be helpful to show that the parity properties of u(to) and v(to) require
Ф(Г) to vanish for negative t. Inverting Eq. 15.87, we have
1 f00
ФA) = — \u(co) + i v{co)~] [cos cot + i sin cot] dco. A5.96)
2л:
J — oo
With m(co) even and v(co) odd, Eq. 15.96 becomes
If00 If00
Ф@ = - u(co) cos cot dco v(co) sin cot dco, t > 0. A5.97)
Jo Jo
From Eq. 15.94
poo Лоо
u(co) cos cot dco = — v(co) sin cot dco, t > 0. A5.98)
Jo Jo
If we reverse the sign oft, sin cot reverses sign and from Eq. 15.97
O(t) = 0, t < 0
(demonstrating the internal consistency of our analysis).
EXERCISE
15.7.1 Derive the convolution
g(t)=
824 INTEGRAL TRANSFORMS
15.8 ELEMENTARY LAPLACE TRANSFORMS
Definition
The Laplace transform/(s) or & of a function F(t) is defined by1
e'stF(t)dt
л» ° A5-99)
e-stF(t)dt.
Jo
A few comments on the existence of the integral might be in order. The infinite
integral of F(t),
J'oo
F(t)dt,
о
need not exist. For instance, F(t) may diverge exponentially for large t. However,
if there is some constant s0 such that
e~s^F(t)\<M, A5.100)
a positive constant for sufficiently large t, t > t0, the Laplace transform (Eq.
15.99), will exist for s > s0; F(t) is said to be of exponential order. As a counter-
counterexample, F(t) — et2 does not satisfy the condition given by Eq. 15.100 and is not
of exponential order. <£{et2} does not exist.
The Laplace transform may also fail to exist because of a sufficiently strong
singularity in the function F(t) as t -> 0; that is,
e'stt"dt
Jo
diverges at the origin for n < — 1. The Laplace transform SC{t"} does not exist
for n < — 1.
Since, for two functions F(t) and G{t), for which the integrals exist
yiaFiu + bG(t)\ = a^iFitW + bSC(G(t)\ A5 101)
the operation denoted by J? is linear.
Elementary Functions
To introduce the Laplace transform, let us apply the operation to some of the
elementary functions. In all cases we assume that F(t) = 0 for t < 0.
F(t) =1, t > 0.
lrThis is sometimes called a one-sided Laplace transform; the integral from
— oo to + oo is referred to as a two-sided Laplace transform. Some authors
introduce an additional factor of s. This extra s appears to have little advantage
and continually gets in the way (compare Jeffreys and Jeffreys, Section 14.13
for additional comments). Generally, we take s to be reai and positive. It is
possible to have s complex provided 0l{s) > 0.
ELEMENTARY LAPLACE TRANSFORMS 825
Then
£e{l}=\ e~stdt = -, fors>0. A5.102)
Jo s
Again, let
F(t) = ekt, t > 0.
The Laplace transform becomes
fors>fc. A5.103)
s — к
Jo
Using this relation, we may easily obtain the Laplace transform of certain
other functions. Since
cosh kt = \{ekt + e~kt\
A5.104)
sinh kt = \{ekx - e~kt),
we have
к
2{s-k s + kj s2-k2"
both valid for s > k. We have the relations
coskt = coshikt,
A5.106)
sin kt = —i sinh ikt.
Using Eqs. 15.88 with к replaced by ik, we find that the Laplace transforms are
se'{cos/a} =-t4tT'
s + к
A5.107)
^{sin kt} =-у—7т,
s + к
both valid for s > 0. Another derivation of this last transform is given in the
next section. Note that Нт^0 £f {sin kt} = l/k. The Laplace transform assigns a
value of l/k to j" sin ktdt.
Finally, for F(t) = t", we have
Лоо
c£\tn} = e'sttndt,
Jo
which is just the factorial function. Hence
^ s>0, n> -I. A5.108)
826 INTEGRAL TRANSFORMS
The reader will note that in all these transforms we have the variable s in the
denominator—negative powers of s. In particular, lim^^ f(s) = 0. The sig-
significance of this point is that if/(s) involves positive powers of «(lim^^/^) -> go),
then no inverse transform exists.
Inverse Transform
There is little importance to these operations unless we can carry out the
inverse transform, as in Fourier transforms. That is, with
} =f(s),
then
1 A5.109)
Taken literally, this inverse transform is not unique. Two functions F^t) and
F2(t) may have the same transform,/(s). However, in this case
F1(t)-F2(t) =
where N(t) is a null function (Fig. 15.6) indicating that
По
N(t)dt = 0.
Jo
for all positive t0. This result is known as Lerch's theorem. Therefore to the
physicist and engineer N(t) may almost always be taken as zero and the inverse
operation becomes unique.
N(t)
• (single point)
FIG. 15.6 A possible null function
The inverse transform can be determined in various ways. A) A table of
transforms can be built-up and used to carry out the inverse transformation
exactly as a table of logarithms can be used to look up antilogarithms. The pre-
preceding transforms constitute the embryonic beginnings of such a table. For a
more complete set of Laplace transforms see Table 15.2 or AMS-55, Chapter 29.
Employing partial fraction expansions and various operational theorems, which
are considered in succeeding sections, may facilitate use of the tables. There is
some justification for suspecting that these tables are probably of more value in
solving textbook exercises than in solving real-world problems. B) A general
technique for ^'1 will be developed in Section 15.12 by using the calculus of
ELEMENTARY LAPLACE TRANSFORMS 827
residues. C) The difficulties and the possibilities of a numerical approach—
numerical inversion—are considered at the end of this section.
Partial Fraction Expansion
Utilization of a table of transforms (or inverse transforms) is facilitated by
expanding/(s) in partial fractions.
Frequently f(s), our transform, occurs in the form g(s)/h(s), where g(s) and
h(s) are polynomials with no common factors, g(s) being of lower degree than
h(s). If the factors of h(s) are all linear and distinct, then by the theory of partial
fractions we may write
f(s) = -^— + -^- + • • • + -?*-, A5.110)
s — ax s — a2 s — an
where the c,'s are independent of s. The a,'s are the roots ofh(s). If any one of the
roots, say a1? is multiple (occurring m times), then/(s) has the form
s — «! i=2
Finally, if one of the factors is quadratic, (s2 + ps + q\ the numerator, instead of
being a simple constant, will have the form
as + b
s2 + ps + q
There are various ways of determining the constants introduced. For in-
instance, in Eq. 15.110 we may multiply through by (s — a{) and obtain
A5.112)
In elementary cases a direct solution is often the easiest.
EXAMPLE 15.8.1 Partial Fraction Expansion
Let
/7 4 к2 с as + b /ic nix
j(s) = 2 Г2\ = ~ + 1 Г2- A5.113)
Putting the right side of the equation over a common denominator and equating
like powers of s in the numerator, we obtain
k2 . c(s2 + k2) + s(as + b)
s(s2 + к2) s(s2 + k2)
с + a = 0, s2,
/7 = 0, S1,
and
A5.114)
828 INTEGRAL TRANSFORMS
ck2 =k2,
Solving these (s Ф 0), we have
giving
and
byEqs. 15.102 and 15.107.
b = 0,
a= -1,
_ 1. s_
= 1 -cos
A5.115)
A5.116)
EXAMPLE 15.8.2 A Step Function
As one application of Laplace transforms, consider the evaluation of
F(t) =
sin
dx.
X
A5.117)
Suppose we take the Laplace transform of this definite (and improper)
integral:
SllHX
X
dx\ =
-st
sin tx
X
dxdt.
A5.118)
IJO " ) JO JO
Now interchanging the order of integration (which must be justified!),2 we get
e st sin txdt
LJo
dx =
dx
s2 + x2''
A5.119)
since the factor in square brackets is just the Laplace transform of sin tx. From
the integral tables
dx
1
s2 + x2 s
'X
s
A5.120)
By Eq. 15.102 we carry out the inverse transformation to obtain
n
F(t) = 'j, t > 0,
A5.121)
in agreement with an evaluation by the calculus of residues (Section 7.2). It has
been assumed that t > 0 in F(t). For F( — t) we need note only that sin ( — tx) =
— sin tx, giving F( — t)= — F(t). Finally, if t = 0, F@) is clearly zero. Therefore
!See Jeffreys and Jeffreys, Chapter 1 (uniform convergence of integrals).
ELEMENTARY LAPLACE TRANSFORMS 829
F(t)
-O
2
FIG. 15.7 F{t)=
a step function
sintx
dx =
n
0, t = 0
A5.122)
n
Note that J" (sin tx/x) dx, taken as a function of t, describes a step function
(Fig. 15.7), a step of height n at £ = 0. This is consistent with Eq. 8.111.
The technique in the preceding example was to A) introduce a second integra-
integration—the Laplace transform, B) reverse the order of integration and integrate,
and C) take the inverse Laplace transform. There are many opportunities where
this technique of reversing the order of integration can be applied and proved
very useful. Exercise 15.8.6 is a variation of this.
Numerical Inversion
As an integration the Laplace transform is a highly stable operation—stable
in the sense that small fluctuations (or errors) in F(t) are averaged out in the
determination of the area under a curve. Also, the weighting factor, e~st, means
that the behavior of F(t) at large t is effectively ignored—unless s is small. As a
result of these two effects, a large change in F(t) at large t indicates a very small,
perhaps insignificant change, in/(s). In contrast to the Laplace transform opera-
operation, going from/(s) to F(t), is highly unstable. A tiny change in/(s) may result
in a wild variation of F{t). All significant figures may disappear. In a matrix
formulation the matrix is ill-conditioned with respect to inversion.
There is no general, completely satisfactory numerical method for inverting
Laplace transforms. However, if we are willing to restrict attention to relatively
smooth functions, various possibilities open up. Bellman, Kalaba, and Lockett3
convert the Laplace transform to a Mellin transform (x = e~l) and use numerical
quadrature based on shifted Legendre polynomials, P*(x) = Р„A — 2х). The
key step is analytic inversion of the resulting matrix. Krylov and Skoblya4 focus
3R. Bellman, R. E. Kalaba, and J. A. Lockett, Numerical Inversion of the
Laplace Transforms. New York: American Elsevier A966).
4 V. I. Krylov, and N. S. Skoblya, Handbook of Numerical Inversion of Laplace
Transforms. Translated by D. Louvish. Jerusalem: Israel Program for Scien-
Scientific Translations A969).
830 INTEGRAL TRANSFORMS
on evaluation of the Bromwich integral (Section 15.12). As one technique, they
replace the integrand with an interpolating polynomial of negative powers and
integrate analytically.
EXERCISES
15.8.1 Prove that
lim sf{s) = lim F(t).
t + 0
Hint. Assume that F(t) can be expressed as F(t) =
15.8.2 Show that
15.8.3 Verify that
- lim <£{cos xt} = <5(x).
fcoSClt- COS bt}_ S ,,2/1,2
j b2-a2 } " (s2 + a2)(s2 + b2)' t
1 5.8.4 Using partial fraction expansions, show that
(b) V<°} ^^^1
\(s + a)(s + b)) a-b
15.8.5 Using partial fraction expansions, show that
1 {sin at sin bt)
a2)(s2
1 5.8.6 The electrostatic potential of a charged conducting disk is known to have the
general form (circular cylindrical coordinates)
ф(р, z) = Г e~klzlJ0(kp)f(k) dk,
Jo
with f(k) unknown. At large distances (z -> oo) the potential must approach the
Coulomb potential Q/4neoz. Show that
hm/(/c)= Q
4ne0'
Hint. You may set p = 0 and assume a Maclaurin expansion of f(k) or, using
e~kz, construct a delta sequence.
15.8.7 Show that
. coss л:
(a) ^s = , 0 < v < 1,
v Jo sv 2(v - 1)! сов(уя/2)
LAPLACE TRANSFORM OF DERIVATIVES 831
,. . Sin S , Л
(b) ds = , 0 < v < 2.
У ' Jo sv 2(v - 1)! sin(vrc/2)
Why is v restricted to @,1) for (a), to @,2) for (b)? These integrals may be
interpreted as Fourier transforms of s~v and as Mellin transforms of sins and
coss.
Hint. Replace s~v by a Laplace transform integral: ££{tv~l}/(v — 1)!. Then
integrate with respect to s. The resulting integral can be treated as a beta function
(Section 10.4).
15.8.8 A function F(t) can be expanded in a power series (Maclaurin); that is,
F(t) = X ant".
n = 0
Then
n
0
Show that /(s), the Laplace transform of F(t), contains no powers of s greater
than s. Check your result by calculating £f{5{t)} and comment intelligently
on this fiasco.
15.9 LAPLACE TRANSFORM OF DERIVATIVES
Perhaps the main application of Laplace transforms is in converting differen-
differential equations into simpler forms that may be solved more easily. It will be seen,
for instance, that coupled differential equations with constant coefficients trans-
transform to simultaneous linear algebraic equations.
Let us transform the first derivative of F(t).
Integrating by parts, we obtain
= e-stF(t)
s e~stF(t)dt
Jo A5.123)
= s&{F(t)} - F@).
Strictly speaking, F@) = F( + 0I and dF/dt is required to be at least piecewise
continuous for 0 < t < oo. Naturally, both F(t) and its derivative must be such
that the integrals do not diverge. Incidentally, Eq. 15.123 provides another proof
of Exercise 15.8.1.
An extension gives
7'( + 0), A5.124)
1 Zero is approached from the positive side.
832 INTEGRAL TRANSFORMS
A5.125)
The Laplace transform like the Fourier transform replaces differentiation with
multiplication. In the following examples differential equations become alge-
algebraic equations. The degree of transcendence is reduced, and the solution is
simplified. Here is the power and the utility of the Laplace transform. But see
Example 15.10.3 for what may happen if the coefficients are not constant.
Note carefully how the initial conditions, F( + 0), F'( + 0), and so on, are
incorporated into the transform. Equation 15.124 may be used to derive
JS?'{sin kt}. We use the identity
— k2sinkt =-rj sin kt. A5.126)
dt2
Then, applying the Laplace transform operation, we have
-k2g>{sinkt} = £e j^l
*■ ^ A5.127)
= s2 У {sin kt} — ssin@) — —sinkt\t=0.
%A к-
Since sin@) = 0 and d/dt sin kt\t=o = k,
^{sinkt} = 2 k ,2, A5.128)
sz + kz
verifying Eq. 15.107.
EXAMPLE 15.9.1 Simple Harmonic Oscillator
As a simple but reasonably physical example, consider a mass m oscillating
under the influence of an ideal spring, spring constant k. As usual, friction is
neglected. Then Newton's second law becomes
A5.129)
UL
also
X@) = Xo,
X'@) = 0.
Applying the Laplace transform, we obtain
= 0, A5.130)
and by use of Eq. 15.124 this becomes
ms2x(s) - msX0 + к x(s) = 0, A5.131)
LAPLACE TRANSFORM OF DERIVATIVES 833
x(s) = X0-
with con = —•
m
A5.132)
From Eq. 15.107 this is seen to be the transform of cosco0t, which gives
X(t) = Xo cos co0t, A5.133)
as expected.
EXAMPLE 15.9.2 Earth's Nutation
A somewhat more involved example is provided by the nutation of the earth's
poles (force-free precession). Treating the earth as a rigid (oblate) spheroid, the
Euler equations of motion reduce to
dX_
dt
dY_
dt
= -aY
= +aX
A5.134)
where a = \_(IZ — /J/Jjco
X = cox,
z>
Y = соy with angular velocity vector со = (cox, coy, coz) (Fig. 15.8),
out the z-axis and I
about x- (or y-) axis.
Iz = moment of inertia about the z-axis and Iy = Ix moment of inertia
FIG. 15.8
The z-axis coincides with the axis of symmetry of the earth. It differs from the
axis for the earth's daily rotation, со, by some 15 meters, measured at the poles.
Transformation of these coupled differential equations yields
sx{s)-X@)= -ay(s),
s y(s) - 7@) = a x(s).
Combining to eliminate y(s), we have
A5.135)
834 INTEGRAL TRANSFORMS
s2x(s) - sX(Q) + a 7@) = -a2x(s)
or
x(s) = X@)-r^-2 - YiO)-^. A5.136)
Л "г" W о \~ (Л
Hence
X(t) = X@) cos at - 7@) sin at. A5.137)
Similarly,
7@ = Z@)sinat + 7@) cos at. A5.138)
This is seen to be a rotation of the vector (X, 7) counterclockwise (for a > 0)
about the z-axis with angle в = at and angular velocity a.
A direct interpretation may be found by choosing the time axis so that
7@) = 0. Then
X(t) = X@) cos at,
A5.139)
= X@)smat,
which are the parametric equations for rotation of (X, Y) in a circular orbit of
radius X@), with angular velocity a in the counterclockwise sense.
In the case of the earth's angular velocity vector X@) is about 15 meters,
whereas a, as defined here, corresponds to a period Bn/a) of some 300 days.
Actually because of departures from the idealized rigid body assumed in setting
up Euler's equations, the period is about 427 days.2
If in Eq. 15.134 we set
X(t) = Lx,
Y(t) = Ly,
where Lx and Ly = the x- and y-components of the angular momentum L,
a = ~gLBz,
9l = gyromagnetic ratio,
Bz = magnetic field (along the z-axis),
Eq. 15.134 describes the Larmor precession of charged bodies in a uniform
magnetic field, Bz.
Dirac Delta Function
For use with differential equations one further transform is helpful—the
Dirac delta function:3
2D. Menzel, ed., Fundamental Formulas of Physics, p. 695. En glewood Cliffs,
NJ: Prentice-Hall A955).
3 Strictly speaking, the Dirac delta function is undefined. However, the integral
over it is well defined. This approach is developed in Section 8.7 using delta
sequences.
LAPLACE TRANSFORM OF DERIVATIVES 835
Лоо
e'st«, for t0 > 0, A5.140)
and for t0 = 0
} = 1, A5.141)
where it is assumed that we are using a representation of the delta function
such that
Лоо
6(t)dt=l, E@ = 0, forr>0. A5.142)
Jo
As an alternate method, S(t) may be considered the limit as e -> 0 of F(t), where
0, t < 0,
s'\ 0<t<s, A5.143)
0, t > e.
By direct calculation
1 p — tS
}= Z ■ A5.144)
Taking the limit of the integral (instead of the integral of the limit), we have
or Eq. 15.141
\\mg>{F{t)} = 1
This delta function is frequently called the impulse function because it is so
useful in decribing impulsive forces, that is, forces lasting only a short time.
EXAMPLE 15.9.3 Impulsive Force
Newton's second law for impulsive force acting on a particle of mass m
becomes
A5.145)
dt
where P is a constant.
Transforming, we obtain
ms2x(s) - msX@) - mX'@) = P. A5.146)
For a particle starting from rest X'@) = 0.4 We shall also take X@) = 0. Then
x(s) = £~2, A5.147)
and
4This really should be X'( + 0). To include the effect of the impulse, consider
that the impulse will occur at t = г and let e -> 0.
836 INTEGRAL TRANSFORMS
X(t) =
dX(t)
dt
m
P^
m
a constant.
A5.148)
A5.149)
The effect of the impulse PS(t) is to transfer (instantaneously) P units of
linear momentum to the particle.
A similar analysis applies to the ballistic galvanometer. The torque on the
galvanometer is given initially by ki, in which i is a pulse of current and к is a
proportionality constant. Since i is of short duration, we set
ki = kqS(t), A5.150)
where q is the total charge carried by the current i. Then, with / the moment
of inertia,
Л2в
~
A5.151)
and transforming as before, we find that the effect of the current pulse is a
transfer of kq units of angular momentum to the galvanometer.
EXERCISES
15.9.1 Use the expression for the transform of a second derivative to obtain the transform
of cos kt.
15.9.2 A mass m is attached to one end of an unstretched spring, spring constant k.
At time t = 0 the free end of the spring experiences a constant acceleration a,
away from the mass. Using Laplace transforms,
(a) Find the position x of m as a function of time.
(b) Determine the limiting form of x(t) for small t.
ANS. (a) x = -at2- —
2 со1
o» *-
- coseit)
2 к
со2 =—,
m
cot
1.
m
EXERCISES 837
15.9.3 Radioactive nuclei decay according to the law
dN
dt
= -AN,
N being the concentration of a given nuclide and A, the particular decay constant.
This equation may be interpreted as stating that the rate of decay is proportional
to the number of these radioactive nuclei present. They all decay independently.
In a radioactive series of и different nuclides, starting with Ny,
dNl -AN
dt
= Aj^Ny — A2N2, and so on
dN2
dt
= An.lNn.l, stable.
Find NM N2(t), and N3(t), n = 3, with Ny@) = No, N2@) = N3@) = 0.
ANS. Nl{t) = Noe-^t,
A2 — /Ц
Find an approximate expression for N2 and N3, valid for small f when Ay « A2.
ANS. N2*N0Art
Find approximate expressions for N2 and N3, valid for large t, when
(a) Ay » A2, ANS. (a) N2 « Noe~^<
(b) Ay « A2. N3 « N0(l - e-^'), Axt » 1.
(b) N2*N0~±e-^,
N3*N0(l-e-^), A2t»l.
15.9.4 The formation of an isotope in a nuclear reactor is given by
= nvayNl0 - A2N2(t) - nvo2N2{t).
dN2
dt
Here the product nv is the neutron flux, neutrons per cubic centimeter, times
centimeters per second mean velocity; ax and a2 (cm2) are measures of the proba-
probability of neutron absorption by the original isotope, concentration N10, which
is assumed constant and the newly formed isotope, concentration N2, respectively.
The radioactive decay constant for the isotope is A2.
(a) Find the concentration N2 of the new isotope as a function of time.
(b) If the original element is Eu153, ox = 400 barns = 400 x 10~24 cm2, a2 =
1000 barns = 1000 x'l0~24 cm2, and A2 = 1.4 x 10~9 sec. If Nl0 = 1020
and (nv) = 109 cm sec, find N2, the concentration of Eu154 after one
year of continuous irradiation. Is the assumption that Nt is constant
justified?
15.9.5 In a nuclear reactor Xe135 is formed as both a direct fission product and a decay
product of I135, half-life, 6.7 hours. The half-life of Xe135 is 9.2 hours. As Xe135
838 INTEGRAL TRANSFORMS
strongly absorbs thermal neutrons thereby "poisoning" the nuclear reactor, its
concentration is a matter of great interest. The relevant equations are
dN
^ XN + N - XXNX - cpoxNx.
Here NY = concentration of I135 (Xe135, U235). Assume Nv = constant.
yY = yield of I135 per fission = 0.060,
yx = yield of Xe135 direct from fission = 0.003,
ti35,v 135ч л ln2 0.693
Al = ll (Xe ) decay constant = =
Of = thermal neutron fission cross section for U235,
ox = thermal neutron absorption cross section for Xe135 = 3.5 x 106 barns,
= 3.5 x 10'18 cm2.
(oj, the absorption cross section of I135 is negligible.)
(p = neutron flux = neutrons/cm3 x mean velocity (cm/sec)
(a) Find Nx(t) in terms of neutron flux cp and the product ofNv.
(b) Find Nx(t -* oo).
(c) After Nx has reached equilibrium, the reactor is shut down, cp = 0. Find
Nx(t) following shut down. Notice the increase in Nx, which may for a few
hours interfere with starting the reactor up again.
15.10 OTHER PROPERTIES
Substitution
If we replace the parameter s by s — a in the definition of the Laplace trans-
transform (Eq. 15.99), we have
J'oo Лоо
e'(s'a)tF(t)dt = e~ste'"F(t)dt
о Jo A5.152)
= &{eatF(t)}.
Hence the replacement of s with s — a corresponds to multiplying F(t) by eat
and conversely. This result can be used to good advantage in extending our
table of transforms. From Eq. 15.107 we find immediately that
=^--±Tp-, A5.153)
also
1 s (s-aJ + k2
EXAMPLE 15.10.1 Damped. Oscillator
These expressions are useful when we consider an oscillating mass with
damping proportional to the velocity. Equation 15.129, with such damping
added, becomes
mX"(t) + bX'(t) + kX(t) = 0, A5.154)
OTHER PROPERTIES 839
in which b is a proportionality constant. Let us assume that the particle starts
from rest at X@) = Xo, X'@) = 0. The transformed equation is
m[s2x(s) - sX0] + b[s x(s) - Zo] + к x(s) = 0 A5.155)
and
x(s) = X0
msz + bs
This may be handled by completing the square of the denominator,
2 Ь к ( ЬУ ( к Ь2\ /1С1С7Ч
s2 +— s + — = s + — + T-2- A5.157)
m m \ 2mJ \m 4m J
If the damping is small, b2 < 4 km, the last term is positive and will be denoted
by co\.
x(s) = X0 S + Ъ'т
(s + b/2mJ + со2,
A5.158)
_ s + b/2m (b/2mW[)w,
(s + b/2m) + co\ (s + b/2m) + co\
ByEq. 15.153
X(t) = Xoe'W2m)t( cos co^ + ~^—sinw^J
V 2mco, /
4 У A5.159)
°co1 '
where
2 к
u m
Of course, as b -> 0, this solution goes over to the undamped solution,
(Section 15.9).
RLC Analog
It is worth noting the similarity between this damped simple harmonic
oscillation of a mass on a spring and an RLC circuit (resistance, inductance,
and capacitance) (Fig. 15.9). At any instant the sum of the potential differences
around the loop must be zero (Kirchhoff's law, conservation of energy). This
gives
ьЦ- + RI + -4 Idt = O. A5.160)
dt С J
Differentiating the current / with respect to time (to eliminate the integral),
we have
840 INTEGRAL TRANSFORMS
FIG. 15.9 RLC circuit
dt
dt С
A5.161)
Ifwe replace/(Г) with Z(r), L withm, R with/?, С 1 with/c, Eq. 15.161 is identical
with the mechanical problem. It is but one example of the unification of diverse
branches of physics by mathematics. A more complete discussion will be found
in Olson's book.1
Translation
This time let/(s) be multiplied by e'bs, b > 0.
-bsf(s) = e'bs
e'stF(t)dt
A5.162)
f
e-s{t+b)F{t)dt.
Jo
Now let t + b = т. Equation 15.162 becomes
f00
e~bsf(s)= e'STF(x - b)dx
Jb
Лоо
e'STF(x-b)u(x-b)dx,
J
A5.163)
where u(x — b) is the unit step function. This relation is often called the "Heavi-
side shifting theorem" (Fig. 15.10).
-*-1
Fit - b)
FIG. 15.10 Translation
1H. F. Olson, Dynamical Analogies. New York: Van Nostrand A943).
OTHER PROPERTIES 841
Since F(t) is assumed to be equal to zero for t < 0, F(x — b) = 0 for 0 < т < b.
Therefore we can extend the lower limit to zero without changing the value of
the integral. Then, noting that т is only a variable of integration, we obtain
A5.164)
EXAMPLE 15.10.2 Electromagnetic Waves
The electromagnetic wave equation with E = Ey or Ez, a transverse wave
propagating along the x-axis. is
d2E(x,t) _ J_ d2E(x, t)
v2 dt2
dx
= 0.
A5.165)
Transforming this equation with respect to t, we get
О ^„(^, 4i S ^„t^, Ч-. S _,. „, I CtLyX, t)
dx:
If we have the initial condition £(x, 0) = 0 and
dE(x, t)
v2 dt
= 0. A5.166)
( = 0
dt
= 0,
( = 0
then
A5.167)
The solution (of this ordinary differential equation) is
c2e+{slv)x.
A5.168)
The "constants" c1 and c2 are obtained by additional boundary conditions.
They are constant with respect to x but may depend on s. If our wave remains
finite as x -> oo, JS?{£(x, t)} will also remain finite. Hence c2 = 0.
If £@, t) is denoted by F(t), then c1 = f(s) and
A5.169)
From the translation property (Eq. 15.164) we find immediately that
E(x,t)=
o,
'>--»■
x
t <-.
v
A5.170)
Differentiation and substitution into Eq. 15.165 verifies Eq. 15.170. Our solution
represents a wave (or pulse) moving in the positive x-direction with velocity v.
Note that for x > vt the region remains undisturbed; the pulse has not had
time to get there. If we had wanted a signal propagated along the negative
x-axis, Cj would have been set equal to 0 and we would have obtained
842 INTEGRAL TRANSFORMS
E(x,t) =
a wave along the negative x-axis.
o, t < -x-.
A5.171)
Derivative of a Transform
When F(t), which is at least piecewise continuous, and s are chosen so that
e~stF(t) converges exponentially for large s, the integral
Лоо
e'stF(t)dt
Jo
is uniformly convergent and may be differentiated (under the integral sign)
with respect to s. Then
f'(s)= I (-t)e-nF{t)dt = &{-tF(t)}. A5.172)
Jo
Continuing this process, we obtain
P"\s) = ^{{-t)"F{t)}. A5.173)
All the integrals so obtained will be uniformly convergent because of the
decreasing exponential behavior of e'stF(t).
This same technique may be applied to generate more transforms. For
example,
Лоо
^{ekt} = e'stektdt
Jo A5.174)
= r, s>k.
s — к
Differentiating with respect to s (or with respect to k), we obtain
£>{tekt}=-—^, s>k. A5.175)
EXAMPLE 15.10.3 Bessel's Equation
An interesting application of a differentiated Laplace transform appears in
the solution of Bessel's equation with n = 0. From Chapter 11 we have
x2y"(x) + xy'(x) + x2y(x) = 0. A5.176)
Dividing by x and substituting t = x and F(t) = y(x) to agree with the present
notation, we see that the Bessel equation becomes
tF"(t) + F'(t) + tF(t) = 0. A5.177)
We need a regular solution, in particular, F@) = 1. From Eq. 15.177 with t = 0,
OTHER PROPERTIES 843
F'( + 0) = 0. Also, we assume that our unknown F(t) has a transform. Then,
transforming and using Eqs. 15.124 and 15.172, we have
—f-[s2/(s) - s] + sf(s) - 1 - 4/(s) = 0- A5.178)
as as
Rearranging Eq. 15.178, we obtain
(s2 + l)/'(s) + sf(s) = 0 A5.179)
or
%=-^-t A5.180)
/ sz + 1
a first-order differential equation. By integration,
ln/(s)= -yln(s2 + l) + lnC, A5.181)
which may be rewritten as
f(s)= JL A5.182)
To make use of Eq. 15.108, we expand/(s) in a series of negative powers of s,
convergent for s > 1:
A5.183)
J_ 1-3 (-l)"Bn)!
2s2 22-2!s4 '" + B"n!Js2n
Inverting, term by term, we obtain
= CT Ь-^. A5.184)
„=o
When С is set equal to 1, as required by the initial condition F@) = 1, F(t) is
just J0(t), our familiar Bessel function of order zero. Hence
1 A5.185)
Note that we assumed s > 1. The proof for s > 0 is left as a problem.
It is perhaps worth noting that this application was successful and relatively
easy because we took n = 0 in Bessel's equation. This made it possible to divide
out a factor of x (or t). If this had not been done, the terms of the form t2F(t)
would have introduced a second derivative of/(s). The resulting equation
would have been no easier to solve than the original one.
When we go beyond linear differential equations with constant coefficients,
the Laplace transform may still be applied, but there is no guarantee that it
will be helpful.
844 INTEGRAL TRANSFORMS
The application to Bessel's equation, n Ф 0, will be found in the references.
Alternatively, we can show that
Se{JH(at)} = a(yft+Zr sl A5.186)
V + a2
by expressing Jn(t) as an infinite series and transforming term by term.
Integration of Transforms
Again, with F(t) at least piecewise continuous and x large enough so that
e~xtF(t) decreases exponentially (as x -> oo), the integral
Лоо
f(x)= e'xtF(t)dt A5.187)
Jo
is uniformly convergent with respect to x. This justifies reversing the order of
integration in the following equation:
f(x)dx = e'xtF(t)dtdx
Is Js Jo
A5.188)
-e-bt)dt,
Jo l
on integrating with respect to x. The lower limit s is chosen large enough so that
f(s) is within the region of uniform convergence. Now letting b -> oo, we have
f°° P* F(t)
f(x)dx= -j-le stdt
js Jo A5.189)
- У \F^\
~ US'
provided that F(t)/t is finite at t = 0 or diverges less strongly than C1 (so that
Se{F(t)lt) will exist).
Limits of Integration—Unit Step Function
The actual limits of integration for the Laplace transform may be specified
with the (Heaviside) unit step function
f 0, t < к
u(t-k) = <
1 ll, t>k.
For instance,
£>{u(t-k)}= e~stdt
Jk
1 -ks
= -e .
s
A rectangular pulse of width к and unit height is described by F(t) =
845
u(t) — u(t — к). Taking the Laplace transform, we obtain
- u(t -k)}= e
Jo
1
dt
The unit step function is also used in Eq. 15.163 and could be invoked in
Exercise 15.10.13.
EXERCISES
15.10.1 Solve Eq. 15.154, which describes a damped simple harmonic oscillator for
X@) = Xo, X'@) = 0, and
(a) b2 = 4 km (critically damped),
(b) b2 > 4 km (overdamped).
ANS. (a) X(t) = X0e-{bl2m)t(l + — Л
V 2m J
15.10.2 Solve Eq. 15.154, which describes a damped simple harmonic oscillator for
ANS. (a) X(t) = -^~ e-{bl2m)t sin wrt,
(b) X(t) = v0te~(b'2m)t.
(a) b2 < 4 km (underdamped),
(b) b2 = 4 km (critically damped),
(c) b2 > 4 km (overdamped).
15.10.3 The motion of a body falling in a resisting medium may be described by
when the retarding force is proportional to the velocity. Find X(t) and dX(t)/dt
for the initial conditions
X@) =
dX
dt
= 0.
( = 0
15.10.4 Ringing circuit. In certain electronic circuits resistance, inductance, and
capacitance are placed in the plate circuit in parallel (Fig. 15.11). A constant
voltage is maintained across the parallel elements, keeping the capacitor
charged. At time t = 0 the circuit is disconnected from the voltage source.
С
^f 1'
_ <. <
~* R < «
\ r
1
[
7
e
э
D /
Э
FIG. 15.11 Ringing circuit
846 INTEGRAL TRANSFORMS
Find the voltages across the parallel elements R, L, and С as a function of
time. Assume R to be large-
Hint. By Kirchhoff 's laws
Ir + Ic + Il = 0 and ER = Ec = EL,
where
th~L dt'
q0 = initial charge of capacitor.
With the DC impedance of L = 0, let /L@) = /0, £L@) = 0. This means q0 = 0.
15.10.5 With J0(t) expressed as a contour integral, apply the Laplace transform
operation, reverse the order of integration, and thus show that
= (s2 + 1)~1/2, for s > 0.
15.10.6 Develop the Laplace transform of Jn(t) from £?{J0(t)} by using the Bessel
function recurrence relations.
Hint. Here is a chance to use mathematical induction.
15.10.7 A calculation of the magnetic field of a circular current loop in circular cylin-
cylindrical coordinates leads to the integral
f e'kzU1(ka)dk, Щг)>0.
Jo
Show that this integral is equal to a/(z2 + a2f12.
15.10.8 The electrostatic potential of a point charge q at the origin in circular cylin-
cylindrical coordinates is
e~%zJJkp)dk = -2—.— —- 0t(z) > 0.
From this relation show that the Fourier cosine and sine transforms ofJ0(kp)
are
(a) -Fc{J0{kp)}= J0(kp)coskCdk = <
V2 Jo I0' P<£-
(b) [^Fs{ ^2_p2yl/2
Hint. Replace z by z + iC and take the limit as z -> 0.
15.10.9 Show that
&{I0{at)} = (s2 - a2)-1'2, s > a.
15.10.10 Verify the following Laplace transforms:
, , „, . , ., rn (sinat) 1 _, /s\
(a) £e{jo{at)} = £e\ l = -cotM-,
{ at \ a \aj
(b) ^{no(at)} does not exist,
EXERCISES 847
/ ч mi- < м „fsinhaf) 1 . s + a
(с) &{io{at)} = &\ f = 4~ln
[ at \ 2a s - a
a \al
(d) £?{ko(at)} does not exist.
15.10.11 Develop a Laplace transform solution of Laguerre's equation
tF"(t) + A - t)F'{t) + nF(t) = 0.
Note that you need a derivative of a transform and a transform of derivatives.
Go as far as you can with n = n; then (and only then) set n = 0.
15.10.12 Show that the Laplace transform of the Laguerre polynomial Ln(at) is given
by
<£{Ln(at)) = ^ J* ? s > 0.
s"
15.10.13 Show that
1
i£\Ex{f)\ = -ln(s + 1), s > 0,
where
Г00 &~x Ит Г00 &~xt
El(t) =
Jr T J
E^f) is the exponential-integral function.
15.10.14 (a) From Eq. 15.189 show that
/(*)<*X =
Jo Jo l
provided the integrals exist.
(b) From the preceding result show that
r°sinf, я
Jo ' 2
in agreement with Eqs. 15.122 and 7.41.
15.10.15 (a) Show that
(b) Using this result (with к = 1), prove that
_ _l
s
where
.. , Г sinx, .. ■ ■ . i
si(t) = — ax, the sine integral.
J, x
15.10.16 If F(t) is periodic (Fig. 15.12) with a period a so that F(t + a) = F(t) for all
t > 0, show that
848 INTEGRAL TRANSFORMS
F(t)
FIG. 15.12 Periodic function
la
Ъа
e~stF{t)dt
- e
with the integration now over only the first period of F(t).
15.10.17 Find the Laplace transform of the square wave (period a) defined by
11, 0 < t < a/2
F(t) =
[0, a/2 < t < a.
1 1 _
ANS. f(s) = -
Jy ' s
15.10.18 Show that
(a) <£(cosh at cos at} — -r
s4
(b) <e {cosh at sin at) =
s4 + 4a*
as2 + 2a3
^
/ ч m ( ■ i i as2 — 2a3
(с) У {smh at cos at} = -^——T,
(d) ^{sinh at sin at} =-j
15.10.19 Show that
S4" + 4(
2a2s
s4 + 4a4'
(a) Se 1{(s2 + a2) 2} =—з sin at jt cos at,
(b)
(c)
+ a2)~2} =—tsinat,
2a
+ a2)'2} = —sin at +-t cos at,
s l-e~
(d) yl{s3(s2 + a2)'2} = cos at --t sin at.
CONVOLUTION OR FALTUNG THEOREM 849
15.10.20 Show that
&{{t2 - k2ymu(t - к)} - K0(ks).
Hint. Try transforming an integral representation of K0(ks) into the Laplace
transform integral.
15.10.21 The Laplace transform
f00 «
e~xsxJo(x)dx = —r-
Jo (s H
may be rewritten as
which is in Gauss-Laguerre quadrature form. Evaluate this integral for
s = 1.0, 0.9, 0.8, ... decreasing s in steps of 0.1 unitl the relative error rises to
10 percent. (The effect of decreasing s is to make the integrand oscillate more
rapidly per unit length of y, thus decreasing the accuracy of the numerical
quadrature.)
15.10.22 (a) Evaluate
e~kzkJx{ka)dk
Jo
by the Gauss-Laguerre quadrature. Take a = 1 and z = 0.1@.1I.0.
(b) From the analytic form, Exercise 15.10.7, calculate the absolute error
and the relative error.
15.11 CONVOLUTION OR FALTUNG THEOREM
One of the most important properties of the Laplace transform is that given
by the convolution or faltung theorem.1 We take two transforms
Ms) = &{Fl(t)} and f2(s) = &{F2(t)} A5.190)
and multiply them together. To avoid complications when changing variables,
we hold the upper limits finite:
e-"F2(y)dy. A5.191)
The upper limits are chosen so that the area of integration, shown in Fig. 15.13a,
is the shaded triangle, not the square. If we integrate over a square in x_y-plane,
we have a parallelogram in the £z-plane, which simply adds complications.
This modification is permissible because the two integrands are assumed to
decrease exponentially. In the limit a ->■ oo the integral over the unshaded
triangle will give zero contribution. Substituting x = t — z, у = z the region
rAn alternate derivation employs the Bromwich integral (Section 15.12).
This is Exercise 15.12.3.
850 INTEGRAL TRANSFORMS
(a, a)
>- x
(a, a)
(a, o) " (a, o)
FIG. 15.13 Change of variables, (a) xy-plane (b) z/-plane
of integration is mapped into the triangle shown in Fig. 15.13b. To verify the
mapping, map the vertices: t = x + y, z = y. Using Jacobians to transform the
element of area, we have
dxdy =
ex
dt
dx
dz
dy
dt
dy
dz
dtdz =
1
-1
0
1
dtdz
A5.192)
or dxdy = dtdz. With this substitution Eq. 15.191 becomes
Г - P
/1 (s)*/2(s) = nm e st F\{t ~ z)F2(z)dzdt
a"°°Jo Jo
CP 1
= ^<^ Л(г-7)^2B)^2^.
Uo J
For convenience this integral is represented by the symbol
[ Fl(t-z)F2(z)dz = F1*F2
A5.193)
A5.194)
and referred to as the convolution, closely analogous to the Fourier convolution
(Section 15.5). If we substitute w = t — z, we find
Fl*F2=F2*Fl, A5.195)
showing that the relation is symmetric.
Carrying out the inverse transform, we also find
£{fi(s)'f2(s)} =
- z)F2(z)dz.
A5.196)
This can be useful in the development of new transforms or as an alternative to
a partial fraction expansion. One immediate application is in the solution of
integral equations (Section 16.2). Since the upper limit t is variable, this Laplace
convolution is useful in treating Volterra integral equations. The Fourier
convolution with fixed (infinite) limits would apply to Fredholm integral
equations.
CONVOLUTION OR FALTUNG THEOREM 851
EXAMPLE 15.11.1 Driven Oscillator with Damping
As one illustration of the use of the convolution theorem, let us return to
the mass m on a spring, with damping and a driving force F(t). The equation
of motion A5.129) now becomes
mX"(t) + bX'(t) + kX(t) = F(t).
A5.197)
Initial conditions X@) = 0, X'@) = 0 are used to simplify this illustration,
and the transformed equation is
ms2x(s) + bs x(s) + к x(s) = f(s)
or
A5.198)
A5Л99)
where co\ = k/m — b2/4m2, as before.
By the convolution theorem (Eq. 15.193 or 15.196),
1 Г
X(t) = —!— F(t - z)e~{bl2m)zsmoj1zdz.
mco
1Jo
If the force is impulsive, F(t) = PS(tJ
X(t)= P
sin
A5.200)
A5.201)
P represents the momentum transferred by the impulse and the constant P/m
takes the place of an initial velocity X'@).
If F(t) = Fo sin cot, Eq. 15.200 may be used, but a partial fraction expansion
is perhaps more convenient. With
Eq. 15.199 becomes
x(s) = -5- x
x
m
Foco
sz + со1 ls + bjlmJ + coi
m
a's + b'
c's + d'
s2 + со2 (s + b/2mJ + col
A5.202)
The coefficients a', b', d', and d' are-independent of s. Direct calculation shows
Щ
a1 = -co2 + ~(cozo - со1I,
m b
m t 2
f = -T«»o -
b »2 + ~(wl - »2J
m
m
~b
!Note that d(t) lies inside the interval [0, t\.
852 INTEGRAL TRANSFORMS
Since c' and d' will lead to exponentially decreasing terms (transients), they
will be discarded here. Carrying out the inverse operation, we find for the
steady-state solution
X(t) =
\co + m
where
bco
tamp =
m(too — to )
Differentiating the denominator, we find that the amplitude has a maximum
when
b2 b2
0 2m2 4m2 v
This is the resonance condition.3 At resonance the amplitude becomes FJbcO]^,
showing that the mass m goes into infinite oscillation at resonance if damping
is neglected (b = 0). It is worth noting that we have had three different charac-
characteristic frequencies :
22b
too = ton —
2m
resonance for forced oscillations, with damping,
b2
CO]_ — CO0 — 2,
Am
free oscillation frequency, with damping,
2 к
0H = m>
free oscillation frequency, no damping. They coincide only if the damping is
zero.
Returning to Eqs. 15.197 and 15.199, Eq. 15.197 is our differential equation
for the response of a dynamical system to an arbitrary driving force. The final
response clearly depends on both the driving force and the characteristics of
our system. This dual dependence is separated in the transform space. In
Eq. 15.199 the transform of the response (output) appears as the product of
two factors, one describing the driving force (input) and the other describing
the dynamical system. This latter part, which modifies the input and yields
the output, is often called a transfer function. Specifically, [(s + b/2mJ + cof]'1
is the transfer function corresponding to this damped oscillator. The concept
of a transfer function is of great use in the field of servomechanisms. Often the
3The amplitude (squared) has the typical resonance denominator, the Lorentz
line shape, Exercise 15.3.9.
INVERSE LAPLACE TRANSFORMATION 853
characteristics of a particular servomechanism are described by giving its
transfer function. The convolution theorem then yields the output signal for
a particular input signal.
EXERCISES
15.11.1 From the convolution theorem show that
15.11.2 IfF(t) = f and G(t) = tb, a> -1, b>-\
(a) Show that the convolution
F*G = ta+b+1 f ya(l-yfdy.
Jo
(b) By using the convolution theorem, show that
When replacing a by a — 1 and b by b — 1, we have the Euler formula for the
beta function (Eq. 10.60).
15.11.3 Using the convolution integral, calculate
15.11.4 An undamped oscillator is driven by a force Fo sin a>t. Find the displacement as
a function of time. Notice that it is a linear combination of two simple harmonic
motions, one with the frequency of the driving force and one with the frequency
w0 of the free oscillator. (Assume X@) = X'@) = 0).
ANS. X(t)= °'m (—sin (D0t-sin cot),
w -wo\cyo /
Other exercises involving the Laplace convolution appear in Section 16.2.
15.12 INVERSE LAPLACE TRANSFORMATION
Broftiwich Integral
We now develop an expression for the inverse Laplace transform,
appearing in the equation
}. A5.205)
One approach lies in the Fourier transform for which we know the inverse
relation. There is a difficulty, however. Our Fourier transformable function
had to satisfy the Dirichlet conditions. In particular, we required that
lim G((o) = 0 A5.206)
854 INTEGRAL TRANSFORMS
so that the infinite integral would be well defined.1 Now we wish to treat
functions, F(t), that may diverge exponentially. To surmount this difficulty,
we extract an exponential factor, eyt, from our (possibly) divergent Laplace
function and write
Fit) = e7tG(t). A5.207)
If F(t) diverges as eat, we require у to be greater than a so that G{t) will be
convergent. Now, with G(t) = 0 for t < 0 and otherwise suitably restricted so
that it may be represented by a Fourier integral (Eq. 15.20),
oo Лоо
hit
<du G{v)e"luv dv. A5.208)
'o
Using Eq. 15.207, we may rewrite A5.208) as
3
F(v)e~yve~iuvdv. A5.209)
/o
Now with the change of variable,
s = у + iu, A5.210)
the integral over v is thrown into the form of a Laplace transform
/»oo
F{v)e'svdv=f{s); A5.211)
Jo
s is now a complex variable and M(s) > у to guarantee convergence. Notice that
the Laplace transform has mapped a function specified on the positive real axis
onto the complex plane, M(s) >y.2
With у as a constant, ds = idu. Substituting Eq. 15.211 into Eq. 15.209, we
obtain
-i Лу + i'oo
-— 1 estf(s)ds. A5.212)
Here is our inverse transform. We have rotated the line of integration through
90° (by using ds = idu). The path has become an infinite vertical line in the com-
complex plane, the constant у having been chosen so that all the singularities of/(s)
are on the left-hand side (Fig. 15.14).
Equation 15.212, our inverse transformation, is usually known as the
Bromwich integral, although sometimes it is referred to as the Fourier-Mellin
theorem or Fourier-Mellin integral. This integral may now be evaluated by the
regular methods of contour integration (Chapter 7). If t > 0, the contour may be
closed by an infinite semicircle in the left half-plane. Then by the residue theorem
(Section 7.2)
1 If delta functions are included, G(co) may be a cosine. Although this does
not satisfy Eq. 15.206, G(a>) is still bounded.
2 For a derivation of the inverse Laplace transform using only real variables
see C. L. Bohn and R. W. Flynn, "Real Variable Inversion of Laplace Trans-
Transforms: An Application in Plasma Physics." Am. J. Phys. 46, 1250 A978).
INVERSE LAPLACE TRANSFORMATION 855
Possible singularities
of e-"'/(i)
y-plane
FIG. 15.14 Singularities of e*f(s)
F(t) = £ (residues included for Щб) < у).
A5.213)
Possibly this means of evaluation with M{s) ranging through negative values
seems paradoxical in view of our previous requirement that M(s) > y. The
paradox disappears when we recall that the requirement 8#(s) > у was imposed
to guarantee convergence of the Laplace transform integral that defined f(s).
Once/(s) is obtained, we may then proceed to exploit its properties as an analy-
analytical function in the complex plane wherever we choose.3 In effect we are
employing analytical continuation to get S£?{F(i)} in the left half-plane exactly
as the recurrence relation for the factorial function was used to extend the Euler
integral definition (Eq. 10.5) to the left half-plane.
Perhaps a pair of examples may clarify the evaluation of Eq. 15.212.
EXAMPLE 15.12.1 Inversion via Calculus of Residues
If/(s) = a/(s2 - a2), then
f(s) =
ae
ae
st
s —a (s + a)(s — a)
A5.214)
The residues may be found by using Exercise 7.1.1 or various other means. The
first step is to identify the singularities, the poles. Here we have one simple pole
at s = a and another simple pole at s = — a. By Exercise 7.1.1 the residue at
s = a is (j)eat and the residue at s = —a is ( — y)<Tat. Then
Residues =
in agreement .with Eq. 15.105.
EXAMPLE 15.12.2
If
- e~at) = sinhaf = F(t)
A5.215)
f(s) =
1 -e
3In numerical work/E) may well be available only for discrete real, positive
values of s. Then numerical procedures are indicated. See Section 15.8 and
the reference to Krylov and Skoblya.
856 INTEGRAL TRANSFORMS
then we have
st
A5.216)
The first term on the right has a simple pole at s = 0, residue = 1. Then by Eq.
15.213
f 1, t > 0,
Fi(t) = \' ' A5.217)
[0, t < 0,
= u{t),
where u(t) is the unit step function. Neglecting the minus sign and the e~as, we
find that the second term on the right also has a simple pole at s = 0, residue = 1.
Noting the translation property (Eq. 15.164), we have
Л. 1 1 »- / ■Х
[О, t - а < О,
= ^(г — а).
Therefore
ГО, t < О,
F(t) = F1(t)-F2(t)= < 1, 0<г<а,
I 0, г > а,
= u{t) — u(t — а)
a step function of unit height and length a (Fig. 15.15).
A5.218)
A5.219)
t-a
*~l
FIG. 15.15 Finite-length step function
u(t) - u(t - a)
Two general comments may be in order. First, these two examples hardly be-
begin to show the usefulness and power of the Bromwich integral. It is always avail-
available for inverting a complicated transform when the tables prove inadequate.
Second, this derivation is not presented as a rigorous one. Rather, it is given
more as a plausibility argument, although it can be made rigorous. The deter-
determination of the inverse transform is somewhat similar to the solution of a
differential equation. It makes little difference how you get the solution. Guess
at it if you want. The solution can always be checked by substitution back into
INVERSE LAPLACE TRANSFORMATION 857
the original differential equation. Similarly, F(t) can (and, to check on careless
errors, should) be checked by determining whether by Eq. 15.99
&{F(t)} =f(s).
Two alternate derivations of the Bromwich integral are the subjects of Exercises
15.12.1 and 15.12.2.
As a final illustration of the use of the Laplace inverse transform, we have
some results from the work of Brillouin and Sommerfeld A914) in electromag-
electromagnetic theory.
EXAMPLE 15.12.3 Velocity of Electromagnetic Waves in a Dispersive
Medium
The group velocity и of traveling waves is related to the phase velocity v by
the equation
. dv
и=v-A—
dX
A5.220)
Here X is the wavelength. In the vicinity of an absorption line (resonance) dv/dX
may be sufficiently negative so that и > с (Fig. 15.16). The question immediately
arises whether a signal can be transmitted faster than c, the velocity of light in
vacuum. This question, which assumes that such a group velocity is meaningful,
is of fundamental importance to the theory of special relativity.
Anomalous region
dv
dX
negative
Increasing wavelength, X
Increasing frequency, v = -£-
FIG. 15.16 Optical dispersion
We need a solution to the wave equation
д2ф 1 д2ф
дх2 v2 dt
2 '
A5.221)
corresponding to a harmonic vibration starting at the origin at time zero. Since
our medium is dispersive, v is a function of the angular frequency. Imagine, for
instance, a plane wave, angular frequency со, incident on a shutter at the origin.
858 INTEGRAL TRANSFORMS
At t = 0 the shutter is (instantaneously) opened, and the wave is permitted to
advance along the positive x-axis.
Let us then build up a solution starting at x = 0. It is convenient to use the
Cauchy integral formula, Eq. 6.43,
,-izr
,/,@, t) = — i— dz = e-iz°f.
2ni J z — z0
(for a contour encircling z = z0 in the positive sense). Using s = — iz and z0 = со,
we obtain
1 Лу+ioo st f() t < 0
,/,@, *) = — ~^—ds = J ' . A5.222)
To be complete, the loop integral is over the vertical line M{s) = у and an infinite
semicircle as shown in Fig. 15.17. The location of the infinite semicircle is chosen
i
r
— /'to <
J
)
)
■4
\
\
\
\
\
\
i
i
\
■ib-
FIG. 15.17 Possible closed contours
so that the integral over it vanishes. This means a semicircle in the left half-plane
for t > 0 and the residue is enclosed. For t < 0 we pick the right half-plane and
no singularity is enclosed. The fact that this is just the Bromwich integral may
be verified by noting that
F(t) = 1°' . l < °' A5.223)
\e'lcat, t>0,
and applying the Laplace transform. The transformed function/(s) becomes
1
f(s) =
A5.224)
S + ICO
Our Cauchy-Bromwich integral provides us with the time dependence of a
signal leaving the origin at t = 0. Tq include the space dependence, we note that
es(t-x/v)
satisfies the wave equation. With this as a clue, we replace tby t — x/v and write
a solution
EXERCISES 859
fy + ico s(t-x/v)
—ds. A5.225)
ly — ico
It was seen in the derivation of the Bromwich integral that our variable s
replaces the со of the Fourier transformation. Hence the wave velocity v becomes
a function of s, that is, v(s). Its particular form need not concern us here. We need
only the property
lim v(s) = constant, с A5.226)
N-oo
This is suggested by the asymptotic behavior of the curve on the right side of
Fig. 15.16.4
Evaluating Eq. 15.225 by the calculus of residues, we may close the path of
integration by a semicircle in the right half-plane, provided
t - - < 0.
с
Hence
ф(х, t) = 0, t - - < 0, A5.227)
с
which means that the velocity of our signal cannot exceed the velocity of light in
vacuum c. This simple but very significant result was extended by Sommerfeld
and Brillouin to show just how the wave advanced in the dispersive medium.
Summary—Inversion of Laplace Transform
1. Direct use of tables, Table 15.2, and references; use
of partial fractions (Section 15.8) and the operational
theorems of Table 15.1.
2. Bromwich integral, Eq. 15.212, and the calculus of
residues.
3. Numerical inversion, Section 15.8, and references.
EXERCISES
15.12.1 Derive the Bromwich integral from Cauchy's integral formula.
Hint. Apply the inverse transform <£~x to
1 Лу + roe /Y_4
f(s) = ± Hm Ш-dz,
where f(z) is analytic for Щг) > у.
4Equation 15.226 follows rigorously from the theory of anomalous disper-
dispersion. See also the Kronig-Kramers optical dispersion relations of Section 7.3.
860 INTEGRAL TRANSFORMS
15.12.2 Starting with
i Лу + ioo
— estf(s)ds,
show that by introducing
/(s) = e'szF{z)dz,
Jo
we can convert one integral into the Fourier representation of a Dirac delta
function. From this derive the inverse Laplace transform.
15.12.3 Derive the Laplace transformation convolution theorem by use of the Brom-
Bromwich integral.
15.12.4 Find
\s2 - k2
(a) by a partial fraction expansion,
(b) repeat, using the Bromwich integral.
15.12.5 Find
(a) by using a partial fraction expansion,
(b) repeat using the convolution theorem,
(c) repeat using the Bromwich integral.
ANS. F(t)= I-coskt.
15.12.6 Use the Bromwich integral to find the function whose transform is/(s) = s~1/2.
Note that /(s) has a branch point at s = 0. The negative x-axis may be taken
as a cut line.
ANS. F(t) = 12
15.12.7 Show that
by evaluation of the Bromwich integral.
Hint. Convert your Bromwich integral into an integral representation of J0(t).
Figure 15.18 shows a possible contour.
15.12.8 Evaluate the inverse Laplace transform
by each of the following methods:
(a) Expansion in a series and term-by-term inversion,
(b) Direct evaluation of the Bromwich integral,
(c) Change of variable in the Bromwich integral: s = ~(z + z~l).
15.12.9 Show that
„_, fins) .
* \-\=-[n"ъ
where у = 0.5772..., the Euler-Mascheroni constant.
EXERCISES 861
FIG. 15.18 A possible contour for the inver-
son of /o@
15.12.10 Evaluate the Bromwich integral for
Jy' (з2 + а2Г
15.12.11 Heaviside expansion theorem. If the transform /(s) may be written as a ratio
ns) h(sy
where g(s) and h(s) arc analytic functions, h(s) having simple, isolated zeros
at s = s,, show that
)h(s) *? h'(st)
Hint. See Exercise 7.1.2.
15.12.12 Using the Bromwich integral, invert /(s) = s'2e'ks. Express F(t) = <£~x {/(s)}
in terms of the (shifted) unit step function u(t — k).
ANS. F.(t) = (t- k)u(t - k).
15.12.13 You have a Laplace transform:
1
(s + a)(s + b)
Invert this transform by each of three methods:
(a) Partial fractions and use of tables,
(b) Convolution theorem,
(c) Bromwich integral.
афЪ.
ANS. F(t) =
e'"' - e'at
a-b '
862 INTEGRAL TRANSFORMS
TABLE 15.1 Laplace Transform Operations
Operations
Equation
1. Laplace transform
2. Transform of derivative
3. Transform of integral
4. Substitution
5. Translation
6. Derivative of transform
f(s)=<e{F(t)}= e-stF(t)dt
J
A5.99)
A5.123)
s2f(s) - s
f{s-a)=<£{eatF{t)}
e'bsf(s) - <£{F(t - b)}
7. Integral of transform Г f(x)dx = & I—
J S V. J
8. Convolution
9. Inverse transform,
Bromwich integral
= &{F"(t)}
(Exercise 15.11.1)
A5.152)
A5.164)
A5.173)
A5.189)
z)F2{z)dz\ A5.193)
A5.212)
Jy—ioo
EXERCISES 863
TABLE 15.2 Laplace Transforms
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
f(s)
1
1
s
n!
s"+1
1
s-k
1
(* - kf
s
s2-k2
к
s2-k2
s
S2 + k2
к
s2 + k2
s — a
(s - aJ + k2
к
(s - aJ + k2
s2-k2
(s2 + k2J
2ks
(s2 + k2J
(s2 + a2)'112
(s2 - a2)'1'2
a \a)
1 s + a ]
la s — a
a \a)
(s - af
sn+1
-s\n(s+l)
Ins
s
F(t)
e(t)
1
t"
ekt
tekt
cosh kt
sinh kt
cos kt
sin kt
eat cos kt
eat sin kt
t cos kt
t sin kt
J0(at)
I0(at)
Ш)
io(at)
Ln(at)
El (x) = — Ei( — x)
-Int-C
Limitation Equation
Singularity at+0 A5.141)
s> 0
s> 0
n> -1
s > к
s> к
s> к
s > к
s>0
s> 0
s > a
s > a
s> 0
s>0
s>0
s > a
s> 0
s > a
s> 0
s> 0
s> 0
A5.102)
A4.108)
A5.103)
A5.175)
A5.105)
A5.105)
A5.107)
A5.107)
A5.153)
A5.153)
A5.172)
A5.172)
A5.185)
(Exercise 15.10.10)
(Exercise 15.10.11)
(Exercise 15.10.11)
(Exercise 15.10.13)
(Exercise 15.10.14)
(Exercise 15.12.9)
A more extensive table of Laplace transforms appears in Chapter 29 of AMS-55.
864 INTEGRAL TRANSFORMS
REFERENCES
Champeney, D. C, Fourier Transforms and Their Physical Applications. New York:
Academic Press A973).
Fourier transforms are developed in a careful, easy to follow manner. Approximately
60 percent of the book is devoted to applications of interest in physics and engineering.
Erdelyi, A., Ed., Tables of Integral Transforms, Vols. I and II. Bateman Manuscript
Project. New York: McGraw-Hill. Vol. I: Fourier, Laplace, Mel/in Transforms. Vol. II:
Hankel Transforms and Special Functions.
An encyclopedic compilation of transforms, special functions, and their properties,
this is book is useful primarily as a reference.
Erdelyi, A., W. Magnus, F. Oberhettinger, and F. G. Tricomi, Tables of Integral
Transforms, two vols. New York: McGraw-Hill A954).
This text contains extensive tables of Fourier sine, cosine, and exponential transforms,
Laplace and inverse Laplace transforms, Mellin and inverse Mellin transforms, Hankel
transforms, and other more specialized integral transforms.
Hanna, J. R., Fourier Series and Integrals of Boundary Value Problems. Somerset, N.J.:
Wiley A982).
This book is a broad treatment of the Fourier solution of boundary value problems.
The concepts of convergence and completeness are given careful attention.
Jeffreys, H., and B. S. Jeffreys, Methods of Mathematical Physics, 3rd ed. Cambridge:
, Cambridge University Press A966).
Krylov, V. I., and N. S. Skoblya, Handbook of Numerical Inversion of Laplace Transform.
Jerusalem: Israel Program for Scientific Translations A969).
LePage, W. R., Complex Variables and the Laplace Transform for Engineers. New York:
McGraw-Hill A961); New York: Dover A980).
A complex variable analysis which is carefully developed and then applied to Fourier
and Laplace transforms. It is written to be read by students, but intended for the serious
student.
McCollum, P. A., and B. F. Brown, Laplace Transform Tables and Theorems. New York:
Holt, Rinehart and Winston A965).
Miles, J. W., Integral Transforms in Applied Mathematics. Cambridge: Cambridge
University Press A971).
This is a brief but interesting and useful treatment for the advanced undergraduate.
It emphasizes applications rather than abstract mathematical theory.
Papoulis, A., The Fourier Integral and Its Applications. New York: McGraw-Hill A962).
This is a rigorous development of Fourier and Laplace transforms and has extensive
applications in science and engineering.
Roberts, G. E., and H. Kaufman, Table of Laplace Transforms. Philadephia: W. B.
SaundersA966).
Sneddon, I. N., Fourier Transforms. New York: McGraw-Hill A951).
A detailed comprehensive treatment, this book is loaded with applications to a wide
variety of fields of modern and classical physics.
Sneddon, I. H., The Use of Integral Transforms. New York: McGraw-Hill A972).
Written for students in science and engineering in terms they can understand, this book
covers all the integral transforms mentioned in this chapter as well as in several others.
Many applications are included.
Van der Pol, В., and H. Bremmer, Operational Calculus Based on the Two-sided Laplace
Integral. 2nd ed. Cambridge: Cambridge University Press A955).
Here is a development based on the integral range — oo to + oo, rather than the useful
0 to +oo. Chapter V contains a detailed study of the Dirac delta function (impulse
function).
Wolf, К. В., Integral Transforms in Science and Engineering. New York: Plenum Press
A979).
This book is a very comprehensive treatment of integral transforms and their
applications.
16 INTEGRAL
EQUATIONS
16.1 INTRODUCTION
With the exception of the integral transforms of the last chapter, we have
been considering equations with relations between the unknown function cp(x)
and one or more of its derivatives. We now proceed to investigate equations
containing the unknown function within an integral. As with differential equa-
equations, we shall largely confine our attention to linear relations, linear integral
equations. Integral equations are classified in two ways:
1. If the limits of integration are fixed, we call the equa-
equation a Fredholm equation; if one limit is variable,
it is a Volterra equation.
2. If the unknown function appears only under the
integral sign, we label it "first kind." If it appears
both inside and outside the integral, it is labeled,
"second kind."
Definitions
Symbolically, we have Fredholm equation of the first kind:
f(x)= F K(x,t)q>(t)dt. A6.1)
Ja
Fredholm equation of the second kind:
cp(x) =f(x) + X K{x,t)q>{t)dt. A6.2)
J a
Volterra equation of the first kind:
/(*)= f K(x,t)(p(t)dt. A6.3)
J a
Volterra equation of the second kind:
q>(x)=f(x)+ j K(x,t)cp(t)dt. A6.4)
In all four cases cp(t) is the unknown function. K(x, t), which we call the kernel,
865
866 INTEGRAL EQUATIONS
and/(x) are assumed to be known. When/(x) = 0, the equation is said to be
homogeneous.
The reader may wonder, with some justification, why we bother about in-
integral equations. After all, the differential equations have done a rather good
job of describing our physical world so far. There are several reasons for intro-
introducing integral equations here.
We have placed considerable emphasis on the solution of differential equa-
equations subject to particular boundary conditions. For instance, the boundary con-
condition at r = 0 determines whether the Neumann function Nn(r) is present when
Bessel's equation is solved. The boundary condition for r -> oo determines
whether the In(r) is present in our solution of the modified Bessel equation. The
integral equation relates the unknown function not only to its values at neigh-
neighboring points (derivatives) but also to its values throughout a region, including
the boundary. In a very real sense the boundary conditions are built into the
integral equation rather than imposed at the final stage of the solution. It will be
seen later, when we construct kernels (Section 16.5), that the form of the kernel
depends on the values on the boundary. The integral equation, then, is compact
and may turn out to be a more convenient or powerful form than the differential
equation. Mathematical problems such as existence, uniqueness, and complete-
completeness may often be handled more easily and elegantly in integral form. Finally,
whether or not we like it, there are some problems, such as some diffusion and
transport phenomena, that cannot be represented by differential equations. If
we wish to solve such problems, we are forced to handle integral equations. A
most important example of this sort of physical situation follows.
EXAMPLE 16.1.1 Neutron Transport Theory—Boltzmann Equation
The fundamental equation of neutron transport theory is an expression of the
equation of continuity for neutrons:
Production = losses + leakage.
Under production we have sources
S(v,fl,r)dvdfl
representing the introduction of S neutrons per cubic centimeter per second with
speeds between v and v + dv and direction of motion £1 within a solid angle dfl.
An additional source is provided by scattering collisions that scatter neutrons
into the ranges just listed. The rate of scattering is given by
where £s is the (macroscopic) probability that a neutron of speed r', direction
£1', will be scattered with resultant speed v, direction £1. The quantity (p(v',£l',r)
is the neutron flux. Expressed as a vector, <p = flcp has the direction of the
neutron velocity and a magnitude equal to the number of neutrons per second
of speed v crossing a unit area at position г and in a direction £1 (Fig. 16.1).
INTRODUCTION 867
Neutrons
FIG. 16.1 Neutron flux
Integrating over available initial speeds (i/) and over all directions (£1% we
obtain
' V, fl, П')ф', Cl, r) dv' dil'
for the second production term.
Losses come from leakage given by
and from absorption and scattering into another (lower) velocity range. These
are
2» +Loo
(p(v,Cl,r).
If the medium is not homogeneous and isotropic, the £'s may have position
and direction-dependence in addition to the indicated speed or energy depen-
dependence.
Our equation of continuity finally becomes
, v', n, ii')(p(v', a, r) dv' da + s(v, a,
<p(v, Q r).
A6.5)
This is the steady-state Boltzmann equation, an integro-differential equation. In
this form the Boltzmann equation is almost impossible to handle. Most of
neutron transport theory is a development of methods that are compromises
between physical accuracy and mathematical feasibility.1
An integral equation may also appear as a matter of deliberate choice based
on convenience or the need for the mathematical power of an integral equation
formulation.
EXAMPLE 16.1.2 Momentum Representation in Quantum Mechanics
The Schrodinger equation (in ordinary space representation) is
1 Compare H. Soodak, Ed., Reactor Handbook, 2nd ed., vol. Ill, part A,
Physics. New York: Interscience Publishers A962). Compare Chapter 3.
868 INTEGRAL EQUATIONS
\ = Еф{г) A6.6)
or
(-V2 + а2)ф(г) = и(г)ф(г), A6.7)
where
i 2m ^
, ч 2m
v(r) = ТУ
We may generalize Eq. 16.7 to
(-V2 + а2)ф(г) = v(r,r'№(r')d3r'. A6.8)
For the special case of
v(r,r') = v(r')S(r-r'), A6.9)
which represents local interaction Eq. 16.8 reduces to Eq. 16.7. Equation 16.8
is now subject to the Fourier transform (compare Section 15.6).
1 Г _.k>r 3
{2П) [ A6-10)
Here the abbreviation
7 = k (wave number) A6.11)
h
has been introduced. Developing Eq. 16.10, we obtain
(-V2 + а2)ф(г)е"к-г d3r = u(r,r')iA(r> ~ik43r'd3r. A6.12)
Note that the V2 on the left operates only on the ф(г). Integrating the left-hand
side by parts and substituting Eq. 16.10 for i^(r') on the right, we get
(k2 + а2)ф(г)е-1к-Ч3г = BтгK/2(/с2 + а2)Ф(к)
A6-13)
If we use
i CC
A6.14)
INTRODUCTION 869
Eq. 16.13 becomes
(к2 + а2)Ф(к) = /(к,к')Ф(к') d3kr, A6.15)
a homogeneous Fredholm equation of the second kind in which the parameter
a2 corresponds to the eigenvalue.
For our special but important case of local interaction, application of
Eq. 16.9 leads to
/(k,k') = /(k-k'). A6.16)
This is our momentum representation equivalent to an ordinary static
interaction potential in ordinary space. Our momentum function Ф(к) satisfies
the integral equation (Eq. 16.15). It must be emphasized that all through here
we have assumed that the required Fourier integrals exist. For a linear oscillator
potential, V(r) = r2, the required integrals would not exist. Equation 16.10
would lead to divergent oscillations and we would have no Eq. 16.15.
Transformation of a Differential Equation into
an Integral Equation
Often we find that we have a choice. The physical problem may be represented
by a differential or an integral equation. Let us assume that we have the differ-
differential equation and wish to transform it into an integral equation. Starting
with a linear second-order differential equation
y" + A(x)y' + B(x)y = g(x) A6.17)
with initial conditions
У (a) = Уо>
У (a) = /0,
we integrate to obtain
/=- Ay'dx- Bydx+ gdx + y'o. A6.18)
Integrating the first integral on the right by parts yields
у = -Ay- (В- A')ydx + \ gdx + A(a)y0 + y'o. A6.19)
J a J a
Notice how the initial conditions are being absorbed into our new version.
Integrating a second time, we obtain
y=-\ Aydx-\ [B(t)-A'(t)]y(t)dtdx
J" fx " j" A6-20)
+ g(t)dtdx+[A(a)yo + y'o](x-a) + yo.
J a J a
To transform this equation into a neater form, we use the relation
870 INTEGRAL EQUATIONS
f(t)dtdx= (x-t)f(t)dt. A6.21)
J a J a J a
This may be verified by differentiating both sides. Since the derivatives are equal,
the original expressions can differ only by a constant. Letting x -> a, the constant
vanishes and Eq. 16.21 is established. Applying it to Eq. 16.20, we obtain
y(x) = - j {A(t) + (x- t)\B{t) - A'(t)]}y(t)dt
" A6.22)
+ (x - t)g(t)dt + [A(a)y0 + y'0](x - a) + y0.
J a
If we now introduce the abbreviations
K(x,t) = (t- x)[B(t) - АЩ - A(t),
л, A6.23)
f(x) = (x - t)g(t)dt + [A(a)y0 + y'0](x - a) + y0,
J a
Eq. 16.22 becomes
y(x) = f(x)+ Г K(x,t)y(t)dt, A6.24)
J a
which is a Volterra equation of the second kind. This reformulation as a Volterra
integral equation offers certain advantages in investigating questions of existence
and uniqueness.
EXAMPLE 16.1.3 Linear Oscillator Equation
As a simple illustration, consider the linear oscillator equation
f + со2 у = 0 A6.25)
with
y@) = 0
У@) = i.
This yields
A(x) = 0,
B{x) = w2,
g(x) = 0.
Substituting into Eq. 16.22 (or Equations 16.23 and 16.24), we find that the
integral equation becomes
y(x) = x + a>2\ (t-x)y(t)dt. A6.26)
Jo
This integral equation, Eq. 16.26, is equivalent to the original differential equa-
INTRODUCTION 871
tion plus the initial conditions. The reader may show that each form is indeed
satisfied by y(x) = A/co) sin cox.
Let us reconsider the linear oscillator equation A6.25) but now with the
boundary conditions
У@) = 0,
уф) = О-
Since /@) is not given, we must modify the procedure. The first integration
gives
,2
у = -co2 \ ydx + /@). A6.27)
Jo
Integrating a second time and again using Eq. 16.21, we have
y^-co2 (x - t)y{t)dt + /@)x. A6.28)
Jo
To eliminate the unknown /@), we now impose the condition y(b) = 0. This
gives
to2 Г Ф ~ t)y(t)dt = by'@). A6.29)
Jo
Substituting this back into Eq. 16.28, we obtain
y(x) = -co2 Г (x - t)y{t)dt + to2* I (b - t)y{t)dt. A6.30)
Jo °Jo
Now let us break the interval [0, fr] into two intervals [0, x] and [x,fr].
Since
\(b-t)-{x-t) = Ub-x), A6.31)
о b
we find
t
,2 I
b
y(x) = oj2 -(b - x)y{t)dt + v/ -(b - t)y{t)dt. A6.32)
Jo b Jxb
Finally, if we define a kernel (Fig. 16.2)
уф — x), t < x,
K(x,t) =
уф - t), x < t,
\ b
A6.33)
we have
f*b
y(x) = co2 K(x,t)y(t)dt, A6.34)
Jo
a homogeneous Fredholm equation of the second kind.
872 INTEGRAL EQUATIONS
K(x,t)
*~ t
FIG. 16.2
Our new kernel, K(x, t), has some interesting properties.
1. It is symmetric, K(x,t) = K(t,x).
2. It is continuous in the sense that
lib - x)
x
t = X l = X
3. Its derivative with respect to t is discontinuous. As t
increases through the point t = x, there is a dis-
discontinuity of — 1 in dK(x, t)/dt.
We shall return to these properties in Section 16.5 in which we identify K(x, t)
as a Green's function.
In the transformation of a linear, second-order differential equation into an
integral equation, the initial or boundary conditions play a decisive role. If we
have initial conditions (only one end of our interval), the differential equation
transforms into a Volterra integral equation. For the case of the linear oscillator
equation with boundary conditions (both ends of our interval), the differential
equation leads to a Fredholm integral equation with a kernel that will be a
Green's function.
It might be noted that the reverse transformation (integral equation to
differential equation) is not always possible. There exist integral equations for
which no corresponding differential equation is known.
EXERCISES
16.1.1 Starting with the differential equation, integrate twice and derive the Volterra
integral equation corresponding to
(a) y"{x)-y(x) = 0; y@) = 0, /@) = 1.
ANS. у = (x -t)y(t)dt + x.
(b) y"(x) - y(x) = 0; y@) = 1, /@) = - 1.
ANS. у = {х- t)y(t)dt - x + 1.
Jo
Check your results with Eq. 16.23.
16.1.2 Derive a Fredholm integral equation corresponding to
y"(x) - y(x) = 0; y(l) = 1,
INTEGRAL TRANSFORMS, GENERATING FUNCTIONS 873
(a) by integrating twice,
(b) by forming the Green's function.
ANS. y(x)= 1 - f K(x,t)y(t)dt,
ft(l-x)(t+l\ x>t,
K(x, t) = <,
У [i(l - t)(x +1), x < t.
16.1.3 (a) Starting with the given answers of Exercise 16.1.1, differentiate and recover
the original differential equations and the boundary conditions.
(b) Repeat for Exercise 16.1.2.
16.1.4 The general second-order linear differential equation with constant coefficients
is
Given the boundary conditions
integrate twice and develop the integral equation
y(x)= f K(x,t)y(t)dt,
Jo
with
(a2t(l-x) + a1(x-l), t<x,
K(x, t) = <
[a2x(l - t) + a,x, x <t.
Note that K{x, t) is symmetric and continuous if ax =0. How is this related
to self-adjointness of the differential equation?
16.1.5 Verify that fcfcf(t)dtdx = J*(x - t)f(t)dt for all f(t) (for which the integrals
exist).
16.1.6 Given (p{x) — x — Jo(? — x)cp(t)dt. Solve this integral equation by converting it
to a differential equation (plus boundary conditions) and solving the differential
equation (by inspection).
16.1.7 Show that the homogeneous Volterra equation of the second kind
has no solution (apart from the trivial ф = 0).
Hint. Develop a Maclaurin expansion of ф{х). Assume ф(х) and K{x, t) are
differentiable with respect to x as needed.
16.2 INTEGRAL TRANSFORMS, GENERATING
FUNCTIONS
To put the problem of solving integral equations in perspective, we compare
differentiation and integration:
874 INTEGRAL EQUATIONS
Differentiation Integration
Rules, systematic procedures Often no integrated
function exists in
closed form
Computing machine can be Numerical integration
instructed to do analytic may have to be used
differentiation
Analogous to differentiation, linear differential equations are solved com-
completely in Chapter 8. Analogous to integration, there is no general method
available for inverting integral equations. However, certain special cases may
be treated with our integral transforms (Chapter 15). For convenience these are
listed here. If
ф(х) =
then
<p(x) = -^=\ e'ix^(t)dt (Fourier). A6.35)
If
Лоо
ф(х)= e~xt<p(t)dt,
Jo
then
лу + ioo
(p(x) = ~\ ext\j/(t)dt (Laplace). A6.36)
If
Лоо
ф(х)= tx'l(p{t)dt,
Jo
then
<p(x)=— x~lil/{t)dt (Mellin). A6.37)
2nl Jy-ioo
If
Лоо
ф(х) = t(p(t)Jv(xt)dt,
Jo
then
Лоо
(p(x)=\ t\ff{t)Jv{xt)dt (Hankel). A6.38)
Jo
Actually the usefulness of the integral transform technique extends a bit
beyond these four rather specialized forms.
INTEGRAL TRANSFORMS, GENERATING FUNCTIONS 875
EXAMPLE 16.2.1 Fourier Transform Solution
Let us consider a Fredholm equation of the first kind with a kernel of the
general type k(x — t):
Лоо
f(x)= k(x-t)w(t)dt, A6.39)
J — oo
in which w(t) is our unknown function. Assuming that the needed transforms exist,
we apply the Fourier convolution theorem (Section 15.5) to obtain
Лоо
f(x)= K(co)<X>(co)e-i<axdoj. A6.40)
J — oo
The functions K(co) and Ф(со) are the Fourier transforms of k(x) and cp(x),
respectively. Inverting, by Eq. 16.35, we have
= wzl f(x)ei<*xdx = ^=l. A6.41)
'2n
Then
ф(ю) = . (ы> t A6.42)
and again inverting we have
A6.43)
For a rigorous justification of this result the reader is invited to follow Morse
and Feshbach across complex planes. An extension of this transformation
solution appears as Exercise 16.2.1.
EXAMPLE 16.2.2 Generalized Abel Equation, Convolution Theorem
The generalized Abel equation is
f* wit) f/M known,
Я*) = dt ° <a < ! with 1 , ч , A6-44)
[(t) nkw
Jo (x - tf [(p(t) unknown.
Taking the Laplace transform of both sides of this equation, we obtain
A6,45)
the last step following by the Laplace convolution theorem (Section 15.11).
Then
Sl~'fffi*)}- A6-46)
876 INTEGRAL EQUATIONS
Dividing by s,1 we obtain
A6.47)
Combining the factorials (Eq. 10.32), and applying the Laplace convolution
theorem again, we discover that
1 ' ,-, sin7ra ,л f f^ fit) ,] ,,. ,m
)}=—.S^j^n^j. A6.48)
Inverting with the aid of Exercise 15.11.1, we get
and finally, by differentiating,
;a d Г f(t)
— n dxH(x-tr«-
Generating Functions
Occasionally, the reader may encounter integral equations that involve
generating functions. Suppose we have the admittedly special case
We notice two important features:
1. A — 2xt + x2)'1'2 generates the Legendre poly-
polynomials.
2. [ — 1,1] is the orthogonality interval for the Legendre
polynomials.
If we now expand the denominator (property 1) and assume that our unknown
(p(t) may be written as a series of these same Legendre polynomials,
/(*) = Г I aHPH(t) I Pr(t)x'dt. A6.52)
Utilizing the orthogonality of the Legendre polynomials (property 2), we obtain
We may identify the an's by differentiating n times and then setting x = 0.
s1 a does not have an inverse for 0 < a < 1.
Hence
EXERCISES 877
A6.54)
Similar results may be obtained with the other generating functions (compare
Exercise 15.2.9). Actually the technique of expanding in a series of special
functions is always available. It is worth a try whenever the expansion is possible
(and convenient) and the interval is appropriate.
EXERCISES
16.2.1 The kernel of a Fredholm equation of the second kind,
Л0О
(p(x) = f(x) + X K(x,t)<p(t)dt,
J — qo
is of the form k(x — t).2 Assuming that the required transforms exist, show that
(p(x) =
F(t) and K(t) are the Fourier transforms of f(x) and k(x\ respectively.
16.2.2 The kernel of a Volterra equation of the first kind,
f(x)= f K(x,t)cp(t)dt,
Jo
has the form k(x — t). Assuming that the required transforms exist, show that
v 2ni . K(s)
F{s) and K(s) are the Laplace transforms of f(x) and k(x), respectively.
16.2-3 The kernel of a Volterra equation of the second kind,
ср(х) = /(х) + л Г K(x,t)<p(t)dt
Jo
has the form k(x — t). Assuming that the required transforms exist, show that
I Гу + ico p, ч
. 1 - AK(s)
16.2.4 Using the Laplace transform solution (Exercise 16.2.3), solve
rx
(a) <p(x) = x + (t - x)<p(t)dt,
Jo
ANS.
2 This kernel and a range 0 < x < oo are the characteristics of integral equa-
equations of the Wiener-Hopf type. Details will be found in Chapter 8 of Morse
and Feshbach.
878 INTEGRAL EQUATIONS
(b) (p(x) = x- (t-x)(p(t)dt.
Jo
ANS.
Check your results by substituting back into the original integral equations.
16.2.5 Reformulate the equations of Example 16.2.1 (Eqs. 16.39 to 16.43), using Fourier
cosine transforms.
1 6.2.6 Given the Fredholm integral equation,
e~x2= f e-{x~tJ(p(t)dt,
J — 00
apply the Fourier convolution technique of Example 16.2.1 to solve for q>(t).
16.2.7 Solve Abel's equation,
by the following method:
(a) Multiply both sides by (z — x)* and integrate with respect to x over the
range 0 < x < z.
(b) Reverse the order of integration and evaluate the integral on the right-hand
side (with respect to x) by the beta function.
Note.
Г^ = B(l -a,a)
Г
sin not
16.2.8 Given the generalized Abel equation with f(x) = 1,
Solve for cp(t) and verify that cp(t) is a solution of the preceding equation.
16.2.9 A Fredholm equation of the first kind has a kernel e
Лео
f(x) = «Г*"' "
J — oo
Show that the solution is
ANS.
<p(t) =
smna
n
in which Я„(х) is an nth-order Hermite polynomial.
16.2.10 Solve the integral equation
NEUMANN SERIES, SEPARABLE (DEGENERATE) KERNELS 879
for the unknown function q>(t) if
(a) /(x) = x2s,
(b) /(x) = x2s+1.
ANS. (a) (p(t)
4.v + 3
(b) 9@ = ----—P2.,+ 1(f).
16.2.11 A Kirchhoff diffraction theory analysis of a laser leads to the integral equation
K(r1,r2)v(rl)dA.
The unknown u(rj) gives the geometric distribution of the radiation field over
one mirror surface; the range of integration is over the surface of that mirror.
For square confocal spherical mirrors the integral equation becomes
in which b is the centerline distance between the laser mirrors. This can be put
in a somewhat simpler form by the substitutions
kxf yl kyf 2 , ka2 2na2 2
-j^ = tf, -jr = fh and -j- = —.- = <*•
Ь Ь Ъ AD
(a) Show that the variables separate and we get two integral equations.
(b) Show that the new limits ± a may be approximated by + go for a mirror
dimension a » /,.
(c) Solve the resulting integral equations.
16.3 NEUMANN SERIES, SEPARABLE
(DEGENERATE) KERNELS
Many and probably most integral equations cannot be solved by the spe-
specialized integral transform techniques of the preceding section. Here we develop
three rather general techniques for solving integral equations. The first, due
largely to Neumann, Liouville, and Volterra, develops the unknown function
(p(x) as a power series in k, where к is a given constant. The method is applicable
whenever the series converges.
The second method is somewhat restricted because it requires that the two
variables appearing in the kernel K(x, t) be separable. However, there are two
major rewards: A) the relation between an integral equation and a set of
simultaneous linear algebraic equations is shown explicitly, and B) the method
leads to eigenvalues and eigenfunctions—in close analogy to Section 4.6.
Third, a technique for numerical solution of Fredholm equations of both the
first and second kind is outlined. The problem posed by ill-conditioned matrices
is emphasized.
Neumann Series
We solve a linear integral equation of the second kind by successive approx-
approximations; our integral equation is the Fredholm equation
880 INTEGRAL EQUATIONS
K(x,t)<p{t)dt
A6.56)
in which/(x) ф 0. If the upper limit of the integral is a variable (Volterra equa-
equation), the following development will still hold, but with minor modifications.
Let us try (there is no guarantee that it will work) to approximate our unknown
function by
<p{x) * (po(x) = f(x).
A6.57)
This choice is not mandatory. If you can make a better guess, go ahead and
guess. The choice here is equivalent to saying that the integral or the constant
X is small. To improve this first crude approximation, we feed (po{x) back into
the integral, Eq. 16.56, and get
= /(*) +A K(x,t)f(t)dt.
A6.58)
Repeating this process of substituting the new wn(x) back into Eq. 15.56, we
develop the sequence
(Pi M = Ях) + / К (x, t, )f(t 1)dtl
and
where
(Pn(x) = £
;=o
«oW = f(x)
u1(x)= K(x,t1)f(t1)dt1
u2(x) =
K(xttl)K(tlJ2)f(t2)dt2dtl
A6.59)
A6.60)
A6.61)
un{x) = KfrtJKit^tJ ■ ■ ■ K(tn-l4tH)-f(tH)dtH ■ ■ ■ dlx.
We expect that our solution cp{x) will be
n
cp(x) = lim (pn{x) = lim У Я'мДх),
A6.62)
provided that our infinite series converges.
We may conveniently check the convergence by the Cauchy ratio test,
Section 5.2, noting that
A6.63)
NEUMANN SERIES. SEPARABLE (DEGENERATE) KERNELS 881
using |/|max to represent the maximum value of |/(x)| in the interval [a,b~\ and
^|max to represent the maximum value of \K(x,t)\ in its domain in the x,
f-plane. We have convergence if
\ЩК\тах\Ь - a\ < L A6.64)
Note that Awn(max) is being used as a comparison series. If it converges, our
actual series must converge. If this condition is not satisfied, we may or may not
have convergence. A more sensitive test is required. Of course, even if the
Neumann series diverges, there still may be a solution obtainable by another
method.
To see what has been done with this iterative manipulation, we may find it
helpful to rewrite the Neumann series solution, Eq. 16.59, in operator form.
We start by rewriting Eq. 16.56 as
(p = AK(p + f,
where К represents the integration operator $,K{x, /) [ ]dt. Solving for q>,
we obtain
<p = A - ЖУ1/.
Binomial expansion leads to Eq. 16.59. The convergence of the Neumann series
is a demonstration that the inverse operator A — Ж) exists.
EXAMPLE 16.3.1 Neumann Series Solution
To illustrate the Neumann method, we consider the integral equation
(p(x) = x + i (t- x)(p(t)dt. A6.65)
To start the Neumann series, we take
(po(x) = x. A6.66)
Then
) = x+i (t-x)tdt
'-1
^ 1.3 U2 1
9^1 ~ 21 Л _
l
Substituting (p^x) back into Eq. 16.65, we get
1 Г1 1 f1 1
cp2(x) = x + -\ (t- x)tdt + - {t - x)-dt
, l x
-y- _l_
Continuing this process of substituting back into Eq. 16.65, we obtain
882 INTEGRAL EQUATIONS
and by induction
cp2n(x) = x+ fc-ir^-'-xfc-ir^"'. A6.67)
s=l s=l
Letting n -> oo, we get
<p(x) = fx+i A6.68)
This solution can (and should) be checked by substituting back into the original
equation, Eq. 16.65.
It is interesting to note that our series converged easily even though Eq. 16.64
is not satisfied in this particular case. Actually Eq. 16.64 is a rather crude upper
bound on a. It can be shown that a necessary and sufficient condition for the
convergence of our series solution is that \k\ < \Xt\, where al, is the eigenvalue
of smallest magnitude of the corresponding homogeneous equation [/(.x) = 0)].
For this particular example Ae = y/3/2. Clearly, л = \ < Ae = ^3/2.
One approach to the calculation of time-dependent perturbations in quantum
mechanics starts with the integral equation for the evolution operator
U{t,to)=\-l-\ F(f1)L/(r1,f0)Jf1. A6.69a)
Iteration leads to
U(t,to)={ -~\ Vit^dt.+lj) Г V{t1)V(t2)dt2dtl + •••. A6.6%)
hk W к к
The evolution operator is obtained as a series of multiple integrals of the perturb-
perturbing potential V(t), closely analogous to the Neumann series, Eq. 16.60. For
V = Vo, independent off, the evolution operator becomes
U(tut0) = Qxp[-i(t - to)Vo/hl
A second and similar relationship between the Neumann series and quantum
mechanics appears when the Schrodinger wave equation for scattering is
reformulated as an integral equation. The first term in a Neumann series
solution is the incident (unperturbed) wave. The second term is the Born
approximation, Eq. 16.191 of Section 16.6.
The Neumann method may also be applied to Volterra integral equations
of the second kind, Eq. 16.4 or Eq. 16.56 with the fixed upper limit b replaced
by a variable x. In the Volterra case the Neumann series converges for all a
as long as the kernel is square integrable.
Separable Kernel
The technique of replacing our integral equation by simultaneous algebraic
equations may also be used whenever our kernel K(xj) is separable in the
sense that
NEUMANN SERIES, SEPARABLE (DEGENERATE) KERNELS 883
K(x,t)= £ M/xJN/f), A6.70)
j=i
where n, the upper limit of the sum, is finite. Such kernels are sometimes called
degenerate. Our class of separable kernels includes all polynomials and many
of the elementary transcendental functions; that is,
cos(t — x) = cos t cos x + sin t sin x. A6.70a)
If Eq. 16.70 is satisfied, substitution into the Fredholm equation of the second
kind, Eq. 16.2, yields
<p(x) = f(x) + к £ Mj{x) ["Nj(t)<p(t)dt, A6.71)
interchanging integration and summation. Now the integral with respect to t
is a constant,
j( Cj, A6.72)
J a
Hence Eq. 16.71 becomes
cp(x) = f{x) + к t CjMjix). A6.73)
This gives us cp(x), our solution, once the constants c, have been determined.
Equation 16.73 further tells us the form of <p(x):/(x), plus a linear combination
of the x-dependent factors of the separable kernel.
We may find c, by multiplying Eq. 16.73 by N{(x) and integrating to eliminate
the x-dependence. Use of Eq. 16.72 yields
с,- = bt + к | аиср A6.74)
where
t= Г N,(x)f(x)dx,
la
au= [ Nt(x)Mj(x)dx.
A6.75)
It is perhaps helpful to write Eq. 16.74 in matrix form, with A = (я0).
b = с - kAc = A - AA)c, A6.76a)
or1
Ь. A6.76b)
Equation 16.76a is equivalent to a set of simultaneous linear algebraic equations
1 Notice the similarity to the operator form of the Neumann series.
884 INTEGRAL EQUATIONS
a^ \ п i i
— А(Лц)С± — AU±2L2 — Аи^-^С-^ — — t/j,
-Xa2lcl + A - Xa22)c2 - Ха2Ъсъ - ■ ■ ■ = b2, A6.77)
— Xa3>lcl — Хаъ2с2 + A — Хаъъ)съ — ■ • ■ = b3, and so on.
If our integral equation is homogeneous, \_j\x) = 0], then b = 0. To get a
solution, we set the determinant of the coefficients of c, equal to zero,
|1 - AA\ = 0, A6.78)
exactly as in Section 4.6. The roots of Eq. 16.78 yield our eigenvalues. Sub-
Substituting into A — AA)c = 0, we find the c,'s and then Eq. 16.73 gives our
solution.
EXAMPLE 16.3.2
To illustrate this technique for determining eigenvalues and eigenfunctions
of the homogeneous Fredholm equation, we consider the simple case
Here
cp(x) = X \ (t + x)(p(t)dt.
M, = 1, M2(x) = x,
A6.79)
Equation 16.75 yields
«11 =«22 =0,
«21 =2-
Equation 16.78, our secular equation, becomes
Expanding, we obtain
2X
-2X
= 0.
A6.80)
■4
'■
A6.81)
Substituting the eigenvalues X = ±^/3/2 into Eq. 16.76, we have
с -
1 +
- 0
A6.82)
NEUMANN SERIES. SEPARABLE (DEGENERATE) KERNELS 885
Finally, with a choice of c-, = 1, Eq. 16.73 gives
^ /- = ~< A683)
Since our equation is homogeneous, the normalization of </>(.v) is arbitrary.
If the kernel is not separable in the sense of Eq. 16.70, there is still the possi-
possibility that it may be approximated by a kernel that is separable. Then we can get
the exact solution of an approximate equation, an equation that approximates
the original equation. The solution of the separable approximate kernel problem
can then be checked by substituting back into the original, unseparable kernel
problem.
Numerical Solution
There is extensive literature on the numerical solution of integral equations,
much of it concerns special techniques for certain situations. One method of
fair generality is the replacement of the single integral equation by a set of
simultaneous algebraic equations. And again matrix techniques are invoked.
This simultaneous algebraic equation-matrix approach—is applied here to two
different cases. For the homogeneous Fredholm equation of the second kind
this method works well. For the Fredholm equation of the first kind the method
is a disaster. First we deal with the disaster.
We consider the Fredholm integral equation of the first kind
./(•*)= K(xj)(p(t)dl, A6.84«)
Jn
with/(x) and K(x, t) known and cp(t) unknown. The integral can be evaluated
(in principle) by quadrature techniques. For maximum accuracy the Gaussian
method (Appendix 2) is recommended (if the kernel is continuous and has
continuous derivatives). The numerical quadrature replaces the integral by a
summation:
with Ak the quadrature coefficients. We abbreviate/(.v,) as j], (p(tk) as q>k, and
AkK(xhtk) as Blk. In effect we are changing from a function description to a
vector-matrix description with the n components of the vector (/,) defined as
the values of the function at the n discrete points [./(.v,)]. Equation 16.84b
becomes
./; = tв
k = l
a matrix equation. Inverting (Bik), we obtain
886 INTEGRAL EQUATIONS
<РЫ = <Pk=t вы% A6.84с)
k=i
and Eq. 16.84a is solved—in principle. In practice, the quadrature coefficient-
kernel matrix is often "ill-conditioned" (with respect to inversion). This means
that in the inversion process small (numerical) errors are multiplied by large
factors. In the inversion process all significant figures may be lost and Eq. 16.84c
becomes numerical nonsense.
This disaster should not be entirely unexpected. Integration is essentially a
smoothing operation. j\x) is relatively insensitive to local variation of q>{t).
Conversely, cp(t) may be exceedingly sensitive to small changes in/(x). Small
errors in f(x) or in B are magnified and accuracy disappears. This same
behavior shows up in attempts to invert Laplace transforms numerically—
Section 15.8.
When the quadrature—matrix technique is applied to the integral equation
eigenvalue problem, the symmetric kernel, homogeneous Fredholm equation
of the second kind,2
Г
к<р{х)= K(x,t)<p(t)dt, A6.84cf)
the technique is far more successful. Replacing the integral by a set of simul-
simultaneous algebraic equations (numerical quadrature, Appendix 2), we have
Яф,-= £лкК!к<рк, A6.84*)
fc=i
with q>{ = (p(Xi) as before. The points xh i = 1, 2, . . ., n are taken to be the
same (numerically) as tk, к = 1, 2, . . ., n, so that Kik will be symmetric. The
system is symmetrized by multiplying by AI2 so that
2cp) = X (АУ2К1кА1к12)(А1к'2срк). A6.84/)
fc=i
Replacing А}12(р: by (Д, and Afl2KjkAl12 by Sjk, we obtain
^ = S(A, A6.840)
with S symmetric (since the kernel K(x, t) was assumed symmetric, ф, of course,
has components ф{ = ф(х(). Equation 16.84$ is our matrix eigenvalue equation,
Eq. 4.146. The eigenvalues are readily obtained by calling the SSP EIGEN.3
For kernels such as those of Exercise 16.3.15 and using a 10-point Gauss-
Legendre quadrature, EIGEN determines the largest eigenvalue to within about
0.5 percent for the cases where the kernel has discontinuities in its derivatives.
If the derivatives are continuous, the accuracy is much better.
2The eigenvalue I has been written on the left side, multiplying the eigen-
function, as is customary in matrix analysis (Section 4.6). In this form I will
take on a maximum value.
3The corresponding subroutine in the PL/I Scientific Subroutine Package is
MSDU.
EXERCISES 887
Linz4 has described an interesting variational refinement in the determination
of Amax to high accuracy. The key to his method is Exercise 17.8.7. The compo-
components of the eigenfunction vector are obtained from Eq. 16.84J with cp(tk) now
known and (p^ = <p(.x,) generated as required. (The x, are no longer tied to the tk.)
EXERCISES
16.3.1 Using the Neumann series, solve
(a) (p(x)= 1-2 t(p(t)dt,
r-x° ANS. (a) <p(x) = e~*2.
(b) ф) = х+ (t - x)<p(t)dt,
Jo
(c) cp(x) = x— (t — x)(p(t)dt.
Jo
16.3.2 Solve the equation
f1
cp(x) = x + И (t + x)(p(t)dt
J-i
by the separable kernel method. Compare with the Neumann method solution
of Section 16.3.
ANS. <p(x) = iCx + 1).
16.3.3 Find the eigenvalues and eigenfunctions of
q>(x) = A {t-x)(p(t)dt.
16.3.4 Find the eigenvalues and eigenfunctions of
cp(x) = X cos(x — t)cp(t)dt.
Jo
ANS. Я1=Я2=-1-,
n
q>{x) = y4cosx + Bsinx.
16.3.5 Find the eigenvalues and eigenfunctions of
у(х) = Л Г (x-tJy(t)dt.
Hint. This problem may be treated by the separable kernel method or by a
Legendre expansion.
16.3.6 If the separable kernel technique of this section is applied to a Fredholm
equation of the first kind, (Eq. 16.1), show that Eq. 16.76 is replaced by
In general the solution for the unknown cp(t) is not unique.
4Peter Linz, "On the numerical computation of eigenvalues and eigenvectors
of symmetric integral equations." Math. Computation, 24, 905 A970).
888 INTEGRAL EQUATIONS
16.3.7 Solve
A + xt)\jj{t)dt
o
by each of the following methods:
(a) the Neumann series technique,
(b) the separable kernel technique,
(c) educated guessing.
16.3.8 Use the separable kernel technique to show that
ф(х) = A cos x sin til/(t)dt
Jo
has no solution (apart from the trivial ф = 0). Explain this result in terms of
separability and symmetry.
16.3.9 Solve
<p(x)= 1 + A2 (x-t)<p(t)dt
Jo
by each of the following methods:
(a) Reduction to a differential equation (including establishment of boundary
conditions),
(b) The Neumann series,
(c) The use of Laplace transforms.
ANS. cp(x) = cosh Ax.
16.3.10 (a) In Eq. 16.69a take V = Vo, independent of t. Without using Eq. 16.69b,
show that Eq. 16.69a leads directly to
-to) = exp[-i(t-to)Vo/ti].
(b) Repeat for Eq. 16.69b without using Eq. 16.69a.
16.3.11 Given cp(x) = л Jo A + xt)(p(t) dt, solve for the eigenvalues and the eigenfunctions
by the separable kernel technique.
16.3.12 Knowing the form of the solutions can be a great advantage, for the integral
equation
cp(x) = A A + xt)cp(t)dt,
Jo
assume cp(x) to have the form 1 + bx. Substitute into the integral equation.
Integrate and solve for b and A.
16.3.13 The integral equation
is approximated by
J0(axt)(p(t)dt, J0(ot) =
o
EXERCISES 889
q){x) = k Г [1 - x2t2](p(t)dt.
Jo
Find the minimum eigenvalue X and the corresponding eigenfunction cp(t) of
the approximate equation.
ANS. Amin= 1.112486
(p(x)= 1.-0.303337*2.
16.3.14 You are given the integral equation
f1
cp(x) — X \ sin nxt (p(t) dt.
Jo
Approximate the kernel by
K(x, t) = 4(xf)(l - xt) ^ sin nxt.
Find the positive eigenvalue and the corresponding eigenfunction for the
approximate integral equation.
Note. For K(x, t) = sin nxt, X = 1.6334.
ANS. X = 1.5678
ф) = x - 0.6955x2
4,Я^ = -V3T-4)
16.3.15 The equation
"b
f(x)= K(x,t)cp(t)dt
la
has a degenerate kernel K(x, t) ~ YH=i Mi(x)Nj(t).
(a) Show that this integral equation has no solution unless f(x) can be written
as
f(x) = t /;M,.(x),
with the /; constants.
(b) Show that to any solution q>(x) we may add ф(х), provided ф(х) is ortho-
orthogonal to all Nj(x):
Ni(x)il/(x)dx = O for alii.
Ja
16.3.16 Using numerical quadrature, convert
<jo(x) = A J0(axt)(p(t)dt, J0(oc) = 0
Jo
to a set of simultaneous linear equations.
(a) Find the minimum eigenvalue Я.
(b) Determine q>(x) at discrete values of x and plot q>(x) versus x. Compare
with the approximate eigenfunction of Exercise 16.3.13.
ANS. (a) Xmm = 1.14502.
16.3.17 Using numerical quadrature, convert
f1
(p(x) = X\ sin nxt cp(t)dt
Jo
890 INTEGRAL EQUATIONS
to a set of simultaneous linear equations.
(a) Find the minimum eigenvalue A.
(b) Determine q>(x) at discrete values of x and plot q>(x) versus x. Compare
with the approximate eigenfunction of Exercise 16.3.14.
ANS. (a) Amln = 1.6334
16.3.18 Given a homogeneous Fredholm equation of the second kind
Acp(x) = K(x,t)(p(t)dt.
Jo
(a) Calculate the largest eigenvalue Ao. Use the 10-point Gauss-Legendre
quadrature technique. For comparison the eigenvalues listed by Linz are
givenasAexact.
(b) Tabulate q>(xk), where the xk are the 10 evaluation points in [0,1].
(c) Tabulate the ratio
K{x,t)(p{t)dt/ko(p(x) forx = xk.
Jo
This is the test of whether or not you really have a solution.
(a) K(x,t) = ext.
ANS. Aexact = 1.35303.
(b) K(x,t) = <
[Щ2 - x), x > t.
ANS. Aexact = 0.24296.
(c) K(x,t) = \x-t\.
ANS. Aexacl = 0.34741.
(d) K(x,t) = \*' X<\
[t, x > t.
ANS. Aexact = 0.40528.
Note. A) The evaluation points x( of Gauss-Legendre quadrature for [— 1,1]
may be linearly transformed into [0,1],
Then the weighting factors Ax are reduced in proportion to the length of the
interval
16.3.19 Using the matrix variational technique of Exercise 17.8.7, refine your calculation
of the eigenvalue of Exercise 16.3.18(c)[X(x, t) = |x — t|]. Try a 40 x 40 matrix.
Note. Your matrix should be symmetric so that the (unknown) eigenvectors
will be orthogonal.
ANS. D0 point Gauss-Legendre quadrature) 0.34727.
16.4 HILBERT-SCHMIDT THEORY
Symmetrization of Kernels
This is the development of the properties of linear integral equations
(Fredholm type) with symmetric kernels.
K(x,t) = K(t,x). A6.85)
Before plunging into the theory, we note that some important nonsymmetric
HILBERT-SCHMIDT THEORY 891
kernels can be symmetrized. If we have the equation
(p(x) = f(x) + A K(x,t)p(t)(p(t)dt, A6.86)
Ja
the total kernel is actually K(x,t)p(t), clearly not symmetric if K(x, t) alone is
symmetric. However, if we multiply Eq. 16.86 by yfp(x) and substitute
y/plx)<p{x) = ф(х), A6.87)
we obtain
. Cb .
ф(х) = Jp(x)f(x) + A [K(x,t)Jp{x)p{t)~]\l/{t)dt, A6.88)
with a symmetric total kernel, K(x, t)yjp(x)p(t). We shall meet p(x) later as a
weighting factor in this integral equation Sturm-Liouville theory.
Orthogonal Eigenf unctions
We now focus on the homogeneous Fredholm equation of the second kind:
(p(x) = A\ K{x,t)(p{t)dt. A6.89)
We assume that the kernel K(x,t) is symmetric and real. Perhaps one of the
first questions the mathematician might ask about the equation is, "Does it
make sense?" or more precisely, "Does an eigenvalue A satisfying this equation
exist?" With the aid of the Schwarz and Bessel inequalities, Courant and Hilbert
(Chapter III, Section 4) show that if K(x,t) is continuous, there is at least one
such eigenvalue and possibly an infinite number of them.
We show that the eigenvalues, A, are real and that the corresponding eigen-
functions, <p,(x), are orthogonal. Let A,-, A,- be two different eigenvalues and <p,(x),
cpj{x\ the corresponding eigenfunctions. Equation 16.89 then becomes
<p.(x) = A,. K(x, 1)^A) dt, A6.90a)
a
(pj(x) = Aj\ K(x,t)(pj(t)dt. A6.90/?)
J и
If we multiply Eq. 16.90a by А^Дх), Eq. 16.90/? by A,^,(x), and then integrate
with respect to x, the two equations become1
Aji (pi(x)(pj(x)dx = AiAj\ Kix.Oqj^qj^dtdx, A6.91a)
J a J a J a
ГЬ ГЪ ГЬ
A- (pi(x)(pj(x)dx = AiAj\ K(x,t)(pj(t)(pi(x)dtdx. A6.91/?)
J a J a J a
1 We assume that the necessary integrals exist. For an example of a simple
pathological case, see Exercise 16.4.3.
892 INTEGRAL EQUATIONS
Since we have demanded that K(x, t) be symmetric, Eq. 16.91/? may be rewritten
as
гь гь гь
Л (P;{x)(pj{x)dx = A{Xj \ K(x,t)(pi(t)(pj(x)dtdx. A6.92)
J a J a J a
Subtracting Eq. 16.92 from Eq. 16.91a, we obtain
(Я,- - Я,) j (Pi(x)(pj(x)dx = 0. A6.93)
This has the same form as Eq. 9.33 in the Sturm-Liouville theory. Since k{ ^ X.-n
Г
<P;{x)(pj{x)dx = 0, i±j, A6.94)
J a
proving orthogonality. Note that with a real symmetric kernel no complex
conjugates are involved in Eq. 16.94. For the self-adjoint or Hermitian kernel
see Exercise 16.4.1.
If the eigenvalue /; is degenerate,2 the eigenfunctions for that particular
eigenvalue may be orthogonalized by the Gram-Schmidt method (Section 9.3).
Our orthogonal eigenfunctions may, of course, be normalized, and we assume
that this has been done. The result is
fb
(P;(x)q)j(x)dx = dr. A6.95)
Ja
To demonstrate that the Xi are real, we need to get into complex conjugates.
Taking the complex conjugate of Eq. 16.90a, we have
A6.96)
provided the kernel K(x,t) is real. Now, using Eq. 16.96 instead of Eq. 16.90/?,
we see that the analysis leads to
№-Ъ) \b(phx)(Pi(x)dx = 0. A6.97)
J a
This time the integral cannot vanish (unless we have the trivial solution,
(Pi(x) = 0) and
Af = k-x A6.98)
or Xh our eigenvalue, is real.
If readers feel that somehow this state of affairs is vaguely familiar, they are
right. This is the third time we have passed this way, first with Hermitian
matrices, then with Sturm-Liouville (self-adjoint) equations, and now with
Hilbert-Schmidt integral equations. The correspondence between the Hermit-
Hermitian matrices and the self-adjoint differential equations shows up in modern.
2 If more than one distinct eigenfunction corresponds to the same eigenvalue
(satisfying Eq. 16.89), that eigenvalue is said to be degenerate.
HILBERT-SCHMIDT THEORY 893
physics as the two outstanding formulations of quantum mechanics—the
Heisenberg matrix approach and the Schrodinger differential operator ap-
approach. In Section 16.5 we shall explore further the correspondence between
the Hilbert-Schmidt symmetric kernel integral equations and the Sturm-
Liouville self-adjoint differential equations.
The eigenfunctions of our integral equation form a complete set3 in the sense
that any function g(x) that can be generated by the integral
g{x)= K(x,t)h{t)dt, A6.99)
in which h(t) is any piecewise continuous function, can be represented by a
series of eigenfunctions,
00
g{x)= ^ancpn(x). A6.100)
n = \
The series converges uniformly and absolutely.
Let us extend this to the kernel, K(x, t), by asserting that
K(x,t)= £>ж@, A6.101)
n=l
and an = an(x). Substituting into the original integral equation (Eq. 16.89) and
using the orthogonality integral, we obtain
cp-Xx) = A.fl,.(x). A6.102)
Therefore for our homogeneous Fredholm equation of the second kind the
kernel may be expressed in terms of the eigenfunctions and eigenvalues by
K(x,t)= X (?"(x)(?»(f), (zero not an eigenvalue). A6.103)
Here we have a bilinear expansion, a linear expansion in (pn(x) and linear in
(pn(t). Similar bilinear expansions appear in Section 8.7. It is possible that the
expansion given by Eq. 16.101 may not exist. As an illustration of the sort of
pathological behavior that may occur, the reader is invited to apply this analysis
to
(p{x) = I e~xt(p{t)dt
(compare Exercise 16.4.3).
It should be emphasized that this Hilbert-Schmidt theory is concerned with
the establishment» of properties' of the eigenvalues (real) and eigenfunctions
(orthogonality, completeness), properties that may be of great interest and
value. The Hilbert-Schmidt theory does not solve the homogeneous integral
equation for us any more than the Sturm-Liouville theory of Chapter 9 solved
1 For a proof of this statement see Courant and Hilbert, Chapter III, Section 5.
894 INTEGRAL EQUATIONS
the differential equations. The solutions of the integral equation come from
Sections 16.2 and 16.3 (including numerical analysis).
Nonhomogeneous Integral Equation
We need a solution of the nonhomogeneous equation
(p(x) = f(x) + к K(x,t)(p(t)dt. A6.104)
J a
Let us assume that the solutions of the corresponding homogeneous integral
equation are known.
(pn(x) = K K(x,t)(pn(t)dt, A6.105)
J и
the solution (р„(х) corresponding to the eigenvalue /„. We expand both <p(x)
and/(x) in terms of this set of eigenfunctions
00
cp(x) = Y, an(P,Xx\ (an unknown) A6.106)
/1—1
00
f(x) = £ bncpnix). фя known) A6.107)
Substituting into Eq. 16.104, we obtain
ancpn(x) = £ bn(pn(x) + A K(x,t) £ an(pn(t)dt. A6.108)
n=l n=l Ja n=l
By interchanging the order of integration and summation, we may evaluate the
integral by Eq. 16.105, and we get
WnW = X bncpnix) + к £ ^^. A6.109)
n—\ n—\ n=l ^n
If we multiply by <p,(x) and integrate from x = a to x = b, the orthogonality
of our eigenfunctions leads to
a; = bi + k^-. A6.110)
/I;
This can be rewritten as
a. = b- + ^--rb, A6.111)
A; — /
which brings us to our solution
„ { At)<p,{t)dt
cp(x) = f(x) + к X ^Ц cp-Xx). A6.112)
Here it is assumed that the eigenfunctions, <p,(x), are normalized to unity.
Note that if f(x) = 0 there is no solution unless к = к{. This means that our
EXERCISES 895
homogeneous equation has no solution (except the trivial (p(x) = 0) unless A
is an eigenvalue, 1-.
In the event that A for the nonhomogeneous equation A6.104) is equal to one
of the eigenvalues, Ap, of the homogeneous equation, our solution (Eq. 16.112)
blows up. To repair the damage we return to Eq. 16.J 10 and give the value
ap = bp + Ap°^ = bp + ap A6.113)
special attention. Clearly, ap drops out and is no longer determined by bp,
whereas bp = 0. This implies that \j\x)cpp{x)dx = 0, that is,/(.x) is orthogonal
to the eigenfunction (pp(x). If this is not the case, we have no solution.
Equation 16.111 still holds for i j= p, so we multiply by <p,(.x) and sum over
щ ф n) to obtain
<p(x) = f(x) + ap% + kp Y~-~, ; Ф,-(х); A6.114)
/,• — Ap
the prime emphasizes that the value i = p is omitted. In this solution the ap
remains as an undetermined constant.4
EXERCISES
16.4.1 In the Fredholm equation
= k K(x,t)(p(t)dl
the kernel K(x, t) is self-adjoint or Hermitian.
K(x,t) = K*(t,x).
Show that
(a) the eigenfunctions are orthogonal in the sense
P
(p*(x)(pn(x)dx = 0, т±п
Ja
(b) the eigenvalues are real.
16.4.2 Solve the integral equation
x + U (t + x)(p(t)dt
(compare Exercise 16.3.2) by the Hilbert-Schmidt method.
The application of the Hilbert-Schmidt technique here is somewhat like using
a shotgun to kill a mosquito, especially when the equation can be solved in about
15 seconds by expanding in Legendre polynomials.
4This is like the inhomogeneous linear differential equation. We may add to
its solution any constant times a solution of the corresponding homogeneous
differential equation.
896 INTEGRAL EQUATIONS
16.4.3 Solve the Fredholm integral equation
<p(x) = Я e-x'(p(t)dt.
Jo
Note. A series expansion of the kernel e~xt would permit a separable kernel-type
solution (Section 16.3), except that the series is infinite. This suggests an infinite
number of eigenvalues and eigenfunctions. If you stop with
Ф) = x~m,
Я = П-1'2,
you will have missed most of the solutions! Show that the normalization integrals
of the eigenfunctions do not exist. A basic reason for this anomalous behavior
is that the range of integration is infinite, making this a "singular" integral
equation.
16.4.4 Given
f1
y(x) = x + Я xty(t)dt.
Jo
(a) Determine y(x) as a Neumann series.
(b) Find the range of Я for which your Neumann series solution is convergent.
Compare with the value obtained from
№|max<i.
(c) Find the eigenvalue and the eigenfunction of the corresponding homoge-
homogeneous integral equation.
(d) By the separable kernel method show that the solution is
3x
У(х) = -.
3 — л
(e) Find y(x) by the Hilbert-Schmidt method.
16.4.5 In Exercise 16.3.4
K(x,t) = cos(x - t).
The (unnormalized) eigenfunctions are cos x and sin x.
(a) Show that there is a function h(t) such that K(x, s), considered as a function
of s alone, may be written as
K(x,s)= K(s,t)h(t)dt.
Jo
(b) Show that K(x, t) may be expanded as
n = \
16.4.6 The integral equation q>(x) = Я jo(l + xt)cp(t)dt has eigenvalues kx = 0.7889 and
Я2 = 15.211 and eigenfunctions срг — 1 + 0.5352x and <p2 = 1 — 1.8685.x.
(a) Show that these eigenfunctions are orthogonal over the interval [0,1].
(b) Normalize the eigenfunctions to unity.
(c) Show that
K(x t) =
GREEN'S FUNCTIONS—ONE DIMENSION 897
ANS. (b) <p,(x) = 0.7831 +0.4191*
(p2(x) = 1.8403 - 3.4386x.
16.4.7 An alternate form of the solution to the nonhomogeneous integral equation,
Eq. 16.104, is
cp(x) = f ^
тсрДх).
i — A
(a) Derive this form without using Eq. 16.112.
(b) Show that this form and Eq. 16.112 are equivalent.
16.4.8 (a) Show that the eigenfunctions of Exercise 16.3.5 are orthogonal,
(b) Show that the eigenfunctions of Exercise 16.3.11 are orthogonal.
16.5 GREEN'S FUNCTIONS—ONE DIMENSION
As part of the investigation of differential operators in Section 8.7, we see that
Poisson's equation of electrostatics
\2(p(r)= -^ A6.115)
has a solution
^[^2 A6.116)
Here we have the infinite case in which the range of integration covers all space.
If desired, the potential (p(rx) may be developed for a finite case by using appro-
appropriate charge and dipole layer distributions on the boundaries.1
Equation 16.116 may be given two interpretations.
1. If the potential function (p(xx) is known and we seek
the charge distribution p(r2), which produces the
given potential, Eq. 16.116 is an integral equation for
2. If the charge distribution p(r2) is known, Eq. 16.116
yields the electrostatic potential (p(rx) as a definite
integral.
Following up this second (and more frequently encountered) situation, we
may use the physicists' customary"cause and effect vocabulary. We might label
p(r2) the "cause" that gives rise to the "effect" (pirj; that is, the charge distribu-
distribution produces a potential field. However, the effectiveness of the charge in
producing this potential depends on the distance between the element of charge
p(r2)dx2 and the point of interest given by rt. This effectiveness or, let us say,
the influence of the element of charge is given by the function Dя)г, — r2\)~l.
1 Compare J. A. Stratton, Electromagnetic Theory. New York: McGraw-Hill
A941).
898 INTEGRAL EQUATIONS
For this reason D^|rj — r2 j)" is often called an influence function. Although
we relabel it a Green's function, the physical basis for the term influence function
remains important and may well be helpful in determining the form of other
Green's functions.
Also in Section 8.7, the Green's function (for the operator V2) is described as
satisfying the point source equation
V2G(r,,r2) = -5(r, -r2). (8.122)
A detailed discussion of the Dirac delta function in terms of sequences is
included. Using Eq. 8.122 and Green's theorem, Section 1.11, the Green's
function is shown to be symmetric:
G(r1,r2) = G(r2,r1). (8.139)
In Section 9.4 the Dirac delta and Green's functions are expanded in series
of eigenfunctions. These expansions make the symmetry properties explicit.
Moving into this chapter, in Section 16.1 it is seen that the integral equation
corresponding to a differential equation and certain boundary conditions may
lead to a peculiar kernel. This kernel is our Green's function.
The development of Green's functions from Eq. 8.122 for two- and three-
dimensional systems is the topic of Section 16.6. Here, for simplicity, we restrict
ourselves to one-dimensional cases and follow a somewhat different approach.2
Defining Properties
In our one-dimensional analysis we consider first the nonhomogeneous
Sturm-Liouville equation (Chapter 9)
A6.117)
in which J2? is the self-adjoint differential operator
As in Section 9.1, y(x) is required to satisfy certain boundary conditions at the
end points a and b of our interval [a,b~\. Indeed, the interval may well be chosen
so that appropriate boundary conditions can be satisfied. We now proceed to
define a rather strange and arbitrary function G over the interval [a, b~\. At this
stage the most that can be said in defense of G is that the defining properties
are legitimate, or mathematically acceptable.3 Later, it is hoped, G may appear
reasonable if not obvious.
1. The interval a < x < b is divided by a parameter t.
2 Equation 8.122 can be used for one-dimensional systems. The relationship
between these two different approaches to Green's functions is shown at the
end of this section.
3Note, however, that these properties are just those of the kerne! of the
Fredholm equation that had been derived from a self-adjoint differential
equation, Example 16.1.3.
GREEN'S FUNCTIONS—ONE DIMENSION 899
We label G(x) = Gj(x) for a < x < t and G(x) =
G2{x) for t < x < b.
2. The functions Gj(x) and G2(x) each satisfy the
homogeneous4 Sturm^Liouville equation; that is,
<£GX (x) = 0, a < x < t,
A6.119)
S£G2{x) = 0, t <x<b.
3. At x = a, Gj(x) satisfies the boundary conditions we
impose on y(x). At x = b, G2(x) satisfies the boundary
conditions imposed on y(x) at this end point of the
interval. For convenience in renormalizing the
boundary conditions are taken to be homogeneous;
that is, at x = a
У(а) = 0,
or
/(«) = 0,
or
ay {a) + fiy'{a) = 0
and similarly for x = b.
4. We demand that G{x) be continuous,5
lim Gj(x)= Km G2(x). A6.120)
5. We require that G\x) be discontinuous, specifically
that5
d
ax
Pit)
where p(t) comes from the self-adjoint operator, Eq.
16.118. Note that with the first derivative discontin-
discontinuous the second derivative does not exist.
These requirements, in effect, make G a function of two variables, G{x, t).
Also, we note that G(x, t) depends, on both the form of the differential operator
S£ and the boundary conditions that y(x) must satisfy.
Now, assuming that we can find a function G(x, t) that has these properties,
we label it a Green's function and proceed to show that a solution of Eq. 16.117
is
4 Homogeneous with respect to the unknown function. The function /(.v) in
Eq. 16.117 is set equal to zero.
5 Strictly speaking, this is the limit as a- -> t.
900 INTEGRAL EQUATIONS
y(x)= Г G(x,t)f(t)dt. A6.122)
Ja
To do this we first construct the Green's function, G(x, t). Let u{x) be a solution
of the homogeneous Sturm—Liouville equation that satisfies the boundary
conditions at x = a and v(x) is a solution that satisfies the boundary conditions
at x = b. Then we may take6
(c,u(x\ a < x < t,
G(x,t) = < 1 У J ~ A6.123)
{c2v{x\ t < x <b.
Continuity at x = t (Eq. 16.120) requires
c2v(t)- clu(t) = 0. A6.124)
Finally, the discontinuity in the first derivative (Eq. 16.121) becomes
c2v\t)-Clu'{t)= -~ A6.125)
There will be a unique solution for our unknown coefficients cx and c2 if the
Wronskian determinant
;/ ;/ =u(t)v'(t)-v(t)u'(t)
u'(t) v'(t)
does not vanish. We have seen in Section 8.6 that the nonvanishing of this
determinant is a necessary condition for linear independence. Let us consider
u(x) and v(x) to be indepeftdent. The contrary, which occurs when u(x) satisfies
the boundary conditions at both end points, requires a generalized Green's
function. Strictly speaking, no Green's function exists when u(x) and v(x) are
linearly dependent. This is also true when л = 0 is an eigenvalue of the homoge-
homogeneous equation. However, a "generalized Green's function" may be defined.
This situation, which occurs with Legendre's equation, is discussed in Courant
and Hilbert and other references. For independent u(x) and v(x) we have the
Wronskian (again from Section 8.6 or Exercise 9.1.4)
u(t)v'(t)-v(t)u'(t) = ^~, A6.126)
in which A is a constant. Equation 16.126 is sometimes called Abel's formula.
Numerous examples have appeared in connection with Bessel and Legendre
functions. Now, from Eq. 16.125, we identify
A6.127)
6The "constants" cx and c2 are independent of x, but they may (and do)
depend on the other variable, t.
GREEN'S FUNCTIONS—ONE DIMENSION 901
Equation 16.124 is clearly satisfied. Substitution into Eq. 16.123 yields our
Green's function.
G(x,t) =
u(x)v(t), a < x < t,
—--u{t)v{x\ t <x <b.
A6.128)
Note carefully that G(x, t) = G(f, x). This is the symmetry property that was
proved earlier in Section 8.7. Its physical interpretation is given by the reciproc-
reciprocity principle (via our influence function)—a cause at t yields the same effect at
x as a cause at x produces at t. In terms of our electrostatic analogy this is
obvious, the influence function depending only on the magnitude of the distance
between the two points
Green's Function Integral—Differential Equation
We have constructed G(x, t), but there still remains the task of showing that
the integral (Eq. 16.122) with our new Green's function is indeed a solution of
the original differential equation A6.117). This we do by direct substitution.
With G(x, t) given by Eq. 16.128,7 Eq. 16.122 becomes
y{x) = -~ [*v{x)u{t)f(t)dt - ~ Г u(x)v(t)f(t)dt. A6.129)
J a Jx
Differentiating, we obtain
У'(х)= -~ Г v\x)u{t)j\t)dt--j I"u\x)v{t)j\t)dU A6.130)
Ja Jx
the derivatives of the limits canceling. A second differentiation yields
/(*)= ~ f v"(x)u(t)f(t)dt-~ [bu"{x)v{t)j\t)dt
^ л A6.131)
- ~[u(x)v'(x) - v(x)u'(x)]f(x).
By Eqs. 16.125 and 16.127 this may be rewritten as
v"(x) Г u"ix) Ch fix)
/'(*)= _i-W u(t)f{t)dt - ±±4 \ V(t)f(t)dt-J^. A6.132)
/ж I /1 I 1/1Л/
Ja Jx
Now, by substituting into Eq. 16.118, we have
u(t)j\t)dt-^^f^ v{t)j\t)dt-j\x).
A Jx
A6.133)
7In the first integral a < t < x. Hence G(x, t) = G2{x, t) = ~{\jA)u{t) v(x).
Similarly, the second integral requires G = Gx.
902 INTEGRAL EQUATIONS
Since u(x) and v(x) were chosen to satisfy the homogeneous Sturm-Liouville
equation, the factors in brackets are zero and the integral terms vanish. Trans-
Transposing f(x), we see that Eq. 16.117 is satisfied.
We must also check that y(x) satisfies the required boundary conditions.
At point x = a
y{a) = -^ I bv(t)f(t)dt = cu(a), A6.134)
la
v(t)f{t)dt = cu'(al A6.135)
la
since the definite integral is a constant. We chose u(x) to satisfy
(a) + Pu'(a) = 0. A6.136)
Multiplying by the constant c, we verify that y(x) also satisfies Eq. 16.136. This
illustrates the utility of the homogeneous boundary conditions: The normaliza-
normalization does not matter. In quantum mechanical problems the boundary condition
on the wave function is often expressed in terms of the ratio
ф{х) dx
equivalent to Eq. 16.136. The advantage is that the wave function need not be
normalized.
Summarizing, we have Eq. 16.122
y(x)= [Ь G{x,t)f{t)dt,
Ja
which satisfies the differential equation (Eq. 16.117)
&y(x) + j\x) = 0
and the boundary conditions, these boundary conditions having been built into
the Green's function G(x, t).
Basically, what we have done is to use the solutions of the homogeneous
Sturm-Liouville equation to construct a solution of the nonhomogeneous
equation. Again, Poisson's equation is an illustration. The solution (Eq. 16.116)
represents a weighted [p(r2)] combination of solutions of the corresponding
homogeneous Laplace's equation. (We did this same sort of thing in Section
16.4).
It should be noted that our y(x), Eq. 16.122, is actually the particular solution
of the differential equation, Eq. 16.117. Our boundary conditions exclude the
addition of solutions of the homogeneous equation. In an actual physical
problem we may well have both types of solutions. In electrostatics, for instance
(compare Section 8.7), the Green's function solution of Poisson's equation gives
the potential created by the given charge distribution. In addition, there may
be external fields superimposed. These would be described by solutions of the
homogeneous equation, Laplace.
GREEN'S FUNCTIONS—ONE DIMENSION 903
Eigenfunction, Eigenvalue Equation
The preceding analysis placed no special restrictions on our f(x). Let us
now assume that f(x) = Ap(x)y(x).8 Then we have
y(x) = а СG(x,t)p(t)y(t)dt A6.137)
as a solution of
&y{x) + Я p(x)y{x) = 0 A6.138)
and its boundary conditions. Equation 16.137 is a homogeneous Fredholm
equation of the second kind and Eq. 16.138 is the Sturm-Liouville eigenvalue
equation of Chapter 9 [with the weighting function w(x) replaced by p(x)].
Notice the change from Eqs. 16.117 and 16.122 to 16.137 and 16.138. There
is a corresponding change in the interpretation of our Green's function. It
started as an importance or influence function, a weighting function giving the
importance of the charge p(r2) in producing the potential (p{rx). The charge p
was the nonhomogeneous term in the nonhomogeneous differential equation
16.117. Now the differential equation and the integral equation are both
homogeneous. G(x, t) has become a link relating the two equations, differential
and integral.
To complete the discussion of this differential equation—integral equation
equivalence—let us now show that Eq. 16.138 implies Eq. 16.137; that is, a
solution of our differential equation A6.138) with its boundary conditions
satisfies the integral equation A6.137). We multiply Eq. 16.138 by G(x,t), the
appropriate Green's function, and integrate from x = a to x = b to obtain
G{x,t)&y(x)dx + Я G{x,t)p(x)y{x)dx = 0. A6.139)
Ja Ja
The first integral is split in two (x < t,x > t), according to the construction of
our Green's function, giving
- Gl{x,t)£ey{x)dx- G2{x,t)^y{x)dx = Я G(xj)p(x)y(x)dx.
Ja Jt Ja
A6.140)
Note that t is the upper limit for the G{ integrals and the lower limit for the G2
integrals. We are going to reduce the left-hand side of Eq. 16.140 to y(t). Then,
with G(x,t) = G(t,x), we have Eq. 16.137 (with x and t interchanged).
Applying Green's theorem to the left-hand side or, equivalently, integrating
by parts, we obtain
q(x)y{x)
dx
= -\G1(x,t)p(x)y'(x)\ta+ G'l(x,t)p(x)y'(x)dx- Gi(x,t)q(x)y(x)dx,
J a J a
A6.141)
*The function p(x) is a weighting function, not a charge density.
904 INTEGRAL EQUATIONS
with an equivalent expression for the second integral. A second integration by
parts yields
- Gx{x,t)<ey{x)dx = - y(x)^G1(x,t)dx
Ja * a
- Gl{x,t)p{x)y\x)\[l + \G'1{x,t)p{x)y{x)\ta. A6.142)
The integral on the right vanishes because <£GX = 0. By combining the inte-
integrated terms with those from integrating G2, we have
t) ~ G[(t,t)y(t) - G2(t,t)y'(t) + G'2(t,t)y(t)]
+ p{a)lGl{a,t)y\a) - G[[a,t)y(a)\ - p(b)[G2(b,t)y'(b) - G2(b,t)y(b)].
A6.143)
Each of the last two expressions vanishes, for G(x,t) and y(x) satisfy the same
boundary conditions. The first expression, with the help of Eqs. 16.120 and
16.121, reduces to y(t). Substituting into Eq. 16.140, we have Eq. 16.137, thus
completing the demonstration of the equivalence of the integral equation and
the differential equation plus boundary conditions.
EXAMPLE 16.5.1. Linear Oscillator
As a simple example, consider the linear oscillator equation (for a vibrating
string)
у"(х) + Лу{х) = 0. A6.144)
We impose the conditions y@) = y(l) = 0, which correspond to a string clamped
at both ends. Now, to construct our Green's function, we need solutions of the
homogeneous Sturm-Liouville equation, <Уу(х) = 0, which is y"(x) = 0. To
satisfy the boundary conditions, we must have one solution vanish at x = 0,
the other at x = 1. Such solutions (unnormalized) are
u(x) = x,
A6.145)
v(x) = 1 - x.
We find that
uv' - vu' = -1 A6.146)
or, by Eq. 16.126 with p(x) = 1, A = — 1. Our Green's function becomes
(x(l - t), 0 < x < t,
G(x,t) = l У h A6.147)
; [t(l - x), t < x < 1.
Hence by Eq. 16.137 our clamped vibrating string satisfies
у(х) = л G(x,t)y(t)dt. A6.148)
Jo
This is Eq. 16.34 with b = 1 and w2 = X.
GREEN'S FUNCTIONS—ONE DIMENSION 905
G{x,t)
t(\ - x)
FIG. 16.3 A linear oscillator Green's
x = t x = 1 function
The reader may show that the known solutions of Eq. 16.144.
у = sin nnx, a = n2n2,
do indeed satisfy Eq. 16.148. Note that our eigenvalue X is not the wavelength.
Green's Function and the Dirac Delta
Function
One more approach to the Green's function may shed additional light on
our formulation and particularly on its relation to physical problems. Let us
refer once more to Poisson's equation, this time for a point charge
Ppoin
V>(r) = -
A6.149)
The Green's function solution of this equation was developed in Section 8.7.
This time let us take a one-dimensional analog
/Wpoint = °-
A6.150)
Here /(x)point refers to a unit point "charge" or a point force. We may represent it
by a number of forms, but perhaps the most convenient is
X /point
—, t — e < x < t + e,
2
A6.151)
0, elsewhere,
which is essentially the same as Eq. 8.108. Then, integrating Eq. 16.150, we have
<ey(x)dx=-\ f(x)pomtdx
Jt~E Jt-E
= -1
from the definition of j\x). Let us examine Уу{х) more closely. We have
rt+E d rt+E
A6.152)
dx
-[p(x)y'(x)]dx+ \ q(x)y(x)dx
lt-г
*t+£
A6.153)
= \р(х)у'(х)\\+4+ q(x)y(x)dx= -1.
In the limit e ->■ 0 we may satisfy this relation by permitting y'(x) to have a
discontinuity of — l/p(x) at x = t, y(x) itself remaining continuous.9 These,
9 The functions p(x) and q(x) appearing in the operator if are continuous
functions. With y(x) remaining continuous. J q(x)y{x) dx is certainly continu-
continuous. Hence this integral over an interval 2e (Eq. 16.153) vanishes as t vanishes.
906 INTEGRAL EQUATIONS
however, are just the properties used to define our Green's function, G(x, t).
In addition, we note that in the limit e ->■ 0
/(x)point = 5{x - t), A6.154)
in which S(x — t) is our Dirac delta function, defined in this manner in Section
8.7. Hence Eq. 16.150 has become
J5fG(x,0= -S(x- t). A6.155)
This is Eq. 8.132, which we exploit for the development of Green's functions
in two and three dimensions—Section 16.6. It will be recalled that we used
this relation in Section 8.7 to determine our Green's functions.
Equation 16.155 could have been expected since it is actually a consequence
of our differential equation, Eq. 16.117, and Green's function integral solution,
Eq. 16.122. If we let <£x (subscript to emphasize that it operates on the
x-dependence) operate on both sides of Eq. 16.122, then
&ху{х)=&х [ЪG{x,t)j\t)dt.
By Eq. 16.117 the left-hand side is just —f(x). On the right Z£x is independent
of the variable of integration t, so we may write
-f(x)= Г {J?xG(x,t))j\t)dt.
Jci
By definition of Dirac delta function, Eqs. 8.107 and 8.117, we have Eq. 16.155.
EXERCISES
16.5.1 Show that G(x, *) = •{' ^ x < l>
v \t, t<x<:l.
is the Green's function for the operator Z£ = d2/dx2 and the boundary condi-
conditions
У@) = 0,
/A) = 0.
16.5.2 Find the Green's function for
—
ax
(b) Уу(х) = —^~ - y(x), y(x) finite for - oo < x < oo.
ax
16.5.3 Find the Green's function for the operators
ANS. (a) G(x,t) = \
l—lnx, t < x < 1,
EXERCISES 907
16.5.5
with y(Q) finite and y(l) = 0.
(b) G(x,t) =
'X
0 < x < t,
t <x< 1.
The combination of operator and interval specified in Exercise 16.5.3(a) is
pathological in that one of the end points of the interval (zero) is a singular
point of the operator. As a consequence, the integrated part (the surface integral
of Green's theorem) does not vanish. The next four exercises explore this situa-
situation.
16.5.4 (a) Show that the particular solution of
dx
dx
y(x)\=-l
is yP(x) = — x.
(b) Show that
yP(x)= -хф G(x,t)(-l)dt,
where G(x, t) is the Green's function of Exercise 16.5.3(a).
Show that Green's theorem, Eq. 1.97 in one dimension with a Sturm-Liouville
type operator —p(t)— replacing V • V, may be rewritten as
dt dt
dt wrw dt
16.5.6 Using the one-dimensional form of Green's theorem of Exercise 16.5.5, let
v(t) = y(t) and
dt
dt J
/ч ^/ч i d( . .dG(x,t)\ -.
u(t) = G(x, t) and — p{t)—y—L-L = -6(x- t).
Show that Green's theorem yields
y(x)= [b-G{x,t)f{t)dt
■+
G(x,t)p(t)^-y(t)p(t)j-tG(x,t)
^
jt
16.5.7 For p(t) = t, y(t)=-t,
\ - In x t < x < Г
verify that the integrated part does not vanish.
908 INTEGRAL EQUATIONS
16.5.8 Construct the Green's function for
subject to the boundary conditions
У@) = 0,
y(l) = 0.
16.5.9 Given
dx2 dx
and
G( ± 1, t) remains finite.
Show that no Green's function can be constructed by the techniques of this
section. (u(x) and v(x) are linearly dependent.)
16.5.10 Construct the infinite one-dimensional Green's function for the Helmholtz
equation
(V2 + к2)ф(х) = g(x).
The boundary conditions are those for a wave advancing in the positive x
direction—assuming a time dependence е~ш.
ANS. G(x!,x2) = — exp(ik\xl — x2|).
16.5.11 Construct the infinite one-dimensional Green's function for the modified
Helmholtz equation
(V2 - к2Щх) = f(x).
The boundary conditions are that the Green's function must vanish for x -> oo
and x -> — oo.
ANS. G(xl,x2) = —exp(-k\xl-x2\).
16.5.12 From the eigenfunction expansion of the Green's function show that
n j2_ у sin nnx sin nnt _ |x(l - t), 0 < x < t,
n2 „t; n2 ~ \t(l - x), t < x < 1.
... 2 S sin(n + j)nx sin(n + \)nt {x, 0 < x < t,
я2 ^ in + xJ = 11 t<x<\
Note. In Section 9.4 the Green's function of Z£ + X is expanded in eigenfunctions.
The X there is an adjustable parameter, not an eigenvalue.
16.5.13 In the Fredholm equation,
f(x) = X2 I" G{x,t)(p(t)dt,
Ja
G(x, t) is a Green's function given by
byx,t) — 2_ ~~r^ t^-
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 909
Show that the solution is
oo " 2 i2 ГЬ
n=l ^2 Jfl
16.5.14 Show that the Green's function integral transform operator
G(x,t)[ ]dt
Ja
is equal to — <£~x in the sense that
(a) S£x [Ъ G{x,t)y{t)dt= -y(x),
Ja
(b) Г G(x,t)&ty(t)dt = -y(x).
Ja
Note. Take &у(х) + f(x) = 0, Eq. 16.117.
16.6 GREEN'S FUNCTIONS—TWO AND THREE
DIMENSIONS
As in the preceding section (and in Section 8.7), we consider a nonhomo-
geneous differential equation
1)= -Л'!)- A6.156)
We seek a solution that might be represented by
y(r1)=-J?-\f(rl). A6.156a)
It might be expected that with S£ a differential operator, the inverse operator
££~x will involve integration. To proceed further, we define the Green's function
corresponding to the differential operator 5£ as a solution of the point source
nonhomogeneous equation1
^,G(r,,r2)= -<3(r, -r2), A6.156/7)
which satisfies the required boundary conditions. Here the subscript 1 on if
emphasizes that if operates on i-j .
Let us assume that ££x is a self-adjoint differential operator of the general
form2
М+<*(!•,). A6.156c)
Then, as a simple generalization of Green's theorem, Eq. 1.97, we have
\= \ p(v\2u-u\2v)-da2, A6.156J)
1 This equation appears in different forms in different references. Some authors
write the right-hand side as — 4nd(rl — r2), others use + <Hri ~ гг)- As stressed
in Section 8.7, the delta function will be part of an integrand.
2^fx may be in 1, 2, or 3 dimensions (with appropriate interpretation of V,).
910 INTEGRAL EQUATIONS
in which all quantities have r2 as their argument. (To verify Eq. 16.156d, take
the divergence of the integrand of the surface integral.) We let u(r2) = y{r2)
so that Eq. 16.156 applies and v(r2) = G(rl5r2) so that Eq. 16.156/? applies.
(Remember G(rj, r2) = G(r2, i-j ), Section 8.7.) Substituting into Green's theorem
{-G(r,,r2)/(r2) + Яг2)<3(г, - r2)} dx2
A6.156*)
= \p(r2){G(rl,r2)\2y(r2)-y(r2)\2G(rl,r2)}-d<;2.
Integrating over the Dirac delta function
y(rl)= ГG(r,,r2)/( [{}
A6.156/)
Our solution to Eq. 16.156 appears as a volume integral plus a surface integral.
If у and G both satisfy Dirichlet boundary conditions, or if both satisfy Neumann
boundary conditions, the surface integral vanishes and we regain Eq. 16.122.
The volume integral is a weighted integral over the source term /(r2) with our
Green's function G(rl5r2) as the weighting function.
Form of Green's Functions
For the special case of p(rj = 1 and q^J = О, У is V2, the Laplacian.
Let us integrate
V2G(r1?r2)= -EA4-r2) A6.157)
over a small volume including the point source. Then
^ A6.157a)
The volume integral on the left may be transformed by Gauss's theorem as in
the development of Gauss's law—Section 1.14. We find that
\\lG(r1,r2)-dal = -1. A6.158)
This shows, incidentally, that it may not be possible to impose a Neumann
boundary condition, that the normal derivative of the Green's function, dG/dn,
vanishes over the entire surface.
If we are in three-dimensional space, Eq. 16.158 is satisfied by taking
■G(rl5r2)= -—•- —s, r12=rl-r2. A6.158a)
drl2 yit " An
-r2
The integration is over the surface of a sphere centered at r2. The integral
ofEq. 16.158a is
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 911
- —
15 2 4тг
in agreement with Section 1.14.
If we are in two-dimensional space Eq. 16.158 is satisfied by taking
A6.159)
dp
-G(pi,p2)= —x-
1
12
Pi -P2
A6.160)
with r being replaced by p, p = (x2 + y2I12 and the integration being over the
circumference of a circle centered on p2. Here pl2 = |рж — p2|. Integrating Eq.
16.160, we obtain
1
G(Pi,P2)= -^-
2n
A6.161)
To G(p1?p2) (and to G(rl,r2)) we may add any multiple of the regular
solution of the homogeneous equation as needed to satisfy boundary conditions.
The behavior of the Laplace operator Green's function in the vicinity of the
source point rt = r2 shown by Eqs. 16.159 and 16.161 facilitates the identification
of the Green's functions for the other cases, such as the Helmholtz and modified
Helmholtz equations.
1. For rt j= r2, G(rl5r2) must satisfy the homogeneous
differential equation
•1,r2) = 0, r,^r2. A6.162)
2. As гх
(or
f
2тг
J_
4тг
Pi -P2
two-dimensional
space,
A6.163)
three-dimensional
space.
A6.163a)
The term ±k2 in the operator does not affect the behavior of G near the singular
point rt = r2. For convenience, the Green's functions for the Laplace, Helm-
Helmholtz, and modified Helmholtz operators are listed in Table 16.1.
Spherical Polar Coordinate Expansion
As an alternate determination of the Green's function of the Laplace operator,
let us assume a spherical harmonic expansion of the form
A6.164)
G(r1,r2)=£
1 = 0 m=-l
We will determine gl(rl,r2). From Exercises 8.6.7 and 12.6.6
912 INTEGRAL EQUATIONS
TABLE 16.1 Green's Functions"
Laplace
V2
Helmholtz
V2 + A2
Modified
Helmholtz
V2 - A2
One-dimensional space No solution for —exp(//r|.v1 — .v->|) — exp(— A|.Yi — x2\)
(-00,00) 2k ' 2k
Two-dimensional space In|pj — p2
Three-dimensional space
2л
JL.
4л'
2k
-P2|) --AToC/rjp, — p2j)
in
ехр(—
4л
4л
г, ~г2
"These are the Green's functions satisfying the boundary condition G(r,, r2) = 0 as r, -> 00
for the Laplace and modified Helmholtz operators. For the Helmholtz operator, G(r!,r2)
corresponds to an outgoing wave, //q1* is the Hankel function of Section 11.4. Ko is the
modified Bessel function of Section 11.5.
1
x — cosO2)S{(pl — (p2)
A6.165)
1 = 0 m=-l
Substituting Eqs. 16.164 and 16.165 into the Green's function differential equa-
equation, Eq. 16.157, and making use of the orthogonality of the spherical harmonics,
we obtain a radial equation:
- 1A
,r2) = -
- r2).
A6.166)
This is now a one-dimensional problem. The solutions3 of the corresponding
homogeneous equation are r[ and г^1~х. If we demand that g, remain finite as
г, ->• 0 and vanish as rx ->■ 00, the technique of Section 16.5 leads to
1
21 + 1
rx <r2,
r, >r2,
A6.167)
or
1
21+ 1
A6.168)
Hence our Green's function is
со I
A6.169а)
1 = 0 m=-l Zi + 1 Г>
Since we already have G(rl,r2) in closed form, Eq. 16.159, we may write
3 Compare Table 8.1.
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 913
J_
4тг
-r2
= 1 I
t% mf_, 21+Ir
/+1
>
A6.16%)
One immediate use for this spherical harmonic expansion of the Green's
function is in the development of an electrostatic multipole expansion. The
potential for an arbitrary charge distribution is
l
P(r2)
(which is Eq. 8.81). Substituting Eq. 16.16%, we get
[p(T2)Yr(e2i<p2)rl2d(p2sine2d02r22dr2\t
J
•A(ri)=1Z t Li
e0 i=o m=-i lZi "•
for
> r2.
This is the multipole expansion. The relative importance of the various terms
in the double sum depends on the form of the source p(r2).
FIG. 16.4
Legendre Polynomial Addition Theorem
From the generating expression for Legendre polynomials, Eq. 12.4a,
J
1
1
-r2
A6.170)
where у is the angle included between vectors i^ and r2, Fig. 16.4. Equating
Eqs. 16.169 and 16.170, we have the Legendre polynomial addition theorem
A6.171)
914 INTEGRAL EQUATIONS
Compare the simplicity (once Green's functions are understood) of this deriva-
derivation with the relatively cumbersome derivation of Section 12.8.
Circular Cylindrical Coordinate Expansion
In analogy with the preceding spherical polar coordinate expansion, we
write
- r2) = —
Pi
i ~ z2)
= —d(pl-p2)-^- У ^^"i-^J-
Pi 2ТГ m^-oo 2n
A6.172)
using Exercise 12.6.5 and Eq. 15.2Ы. But why this choice? Why a summation
for the ^-dependence and an integration for the z-dependence? The requirement
that the azimuthal dependence be single-valued quantizes m, hence the sum-
summation. No such restriction is expected on k.
To avoid problems later with negative values of k, we rewrite Eq. 16.172 as
1 1 °° if00
O[T i — I2) — O\Pl — Pi) 7—1 — I COS/v^Zj — ^2)^'^i
Pi 2пт=-ао П Jo
A6.172a)
using the Cauchy principal value. We assume a similar expansion of the Green's
^ ' *'»<«..—-> 1 cos/c(z, -z2)dk, A6.173)
with the p-dependent coefficients gj^p^, p2) to be determined. Substituting into
Eq. 16.157, now in circular cylindrical coordinates, we find that if gM(Pi,p2)
satisfies
Pi
dgm
dpi
k2Pl
m
Pi
9m = -Hpi ~ Pi\
A6.174)
then Eq. 16.157 is satisfied.
The operator in Eq. 16.174 is identified as the modified Bessel operator (in
self-adjoint form). Hence the solutions of the corresponding homogeneous
equation are ut = Im(kp), u2 = Km(kp). As in the spherical polar coordinate
case, we demand that G be finite at pl = 0 and vanish as px ->■ oo. Then the
technique of Section 16.5 yields
9m(Pi,p2)= —
A6.175)
This corresponds to Eq. 16.128. The constant A comes from the Wronskian:
A
Im(kp)K'm(kp) - Im(kp)Km(kp) =
p{kp)
A6.175a)
From Exercise 11.5.10 A = — 1 and
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 915
= IJkp^KJkp,). A6.176)
Therefore our circular cylindrical coordinate Green's function is
/71 |n
A6.177)
Exercise 16.6.14 is a special case of this result.
EXAMPLE 16.6.1 Quantum Mechanical Scattering—Neumann Series
Solution
The quantum theory of scattering provides a nice illustration of integral
equation techniques and an application of a Green's function. Our physical
picture of scattering is as follows. A beam of particles moves along the negative
z-axis toward the origin. A small fraction of the particles is scattered by the
potential V(r) and goes off as an outgoing spherical wave. Our wave function
ф(г) must satisfy the time-independent Schrodinger equation
^ = Еф(г) A6.178a)
or
2m
У(г)ф(г)
A6.178/7)
WW + к2ф(г) = -
with
k2 = 2mE/h2.
From the physical picture just presented we look for a solution having an
asymptotic form
ifcr
p)—. A6.179)
Here e'k°'r is the incident plane wave4 with k0 the propagation vector carrying
the subscript 0 to indicate that it is in the 6 = 0 (z-axis) direction. The magni-
magnitudes k0 and к are equal. elkr/f is the outgoing spherical wave with an angular-
(and energy) dependent amplitude factor fk(9, cp).5 Vector к has the direction of
the outgoing scattered wave. In quantum mechanics texts it is shown that the
4 For simplicity we assume a continuous incident beam. In a more sophis-
sophisticated and more realistic treatment Eq. 16.179 would be one component of
a Fourier wave packet.
5 If V(r) represents a central force, fk will be a function of 0 only, independent
of azimuth.
916 INTEGRAL EQUATIONS
differential probability of scattering, da/dCl, the scattering cross section per
unit solid angle, is given by \fk{9, (p)\2.
Identifying
2m
with /(r) of Eq. 16.156, we have
h
,,r2)d3r2 A6.180)
by Eq. 16.156/. This does not have the desired asymptotic form Eq. 16.179,
but we may add to Eq. 16.180 e'k°'r\ a solution of the homogeneous equation
and put ф(г) into the desired form:
,т2)<13г2, A6.181)
№
Our Green's function is the Green's function of the operator J2? = V2 + k2
(Eq. 16.178/?) satisfying the boundary condition that it describe an outgoing
wave. Then, from Table 16.1, G(rl,r2) = ехр^Тс^ — г2|)/Dтг|г1 — r2|)and
h
r2
d3r2. A6.182)
This integral equation analog of the original Schrodinger wave equation is
exact.
Employing the Neumann series technique of Section 16.3 (remember, the
scattering probability is very small), we have
фо(Г1) = **•>"., A6.183a)
which has the physical interpretation of no scattering.
Substituting фо(г2) = elko'12 into the integral, we obtain the first correction
term
\£±k^d'r2. A6.183/7)
i -r2
This is the famous Born approximation. It is expected to be most accurate for
weak potentials and high incident energy. If a more accurate approximation
is desired the Neumann series may be continued.6
EXAMPLE 16.6.2 Quantum Mechanical Scattering—Green's Function
Again, we consider the Schrodinger wave equation (Eq. 16.178/?) for the
scattering problem. This time we use Fourier transform techniques and derive
the desired form of the Green's function by contour integration. Substituting
the desired asymptotic form of the solution (with к replaced by k0)
ikor
., eiko
,iknz , /■ 1 r\ 4 е-
ф(г) ~ elk°z + fko@,<p)-— = eik°z + Ф(г) A6.179a)
6This assumes the Neumann series is convergent. In some physical situations
it is not convergent and then other techniques are needed.
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 917
into the Schrodinger wave equation, Eq. 16.178/?, yields
(V2 + ^)Ф(г) = U(r)eik°s + С/(г)Ф(г). A6.184a)
Here
~ U(r) = F(r),
2m
the scattering (perturbing) potential. Since the probability of scattering is
much less than one, the second term on the right-hand side of Eq. 16.184a is
expected to be negligible (relative to the first term on the right-hand side) and
thus we drop it. Note that we are approximating our differential equation with
(V2 + ^)Ф(г) = U(r)eik*=. A6.1846)
We now proceed to solve Eq. 16.184/?, a nonhomogeneous differential equa-
equation. The differential operator V2 generates a continuous set of eigenfunctions
V2<Ak(r) = -к2фк(г), A6.185)
where
<Ak(r) = B7r)-3/Vk-r.
These eigenfunctions form a continuous but orthonormal set in the sense that
(compare Eq. 15.2Ы).7 We use these eigenfunctions to derive a Green's function.
We expand the unknown function Ф(г^ in these eigenfunctions,
ФA)= |\<Mri)d3*i A6186)
a Fourier integral with Ak the unknown coefficients. Substituting Eq. 16.186
into Eq. 16.184/? and using Eq. 16.185, we obtain
I Ak{k20 - к2)фк(г)^к = U{r)eik»z. A6.187)
Using the now familiar technique of multiplying by ф*2(г) and integrating over
the space coordinates, we have
pki(/c2 - k^d'k, Lk*(i#ki(r)d3r = Аф20 - к\)
** ^ A6.188)
Solving for Ak and substituting into Eq. 16.186, we have
Ф(г2)= ("[(fro-*!) Ul(r1)U(rl)eiko^d3rl]ij,k(r2)d3k2. A6.189)
7 J3
d r = dxdydz, a (three-dimensional) volume element in r-space.
918 INTEGRAL EQUATIONS
Hence
ФОГ1) = |Vkl0ri)(*o - к])'1 d3k, L*(r2)U(r2)eik°^d3r2. A6.190)
replacing k2 by kx and r, by r2 to agree with Eq. 16.186. Reversing the order
of integration, we have
ФС1) = - iGko(r1,r2)U(r2)eiko^d3r2, A6.191)
where Gko(rl,r2), our Green's function, is given by
G(Tr) = ^ ^ l)
A6.192)
analogous to Eq. 9.91 of Section9.4 for discrete eigenfunctions. Equation 16.191
should be compared with the Green's function solution of Poisson's equation
A6.116).
It is perhaps worth evaluating this integral to emphasize once more the vital
role played by the boundary conditions. Using the eigenfunctions from Eq.
16.185 and
d3k = k2dk sin в dBdcp, A6.193)
we obtain
GkjTj,r2) = ——\ V2 ridcp sin() dOk2dk. A6.194)
Bn> Jo Jo Jo k ~ ko
Here kp cos в has replaced к-(г, — r2), with p = i^ — r2 indicating the polar
axis in /c-space. Integrating over q> by inspection, we pick up a 2n. The 0-integra-
tion then leads to
1 Г00 (P'kP _ p~ikP)
Gk(r1,r2) = —~\ — -=—-Y-kdK A6.195)
4тг pi Jo к -к0
and since the integrand is an even function of k, we may set
1 f00 (eix _ e-i«)
Gko(rl,r2)=—^—\ ¥-2—-^KdK. A6.196)
' J — oc
The latter step is taken in anticipation of the evaluation of Gk (r,, r2) as a contour
integral. The symbols к and a (a > 0) represent kp and kop, respectively.
If the integral in Eq. 16.196 is interpreted as a Riemann integral, the integral
does not exist. This implies that i? does not exist, and in a literal sense it
does not. J2? = V2 + k2 is singular since there exist nontrivial solutions ф for
which the homogeneous equation Sfty = 0 (compare Exercise 4.6.6). We avoid
this problem by introducing a parameter y, defining a different operator J£~l
and taking the limit as у ->■ 0.
Splitting the integral into two parts so each part may be written as a suitable
contour integral gives us
GREEN'S FUNCTIONS—TWO AND THREE DIMENSIONS 919
G(rl,r2) =
Sn2pi
ке1Кйк
~кГ^а2~
+
ке
Sn2pi
A6.197)
Contour Q is closed by a semicircle in the upper half-plane, C2 by a semicircle
in the lower half-plane. These integrals were evaluated in Chapter 7 by using
appropriately chosen infinitesimal semicircles to go around the singular points
K = ±а. As an alternative procedure, let us first displace the singular points
from the real axis by replacing a by a + iy and then, after evaluation, taking the
limit as у -> 0 (Fig. 16.5).
I.P.(K)
FIG. 16.5 Possible Green's func-
function contours of integration
For у positive, contour Cl encloses the singular point к — a + iy and the
first integral contributes
2ni-ieila+iy).
From the second integral we also obtain
2ni-ieila+iy\
the enclosed singularity being к = — {a + iy). Returning to Eq. 16.197 and
letting у ->■ 0, we have
= -A r, A6.198)
4n\rl -r2|
in full agreement with Exercise 8.7.16. This result depends on starting with у
920 INTEGRAL EQUATIONS
positive. Had we chosen у negative, our Green's function would have included
е~ш, which corresponds to an incoming wave. The choice of positive у is dictated
by the boundary conditions we wish to satisfy.
Equations 16.191 and 16.198 reproduce the scattered wave in Eq. 16.183/?
and constitute an exact solution of the approximate Eq. 16.184/?. Exercises
16.6.18 to 16.6.20 extend these results.
EXERCISES
16.6.1 Verify Eq. 16.156d,
(yi?2w — wi?2y)^T2 — /?(yV2w — u\2 v) • do 2 ■
J J
16.6.2 Show that the terms- + k2 in the Helmholtz operator and —k2 in the modified
Helmholtz operator do not affect the behavior of G(r!,r2) in the immediate
vicinity of the singular point i^ = r2. Specifically, show that
Г
lim k2G{xl, r2) dx2 = 0.
l'.-'2ho
16.6.3 Show that
exp(i/c|ri -r2|)
~ Г2
satisfies the two appropriate criteria and therefore is a Green's function for the
Helmholtz equation.
16.6.4 (a) Find the Green's function for the three-dimensional Helmholtz equation,
Exercise 8.7.16, when the wave is a standing wave,
(b) How is this Green's function related to the spherical Bessel functions?
16.6.5 The homogeneous Helmholtz equation
has eigenvalues Xf and eigenfunctions <p,-. Show that the corresponding Green's
function that satisfies
?2G(rlsr2) + A2G(r1,r2)= -<5(r,-r2)
may be written as
An expansion of this form is called a bilinear expansion. If the Green's function
is available in closed form, this provides a means of generating functions.
16.6.6 An electrostatic potential (mks units) is
Z e~ar
<p(r) =
4ns0 r
Reconstruct the electrical charge distribution that will produce this potential.
Note that cp(r) vanishes exponentially for large r, showing that the net charge is
zero.
EXERCISES 921
7a2 p'"r
ANS. p(r) = Zd(r) .
An r
16.6.7 Transform the differential equation
lip - k2y(r) + V0~-y(r) = 0,
dr r
and the boundary conditions y@) = у (со) = 0 into a Fredholm integral equa-
equation of the form
J'OO —J
G(r,t)Z-y(t)dt.
о l
The quantities Vo and k2 are constants. The differential equation is derived from
the Schrodinger wave equation with a meson potential.
-e kt sinh kr, 0 < r < t,
к
G(r,t) =
-e'^sinhkt, t < r < oo.
к
16.6.8 A charged conducting ring of radius a (Example 12.3.3) may be described by
Using the known Green's function for this system, find the electrostatic poten-
potential.
Hint. Exercise 12.6.3 will be helpful.
16.6.9 Changing a separation constant from k2 to — k2 and putting the discontinuity
of the first derivative into the z-dependence, show that
1 1 со Л°°
= — У £»«(«>,-<p2)j (kp1)Jm(kp2)e~klz<~z^dk.
An Tj - r2 Anm^aa}0
Hint. The required ё(р1 — р2) may be obtained from Exercise 15.1.2.
16.6.10 Derive the expansion
exp[i/c|r! —
An\vx -
l h
' Г\ <Г2
m=-l Г2>Г2.
Hint. The left side is a known Green's function. Assume a spherical harmonic
expansion and work on the-remaining radial dependence. The spherical har-
harmonic closure relation, Exercise 12.6.6, covers the angular dependence.
16.6.11 Show that the modified Helmholtz operator Green's function, exp( — /c|Г! — r2|)/
Dтг|г1 — r2|) has the spherical polar coordinate expansion
gxn( к\т г I) °° '
— Г2| /=0 m=_,
e. The modified spherical Bessel functions i,(kr) and k,(kr) are defined in
Exercise 11.7.15.
922 INTEGRAL EQUATIONS
16.6.12 From the spherical Green's function of Exercise 16.6.10, derive the plane wave
expansion
00
where у is the angle included between к and r. This is the Rayleigh equation of
Exercise 12.4.7.
Hint. Take r2 » Tj so that
I I kri
|rl r21 *■ r2 ~~ г20 Г1 "~ '2 "~ i •
Let r2 -*■ oo and cancel a factor of e'kr2/r2.
16.6.13 From the results of Exercises 16.6.10 and 16.6.12, show that
eix = У i'Bl + 1)/ (x)
16.6.14 (a) From the circular cylindrical coordinate expansion of the Laplace Green's
function (Eq. 16.177), show that
—2——j-tjj = - K0(kp) cos kz dk.
This same result is obtained directly in Exercise 15.3.11.
(b) As a special case of part (a) show that
16.6.15 Noting that
*,
is an eigenfunction of
(Eqs. 16.183 and 16.184), show that the infinite Green's function of S£ = V2
may be expanded as
1 = 1 Г ft1ri_ri) fk
4тг|Г1-г2| BтгK J ^ /c2'
16.6.16 Using Fourier transforms, show that the Green's function satisfying the non-
homogeneous Helmholtz equation
is
in agreement with Eq. 16.192.
16.6.17 The basic equation of the scalar Kirchhoff diffraction theory is
REFERENCES 923
where ф satisfies the homogeneous Helmholtz equation and r = |гг — r2|.
Derive this equation. Assume that r1 is interior to. the closed surface S2.
Hint. Use Green's theorem.
16.6.18 The Born approximation for the scattered wave is given by Eq. 16.183b (and
Eq. 16.191). From the asymptotic form, Eq. 16.179,
ik 2 Г Ик!
,^__2m
г " h2 )У(Г2Lп\г-г2\е"°'2аГ2-
For the scattering potential V(r2) independent of angles and for
that
fk@, q>) = -
Here k0 is in the в = 0 (original z-axis) direction, whereas к is in the @, cp)
direction. The magnitudes are equal: |ko| = |k|. m is the reduced mass.
Hint. You have Exercise 16.6.12 to simplify the exponential and Exercise 15.3.20
to transform the three-dimensional Fourier exponential transform into a one-
dimensional Fourier sine transform.
16.6.19 Calculate the scattering amplitude fk@, q>) for a meson potential V(r) = Vo .
car
Hint. This particular potential permits the Born integral, Exercise 16.6.18 to
be evaluated as a Laplace transform.
ANS. /k@,0=_^>-_J _
ft a a2 + (k0 - kf
16.6.20 The meson potential V(r) = F0(e~ar/ar) may be used to describe the Coulomb
scattering of two charges ql and q2. We let a -»■ 0 and Vo -*■ 0 but take the ratio
V0/ol to be q^qj/^nEQ. (For Gaussian units omit the 47re0.) Show that the dif-
differential scattering cross section da/dQ = \fk(O,(p)\2) is given by
( £
\4m0) 16£2sin4@/2)' 2m 2m '
It happens (coincidentally) that this Born approximation is in exact agreement
with both the exact quantum mechanical calculations and the classical Ruther-
Rutherford calculation.
REFERENCES
Bocher, M., An Introduction to the Study of Integral Equations. Cambridge Tracts in
Mathematics and Mathematical Physics, No. 10. New York: Hafner A960).
This is a very helpful introduction to integral equations.
Cochran, J. A., The Analysis of Linear Integral Equations. New York: McGraw-Hill
A972). .
This is a comprehensive treatment of linear integral equations which is intended for
applied mathematicians and mathematical physicists. It assumes a moderate to high
level of mathematical competence on the part of the reader.
Courant, R., and D. Hilbert, Methods of Mathematical Physics, vol. 1 (English ed.).
New York: Interscience A953).
This is one of the classic works of mathematical physics. Originally published in German
in 1924, the revised English edition is an excellent reference for a rigorous treatment of
924 INTEGRAL EQUATIONS
integral equations, Green's functions, and a wide variety of other topics on mathematical
physics.
Golberg, M. A., Ed., Solution Methods of Integral Equations. New York: Plenum Press
A979).
This is a set of papers from a conference on integral equations. The initial chapter is
excellent for up-to-date orientation and a wealth of current references.
Kanwal, R. P., Linear Integral Equations. New York: Academic Press A971).
This book is a detailed but readable treatment of a variety of techniques for solving
linear integral equations.
Morse, P. M., and H. Feshbach, Methods of Theoretical Physics. New York: McGraw-Hill
A953).
Chapter 7 is a particularly detailed, complete discussion of Green's functions from the
point of view of mathematical physics. Note, however, that Morse and Feshbach
frequently choose a source of An6(v — r') in place of our <5(r — r'). Considerable atten-
attention is devoted to bounded regions.
Stakgold, I., Green's Functions and Boundary Value Problems. New York: Wiley A979).
17 CALCULUS OF
VARIATIONS
Uses of the Calculus of Variations
Before plunging into this new and rather different branch of mathematical
physics, let us summarize some of its uses in both physics and mathematics.
1. Existing physical theories:
a. Unification of diverse areas of physics—using
energy as a key concept.
b. Convenience in analysis—Lagrange equations,
Section 17.3.
с Convenient introduction of constraints, Section
17.7.
2. Starting point for new, complex areas of physics and
engineering. In general relativity the geodesic is taken
as the minimum path of a light pulse in curved
Riemannian space. Variational principles appear in
modern quantum field theory. Variational principles
have been applied extensively in modern control
theory.
3. Mathematical unification. Variational analysis pro-
provides a proof of the completeness of the Sturm-
Liouville eigenfunctions, Chapter 9, and establishes
a lower bound for the eigenvalues. Similar results
follow for the eigenvalues and eigenfunctions of the
Hilbert-Schmidt integral equation, Section 16.4.
4. Calculation techniques, Section 17.8. Calculation of
the eigenfunctions and eigenvalues of the Sturm-
Liouville equation. Integral equation eigenfunctions
and eigenvalues may be calculated using numerical
quadrature and matrix techniques, Section 16.3.
17.1 ONE-DEPENDENT AND ONE-INDEPENDENT
VARIABLE
Concept of Variation
The calculus of variations involves problems in which the quantity to be
minimized (or maximized) appears as an integral. As the simplest case, let
925
926 CALCULUS OF VARIATIONS
J =
f(y,yx,x)dx.
A7.1)
Here J is the quantity that takes on an extreme value. Under the integral sign,
/ is a known function of the indicated variables y(x), yx(x) = dy(x)/dx and x,
but the dependence of у on x is not fixed; that is, y(x) is unknown. This means
that although the integral is from xt to x2, the exact path of integration is not
known (Fig. 17.1).
У
FIG. 17.1 A varied path
We are to choose the path of integration through points (xl,yl) and (x2,y2)
to minimize J. Strictly speaking, we determine stationary values of J: minima,
maxima, or saddle points. In most cases of physical interest the stationary value
will be a minimum.
This problem is considerably more difficult than the corresponding problem
in differential calculus. Indeed, there may be no solution. In differential calculus
the minimum is determined by comparing y(x0) with y(x), where x ranges over
neighboring points. Here we assume the existence of an optimum path, that is,
an acceptable path for which J is stationary, and then compare J for our
(unknown) optimum path with that obtained from neighboring paths. In Fig.
17.1 two possible paths are shown. (There are an infinite number of possibilities,
of course.) The difference between these two for a given x is called the variation
of y, Sy, and is conveniently described by introducing a new function r](x) to
define the arbitrary deformation of the path and a scale factor a to give the
magnitude of the variation. The function ^(x) is arbitrary except for two re-
restrictions. First,
fi(Xl) = n(x2) = 0, A7.2)
which means that all varied paths must pass through the fixed end points.
Second, as will be seen shortly, r\(x) must be differentiable; that is, we may not
use
rj(x) =1, x = x0,
= 0, x^x0,
A7.3)
ONE-DEPENDENT AND ONE-INDEPENDENT VARIABLE 927
but we can choose r\(x) to have a form similar to the functions used to represent
the Dirac delta function (Chapters 8 and 16) so that r\(x) differs from zero only
over an infinitesimal region.1 Then, with the path described with a and r\(x),
and
y(x, a) = y(x, 0) + ащ{х\
dy = y(x, a) - y(x, 0) = щ(х).
A7.4)
A7.5)
Let us choose y(x, a = 0) as the unknown path that will minimize J. Then
y(x, a) describes a neighboring path. In Eq. 17.1 J is now a function2 of our new
parameter a:
*X2
J(a)= f\_y(x,a),yx(x,a),x]dx,
and our condition for an extreme value is that
~dJ(a)
A7.6)
дх
= 0,
A7.7)
analogous to the vanishing of the derivative dy/dx in differential calculus.
Now the a-dependence of the integral is contained in y(x,oc) and yx(x, a) =
(д/дх)у(х, a). Therefore3
dJ(a)
da
From Eq. 17.4
Equation 17.8 becomes
dJ(a)
da
ду da дух да
dy(x,a) =
da
dyx(x, a) = dr](x)
da dx
df
A7.8)
A7.9)
A7.10)
A7.11)
Integrating the second term by parts, we obtain
drj(x) df
dx =
dx dyx
The integrated part vanishes by Eq. 17.2 and Eq. 17.11 becomes
A7.12)
1 Compare H. Jeffreys, and B. S. Jeffreys, Methods of Mathematical Physics,
3rd ed. Cambridge: Cambridge University Press A966), Chapter 10, for a
more complete discussion of this point.
2 Technically, J is a functional, depending on the functions y{x, a) and yx(x, a):
Note that у and yx are being treated as independent variables.
928 CALCULUS OF VARIATIONS
dy dx dyx
rj(x)dx = 0. A7.13)
In this form a has been set equal to zero and, in effect, is no longer part of the
problem.
Occasionally we will see Eq. 17.13 multiplied by a, which gives
8f d df
dy dxdyj
dJ
= SJ = 0.
Since r](x) is arbitrary (as already discussed), we may choose it to have the same
sign as the bracketed expression whenever the latter differs from zero. Hence
the integrand is always nonnegative. Equation 17.13, our condition for the
existence of a stationary value, can then be satisfied only if the bracketed term
itself is identically zero. The condition for our stationary value is thus a partial
differential equation,4
—- —сУ-- = 0, A7.15)
dy dx dyx
known as the Euler equation, which can be expressed in various other forms.
Alternate Forms of Euler Equations
One other form (Exercise 17.1.1), which is often useful is
In problems in which/ = f(y,yx) and x does not appear explicitly, Eq. 17.16
reduces to
jLff_yxEL\ = Q A7.17)
or
/ - yx-^~ = constant. A7.18)
It is clear that Eq. 17.15 or 17.16 must be satisfied for J to take on a stationary
value, that is, for Eq. 17.14 to be satisfied. Equation 17.15 is necessary, but it
is by no means sufficient.5 Courant and Robbins illustrate this very nicely by
It is important to watch the meaning of д/дх and d\dx closely. For example,
dx dx dy dx
The first term on the right gives the explicit x-dependence. The second term
gives the implicit x-dependence.
5 For a discussion of sufficiency conditions and the development of the calculus
of variations as a part of modern mathematics see G. M. Ewing, Calculus of
Variations with Applications, Norton, New York A969). Sufficiency conditions
are also covered by Sagan (reference listed at the end of this chapter).
EXERCISES 929
FIG. 17.2 Stationary paths over a sphere
considering the distance over a sphere between points on the sphere, A and В
Fig. 17.2. Path A), a great circle route, is found from Eq. 17.15. But path B),
the remainder of the great circle through points A and B, also satisfies the Euler
equation. Path B) is a maximum but only if we demand that it be a great circle
and then only if we make less than one circuit; that is, path B) + n complete
revolutions is also a solution. If the path is not required to be a great circle,
any deviation from B) will increase the length. This is hardly the property of
a local maximum, and that is why it is important to check the properties of
solution of Eq. 17.15 to see if they satisfy the physical conditions of the given
problem.
EXERCISES
17.1.1 Show the equivalence of the two forms of Euler's equation:
ду dx дух
and
ox dx \
17.1.2 Derive Euler's equation by expanding the integrand of
in powers of a, using a Taylor (Maclaurin) expansion with у and yx as the two
variables (Section 5.6).
Note. The stationary condition is 8J(a)/8a = 0, evaluated at a = 0. The terms
quadratic in a may be useful in establishing the nature of the stationary solution
(maximum, minimum, or saddle point).
1 7.1.3 Find the Euler equation corresponding to Eq. 17.15 if/ = f(yxx,yx,y,x).
ANS.
dx2 \8yxxJ dx \dyj dy
with и(х,
17.1.4 The integrand f(y,yx,x) of Eq. 17.1 has the form
,Ух,х) = Mx,y) + f2(x,y)yx-
930 CALCULUS OF VARIATIONS
(a) Show that the Euler equation leads to
ду дх
(b) What does this imply for the dependence of the integral J upon the choice
of path?
17.1.5 Show that the condition that
J= [f{x,y)dx
have a stationary value
(a) leads to f(x, y) independent of у and
(b) yields no information about any x-dependence.
We get no (continuous, differentiable) solution. To be a meaningful variational
problem dependence on yx or higher derivatives is essential.
Note. The situation will change when constraints are introduced (compare
Exercise 17.7.7).
17.2 APPLICATIONS OF THE EULER
EQUATION
EXAMPLE 17.2.1 Straight Line
Perhaps the simplest application of the Euler equation is in the determination
of the shortest distance between two points in the xy-plane. Since the element of
distance is
ds = [{dxJ + (dyJ]112 = [1 + yl]llldx, A7.19)
the distance J, may be written as
J =[ 2'Ъds = f "[I + y2Yl2dx. A7.20)
Jxl,yl Jxl
Comparison with Eq. 17.1 shows that
f(y,yx,x) = (l + y2I'2. A7.21)
Substituting into Eq. 17.16, we obtain
1 1
A7.22)
or
This
and
is satisfied
by
A
У,
d
dx
1
+ У2Х)Щ
1
= C, a
a second
J-
constant
constant
A7.23)
A7.24)
APPLICATIONS OF THE EULER EQUATION 931
y = ax + b, A7.25)
which is the familiar equation for a straight line. The constants a and b, of course,
are chosen so that the line passes through the two points {x^y^ and (x2,y2).
Hence the Euler equation predicts that the shortest6 distance between two fixed
points is a straight line.
The generalization of this in curved four-dimensional space-time leads to the
important relativity concept, the geodesic.
EXAMPLE 17.2.2 Soap Film
As a second illustration (Fig. 17.3), consider a surface of revolution generated
by revolving a curve y(x) about the x-axis. The curve is required to pass through
fixed end points {x1,y1) and (x2,y2)- The variational problem is to choose
the curve y(x) so that the area of the resulting surface will be a minimum.
FIG. 17.3 Surface of rotation—soap film problem
For the element of area shown in Fig. 17.3
dA = 2nyds = 2ny(l + y2I'2 dx.
The variational equation is then
J =- 2ny(l + y2I12 dx.
Neglecting the 2n, we obtain
A7.26)
A7.27)
A7.28)
Since df/дх = 0, we may apply Eq. 17.18 directly and get
6Technically, we have a stationary value. From the a2 terms it can be identified
as a minimum (Exercise 17.2.1).
932 CALCULUS OF VARIATIONS
y(i + Л2I/2 - уу1тл—^тпп = ci' A7-29)
or
^T172=ci- A7-3°)
Squaring, we get
^—2 = c\ withc2<j;2min, A7.31)
and
^ ч/V2 ~ c\
This may be integrated to give
x = c1cosh~1^- + c2. A7.33)
Solving for y, we have
y = cl cosh f^—^A A7 34)
V ci /
and again cx and c2 are determined by requiring the hyperbolic cosine to pass
through the points (xi,}>i) and (x2,y2)- Our "minimum" area surface is a
catenary of revolution or a catenoid.
Soap Film—Minimum Area
This calculus of variations contains many pitfalls for the unwary. (Remember
the Euler equation is a necessary condition assuming a differenttable solution.
The sufficiency conditions are quite involved. See the references for details.)
Perhaps respect for some of these hazards may be developed by considering
a specific physical problem, for example, the minimum area problem with
(xi,.yi) = ( — x0,1), (x2,y2) = ( + x0,1). The minimum surface is a soap film
stretched over the two rings of unit radius at x = ±x0. The problem is to predict
the curve y(x) assumed by the soap film.
By referring to Eq. 17.34, we find that c2 = 0 by the symmetry of the problem.
Then
y = c1cosh|—Y A7.34л)
w
If we take x0 = \, we obtain the transcendental equation for ct,
1 =c1cosh/'^-\ A7.35)
We find that this equation has two solutions; cx = 0.2350, leading to a "deep"
APPLICATIONS OF THE EULER EQUATION 933
curve, and ct — 0.8483, leading to a "flat" curve. Which is our minimum? Which
curve is assumed by the soap film? Before answering these questions, consider
the physical situation with the rings moved apart so that x0 = 1. Then Eq.
17.34a becomes
1 = c, cosh
\ClJ
A7.36)
which has no real solutions! The physical significance is that as the unit radius
rings were moved out from the origin a point was reached at which the soap
film could no longer maintain the same horizontal force over each vertical
section. Stable equilibrium was no longer possible. The soap film broke (ir-
(irreversible process) and formed a circular film over each ring (with a total area
of 2n = 6.2832. . .). This is the Goldschmidt discontinuous solution.
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
-с
-
-
^^ 1 = с cosh (^2. |
^•ч^ \ С /
\. Shallow
\ curve
\
/Deep
/ curve
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 FIG. 17.4 Solutions of Eq. 17.34a for
•xo unit radius rings at .v = +.v0
The next question is—how large may x0 be and still give a real solution
for Eq. 17.34a?7 Letting c^1 = p, Eq. 17.34a becomes,
p = coshpx0.
A7.37)
To find x0 max we could solve for x0 (as in Eq. 17.33) and then differentiate with
respect to p. Finally, with an eye on Fig. 17.4, dxo/dp would be set equal to zero.
Alternatively, direct differentiation of Eq. 17.37 with respect to p yields
1 = sinhpxo[xo + pdxo/dp].
The requirement that dxo/dp vanish, leads to
1 = xosinhpxo. A7.38)
Equations 17.37 and 17.38 may be combined to form
7 From a numerical point of view it is easier to invert the problem. Pick a
value of cx and solve for x0. Equation 17.34a becomes x0 = ct cosh A/c,).
This has numerical solutions in the range 0 < c, < 1.
934 CALCULUS OF VARIATIONS
px0 = cothpx0 A7.39)
with the root
px0 = 1.1997. A7.40)
Substituting into Eqs. 17.37 or 17.38, we obtain
p= 1.810 cx = 0.5524 A7.41)
and
xOmax = 0.6627. A7.42)
Returning to the question of the solution of Eq. 17.35 that describes the
soap film, let us calculate the area corresponding to each solution. We have
Cx° Att Cx°
A = 4n\ y(i + y2xf12 dx = — y2 dx (by Eq. 17.30)
J c J
= nc\
A7.43)
sinh/^Л + Ы
Cl / Cl
For x0 = j, Eq. 17.35 leads to
cx = 0.2350-+ A = 6.8456,
cl = 0.8483-^/1 = 5.9917,
showing that the former can at most be only a local minimum. A more detailed
investigation (compare Bliss, Calculus of Variations, Chapter IV) shows that
this surface is not even a local minimum. For x0 = \ the soap film will be de-
described by the flat curve
() A744)
This flat or shallow catenoid (catenary of revolution) will be an absolute
minimum for 0 < x0 < 0.528. However, for 0.528 < x < 0.6627 its area is
greater than that of the Goldschmidt discontinuous solution F.2832) and it
is only a relative minimum (Fig. 17.5).
For an excellent discussion of both the mathematical problems and experi-
experiments with soap films, the reader is referred to Courant and Robbins.
EXERCISES
17.2.1 A soap film is stretched across the space between two rings of unit radius
centered at + x0 on the x-axis and perpendicular to the x-axis. Using the solution
developed in Section 17.2, set up the transcendental equations for the condition
that x0 is such that the area of the curved surface of rotation equals the area of
the two rings (Goldschmidt discontinuous solution). Solve for x0 (Fig. 17.6).
EXERCISES 935
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
n
Deep curve^_^^--*
. /
/
/Shallow curve
- /
- /
/
Goldschmidt
discontinuous
solution
i i >
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.
FIG. 17.5 Catenoid area (unit radius rings at x = ±x0)
У
~x0
FIG. 17.6 Surface of rotation
17.2.2 In Example 17.2.1 expand J\_y(x,u)\ — J\_y(x,0)~\ in powers of a. The term
linear in a leads to the Euler equation and to the straight-line solution Eq.
17.25. Investigate the a2 term and show that the stationary value of J, the
straight-line distance, is a minimum.
, Уху X) dx> Wlttl / =
17.2.3 (a) Show that the integral
has no extreme values.
(b) If f(y,yx,x) = y2(x) find a discontinuous so.lution similar to the Gold-
schmidt solution for the soap film problem.
936 CALCULUS OF VARIATIONS
17.2.4 Fermat's principle of optics states that a light ray will follow the path, y{x),
for which
n{y,x)ds
is a minimum when n is the index of refraction. For y2 = y1 = 1, — Xj = x2 = 1
find the ray path if
(a) n = ey\
(b) n = a{y-y0), y>y0-
17.2.5 A frictionless particle moves from point A on the surface of the earth to point
В by sliding through a tunnel. Find the differential equation to be satisfied if
the transit time is to be a minimum.
Note. Assume the earth to be nonrotating sphere of uniform density.
ANS. (Eq. 17.15) r<pv{r3 - га2) + r2Ba2 - г2) + д2г2 = О,
r(<p = 0) = ro,
rv(ip = 0) = О,
r(q> = ч>л) = о,
r{<P = <Рв) = а-
п1,-2 г2 - г2
(Eq. 17.18) г2 ^-V"Ч-
»о а — г
The solution of these equations is a hypocycloid, generated by a circle of radius
\{a — r0) rolling inside the circle of radius a. The student might like to show
that the transit time is
For details see P. W. Cooper, Am. J. Phys, 34, 68 A966); G. Venezian et al.,
Am. J. Phys. 34, 701-704 A966).
17.2.6 A ray of light follows a straight-line path in a first homogeneous medium, is
refracted at an interface, and then follows a new straight-line path in the second
medium. Use Fermat's principle of optics to derive Snell's law of refraction:
n1 sin 0j = n2 sin 02.
Hint. Keep the points (x^^) and (x2,y2) fixed and vary x0 to satisfy Fermat
(Fig. 17.7). This is not an Euler equation problem. (The light path is not dif-
ferentiable at x0.)
17.2.7 A second soap film configuration for the unit radius rings at x = ±x0 consists
of a circular disk, radius a, in the x = 0 plane and two catenoids of revolution,
one joining the disk and each ring. One catenoid may be described by
y = Cjcosh/ —+ cA.
(a) Impose boundary conditions at x = 0 and at x = x0.
(b) Although not necessary, it is convenient to require that the catenoids
form an angle of 120° where they join the central disk. Express this third
boundary condition in mathematical terms.
(c) Show that the total area of catenoids plus central disk is
. \ci J ci J
Note. Although this soap film configuration is physically realizable and stable,
the area is larger than that of the simple catenoid for all ring separations for
which both films exist.
GENERALIZATIONS, SEVERAL DEPENDENT VARIABLES 937
FIG. 17.7
ANS. (a) ,
a — c1 coshc3,
(b) f-
ax
17.2.8 For the soap film described in Exercise 17.2.7 find (numerically) the maximum
value of x0.
Note. This calls for a hand computer with hyperbolic functions or a table of
hyperbolic cotangents.
ANS. xOmax = 0.4078.
17.2.9 Find the root of px0 — cothpx0 (Eq. 17.39) and determine the corresponding
values of p and x0 (Eqs. 17.41 and 42). Calculate your values to five significant
figures.
Hint. Try one of the root-determining subroutines listed in Appendix 1.
17.2.10 For the two-ring soap film problem of this section calculate and tabulate x0,
p, p~l, and A, the soap film area for px0 = 0.00@.02I.30.
17.2.11 Find the value of x0 (to five significant figures) that leads to a soap film area,
Eq. 17.43 equal to In, the Goldschmidt discontinuous solution.
ANS. x0 = 0.52770.
17.3 GENERALIZATIONS, SEVERAL DEPENDENT
VARIABLES
Our original variational problem. Equation 17.1, may be generalized in
several respects. In this section we consider the integrand, /, to be a function
of several dependent variables, yl{x), y2{x), y3{x), . . ., all of which depend on x,
the independent variable. In Section 17.4 / again will contain only one unknown
function y, but у will be a function of several independent variables (over which
we integrate). In Section 17.5 these two generalizations are combined. Finally,
938 CALCULUS OF VARIATIONS
in Section 17.7 the stationary value is restricted by one or more constraints.
For more than one dependent variable Eq. 17.1 becomes
J= f[yi{x),y2{x),...,yn{x),ylx{x),y2x{x),...,ynx{x),x']dx.
Jx.
A7.45)
As in Section 17.1, we determine the extreme value of J by comparing
neighboring paths. Let
у({х, a) = y;(x, 0) + аг1,{х), i = 1, 2, ...,n, A7.46)
with the r]i independent of one another, but subject to the restrictions discussed
in Section 17.1. By differentiating Eq. 17.45 with respect to a and setting a = 0,
since Eq. 17.7 still applies, we obtain
the subscript x denoting differentiations with respect to x; that is, yix = dyjdx,
and so on. Again, each of the terms {df/dy!x)r]!x is integrated by parts. The
integrated part vanishes and Eq. 17.47 becomes
r-rff'f),^ ,17.48,
Since the r\{ are arbitrary and independent of one another,1 each of the terms in
the sum must vanish independently. We have
31 d 8f =0, 1 = 1,2,....и, A7.49)
dyi dx d(dy;/dx)
a whole set of Euler equations, each of which must be satisfied for an extreme
value.
Hamilton's Principle
The most important application of Eq. 17.45 occurs when the integrand/is
taken to be the Lagrangian L. The Lagrangian is defined as the difference of
kinetic and potential energies of a system.
L=T-V. A7.50)
Using time as an independent variable instead of x and x,(f) as the dependent
variables,
x—>t,
У1 -> x,-@>
1 For example, we could set r\2 = Чъ = I* ''' = 0, eliminating all but one term
of the sum, and then treat r\x exactly as in Section 17.1.
GENERALIZATIONS. SEVERAL DEPENDENT VARIABLES 939
Xi(t) is the location and x,- = dx-Jdt, the velocity of particle i as a function of
time. The equation SJ = 0 is then a mathematical, statement of Hamilton's
principle of classical mechanics,
L(xl5x2, .. . ,xn,x1,x2, . .. ,xn;t)dt = 0. A7.51)
In words, Hamilton's principle asserts that the motion of the system from time
tl to t2 is such that the time integral of the Lagrangian L has a stationary value.
The resulting Euler equations are usually called the Lagrangian equations of
motion,
ddLSL
dt dx; ox i
These Lagrangian equations can be derived from Newton's equations of motion
and Newton's equations can be derived from Lagrange's. The two sets of
equations are equally "fundamental."
The Lagrangian formulation has certain valuable advantages over the
conventional Newtonian laws. Whereas Newton's equations are vector equa-
equations, we see that Lagrange's equations involve only scalar quantities. The
coordinates xl9 x2, ■ ■ ■ need not be any standard set of coordinates or lengths.
They can be selected to match the conditions of the physical problem. The
Lagrange equations are invariant with respect to the choice of coordinate
system. Newton's equations (in component form) are not invariant. Exercise
2.5.10 shows what happens to F = ma resolved in spherical polar coordinates.
Exploiting the concept of energy, we may easily extend the Lagrangian
formulation from mechanics to diverse fields such as electrical networks and
acoustical systems. Extensions to electromagnetism appear in the exercises. The
result is a unity of otherwise separate areas of physics. In the development of
new areas the quantization of Lagrangian particle mechanics provided a model
for the quantization of electromagnetic fields and led to the modern theory of
quantum electrodynamics.
One of the most valuable advantages of the Hamilton principle—Lagrange
equation formulation—is the ease in seeing a relation between a symmetry and
a conservation law. As an example, let x; = cp, an azimuthal angle. If our
Lagrangian is independent of cp (i.e., cp is an ignorable coordinate), there are
two consequences: A) an axial (rotational) symmetry and B) from Eq. 17.52
dL/дф = constant. Physically, this corresponds to the conservation or invar-
iance of a component of angular momentum. Similarly, invariance under
translation leads to conservation of linear momentum. Noether's theorem is a
generalization of this invariance (symmetry)—the conservation law relation.
EXAMPLE 17.3.1. Moving Particle—Cartesian Coordinates
Consider Eq. 17.50 which describes one particle with kinetic energy
T = \mx2 A7.53)
940 CALCULUS OF VARIATIONS
and potential energy V(x), in which, as usual, the force is given by the negative
gradient of the potential,
F(X) = -^^. A7.54)
dx
From Eq. 17.52
= mx-F{x) = 0, A7.55)
dtv ' dx
which is simply Newton's second law of motion.
EXAMPLE 17.3.2. Moving Particle—Circular Cylindrical Coordinates
Now let us describe a moving particle in cylindrical coordinates (z = 0)-
plane. The kinetic energy is
T = |m(x2 + y2) = \m{p2 + р2ф2), A7.56)
and we take V = 0.
The transformation of x2 + y2 into circular cylindrical coordinates could be
carried out by taking x(p,(p) and y(p, q>), Eq. 2.28, and differentiating with
respect to time and squaring. It is much easier to interpret x2 + y2 as v2 and
just write down the components of v as po(dsp/dt) = pop, and so on. (The dsp
is an increment of length, p changing by dp, q> remaining constant. See Sections
2.1 and 2.4.)
The Lagrangian equations yield
т(тр)-трф2 = 0,
dt
A7.57)
тр1ф) = О.
dt ™П
The second equation is a simple statement of conservation of angular momen-
momentum. The first may be interpreted as radial acceleration2 equated to centrifugal
force. In this sense the centrifugal force is a real force. It is of some interest
that this interpretation of centrifugal force as a real force is supported by the
general theory of relativity.
EXERCISES
17.3.1 (a) Develop the equations of motion corresponding to L = \m(x2 + y2).
(b) In what sense do your solutions minimize the integral J'r2 L dt 1
Compare the result for your solution with x = const., у = const.
17.3.2 From the Lagrangian equations of motion, Eq. 17.52, show that a system in
stable equilibrium has a minimum potential energy.
!Here is a second method of attacking Exercise 2.4.8.
EXERCISES 941
17.3.3 Write out the Lagrangian equations of motion of a particle in spherical coordi-
coordinates for potential V equal to a constant. Identify the terms corresponding to
(a) centrifugal force and (b) Coriolis force.
17.3.4 The spherical pendulum consists of a mass on a wire of length /, free to move
in polar angle в and azimuth angle <p (Fig. 17.8).
(a) Set up the Lagrangian for this physical system.
(b) Develop the Lagrangian equations of motion.
FIG. 17.8 Spherical pendulum
17.3.5 Show that the Lagrangian
L = moc2 11 -
leads to a relativistic form of Newton's second law of motion,
d ( mov; \
in which F; = — дУ/дх{.
17.3.6 The Lagrangian for a particle with charge q in an electromagnetic field de-
described by scalar potential cp and vector potential A is
L = \mv2 —
qA-\.
Find the equation of motion of the charged particle.
d дА- дА-
Hint. —A, = —- + У —-X:. The dependence of the force fields E and В upon
dt J 8t 7 dx:
the potentials q> and A is developed in Section 1.13 (compare Exercise 1.13.10).
ANS. mx; = q[E + \ x B],,
17.3.7 Consider a system in which the Lagrandian is given by
where q{ and q, represent sets of variables. The potential energy V is independent
of velocity and neither T nor V have any explicit time dependence,
(a) Show that
d_
Jt
(b) The constant quantity
942 CALCULUS OF VARIATIONS
defines the Hamiltonian H. Show that under the preceding assumed con-
conditions, H = T + V, the total energy.
Note. The kinetic energy Г is a quadratic function of the <j,'s.
17.4 SEVERAL INDEPENDENT VARIABLES
Sometimes the integrand / of Eq. 17.1 will contain one unknown function и
which is a function of several independent variables, и = u(x,y,z), for the three-
dimensional case. Equation 17.1 becomes
J = \f\u,ux,uy,uz,x,y,z]dxdydz, A7.58)
ux indicating ди/дх, and so on. The variational problem is to find the function
u(x, y, z) for which J is stationary,
zi dJ
SJ = a
= 0. A7.59)
a=0
да
Generalizing Section 17.1, we let
u(x,y,z,a) = u(x,y, z, 0) + arj(x,y, z). A7.60)
u(x, y, z, a = 0) represents the (unknown) function for which Eq. 17.59 is satisfied,
whereas again r)(x,y, z) is the arbitrary deviation that describes the varied
function u(x, y, z, a). This deviation, r](x, y, z) is required to be differentiable and
to vanish at the end points. Then from Eq. 17.60,
ux(x, y, z, a) = ux(x,y,z,0) + arjx, A7.61)
and similarly for uy and uz.
Differentiating the integral (Eq. 17.58) with respect to the parameter a and
then setting a = 0, we obtain
-zf %f %f \
= 0. A7.62)
da
Again, we integrate each of the terms (df/du^rj-, by parts. The integrated part
vanishes at the end points (because the deviation ц is required to go to zero at
the end points) and
'д/ д df д df в df\_,_, „w^,.^ = a, A763)
x dux dy
y
1 Again, it is imperative that the precise meaning of partial derivatives be
understood fully. Specifically, in Eq. 17.63 д/дх is a partial derivative, in that
у and z are constant. But д/дх is also a total derivative in that it acts on implicit
x-dependence as well as on explicit x-dependence. In this sense
Э {8f\__jy_ + J*f_u +d*fu +JH_U +J^U
дхдих дидих х ди2 хх диудих ху ди,дих
EXERCISES 943
Since the variation r](x, y, z) is arbitrary, the term in large parenthesis may be set
equal to zero. This yields the Euler equation for (three) independent variables,
Ztftf*f = 0. Ц7.64)
cu ox cu dy cu c du
EXAMPLE 17.4.1 Laplace's Equation
An example of this sort of variational problem is provided by electrostatics.
The energy of an electrostatic field is
energy density = je£2, A7.65)
in which E is the usual electrostatic force field. In terms of the static potential cp,
energy density = jE(\cpJ. A7.66)
Now let us impose the requirement that the electrostatic energy (associated
with the field) in a given volume be a minimum. (Boundary conditions on E
and cp must still be satisfied.) We have the volume integral2
J= I I \(\(pJdxdydz
•'•'•' A7.67)
= \\\(cp2 + (p2 + (p2)dxdydz.
With
f(q>, (px,q>y,(pz, x, y, z)=cp2 + (p2 + (p2z, A7.68)
the function cp replacing the и of Eq. 17.64, Euler's equation (Eq. 17.64) yields
-2(cpxx + cpyy + cpzz) = 0 A7.69)
or
V2cp(x,y,z) = 0, A7.70)
which is just Laplace's equation of electrostatics.
Closer investigation shows that this stationary value is indeed a minimum.
Thus the demand that the field energy be minimized leads to Laplace's equation.
EXERCISES
17.4.1 The Lagrangian for a vibrating string (small amplitude vibrations) is
L= f (ipu,2 - ituxVx,
where p is the (constant) linear mass density and т is the (constant) tension. The
x-integration is over the length of the string. Show that application of Hamilton's
2 Remember that the subscript x indicates the x-partial derivative, not an
x-component.
944 CALCULUS OF VARIATIONS
principle to the Lagrangian density (the integrand), now with two independent
variables, leads to the classical wave equation
д2и р д2и
дх2 т 8t
2 ■
17.4.2 Show that the stationary value of the total energy of the electrostatic field of
Example 17.4.1 is a minimum.
Hint. Use Eq. 17.61 and investigate the a2 terms.
17.5 MORE THAN ONE DEPENDENT, MORE THAN
ONE INDEPENDENT VARIABLE
In some cases our integrand / contains more than one dependent variable
and more than one independent variable. Consider
/ = f[p(x,y, z\px,py,pz,q(x,y, z\ qx,qy, qz,r(x,y, z), rx,ry, rz,x,y,z].
A7.71)
We proceed as before with
p(x, y, z, a) = p(x, y, z, 0) + a£(x, y, z),
q(x,y,z,a) = q{x,y,z,0) + ccr](x,y,z), A7.72)
r(x, y, z, a) = r(x, y, z, 0) + a((x, y, z), and so on.
Keeping in mind that £, ц, and ( are independent of one another, as were the
г]{ in Section 17.3, the same differentiation and then integration by parts leads to
Z°2Ltf-tf = 0, A7.73)
dp ex dp cy cp cz dp
with similar equations for functions q and r. Replacing p, q, r, ... with yt and
x, y, z, . . . with Xj, we can put Eq. 17.73 in a more compact form:
— X — I -^— 1 = 0, i = 1, 2, . . ., A7.73a)
dy( j cXj \oyijl
in which
Уц = ^-
An application of Eq. 17.73 appears in Section 17.7.
Relation to Physics
The calculus of variations as developed so far provides a convenient and
perhaps elegant description of a wide variety of physical phenomena. The
physics includes ordinary mechanics, Section 17.3; relativistic mechanics, Exer-
Exercise 17.3.5; electrostatics, Example 17.4.1; and electromagnetic theory in Exer-
Exercise 17.5.1. The convenience and elegance should not be minimized, but at
LAGRANGIAN MULTIPLIERS 945
the same time the student should be aware that in these cases the calculus of
variations has only provided an alternate description of what was already
known. It has not provided any new physics.
The situation does change with the challenging and incomplete theories
of modern particle and field physics. Here the basic physics is not yet known
and a postulated variational principle can be a useful starting point.
EXERCISE
17.5.1 The Lagrangian (per unit volume) of an electromagnetic field with a charge
density p is given by
1 / D2\
— pep + pv • A.
Show that Lagrange's equations lead to two of Maxwell's equations. (The re-
remaining two are a consequence of the definition of E and В in terms of A and <p.)
This Lagrangian density comes from a scalar expression in Section 3.7.
Hint. Take AX,A2, A3, and cp as dependent variables, x, y, z, and t as independent
variables. E and В are given in terms of A and ц> by Eq. 3.104.
17.6 LAGRANGIAN MULTIPLIERS
In this section the concept of a constraint is introduced. To simplify the
treatment, the constraint appears as a simple function rather than as an integral.
In this section we are not concerned with the calculus of variations, but in
Section 17.7 the constraints, with our newly developed Lagrangian multipliers,
are incorporated into the calculus of variations.
Consider a function of three independent variables, f(x, y, z). For the func-
function / to be a maximum (or extremeI
df = 0. A7.74)
The necessary and sufficient condition for this is
|£_&_ft A7.75)
cy cz
in which
df =^-dx + ^-dy + ^~dz. A7.76)
дх су cz
Often in physical problems, the variables x, y, z are subjected to constraints
so that they are no longer all independent. It is possible, at least in principle,
to use each constraint to eliminate one variable and to proceed with a new and
smaller set of independent variables.
1 Including a four-dimensional saddle point.
946 CALCULUS OF VARIATIONS
The use of Lagrangian multipliers is an alternate technique that may be
applied when this elimination of variables is inconvenient or undesirable. Let
our equation of constraint be
cp(x,y,z) = 0, A7.77)
from which
^ d^ ^ = O. A7.78)
^dx + ^dy + ^
ox dy dz
Returning to Eq. 17.74, we see that Eq. 17.75 no longer follows because there
are now only two independent variables. If we take x and у as these independent
variables, dz is no longer arbitrary. However, we may add Eq. 17.76 and a
multiple of Eq. 17.78 to obtain
df
Our Lagrangian multiplier A is chosen so that
f- + A^ = O, AX80)
dz dz
assuming that dcp/dz j= 0. Equation 17.79 now becomes
However, we took dx and dy to be arbitrary and the quantities in parentheses
must vanish,
df dq> _
\- A~r~ — U,
dx dx
A7.82)
dy dy
When Eqs. 17.80 and 17.82 are satisfied, df = 0 and/is an extremum. Notice
that there are now four unknowns: x, y, z, and A. The fourth equation is, of course,
the constraint A7.77). Actually we want only x, y, and z; X need not be deter-
determined. For this reason X is sometimes called Lagrange's undetermined multi-
multiplier. This method will fail if all the coefficients of A vanish at the extremum,
dcp/dx, dcp/dy, dcp/dz = 0. It is then impossible to solve for A.
The reader might note that from the form of Eqs. 17.80 and 17.82, we could
identify / as the function taking an extreme value subject to q>, the constraint
or identify / as the constraint and q> as the function.
If we have a set of constraints q>k, then Eqs. 17.80 and 17.82 become
dx,
LAGRANGIAN MULTIPLIERS 947
with a separate Lagrange multiplier Ak for each cpk.
EXAMPLE 17.6.1 Particle in a Box
As an example of the use of Lagrangian multipliers, consider the quantum
mechanical problem of a particle (mass m) in a box. The box is a rectangular
parallelepiped with sides a, b, and c. The ground state energy of the particle is
given by
We seek the shape of the box that will minimize the energy £, subject to
constraint that the volume is constant,
V(a,b,c) = abc = k. A7.84)
With/(a, b, c) = E(a, b, c) and q)(a, b, c) = abc — к = 0, we obtain
— + A^- = --A-r + Abe = 0. A7.85)
da da 4ma
Also,
h2
4mb'
h2
4mc2
+ kac = 0,
+ Aab = 0.
Multiplying the first of these expressions by a, the second by b, and the third
by c, we have
2 h2 h2
h h h
Aabc = -Ц = i = /Ц. A7.86)
Ama 4mb 4mc
Therefore our solution is
a = b = c, a cube. A7.87)
Notice that A has not been determined. It remains an undetermined multiplier.
EXAMPLE 17.6.2 Cylindrical Nuclear Reactor
A further example is provided by the nuclear reactor theory. Suppose a
(thermal) nuclear reactor is to have the shape of a right circular cylinder of
radius R and height H. Neutron diffusion theory supplies a constraint:
cp(R,H) = f^48Y + (Л2 = constant.2 A7.88)
V R J \HJ
!2.4048. . . is the lowest root of Bessel function J0(R) (compare Section 11.1).
948 CALCULUS OF VARIATIONS
We wish to minimize the volume of the reactor
f(R,H) = nR2H. A7.89)
Application of Eq. 17.82 leads to
A7.90)
By multiplying the first of these equations by R/2 and the second by H, we obtain
A7.91)
R H
or
2.4048
for the minimum volume right-circular cylindrical reactor.
Strictly speaking, we have found only an extremum. Its identification as a
minimum follows from a consideration of the original equations.
EXERCISES
The following problems are to be solved by using Lagrangian multipliers.
17.6.1 The ground state energy of a particle in a pillbox (right-circular cylinder) is
given by
e = — A2-4048J + nL\
in which R is the radius and H, the height of the pillbox. Find the ratio of R
to H that will minimize the energy for a fixed volume.
17.6.2 Find the ratio of #(radius) to H(height) that will minimize the total surface
area of a right-circular cylinder of fixed volume.
17.6.3 The U.S. Post Office limits first class mail to Canada to a total of 36 inches,
length plus girth. Using a Lagrange multiplier, find the maximum volume and
the dimensions of a (rectangular parallelepiped) package subject to this con-
constraint.
17.6.4 A thermal nuclear reactor is subject to the constraint
m(a, b, c) = l~) +(-) +{-) = B2, a constant.
\aj \bj усу
Find the ratios of the sides of the rectangular parallelepiped reactor of minimum
volume.
ANS. a = b = c, cube.
EXERCISES 949
17.6.5 For a simple lens of focal length / the object distance p and the image distance
q are related by \/p + \/q = \/f.
Find the minimum object-image distance (p + q) for fixed /. Assume real object
and image (p and q both positive).
17.6.6 You have an ellipse (x/aJ + (y/bJ = 1. Find the inscribed rectangle of maximum
area. Show that the ratio of the area of the maximum area rectangle to the area
of the ellipse is B/я) = 0.6366.
17.6.7 A rectangular parallelepiped is inscribed in an ellipsoid of semiaxes a, b, and с
Maximize the volume of the inscribed rectangular parallelepiped. Show that
the ratio of the maximum volume to the volume of the ellipsoid is 2/я^/З » 0.367.
17.6.8 A deformed sphere has a radius given by r = ro{ao + a2P2(cos $)}> where a0 * 1
and ot2 ~ 0. From Exercise 12.5.14 the area and volume are
Terms of order a\ have been neglected.
(a) With the constraint that the enclosed volume be held constant, that is,
V — Апг^/Ъ, show that bounding surface of minimum area is a sphere,
(ao = 1, a2 = 0).
(b) With the constraint that the area of the bounding surface be held constant;
that is, A — Anrl. Show that the enclosed volume is a maximum when the
surface is a sphere.
17.6.9 Find the maximum value of the directional derivative of q> (x, y, z),
dcp dcp dcp o dcp
—^- = -г- cos a + —— cos p + ~- cos y.
ds ox oy oz
subject to the constraint
cos2 a + cos2 A + cos2 у = 1.
Note concerning the following exercises:
In a quantum-mechanical system there are gt distinct quantum states be-
between energy £; and E( + dEr The problem is to describe how nx particles are
distributed among these states subject to two constraints:
(a) Fixed number of particles:
Хл = п.
i
(b) Fixed total energy:
17.6.10 For identical particles obeying the Pauli exclusion principle the probability of
a given arrangement is
950 CALCULUS OF VARIATIONS
Show that maximizing WFO subject to a fixed number of particles and fixed
total energy leads to
П; =
With Ях = — E0/kTand A2 = 1/kT, this yields Fermi-Dirac statistics.
Hint. Try working with In W and using Stirling's formula, Section 10.3. The
justification for differentiation with respect to щ is that we are dealing here with
a large number of particles, Ant/nt « 1.
17.6.11 For identical particles but no restriction on the number in a given state the
probability of a given arrangement is
ш -n("i + gi~1)!
Show that maximizing Wm, subject to a fixed number of particles and fixed
total energy, leads to
„. = il
' t + W
With A2 — 1/fcT*, this yields Bose-Einstein statistics.
Note. Assume that g{ :» 1.
17.6.12 Photons satisfy WBE and the constraint that total energy is constant. They
clearly do not satisfy the fixed number constraint. Show that eliminating the
fixed number constraint leads to the foregoing result but with Xl = 0.
17.7 VARIATION SUBJECT TO CONSTRAINTS
As in the preceding sections, we seek the path that will make the integral
stationary. This is the general case in which x} represents a set of independent
variables and yt, a set of dependent variables. Again,
SJ = 0. A7.94)
Now, however, we introduce one or more constraints. This means that the
y/s are no longer independent of each other. Not all the ^,'s may be varied
arbitrarily and Eqs. 17.62 or 17.73a would not apply. The constraint may have
the form
%(tt,x,.) = 0, A7.95)
as in Section 17.6. In this case we may multiply by a function of Xj, say, /lk(xy)
and integrate over the same range as in Eq. 17.93 to obtain
jxj)dxj = 0. A7.96)
Then clearly
VARIATION SUBJECT TO CONSTRAINTS 951
<5 k(*MU,*;)^i = 0. A7.97)
Alternatively, the constraint may appear in the form of an integral
(pk(yi,dyi/dxj,xj)dxj= constant A7.98)
J
We may introduce any constant Lagrangian multiplier and again Eq. 17.97
follows—now with A a constant.
In either case, by adding Eqs. 17.94 and 17.97, possibly with more than one
constraint, we obtain
X: = 0. A7.99)
The Lagrangian multiplier Ak may depend on x3- when cp(yhxJ) is given in the
form of Eq. 17.95.
Treating the entire integrand as a new function
gl уь~~-,Х] I,
we obtain
A7л00)
If we have Ny^s (/=1,2,..., N) and m constraints (к = 1, 2, ..., m), N — m
of the 77,'s may be taken as arbitrary. For the remaining mrj^s, the A's may, in
principle, be chosen so that the remaining Euler-Lagrange equations are
satisfied, completely analogous to Eq. 17.80. The result is that our composite
function g must satisfy the usual Euler-Lagrange equations
^ <Ш01>
with one such equation for each dependent variable yi (compare Eqs. 17.64
and 17.73). These Euler equations and the equations of constraint are then solved
simultaneously to find the function yielding a stationary value.
Lagrangian Equations -
In the absence of constraints Lagrange's equations of motion (Eq. 17.52)
were found to be1
dt dqi dq{
1 The symbol q is customary in advanced mechanics. It serves to emphasize
that the variable is not necessarily a cartesian variable (and not necessarily
a length).
952 CALCULUS OF VARIATIONS
with t (time) the one independent variable and q((t) (particle position) a set of
dependent variables. Usually the generalized coordinates qs are chosen to
eliminate the forces of constraint, but this is not necessary and not always
desirable. In the presence of constraints q>k Hamilton's principle is
dt = O,
and the constrained Lagrangian equations of motion are
d dL dL
A7.102)
A7.103)
Usually q>k = (pk(qi,t), independent of the generalized velocities q{. In this case
the coefficient aik is given by
A7.104)
dq{
If q{ is a length, then a[kkk (no summation) represents the force of the /cth con-
constraint in the g,-direction, appearing in Eq. 17.103 in exactly the same way as
-dV/dqr
EXAMPLE 17.7.1 Simple Pendulum
To illustrate, consider the simple pendulum, a mass m, constrained by a
wire of length / to swing in an arc (Fig. 17.9). In the absence of the one constraint
V//////////////////A
FIG. 17.9 Simple pendulum
(pi=r - 1 = 0
A7.105)
there are two generalized coordinates r and в (motion in vertical plane). The
Lagrangian is
L = T -V
= jm(f2 + г2в2) + mgr cos в.
A7.106)
taking the potential V to be zero when the pendulum is horizontal, в = я/2.
By Eq. 17.103 the equations of motion are
VARIATION SUBJECT TO CONSTRAINTS 953
dt
d
dt
dr
dL
dO
dr
dL
dO
= 1, aei = 0),
A7.107)
or
dt
(mr) — mrO2 — mg cos в — Xx,
—-(тг2в) + mgrs'md = 0.
A7.108)
Substituting in the equation of constraint (r = l,r = 0), we have
mid2 + mg cos в = —Лг,
т12в + mgl sin6 = 0.
A7.109)
The second equation may be solved for 6(t) to yield simple harmonic motion
if the amplitude is small (sin# = в), whereas the first equation expresses —/ll5
the tension in the wire in terms of в and в.
Note that since the equation of constraint, Eq. 17.105, is in the form of
Eq. 17.95, the Lagrange multiplier A may be (and here is) a function of t (or of в).
EXAMPLE 17.7.2 Sliding Off a Log
Closely related to this is the problem of a particle sliding on a cylindrical
surface. The object is to find the critical angle 6C at which the particle flies off
from the surface. This critical angle is the angle at which the radial force of
constraint goes to zero (Fig. 17.10).
FIG. 17.10 A particle sliding on a cylindrical
surface
We have
L=T - V = \т{гг + г2в2) - mgr cos 0
and the one equation of constraint
<p, = r - / = 0.
Proceeding as in Example 17.7.1 with a = 1,
mr — тгв2 + mg cose = AX(Q),
тг2в + imrr'Q — mgr sin в = 0,
A7.110)
A7.111)
A7.112)
954 CALCULUS OF VARIATIONS
in which the constraining force Aj@) is a function of the angle 0.2 Since r = /,
f = r = 0, Eq. 17.112 reduces to
-mid2 + тдсо$в = Л,@), A7.113а)
ml29 - mgl sin в = 0. A7.113/?)
Differentiating Eq. 17.113a with respect to time and remembering that
we obtain
-2mlO - mgsind = *Щ^. A7.115)
аи
Using Eq. 17.113b to eliminate the 0 term and then integrating, we have
Al(Q) = 3mgcos0 + C. A7.116)
Since
*i@) = mg, A7.117)
C=-2mg. A7.118)
The particle m will stay on the surface as long as the force of constraint is
nonnegative, that is, as long as the surface has to push outward on the particle
Ц0) = Ътд cos в - 2mg > 0. A7.119)
The critical angle lies where AFC) = 0, the force of constraint going to zero.
FromEq. 17.119
cos0c = §, or0f = 48°ll' A7.120)
from the vertical. At this angle (neglecting all friction) our particle takes off.
It must be admitted that this result can be obtained more easily by con-
considering a varying centripetal force furnished by the radial component of the
gravitational force. The example was chosen to illustrate the use of Lagrange's
undetermined multiplier without confusing the reader with a complicated
physical system.
EXAMPLE 17.7.3 The Schrodinger Wave Equation
As a final illustration of a constrained minimum, let us find the EuJer equa-
equations for the quantum mechanical problem
\l/*{x,y,z)H\l/{x,y,z)dxdydz = 0, A7.121)
2 Note carefully that Xx is the radial force exerted by the cylinder on the
particle. Consideration of the physical problem should show that A, must
depend on the angle Q. We permitted A = A(f). Now we are replacing the
time-dependence by an (unknown) angular deperienc
VARIATION SUBJECT TO CONSTRAINTS 955
with the constraint
[[[ф*фйхйуйг= 1. A7.122)
Equation 17.121 is a statement that the energy of the system is stationary, H
being the quantum mechanical Hamiltonian for a particle of mass m, a dif-
differential operator,
H= -^\2 + V(x,y,z). A7.123)
2m
Equation 17.122, the constraint, is the condition that there will be exactly one
particle present; ф is the usual wave function, a dependent variable, and ф*,
its complex conjugate, is treated as a second2 dependent variable.
The integrand in Eq. 17.121 involves second derivatives, which can be
converted to first derivatives by integrating by parts:
'*+* ^-dx. A7.124)
дх дх
We assume either periodic boundary conditions (as in the Sturm-Liouville
theory, Chapter 9) or that the volume of integration is so large that ф and ф*
vanish strongly4 at the boundary. Then the integrated part vanishes and Eq.
17.121 may be rewritten as
h
2m
The function g of Eq. 17.100 is
\ф*-\ф + Уф*ф
dxdydz = 0. A7.125)
g =^-\ф*.\ф + уф*ф -
Z A7Л26)
again using the subscript x to denote д/дх. For y{ = ф* Eq. 17.101 becomes
dg д dg д dg д dg _
дф* дх дф* ду дф* dz дф.
This yields
Уф - # - ^(фхх + фуу + ф22) = о
or
к 2
-—\2ф +Уф = Аф. A7.127)
2т
'Compare Section 6.1.
956 CALCULUS OF VARIATIONS
Reference to Eq. 17.123 enables us to identify A physically as the energy of the
quantum mechanical system. With this interpretation, Eq. 17.127 is the cele-
celebrated Schrodinger wave equation. This variational approach is more than
just a matter of academic curiosity. It provides a very powerful method of
obtaining approximate solutions of the wave equation (Rayleigh-Ritz varia-
variational method, Section 17.8).
EXERCISES
17.7.1 A particle, mass m, is on a frictionless horizontal surface. It is constrained to move
so that в = cot (rotating radial arm, no friction). With the initial conditions
t = 0, r = r0, f = 0,
(a) find the radial positions as a function of time. ANS. r(t) = r0 cosh cot.
(b) find the force exerted on the particle by the constraint.
ANS. F(c) = 2mr'co — 2mrocoz sinhart.
17.7.2 A point mass m is moving over a flat, horizontal, frictionless plane. The mass is
constrained by a string to move radially inward at a constant rate. Using plane
polar coordinates (p, cp), p = p0 — kt
(a) Set up the Lagrangian.
(b) Obtain the constrained Lagrange equations.
(c) Solve the (^-dependent Lagrange equation to obtain co(t), the angular
velocity. What is the physical significance of the constant of integration
that you get from your "free" integration?
(d) Using the co(t) from part (b), solve the p-dependent (constrained Lagrange
equation to obtain A(t). In other words, explain what is happening to the
force of constraint as p -»■ 0.
17.7.3 A flexible cable is suspended from two fixed points. The length of the cable is
fixed. Find the curve that will minimize the total gravitational potential energy
of the cable.
ANS. Hyperbolic cosine.
17.7.4 A fixed volume of water is rotating in a cylinder with constant angular velocity
со. Find the curve of the water surface that will minimize the total potential
energy of the water in the combined gravitational-centrifugal force field.
ANS. Parabola.
17.7.5 (a) Show that for a fixed-length perimeter the figure with maximum area is a
circle.
(b) Show that for a fixed area the curve with minimum perimeter is a circle.
Hint. The radius of curvature R is given by
R = (r2 + r2f2/(rree - 2r2 - r2).
Note. The problems of this section, variation subject to constraints, are often
called isoperimetric. The term arose from problems of maximizing area subject
to a fixed perimeter-—as in Exercise 17.7.5(a).
17.7.6 Show that requiring J, given by
J= С(p(x)y2 - q(x)y2)dx,
RAYLEIGH-RITZ VARIATIONAL TECHNIQUE 957
to have a stationary value subject to the normalizing condition
y2w(x)dx = 1
Ja
leads to the Sturm-Liouville equation of Chapter 9:
Note. The boundary condition
РУХУ\Ьа = О
is used in Section 9.1 in establishing the Hermitian property of the operator.
17.7.7 Show that requiring J, given by
J = Г ^ K{x,t)q>{x)(p(t)dxdt,
Ja Ja
to have a stationary value subject to the normalizing condition
Cb
<p2(x)dx = 1
leads to the Hilbert-Schmidt integral equation, Eq. 16.89.
Note. The kernel K(x, t) is symmetric.
17.8 RAYLEIGH-RITZ VARIATIONAL TECHNIQUE
Exercise 17.7.6 opens up a relation between the calculus of variations and
eigenfunction-eigen value problems. We may rewrite the expression of Exercise
17.7.6 as
(РУ2х ~ qy2)dx
F[y(x)-] = ^—p , A7.128)
y2wdx
Ja
in which the constraint appears in the denominator as a usual normalizing
condition. The quantity F, a function of the function y(x), is sometimes called
a functional. Since the denominator is constant (for normalized functions), the
stationary values of J correspond to the stationary values of F. Then from
Exercise 17.7.6 when y(x) is such that J and F take on a stationary value, the
optimum function y(x) satisfies the Sturm-Liouville equation
with X the eigenvalue (not a Lagrangian multiplier). Integrating the numerator
of Eq. 17.128 by parts and using the boundary condition,
pyxy\ba = 0, A7.130)
we obtain
958 CALCULUS OF VARIATIONS
„ y{£{pt)+qy}dx
F[y(x)l = ~ гь " —• A7-131)
y2wdx
Then substituting in Eq. 17.129, the stationary values of F[);(x)] are given by
x)]=An, A7.132)
with Xn the eigenvalue corresponding to the eigenfunction yn. Equation 17.132
with F given by either Eq. 17.128 or 17.131 forms the basis of the Rayleigh-Ritz
method for the computation of eigenfunctions and eigenvalues.
Ground State Eigenfunction
Suppose that we seek to compute the ground state eigenfunction y0 and
eigenvalue1 Ao of some complicated atomic or nuclear system. The classical
example for which no exact solution exists is the helium atom problem. The
eigenfunction y0 is unknown, but we shall assume we can make a pretty good
guess at an approximate function y, so that mathematically we may write2
The c/s are small quantities. (How small depends on how good our guess was.)
The y,-'s are normalized eigenfunctions (also unknown), and therefore our trial
function у is not normalized.
Substituting the approximate function у into Eq. 17.131 and noting that
= -*Aj, A7.134)
^o + Z cfAt
i=i
1 + Z cf
A7.135)
i—i
Here we have taken the eigenfunctions to be orthogonal—since they are solu-
solutions of the Sturm-Liouville equation, Eq. 17.129. We also assume that y0 is
nondegenerate. Now, if we expand the denominator of Eq. 17.135 by the
binomial theorem and discard terms of order cf,
00
F[y(x)]=to+ £c?(A,.-A0). A7.136)
i=i
Equation 17.136 contains two important results.
lrrhis means that Яо is the lowest eigenvalue. It is clear from Eq. 17.128 that
if/>(•*) > 0 and q{x) < 0 (compare Table 9.1), then F[j(x)] has a lower bound
and this lower bound is nonnegative. Recall from Section 9.1 that w(x) > 0.
2 We are guessing at the form of the function. The normalization is irrelevant.
RAYLEIGH-RITZ VARIATIOKAL TECHNIQUE 959
1. Whereas the error in the eigenfunction у was 0(c{),
the error in к is O(cf). Even a poor approximation of
the eigenfunctions may yield an accurate calculation
of the eigenvalue.
2. If k0 is the lowest eigenvalue (ground state), then since
k{ - k0 > 0,
F[y(x)] = k>k0, A7.137)
or our approximation is always on the high side be-
becoming lower, converging on k0 as our approximate
eigenfunction у improves (c, -> 0). Note that Eq.
17.137 is a direct consequence of Eq. 17.135 inde-
independent of our binomial approximation.
EXAMPLE 17.8.1 Vibrating String
A vibrating string, clamped at x = 0 and 1, satisfies the eigenvalue equation
d2y
dx:
+ ky = 0, A7.138)
and the boundary condition y@) = y(l) = 0. For this simple example the student
will recognize immediately that yo(x) = smnx (unnormalized) and ko = n2.
But let us try out the Rayleigh-Ritz technique.
With one eye on the boundary conditions, we try
y(x) = x(l - x). A7.139)
Then with p = 1 and w = 1, Eq. 17.128 yields
(l-2xJdx
A7.140)
This result, к = 10, is a fairly good approximation A.3% errorK of k0 = n2 =
9.8696. The reader may have noted that y(x), Eq. 17.139, is not normalized
to unity. The denominator in F[y(x)] compensates for the lack of unit nor-
normalization.
In the usual scientific calculation the eigenfunction would be improved by
3The closeness of the fit may be checked by a Fourier sine expansion (compare
Exercise 14.2.3 over the half interval [0, 1] or, equivalently, over the interval
[—1,1], with y(x) taken to be odd). Because of the even symmetry relative
to x = 1/2, only odd n terms appear:
. . ., ч /8\Г. sin37LX sin57LX
y(x) = x{\ - x) = -r sin nx + —-=— + —=— + •
960 CALCULUS OF VARIATIONS
introducing more terms and adjustable parameters such as
y = x(l-x) + a2x2{\ - xf. A7.141)
It is convenient to have the additional terms orthogonal, but it is not necessary.
The parameter a2 is adjusted to minimize F[y(x)]. In this case, choosing a2 =
1.1353 drives F[y(x)] down to 9.8697, very close to the exact eigenvalue value.
EXERCISES
17.8.1 From Eq. 17.128 develop in detail the argument that A > 0. Explain the circum-
circumstances under which X = 0 and illustrate with several examples.
17.8.2 An unknown function satisfies the differential equation
and the boundary conditions
№ = 1, y(l) = 0.
(a) Calculate the approximation
for
У trial = 1 -X2.
(b) Compare with the exact eigenvalue.
ANS. (a) X = 2.5
(b) ~ = 1.013.
A exact
17.8.3 In Exercise 17.8.2 use a trial function
у = I - x".
(a) Find the value of n that will minimize F[3/trial].
(b) Show that the optimum value of n drives the ratio ЯДемс, down to 1.003.
ANS. (a) n = 1.7247.
17.8.4 A quantum mechanical particle in a sphere (Example 11.7.1) satisfies
with k2 = 2mE/h2. The boundary condition is that ф(г = a) = 0, where a is the
radius of the sphere. For the ground state [where ф = ф{г)] try an approximate
wave function
and calculate an approximate eigenvalue k2.
Hint. To determine p(r) and w(r), put your equation in self-adjoint form (in
spherical polar coordinates).
ANS. ki = 1^
az
exact 2 "
REFERENCES 961
17.8.5 The wave equation for the quantum mechanical oscillator may be written as
dx2
with X = 1 for the ground state (Eq. 13.18). Take
_ fl - (x2/a2), x2 < a2
Y trial ) o 7 2
@, xz > a
for the ground-state wave function (with a2 an adjustable parameter) and calculate
the corresponding ground-state energy. How much error do you have?
Note. Your parabola is really not a very good approximation to a Gaussian
exponential. What improvements can you suggest?
17.8.6 The Schrodinger equation for a central potential may be written as
The /(/ + 1) term comes from splitting off the angular dependence (Section 2.5).
Treating this term as a perturbation, use your variational technique to show that
E > Eo, where Eo is the energy eigenvalue of ^u0 = Eouo corresponding to
/ = 0. This means that the minimum energy state will have / = 0, zero angular
momentum.
Hint. You can expand u(r) as uo(r) + Yl=\ ciuo wnere ^ui — £;M;> E, > Eo.
17.8.7 In the matrix eigenvector, eigenvalue equation
Ar, = Я,г;,
where A is an n x n Hermitian matrix. For simplicity, assume that its n real
eigenvalues (Section 4.6) are distinct, kx being the largest. If г is an approximation
tOTj,
n
r = ri + X siTi>
1 = 2
show that
rfAr .
and that the error in Xx is of the order |<5;|2. Take |<5(| « 1.
Hint, the n r, form a complete orthogonal set spanning the «-dimensional
(complex) space.
17.8.8 The variational solution of Example 17.8.1 may be refined by taking у = x(l — x)
+ a2x2(\ — xJ. Using the numerical quadrature, calculate Aapprox= F[y(x)],
Eq. 17.128, for a fixed value of a2- Vary a2 to minimize A. Calculate the value of
a2 that minimizes Я and Я itself to five significant figures. Compare your eigenvalue
Я with n2.
REFERENCES
Bliss, G. A., Calculus of Variations. The Mathematical Association of America, Open
Court Publishing Co. 111.: LaSalle A925).
As one of the older texts, this is still a valuable reference for details of problems such
as minimum area problems.
962 CALCULUS OF VARIATIONS
Courant, R., and H. Robbins, What Is Mathematics? 2nd ed. New York: Oxford Univer-
University Press A979).
Chapter VII contains a fine discussion of the calculus of variations, including soap
film solutions to minimum area problems.
Lanczos, C, The Variational Principles of Mechanics 4th ed. Toronto: University of
Toronto Press A970).
This book is a very complete treatment of variational principles and their applications
to the development of classical mechanics.
Sagan, H., Boundary and Eigenvalue Problems in Mathematical Physics. New York:
Wiley A961).
This delightful text could also be listed as a reference for Sturm-Liouville theory,
Legendre and Bessel functions, and Fourier Series. Chapter 1 is an introduction to
the calculus of variations with applications to mechanics. Chapter 7 picks up the
calculus of variations again and applies it to eigenvalue problems.
Sagan, H., Introduction to the Calculus of Variations. New York: McGraw-Hill A969).
This is an excellent introduction to the modern theory of the calculus of variations
which is more sophisticated and complete than his 1961 text. Sagan covers sufficiency
conditions and relates the calculus of variations to problems of space technology.
Weinstock, R., Calculus of Variations. New York: McGraw-Hill A952). (Also in paper,
Dover)
A detailed, systematic development of the calculus of variations and applications to
Sturm Liouville theory and physical problems in elasticity, electrostatics, and quantum
mechanics.
Yourgrau, W., and S. Mandelstam, Variational Principles in Dynamics and Quantum
Theory, 3rd ed. Philadelphia: Saunders A968). (Also in Dover, 1979)
This is a comprehensive, authoritative treatment of variational principles. The discus-
discussions of the historical development and the many metaphysical pitfalls are of particular
interest.
APPENDIX 1
REAL ZEROS OF A
FUNCTION
The demand for the values of the real zeros of a function occurs frequently in
mathematical physics. Examples include the boundary conditions on the solu-
solution of a coaxial wave guide problem, Example 11.3.1, eigenvalue problems in
quantum mechanics such as the deuteron with a square well potential, Example
9.1.2, and the location of the evaluation points in Gaussian quadrature (Appen-
(Appendix 2).
The IBM Scientific Subroutine Package (SSP) offers three subroutines for
determining the real zeros of functions. These are A) RTWI, an iteration tech-
technique due to Wegstein, B) RTMI, Mueller's bisection iteration technique and
C) RTNI, Newton's method, hallowed in introductory calculus. All three
methods require close initial guesses of the zero or root. How close depends on
how wildly your function is varying and what accuracy you demand. All are
methods for refining a good initial value. To obtain the good initial value and
to locate pathological features that must be avoided (such as discontinuities or
singularities), you should make a reasonably detailed graph of the function.
There is no real substitute for a graph. Exercise 11.3.12 emphasizes this point.
Newton's Method
This is commonly presented in differential calculus because it illustrates
differential calculus. It may sometimes be a good method—if you know exactly
what your function is doing.
Newton's method assumes the function/(x) to have a continuous first deriva-
derivative. From the geometrical interpretation of a derivative as the tangent to the
curve, Fig. 1,
-^-=-ГЫ (Al.l)
or
With x0 as the initial guess, calculate xt from Eq. A1.2). Iterating, from xt you
calculate x2 and hopefully converge rapidly on the root.
Newton's method does require computation of the derivative. This may or
963
964 REAL ZEROS OF A FUNCTION
fix)
f(Xo)
Xo
\
X\
FIG. 1 Newton's root-finding method
/(*)
FIG. 2 Newton's method—local minimum, no convergence
may not be a handicap. Calculation of the derivative in Exercise 11.3.12 would
be messy. But the real objection to Newton's method is that it is extremely
treacherous. It may fail to converge, oscillating in the vicinity of a local maxi-
maximum or minimum (Fig. 2), or it may diverge in the vicinity of an inflection point.
Or, if your initial guess is not close enough, Newton's method may converge to
the wrong root. Unless you know exactly what your function is doing, this is a
method to avoid.
Bisection Method
This method assumes that only f(x) is continuous. It requires that initial values
x, and xr straddle the zero being sought. Thus f(xt) and f{xr) will have opposite
REAL ZEROS OF A FUNCTION 965
x
FIG. 3 Bisection root-finding method
signs, making the product/(x,) */(xr) negative. In the simplest form of the bisec-
bisection method, take the midpoint xm = |(x, + xr) and test to see which interval
[x,, xm] or [xm, xr] contains the zero. The easiest test is to see if one product, say,
/(xm) */(xr) < 0. If this product is negative, then the root is in the upper half
interval [xm, xr], if positive, then the root must be in the lower half interval
[x,,xm]. Remember, we are assuming/(x) to be continuous. The interval con-
containing the zero is relabeled [x,,xr] and the bisecting continues (as in Fig. 3)
until the root is located to the desired degree of accuracy. Of course, the better
the initial choice of x, and xr is, the fewer will be the bisections required. How-
However, as explained subsequently, it is important to specify the maximum number
of bisections that will be permitted.
This bisection technique may not have the elegance of Newton's method, but
it is reasonably fast and much more reliable—almost foolproof if you avoid
discontinuous functions, such as/(x) = l/(x — a), shown in Fig. 4. Again, there
is no substitute for knowing the detailed local behavior of your function in the
vicinity of your supposed root.
In general, the bisection method (RTMI) is recommended.
Two Warnings
1. Since the computer carries only a finite number of
significant figures we cannot expect to calculate a
966 REAL ZEROS OF A FUNCTION
FIG. 4 A simple pole,/(л:,) -f{xr) < 0 but no root
zero with infinite precision. It is necessary to specify
some tolerance. All three SSP subroutines RTWI,
RTMI, and RTNI require that some tolerance be
specified (input parameter EPS). When the root is
located to within this tolerance the subroutine
returns control to the main calling program.
2. All the approaches mentioned here are iteration
techniques. How many times do you iterate? How
do you decide to stop? It is possible to program the
iteration so that it continues until the desired ac-
accuracy is obtained. The danger is that some factor
may prevent reasonable convergence. Then your
tolerance is never achieved and you have an infinite
loop. It is far safer to specify in advance a maximum
number of iterations. Again, this is the approach of
all three SSP subroutines. (Input parameter IEND).
Thus these subroutines will stop when either a zero
is determined to within your specified tolerance or
the number of iterations reaches your specified
maximum—whichever occurs first. With a simple
bisection technique the selection of a number of
iterations depends on the initial spread xr — x, and
on the precision you demand. Each iteration will
cut the range by a factor of 2. Since 210 = 1024 л
103, 10 iterations should add 3 significant figures,
20 should add 6 significant figures to the location of
the root.
REFERENCES 967
EXERCISES
1.1 Given /(x) = x — ax3. How small must |xo| be for Newton's method to converge to
x = 0?
1.2 Try Newton's method (RTNI or your own program) to locate a root of the following
functions
(a) /(x) = x2 + 1, and x0 = 0.9, 1.0
(b) /(x) = (x2 + 1I/2, xo = 0.9, 1.0
(c) /(x) = sinx, x0 = 1.0, 1.1, 1.2
(d) /(x)-tanhx, x0 = 0.9, 1.0, 1.1.
RTNI demands that you write a subroutine to supply RTNI with/(x) and its deriva-
derivative. Write out x and /(x) everytime the subroutine is called, so that you can trace the
sequence of extrapolations.
1.3 As an example of what Newton's method can do, call RTNI to find the largest root
of the Chebyshev polynomial Tl0(x). Try a succession of initial values x = 0.95,0.96,
0.97, and 0.98. Explain in detail what has happened.
Note. RTNI demands a subprogram that will supply the function (T10(x)) and its
derivative. SSP subroutine CNP will provide Г10(х) and lower index T's. Г/0(х) may
be calculated from Eq. 13.77 (x ф ± 1).
ANS. maximum root = 0.98769.
1.4 Write a simple bisection root determination subroutine that will determine a simple
real root once you have straddled it. Test your subroutine by determining the roots
of one or more polynomials or elementary transcendental functions.
1.5 The theory of free radial oscillations of a homogeneous earth leads to an equation
tanx = - j-2.
1 — azxz
The parameter a depends on the velocities of the primary and secondary waves. For
a — 1.0, find the first three positive roots of this equation.
ANS. x, = 2.7437 x2 = 6.1168 x3 = 9.3166.
1.6 (a) Using the Bessel function J0(x) generated by SSP subroutine BESJ, locate
consecutive roots of J0(x): а„ and а„+1 for n = 5, 10, 15, ..., 30. Tabulate а„,
а„+1, (а„+1 — а„) and (а„+1 — сс„)/п. Note how this last ratio is approaching unity.
Hint. RTMI will pinpoint the root once you have straddled it.
(b) Compare your values of а„ with values calculated from McMahon's expansion,
AMS-55, Eq. 9.5.12.
REFERENCES
Hamming, R. W., Introduction to Applied Numerical Analysis. New York: McGraw-Hill
A971), especially Chap. 2.
In terms of the author's insight into numerical computation and his ability to com-
communicate to the average reader, this book is unexcelled.
APPENDIX 2
GAUSSIAN
QUADRATURE
Interpolatory Formulas
The problem is to find the numerical value of a definite integral
/= (" f(x)w(x)dx.
We approximate our integral by a finite sum
The sum in Eq. A2.1 contains In + 1 parameters:
n xk's, points for evaluating f(x)
n Ak's, coefficients
and
1 the choice of n itself.
We proceed by replacing/(x) by an interpolating polynomial P(x) of degree
n — 1 and a remainder term:
/(x) = P(x) + r(x). (A2.2)
P(x) is fitted to f(x) at the n xk [P(xk) = f(xk)~] by the choice
ем = t fa TLx)f(Xkl (A2J)
k=1 (x — xkja^xkj
where a(x) is a completely factored nth-degree polynomial,
a(x) = (x - Xj)(x - x2) • • • (x - х„). (А2.4)
Note that
lim ^-— = 1. (A2.5)
х^хк(х -хк)а(х)
For /(x) a polynomial of degree n — 1 the remainder term r(x) is zero and
Eq. A2.3 becomes an identity. Specifically (using Eq. A2.5), P(xk) = f(xk), the
(n — l)-degree polynomial is fitted to/(x) at nxk.
968
GAUSSIAN QUADRATURE 969
When the integral of the remainder term is small
fb fb
/(x)w(x) dx * P(x)w(x) dx
la Ja
(A2.6)
using Eq. A2.3. Interchanging summation and integration, we obtain
n
I/(
.
Quadrature formulas of this type are labeled interpolator)?. Since every
polynomial /(x) of degree и — 1 may be represented exactly [r(x) = 0] by our
n-point-fit interpolating polynomial P(x), Eq. A2.7 is exact for such polynomial
functions,/(x).
The locations of the xk, the zeros of a(x) in Eq. A2.7 have not been specified.
Taking them to be equally spaced leads to the various Newton-Cotes formulas.
Of these Simpson's rule (Eq. A2:8) is probably the best known and, among the
simpler formulas, it is the most accurate.
Г' f{x)dx * \{f{a) + 4f(a + h) + 2f(a + 2h) + 4f(a + 3h)
1° J (A2.8)
+ 2f(a + Щ+..-+ 4f(b -h)
Here h is the distance between the equally spaced points, h = x2 — xt = x3 — x2,
and so on. Equation A2.8 may be considered a sum of three-point fits
Cc+2h h
f(x) dx*^ {/(c) + 4/(c + h) + f(c + 2/i)}, (A2.9)
which is expected to be exact if/(x) is of degree <2 over the interval [с, с + 2ti].
Actually Simpson's rule is better than this. An analysis of the error shows
that the error in Simpson's rule is given by — /i5/D)(£)/90 where £ is a point in
[c,c + 2h]. For f(x) = x3, /D)(x) = 0 and Simpson's rule is exact for cubic
equations. The reader may verify this by showing that JqX3^x is given exactly
by Eq. A2.8.
This result may be interpreted as a consequence of symmetry principles:
A) the coefficients in Simpson's rule are symmetric with respect to the middle
xk; 1, 4, 1 for Eq. A2.9. B) For Simpson's rule n = 3, odd and x3 is an odd
function. If we set с = —h,c + h = 0, then both sides of Eq. A2.9 vanish—by
(anti) symmetry. This additional degree of precision appears for each of the
Newton-Coles formulas where n is odd.
Gaussian Quadrature
It was pointed out by Gauss that the locations of xk represent unused para-
parameters that may be used to improve the accuracy of Eq. A2.7, that greater
970 GAUSSIAN QUADRATURE
precision can be obtained if the zeros of a(x) are not equally spaced but are
chosen as follows.
Take the xk so that our completely factored nth-degree polynomial a(x) is
the nth-degree polynomial which is orthogonal to all lower degree polynomials
over [a,b~\ with respect to the weighting factor w(x). The most frequently
encountered combinations of interval and weighting factor are those in Table
9.3.1 The xk's therefore are the n zeros of the nth-degree polynomials—Legendre,
Hermite, Laguerre, Chebyshev, and so on. Both the xk's and the corresponding
coefficients Ak are tabulated in AMS-55, Chapter 25. Computing subroutines
exist in both single and double precision for the Legendre, Laguerre, and
Hermite cases.
We shall prove that this choice of xk (zeros of the appropriate nth-degree
orthogonal polynomial) makes the quadrature formula A2.7 exact for /(x)
a polynomial of degree < 2n — 1. Here is the power of this Gaussian choice.
(Taking the xk equally spaced (Newton-Cotes) is exact only for/(x) a poly-
polynomial of degree < n — 1, n even or < n, n odd.)
Proofs of the necessity and sufficiency of this choice of orthogonal poly-
polynomial roots follow.
Theorem A necessary and sufficient condition that an interpolatory
formula of the form of Eq. A2.7 be exact for all polynomials of degree < 2n — 1
is that a(x) be orthogonal with respect to w(x) over the interval [a, b] to all
polynomials of degree < n — 1.
Necessity. Assume Eq. A2.7 is exact for/(x) any polynomial of degree < 2n —
1. Let Qj(x) be any polynomial of degree < n — 1. Then/(x) = a(x)Q1(x) is a
polynomial of degree < 2n — 1. By simple substitution, we have
f(x)w(x)dx = a(x)Ql(x)w(x)dx, (A2.10a)
Ja Ja
and since Eq. A2.7 is assumed exact for this degree polynomial integrand,
J'b n
cc(x)Ql(x)w(x)dx = X4«W6iW
k=1 (A2.106)
= 0.
The final = 0 follows because a(xk) = 0, Eq. 2.4. But this is a statement that our
nth-degree polynomial w(x) is orthogonal to all polynomials Qi(x) of degree
<n- 1.
Sufficiency. Assume the orthogonality of a(x) to all polynomials of degree
< n — 1. Let/(x) be a polynomial of degree < 2n — 1. Dividing/(x) by a(x),
we obtain
1 If a and b are finite, the interval [a, b\ can always be transformed to [— 1,1]
by the linear transformation t — \2x — (a + b)~\/(a — b), x — [(b — a)t +
(b + e)]/2. Then $f{x)dx = \Uf(t)dt.
GAUSSIAN QUADRATURE 971
or
f(x) = a(x)Q2(x) + p(x), (A2.12)
with Q2(x) and P(x) polynomials of degree < n — 1. Integrating yields
f(x)w(x)dxr a(x)Q2(x)w(x)dx + p(x)w(x)dx. (A2.13)
Ja Ja Ja
The first integral on the right vanishes because of our postulated orthogonality.
Then, because the degree of p(x) is < n — 1, Eq. 2.7 (which is interpolatory)
is exact and we have
(bf(x)w(x)dx = t AkP{xk). (A2.14)
Ja k = 1
Since oc(xk) = 0, Eq. A2.12 yields
P(xk) = /(**)•
Therefore
(bf(x)w(x)dx = t AJ(xkl (A2.15)
a
exact. This is Eq. A2.7, exact for/(x), any polynomial of degree < 2n — 1.
As a specific example of Eq. A2.15, consider the case where [a, b] = [— 1,1]
with w(x) = 1. The polynomials orthogonal over this interval with respect to
this weighting function are the Legendre polynomials of Chapter 12. For the
choice n = 10 the xk are the 10 roots of Pl0(x). The values of Ak are given in
principle by Eq. 2.7. A more convenient expression is derived by Krylov2.
Finally, with the numerical values of Ak and xk, Eq. A2.15 becomes
f(x)dx= +0.0666 7134Д + 0.9739 0652)
+ 0.1494 5134/( +0.8650 6336)
+ 0.2190 8636/( + 0.6794 0956)
+ 0.2692 6671 /( + 0.4333 9539)
+ 0.2955 2422/(+ 0.1488 7433)
(A2.16)
+ 0.2955 2422/(-0.1488 7433)
+ 0.2692 6671/(-0.43339539)
+ 0.2190 8636/( - 0.6794 0956)
+ 0.1494 5134/(-0.86506336)
+ 0.0666 7134/(-0.9739 0652),
exact (to the number of digits listed) for/(x) a polynomial of degree < 19.
2Tabulations of the Ak and xk are found in the references that follow and in
AMS-55 (Chapter 25).
972 GAUSSIAN QUADRATURE
The actual usefulness of Gaussian integration is contingent upon two factors
A) the availability of computers and B) the availability of the values of/(x)
at x = xk. This generally means that/(x) be expressed in closed form or approx-
approximated in some convenient form so that/(xk) may readily be calculated. If/(x)
is given only as equally spaced tabulated values, Simpson's rule is probably
the best choice for the numerical integration.
Warning. Our fundamental assumption is that/(x) can be accurately repre-
represented by a Bn — l)-degree polynomial with n reasonably small. If/(x) has a
singularity in the integration interval, this assumption of a polynomial repre-
representation is obviously not valid. Even if/(x) remains finite, the presence of an
infinite slope means our assumption is poor and that numerical accuracy will
be relatively low. Exercise A2.7 illustrates these points.
EXERCISES
2-1 (a) Verify Eq. A2.5.
(b) With P(x) a polynomial of degree < n — 1 and <x(x) given by Eq. A2.4, verify that
k=1 (x - xk)cc (xk)
2.2 Using a 10-point Gauss-Legendre subroutine, evaluate
xndx for и = 0AL0.
Jo
Tabulate the computed value of the integral, the exact value, and the relative error.
Plot log (relative error) versus n.
2-3 Using a 10-point Gauss-Laguerre subroutine, evaluate
xne"xdx for и = 0AJ5.
Jo
Tabulate the computed value of the integral, the exact value, and the relative error.
Plot log (relative error) versus n.
2.4 Using a 10-point Gauss-Hermite subroutine, evaluate
xne~x2dx for и = 0BJ2.
J — oo
Tabulate the computed value of the integral, the exact value, and the relative error.
Plot log (relative error) versus n.
2.5 (a) Write a double precision Gauss-Chebyshev subroutine that will evaluate
integrals of the form
Г л*) dx
using 20 points, the 20 roots of the Chebyshev polynomial, Г20(х). These roots
and the coefficients Ak are tabulated by Stroud and Secrest.
(b) Check your subroutine by using it to compute
REFERENCES 973
Г 2n(
Y2"C1 _ Y2W'2 Jv
x ц — x ^ ax
J-i
for и = 0BK0. Tabulate the computed value of the integral, the exact value, and
the relative error. Plot log (relative error) versus n.
2.6 Evaluate
. f1 dx
using Gauss-Legendre quadrature. How many evaluation points are needed to
obtain a result accurate to 5 significant figures? to 12 significant figures?
ANS. 4 point Gauss-Laguerre quadrature => 5 significant figures
12 points => 12 significant figures
2.7 From Exercise 10.2.11 the Euler-Mascheroni constant у may be written as
1. у = — In re~~r dr
Jo
2. 7 = 1.0- r\nre~rdr
Jo
[32 points => 3 significant figures]
3. 7 = 1.5-0.5 r2\nre~rdr.
Jo
(a) Explain why Gauss-Laguerre quadrature should not be attempted on the first
integral.
(b) Evaluate B) and C) using a 32-point Gauss-Laguerre quadrature and explain
the very limited accuracy of your results.
2.8 (a) Evaluate the integral
, Г e~x2dx
Ljw\
using Gauss-Hermite quadrature formulas for several values of n (number of
evaluation points),
(b) Rewrite the integral as
7 = 2 Tz-e~xdx>
Jo !+*
and evaluate by Gauss-Laguerre quadrature for several values of n.
ANS. (b) 1.2103.
REFERENCES
Davis, P. J., and P. Rabinowitz. Methods of Numerical Integration. Orlando: Academic
A975).
Krylov, V. I. (translated by A. H. Stroud), Approximate Calculation of Integrals. New
York: Macmillan A962).
This is a very clearly written book, which covers virtually all aspects of the approximate
calculation of integrals, and is an excellent discussion of Gaussian and other methods of
974 GAUSSIAN QUADRATURE
numerical quadrature. Tables of evaluation points and weighting factors are also
included.
Stroud, A. H. Numerical Quadrature and Solution of Ordinary Differential Equations,
Applied Mathematics Series, vol. 10. New York: Springer-Verlag A974).
As an excellent discussion of Gaussian and other methods of numerical quadrature,
this volume also includes tables of evaluation points and weighting factors.
Stroud, A. H.,andD. Secrest, Gaussian Quadrature Formulas. Englewood, N.J.: Prentice-
Hall A966).
This is a valuable book primarily because it contains extensive tables of xk and Ak for
a wide variety of intervals and weighting factors.
GENERAL REFERENCES
1. E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, 4th ed.
Cambridge: Cambridge University Press A962) paperback.
Although this is the oldest (original edition 1902) of the references, it
still is the classic reference. It leans strongly to pure mathematics, as of
1902, with full mathematical rigor.
2. P. M. Morse and H. Feshbach, Methods of Theoretical Physics B Vols).
New York: McGraw-Hill Book Company A953).
This work presents the mathematics of much of theoretical physics in
detail but at a rather advanced level. It is recommended as the outstanding
source of information for supplementary reading and advanced study.
3. H. S. Jeffreys and B. S. Jeffreys, "Methods of Mathematical Physics"
3rd Ed. Cambridge, England: Cambridge Univ. Press A956).
This is a scholarly treatment of a wide range of mathematical analysis, in
which considerable attention is paid to mathematical rigor. Applications
are to classical physics and to geophysics.
4. R. Courant and D. Hilbert, Methods of Mathematical Physics, Col. I,
1st English ed. New York: Wiley (Interscience) A953).
As a reference book for mathematical physics, it is particularly valuable
for existence theorems and discussions of areas such as eigenvalue problems,
integral equations, and calculus of variations.
5. F. W. Byron, jr. and R. W. Fuller, "Mathematics of Classical and
.Quantum Physics". Reading, Mass.: Addison-Wesley A969).
This is an advanced text that presupposes a moderate knowledge of
mathematical physics.
6. С. М. Bender and S. A. Orszag, Advanced Mathematical Methods for
Scientists and Engineers. New York: McGraw-Hill A978).
7. Handbook of Mathematical Functions with Formulas, Graphs, and Mathe-
Mathematical Tables. Applied Mathematics Series .55 (AMS-55), National
Bureau of Standards, U.S. Department of Commerce A964).'
As a tremendous compilation of just what the title says, this is an
extremely useful reference.
Additional more specialized references are listed at the end of each chapter.
INDEX
Abel's equation, 875,878
Addition theorem
Bessel functions, 585
Legendre polynomial, spherical harmonic,
261,693-698,913
Adjoint operator, 498
Analytic continuation, 378-380
Bromwich integral, 855
factorial reflection relation, 413, 542
gamma function, 545
Analytic functions, 362
Cauchy integral formula and, 371-376
Cauchy integral theorem and, 365-371
conformal mapping and, 392-394
Angular momentum operator, 42, 46, 108, 109,
451,684-^92
associated Legendre equation, 451
vector spherical harmonics, 711
Anomalous dispersion, 857-859
Antisymmetry, determinants, 170
Associated Laguerre equation, polynomials. See
Laguerre equation; Laguerre functions
Associated Legendre equation, functions. See Leg-
Legendre equation; Legendre functions
Asymptotic series, 339-346
applications to computing, 344
Bessel functions, 617-620
confluent hypergeometric functions, 757
cosine, sine integrals, 342
incomplete gamma function, 339
integral representation expansion, 339-343,
616-618, 757
steepest descent, 431-436
Stokes' method, 622
Axial vector. See Pseudovector
Bernoulli functions, 330
Bernoulli numbers, 327-330, 350, 413, 414,
555, 775
Bessel equation, 114, 577
Laplace transform solution, 842-844
self-adjoint form, 500
series solution, 459, 474
singularities, 453
spherical, 116, 622
Bessel functions, 573-636
asymptotic expansion, 616-622
Bessel series, 592
confluent hypergeometric representation, 755,
756
cylindrical waveguide, 600
first kind, 573-591
Fourier transform, 805, 806, 847
generating function, 573, 575, 585
Hankel functions. See Hankel functions
integral representation, 578-580. 588
Laplace transform, 843, 846
modified. See Modified Bessel functions
nonintegral order, 584
orthogonality, 591-596
recurrence relations, 576
second kind. See Neumann functions
series form, 459, 575
spherical, 622-636
asymptotic forms, 627
definitions, 623
orthogonality, 628, 629
recurrence relations, 627
spherical modified, 633, 634
Wronskian relation, 631, 634
Wronskian formulas, 599, 602, 605, 608, 622
zeros, 581
Bessel inequality, 526, 533
Bessel integral, 588
Beta function, 560-565
incomplete, 292, 319, 562, 752
Laplace convolution, 853
Binomial theorem, 307
Biot and Savart law, 32, 674
Bisection (root finding), 964
Boltzmann equation, 866
Born approximation, 916, 923
Bose-Einstein statistics, 950
Boundary conditions, 502
hollow cylinder, 592, 614
integral equations and, 871, 899, 902
magnetic field of current loop, 672-676
ring of charge, 658
sphere in uniform electric field, 656
Sturm-Liouville theory, 503
waveguide, coaxial cable, 600
Branch points, 397
Bromwich integral, 419, 853-861
Calculus of residues, 396-421
evaluation of definite integrals, 403-421, 856
Jordan's lemma, 408
Calculus of variations, 925-974
constraints, 945, 950
Euler equation, 928
975
Hamilton's principle, 938
Hilbert-Schmidt integral equation, 957
integral equations, applications to, 887
Lagrangian equations, 939, 951
Lagrangian multipliers, 945-950
Rayleigh-Ritz variational technique, 957-961
soap films, 931-937
Sturm-Liouville equation, 957
surface of revolution, 931
Catalan's constant, 292, 320, 334, 551
Cauchy convergence tests, 281-283
Cauchy principal value, 401, 422, 490, 914
Cauchy-Riemann conditions, 360-365, 429
fluid flow, 47, 364
Laplace's equation, 363
polar coordinates, 364
Cauchy's integral formula, 371-376
calculus of residues, 374
derivatives of analytic functions, 373
Cauchy's integral theorem, 365-371
Cauchy-Goursat proof, 368, 369
Causality, 425, 822
Cayley-Klein parameters, 10, 253
Cavity, cylindrical resonant, 582, 590
Chebyshev equations, 735
convergence of series solution, 292
self-adjoint form, 500
singularities, 454
Chebyshev functions, 731-748
discrete orthogonality, 792
Fourier transform, 807
/generating functions, 731,737
Gram-Schmidt construction, 522, 523
hypergeometric representations, 751
orthogonality, 737
recurrence relations, 732
series of
numerical applications, 740-748
truncation, telescoping, 744-746
shifted, 746
trigonometric form, 736, 741
Christoffel symbol, 160, 162
Circular cylindrical coordinates, 95-101
Circular membrane, Bessel functions, 589
Clausen functions, 783
Closure, 529, 536
Bessel functions, 594, 635
spherical harmonics, 684
Completeness
eigenfunctions of Hilbert-Schmidt integral
equation, 893
Fourier series, 761
Sturm-Liouville eigenfunctions, 523-538
Complex variables, 352-395, 396-436
calculus of residues, 396-421
Cauchy-Riemann conditions, 360-365
Cauchy's integral formula, 371-376
Cauchy's integral theorem, 365-371
complex algebra, 353-360
contour integrals, 365,919
mapping, 384-392
conformal, 392-394
Confluent hypergeometric equation, 753
second solution, 753
singularities, 454
Confluent hypergeometric functions, 753-758
asymptotic expansions, 757
Bessel functions, 755, 756
Hermite functions, 755
Laguerre functions, 755
Whittaker functions, 756
Wronskian relation, 757
Conformal mapping, 392-394
Continuity equation, 40
Contraction of tensors, 124
Contravariant tensor, 120
Convergence, 280-293. See also Infinite series
analytic continuation and, 378-380
improvement of, 288, 296, 334
series solution of
Chebyshev equation, 292
Legendre equation, 288, 291, 464
ultraspherical equation, 292
Convolution theorem
Fourier transforms, 810-814
Laplace transforms, 849-853, 860
Coordinate system. See specific coordinate system
Cosine x, infinite product representation, 348
Cosine integral, 565
asymptotic expansion, 342, 343
confluent hypergeometric representation, 756
Covariant differentiation, 161
Covariant tensor, 120
Cross product of vectors. See Vector product of
vectors
Crossing conditions, 423
Curl
coordinates, cartesian, 42-47
circular cylindrical, 97
curvilinear, 92
spherical polar, 104
integral definition, 55
irrotational, 44, 49, 67, 79, 150
tensor, 166
Curvilinear coordinates, 86-90
differential vector operations, 90-94
metric, 87
scale factors, 87
D'Alembertian, 126, 152, 437
Degeneracy, eigenvalues, 220, 223, 228, 513
Del, 33
successive applications, 47-51
Delta function, Dirac, 81, 481-^84
Bessel representation, 797
eigenfuriction expansion, 528, 684
Fourier integral, 799, 800
Green's function and, 485, 905, 909
impulse force, 835, 836
Laplace transform, 834
point source, 905, 909
quantum theory, 816
976 INDEX
sequences, 483, 484,488,490, 780, 804, 805
sine, cosine representations, 805
spherical polar coordinates, 484
theory of distributions, 483, 484
De Moivre's formula, 356
Deuteron, eigenfunction-eigenvalue, 500-502,
819
Descending power series solution, 320
Determinants, 168-176
antisymmetry, 170
Laplacian development by minors, 169
representation of a vector product, 21, 93, 97,
104
secular equation, 221
solution of set
of homogeneous equations, 171, 222, 884
of nonhomogeneous equations, 172
Gauss elimination, 172
Gauss-Jordan elimination, 173
Differentian equations, 437-496. See also spe-
specific differential equation
eigenfunctions, eigenvalues, 499
first order, 440-447
exact, 441
linear, 442
separable, 440
Fuchs's theorem, 462, 472
nonhomogeneous
Green's function solution, 480-491
particular solution, 455, 479
numerical solutions, 491-496
second order, absence of third solution, 477
second solution, 467-^*80, 507
logarithmic term, 473
self-adjoint, 497-509
separation of variables, 111-117, 440, 448-^*51
series solution (Frobenius), 454—467
singular points, 451-454
Diffusion equation, solutions, 450
Dipoles. See also Electric dipole; Magnetic dipole
interaction energy, 18
radiation fields, 110
Dirac delta function. See Delta function; Dirac
Dirac matrices, 211-213
Direct product
matrices, 179
tensors, 124
Direction cosines, 4, 191. See also Matrices,
orthogonal
identities, 22
orthogonality condition, 11,194,195
Dirichlet integral, 563
Dirichlet kernel, 482
Dispersion, anomalous, 857-859
Dispersion theory, 421-^28, 803
crossing relations, 423
Hilbert transform, 423
sum rules, 424
symmetry, 423
Divergence
coordinates, cartesian, 37-42
circular cylindrical, 97
curvilinear, 90
spherical polar, 104
integral definition, 55
solenoidal, 41, 49, 79, 150
tensor, 164
Dot product of vectors. See Scalar product of
vectors
Dual tensor, 132, 157. See also Pseudotensor
Duplication formula for factorial functions. See
Legendre duplication formula
Dyadics, 137-140
Eigenfunctions, 499
completeness of, 523-538, 893
degeneracy, 513
expansion of Dirac delta function, 528
of Green's function, 529
of square wave, 512
Hermitian differential operators, 511
integral equations, 891
orthogonality, 511
variational calculation, 958
Eigenvalues, 499, 892
Hermitian differential operators, 510
Hermitian matrices, 219
Hilbert-Schmidt integral equations, 892
normal matrices, 229
real, 219, 892
variational principle for, 958
Eigenvectors
Hermitian matrices, 219
normal matrices, 229
Eight-fold way, 270
Einstein velocity addition law, 157, 275
Elasticity, 140-150
cubic symmetry, 149
Hooke's law, 146, 148
isotropic solid, 149
stress, 142
strain, 140
Electric dipole, 47, 641
Legendre expansion, 641, 676
Electromagnetic invariants, 155, 157
Elliptic integrals, 321-327
first kind, 322
hypergeometric representations, 749
second kind, 323
Error integrals, 568
asymptotic expansion, 345
confluent hypergeometric representation, 756
Essential singularity, 396, 400, 452
Euler angles, 10, 198-200, 204, 256-259
Euler equation, 928
Euler identity, 350
Euler-Maclaurin integration formula, 330-332,
555
Euler-Mascheroni constant, 284, 291, 310, 338,
346, 550, 571, 860
Exponential integral function, 339-341, 566
Laplace transform, 847
INDEX 977
Factorial function, 433, 539-572. See also
Gamma function
complex argument, 548
contour integrals, 545
digamma function, 549-556, 597
double factorial notation, 292, 544
infinite product, 541
integral representation, 540, 545
Legendre duplication formula, 556, 561
Maclaurin expansion, 551
poly gamma functions, 550
reflection relation, 544
contour integrals, 412, 413, 418
infinite products, 348, 349
relation to gamma function, 543
steepest descent asymptotic formula, 433
Stirling's series, 555-560
Fast Fourier transform, 791
Fermat's principle, . 936
Fermi age equation, 809
Fermi-Dirac statistics, 950
Force-potential relation, 36, 64-69
Fourier-Bessel series, 592
Fourier-Mellin integral, 854
Fourier series, 760-793
advantages, 766-769
completeness, 761
differentiation, 779
Gibbs phenomenon, 783-787
integration, 778
interval, change of, 768
orthogonality, 512, 761
square wave, 512, 770, 784
Sturm-Liouville theory, 512, 762
summation of, 763
uniform convergence, 303
Fourier transform, 794-823
aliasing, 790
convolution theorem, 810-814
delta function derivation, 799
discrete transform, 787-792
fast Fourier transform, 791
finite wave train, 801-803
Fourier integral, 797-799
inversion theorem, 800-807
momentum representation, 814-820
solution of integral equation, 875
transfer functions, 820-823
transform of derivatives, 807-810
Fraunhofer diffraction, Bessel functions, 580
Fredholm integral equation, 865. See also Inte-
Integral equations
Fresnel integrals, 419, 632, 756, 807
Frobenius' method. See Series solution of differen-
differential equations
Fuchs's theorem, 462, 472
Gamma function, 433, 539-572. See also Fac-
Factorial function
complex argument, 548, 551
definite integral (Euler) definition, 540
digamma function, 549-555
infinite limit (Euler) definition, 539
infinite product (Weierstrass) definition, 541
polygamma functions, 550
recurrence relation, 539, 546
reflection identity, 542
Gauge transformation, 74, 156
Gauss' differential equation. See Hypergeometric
differential equation
Gauss' error function, asymptotic expansion, 345
Gauss' law, 74-77, 485
two dimensional case, 78
Gauss' theorem, 57-61, 75, 90
dyadics, 139
Gaussian quadrature, 968-974. See also
Quadrature
Gegenbauer polynomials. See Ultraspherical
polynomials
Generating function, for
Bernoulli numbers, 327
Bessel functions, 416, 573, 575, 585
modified Bessel functions, 416, 613
Chebyshev polynomials, 416, 731
Hermite polynomials, 416, 712
Laguerre polynomials, 416, 723
associated Laguerre polynomials; 725
Legendre polynomials, 416, 637
associated Legendre functions, 668
ultraspherical polynomials, 731
Generators, group, 261-267
Gibbs phenomenon, 783-787
Gradient
constrained derivative, 949
coordinates, cartesian, 33-37
circular cylindrical, 97
curvilinear, 90
spherical polar, 104
integral definition, 55
force-potential relationship, 64-69
Gram-Schmidt orthogonalization, 516-523
Green's functions, 897-923
construction of
one dimension, 898-901
two, three dimensions, 910, 911
delta function, 905, 909
eigenfunction expansion, 529
electrostatic analog, 480, 897
Helmholtz equation, 529, 908, 912
integral equation-differential equation
equivalence, 901-904
Laplace operator, 910-912
circular cylindrical expansion, 914
spherical polar expansion, 911-913
modified Bessel function, 915
modified Helmholtz equation, 908, 912
PoissonV equation, 485, 897
spherical Bessel functions, 921
symmetry property, 486, 901
Green's theorem, 58, 79
Group theory, 237-275
978 INDEX
character, 241
continuous groups, 251-275
generator, 261-267
homomorphism, SUB)-O3+ 255-258
Lorentz group, 271-275
orthogonal group, O3+ 252
rotation matrix, 258, 259
special unitary group, SUB), 253
definitions, 238
discrete groups, 243-251
dihedral groups, 244
irreducible representations, 240
isomorphism, 239, 242
permutation groups, 249
vierergruppe, 239, 245
variational analog, 957
Hilbert transforms, 423
Hubble's law, 7
Hydrogen atom, 726-728
associated Laguerre equation, 727
electrostatic potentials, 569, 697
momentum representation, 816
Hydrogen molecular ion, 466
Hypergeometric equation, 748
alternate forms, 750
second independent solution, 750
singularities, 454
Hypergeometric functions, 748-752
Chebyshev functions, 751
Legendre functions, 750
Hadamard product, 205
Hamilton's principle and Lagrange equations of
motion, 938-940
Hankel functions, 603-610
asymptotic forms, 431, 618, 619
integral representations, 606-610
series expansion, 604
spherical, 623
Wronskian formulas, 605, 608
Hankel transforms, 795-797
Harmonics. See also Spherical harmonics
sectoral, tesseral, zonal, 685
tensor spherical, 140
vector spherical, 707-711
Heaviside expansion theorem, 400, 861
Heaviside shifting theorem, 840
Heaviside unit step function, 484, 490. See also
Step function
Helmholtz equation, 85, 437, 610, 622, 666
Green's function, 490, 908, 912
solutions, 450
Helmholtz theorem, 78-84
Hermite equation, 714
convergence of series solution, 464
self-adjoint form, 500
singularities, 454
Hermite functions, 712-721
confluent hypergeometric representation, 755
generating function, 712
Gram-Schmidt construction, 522
orthogonality, 714
recurrence relations, 712
Hermitian differential operator, 504, 510-516
completeness of eigenfunctions, 523-538
eigenfunctions, orthogonal, 511
eigenvalues, real, 510
integration interval, 504
in quantum mechanics, 505
Hermitian matrices, 209
real eigenvalues, orthogonal eigenvectors,
219-221
Hilbert matrix, determinant, 175, 233, 535
Hilbert space, 13, 534, 760, 795, 812
Hilbert-Schmidt integral equation, 890-897
Ill-conditioned systems, 233, 535, 829
Impulse function. See Delta function, Dirac
Incomplete gamma function, 319, 339-341,
565-571
confluent hypergeometric representation,. 753
recurrence relations, 569
Indicial equation, 456
Inertia, moment of, 217-219
Infinite products, 346-350
convergence, 347
cosine, 347
gamma function, 347, 541
sine, 347
Infinite series, 277-351
algebra of series, 295-299
double series, 297, 298
alternating series, 293, 294
Cauchy criterion, 277
Chebyshev truncation, 744-746
convergence
absolute, 294
conditional, 294
improvement of, 288, 296, 334
uniform, 299
convergence, tests for, 280-293
Abel's, 302
Cauchy integral, 283
Cauchy ratio, 282
Cauchy root, 281
comparison, 280
D'Alembert ratio, 282
Gauss', 287, 290
Rummer's, 285
Maclaurin integral, 283
Raabe's, 286
Weierstrass M, 301
functions, series of, 299-303
geometric series, 278
harmonic series, 279
Leibnitz criterion, 293
partial sums, 277, 340
power series, 313-321
Riemann's theorem, 295
telescoping, 744-746
INDEX 979
Integral equations, 865-924
delta function, Dirac, 905, 909
differential equation-integral equation transfor-
transformation, 869, 901-904
Fredholm equation, 865
Hilbert-Schmidt theory, 890-897
integral transforms, 874, 875
Neumann series, 879-882, 915
nonhomogeneous integral equation, 894
numerical solution, 885-887
orthogonal eigenfunctions, 891
separable kernel, 882-885
solution by generating function, 876
Volterra equation, 865
Integral transforms, 794-864. See also Fourier
transform; Hankel transform; Laplace
transform; Mellin transform
Fourier, 794-823
Hankel, 795-797
Laplace, 795, 824-863
Mellin, 795, 797
Integrals
contour, 365, 919
differentiation of, 478
evaluation by
beta functions, 560-565
contour integration, 403-421
Lebesgue, 524
Riemann, 365, 511
Stieltjes, 484
Integration, vector, 51-57
line integrals, 51
surface integrals, 53
volume integrals, 54
Interpolating polynomial, 190, 968
Inverse operator, uniqueness of, 826
Inversion of power series, 316
Irreducible
groups, 240
tensors, 134
Isomorphic, 4, 184, 239
Jacobi-Anger expansion, 585
Jacobi identity, 185
Jacobian, 37, 89
Jordan's lemma, 408
Kernels, of integral equations of form k(x -
t), 875, 877
separable, 882-886
Kirchhoff diffraction theory, 879, 922
Kronecker delta, 11
mixed second-rank tensor, 122
Kronig-Kramers dispersion relations, 421, 424
Kummer's equation. See Confluent hyper-
geometric equation
Kummer's first formula, 754
Lagrangian, 939
Lagrangian multipliers, 945-950
Laguerre equation, 721
associated Laguerre equation, 116, 500, 725
self-adjoint form, 500
self-adjoint form, 500
singularities, 454
Laguerre functions, 721-731
associated Laguerre polynomials, 725-728
orthogonality, 726
recurrence relations, 725
Rodrigues' representation, 726
confluent hypergeometric representation, 755
generating function, 723
Gram-Schmidt construction, 522
integral representation, 722
Laplace transform, 847
orthogonality, 724
recurrence relations, 724
Lame's constants, 147
Laplace equation, 48, 79, 437, 480
minimum energy, 943
solutions, 450, 451
uniqueness of, 79
Laplace transform, 795, 824-863
convolution theorem, 849-853, 860, 875
derivative of transform, 842,843
integratioin of transforms, 844,845
inverse transformation, 826-830, 853-861
solution of integral equation, 875
substitution, 838
table of operations, 862
of transforms, 863
transform of derivatives, 831-838
translation, 840
Laplacian
scalar, 48, 60
coordinates, cartesian, 48
circular cylindrical, 97
curvilinear, 92
spherical polar, 104
tensor, 165
vector, 49
coordinates, cartesian, 49
circular cylindrical, 97
spherical polar, 105
Laurent expansion, 308-383, 573, 761
Legendre duplication formula, 556, 561, 623
Legendre equation, 448, 647
associated Legendre equation, 116, 666
self-adjoint form, 500
convergence of series solution, 291, 464
self-adjoint form, 500
singularities, 454
Legendre functions, 637-711
associated Legendre functions, 106, 666-680
orthogonality, 670-672
parity, 669
recurrence relations, 668
relation between +M and — M, 668, 677
electric multipoles, 641-644, 651, 676
980 INDEX
Fourier transform, 807
generating function, 637, 876
Gram-Schmidt construction, 519
hypergeometric representations, 750
Legendre differential equation, 647
Legendre polynomials, 637
Legendre series, 654
orthogonality, 652-663
parity, 649
polarization of dielectric, 661
recurrence relations, 645, 707
ring of electric charge, 658
Rodrigues' formula, 663, 670, 691
Schlaefli integral, 664
second kind, 701-707
closed form solutions, 704-706
series solution of Legendre equation, 464,
701-703
shifted Legendre functions, 522
sphere in uniform electric field, 656
spherical harmonics. See Spherical harmonics
Legendre polynomial addition theorem
derivation from Green's function, 913
group theory, 261
two spherical polar coordinate systems,
693-698
Leibnitz formula, for
differentiating an integral, 478
differentiating a product, 667, 670
it, 318,763,777
Lerch's theorem, 826
Levi-Civita symbol, 132, 133, 168
Lie groups, 252
generators, 252, 273
L'Hospital's rule, 310, 567, 596
Linear independence, 5,468
Linear operator, 113,184,188
differential operator, 113
integral transforms, 795, 824
Liouville's theorem, 375, 399
Liquid drop model, 679, 949
Logarithmic integral function, 567
Lommel integrals, 594
Lorentz covariance of Maxwell's equations,
150-164
Lorentz relation, 151
Lorentz transformation, 154, 273
Maclaurin series, 40, 551
Maclaurin theorem, 305
Madelung constant, 298
Magnetic dipole, 47, 110, 676
Magnetic field of current loop, 325, 672-676,
806,846
Magnetic vector potential. See Potential theory,
vector potential
Mapping, conformal. See Conformal mapping
Matrices, 176-237
adjoint, 210
angular momentum matrices, 186, 187, 262
anticommuting sets, 211-213
antihermitian, 221
definition, 177
direct product, 179
diagonalization, 217-229
Dirac, 211-213
Euler angle rotation, 198-200
Hermitian, 210, 219
unitary relation, 215, 236, 261
ill-conditioned, 233
inverse, 181, 196
Gauss-Jordan matrix inversion, 182
ladder operators, 187
matrix multiplication, 178
moment of inertia, 217
normal, 229-231
orthogonal, 191-205
Pauli spin, 186, 211
quaternions, 185
relation to tensors, 203
similarity transformation, 201-203
trace, 181, 188
transpose, 196
unitary, 210
vector transformation law, 194
Maxwell's equations
derivation of wave equation, 49
dual transformation, 157
Gauss' law relation, 77
Lagrangian for, 945
Lorentz covariance, 150-158
Mellin transforms, 795, 797
Metric, 87, 127, 158, 162, 208
Minkowski space, 134, 152, 272
Mixed tensor, 120
Modified Bessel functions, 610-616
asymptotic expansion, 618, 619
Fourier transform, 806
generating function, 613
Iv, Kv, 610, 612
integral representation, 613-616
Laplace transform, 846
recurrence relations, 611, 614
series form, 611
Wronskian relation, 615
Momentum representation, Schrodinger wave
equation, 814-820,867-869
Morera's theorem, 373, 374
Navier-Stokes equation, 50, 98, 100
Neumann functions, 596-604
asymptotic form, 619
Fourier transform, 806
recurrence relations, 599
series form, 474, 597
spherical Neumann functions, 623
Wronskian formulas, 599, 602
Neutron diffusion theory, 319, 810
Boltzmann transport equation, 866
Newton's root finding formula, 310, 963
INDEX 981
Normal matrices, 229-231
Normal modes of vibration, 231-233
Numerical analysis
asymptotic series, 339-346
Bessel, modified Bessel functions, 620,621
cosine, sine integrals, 342, 343
exponential integral, 339-341
Gauss error function, 345
Stirling's series, 344
Chebyshev truncation, telescoping, 744—746
computation
Bessel functions, 577, 621
Chebyshev polynomials, 734
factorial functions, 556-558
Hermite polynomials, 713
Laguerre polynomials, 724
Legendre polynomials, 647
spherical Bessel functions, 628
convergence of series, improvement of, 288,
296, 334
differential equations, 491-496
first-order, 491
predictor-corrector methods, 493
Runge-Kutta method, 492
second-order, 494
factorial function, 557, 558
integral equations, 885-887
inverse Laplace transform, 829
Rayleigh-Ritz variational technique, 957-961
Nutation, earth's, 833
Oblique coordinates, 164, 206-209, 563
Olber's paradox, 291
Operators. See also Angular momentum operator
adjoint, 498
del, 33
integral. See Integral transforms
ladder (raising, lowering)
Hermite functions, 716
matrices, 187
spherical harmonics, 687-691
linear differential, 113,497
Optical dispersion, 424, 857-859
Orthogonal eigenfunctions
Hilbert-Schmidt integral equations, 891
Sturm-Liouville differential equations, 511
Orthogonal polynomials, 520
Orthogonality
curvilinear coordinates, 87
functions, 511
vectors, 15
Orthogonality condition, 11,194, 197
Orthogonalization, Gram-Schmidt method,
516-523
Oscillator, linear
damping
driving force included, Laplace transform
solution, 851
Laplace transform solution, 838, 845
Green's function, 904
integral equations for, 870, 904
Laplace transform solutions, 832
momentum wave function, 818
quantum mechanical development, 715, 716
scalar potential, 68
self-adjoint equation, 500
series solution of equation, 455
singularities in equation, 454
Parity, 107
Bessel functions, 585
Chebyshev functions, 735
differential operator, 459
Fourier cosine, sine transforms, 801
Hermite functions, 713
Legendre functions, 649
associated, 669
second kind, 706
spherical harmonics, 684
spherical modified Bessel functions, 633
spherical polar coordinates, 107
vector spherical harmonics, 709
Parseval's identity, 780
Parseval's relation, 425, 812
Partial differential equations, 437
boundary conditions, 502
Partial fractions, 827
Particle, quantum mechanical
Lagrangian multipliers, 947
in rectangular box, 117, 947
in right circular cylinder, 587, 948
in sphere, 628, 634, 960
Pauli spin matrices, 186, 211
special unitary group, SUB), 265-267
Pi (it)
Leibnitz formula, 318, 763, 777
Wallis formula, 348, 565
Pochhammer symbol, 533, 749, 753
Poisson equation, 77, 437, 813
Green's function, 480, 485, 897
Poisson's ratio, 146, 149
Polar vectors, 128
Potential theory, 64-74
conservative force, 65
scalar potential, 64, 80
electrostatic, 151, 592, 596, 614, 656-659,
830, 846
gravitational, 67, 655, 683
vector potential, 47,69,80,105,110,151,325,
678
current loop, 105, 325, 672-676, 708-710
Power series, 313-321
differentiation, integration, 314, 315
inversion, 3-16
solution of differential equations, 454-467,
473
uniqueness theorem, 315, 320
Principal axes, 219
Projection operators, 518, 538, 654
Pseudoscalar, 131
982 INDEX
Pseudotensor, 128-137
definition of, 131
Pseudovector, 131
Quadrature
Gaussian, 968-974
interpolatory formulas, 968
Simpson's rule, 969
Quantum mechanics
angular momentum. See Angular momentum
operator deuteron, 500-502
expectation values, 506, 815
hydrogen atom
associated Laguerre polynomials, 726-728
momentum representation, 816, 817
hydrogen molecular ion, 466
momentum representation, 814-820, 867
particle. See Particle, quantum mechanical
scattering, 409-411, 660, 915-920
Schrodinger representation, 817
Schrodinger wave equation, 954
sum rules, 427
wave packet, 819
Quaternions, 10, 20, 185
Quotient rule, 126, 127, 135
Radioactive decay, 837
Rayleigh equation, 665, 922
Rayleigh formulas, 628
Rayleigh-Ritz variational method, 957-961
Reciprocity principle, Green's functions, 488,
901
Recurrence relations
Bessel functions, 576
spherical Bessel functions, 627
Chebyshev functions, 732
confluent hypergeometnc functions, 757
exponential integral function, 570
factorial functions, 544
gamma functions, 539
Hankel functions, 605
Hermite functions, 712
hypergeometric functions, 750
incomplete gamma function, 569
Laguerre functions, 724
associated Laguerre functions, 725
Legendre functions, 645
associated Legendre functions, 668
second kind, 705
modified Bessel functions, 611, 614
spherical modified Bessel functions, 634
Neumann functions, 599
poly gamma functions, 553
Relativistic particle, Lagrangian, 941
Residues
Bromwich integral, 853-861
calculus of residues, 396-421
residue theorem, 400
Riemann-Christoffel curvature tensor, 123
Riemann zeta function, 284, 289, 332, 333, 550
Fourier series evaluation, 772, 773, 775
table of values, 332
Rodrigues representation
Chebyshev polynomials, -735, 738
Hermite polynomials, 713
Laguerre polynomials, 723
associated Laguerre polynomials, 726, 728
Legendre polynomials, 663
associated Legendre polynomials, 681
Rotation
angular momentum and, 261-264
of coordinates, 8-12, 119, 120, 191-203,
261-264
of functions, 264
of vectors, 201,202
Runge-Kutta solution, 492
Saddle point. See Steepest descent, method of
Scalar, definition of, 1, 9, 16, 119
Scalar potential, 64, 80
Scalar product of vectors, 13-18, 511
Scattering, quantum mechanical, Green's function,
Schmidt orthogonalization. See Gram-Schmidt
orthogonalization
Schrodinger wave equation
hydrogen atom, 726
momentum representation, 820, 867
particle in a sphere, 928
scattering, 915-920
variational approach, 954
Schwarz inequality, 527, 533
generalized, 536
Schwarz reflection principle, 377, 378, 553
Secular equation, 221, 884
Self-adjoint differential equations, 497-509
Self-adjoint differential operator. See Hermitian
differential operator
Semiconvergent series. See Asymptotic series
Separation of variables, 111-117, 440, 448-451
Series solution of differential equations, 451-467
Bessel's equation, 459
Chebyshev series, range of convergence, 292
Hermite's equation, 464
hypergeometric series, range of convergence,
291
incomplete beta function, 292
Legendre's equation, 464, 701-704
range of convergence, 288, 291
recurrence relation, 456
ultraspherical equation, range of convergence,
292
Shifted polynomials
Chebyshev, 746, 747
Legendre, 522
Sine x, infinite product representation, 348
Sine integral, 567
asymptotic representation, 342, 343
confluent hypergeometric representation, 756
Laplace transform, 847
INDEX 983
Singularity, 396-400
branch point, 397
differential equation, 451-454, 461
Laurent series, 396
on contour of integration, 408-411
pole, 396
Special unitary group, SUB), 253, 267
O^homomorphism, 255-258
Pauli spin matrices, 265-267
Special unitary group, SUC), 269
Spherical Bessel functions. See Bessel functions
Spherical harmonics, 680-685
addition theorem, 261, 693-698
Condon-Shortley phase, 682, 692
harmonics; sectoral, tesseral, zonal, 685
tensor spherical, 140, 710
vector spherical, 707-711
integrals, 698-700
ladder operators, 687-691
Laplace series, 682, 685
orthogonality, 681
Spherical polar coordinates, 102-111
Spherical tensor, 135
Spinors, 123, 214
Stark effect, 465
Steepest descent, method of, 428-436
factorial functions, 433
Hankel functions, 431
modified Bessel functions, 435
Step function, 415, 484, 490, 804, 828, 840, 844
Stirling's series, 434, 555-559
Stokes' theorem, 61-64, 92
application to Cauchy integral theorem,
366-368
Stress-strain tensors, 140-145
Sturm-Liouville theory, 497-538, 652, 762, 903
variational analog, 957
Summation convention, 121, 125
Symmetry
differential operators, 459
dispersion relations, 423
dyadics, 138
functions, 458
Green's function, 486, 901
kernels, 890
matrices, 201
tensors, 122
Taylor expansion, 43, 303-313, 376, 377, 491,
767
more than one variable, 309
Tensor analysis, 118-167
contravariant vector, 119
covariant vector, 119
definition of second rank tensor, 120
differential operations, 164-167
isotropic tensor, 122, 123, 136
noncartesian tensors, 158-164
scalar quantity, 119
symmetry-antisymmetry, 122
tensor transformation law, 120
Tensor density. See Pseudotensor
Thermodynamics, exact differentials, 69
Thomas precession, 275
Titchmarsh theorem, 426
Transfer functions, 820-823
Triple scalar product of vectors, 26-28
Triple vector product of vectors, 28-30
В AC-CAB rule, 29, 45, 49, 50
Tschebycheff. See Chebyshev
Ultraspherical equation, 735
polynomials, 643, 731
self-adjoint form, 500
Uncertainty principle in quantum theory, 629,
716, 803
Uniqueness
descending power series, 320, 675
differential equation solution, 463
inverse operator, 826
Laurent expansion, 384
power series, 315, 320, 456
solutions of Laplace's equation, 79
Unit vectors
coordinates, cartesian, 5
circular cylindrical, 96
spherical polar, 103
Variational principles. See Calculus of variations
Vector analysis, 1-84. See also Tensor analysis
components, 4
normal vectors, 15
orthogonal vectors, 15
reciprocal lattice, 28, 32, 207
rotation of coordinates, 8,193
scalars, 1, 16
triangle law of addition, 1
vector, definitions of, 1, 7-13
vector components, 4
vector transformation law, 10,119,194
Vector Laplacian. See Laplacian, vector
Vector potential, 47, 69, 110, 325, 672
Vector product of vectors, 18-26
Vector space, 12, 530-534
Vector spherical harmonics, 707-711
Vierergruppe, 185, 239, 240, 242, 243
Volterra integral equation, 865. See also Integral
equations
Wallis formula for it, 348, 565
Wave equation,
anomalous dispersion, 857-859
derivation from Maxwell's equations, 49
Fourier transform solution, 808, 809
Laplace transform solution, 841, 842
Waveguide, coaxial, 101, 600, 603
Whittaker functions, 756
Work, potential, 66
984 INDEX
Wronskian solutions of self-adjoint differential equation,
absence of third solution, 477 469-471, 507
Bessel functions, 599, 602, 605, 608, 622
spherical, 631, 634
Chebyshev functions, 738 Young's modulus, 146, 149
confluent hypergeometric functions, 757
Green's function, construction of, 900
linear independence of functions, 468 Zeta function. See Riemann zeta function
second solution of differential eqtiation, 469, Zeros, of functions, 636,652,963-967
507 of Bessel functions, 581
INDEX 985