Basic Mathematics

Basic Mathematics

Chapter 1 Basic Mathematics While it is desirable to formulate the theories of physical sciences in terms of the most lucid and simple language, this ...

2MB Sizes 5 Downloads 219 Views

Chapter 1 Basic Mathematics While it is desirable to formulate the theories of physical sciences in terms of the most lucid and simple language, this language often turns out to be mathematics. An equation with its economy of symbols and power to avoid misinterpretation, communicates concepts and ideas more precisely and better than words, provided an agreed mathematical vocabulary exists. In the spirit of this observation, the purpose of this introductory chapter is to review the interpretation of mathematical concepts that feature in the definition of important chemical theories. It is not a substitute for mathematical studies and does not strive to achieve mathematical rigour. It is assumed that the reader is already familiar with algebra, geometry, trigonometry and calculus, but not necessarily with their use in science. An important objective has been a book that is sufficiently self-contained to allow first reading without consulting too many primary sources. The introductory material should elucidate most mathematical concepts and applications not familiar to the reader. For more detailed assistance one should refer to specialized volumes treating of mathematical methods for scientists, e.g. [5, 6, 7, 8, 9]. It may be unavoidable to consult appropriate texts in pure mathematics, of which liberal use has been made here without reference.

1.1 1.1.1

Elementary Vector Algebra Vectors

Quantities that have both magnitude and direction are called vectors. Examples of vectors in chemistry include things like the dipole moments of molecules, magnetic and electric fields, velocities and angular momenta. It is convenient to represent vectors geometrically and the simplest example is 1

2

CHAPTER 1. BASIC

MATHEMATICS

a vector, p, in two dimensions. It has the magnitude p and its direction can be specified by the polar angle cp. i JC

y

p

/"^^^ It has two components, (x,y) in cartesian coordinates, and these can be transformed into plane polar coordinates to give X = p cos (p

y = psmcp

The magnitude of the vector is p = y x M - y ^ .

1.1.2

Sum of Vectors

A position vector like p above is usually represented as starting at the origin, but it remains unchanged when moved to another position in the plane, as long as its length and direction do not change. Such an operation is used to form the sum of two vectors, A and B. By moving either A or B to start from the end point of the second vector, the same vector sum is obtained. It is easy to see how the components add up to ^x

Uy

^^ -^x

I

^x

A.y -f- L>y

X

1.1.3

X

Scalar P r o d u c t

The product of two vectors A • B is the simple product of the magnitudes AB, only if the two vectors are parallel.

1.1. ELEMENTARY

VECTOR

ALGEBRA

If the angle between the vectors is a, their product, called the scalar or dot product, is defined as A • B = AB cos a, which is the product of any of the two vectors in magnitude times the projection of the other {B' or A') on the direction of the first. cosa = B'/B = A'/A A!B = B'A ^ AB cos a

1.1 A

Three-dimensional Vectors

Another way to represent vectors is by using unit vectors. If i is a unit vector then any vector v parallel to i can be expressed as a multiple of i. If v has the magnitude v, then \v is equal to v in both magnitude and direction. A vector of unit length is now defined in the direction of each of the cartesian axes.

34.

iA,

A two-dimensional vector A is represented as A = lA^ + ]Ay and any other vector, B = iB^ -\-}By. The scalar product is then written as A • B = (iA^ + ]Ay) • {iB:, + }By) = i • lA^^Bj, + i • iAj^By + j • lAyB:, -f j • jAyBy

CHAPTER 1. BASIC

MATHEMATICS

From the definitions of i and j it follows that

i j = j i = 0 so that A • B = A^Bx + ^yBy. A position vector in 3 dimensions has components F(x, y, z) in cartesian 2 coordinates, and magnitude of r = ix+j^/H-k^:, following from r 2 — ^p^-hz"" = x^ + 2/2 + 2:2

The transformation between cartesian and spherical polar coordinates is X — T^WiQ cos (p

y — r sin^sin(/9 z — r cos Q

The sum and scalar product of two three-dimensional vectors are similar to those quantities in two dimensions, as seen from the following relationships: A = \A:,-^\Ay + kA^

B = i5^+jB^ + kB, A-[Al^-A\^A':)

2^^

1.1. ELEMENTARY

VECTOR

ALGEBRA

5

The scalar product of the two vectors is still given by A • B = AB cos a == A^^B^ + AyBy + A^B^ where a is the angle between the two vectors. The dot product of two unit vectors is equal to the direction cosine relating the two directions. The value of the dot product is a measure of the coalignment of two vectors and is independent of the coordinate system. The dot product therefore is a true scalar, the simplest invariant which can be formed from the two vectors. It provides a useful form for expressing many physical properties: the work done in moving a body equals the dot product of the force and the displacement; the electrical energy density in space is proportional to the dot product of electrical intensity and electrical displacement; quantum mechanical observations are dot products of an operator and a state vector; the invariants of special relativity are the dot products of four-vectors. The invariants involving a set of quantities may be used to establish if these quantities are the components of a vector. For instance, if ^ AiBi forms an invariant and Bi are the components of a vector, then Ai must be the components of another vector.

1.1.5

Vector Product

In three dimensions there is another kind of product between two vectors. Consider the same two vectors A and B together with a third , C perpendicular to the plane that contains A and B . Now form the scalar products L> • A = C^xAx "h OyAy -\- UzAz C • B = CxBx + CyBy -h CzBz

6

CHAPTER

1. BASIC

MATHEMATICS

When trying to solve these equations one finds that Co: = m{AyB^ -

A,By)

Cy = m{A,B:,

-

A:,B,)

C, = m{A:,By -

AyB:,)

where m is some arbitrary constant. If it is defined as m = + 1 , then one has

c'^d + cl + cl = {Al + AI + Al){Bl

+ Bl + Bl) - {A,B,

+ AyBy +

A,B,f

= (A2)(B^)-(A.B)^ Hence C^ = = A^B\l

{A^B^-A^B^cos^a) - cos^ a) = A^B^ ^in^ a

The vector C can therefore be considered to be the product of A and B with the magnitude ^ B s i n a . This is the basis of defining the vector product, or cross product of two vectors as A X B ==

ABsma

By definition A x B = - B x A, but it may be shown that Ax(B4-C) = A x B + A x C . The vector products of the unit cartesian vectors are seen to obey the following relations ixi

=

jxj=kxk

0

i XJ = -j Xi = k jxk = -kxj = i k Xi =

—i X k = j .

Further, A XB

== {lA:, + ]Ay + \
l{AyB, - A,By) + j ( ^ B , - A,B,)

Ax

Ay

Az

Bx

By

Bz

+ \^[A,By -

AyB,)

1.1. ELEMENTARY

VECTOR ALGEBRA

7

Two vectors commonly represented in terms of cross products are: the angular momentum of a particle about some point, equal to the cross product of the momentum vector of the particle and the radius vector from the origin to the particle; and torque, equal to the cross product of the force vector and the vector representing the lever arm.

1.1.6

Three-vector products

The triple scalar product A • {B X C ) , interpreted geometrically, is the volume of the paralellipiped with sides A, B , C and represented by the determinant

•^x

xC) =

A-iB

Bx ^x

Ay By Cy

A^ B^ C,

/

BXC

V /

/

/

/

/

;Acos6/ \ C/_ ot

--'

Since a determinant changes sign when two of its rows are interchanged it is readily shown that A^{By.C)

=

- A . ( B X C) = - B . ( A X C) =

-

C . ( B X A) = B . ( C X A) = C . (A X B )

Since the order of the scalar product is unimportant, it means that A . ( B X C) = ( B X C) . A = (A X J5) . C = ( C X A) . B

(1.1)

The vector triple product A X {B X C ) is a vector since it is the product of two vectors, A and B X C It may be shown by expansion that Ay.{B^C)

= B ( A • C) - C{A • B)

which permits decomposition into two scalar products.

(1.2)

CHAPTER

1.1.7

1. BASIC

MATHEMATICS

Complex N u m b e r s

The most important and fundamental difference between quantum mechanics and classical mechanics is the appearance of complex quantities as an essential ingredient of the former. A complex number consists of two parts: a real and a so-called imaginary part, c = a -^ ib. The imaginary part always contains the quantity z, which represents the square root of - 1 , i = y/^. The real and imaginary parts of c are often denoted by a = R{c) and b = I{c). All the common rules of ordinary arithmetic apply to complex numbers, which in addition allow extraction of the square root of any negative number. If z = X + iy then the reciprocal of z, called z~^, is given by z-'

=

y

x^ + y^

x^ + y^

To prove this, form the product z{z ^) = {x + iy) 1 x^ -h y^

x^ + y2

— i-

y

x^ + y^

-\- ixy — ixy — i y ) •2

2

I y

x^ -h y'^

= 1

Since the real and imaginary parts of a complex number are independent of each other, a complex number is always specified in terms of two real numbers, like the coordinates of a point in a plane, or the two components of a two-dimensional vector. In an Argand diagram a complex number is represented as a point in the complex plane by a real and an imaginary axis.

z=x+iy

l.L

ELEMENTARY

VECTOR ALGEBRA

9

When expressed in polar coordinates, the quantity r is the magnitude or absolute value or modulus and (f) is the argument or phase of the complex number. It follows immediately that X -^ iy — r cos (f) -{- ir sin (f) A standard way to represent trigonometric functions is in terms of infinite series:

n-Q X^

X^

X^

y ^ ,

^,„

X^^

COSX n=0

while the exponential function oo

2!

3!

^ ^ n\

It is left as an exercise to the reader to show that e^^ = cosx + zsinx, which is known as Euler's theorem. In the present notation re^^ — ^ (>Qg (j) -\- if sin cj) — x -\- iy where e is the base of natural logarithms, e = 2.7182813 • • •. Viewed as an operator, this quantity rotates any vector ^ == x -f z?/ by an angle (j) and increases its length by a factor r. In the polar representation, the product of two complex numbers z\ — Tie^^^ and Z2 = r2e^^'^ is 2:12:2 = rir2e^^^^^^'^^ and their quotient, ^2

\r2,

In all of these formulae, (j) must be measured in radians. De Moivre's formula for raising a complex number to a given power is: {re'^^Y ^ r^e^^^ = r^[cos(n(/)) + zsin(n(/))] The magnitude and phase of a complex number z — x -{-iy are calculated as (j) = arctan f — j The complex conjugate oi z =^ x -\- iy \s defined as 2:* = (x + iyY = X — iy as shown in the following Argand diagram

CHAPTER 1. BASIC

10

MATHEMATICS

z=x-\-iy

z=x-iy

The phase of the complex conjugate^ is —(f). The magnitude is the same, so

The magnitude of a complex quantity can be obtained from the product

so that r = y/zz*^ where the positive square root is taken. Note that zz* is always real and nonnegative.

1.1.8

N-dimensional Vectors

In the course of this work there will be occasion to use vectors defined in more than three dimensions. Although an n-dimensional vector is diflficult to visualize geometrically the algebraic definition is no more complicated than for three-dimensional vectors. An n-dimensional vector is considered to have n orthogonal components and the vector is defined by specifying all of these as an array. There are two conventions to define either a column vector, like

C2 C

= { C l , C 2 , - • • ,Cn}

=

\Cn J or a row vector, like C = [ci,C2,- • • ,Cn]

^The complex conjugate is obtained by changing the sign in front of every i that occurs.

1.1. ELEMENTARY

VECTOR

ALGEBRA

11

The scalar product of two n-dimensional vectors is only defined when one vector is a column vector and the other is a row vector, i.e. W V := [Wi,W2r"

,'Wn]{vuV2r"

= WiVi + W2V2 H n

,Vn}

h WnVfi

= Y^ WiVi The Hermitian conjugate c^(dagger) of a column vector c, is a row vector, with the components c*. The scalar product of the row vector w^ and a column vector, v is

= Y2^>i i=l

If wW — 0 the column vectors v and w (or equivalently the row vectors v^ and w^) are orthogonal. The value of vW for any vector is always positive, since n V^ I

t

|2

vW = 2^\vi\ 2=1

The length (norm) of the column vector v is the positive root \/vtv. A vector is normalized if its length is 1. Two vectors r^ and Vj of an ndimensional set are said to be linearly independent of each other if one is not a constant multiple of the other, i.e., it is impossible to find a scalar c such that Vi = cvj. In simple words, this means that Vj and r^ are not parallel. In general, m vectors constitute a set of linearly independent vectors if and only if the equation m

^

aiVi = 0

2=1

is satisfied only when all the scalars a^ = 0 for 1 < z < m. In other words, the set of m vectors is linearly independent if none of them can be expressed as a linear combination of the remaining m — \ vectors. A simple test for the linear independence of a set of vectors is to construct the determinant of their scalar products with each other as

r-

ri ri r2' Vi

ri r2 r2'r2

•ri

•^2

• • r^i r^ • • • r2•r„

12

CHAPTER

1. BASIC

MATHEMATICS

known as the Gram determinant. If F = 0 (see 1.2.3), it follows t h a t one of the vectors can be expressed as a linear combination of the remaining m — 1; if r / 0, the vectors are linearly independent.

1.1.9

Quaternions

There is similarity between two-dimensional vectors and complex numbers, but also subtle differences. One striking difference is between the product functions of complex numbers and vectors respectively. The product of two complex numbers is ziZ2

=

(xi -hiyi) {x2 + zy2) .

:=: (xiX2 - yiy2) + i (xi2/2 + ^22/1), = xs + lys

zs or in polar form Z1Z2 =

Z3 =

rir2 {cos(l9i -f 6^2) + ^ sin((9i -f O2)} r3(cos(93+ isin6>3)

The product of two vectors is either a scalar z:i = zi ' Z2 = Z1Z2 cos6> = X1X2 + 7/12/2

or a vector zs = zi X Z2 = ziZ2sm9

= yiX2 - y2X\.

The complex product appears to be made up of two terms, not too unlike the scalar and vector products, from which it seems to differ only because of a sign convention. The extension of vector methods to more dimensions suggests the definition of related hypercomplex numbers. When the multiplication of two three-dimensional vectors is performed without defining the mathematical properties of the unit vectors i, j , fc, the formal result is q =

{ixi+jyi+kZi){ix2-}-jy2

=

i^XiX2 + J^Jl2/2 + k'^ZiZ2

+

ijxiy2

+ kz2)

+ jiy\X2 + ikxiZ2 + kiziX2 + jkyiZ2

By introducing the following definitions «-2

ij ji

-2

.2

1

= k Jk = i ,ki = j = —k ^kj = —i ^ik — —j

+

kjziy2

1.2. DETERMINANTS

AND MATRICES

13

a physically meaningful result, q=

-

{xiX2-\-yiy2-^

+

j{ziX2 - Z2X1) + k{xiy2 - X2yi)

Z1Z2)-\-i{yiZ2 - y2Zi)

is obtained. The first term on the right is noticed to be the negative of the scalar product of two vectors and the remaining three terms define the vector product in terms of unit vectors i, j , k. The formal multiplication therefore yields the scalar and vector products in one blow. This solution first occured to Sir William Hamilton as a eureka experience on a bridge in Dublin, into which he carved the result

i^ = p z= k^ = ijk = - 1 . He called the product function a quaternion. In terms of the Hamilton formalism a hypercomplex number of unit norm can now be defined in the form ao = aiCi where ao and the a^ are generalizations of \/—T. In the case of three e^ the hypercomplex number is called a quaternion and the e^ obey the following rules of multiplication: 6? = eiCj =

for

ijk

an

-1,

2 = 1,2,3

e^Jkek

< odd

{i ^ j)

>

permutation of 1, 2, 3.

These numbers do not obey all of the laws of the algebra of complex numbers. They add like complex numbers, but their multiplication is not commutative. The general rules of multiplication of n-dimensional hypercomplex numbers were investigated by Grassmann who found a number of laws of multiplication, including Hamilton's rule. These methods still await full implimentation in physical theory.

1,2 1.2.1

Determinants and Matrices Introduction

The first successful formulation of quantum mechanics by Heisenberg was achieved in terms of matrices, and has not diminished in importance since

CHAPTER!.

14

BASIC

MATHEMATICS

then. Matrices are rectangular or, more important in the present context, square arrays of numbers, manipulated only in terms of the well-defined rules of matrix algebra. These procedures probably developed from techniques, familiar to any schoolboy, for solving a set of linear equations, like auXi + ai2X2 + ai3X3 = bi a2iXi + a22X2 4- a23X3 = 62 asiXi + as2X2 + ^33:^3 == 63 In shorthand notation this is written as

an

^12

" Xi

^13

^21

^22

<^23

^2

^31

Of'32

0-33

^3

"61

' z=

62 ^3

or Ax = b where x and b are recognized as column vectors and A is a 3 x 3 matrix.

1.2.2

Matrix Operations

To reproduce the system of equations it is noted that each row of the matrix is multiplied into the column vector, adding elementary products together to yield the elements of the product vector, b . This is a special case of matrix multiplication which can be represented symbolically by

] 1 = [-f] The zth row of the pre-factor is multiplied term-by-term with the j t h column of the post-factor and added to yield the ijth element of the product. The product of two matrices is therefore similar to the scalar product of two vectors. C is the product of A B , according to = ^ a ik^kj , k=l

n is the number of columns in A, and B is required to have as many rows as A has columns. C will have as many rows as A and as many columns as B .

1.2. DETERMINANTS

AND MATRICES

15

If two matrices are square, they can be multiplied together in any order. In general, the multiplication is not commutative. That is A B ^ B A , except in some special cases. It is said that the matrices do not commute^ and this is the property of major importance in quantum mechanics, where it is common practice to define the commutator of two matrices as [A,B] = [ A B - B A ] However, matrix multiplication is associative A(BC) -

(AB)C

and matrix multiplication and addition are distributive A ( B + C) = A B + A C Two matrices are equal to each other if and only if every element of one is equal to the corresponding element of the other. The two matrices must, of course have the same numbers of rows and of columns. The sum of two matrices is defined by C = A -h B if and only if Cij = ^ij + ^ij for all i and j . The product of a matrix and a scalar is defined by B == cA, if and only if bij = caij for all i and j . An identity matrix is defined to have the property A I = l A = A. Since it gives the same result for pre and post multiplication it must be a square matrix, with the form

I -

1 0

0 1

0 0

••• •••

0 0

0

0

0

•••

1

The nonzero elements are those with both indices the same and are called the diagonal elements. The sum of the diagonal elements of a square matrix is called the trace n

TiA = Y^ A,, The trace of the product of two or more matrices is independent of the order of multiplication. The proof is simple:

TvAB = Y^iABU = Y.J2 ^iJ^Ji = T r S ^ i

i

j

16

CHAPTER!.

BASIC

MATHEMATICS

If A X JB = C , T r C = TTA • TvB. Another important matrix is the transpose which is obtained by interchanging rows and columns. It is denoted by A (tilde) and defined by

If a matrix is equal to its transpose, it is said to be a symmetric matrix. If the elements of A are complex numbers, the complex conjugate of A is defined as

A* = [Ay The transpose of the complex conjugate At = (A*) = (A)* = {A%

= a*,

is called the hermitian conjugate. A matrix that is equal to its hermitian conjugate is called hermitian^ and these are the matrices used in matrix mechanics, A^ = A. A matrix is antihermitian if A^ = —A. An orthogonal matrix is one whose transpose is equal to its inverse^ A = A - ' , (A = A " ' ) A unitary matrix is one whose inverse is equal to its hermitian conjugate, A - i = At = A*.

1.2.3

Inverse of a M a t r i x

In order to solve the set of equations it is necessary to find the inverse of A (denoted by A~^) and multiply it into A = bx, noting that AA~^ = A~^A = I, the unit matrix.

1 =

1 0 0 0 1 0 0 0 1

Hence A-^Ax = x = A-^b produces the required solutions. The unit matrix is also presented by the Kronecker delta function c _ ) ^,

'-{5:

Oin

iov 1 = j for i y^ j

1.2. DETERMINANTS

AND

MATRICES

17

The systematic high-school procedure to solve the set of three equations therefore amounts to the procedure of obtaining the inverse matrix A " ^ To follow this procedure it is useful to define the determinant associated with the square matrix of interest, written as

an

^12

du

0^21

^22

Ci23

^31

^32

=

0.11

^22

G^23

0^32

G^33

-

^21

^23

<^31

^33

Cil2

+ ai3

<^21

0^22

0^31

<^32

^33

— ^11^22^^33 "

0^1ltt23<^33 ~

ttl2^21<^33

+ ^ 1 2 ^ 2 3 ^ 3 1 + ^ 1 3 ^ 2 1 ^ 3 2 — G^13<^22<^31

The sub-determinants, with appropriate signs, as they appear in the first stage of expansion of the determinant, are examples of cofactors. The minor^ Mij of an element aij is obtained by deleting the zth row and the jth column from the determinant, whereby the cofactor A,j = (-1)'+^M,, The evaluation of a determinant is a tedious process, which for an n x n determinant is summarized by

1^1 =

an

ai2

^21

G^22

Oln 02n rn=\

Onl

On2

for I — i for z / j

[=\A\

\m,

If the rows and columns of the cofactor matrix are transposed a matrix A-^*, called adjugate to A is obtained, such that 0 0

A • A^' = \A\ 0

0

It follows that A^' A X —— = I = A A "

1^1

and hence the definition of the inverse matrix as A-i

A^

= \Am

CHAPTER 1. BASIC

18

MATHEMATICS

To calculate the inverse of a matrix by this procedure is equally tedious and probably more work than solving a set of equations by the brute-force highschool technique. However, the procedure is readily converted into computer code and this is now the only way recommended for matrix inversions. It is noted that the inverse of a matrix only exists if \A\ ^ 0. Any matrix with |A| — 0 is called singular.

1.2A

Linear Homogeneous Equations

It is a common problem to solve a set of homogeneous equations of the form Ax = 0. If the matrix is non-singular the only solutions are the trivial ones, xi = X2 = ''' = Xn = 0. It follows that the set of homogeneous equations has non-trivial solutions only if 1.41 = 0. This means that the matrix has no inverse and a new strategy is required in order to get a solution. Since A is singular, one of the consequences is that the columns (or rows) of the matrix are linearly dependent. This means that a set of non-zero coefficients Q, for which an identity relation /

flu

\

^21

/ ai2 Cl +

\ ani J

\

^22

/ din

\ Cn == 0

C2 -f- • • • +

\ 0,n2 J

exists between the columns (rows) of the matrix, can be found. On the other hand if Xj{l < j < n) is a set of n linearly independent column vectors (matrices of order n x 1), then any column matrix (vector) y can be expressed as a linear combination of the vectors x^; so that the coefficients a exist such that y = CiXi -h C2X2 + • • • + CnXn

If these n column vectors are assembled into a square matrix / ^11 ^21

^12 ^22

\ ^nl

^n2

Xln X2n

\

X =

J

then, since the columns of the matrix are linearly independent, the matrix X is non-singular. It follows that any column matrix y of order n can be

L2.

DETERMINANTS

AND

MATRICES

19

expressed in the form y = X c , and the required values of Cj are given by

c-

X-'y.

This result can be generalized into the statement that any arbitrary vector in n dimensions can always be expressed as a linear combination of n basic vectors, provided these are linearly independent. It will be shown that the latent solutions of a singular matrix provide an acceptable set of basis vectors, just like the eigen-solutions of certain differential equations provide an acceptable set of basis functions. T h e L a t e n t r o o t s of a M a t r i x A problem of particular importance in quantum theory is the calculation of special column vectors x and the eigenvalues A associated with a given square matrix A, by the equation A x = Ax, or (A - AI)x = 0 The solution x = 0 is excluded. There may be several different column vectors x, each with a different value of A, and each satisfying the equation. The numbers A are called the latent roots or eigenvalues of A. The vectors x are the latent solutions or eigenfunctions of A. The equation, written out in full, is a homogeneous system of linear equations, and will have a solution, other than x = 0, if and only if an - A

ai2

^21

<^22 ~ ^

Hi

Q>n2

^2n

= 0

This is an algebraic equation in A of degree n and will have n roots Ai, A2, • • • , An (possibly including repeated and complex roots). To each value of A there will correspond in general a distinct solution x. Let X(i), • • • ,X(7i) correspond to Ai, A2, • • • , A^ then Ax(i) AX(2)

= =

AiX(i) 1 A2X(2)

-^X(n)



"^n^in (n)

These solutions are not unique; for example, if X(i) is a solution, then /cx(i) is also a solution.

20

CHAPTER

1. BASIC

MATHEMATICS

To show that the eigenvectors are linearly independent assume that a linear relationship does exist. Then CiX(i) -f C2X(2) + • • • + CA;X(A:) = 0

If this is multiplied by the matrix (A — A^I) terms like (A - A,I)x(^) = Ax(^) - AiX(^-) = {Xj - Ai)x(^) will be generated for all i ^ j . A new linear combination, but with X(j) missing is obtained. Multiplying in succession by (A — Ajl)(z = 2 ^ A:) transforms the expression into ci(Ai - A2)(Ai - A3) • • • (Ai -

AA;)X(I)

= 0

If all the A's are different, as assumed, it follows that Ci == 0. In the same way all of the Ci can be shown to be zero. The linear relationship can therefore not exist and it follows that if A has n distinct eigenvalues the n corresponding eigenfunctions provide a set of basic vectors as discussed before. D i a g o n a l i z a t i o n of a M a t r i x A similarity transformation is effected by multiplying a matrix by another matrix and its inverse to produce yet another matrix, according to Q-^AQ - B Applied to the characteristic

matrix of A,

Q - ' [ A - AI]Q = [Q-^AQ - AI] = [B - AI] Moreover, if B == Q"^AQ, then

|B| = |Q"^AQ| = IQ-'I . |A| • IQI - |Q-^| • |Q| • |A| -T^-|Q|-|A| = |A| IWI Therefore |A — AI| = |B — AI|, whereby two matrices related by a similarity transformation have the same eigenvalues. Now suppose that B is a diagonal matrix (all off-diagonal elements equal to zero); then the roots of its characteristic equation (eigenvalues) are identical with its diagonal elements. If A is not a diagonal matrix but is related to B by a similarity transformation, it follows that it has the same characteristic equation and roots as B . The problem of finding the eigenvalues

1.2. DETERMINANTS

AND MATRICES

21

of A is therefore related to reducing the matrix to diagonal form by means of similarity transformations. The aim is to find a matrix X, such that X - i A X = A = [X^6^J]. Suppose that the n eigenvectors of A are compacted together as the n columns of a matrix X, i.e. Xi(^) = Xij and that the eigenvalues of A are Ai, A2, • , A^. Then

AX =

an

ai2

•••

^21

^22

•••

Ci2n

dfil

^n2

' ' *

^nn

=

ain

• Xn

J

X12

^In

2:21

^22

^2n

Xjii

Xji2

^nn

AiXii

A2X12

•••

^n^ln

A1X21

A2X22

•••

A„X2n

Al^ni

X2Xn2 ' ' ' A

Hi)) = AiX(j). It follows that

AX = \

Xii

X12

•••

Xin

X21

X22

•••

X2n

^nl

Xn2

' * *

^nn

\ Ai 0

0 A2

1 0

0

...

0

...

0 A„

= XA', where A' is a diagonal matrix, whose elements are the eigenvalues of A. Finally, since X is non-singular its inverse X"^ exists and premultiplication by X~^ yields the desired result X-^AX - A' = [X^6^J] This result is often stated in words that, the trace of a matrix is independent of the representation of the matrix, i.e.

TrA = TrSS-^A

1.2.5

= TvS-^AS

Linear Transformations

One important application of matrix algebra is formulating the transformations of points or vectors which define a geometrical entity in space. In ordinary three-dimensional space that involves three axes, any point is located by means of three coordinates measured along these axes. Similarly

22

CHAPTER 1. BASIC

MATHEMATICS

in n-dimensional vector space a set of n independent vectors is required to span the whole space. A linear transformation in general is brought about by an operation such as translation, twisting, rotation, stretching or some other kind of distortion of a vector to produce another vector in the same space. The operation is denoted by Tu = v and the space is said to be invariant under the action of T. If the outcome is unique for all u and the inverse transformation is also uniquely defined, T is said to be a one-to-one mapping of the space onto itself. An operator such as T can be represented by some matrix A, with appropriate elements, chosen in such a way as to mimic the operation of T when multiplied into the vector that represents u, e.g. X = eiXi -h €2X2 + ... + SriXn

where the e^ are unit vectors. Hence

yZAk^3 i,k

The transformed vector can also be expressed in terms of the unit vectors as x' = eix[ -h 62X2 -h ... + Cnx'^ It follows that the transformation is represented by Ax = x' If, in addition x — Bx", then AB = C is a matrix which transforms x" directly into x'. If P and Q are non-singular matrices, then A and B are said to be equivalent when B = PAQ. An important special case arises when PQ = / , when

B = Q-^AQ which is known as a similarity transformation. The matrices A and B, in this case, are said to be transforms of each other. If the matrix elements are complex the transformation is called unitary. The special class of transformation, known as symmetry (or unitary) transformation, preserves the shape of geometrical objects, and in particular the norm (length) of individual vectors. For this class of transformation the symmetry operation becomes equivalent to a transformation of the coordinate system. Rotation, translation, reflection and inversion are obvious examples of such transformations. If the discussion is restricted to real vector space the transformations are called orthogonal. The procedure to find symmetry transformation matrices will be demonstrated here for two-dimensional rotation.

1.2. DETERMINANTS

AND

MATRICES

23

(r,(t)+e)

( r , ([))

The radial coordinate r remains unchanged during rotation of the vector through an angle 6 about the z-axis, while the polar angle (f) becomes (f)-\- 9. In terms of cartesian components x'

— rcos((/) + 0) — r (cos (\) cos ^ — sin 0 sin &) — X cos 6 — y sin 6

y

— rsin((/)-h^) — r (sin (\) cos Q + cos 0 sin &) — yy cos 9 -\- xsmO

Thus "" x' 1 _ r cos 9 - sin ^ 1 r X y' \ [ sin^ cos 9 J [ y This operation corresponds to a counterclockwise rotation of the vector r . Since cos^ = cos{—9) and sin^ = - s i n ( - ^ ) , the matrix for clockwise rotation is cos U sm ( - sin 9 cos 9 The matrix equation for clockwise rotation through 9 about the z axis becomes X " x' ' cos 9 sin 0 0 — sin 9 cos 9 0 y = y' 0 0 1 z L ^' J An improper rotation which has the same effect as a proper rotation but in addition, changes the sign of z, is represented by the matrix cos 9 sin ^ 0 - sin 9 cos 9 0 0 0 - 1

CHAPTER 1. BASIC

24

Other symmetry operations and cartesian plane: " 1 0 0 1 0 0

their matrices are: 0 ' 0 1

X

MATHEMATICS

Reflection in the xy

X



y z

y

zJ

Inversion through the origin: 1 0 0 0 0 -1 0 0 -1

X

X

=

y z

y z

The identity operation: 1 0 0 0 1 0 0 0 1

1.2.6

X

V z

X

=

y z

Direct Sums and Products

The direct sum of two square matrices A = [/ly] of order m and B = [Bij] of order n is a square matrix C of order rn + n defined by An

...

Air, Oi

C=A®B=

A 0

0 B

Ami

Bu

•••

Bin

Oo

where Oi and O2 are null matrices of order m x n and n x m respectively. This idea can easily be extended to more than two matrices to yield a matrix with non-vanishing elements in square blocks along the main diagonal and zeros elsewhere. Such a block-diagonal matrix {e.g. £> = A © S 0 C) has the self-evident important properties: detD = (detA)(detB)(detC) traceZ) = traceA + traceS + traceC The direct product of two matrices is best explained in terms of an example.

1.3. VECTOR FIELDS

The direct product oi A —

C = A®B

25 e f 91 h k n r s t \

a b' and B = c d =

aB cB

bB dB

ae ah ar ce ch cr

af ak as cf ck cs

ag an at eg en ct

be bh br de dh dr

bf bk bs df dk ds

IS

bg bn bt dg dn dt

The concept can once again be extended to the direct product of more than two matrices. If Ai, A2, J3i, and B2 are any matrices whose dimensions are such that the ordinary matrix products A1A2 and B1B2 are defined, then the direct product has the property (Ai (8) B,){A2 (8) B2) = (A1A2) ® {B1B2) Further, if F is the direct product of matrices A, B , C, ... , that is, F = A®B(g)C(2) ... , then traceF — (traceA)(traceB)(traceC).... The operation of direct product of matrices is both associative and also distributive with respect to matrix addition, and hence finally (AS)(^(AS)0(AJB) = {AB)^[{A

(g) A){B (g) J3)] = (A (g) A (g) A){B (g) B (g) B)

I.e.

{ABf^ =

{Af^{Bp

where A''"! = A 0 A... (8) A

1.3

(k times).

Vector Fields

In formulating physical problems it is often necessary to associate with every point (x, y, z) of a region R of space some vector a(x, y, z). It is usual to call a{x,y,z) a vector function and to say that a vector field exists in R. If a scalar (t){x,y,z) is defined at every point of R then a scalar field is said to exist in R.

26

1.3.1

CHAPTER 1. BASIC

MATHEMATICS

The Gradient

If the scalar field in R is differentiable the gradient of (/)(x, y, z) is defined as ^,

.d(f)

.d(t)

, d(f)

grade/) = V0 = z ^ + j ^ + fc-^ ox ay oz (The symbol V is pronounced nabla or del.) Clearly grad (/) is a vector function whose (x^y^z) components are the first partial derivatives of . The gradient of a vector function is undefined. Consider an infinitesimal vector displacement such that dr ,dx .dy dz dt dt dt dt where t is some parameter. Then

_

d(t)dx dx dt

d(j)dy dy dt

d(j)dz _ d(f) dz dt dt '

the total diff'erential coefficient of (j){x, y, z) with respect to t. Hence considering a surface defined by (j){x,y,z) — constant, it follows from (5) that V0 must be perpendicular to di/dt (since d(j)/dt = 0). In other words V^ is a vector normal to the surface (f){x^y^z) — constant at every point. If n is a unit vector normal to this surface in the direction of increasing (/)(x, y, z) and d(j)ldn is the derivative in this direction then d(b on

1.3.2

The Laplacian

The V operator may be used to define further quantities, such as V^0 (nabla squared 0) defined as the scalar product \ dx dy dz J d^cf) d'^cf) d'^cj) dx'^ dy"^ dz'^

\ dx

dy

dz

which is the three-dimensional Laplacian, so called after Laplace's equation which in this notation becomes V^^^O.

(1.6)

1.3. VECTOR FIELDS

1.3.3

27

The Divergence

The operator V- may be applied to any vector function and the result is called the divergence^ e.g. V • a{x, y, z) = div a =: f i— + j ^ + k— j • (ia^^ + jdy + ka^) dax dx

day dy

da. dz

The divergence operator is the three-dimensional analogue of the differential du of the scalar function u[x) of one variable. The analogue of the derivative is the net outflow integral that describes the flux of a vector field across a surface S

The flux per unit volume, at the point (x, y, z) defines the divergence div F — lim

§ F -da volume

=

V F

The outflow is therefore described equally well by the total divergence inside 5, and hence^

j V . Fdr = I Fdowhere J F = pv is the mass flux density, defined in terms of density p and velocity of flow v. For a fluid in a region of space where there is neither a source nor a sink, the total mass flow out of the region must balance the rate of decrease of the total mass in the volume. This balance means that

I^V.{pv)dr = -l^l^pdr and hence the continuity condition ^^+dW{pv)

=0

(1.7)

should hold everywhere in the volume.

^This integral relation is known as Gauss's theorem. The most familiar example is in electrostatics.

CHAPTER!.

28

BASIC

MATHEMATICS

Unlike grad (/) div a is seen to be a scalar function. The Laplacian function in terms of this notation is written V2(/) = V • V0 = div grad (j)

1.3.4

The Curl

Instead of the dot product the vector product of the nabla operator can also be formed to produce a function called curl or rot, /

O

r\

Pi

\

V X a = curl a = f i— + j - ^ + k— j x (ia^ + jay + fcaj Noting that ixj follows that

= k, jxk

. 'da^ curl a = i{

= i^kxi

= j and i x j = —j X z, etc., it

day . da^ _ ^ ^ , ju f^^ _ ^ ^ ) + -^ dz dx ) \ dx dy

i j k A A A dx

dy

dz

The vector curl F is a measure of the vorticity of the field at P(x, y, z) and is related to the net circulation integral per unit area around an element of area at P. The circulation of a vector field along a closed curve c is defined by the line integral Fc^ f {F^dx + Fydy + F^^dz) The direction of integration decides the sign of the integral, £, ~ ~ §-cConsider circulation at constant speed v around a closed rectangular loop on the plane z — constant, i.e. I

4

3

t 2 A3'

1 — Ax —

i

1.3. VECTOR FIELDS Tz =

29

(P {vxdx -f Vydy)

c^ Ax[va:{xi,y) - ^^(xa, V + Ay)] + l::^y\vy[x + Ax), ^2) - Vy[x, y^\ ^ b.x\-b.yV^^ t^y\b.^Vy\ Ax Ay I -

-

AyVx

Ay

+

^'^y{-dy +

A^rU X '^y

Ax

dVy

dx

If r^ is interpreted as the 2:-component of a vector, the three-dimensional circulation integral per unit area defines the vector curl v. Since curl a is a vector function one can form its divergence div curl a

V • (V X a) d dx dx d_ dx

d dy d_ dy d_ dy

_d_ dz d_ dz

d\ dz)

i

j

k

d dx 0>x

_9_ dy

d dz dz

ay

= 0

a,. since two rows of the determinant are identical. Hence div curl a — 0 for all a. Conversely if 6 is a vector function such that div & = 0, then it may be inferred that b is the curl of some vector function a. Vector functions with identically zero divergence are said to be solenoidal. Likewise, since grad 0 is a vector function it is allowed to take its curl, i.e. i j k d_ d_ V X V(/> == curl grad (j) 0 dx dy dz dx

dy

dz

Hence curl grad 0 = 0 for all 0. Again, conversely may be inferred that if h is a vector function with identically zero curl, then 6 must be a gradient of some scalar function. Vector functions with identically zero curl are said to be irrotational. Using the definitions above many vector identities can be derived and these are listed in many sources e.g. [10]. Some useful identities are V • {(j)a) = (f){V • a) + a • V(/) (1.8) V X (V X a) = V(V • a) - V^a (1.9) V(a • 6) = a X curl 6 + (a • V ) 6 + bx curl a + (6 • V ) a (1.10)

30

1.3.5

CHAPTER 1. BASIC

MATHEMATICS

Orthogonal Curvilinear Coordinates

It is often necessary to express div A, grad V, curl A, and V'^V in terms of coordinates other than rectangular. Of special importance for problems with spherical symmetry are spherical polar coordinates, already introduced before. Any three-dimensional coordinate system consists of three sets of surfaces that intersect each other. On each surface a certain quantity, the coordinate, is constant. This coordinate has a different value for each surface of the set, and it will be assumed that there is a continuum of surfaces, represented by all possible values of the coordinate. In Cartesian coordinates the surfaces are planes intersecting each other at right angles. The intersection of three of these planes determines a point, and the point is designated by the values of the three coordinates that specify the planes. Similarly, in spherical polar coordinates the surfaces are a set of concentric spheres specified by the value of r, a set of planes which all pass through the polar axis and are specified by the values of cp, and a set of circular cones of which the vertices are all at the origin and which are specified by the values of the variable 9. In these two examples the surfaces intersect each other at right angles, and consequently these coordinate systems are examples of orthogonal coordinates.

Any three-dimensional orthogonal coordinate system may be specified in terms of the three coordinates qi, q2 and ^3. Because of the orthogonality of the coordinate surfaces, it is possible to set up, at any point, an orthogonal set of three unit vectors Ci, 62, €3, in the directions of increasing qi^ ^2, ^3, respectively. It is important to select the qi such that the unit vectors define a right-handed system of axes. The set of three unit vectors defines a Cartesian coordinate system that coincides with the curvilinear system in

1.3. VECTOR FIELDS

31

the immediate neighbourhood of this one point. In the diagram above the polar unit vectors are shown at a point P , together with the cartesian unit vectors and axes. The differentials of the curvilinear system i.e. the orthogonal elements of length, are connected with the differentials of the Qi by the relations dsi = hidqi

ds2 = h2dq2

dss — h^dq^

where the hi are functions of the coordinates, q^ and vary from point to point. In cylindrical coordinates dsi = dr

ds2 = rdO

ds^ = dz

so that hi = 1

h2 = r

hs = 1 r s i n 9 d9

In polar coordinates dsi — dr

ds2 — rdO

hi = 1

h2 — r

dss = r sin 6d(f hs = rs'mO

In general, the qi do not have the dimensions of length, and the hi are necessary to translate a change in q into a length. The three surface elements are hih2dqidq2^ hih^dqidq^ and h2hzdq2dqz, while the typical volume element is hih2h^dqidq2dq^. An expression for grad w follows immediately as .

I dw

1 dw

1 dw

grad w = -ri~^^ + r":r"^2 + -ri~^^ hi dqi

h2 dq2

h^ dq^

32

CHAPTER

1. BASIC

MATHEMATICS

where the e^ are unit vectors in the directions of the elements hidqi. Substitution of the proper line elements gives the components of the gradient in cylindrical and in polar coordinates as grdidj^w

_ dw dr

grad^K; =

grad^^ =

1 dw r dip

gmdffW =

gmd^w =

dw

grad^ =

dw dr 1 dw r~de 1

dw

r sin 9 dip

To derive an expression for divA in curvilinear coordinates the net outward flow through the surface bounding the volume dv is defined as divAdv.

hadqj

h.^^i

hadqa

It is important to note that opposite faces of elementary volumes no longer have equal areas. Suppose that the surface on the left has area ai and that the component of the vector A in the direction of hidqi has a value of Ai at this surface. The outward flow through this surface is —aiAi. The contribution from the opposite face will be hidqi The net outward flow through the pair of faces therefore is -^-{aiAi)dwi oqi

=

-^-{Aih2h3)dqidq2dq3 oqi

Similar contributions from the other two pairs of faces yield the net flow across the entire surface as r\

r\

/^

dqi

dq2

9^3

dqidq2dq3

Hence divA =

1 hih2hs

dqx

dq2

5^3

1.3.

VECTOR

33

FIELDS

The expression for curlA is obtained in an analogous manner as 1

curlA ==

/^l/^2/^3

hiei

/i2e2

h^e^

d dq\

d dq2

d dqs

hiAi

/i2^2

hsAs

The Laplacian in curvilinear coordinates follows from the definition divgrady

=

VV 1

h2hs dV dqi

/ll/l2/i3

d

+' dq2

hi dqij

fhhidV \ /i2 dq2)

d

(hih2dV

+' (9^3

V ^3 dq^ J \

Appropriate substitutions show that in cylindrical coordinates

w = -drd_

T

I H

hr

dr J

r dO'^

a + r2 sin 0 90

V^'"

d^'^

and in polar coordinates VV

=

divgradV

1^2 ^ Q^

dr

M de

1

a^K

+ r2 sin2 0 902 (1.11)

Another curvilinear coordinate system of importance in two-centre problems, such as the diatomic molecule, derives from the more general system of confocal elliptical coordinates. The general discussion as represented, for instance by Margenau and Murphy [5], will not be repeated here. Of special interest is the case of prolate spheroidal coordinates. In this system each point lies at the intersection of an ellipsoid, a hyperboloid and and a cylinder, such that hi = hi = a^(sinh^ u -f sin^ v)

hi — a^(sinh^ u sin^ v)

where R —a

TA-^ TB = 2R cosh u = 2R^

VA — VB = 2R cos v — 2Rr]

In terms of these variables the volume element takes the form dr = R\^^

- r]^)d^dr]dip

and V2

=

+

b« [I k-)l} !{<-<}

/?2(^2

e

r]'

92

(^2_l)(l_ry2)g<^2

(1.12)

34

1.3,6

CHAPTER 1. BASIC

MATHEMATICS

Tensor Analysis

In an isotropic medium, vectors such as stress S and strain X are related by vector equations such as, S = kX, where S and X have the same direction. If the medium is not isotropic the use of vectors to describe the response may be too restrictive and the scalar k may need to be replaced by a more general operator, capable of changing not only the magnitude of the vector X , but also its direction. Such a construct is called a tensor. An important purpose of tensor analysis is to describe any physical or geometrical quantity in a form that remains invariant under a change of coordinate system. The simplest type of invariant is a scalar. The square of the line element ds of a space is an example of a scalar, or a tensor of rank zero. In a space of v dimensions two coordinate systems may be defined in such a way that the same point has coordinates ( x \ x^,..., x^) and x \ x^,..., x^) in terms of the two systems respectively. The two coordinate systems are related in such a way that a transformation can be effected from one to the other, i.e. x^ — f^[x^^x'^,...

,x^)

;

x-^ =/i"^(x^,x^,... ,:r^),

j = l,zy

Suppose that an infinitesimal displacement moves point A (coordinates x^) to position B (coordinates x^ -{- dx^). To describe the same displacement in the other coordinate system, it is necessary to diflFerentiate the expression for x\ i.e. ^ dh} ^ dx^

Any set of quantities A^ that transform in this way, i.e.

A^ =

y^A^ ^^ dx^

are the contravariant elements of a vector, or a tensor of first rank. To simplify the notation, it is customary to omit the summation sign and sum over indices which are repeated on the same side of the equation. An index which is not repeated is understood to take successively the values 1,2, ...,z/, so that there are altogether u diff'erent equations. With these conventions the transformation equations become

dx^

L3.

VECTOR FIELDS

35

Consider a function (or scalar field) (/>(x^) of the point M (coordinates x*) and defined in the neighbourhood of M. Being a function of a point, the value of (f) does not change when described in a different coordinate system. By the rules of differentiation d(f) _ s^ dx^ d
Any set of quantities that transform according to this prescription are known as the covariant components of a vector, and represented by subscripted indices^. These ideas may be extended to define tensors of any rank. There are three varieties of second rank tensors, defined by the transformations dx^ dx^ dx^ dx^ dx^ dx^ /in

dx^ dx^ ' They are called contravariant, covariant and mixed tensors, respectively. A useful mixed tensor of the second rank is the Kronecker delta C = l ; ( m = n),

=0;(m7^n).

This is demonstrated as follows. If 5] is a tensor in the coordinate system x \ dx^ dx^ ,. dx' dx"" ^ dx^

dx"^ dx' dx' dx^

^

^Although it is customary to refer to covariant and contravariant vectors, this may be misleading. Any vector can be described in terms of its contravariant or its covariant components with equal validity. There is no reason other than numerical simplicity for the preference of one set of components over the other.

CHAPTER 1. BASIC

36

MATHEMATICS

It follows that 5^ has the same components in all coordinate systems. Tensors of higher rank are defined in the same way, for example, a mixed tensor of rank four is ^ ^ ^ dx"^ dx^ dxk dx^ ^ ^^^ dx^ dx^ dxP dx^ ^^^' If I' is the number of dimensions of the coordinate systems, then a tensor of order a has a" components. Tensor Algebra The sum or diflFerence of two or more tensors of the same rank and type is a tensor of the same rank and type, e.g.

If the components of a tensor satisfy the relation A^'^ — A^'^^ such a tensor is called symmetric. If A^'^ — —A^'^, the tensor is skew-symmetric. There is an important relationship between vectors and skew-symmetric tensors. Suppose A and B are two vectors in a three-dimensional rectangular coordinate system whose components are connected by Bx By

=

B,

an

^12

^13

tt21

^22

^23

^31

<^32

^33

A A A

If the coefliicients were components of a skew-symmetric tensor, a^j — —aji; a^. 0, then Bx B„ B.

0

- a : l\ 0

^21

-aai3

>

-a2iAy

^13 -^32

0

which may be written ten IzAx — J xAz *• x-^y

and recognized as the components of the vector product B = T X A =

+ ai3^^

a2iAx - as2Az -ai^Ax + as2Ay

Ay

^x

-^y

^z

A-x

Ay

Az

1.3. VECTOR FIELDS

37

Tensors and matrices are evidently closely related and the components of a tensor can always be written as the elements of a matrix. Multiplication of A^ by Bn yields a mixed tensor A^Bn = C^, called the outer product, which may be formed with tensors of any rank or type, e.g. Let m = q \n the mixed tensor A^^^ = ^]kh'> defined before, and write B

^

Am ^ a x ^ a x ^ ^ ^ x ^ .i

^P

npm

Q^,

g-n

Q^p Q^m

jkh

dx^dxp ' '^"^ dx^dxp '^' This result shows that, by its transformation properties, A^-j^^ is equivalent to a covariant vector of rank two. This process of summing over a pair of contravariant and covariant indices is called contraction. It always reduces the rank of a mixed tensor by two and thus, when applied to a mixed tensor of rank two, the result is a scalar: Am _

^^

"

^-^

^^

-

dx^

dxrn

-

Ai -

^^

Ai ^

- ^ ^ -

Am

^-^'

When two tensors are multiplied together and then contracted the process is known as inner multiplication, thus ^

-^npq ^^ ^pq

'

^m^

^^ ^ SCalar.

The last example is clearly equivalent to the scalar product in rectangular coordinates, and / may be viewed as the length of A^^ P = A^AmThe angle between two vectors Am and Bm is defined by cost/ =

AmB^ y/{AmAr-){BmBm)

and if Am and Bm are perpendicular, AmB^ = 0. The most important second order tensor is the metric tensor y, whose components in a Riemann space are defined by the relations Qij = {O'iO'j)

;

^ — ^'«2-

In terms of this tensor the length of any vector is defined by any of

38

CHAPTER L BASIC

MATHEMATICS

It is often of value to define a unit vector A in the direction of a vector x. Clearly A^ = 3;7./|cc| will be covariant and its length 1^1

—9

^m^n

— ymn^

A

— A

Am — 1-

The angle between two such unit vectors A and /i will be COS^ — g

1.4

A-mAn

=

Qmn^

A

=

A

flrn-

Differential Equations

One of the most useful formulations of physical laws is in terms of differential equations. The main reason for this is that a differential description is, like experimentally obtained knowledge, highly localized. Experimental regularities on which scientific knowledge depends, refer to a limited neighbourhood, since a complete knowledge of the entire universe is not needed in order to make objective predictions of local events. However, doing science in relative ignorance of the actual nature of the universe means that the regularities that can be discovered and quantified are relatively simple relationships. They tend to be smooth functions of space-time and do not involve high-order derivatives or complicated functions of low-order derivatives. As a result, physical laws are often differential equations. An ordinary differential equation has only one variable. Those with more variables are partial differential equations. In most applications to be considered here the differential equations are of the homogeneous type. This means that \{ y\{x) and y2{x) are two solutions of the equation Cy{x) = 0 then y{x) — Ciyi{x) -f C2y2{x) , c^ = constant is also a solution. This is because the linear operator £ has the property C{ciyi + C22/2) = ciCyi + C2Cy2 = 0 The complete determination of a solution of a partial differential equation requires the specification of a suitable set of boundary and initial conditions. The boundaries could have a variety of forms depending on the nature of the problem. The role of boundary conditions should become clear from their usage later on. The solution of differential equations is a specialized pursuit; the precise method is often unique for a specific problem. A common equation with numerous applications will be solved by way of demonstration:

1.4. DIFFERENTIAL In

EQUATIONS

g + A;2y = 0 ,let

39

^ = D, hence {D^ + k^)y = 0

(1.13)

which is a special case of a more general family of equations {D - p,){D - p2)y i.e.

D'^y - {pi-{-p2)Dy-{-pip2y

=

0 0

This form is the same as (13) provided that pi = —p2- It follows that For each term, (D — pi)y ~ 0, i.e. -^ — p\y or -^ = pidx. Integration gives In y = pix + const (= In A) Thus In ^ = pix = ±iA; which is y = Aex.Y)[ikx) -h Bexp{—ikx)

1.4.1

Series Solution of Differential Equations

It is often possible to find a solution of homogeneous differential equations in the form of a power series. According to Frobenius, the power series should have the general form oo

Such a series is completely described by a knowledge of the initial term, the relations between exponents of x, and the relations between coefficients. To illustrate the procedure equation (13)

is considered once more, assuming a Frobenius solution, AQ / 0, for which dy_ dx d'y

=

^ ( m + n)(m + n-l)>l„x'"+"-2

Substitution into (13) gives ^ ( m + n){m + n-

l)yl„x"'+"-2 + k''Y^ A„x'"+" = 0

To be a solution this series must identically equal zero, and hence the coefficient of each power of x therein must be zero. The lowest power of x,

40

CHAPTER

1. BASIC

MATHEMATICS

namely x"^"^ occurs in the first term only for n = 0. To make this term vanish requires m{m - l)Ao = 0. This equation, which determines the powers of the Frobenius series is called the indicial equation and it has solutions m = 0 or m = 1. The coefficient of the next power must also vanish: ^ i ( m + l)m = 0 This expression shows that if m = 0, Ai can be non-zero, but for m = 1, it is required that Ai = 0. A general power {x^'^^) occurs when n = r + 2 in the first term and n = r in the second. For the general term to vanish Ar+2{m + r -\- 2)(m -h r -f 1) =

-k'^Ar

defining the recursion formula

""^^ ~ " (m + r + 2)(m -h r + 1) '' In this way, all the coefficients Ar of the Frobenius series can be determined step by step. The recursion formula generates two independent series for odd and even values of r. For r = 0

: A2 = - 7 —7 rAo (m + 2 ) ( m + l )

there are two possibilities: m

=

0

:

A2=^ ~ T ^ o

m

=

I

:

A2 =

k 3.2

AQ

°

1^2

For

r = l

:

^3 = - 7 — - — r r — - — r ^ i

(m + 3)(m + 2) m = 0 m = 1

: ,

1^2 A3 = -—Ai no solution, Ai — 0.

1.4. DIFFERENTIAL

EQUATIONS

41

The most general solution is obtained for m = 0, two series:

r = 0,1, in the form of

Voddix) = Aixll--^^^—

1

(1.15)

The series (14) and (15) with AQ = Ai = 1 are recognized as the Taylor expansion about x = 0 of the functions, (4),(3) ye{x) = yo{x) =

coskx k

-sinkx

which are the linearly independent solutions of (13). Putting ^Q = 1, ^ i = ye{x)-\-yo{x)

— coskx -{- isinkx = exp{±ikx)

as before. Legendre's Equation The equation ^'-'^d^^-^'^d-x-^''^' ^'-'^^ can be solved by the same method to demonstrate a different aspect of the Frobenius procedure. The equation (16) has the special feature that the first term disappears for x = ±1] singular points of the differential equation. It is solved by the same Frobenius series as before, giving y^ An{m -f n)(m + n - l)x'^+''~^ - ^ -2^An{m^

An{m -\-n){m + m-

l)x m-\-n

n)x^+^ + c ^ Ar^x"^^"" = 0

The lowest power of x"^~^ occurs in the first term only for n — 0. Hence for this power to vanish Aom{m - l)c^-2 = 0 which yields m = 0 or m = 1. The next power occurs in the first term for n = 1, i.e. Ai{m-\- l)m = 0

CHAPTER 1. BASIC

42

MATHEMATICS

If m == 1 it is required that Ai = 0. A general power x^ occurs for n = r H- 2 in the first term and forn = r in the others. The recursion formula so defined generates the two series: ye{x) =

AQ

' - r +2

yo{x) = Ax x +

2-c

x3 +

c-6

x' +

2-c

12 20

rr^ +

The Frobenius series can be shown not to converge for|j:| > 1. An important special case occurs when c = Q = /(/ + 1), / = 0,1, 2,..., any non-negative integer. For each of these special values, one of the two chains of coefficients terminates at Ai because ^/+2 = Ai

l{l + 1) - Q

L(/ + 2)(/ + l)J

= 0

In particular the even chain terminates when / is even, while the odd chain terminates when / is odd. The corresponding Frobenius solution then simplifies to a polynomial. It is called the Legendre polynomial of degree /, or Pi{x). The modified form of equation (16) becomes (l-x^)

dx'^

(1.17)

ax

and is known as Legendre's equation. Choosing the arbitrary coefficients such that Pi{x = I) = 1 has become standard practice, whereby the Legendre polynomials, Po{x) = 1

Pi(x) = x

P2W = 1(3x2-1)

Psix) = 1(5x3-3x)

P^{x) =1(35x^-30x2 + 3)

etc.

The importance of the special case c = ce lies therein that it avoids the function becoming infinite at x = ± 1 , and it is this form that can therefore be used to describe bounded physical properties at these points. Laguerre's Equation The equation X-— + (1 - x)-— -i-ny = 0 dx^ dx

1.4. DIFFERENTIAL

EQUATIONS

43

is solved by assuming a series solution oo k=0

The indicial equation is m = n, the recursion formula is _ (-l)-(n!)^ ' • + ' ~ [ ( n - r ) ! ] 2 r ! '•' and the solution _ ^

(-l)-(n!)^

y-Z^[(n-r)!]V!^°'^ for integral n, is a polynomial of order n, called the n^^ Laguerre polynomial. The 5*^ derivative of the n^^ Laguerre polynomial is the associated Laguerre polynomial of degree n — s and order 5,

By differentiating Laguerre's equation s times the equation satisfied by L^ follows as x~K{x)

+ is + l- x)^K{x) + (n - S)L:(X) = 0

(1.18)

Bessel's Equation Bessel functions are prominent in theoretical chemistry and physics. These functions were first obtained as solutions of Bessel's equation:

This equation can be solved by the Frobenius method, assuming a series solution oo

y = Y, «m^'"+' m=0

to give "^Arn{m+k){m+k-l)x'^^''-i-J2^m{m-^k)x'^^^-{-"^

- 0

44

CHAPTER

1. BASIC

MATHEMATICS

The lowest power occurs for m = 0. It has to vanish independently, giving rise to the indicial equation aQk{k - l)a;^ + Aokx'' ~ U'^AQX^ = 0

If

Aoy^O

,

k = ±n

A general power x^~^^ occurs for r = m — 2 in the third term and for r = m in the other terms. For this power to vanish independently it is therefore required that Am[{m + k){m + A: - 1) + (m + A:) - n^] + Am-2 = 0 For n — k this equation defines the recursion formula ^m.



Am-2

and the series:

m(2n + m)

E^"

y =^ ?

/imX

m-\-n

A^ = 0 .

m=0

This is y =

Aox"

1 -

2(2n + 2)

+ 2.4(2n

+ 2)(2n-4)

x2^

+

( - 1 ) ' 2,4 . . . 2r(2n + 2)(2n + 4 ) . . . (2n + 2r)

+

x'^^nl /lox" 2^r!2^(n-hr)!

=E

r = 0 •-

"^«E(f)

+ 2

r\{n + r)!

By choosing ^ 1/ r=0

1 2^n! r!(n-f r)! 2

is known as a Bessel function of order n. When n is non-integral the factorial function is represented by a F (gamma) function. For integer n, F(n + 1) = n!

1.4. DIFFERENTIAL

EQUATIONS

45

In general T./ ^ r 1.2.3... ( n - 1 ) r(x) = hm — 7 ^—7T^ ^ ^

n->oo X(X + 1) . . . (X + n + 1)

Thus

r(x +1) = lim -—}:f"'^I!:~^)—-y^' whereby

r{x -\- 1) =

xr{x)

Other important properties of F functions, not too difficult to demonstrate, include

r(x)r(i-x)=

""

sin(7rx)

r(x)r(:z: + i ) - 2 ^ - 2 ^ 7 r ^ r ( 2 x ) r ( l ) = lim — = 1 n ^ c x ) 77,1

It therefore follows that

^nW

Z ^ r ( r + l ) r ( r + n + l ) V2/

Bessel functions have many interesting properties that will be presented here without proof, e.g. the recursion formula

i.e. X

Using these formulae it can be shown that when n is half an odd integer, e.g. / + | , then Jn{p^) takes a particularly simple form and is related to trigonometric functions. By definition, for instance

\Yx^'

To evaluate the factor r ( r + | ) the relationship r{x + 1) = xr{x) is applied r times, by writing the factor in the form r(:r + 1 ) for each step. For the first

CHAPTER 1. BASIC

46

MATHEMATICS

step, set X = {2r + l)/2, so that

/ =r

'2r + 3

=

2r + l

r(x + 1) = xr{x) = 2r + l

2r- 1

r(r + -)

+1

e^c.

After r passes /

2r + l

=

2r- 1 2

(2r + l)! 22^r!

\2

/3 I2

usingr(i)r(|) = 2-ivir(2) (2r + l)! When this is substituted into the series forJi(a:), there results

2{-iyx^'

^iw = (|)T (2r + l)\y/n

f

2Y.^{-iYx^'+'

\ \TTXJ ITT I

^-^

(2r + 1)

2V . TTxy It follows immediately that

d Ji(x) — —X2-— \x ^ dx ^

'^Ji(x) -^

/ 2 \ ^ f sin X

1

The so-called spherical Bessel functions are now defined as (1.20)

Jn{x) = Y ^ ' ^ n + i l ^ ) By using the recursion formula it follows that sinx JiW

= sm X

... J2[x) = Jsix)

=

,3 -T

cos X 1\ . smx

3 rcosx

X"*

X

15

6 15 _ , sm X - ^ x"^ J Xx-^

1 I cos X X^

1.4. DIFFERENTIAL

1.4.2

EQUATIONS

47

Separation of Variables

It is often possible to write the solution of a partial differential equation as a sum of terms, each of which is a function in one of the variables only. This procedure is called solution by separation of variables. The one-dimensional wave equation d^u{x,t) _ 1 d^ is a typical example^. To solve the equation one tries the possibility of solutions in separable form, u{x,t) =

X{x)-T{t)

Direct substitution of this product into the wave equation gives d'^u _ dx^

d?X _ i ^d^T dx'^ (? dt^

Thus

1 d^X _ J _ ^ Y~d^ ~ '^~d^ " where the l.h.s. is a function of x only, while the r.h.s. is a function of t only. These two sides must be equal for all values of x and t and this is possible only if they are separately equal to the same constant A. The result is two separated equations d^X{x) dx^ d^T{t) df^

=

\X{x)

=

c^XT{t)

The X equation has the general solution X[x) — y l c o s ( x \ / ^ ) H—^=sin(x>A-A)

'^This equation is second order in time, and therefore remains invariant under time reversal, that is, the transformation t ^ —t. A movie of a wave propagating to the left, run backwards therefore pictures a wave propagating to the right. In diffusion or heat conduction, the field equation (for concentration or temperature field) is only first order in time. The equation is not invariant under time reversal supporting the observation that diffusion and heat-flow are irreversible processes.

48

CHAPTER 1. BASIC

MATHEMATICS

The solution for T is rather similar to X with the constant c^X replacing A. Boundary conditions are used to select acceptable solutions from the infinite set. It can happen that one or more of these boundary conditions can be satisfied only when the separation constant takes on some special values. The subset so generated contains only permissible values, or eigenvalues, for the problem. The corresponding solutions are called eigenfunctions. A reasonable restriction for a wave is that acceptable solutions should be periodic with period 27r, i.e. the solutions 1, cosnyr, sinnyr, where n is a positive integer. This condition implies A^ = —n^, n = 0,1,2, For A = A„ Xn {x)Tn {t) = an COS ux COS cut + 6^ siu nx COS cut + Cn COS nx sin cut + dn sin nx sin cnt

(1-21)

Since the one-dimensional wave equation is linear, the general solution periodic in X with period 27r is the linear superposition 1

"^

2 n=l

of all possible solutions^. Equation (21) can be rewritten in the form Xn{x)Tn{t)

= An COS n{x — ct) + Bn siu n{x — ct) + Cn COS n{x + ct) -\- Dn sin n{x + ct)

where the linearly independent terms f{x — ct) and g{x-\-ct) represent waves propagating along the -hx and —x direction, respectively. A general solution in complex notation is u{x,t) = aexpin{x — ct)

(1.22)

as before.

1.4.3

Special Functions

The solutions of diflferential equations often define series of related functions that can be obtained from simple generating functions or formulae. As an example consider the Legendre polynomials

^

(-ir(2n-2r)! ^2"r!(n-2r)!(n-r)!

^The function is called a double Fourier series.

1.4. DIFFERENTIAL

EQUATIONS

49

where L — [n — l)/2 for n odd and L — n/2 for n even. According to the binomial theorem (X2 - 1)" =

^

^Cr{-iy

^r\{n-r)\

(X^)"-'"

^ '

and by direct differentiation follows that Vdxy

^

^

(n-2r)!

" 2

Combining these results leads to the Rodriques formula for Legendre polynomials: ""^ ^ 2^n! \dx) ^ ^ The Legendre polynomials are one example of a family of polynomials, said to be orthogonal on the closed interval [a, b] with respect to a weight function w{x) if rb

I

W{x)fm{x)fn{x)dx

=

hndmn-

Ja Further examples are shown in the following table.

fn{x) Name a b w{x) Pn{x) Legendre - 1 1 1 Ln{x) Laguerre 0 oo e~^ Hri{x) Hermite —CXD CXD e~^

hn 2/(2n-fl) 1 y/7T2^n\

The first four terms for each series are given in the next table n

0

1

2

Pn{x)

1

X

^(3x2-1)

Ln{x)

1 -x + 1 x2-4x-h2

Hn{x)

1

2x

4x^-2

Each of these has a Rodriques formula,

3 l{5x^-3x) -x^ + 9x2 - 18x--h 6 8x^-12x

CHAPTER 1. BASIC MATHEMATICS

50 fn{x)

w{x)

s{x)

a„

Pn{x)

1

1 - x^

(-l)"2"n!

Ln{x)

e~^

X

1

H„{x) e-^'

1

(-1)"

Formulae of this type are especially useful when looking for solutions of more complicated differential equations related to the simpler classical ones. Two important examples are the associated Legendre and associated Laguerre (18) equations, respectively y^o ^^

/

-.

X^

/

xl

x-T-^ + (5 + 1 - x)-— + (n - 5) y = 0 ax'^ ax

(1.23) (1.24)

The relationship between (23) and Legendre's equation can be demonstrated in two steps. First, substitute

y= [l-x^y

z

into (23), transforming it into (1 - x ^ ) ^ - 2(m + 1 ) ^ + [/(/ + 1) - m{m + 1)] z - 0 c/x^ dx

(1.25)

The next step is to differentiate Legendre's equation n times. Using the Leibniz [6] theorem one gets (1 - x^) D'^^^w - 2{m + l)xD'^^'w + [/(/ + 1) - m{m + l]Z?^if; - 0 (1.26) where D — -^ and w solves Legendre's equation. Equation (26) is identical to (25) for z — D^w. The solution of (23) follows immediately as Pl^(x) = {1 - x^)"^ Pi{x)

, 0 < m < /.

Substitution into the Rodriques formula gives the associated Legendre functions in the form 1

/ rl \ "^"^^

L4. DIFFERENTIAL

EQUATIONS

51

For m an integer the Pi^{x) are polynomials. The polynomials P^{x) are identical with the Legendre polynomials. The first few associated Legendre polynomials for x — cos 0 (a common form of Legendre's equation) are: Pi

=

( l - x ^ ) ^ =sin(9

p^

= 3 ( l - x ^ ) ^ x = 3sin(9cos(9

P^ = - ( l - x 2 ) ^ ( 5 x 2 - 1 ) = ^sin(9(15cos2|9-3) P|

= 3 ( l - x 2 ) ^3sin2|9

P^ =

15(1-x^)x:-15sin2(9cos(9

To normalize a function such as P[^{cos9) it is necessary to equate the integral f_^[P[^{cos6)]'^dx — 1. Starting from the Rodriques formula and integrating by parts it can be shown that the normalized associated Legendre functions are

From its Rodriques formula the next Laguerre polynomial is L^{x) = x^ - 16x^ + 72x^ - 96x + 24. The associated Laguerre polynomials derived by differentiation are: L\{x)

=

-1

L\{x) L\{x) L\{x) L\{x) L\{x) L\{x) Llix) Ll{x)

= = = = = = = =

2x-4 2! -3x^ + 18a; - 18 - 6 x + 18 -(3!) 4x^ - 48x2 + 144x - 96 12x2-96x +144 24X-96

(1.27)

Ltix) = 4! To demonstrate that Laguerre polynomials in the interval 0 < x < CXD may be obtained from the generating function

uiu, x) = y ^ u ^ = ^^p[-^"/(^ - ^)] ^

^

^^ r=0

r!

1-u

52

CHAPTER

1. BASIC

MATHEMATICS

this function is differentiated with respect to both u and x. In the first instance dU

s~^ Lrix) = /

1

__i

TVT^

e x p [ - x ? i / ( l - u)l , =

-T.

\ ^

,

^ ( 1 - U - X)

I.e.

r\

-u

or Lr{x) (r - 1)!

(r-l)!

(r-l)

= J2^^' E

,r+\

^E

hl^L^yT+l _ ^ V^ ^''(^)..rr

u r! Equating coefficients of similar powers {e.g. u^) on the l.h.s. and the r.h.s. gives the recursion formula T\

Lr + l{x) _

Lr[x)

Lr-i{x)

(r-l)!

(r-2)!

_ Lr{x) _ Lr-l{x)

r!

_

(r-l)!

Lr{x)

r!

I.e.

L,+i(:r) + (x - 1 - 2r)L,(x) + r^Lr-i{x)

= 0

(1.28)

Similarly dU dx

Y ^ '^^ dLr{x) ^ r\ dx

or

dLr{x) dx Equivalent forms of (29) are

u

^ - ^ Lj-yXj

dLr-\[x) + rLr-i{x) dx

^

= 0

(1.29)

uL/r + l ( ^

- ( r + l ) ^ ^ + (r + l ) L , ( x ) = 0 dx dLr+2{^:) - ( r + 2 ) ^ % i M + (, + 2)L,+,(x)=0 dx dx differentiating to expressions d^Lr+i{x)

= r+ 1

d'^Lr{x) dx^

dLr{x) dx

= 0

etc.^ which can be used to turn (28) into an ordinary differential equation. First replace r by (r + 1) in (28) and differentiate twice, such that

dx^

+ (x- 3-2r)-

dx'^

{r + iy

dx'^

1.4. DIFFERENTIAL

EQUATIONS

53

Substituting from the previous expressions transforms this equation into one in Lr{x) alone: (PLr(x)

X—^4-^ dx'^

.

.dLr(x)

+ (1 - x)

^ /X

,^ dx ^ + rLr{x) = 0

which is the differential equation of the r*^ Laguerre polynomial. Since L^{x) — ( ^ ) Lr{x) the generating function for the associated Laguerre polynomials follows as Us{x,u) = 2 ^

r-u

=

T\

exp[—:r^/(l — u)] dx^

1

-u

= (-ir

exp[—xu/(l — u)\ ( l - u ) 5+1

To normalize radial hydrogenic eigenfunctions it will be necessary to evalute the integral /»oo

A^= / e-^x'^^[Ll{x)fdx Jo which can be considered as a product of two generating functions, I

=

e-^^x'^^Us{x,u)Vs{x,v)dx

Jo . _ . ^•^' r=s„ t=s

Jo

and also / =

(1-^)

v^^'e-<'-'^^T^)d2

s-f

Using

f / =

^.5+lg-ax

^ r75+2

Jo {s-hl)\{uvy{l

-u){l-v)

(1 - uvY^'^

The factor (1 - uv)~^^~^'^^ is expanded by the binomial theorem^ to give I = {s + 1)!(1 - u - v ^ u v ) y

i^-tA±ll:(^^)^+/^

A;=0

/-.

N-n

1

n ( n - 1) 2

( ^ + ^ - 1 ) ' A-

(1 - a) "^ = 1 + na + ^^^—^a^ + . . . + —, TTTT—a + • 2! (n-l)!A:!

(1.30)

CHAPTER 1. BASIC

54

MATHEMATICS

Also, for r = i fOO

(r!)2

{uvydx

The normalization integral is given by (r!)^ times the coefficient of {uvY in (30), i.e. N

(r + 1)! ( r - s ) ! ( s + l)! r+1 r—s rrl^3 ^ ' [{r-sy. + ( r - s ) ! j

= (r!)2(s + l)! =

(r!)^(2r-s+l

{r-sy.

r!

+ (r-s-l)!(s

+ l)!j

(1.31)