Volume 93A, number 9
PHYSICS LE'I'TERS
14 February 1983
ON PHYSICAL PROOFS OF MATHEMATICAL THEOREMS J. PERDANG Institute of Astronomy, Cambridge CB3 0HA, UK and Institut d'Astrophysique, B-4200 Cointe-Ougrde, Belgium 1 Received 10 November 1982
We argue that the alleged "proofs" of mathematical results from physical models logically follow from the mathematics, rather than from the specific physical laws used in working out the behaviour of the modeL To support this view, we present an optical derivation of Brouwer's fixed point theorem and illustrate how the physical model can be discarded to obtain a strict mathematical proof. We also analyse an elementary version of Landsberg's model which exhibits more clearly the mathematical steps involved in the physical proof.
Following a somewhat startling recent derivation of a mathematical theorem from physics, namely of the inequality between the geometric and the arithmetic mean from thermodynamic principles, and several generalisations thereof [1-6] there has been some argument about the logical status of such "proofs" [7,8]. The main target of the present note is to contribute towards clarifying the latter issue. To place Landsberg's contribution [ 1 - 5 ] into the right historical perspective, we start out with observing that generating mathematical results from empirical sciences is by no means a novel technique. In fact, this approach has been operating in the past at two levels. (i) It served the purpose of suggesting new mathematical results. In several instances the subsequent step towards proving these results in a strict mathematical framework actually gave birth to entirely new mathematical fields. The oldest well-documented illustration of this procedure dates back to Archimedes (third century B.C.) In the introductory sentences of his treatise "Quadratura parabolae" [9] Archimedes states that he discovered the area of the parabolic segment from mechanical laws, namely from an application of his principle of the lever, before he could provide a geometrical proof ([theorema] prius per mechanica inventum). The mechanical procedure is further gener1 Permanent address. 0 031-9163/83/0000-0000/$ 03.00 © 1983 North-Holland
alised and applied in his subsequent work "De Mechanicis proportionibus ad Erathosthenem methodus", in which he obtains the area or volume of some 13 different geometrical figures [9,10] , l . The first paper on variational calculus makes use of an optical model. Bernoulli [13] determines the curve of fastest descent (brachistochrone) by observing that it is equivalent to the path of a light ray in a stratified medium of variable index of refraction (of ref. [14]). According to Courant [ 15], Riemann's geometric function theory was developed from an electrical model (although the physical analogy is not explicitly spelled out in Riemann's dissertation [ 16]). The existence theorems given by Riemann are direct consequences of physically natural existence properties of electrical flows. (ii) Besides the suggestion of new results physical reasoning also provided simpler and shorter proofs of already known theorems. Lietzmann [17] gives an optical proof of the theorem: Among all triangles inscribed in an acute-angle triangle the pedal triangle (triangle whose vertices coincide with the feet of the altitudes) has least perimeter. The actual purely geometrical proof which parallels the optical derivation is due to Schwarz [18]. ,1 We recall that ref. [10] is regarded as an anticipation of modern integral calculus (of refs. [ 11,12]. 459
Volume 93A, number 9
PHYSICS LETTERS
Uspenskii [19] indicates, among other instances of application of physics to mathematics, a thermodynamic proof of the theorem: Any convex polyhedron (polygon) possesses at least one face f (one edge e) such that the foot of the normal to it through the centre of gravity lies in the interior of f (e). The underlying physical argument had already been hinted at in ref. [20] ; Uspenskii's more explicit proof hinges on the principle of impossibility of perpetual motion. Gutfreund and Little [21] following Golomb [22] prove Fermat's theorem of primes by resorting to the statistical mechanics of Ising-spin systems. P61ya [23] lists a variety of mathematical properties which are directly conjectured by physics. Apart from its specific originality (first physical proof to lean on both principles of thermodynamics), Landsberg's contribution [ 1] has raised the general issue of the logical link between heuristic and mathematical truth, the latter being seemingly entailed by the former in the procedures under discussion. An a priori plausible explanation of this puzzle was offered by Abriata [7] and Deakin and Troup [8]. These authors suggest that the empirical laws the derivation of the mathematical property rests on might be redundant; the specific logico-mathematical redundancy then becomes responsible for the actual implication of the mathematical result at stake. This conjecture, while perhaps true in some instances, cannot, however, be taken as logically compulsory. We suggest in this note that the actual physical model plays the part of a dummy in the physical proof of a mathematical truth; the latter basically flows from the mathematical rules of manipulation of the physics. By offering a specific visualisation of an otherwise highly abstract setting the physical method serves, however, the useful purpose of sometimes providing a concrete support, if not a mold, for the mathematical reasoning. At the end of the steps guided by physics, the model can be thrown away and the abstract context can be reestablished. To forrnalise these ideas we shall translate the logical mechanism of the physical proofs into the concise language of Propositional Calculus. We shall clarify it further by taking a closer look at a simple illustration. Finally we shall show how Landsberg's proof (in its simplest version [2]) fits into this picture. We should observe that strictly speaking a logically satisfying analysis would require a full axiomatisation of 460
14 February 1983
the physical model (as propounded by Bunge [24]). For the present more modest purposes we feel that a partial logical characterisation of the model already enables us to unscramble the structure of the physical proof. The physical model, 5a, which is conducive to the proof is made up of an aggregate of "elementary physical systems" specified in some "space": the "behaviour" or alternatively the "function" of these subsystems is ruled by a set of (empirically established) "physical laws". Of the concepts listed (elementary physical system, law, ...) we keep only those appendages which admit a mathematical description; the formal structure of the model is then specified by the collection of these mathematical formulations. More explicitly, let Pl, P2, "-', Pn be the list of the individual (true) mathematical statements about the physics of the model. Then we understand by the specific physics P0 of the model (9 the conjunction of the mathematical statements Pi, i = 1,2,... ,n. Pl AP2 A ... A P n =P0 •
(1)
Needless to say, P0 has to be logically consistent. In order to handle the mathematical statements Pi, i = 1,2, ..., n, we necessarily have to appeal to pure mathematics, and in particular to those mathematical fields M1, M2, ..., M m which allow a meaningful analysis of these statements; thus i f P l and P2 are statements about a numerical property of a physical concept of (9, then we clearly have to include the area of algebra, MI, among the pure mathematics needed to investigate our model; if p3 is a topological statement, the field of topology, M 2, is required in our mathematical discussion of the model, etc. Let then M1, M2, ..., M m denote the collection of areas of pure mathematics we have to resort to. More specifically we shall understand by M1, M 2 . . . . , Mm, the totality of (true) statements within these particular mathematical branches; we then denote by M the conjunction of all of these mathematical truths M 1 AM 2 A ... A M m = M .
(1')
Physically we can interpret M as the laws of manipulation of the physics. Any permissible behaviour of the model, or more abstractly any true statement about the model must be compatible with PO ^ M = P ,
(2)
Volume 93A, number 9
PHYSICS LETTERS
which we shall refer to as the m o d e l t r u t h s of the physical model 7. Again (2) has to be logically consistent. It is already clear at this stage that the admittedly incomplete formalisation attempted above has transformed the physical model problem into a piece of mathematics. If we ignore the physical semantics usually associated with the statements P l , P2, ..., Pn and keep to their bare mathematical contents, then these statements just play the part of axioms or "boundary conditions" which complete the axiomatics of M1, M 2.... , M m •
We should perhaps also stress in this context that the specific mathematical fields M/, ] = 1,2 . . . . , m, the physicist chooses to include in his treatment largely depend on the latter's taste; where one physicist merely appeals to elementary geometry, another one may find differential topology a better-suited framework. In the formal setting introduced above this implies that both physicists are dealing with different formal structures. Let la now be a particular truth of one of the (mutually compatible) fields M / , / = 1,2, ..., m; then the implication M-+/~
(3)
holds. But in the context of model 5a the model truths hold, so that in particular M is true; therefore we have the implication P -+/a,
(4)
i.e. theorem/a is derivable from the model truths. Suppose to actually prove/a we make use of the method of contradiction. To this end we substitute formula (2) into formula (4) and rewrite the latter in the logically equivalent form ~/a -+ ~P0 v " M ,
Of course, the origin of this paradox merely lies in the unjustified assumption: If this assumption holds, we have from formula (3) that the theorem holds as well. Both logical schemata (4) and (5) tell us that the specific model :9 we have chosen enables us to derive theorem/2; the theorem itself is tautologously true as a consequence of the l a w s o / m a n i p u l a t i o n of the physical lawsP 0 governing the model, i.e. as a consequence of the mathematics underlying the model; the precise physics plays no part in the logical structure of the proof. To illustrate this point we present and discuss a physical proof of the one-dimensional version of Brouwer's fixed point theorem: If f (x) = x ' is a continuous function that transforms any point x defined in the closed interval 0 ~
=f(x,) The optical device shown in fig. 1 makes this theorem virtually selfevident: A beam of parallel light rays r in an optically homogeneous medium is passing through a slit CD and is illuminating a second slit AB into which a mirror M is fitted. The profile of the mirror is continuously depending on the position Q. The rays falling on the mirror are reflected back on to a transparent screen S clamped in slit CD. Denote by x
M
V ~P2V...
V~Pn
,
S
BH
(4')
(the sign ~ stands for denial). A s s u m e n o w t h a t M h o l d s t r u e . Then the latter formula, together with (1), becomes ~#-+~Pl
14 February 1983
Ho I
I
(5)
which reads: If theorem t* does not hold, then the mathematical statement p 1 about the physics of the model, o r the mathematical statement P2 about the physics of the model, o r . . . . statement Pn about the model is not true. In this paradoxical formulation deductive science (mathematics) leads to inferences on empirical truths.
x
[
I
)
"H
I-~." A' x' j.------:~
'--- o
I
He Fig. 1. The optical model. 461
Volume 93A, number 9
PHYSICS LETTERS
and x ' the dimensionless heights (normalised such that x = x ' = 0 on the line AC and x = x ' = 1 on the line BD) of a point Q on the mirror and its optical image Q' on the screen (i.e. the intersection of the light ray reflected at Q with the vertical line CD). Given any continuous function f ( x ) , it follows from the law of reflection on the mirror and the rectilinear propagation of light rays in a homogeneous medium that we can construct a mirror of profile continuously depending on the height x, and hence also of normal to the profile of the mirror, n(x), continuously varying with x. We choose this profile such that to any point Q of height x on M there corresponds a unique image point Q' on S at the preassigned height x ' = f ( x ) . The theorem is proved if there is always a point Q , at some heightx, on M such that the normal n ( x , ) is parallel to the incident beam. But this is obviously true since if we denote by a(x) the angle of incidence at the mirror point at height x [a(x) positive for clockwise rotation] then a(0)/> 0 at the lower end-point A and t~(1) ~< 0 at at the upper end-point B of M; but since r~(x) is a continuous function of x, there must be at least one point x , , 0 ~
14 February 1983
(2) Topology (concept of continuity and related properties); (3) Calculus. (Calculus is needed to establish the connection between the mirror profile and the function f ( x ) ; it enters also the definition of the angles of incidence and reflection.) The reader can now easily repeat the previous proof stripped off its optical interpretation; he will appreciate further that he can introduce an auxiliary function ct(x) = ~ tan -1 { [x - f ( x ) ] / d ) (angle of incidence in the optical setup; d, distance between M and S) without reference to a profile function; finally since all that matters in the proof is the 1-1 correspondence between t~(x) and x - f ( x ) , he can replace the optically suggested correspondence by the simpler relation ol(x) = x - f ( x ) . In this fashion a strict mathematical proof is obtained which is in fact simpler than the proof resulting from a direct translation of the model physics into mathematics. A further remark is in order. A proof by contradiction conforming to the logical schema (5) can be construed as follows: Assume that (a) no fixed point x , exists, and that (b) the image points Q' all have heights x ' with 0 ~ 0, otherwise the image A' of A would have height x' < 0. By continuity, a(x) > 0 for any height 0 < x ~< 1, in particular a ( 1 ) > 0; but by assumption (b) the image point of B (at height x = 1), B', must have a height x ' < 1, so that the reflection law cannot hold at point B. It is perhaps interesting to point out that the proof by contradiction can be given a different form, if we replace the mathematical assumption (b) by the physical assumption: (b') the reflection law holds. Within this setting the proof then conforms to the more conventional schema ~/a -~ ~M, which is a logical consequence of formula (3). Under those circumstances the incompatibility lies in the mathematical condition 0 < f ( x ) ~< 1. We finally discuss Landsberg's thermodynamic proof of the inequality between the arithmetic and the geometric means from a slightly different point of view, which brings this proof directly within the context of the above ideas. To this end we refer to Landsberg's simplest [2] device, an isolated thermodynamic system £ made up of two incompressible solids E 1 and ~2 of identical and constant specific heat C('~0).
Volume 93A, number 9
PHYSICS LETTERS
From the entropy one-form and the assumptions of incompressibility and constant and equal specific heat in both subsystems we immediately calculate that the second variation of the entropy around the mutual equilibrium state (brought about through the exchange of energy between the two subsystems) obeys
8 2 S = -CTeq2 (6 T 1)2 _ CTeq2 (8 T2) 2 •
(6)
where Teq is the equilibrium temperature given by Teq = (T 1 + T2)/2,
(7)
where T 1 and T 2 are the instantaneous temperatures in 2;1 and 2;2 respectively when slightly out of small mutual equilibrium [(7) is a consequence of the energy conservation within 2;]. The small temperature excesses ~Ti, i = 1,2 are defined by
~Ti = T i - Teq •
(8)
Substituting (7) and (8) into the entropy variation (6), which by the entropy law must be negative, we obtain 82S = -(C/2)(r
1 - r2)2/r2q
<~ O .
(9)
We can view this inequality as a physical proof that the square o f ( T 1 - T2) must be positive (assuming that T2q is positive). From the algebraic expansion of the relation (T 1 - T2) 2 1> 0 we generate the inequality between the arithmetic and the geometric means. But the latter step is precisely the conventional mathematical proof of the inequality under discussion.
This note was prepared while the author was staying at the Institute of Astronomy, Cambridge. He wishes to thank the members of this institution for the hospitality extended to him. He is particularly indebted to Professor P.T. Landsberg for his comments on an earlier version of the paper. He also gratefully acknowledges a NATO (1981) travel grant operated through the Luxembourg Minist~re de l'Education Nationale.
References
14 February 1983
[2] P.T. Landsberg, Thermodynamics and statistical mechanics (Oxford Univ. Press, London, 1978). [3] P.T. Landsberg, Phys. Lett. 78A (1980) 29. [4] P.T. Landsberg, J. Math. An. AppL 76 (1980) 209. [5] P.T. Landsberg and G. Tonge, L Appl. Phys. 51 (1980) R1. [6] S.S. Sidhu, Phys. Lett. 76A (1979) 107. [7] J.P. Abriata, Phys. Lett. 71A (1979) 309. [8] M.A.B. Deakin and G.J. Troup, Phys. Lett. 83A (1981) 239. [9] J.L. Heiberg, Archimedis Opera Omnia, Vol. II (Teubner, Leipzig, 1913) pp. 261-315 and 425-507. [ 10] T.L. Heath, The Method of Archimedes, recently discovered by Heiberg (Cambridge Univ. Press, London, 1912). [ 11 ] B.L. van der Waerden, Science Awaking (Noordhoff, Groningen, 1954) p. 212. [12] O. Toeplitz, The calculus: a genetic approach (University of Chicago Press, Chicago, 1963). [13] J. Bernoulli, Acta Eruditorum, Leipzig (June, 1696) 269; (May 1697) 206; reprinted in; Ostwald's Klassiker der exacten Naturwissensehaften, Vol. 46, ed. P. St~/ckel, (Engelmann, Leipzig, 1894). [141 E. Math, The science of mechanics (Open Court, Lasalle, IL, 1960) Ch. IV. [15] R. Courant, Dirichlet's principle, conformal mapping, and minimal surfaces (Introduction) (Interscience, New York, 1950). [ 16] B. Riemann, Gesammelte mathematische Werke und wissenschaftlicher Nachlass, eds. R. Dedekind and H. Weber (Leipzig, 1876). [ 17 ] W. Lietzmann, Experimentelle Geometric (Teubner, Stuttgart, 1959) pp. 55-56, also pp. 92-108. [18] H.A. Schwaxz, Gesammelte mathematische Abhandlungen, 2. Band (Berlin, 1890) pp. 344, 345. [19] V.A. Uspenskii, Some applications of mechanics to mathematics (Pergamon Press, Oxford, 1961). [20] G. P61ya and G. SzegS, Aufgaben und Lehts//tze aus der Analysis. Band 2. (Springer, Berlin, 1925) p. 162. [21] H. Gutfreund and W.A. Little, Am. J. Phys. 50 (1980) 219. [22] S.W. Golomb, Am. Math. Mont. 63 (1956) 718. [23] G. P61ya, Induction and analogy in mathematics, Vol. I (Princeton University Press, Princeton, 1971) Ch. IX, see also Ch. VIII and X. [24] M. Bunge, Foundations of physics (Springer, Berlin, 1967) Ch. 2.
[I] P.T. Landsberg, Phys. Lett. 67A (1978) I.
463