Entropy CARLO BIANCIARDI and SERGIO ULGIATI University of Siena Siena, Italy
1. Introduction 2. Entropy as a Thermodynamic State Function 3. Statistical Meaning of Entropy and Boltzmann’s H-Theorem 4. Entropy and Information Theory 5. The Free-Energy Concept 6. Conclusion
Glossary energy Energy is usually defined as the capacity for doing mechanical work—that is, displacing an object from a position to another by means of the application of a force. This textbook definition is sometimes criticized for being reductive and not always easily applicable to processes other than purely physical ones. A broader definition for energy could be the ability to drive a system transformation, which clearly includes all kinds of physical, chemical, and biological transformations. enthalpy Enthalpy, H, is a thermodynamic property expressing the total energy content of a system. It is defined as H ¼ U þ pV, where U is the internal energy of the system (any kind of energy form of its molecules) and pV is the system’s energy in the form of expansion work (in fact pV, the product of pressure and volume, has the physical dimensions of energy). In a chemical reaction carried out in the atmosphere the pressure remains constant and the enthalpy of reaction ðDHÞ is equal to DH ¼ DUþpDV: Enthalpy is a state function, because it only depends on the initial and final states of the system, no matter how the process develops. macroscopic/microscopic Any system is composed of parts. Parts contribute to the functioning of the whole. Matter is composed of molecules, an economy is made up of production sectors and firms, for example. The dynamics of the whole system may be or appear to be different from the behavior of the component parts. A car may be stopping in a parking lot, yet its molecules possess a chaotic motion, which cannot be seen when we look at the car. A national economy may be in good health, on average, yet some sectors or local firms may face a difficult
Encyclopedia of Energy, Volume 2. r 2004 Elsevier Inc. All rights reserved.
survival. Macroscopic therefore concerns the system’s dynamics, considered as a whole, while microscopic applies to the behavior of the lower scale components. perfect gas A perfect or ideal gas is a gas the molecules of which are infinitely small and exert no force on each other. This means that their motion is completely chaotic and does not depend on any mutual interaction, but only on completely elastic collisions, which cause a transfer of energy from a molecule to another and a change of their direction and velocity. probability The probability of an event is the ratio of the number of times the event occurs to a large number of trials that take place. The number of trials (all events that may occur, including those different from the ones fitting the request) must be very large for the probability to be reliable. As an example, assume we cast a die. If the die is perfectly constructed, the probability of a five is 1 out of 6 possible different castings. However, in a real case, we must cast it many times to ensure that a five occurs out of the other possible events in a 1:6 ratio. reversible/irreversible A reversible change in a process is a change that can be reversed by an infinitesimal modification of any driving force acting on the system. The key word infinitesimal sharpens the everyday meaning of the word reversible as something that can change direction. In other words, the state of the system should always be infinitesimally close to the equilibrium state and the process be described as a succession of quasiequilibrium states. A system is in equilibrium with its surroundings if an infinitesimal change in the conditions in opposite directions results in opposite changes in its state. It can be proved that a system does maximum work when it is working reversibly. Real processes are irreversible, therefore there is always a loss of potential work due to chemical and physical irreversibility. self-organization All systems in nature self-organize—that is, use the available energy and matter inputs in order to achieve higher levels of structure and functioning under natural selection constraints. In so doing, they may maintain their present level of complexity or become more and more complex by also increasing the number of components, functions, and interactions. statistic The science dealing with sampling and recording events, which occur in a different way in dependence of a given parameter (time, temperature, etc.). It estimates
459
460
Entropy
events by recording and analyzing a given behavior verified by an appropriately large number of random samplings. The statistical analysis may lead to hypotheses about the behavior of a given set of events. temperature Temperature is a property of an object that tells us the direction of the flow of heat. If the heat flows from A to B, we can say that the temperature of the body A is higher than that of the body B. In some a way it measures the concentration of heat within matter and can be linked to the average kinetic energy of the molecules in a body. A thermometric scale (centigrade or Celsius scale of temperature) can be defined by dividing by 100 the temperature interval between the temperature of melting ice (01C) and the temperature of boiling water (1001C) at sea level. Other existing scales depend on the reference parameters chosen as the zero level. A scale of absolute temperatures (Kelvin) defined as described in this article is linked to the Celsius temperature, t, by means of the relationship: T (K) ¼ t (1C) þ 273.15. A unit absolute degree (K) coincides with a unit centigrade degree (1C). work The transfer of energy occurring when a force is applied to a body that is moving in such a way that the force has a component in the direction of the body motion. It is equal to the line integral of the force over R the path taken by the body, F ds: Its meaning is different from the usual life work, often associated with activities not always involving visible motion. Physical work requires a body motion under the action of the force.
Entropy, a measure of energy and resource degradation, is one of the basic concepts of thermodynamics, widely used in almost all sciences, from pure to applied thermodynamics, from information theory to transmission technology, from ecology to economics. In spite of the mathematical difficulty of its definition and application for the nonspecialists, the word entropy is often used in the everyday life, where it assumes less constrained, looser meanings than in the scientific community, being generally associated to disorder, lack of organization, indefiniteness, and physical and social degradation. In this article, we first give the mathematical definition of entropy and discuss its meaning as a basic thermodynamic function in reversible, ideal processes. The relationship of the entropy concept with real irreversible processes, statistical thermodynamics, information theory, and other sciences is then reviewed.
1. INTRODUCTION The development of thermodynamics as well as the introduction of the concept of entropy is deeply rooted into the fertile technological ground of the Industrial
Revolution, which took place in England between the beginning of the 18th and the second half of the 19th century. The development of steam-powered machines (first used to pump water out of coal mines and then to convert into work the heat generated by coal combustion) amplified both technological and purely scientific research, in an interplay of mutual reinforcement. The ‘‘miner’s friend,’’ an engine built by Thomas Savery in 1698, had a very low efficiency. Other more efficient engines (such as Newcomen’s engine in 1712) opened the way to Watt’s steam engine (1765), the efficiency of which was much higher than Newcomen’s engine. The Scottish technician James Watt had to solve several technological and theoretical problems before his machine could effectively replace the previous steam engines. However, at the beginning of 19th century, hundreds of Watt’s machines were used in Europe and constituted the basis for the significant increase of production that characterized the Industrial Revolution. The conversion of energy from one form to another, and in particular the conversion of heat into work, has been investigated by many scientists and technicians in order to deeper understand the nature of heat toward increased efficiency. These studies yielded a general framework for energy conversion processes as well as a set of laws, the so-called laws of thermodynamics, describing the main principles underlying any energy transformation. It is worth noting that the investigation about thermal engines and the limits to the conversion of heat to mechanical work (second law, Carnot, 1824, ‘‘Re´flexions sur la puissance motrice du feu et sur les machines propres a` developper catte puissance’’) preceded the statement of the first law of thermodynamics. This is because heat conversion to mechanical work (steam engine, second half of 18th century) generated first an attention to the existing constraints and then a much clearer understanding of the nature of heat and a quantitative equivalence between heat and mechanical work (middle of the 19th century). Clausius then provided Carnot’s ‘‘Re´flexions,’’ with a stronger mathematical basis and introduced the concept of entropy, S, in so developing a clear statement of the second law. The entropy concept is perhaps the most complex one in physical science, also because it was gradually charged by many meanings belonging not only to thermodynamics, but also to a variety of other fields and disciplines. A further definition of the entropy concept came from L. Boltzmann (1844–1906) in the second half of the 19th century, by means of the so-called H-theorem,
Entropy
in which the statistical meaning of entropy was definitely assessed.
461
P
B
2. ENTROPY AS A THERMODYNAMIC STATE FUNCTION
1
A consequence of the first and second laws of thermodynamics is that the efficiency, R , of any reversible machine only depends on the two temperatures, T1 and T2, between which the machine operates. In short, after adequately choosing the thermometric scale in order to have T ¼ t þ 273.14 (t is Celsius temperature), it is Q2 =Q1 ¼ T2 =T1 ð1Þ and ZR ¼ ðT1 T2 Þ=T1
ð2Þ
independently on the fluid or the amount of heat exchanged. The temperature T is called absolute temperature and measured in absolute degrees (or Kelvin degrees, K). Since it does not depend on the particular thermometric substance used, it therefore coincides with the one obtained by using an ideal fluid as the thermometric substance. Equation (1) can be rewritten as Q1 =T1 Q2 =T2 ¼ 0 ð3aÞ and generalized to the case of n sources in a reversible cycle, as X Qi =Ti ¼ 0; ð3bÞ where Qi has a positive sign if heat is provided to the fluid and a negative sign if heat is released. Eq. (3b) can be further generalized to the case of infinitesimal amounts of heat dQ exchanged by each source in any reversible process with continuous temperature changes: Z dQ=T ¼ 0: ð4aÞ
2
A
O
V
FIGURE 1 Diagram of a reversible cycle, with continuous temperature changes, simplified as the succession of two reversible transformations from A to B (along the pathway 1) and vice versa (along the pathway 2).
According to Eq. (4a), we may write Z B Z A dQ=Tj1 þ dQ=Tj2 ¼ 0; A
where the subscripts indicate the paths from A to B and vice versa. Due to the reversibility of the two transformations A-B and B-A, we may also write Z B Z A Z B dQ=Tj1 ¼ dQ=Tj2 ¼ dQ=Tj2 : ð6Þ A
B
Given an arbitrary initial system state O, an arbitrary final state A and an arbitrary reversible transformation from O to A, the value of the integral
SðAÞ ¼ ð4bÞ
cycle
because Zs oZR Let us now consider the most general case of a reversible cycle as the one shown in Fig. 1, with continuous temperature changes. It can always be simplified as the succession of two reversible transformations from A to B (along the pathway 1) and vice versa (along the pathway 2).
A
This result is of paramount importance, because it demonstrates that the integral does not depend on the specific reversible transformation from A to B, but only on the actual states A and B. We are therefore able to define a new function only depending on the system state (such as the internal energy U). Clausius (1850) called this function entropy, defined as follows:
cycle
For any irreversible process S, it would be Z dQ=To0
ð5Þ
B
Z
A
dQ=T þ SðOÞ
ð7Þ
O
calculated over any reversible transformation O-A is a function only of the states O and A. This value is called entropy.
Entropy is measured in cal/K or J/K. It clearly appears that S(A) is defined but for an arbitrary constant S(O), which has no practical utility, because we are always interested in entropy differences between states and not in absolute values. We can
462
Entropy
therefore equate this constant to zero. For any reversible transformation from a state A to a further state B, the entropy variation dS will be Z B dS ¼ SðBÞ SðAÞ ¼ dQ=T ð8Þ A
independent on S(O), the entropy of the arbitrary initial state O. Unlike dS, an exact differential independent on the path, dQ is path dependent and it is therefore indicated by means of the d symbol. The expression at the right side of Eq. (8) is called the Clausius integral. As a consequence, in order to calculate the variation of entropy between the initial state A and the final state B of a system undergoing a real and irreversible process we must first consider the corresponding reversible transformation of the system between the same two states and calculate dS over this pathway. If we then consider a globally irreversible cycle (Fig. 2), made up with an irreversible transformation from A to B and a reversible transformation from B to A again, it must be according to Eq. (4b) and Eq. (8): Z B Z A dQ=Tjirr þ dQ=Tjrev o0 ð9aÞ A
Z
B
B
A
dQ=Tjirr þSðAÞ SðBÞo0
SðBÞ SðAÞ4
Z
ð9bÞ
B
A
ð9cÞ
dQ=Tjirr :
This famous inequality, due to Clausius and named after him, states that the variation of
P
B
1
2
A
O
V
FIGURE 2 Diagram of a globally irreversible cycle, with continuous temperature changes, simplified as the succession of one irreversible transformation from A to B (along the pathway 1, dotted line) and one reversible transformation (along the pathway 2).
entropy between two states for an irreversible transformation is always higher than Clausius integral. From Eqs. (8) and (9c) we have that in a generic transformation, Z B dQ=T; ð10Þ SðBÞ SðAÞZ A
where the equality applies only to reversible processes. If a system is isolated (i.e., it has no exchange of energy [or matter] with the surrounding environment), we have SðBÞ SðAÞZ0
ð11aÞ
SðBÞZSðAÞ
ð11bÞ
that is, ‘‘If an isolated system undergoes a transformation from an initial state A to a final state B, the entropy of the final state is never lower than the entropy of the initial state.’’ This statement is the so-called entropy law. Therefore, when an isolated system reaches the state of maximum entropy consistent with the constraints imposed upon it, it cannot undergo any further transformation. The state characterized by the maximum entropy is the most stable equilibrium state for the system. From a strictly thermodynamic point of view, the increase of a system’s entropy measures the amount of energy no longer usable to support any further process evolution. When all the energy of a system becomes unusable (degraded heat with no temperature gradients to drive heat flows), no more system transformations are possible. In fact, heat cannot be further transformed into other forms of energy nor can it be transferred from a body to another within the system, because all bodies have the same temperature. It is important to underline that the law of entropy states in an unequivocal way that in an isolated system all real transformations proceed irreversibly towards the state of maximum entropy. The whole universe is assumed to be an isolated system, and therefore we may expect it to reach the maximum entropy state in which its energy is completely converted into heat at the same temperature and no further transformation can be supported. This state would be the state of maximum equilibrium for the universe and has been defined as its thermal death. Finally, unlike in classical mechanics, no time symmetry is observed in thermodynamics. While the movement of a point (i.e., a virtual object) may always occur from A to B and vice versa (time symmetry between t and t over the time axis), a real
Entropy
process in an isolated system cannot reverse its evolution if this change is not accompanied by an increase of entropy. In 1906, Nerst demonstrated that the entropy of any thermodynamic system at the absolute temperature of 0 K is equal to zero, whatever is the state of the system (Nerst’s theorem). As a consequence of Nerst’s statement, also known as the third law of thermodynamics, it is convenient to choose one of these states as the initial state O and to define the entropy of state A (at a temperature Ta0 K) as Z A SðAÞ ¼ dQ=T; ð12Þ O
where the arbitrary constant S(O) of Eq. (7) is equal to 0, in order to be in agreement with Nerst’s theorem, and where the integral must be calculated over any reversible transformation from O (at T ¼ 0 K) to A (at Ta0 K). It is worth noting that, at T ¼ 0 K, it is impossible to have a mixture of finely ground substances. In fact, if so, it would be possible to diminish the entropy of the system by separating them out of the mixture. This yields an equivalent statement of the third law: The entropy of a pure substance is equal to zero at T ¼ 0 K. As a consequence, the substances that are in the mixture will spontaneously separate from each other when their temperature drops in the proximity of T ¼ 0 K.
3. STATISTICAL MEANING OF ENTROPY AND BOLTZMANN’S H-THEOREM Dice players know very well that if we cast two dice it is much easier to get a seven than a twelve. This is because a seven can be obtained as 1 þ 6, 2 þ 5, 3 þ 4, 4 þ 3, 5 þ 2, and finally 6 þ 1, where the first number is the value obtained from the first die, and the second number is the value of the second die. There are therefore six ways to obtain a seven, and only one (6 þ 6) to obtain a twelve. We may state that the macroscopic configuration seven has six microscopic states accomplishing it, and is therefore more probable than the configuration twelve. A similar result can be achieved if we consider a box divided into two compartments (Fig. 3A). There are several ways of putting four balls in it. We may put four balls in its left compartment leaving the right compartment empty. Or we may do the
463
opposite. The two states are identical from the microscopic and statistical points of view. In fact, if we try to extract one ball from the box, in the dark, we have a 50% probability of putting our hand in the empty part and a 50% probability of putting our hand in the compartment with the four balls. The probability is the same for both compartments, which are therefore perfectly equivalent from a statistical point of view. Therefore, the macroscopic state ‘‘four balls in a compartment and no balls in another’’ can be achieved by means of two microscopic configurations perfectly equivalent. Things are different if we want to distribute the four balls in both compartments, two in each one of them. We have now six possible ways for distributing our balls, each of them equivalent from a statistical point of view (Fig. 3B). The macroscopic state having two balls per compartment has more ways to occur (more microstates), which means that it is more probable that in Fig. 3A. It is easy to understand now that a macroscopic configuration having three balls in a compartment and only one in the other is more probable than the two previously considered configurations (Fig. 3C) since it can be achieved by means of eight different microstates, all perfectly equivalent. Let us now consider a macroscopic state of a perfect gas, defined by given values of the quantities P (pressure), V (volume), and T (temperature). It may result from a given number N of microstates—that is, a given number N of different positions and velocities of the billion molecules which compose it. In each microstate there will be different values of the position and the velocity of each molecule, but the global result of the behavior of all the molecules will be exactly that macrostate characterized by the measured values of P, V, and T. Of course, the larger the number of possible equivalent microstates N, the higher the probability of the corresponding macrostate. We know from the previous section that an isolated perfect gas tends to and will sooner or later achieve the state characterized by the maximum entropy consistent with the energy U of the gas. This state is also the macrostate characterized by the largest number of equivalent microstates. Ludwig Boltzmann (1844–1906) understood the strong link between the thermodynamic and the statistical interpretation of the entropy concept. In 1872 he published a milestone paper in which he introduced the basic equations and concepts of statistical thermodynamics. As the first step, Boltzmann assumed that all the n molecules constituting the gas had the same velocity and divided the volume V into a very large number of
464
Entropy
A a
b
a
b
c
d
c
d
B
a
b
b
c
d
c
d
a
d
a
c
a
d
b
b
c
a
c
b
c
b
a
d
d
C c
b
c
a d b
a
c
d
a
b
d
d
a
a c
b
a
b
d
c
a
c
d
c
b
d
b
a
d c
b
FIGURE 3 Macrostates and microstates in a system made up of four balls in two separated compartments. Each macrostate may correspond to different numbers of equivalent microstates. Figures A, B, and C show the different number of equivalent microstates that may correspond to a given macrostate, characterized by a measurable temperature, pressure, and volume.
cells, M, each having the smallest experimentally measurable size. It is then easy to demonstrate that the number of distinct microstates N corresponding to the macrostate of volume V is N¼
n! n1::::::::nM !;
ð13Þ
where ni is the number of molecules contained in the P i-th cell and n ¼ i ni Boltzmann also assumed that the entropy of the state is linked to the number N of microstates by the logarithmic expression S ¼k log N;
ð14Þ
where k ¼ 1.38 1023 joule/K, the so-called Boltzmann’s constant. The logarithm base used is arbitrary, since the conversion from a base to another one would simply require the introduction of a multiplicative constant. For a reason, which will clearly appear in the next section, it is preferable to use the logarithms with base two.
According to Nerst’s theorem and to Eq. (14), it must be N ¼ 1 when T ¼ 0 K, in order to have S ¼ 0—that is, at the absolute temperature T ¼ 0 K, a system’s state is always composed by only one microstate. By applying to Eqs. (13) and (14) the properties of natural logarithms and the Stirling formula for the mathematical expansion of n!, Boltzmann derived the expressions n! S ¼ k log N ¼ k log ¼kH ð15aÞ n1 !:::::::::nM ! X X pi log pi with pi ¼ 1; ð15bÞ H¼ i
i
where pi is the probability to find one molecule in the i-th cell. It is not difficult to show that the maximum entropy state is that state for which all the pi’s have the same value (i.e., that state in which all the cells are occupied by the same number of molecules). It can also be said that this is the most disordered
Entropy
state, since the distribution of the molecules among the cells is the most uniform and therefore the one showing the lower level of organization. As the second step, Boltzmann dealt with the most general case in which the positions and velocities of the molecules are all different. He introduced a distribution function, f(x,y,z; vx,vy,vz), to account for the different positions and velocities and expressed the pi’s as functions of f(x,y,z; P vx,vy,vz). He also transformed the summation of Eq. (15b) into an integral, due to the very high number of molecules: Z H ¼ f log f dx dy dz dvx dvy dvz : ð16Þ Boltzmann finally demonstrated in 1872 his wellknown H-theorem—that is, that it must be dH=dtZ0:
ð17Þ
According to the Eq. (17), H (and therefore S ¼ k H as well) never decreases over time. Actually Boltzmann defined H without the minus sign in Eq. (16) and then demonstrated that dH/dtr0. As a consequence, Eq. (15a) became S ¼ k H. We prefer, in agreement with other authors, the introduction of the minus sign in order to have H increasing over time according to Eq. (17). In so doing, H and S show exactly the same trend. Boltzmann therefore restated by means of statistical considerations the law of entropy in isolated systems—that is, he gave a statistical reading of the second law of thermodynamics and the entropy concept. He also demonstrated that his velocity distribution function f coincided with the one hypothesized by Maxwell in 1859, from which the most probable state corresponding to the maximum entropy could be derived. Boltzmann’s seminal work generated a heated debate about mechanical reversibility and thermodynamic irreversibility. Several critics questioned that thermodynamic irreversibility could be described and explained in terms of mechanical reversibility. It is not the goal of this article to go into the details of this fascinating debate. The latter soon became a philosophical investigation about the mechanistic conception of reality as well as the so-called arrow of time (i.e., the spontaneous direction of natural processes as suggested by their irreversibility). The most difficult part of this concept is trying to figure out two enormous numbers—that is, the number of molecules constituting the system (the ideal gas or any other) and the number of microstates contributing to a given macrostate of the system. We would like to underline that Boltzmann only stated
465
the statistical irreversibility of the evolution of an isolated system. After a sufficiently (infinitely) long time, the system will certainly achieve the most stable equilibrium state, characterized by the maximum entropy, although among the billion microstates contributing to a given macrostate in any instant, there is a percentually small (but different from zero) number of microstates capable of driving the system to a macrostate at lower entropy. This means that future lower entropy states are not impossible for the system, but only extremely unlikely.
4. ENTROPY AND INFORMATION THEORY Information theory is relatively recent. It moved its first steps in 1948, when C. E. Shannon published his seminal paper, ‘‘A Mathematical Theory of Communication.’’ This paper was the starting point of modern communication technologies, which have so deeply modified all aspects of our life. Shannon’s problem was how to measure and transmit a message from a source (transmitter) to a recipient (receiver) through a transmission channel generally characterized by a background noise capable of altering the message itself. We will focus in this paper on the problem of information measure. First of all, it is important to underline the fact that the concept of message is very broad. However, the set of messages that each source is able to transmit must be specified without any possible ambiguity. A letter of the alphabet, the ring of a bell, a word or a particular gesture are all examples of possible messages. Shannon considered the n elementary messages that a source could transmit (i.e., the basic alphabet of the source itself). He first assumed that these messages were equally probable (i.e., that the probability for each of the n options to be transmitted was p ¼ 1/n). If two consecutive elementary messages are transmitted, the possible options (i.e., the number of possible different sequences of two messages out of n available ones) are n2. In general, if m messages are transmitted in a row, the possible different sequences become N ¼ nm (i.e., the number of all the possible permutations of the n elementary messages in groups of m elements, also allowing for repetitions). At the same time, the information transmitted (i.e., the events or descriptions that characterize the source and that are not yet known by the recipient) should be twice, three times,y, m times the one transmitted by means of only one elementary message. In any case, this
466
Entropy
information should be somehow proportional to the number m of messages that make up the transmitted sequence of messages. Shannon assumed it was proportional to the logarithm of the possible options N, which easily yielded I ¼ log N¼ log nm ¼ m log n ¼ m log 1=n ¼ m log p:
ð18aÞ
The information conveyed by each elementary message is therefore i ¼ log p¼ log 1=p:
ð18bÞ
From Eq. (18b) it appears that the information conveyed by a message is proportional to the log of the reciprocal of its probability. If po1, it is i40. This also means that the less probable the message, the higher the information content transferred from the source to the recipient. If we already know that a given message will be transmitted (p ¼ 1), it will not carry any additional information. In fact, we would not switch the TV on if we already knew which news are going to be announced. Instead, an unlikely and therefore unexpected message carries a significant piece of information. Finally, no information is carried by a message that cannot be transmitted (p ¼ 0). Shannon then generalized his equation to the case of messages showing different transmission probability. He assumed that the source could transmit n elementary messages in groups of m elements, with probabilities pi, where i ¼ 1,y, n. If m is large enough, the i-th message with probability pi will appear m pi times and the total information carried by the sequence of m messages will be X m pi log pi : ð19aÞ I¼ i
As a consequence, the average information conveyed by each of the m messages will be X X H¼ pi log pi with pi ¼ 1 ð19bÞ i
i
Equation (19b) has the same mathematical form as Eq. (15b), thus indicating a surprising similarity between Boltzmann’s entropy and Shannon’s information. For this reason, the same symbol H is used and Shannon’s expression is very often called entropy of a source. Actually, the entropy of the source should be indicated as in Eq. (15a), where k is Boltzmann’s constant and H would be Shannon’s expression for the average information of a message. The entropy H of a source, intended as the average information content of a given message, also indicates
the recipient’s ignorance about the source, or in other words his a priori uncertainty about which message will be transmitted. This fact should not be surprising, since it is exactly by receiving the message that the uncertainty is removed. In 1971, M. Tribus and E. C. McIrvine provided an interesting anecdote about Shannon, who is reported to have said: My greatest concern was what to call it. I thought of calling it ‘‘information,’’ but the word was overly used, so I decided to call it ‘‘uncertainty.’’ When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘‘You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.’’
True or not, the story points out the links between the two concepts. Equation (19b) also allows the definition of a unit to measure information. Let us assume that the source releases information by means of a binary alphabet—that is, it is able to transmit only two elementary messages, both having a probability p ¼ 12: It is common habit to indicate these messages by means of the two digits, 0 and 1, which therefore become the basis of the so-called digital alphabet and transmissions. The average information carried by an elementary message is, in such a case, from Eq. (19b): H1 ¼ 1=2 log 1=2 1=2 log 1=2 ¼ 1=2 log2 1=2 1=2 log2 1=2 ¼ 1=2 ð0 1Þ 1=2 ð0 1Þ ¼ 1=2 þ 1=2 ¼ 1:
ð20Þ
This unit quantity is called a bit (acronym for binary digit), which indicates a pure number, with no physical dimensions. One bit of information is, by definition, the information carried by each of the two elementary messages released by a binary source with equal probability 1/2. The reason of the choice of logarithms in base two is now evident. For the sake of completeness, we may also add that another very common unit is the byte (1 byte ¼ 8 bit). Eight bits allow for the construction of 28 ¼ 256 different sets of elementary messages, or different 8-bit strings, to each of which we may for instance assign a given symbol on the keyboard of a personal computer.
Entropy
4.1 Maxwell’s Demon
A
The link between S and H was deeply investigated soon after the introduction of Shannon’s theory. Each isolated system can be considered either a source of information or vice versa. For example, an isolated gas with totally thermal energy and uniform temperature is in a high-entropy, disordered state (from a thermodynamic point of view). It is also in a macrostate to which a high number of microstates may correspond and therefore it is a highly probable state (from a statistical point of view). If so, the uncertainty about the actually occurring microstate is huge and a large number of bits (much information) would be needed to ascertain it (from the point of view of Information theory). A clear example of the link between entropy and information is the so-called Maxwell’s demon, a problem that fascinated scientists for a long time. Assume that we have two rooms at the same temperature, due to the fact that in each room there is the same distribution of molecular velocities and kinetic energies. If we could separate the slowly moving molecules from the fast ones, the temperatures of the two rooms should vary. Maxwell assumed that this task could be performed by a demon quickly opening or closing a frictionless window in the separation wall, in order to pass fast (hot) molecules to a room and slow (cold) molecules to the other one (Fig. 4). The final effect would be the separation of hot from cold molecules and the generation of a temperature gradient (one room would be at higher temperature than the other one), without spending energy from outside, due to the absence of friction in the process. In so doing, the entropy of the system would decrease and the second law would be violated. This paradox was only solved in 1956 by L. Brillouin. He pointed out that the demon needed information about each approaching molecule, in order to decide about opening or not the window. He could, for instance, light them up. The process of generating information would cause an increase of the overall entropy of the system þ environment, larger or equal to the entropy decrease of the system only. From Eq. (15a) and the result of Eq. (20), we have S ¼ k H ¼ 1:38 1023 joule=K 1 bit ¼ 1:38 10
23
joule=K 1 bit:
B
467
(T1 =T2)
T1
T2
T1
T2
(T1 > T2)
FIGURE 4 Maxwell demon. The demon seems to be able to violate the second law of thermodynamics, by selecting the molecules and sorting the hot (gray) from the cold ones (black). The two compartments are initially at the same temperature (A). After some time, all hot molecules migrated to the left compartment and cold ones to the right compartment. As a consequence of the selection, the temperature of the left side is higher than the temperature of the right side (B).
as for Maxwell’s demon. Of course, if H is calculated by means of natural logarithms, then 1 bit corresponds to k ln 2 entropy units.
5. THE FREE-ENERGY CONCEPT ð21Þ
Therefore, 1 bit of information requires an entropy generation equal to 1.38 1023 joule/K. This entropy cost of information is so low, due to the small value of k, that people thought information was free,
What happens when entropy increases? What does it mean in practical terms? Lots of pages have been written, linking entropy increase with disorder, disorder with less organization, less organization with lower quality, lower quality with lower usefulness.
468
Entropy
Sometimes links have been a little arbitrary, because things are different with isolated or open systems. How can we quantify these concepts? We know from Section 2 that if an isolated system undergoes a transformation from a state A to a state B, its entropy never decreases. The entropy law states in an unequivocal way that in an isolated system all real transformations tend irreversibly towards a state of maximum entropy. This statement applies to any kind of energy conversion, not only to heat-to-work ones. An entropy increase means that a fraction of the energy provided to the system is transformed into degraded heat, no longer usable to support any further process evolution. The energy actually usable is less and less work can be done. Many feel confused by using increasing entropy to mean decreasing ability to do work, because we often look to increasing quantities with positive feelings in our mind. For the entropy concept it is the opposite. The chemist J. W. Gibbs (1839–1903) actually ‘‘solved’’ this problem by introducing a new quantity, which he called available energy, linked to the entropy concept by means of a fundamental equation named after him. Available or free energy was then recognized as being an important property not only in chemistry but also in every energy transformation, in which the dissipation of a gradient (temperature, pressure, altitude, concentration, etc.) yields a product or delivers work as well as releasins some degraded energy or matter. Several slightly different expressions exist for available energy, so that we will only refer to one of them, the so-called Gibbs energy, or Gibbs free energy, defined for reversible system transformations at constant pressure. This is the most frequent case, because the majority of processes on Earth occur at constant pressure.
5.1 Gibbs Energy In general, the available energy is that fraction of a system’s energy that is actually usable to drive a transformation process. It can be calculated by subtracting from the total system’s energy the fraction that becomes unavailable due to the entropy increase. Assuming a process in which heat is transferred at constant pressure p, let us first define the Gibbs-energy function, G, as G ¼ H TS;
ð22aÞ
H ¼ U þ pV
ð22bÞ
where
is the system’s enthalpy, a measure of the total heat content of a system, while U ¼QþW
ð22CÞ
is the well-known expression of the first law of thermodynamics. For a measurable change of the system’s state at constant temperature T, from Eq. (22a) we have Gf Gi ¼ DG ¼ DH TDS ¼ ðHf Hi Þ TðSf Si Þ;
ð23Þ
which quantifies the amount of the available energy DG, after enthalpy and entropy changes have been accounted for. The subscripts i and f indicate the initial and final states. This result shows the importance of the entropy concept also in practical applications. It indicates that the energy of a system is not always totally available to be transformed into work. Part of this energy is not converted but instead is released as high entropy waste (degraded heat), no longer able to support change of state of the system. The terms available and free can be now understood, as opposed to the bound, unavailable fraction of energy present in the entropy term of Eq. (23). A process with DGo0 (i.e., a process whose initial state has higher available energy than the final state) may evolve spontaneously towards the latter, and deliver work. This result has a general validity and clearly shows the importance of the entropy-related, free energy concept. Heat conversion into work and chemical reactions are clear examples of how the second law and the entropy concept work. In both cases we have spontaneous processes: (1) the heat flows from a source at higher temperature to a sink at lower temperature, and (2) the reaction proceeds from the higher to the lower free energy state. In both cases we have a process driven from a gradient of some physical quantity (respectively, a difference between two temperatures and a difference between two free energy states) and some work delivered to the outside at the expenses of the free energy of the system. In general terms, whenever a gradient of some physical quantity (altitude, pressure, temperature, chemical potential, etc.) exists, it can drive a transformation process. The free energy of the system is used whenever these spontaneous processes occur. The result is a lowered gradient, and an increased total entropy of the process þ environment system, in addition to the work done.
Entropy
6. CONCLUSION The entropy concept is fascinating and rich of significance. At the same time, it appeared scaring and confusing to many. Many focused on the gloomy picture of increasing disorder and thermal death, characteristic of equilibrium thermodynamics and isolated systems. In this regard, entropy is used to point out and measure the decreasing availability of high quality resources (i.e., resources with low entropy and high free energy), the increase of pollution due to the release of waste, chemicals, and heat into the environment, the increase of social disorder due to degraded conditions of life in megacities all around the world, the ‘‘collapse’’ of the economy, and so on. Others identify entropy as the style of nature: since the biosphere is an open system supported by a constant inflow of solar energy, its structures and life phenomena undergo a continuous process of selforganization. Geologic processes, atmospheric systems, ecosystems, and societies are interconnected through a series of infinitely different and changing relationships, each receiving energy and materials from the other, returning same, and acting through feedback mechanisms to self-organize the whole in a grand interplay of space, time, energy and information. During this self-organization process, entropy is produced and then released towards the outer space. Living structures do not violate the second law, but feed on input sources, which continuously supply low entropy material and energy (free energy) for system development. While the first point of view calls for increased attention to avoid misuse of resources and prevent degradation of both natural and human environments (i.e., prevent the loss of stored information), the second point of view focuses on the creation of organization and new structures out of disordered materials, thanks to a flow of resources from outside. This point of view calls for adaptation to the style of nature, by recognizing the existence of oscillations (growth and descent) and resource constraints, within which many options and new patterns are however possible. Both points of view are interesting and stimulating and account for the richness of the entropy concept. The latter, first introduced in the field of energy conversion, acquired very soon citizenship in several other fields. We already mentioned the information theory in Section 4, but many other examples can be easily provided.
469
Environmentally concerned economists focused on the entropy tax—that is, on the fact that the development of economic systems is always dependent on the degradation of natural resources, the amount of which is limited. Some pointed out that the notion of entropy helps ‘‘to establish relationships between the economic system and the environmental system,’’ as noted by Faber et al. in 1983. In 1971, Nicholas Georgescu-Roegen underlined that ‘‘The Entropy Lawyemerges as the most economic in nature of all natural laws.’’ The ecologist Ramon Margalef introduced in 1968 a quantity, called ecological diversity, D, as a measure of the organization level of an ecosystem. This quantity is defined exactly in the same way as entropy, namely, X pi log pi ð24Þ D¼ i
where pi is the probability for an organism to belong to a given species. The higher the ecological diversity, the lower the production of entropy per unit of biomass, because resources are better utilized and support the growth of the whole spectrum of ecosystem hierarchy. Instead, simplified ecosystems with fewer species (such as agricultural monocultures) do not use available resources and coproducts as effectively, resulting into a higher production of entropy per unit of time and biomass. Finally, entropic styles have been adopted by several artists to express in their works their desire of novelty and their rejection of previous, static views on life. Some understood and meant entropy alternatively as disorder, indefiniteness, motion, rebellion—the opposite of the ordered and well-defined, Newtonian structures of Renaissance artists. In 1974, Rudolf Arnheim published an interesting assay on disorder and order in arts as well as on the need for complexity, which moves artists toward investigating new forms to express their creativity. In the end, we would do a disservice to the entropy concept, by limiting its significance to the concept of disorder, which is, however, one of its meanings, or even to the practical application of the concept of free energy. Entropy, a word now very common in the everyday language, may be suggested as the other side on the coin of life itself, with its development, its richness and complexity. Life processes (such as photosynthesis) build organization, add structure, reassemble materials, upgrade energy, and create new information by also degrading input resources. Degraded resources
470
Entropy
become again available for new life cycles, ultimately driven by the entropic degradation of solar energy.
SEE ALSO THE FOLLOWING ARTICLES Conservation of Energy Concept, History of Energy in the History and Philosophy of Science Entropy and the Economic Process Exergy Exergy Analysis of Energy Systems Exergy Analysis of Waste Emissions Exergy: Reference States and Balance Conditions Thermodynamics and Economics, Overview Thermodynamic Sciences, History of Thermodynamics, Laws of
Further Reading Arnheim R. (1974). ‘‘Entropy and Art. An Assay on Disorder and Order.’’ University of California Press, Berkeley, California. Boltzmann, L. (1872). Further studies on the Thermal Equilibrium of Gas Molecules (from Sitzungsberichte der kaiserlichen Akademie der Wissenschaften), Vienna.
Brillouin, L. (1962). ‘‘Science and Information Theory.’’ Academic Press, New York. Clausius, R. G. (1850). Ueber die bewegende Kraft der Wa¨rme. Annalen der Physik und Chemie 79, 368–397, 500–524 [translated and excerpted in William Francis Magie, A Source Book in Physics, McGraw-Hill, New York, 1935]. Faber, M., Niemes, H., and Stephan, G. (1983). ‘‘Entropy, Environment, and Resources. An Assay in Physico-Economics.’’ Springer-Verlag, Berlin. Georgescu-Roegen, N. (1971). ‘‘The Entropy Law and the Economic Process.’’ Harvard University Press, Cambridge, MA. Gibbs, J. W. (1931). ‘‘The Collected Works of J. Willard Gibbs.’’ Longmans, Green and Co, London. Margalef, R. (1968). ‘‘Perspectives in Ecological Theory.’’ The University of Chicago Press, Chicago. Maxwell, J. C. (1871). ‘‘Theory of Heat.’’ Longmans, Green, & Co., London. Nernst, W. (1906). Ueber die Berechnung chemischer Gleichgewichte aus thermischen Messungen. Nachr. Kgl. Ges. Wiss. Go¨tt. 1, 1–40. Prigogine, I. (1978). Time, structure, and fluctuations. Science 201, 777–785. Text of the Lecture delivered in Stockholm on 8 December 1977, when Prigogine was awarded the Nobel Prize in chemistry. Shannon, C. E. (1948). A mathematical theory of communication. Bell System Tech. J. 27, 379–423. Tribus, M., and McIrvine, E. C. (1971). Energy and information. Scientific American 225(3), 179–184.