You are on page 1of 237

Contents

1 Schroeder Chapter 1 Introduction & Thermal Equilibrium

2 The ideal gas

18

3 Heat and Work

26

4 Compression Work

27

5 Heat Capacity

31

6 Schroeder Chapter 2 The second law

39

7 Two-state systems

44

8 Einstein Solid

53

9 Interacting Systems

55

10 Large systems and large numbers

63

11 Ideal gas

69

12 Entropy

81

13 Supplemental: Combinatorics
13.1 Permutation with repetition . .
13.2 Permutation without repetition
13.3 Combination without repetition
13.4 Combination with repetition . .
13.5 Hypergeometrical . . . . . . . .

94
94
94
95
95
97

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

14 Supplemental: NA 6= NB

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

97

15 Schroeder Chapter 3 Interactions and Implications

100

16 Entropy and Heat

111

17 Paramagnetism

118

18 Supplemental: Gospers approximation of N !

132

19 Schroeder Chapter 4 Engines and Refrigerators

148

20 Heat Engines

148

21 Refrigerator

156

22 Real Heat Engines

158

23 Schroeder Chapter 5 Free energy and chemical thermodynamics


179
24 Free Energy

180

25 Free energy as a force towrads Equilibrium

189

26 Phase transformation of Pure Substances

203

27 Phase transition of Mixtures

216

28 Uses of thermodynamic potentials

219

29 Schroeder Chapter 6 Boltzmann Statistics

221

30 Quantum Statistics
232
30.1 Degenerate Fermi Gas . . . . . . . . . . . . . . . . . . . . . . 233

Schroeder Chapter 1 Introduction & Thermal Equilibrium


Thermal physics is study of the behavior of many-body systems as a
function of temperature (hence thermal).
Statistical Mechanics is the microscopic theory that uses statistical
ideas to analyze the macroscopic properties of many-body systems.
Thermal physics DO NOT have to be derived from statistical mechanics. It is, like anything in physics, an empirical science based on a small
number of basic principles such as energy conservation.
One can measure pressure and energy of a particular system as a
function of temperature.
One can measure susceptibilities of a particular system without
knowing any details of microscopic interactions inside.
Once these are known, the interaction of such systems with other
macroscopic systems can be readily calculated.
For instance, you dont have to know much about interactions
among water molecules to calculate how much ice is needed to
cool a cup of boiling water to a reasonable drinking temperature.
Just look up a table of latent heat from a book (more about that
later).
Real fun thing is to see if we can actually calculate whats measured
in an experiment and also to predict the properties of physical systems
yet to be studied experimentally from Stat Mech.
We study thermal Equilibrium Static
And small deviations from it

Everyday materials are made up of many molecules and atoms. For instance, a mole of gas (thats about 22.4 liters under 1 atm and 0 C) contains
about 6.02 1023 molecules. Thats a huge number goes by the name of
Avogadros number = 6.02 1023 If we want to describe a system that
contains that many number of particles, it is impossible to give detailed information about the motion of each particle. First of all, we cant really do
3

that. Not even a fastet computer in this day and age can possibly track the
motion of 10 to the 23 number of particles. Second, to solve Newtons equations, we need to know the initial position and the velocity of all particles.
Suppose each number takes 8 byte to specify. We need 6 such numbers. So
the initial condition of each particle takes about 50 bytes. Therefore we need
M = 50 6 1023 = 3 1024 bytes

(1)

A good hard disk takes about 1011 bytes. So one would need about 1013
such hard disks to store the information about the initial condition alone.
One hard disk takes up about 0.02 cubic meter. So the volume of hard disks
alone (not to mention the computers) would be about V 1011 cubic meter.
Thats kilometer by kilometer by 100 kilometer.
Fortunately, we are not really interested in the details of such system.
What we are interested in are
Macroscopic quantities
Intensive quantities Dont depend on the size of the system

Temperature T
Pressure P
Chemical Potential
Density n
...

Volume V
Number N
Energy U
Entropy S
Heat Q
...

Extensive quantities Do depend on the size of the system

Responses
Coefficient of expansion V /T
Compressibilityt V /p
Heat Capacity Q/T
4

Magnetic susceptibility M/H


...
These are all average quantites which are averaged not only over the
particles in the system but also over all possible initial states. Therefore
the physical equations we are interested in are not the microscopic Newtons
equation
m

d 2 ri
= Fi (rj )
dt2

(2)

but the equations that govern the behavior of pressure, energy, temperature,
etc. In this regard, the large number actually helps us because it lets us to
use statistical ideas.
There arent that many things in physics that are exactly solvable even if
you have the greatest computer ever built at your disposal. Usually solvable
systems are simple systems. For instance, any single particle or two-particle
problem in Mechanics in 1-D is ultimately solvable. But as soon as you
increase the number of particles or the dimensions, things get complicated.
Again, systems that can be simplified due to symmetries, fundamental or
accidental, can be solved. An example is the Kepler problem, that is the motion of a planet or an asteroid with respect to the Sun. With the introduction
of computers, the calculation of orbits became so advanced any deviations
from the calculated orbits are taken as the sign of a new object such as the
10-th planet.
However, this kind of problems are few and far between. As soon as the
number of bodies (with similar sizes) becomes three, there isnt much theretical physics can do about it. One has to resort to a computer calculation. But
then when the number of bodies becomes realistic 1010 , even the fastest
computer available cant do much about that.
In the late 19-th century, physicists started to realize that there is another limit where analytic calculation is possible. This is the extremely large
number limit. The reasoning is as follows. Suppose you have one mole of
a certain gas. You know that there are about 6 1023 gas molecules in the
container. It is not only impossible but absurd to keep track of the motion of
every individual molecules 1.2 1024 microscopic degrees of freedom.
What one is interested in is just a few average macroscpic quantities such as
the pressure, energy density, number density, etc. The idea is then to use
statistics to analyze the many-body system. From statistics, we know that
5


the relative error in measuring the an average quantity behaves like 1/ N .
Now if N is the Avogadros number, this is 1012 which is surely negligible.
Therefore if we can formulate manybody problem in terms of average
quantities using concepts borrowed from statistics, we may be able to go far
in solving for the characteristics of the system.
Let me give you a quick example.
Suppose you have 3 particles interacting with a potential that attracts at
long distances but repulses at short distances. Put them in a large box and
ask yourself What is the density of this small box as a function of time?

My box
Figure 1: 3 bodies in a big box
Well, most of the time, it would be zero. But to know the density as a
function of time, we have to know the trajectories of the all three particles
and thats hard no matter how simple the interaction is.
However, now suppose that instead of 3 particles, we have 6 times 10 to
the 23rd number of particles in the box.
Actually there are only 10,000 dots in this figure. However, it is clear that
6

Figure 2: 10,000 bodies in a box


unless clumping happens for some reason (it does. Condensation of water
droplets, but that means changing temperature), the density of this small
box in the corner as a function of time is just n = N/V no matter what how
complicated the interaction among the molecules are as long as they remain
gas.
The question is the, can clumping happen? That is, how is it likely that
that a large deviation from n = N/V occur in this small volume? Well,
it clearly depends on the size of the volume. If the size is too small to
be about the size of the molecular volume, then the answer could be very
frequently. However, thats not what we are interested. We often talk
about macroscopically large but microscopically small volume. That is,
we would like to think that our system is made up of a large enough number
of boxes so that calculus applies, but the box size is big enough to include
many particles. This, of course, is an approximation. The question is, how
good is this approximation?
Suppose we have N particles in a volume V . We divide the volume in B

number of boxes. So on average, there are


NB = (N/V )B

(3)

particles in each box. Now we ask: How likely is it for the number of particles
in a box to deviate from NB by -percent?
Since things are distributed almost randomly, we can use Binomial distribution to approximate the real situation. For a single particle, the probability that it is in this box is p = B/V which we take to be a small number.
Therefore, the probability that there are n particles in this box is given by
P (n) =

N!
pn (1 p)N n
n!(N n)!

(4)

Now since B V , typically n N . We know that the mean is NB = pN


and the variance is
hn2 i hni2 = N p(1 p) N p = NB

(5)

If NB is large enough, we can also approximate P (n)


with a normal distribution with the mean of NB and and the width of NB :
1
(n NB )2
P (n)dn dn
exp
2NB
2NB
#
"
2
1
x
= dx
(6)
exp
2
2

where we defined x = (n NB )/ NB .
Lets think about the probability that the number is within NB (1 0.01),
that is the probability that the actual number in the box is within 1% of
NB .
This is
"

NB (1+)

P =

P (n)

n=NB (1)

where x = NB / NB = NB .

dx ex

2 /2

(7)

Now suppose we divide 1 mole of gas in 1 m3 boxes. 1 mole of gas is


about 22.4 litres at room temperature so thats about 2.24 1016 boxes. In
that case on average each box has NB 6.02 1023 /(2.24 1016 ) 3 107
molecules. Square-root of that is about 5 103 . One percent of that is about
50. The limits of the integral are therefore about 50 to 50. This deviates
from 1 by about
e1250 10540

(8)

which is practically never. The same goes for 0.1 % 0.01 % and so on. Therefore, as long as NB is large enough, we have practically no deviation from
the average values.
What did we learn here? We learned that for some quantites in many
body system, the details of particle interactions dont matter much. In particular, unless something dramatic happens (well get to that later), clumping
(practically) never happens. In this sense, the problem of keeping track of
1023 particles reduces to a much simpler problem of keeping track of only a
few average quantities Thats the idea of Stat-Mech.
In this course, we are going to study thermal physics from the view point
of statistical mechanics. Stat Mech, however, is not the answer to all questions. If you think about it, stat mech is the ultimate theory of matter. All
macroscopic system can be dealt with using stat mech. However, systems like
living cells are notoriously hard to describe in terms of stat mech or the motion of high speed wind passing a wing for that matter. This is because these
are dynamic problems. In these problems, system properties change macroscopically all the time, sometimes drastically. Stat Mech is hard pressed to
solve such problems, this time due to sheer complexity of the system itself.
For instance, suppose that box of gas we were thinking about is actually a
part of a wind which sometimes rotates or suddenly changes directions or
encounters a brick wall, etc. Yes, the basic equations may be derived from
Stat Mech, but the problem of solving for the properties as a function of time
is far from simple.
Now things get calmer if one thinks about static systems. These are the
systems which were left alone in an isolated box for a long time. In that case,
all the turbulances, gusts, vorticies etc have all calmed down and the system
becomes uniform. This is what we refer to as the Equilibrium State It
has come to an equilibrium with its environment.
Studying equilibrium state is much simpler than the non-equilibrium
state. Of course, that doesnt mean that we can solve all problems in equi9

librium. But we know a lot. Also, in many cases, the answer can be guessed
well before any actual calculation.
In fact to know the answer beforehand, there are only a few things you
really needs to know. And I am going to tell you right now what they are.
You can take it as a mini summary of what this course is about:
Extensive and Intensive quantities

Extensive quantities are the ones that grows like system size. These
are volume V , number of particles N , total energy U , entropy S, heat
Q, Helmholtz free energy F , Gibbs free energy G, enthalpy H, etc.
Intensive quantities are the ones thats independent of system size.
These are temperature T , pressure P , chemical potential , density
n = N/V , in fact any ratio of two extensive quantities or derivative of
one w.r.t. another are intensive quantities.

Energy is conserved.
The amount of energy that entered the system through thermal contact (in other words, temperature difference) is the heat Q.
The amount of energy that entered the system through non-thermal
contact is the work W .
total energy change = U = Q + W . In many cases, this reduces to:
T dS = dU + P dV dN .
There is also this relationship: T S = U + P V N
Total entropy always increases.
Equilibrium means T , and p are the same.
Temperature is proportional to the energy per particle E kB T .

Goes by the name of equi-partition theorem. kB = 1.38 1023 J/K:


Boltzmann constant. Hard to remember. Easier to remember

300 K kB (1/40) eV (More precisely, 290 K kB = 1/(40.016) eV)


1 eV 12000K kB
10

The amount of kinetic energy and the potential energy in the bound
systems are of the same order of magnitude.
Kinetic energy means pressure.
If you have a large number, sum and integral dont differ that much.
A particle can occupy a phase space volume of d3 xd3 p/h3 .
The probability to have energy E is proportional to the Boltzmann
factor p eE/T
Fermions are like cats Fermi energy
Bosons are like dogs Bose-Einstein Condensate

Stirlings formula N ! 2N N N eN
h
c 2000 eV A
h
c 200 eV nm
me = 0.511 MeV/c2 0.5 MeV/c2
mN = 940 MeV/c2 1 GeV/c2
Potential energy tries to organize
Thermal energy tries to randomize
These arent that many and they are mostly qualitative. However, a large
amount of qualitative answers can be obtained from these facts. And getting
the qualitative answer is just as important as getting the quantitative answer
because getting the qualitative answer right shows that you understand the
problem and what is actually going on.
For instance, if you know that the temperature is proportional to the
energy and the pressure is too, then you can easily guess
kB T P

(9)

Now to make up the dimensions, you need V


kB T P V
11

(10)

But the left hand side is intensive and the right hand side is extensive. Since
the dimensions match up, we should use dimensionless N to get
N kB T = c P V

(11)

where c must be an order 1 number which in our case turns out to be just 1
or
P V = N kB T

(12)

Thats the ideal gas law.


Another system we are going to study later is the system of spin 1/2
particles which acts like tiny magnets. The question to ask is, suppose you
have N spin 1/2 particles in a magnetic field B. What is the average magnetization? First of all, we need to know the energy of each particle. Lets
suppose that each particle has the magnetic moment . Then if the spin lines
up with the magnetic field, its energy is B. If the spin is anti-parallel to
B, its energy is B. So naturally, when left alone, each particle would like
to align itself with the magnetic field. In that case, the magnetization would
be simply N in the direction of B.
But if the system is at finite temperature, then what thermal energy does
is to randomize the orientation of the spin. The magnets are colliding with
each other and the other particles in the system and getting agitated all
the time. Now the typical thermal energy scale is kT . So we can make the
following guess.
If the particles are left alone, that is at T = 0, the magnetization would
be simply M = N since all particles line up with the magnetic field
at T = 0.
If the temperature is very high so that the thermal energy is much
greater than the magnetic energy B, M will be very small because
the orientation of the spins will be practically random.
In between, the magnetization would be a function of the ratio B/kT
so that
M = N f (B/kT )

(13)

This function f (x) should go to 0 when x goes to zero (tiny B or large


T limit) and go to 1 when x becomes large (tiny T or large B limit).
12

Furthermore, if the direction of B is reversed, M should also reverse.


That is, f (x) should be an odd function of x.
There are a few elementary functions that exhibit such behavior: One is
arctangent and the other is hyperbolic tangent
atan(x)/(3.1416/2)
tanh(x)

0.5

-0.5

-1

-4

-2

Figure 3: Arctangent and Hyperbolic Tangent


As you can see here, arctangent approaches 1 in a polynomial way and
hyperbolic tangent approaches 1 in an exponential way. Now if you go back
to the list we made, you will see that the energy and the temperature combination naturally occurs in an exponetial way thats the Boltzmann factor.
So we would guess that the magnetization should behave like
M N tanh(c B/kT )

(14)

where c is again as yet unknown number of order 1. Again in this case, c


turns out to be 1 and in fact,
M = N tanh(B/kT )

13

(15)

Similar analogy can be made about melting and boiling. Ordinary materials
such as iron are held together by molecular bonding. As you heat up the
material, the atoms in the crystal becomes more and more agitated. This
means that first of all, each one needs more room. Think of a harmonic
oscillator. The more the energy of an SHO, the bigger the amplitude. So
does the atoms in a crystal. Now, the atomic potential is not really a simple
harmonic potential. So unlike the SHO case, when the amplitude becomes
too large (kinetic energy is too large), the bonding will break down. That is,
as the atoms agitate more and more, the thermal kinetic energy overcomes
the binding potential energy and the solid melts or water boils.
There you have it. What we are going to do from now on are how to
make more quantitative calculations of these quantities and many related
ones. But the spirit is the same.
Temperature
We are going to study thermal physics. Naturally then the most important concept is the Temperature. We kind of intuitively know what
temperature is. For instance we know that boiling water is much hotter than
the ice. But what exactly is the temperature? How do we define it the way
we can define other physically measurable quantities such as the mass or the
volume of an object?
There exists a precise definition of what a temperature is. However,
to talk about that we need to introduce the concept of entropy first and
that can wait. For now, lets think about how we measure the temperature
practically. Well, we use thermometers, of course. But what exactly are the
thermometers? Whats happening when you stick a thermometer in a boiling
water and say that the temperature is 100 C?
To begin with the thermometer would be at the room temperature. That
is about 20 C. When you stick it in a boiling water, it starts to heat up.
That is, the temperature of the thermometer gradually becomes the same as
the temperature of the boliling water and it will show up in the scales. This
is operational defintion of temperature.
More theoretical defintion would be
Temperature is the thing thats the same for two objects, after
theyve been in contact long enough.
Thats intuitive. But what do all these word mean exactly? What does
in contact mean? In the context of temperature, this means that the two
objects can exchange energy in some form. What about long enough?
14

Well, this is different from system to system. This depends on the rate of
heat transfer or heat conductivity. For instance, steel conduct heat fairly
quickly. So if you build a house out of steel, your house will become cold
very quickly when winter comes and heating it will take a lot of energy. In
this case, we say that the relaxation time is short.

Figure 4: You start with this


On the other hand, if you put styrofoam between you and the steel wall,
it will take a long time for the air inside of your house to be as cold as the
outside air once it was heated up. But eventually, without additional heat
source, it will become as cold. It just takes much longer than the steel wall
alone. In this case we say the relaxation time is long.
The short and long of it, however, is relative term. The above examples
measure time in human scale. But thats good enough. All we want to get
out of this is that there is a characteristic time for each system to become
acclamatized with its surroundings. Long enough means that longer than
this characteristic relaxation time.
Now when two system are in contact for long enough, theyll come to the
state of Thermal Equilibrium. This is the state when on average there is
no energy exchange between two systems. That is on average, things become
static or independent of time.
Remember our example of 3 particles and 10,000 particles in a box? Even
if the particles are still actively moving around, the density of the system
15

Figure 5: end up with this.


remains (practically) the same for all time. If you think of each particle
carrying a certain amount of energy, then you can say that the temperature
of any small box is the same as the temperature of the whole box. That is,
they have come to the state of thermal equilibrium.
In this example, there is another quantity that remains the same. That is,
the average number of particles in the box. You can start with an initial state
wherer all particles are in the right half of the box but quickly the system
will become homogenized and never can go back to the initial state. This is
another kind of equilibrium called diffusive equilibrium. This time, there
is no exchange of the number of particles between the systems. Now if there
is a movable wall between two systems, depending on the pressure the wall
can move around changing volumes of the two systems in contact. When the
two pressures become the same, then the forces on the wall balances and the
wall stops moving.
This is called mechanical equilibrium and in this case, whats ceasing
to be exchanged is the volume.
In all these examples of equilibrium, something is flowing such as energy
or number of particles. When two objects are brought in contact with each
other, usually one has more tendency to give up the energy than the others.
This has nothing to do with the absolute amount of energy each system has.
The atmosphere has a lot more energy than a hot piece of steel. But still it
16

F=PA

F = P A
P

Figure 6: Mechanical Equilibrium


is the hot steel that gives up the energy.
Therefore something makes the energy flow from one system to another.
Looking ahead, this is ultimately the role entropy plays. However, we will
just say here that
Temperature is a measure of the tendency of an object to
spontaneously give up energy to its surroundings.
Now that we have a fair bit of idea what temperature means, we need
a unit. In everyday life, we use Celsius (Centigrade) or Fahrenheit. The
official SI unit, however, is kelvin (not degrees Kelvin). 1 kelvin difference is
the same is 1 C difference. But the 0 point is different. In Celsius, 0 degree
is defined by the freezing point of water. In kelvin, 0 degree (often called
absoulte zero) is defined by the point to which the pressure of low density
gas goes to zero. In Celsius, zero kelvin is 273.15 C. Please note that
otherwise stated, all formulas in thermodynamics work with temperature in
kelvin. (C.F. Triple point of water: 273.16 K = 0.01 C).
Operationally, we use the fact that certain properties of materials are well
known as a function of temperature such as the expansion of mecury or
alcohol, also see Fig.1.3 of the textbook to measure the temperature.
More sophisticated instrument that measure extremely cold or hot temperature may use the change in the resistance as a function of temperature or
the spectrum of infrared radiation generated by the surface.
Standard temperature and pressure (STP
This is 0 C and 1 atm (= 1.013 105 P a). 1 mole of gas occupies 22.4
17

litre at STP. At room temperature, it occupies


V300 = VST P (300/273) = 24.6litre

(16)

The ideal gas

Summary of Lecture 1
Temperature energy flows from higher to lower
Relaxation time Characteristic time to achieve equilibrium
Thermal equilibrium no net exchange of energy
Diffusive equilibrium no net exchange of particles
Mechanical equilibrium no net exchange of volume
Unit of temperature: kelvin
0 K = 273 C Low density gas has zero pressure.
Low density gas and Ideal gas law
Empirically, we know that properties of low density gases can be well
described by the ideal gas law
P V = nRT

(17)

where
P : pressure measured in pascal: Pa = N/m2 .
V : volume measured in m3 .
n: number of moles of gas
R: A universal constant: 8.31 J/molK
T : measure in kelvin
1 mole is defined to contain one Avogadros number of molecules
NA = 6.02 1023
Other measures of pressure includes
18

(18)

bar = 105 Pa
atm = 1.013 105 Pa = 1013 mbar

This is the form often used in chemistry. In physics, it is more useful to


rewrite it as
P V = (nNA )(R/NA )T = N kT

(19)

where N is the total number of particles (molecules) in the system and


k (R/NA ) = 1.381 1023 J/K

(20)

is the Boltzmann constant.


This constant is one of the most important ones in physics because it provides connection between macroscopic physics and the microscopic physics.
Notice the unit of k. It is joule per kelvin or energy per temperature. Therefore the existence of this constant indicates that energy can be converted into
temperature and temperature can be converted into energy. An analogy is
the speed of light c which is another constant. It provides a way to convert
time to length and vice versa and ultimately the existence of the constant
c gave birth to Einsteins relativity. In the case of Boltzmann constant, it
gave birth to the statistical mechanics.
The above value of k in joule and kelvin is, however, often inconvenient
when considering microscopic physics. Joule is simply to big. The energy
unit most often used in atomic and subatomic physics is electron-volt. This
is defined to be the potential energy gained by an electron when it is traverses
a potential difference of 1 volt. In terms of eV, this is easier to remember:
1
eV
(21)
k (300 K)
40
or
1 eV k (12, 000 K)

(22)

or if you have memorized the surface temperature of sun 6, 000 K,


k (6, 000 K) 0.5 eV

(23)

These values are fine for rough estimates but for more quantitative values,
you may memorize:
1
eV
(24)
k 290 K
40.02
Now, when we started this section, we said
19

Ideal gas law is valid for low density gas It is an approximation.


What do we mean by that? What does low density mean? Lets think
about what really happens when the temperature becomes very small. The
ideal gas law dictates that in this limit, the product P V is zero. Suppose we
keep the pressure constant. Now we know that real molecules and atoms have
a finite size. Therefore the volume, however small the temperature is, cant
shrink further than N vmolecule where vmolecule is the volume of the molecule
itself. In other words, there is a maximum density that a gas can reach
that is
maximum = 1/vmolecule

(25)

This happens when there is no room what-so-ever between each molecule.


Now let me right the ideal gas law in this way:
P
=
kT

(26)

where = N/V is the density. If P is constant, then as T becomes smaller


and smaller, the left hand side becomes larger and larger and eventually will
exceed maximum . But that cant happen. Therefore we have this condition
for the validity of the ideal gas:
maximum = 1/vmolecule

(27)

That is, the average space between each molecules must be much larger than
the size of the molecule. Another way of saying it is that the point particle
approximation is a good approximation. At a constant temperature, this
also means that the temperature must be high enough. This makes sense. If
the temperature becomes low enough, any gas liquifies and the ideal gas law
of course breaks down.
Microscopic Model of Ideal Gas and Equipartion of Energy
Now lets see if we can get any more information out of the ideal gas law.
The ideal gas law itself is an emperical law that has been verified many
times in laboratory experiements with low density gases. To get any more
information, we need to add some more physical intuition/ingredients. In
this case what we add is our knowledge that all rarified gases are made up
of weakly interacting molecules.
20

Lets consider how pressure arises in this microscopic picture.


Suppose a single molecule hit a wall and bounced off. If we take the
directiof of the cylinder to be in the x direction, then the initial velocity is
~vinit = (vx , vy , vx )

(28)

~vfinal = (vx , vy , vx )

(29)

and the final velocity is

Actually, there is a recoil of the wall to consider but since a molecule is so


small compared to the wall, we can ignore that. So the momentum change
before and after the bounce is
~p = m(~vfinal ~vinit )
= 2m(vx , 0, 0)

(30)

Now if the size of the container is L, then this will happen in every
t = 2L/vx

(31)

On average, the force on the particle when bouncing off a wall is


hF~ iptcl =

~p
= 2mh(vx , 0, 0)/(2L/vx )i = m(hvx2 i/L, 0, 0)
t
21

(32)

By Newtons third law (the action is equal to the reaction), the wall experiences a force
hF~ iwall = hF~ iptcl

(33)

when a particle bounces off of it.


Then the average force on the wall due to N such particles is
hF~ itotal = N hF~ iwall = 2N mh(vx , 0, 0)/(2L/vx )i = N m(hvx2 i/L, 0, 0)

(34)

Pressure is the perpendicular force per unit area:


P = |F |/A = N mvx2 /(AL) = N mhvx2 i/V

(35)

P V = N mhvx2 i

(36)

or

This is what I meant when I said that pressure is kinetic energy.


Comparing this with the ideal gas law
P V = N kT

(37)

mhvx2 i = kT

(38)

we conclude

Now there is nothing special about the x direction. Therefore


mhvx2 i = mhvy2 i = mhvz2 i = kT

(39)

The average kinetic energy of a molecule is then

3
1
hKi = m hvx2 i + hvy2 i + hvz2 i = kT
2
2

(40)

This is what I meant when I said that energy is temperature.


This is a remarkable formula. We started with an emperical ideal gas
law, threw in a basic microscopic physics and got a profound result each
momentum degree of freedom contributes kT /2 to the total energy of a
particle.
22

From above formula, we can also get the average root-mean-square speed
of a molecule at a temperature of T :
mv 2
3kT
=
2
2

(41)

or
vrms =

3kT
m

(42)

Lets plug in some numbers. At room temperature, we know that kT


1/40 eV. The air is mostly made up of Nitrogen molecules which are in turn
made up of 2 nitrogen atoms. Each nitrogen atom carries 14 nucleons. Each
nucleons weight about 940 MeV/c2 or roughtly 109 eV/c2 . Here we are using
energy as a unit of mass using the Einsteins famous E = mc2 . Therefore
m = 28 109 eV/c2 3 1010 eV/c2

(43)

then
vrms =

1
eV/(3 1010 eV/c2 )
40

1
eV/(3 1010 eV/c2 )
40
1.6 106 c 480 m/s
=

(44)

Thats slightly larger than the speed of sound.


This division of energy in equal amount among degrees of freedom goes by
the name of Equipartition of energy or simply Equipartition theorem.
Well get to the theorem part but what it states is that if any quadratic term
in the energy, be it kinetic, rotational or potential, contribute kT /2 to the
total energy. This includes the translational kinetic energy
p2
Ktr =
2m

(45)

for any value of m and the rotational kinetic energy


Krot =

L2
2I

23

(46)

where L is the angular momentum and I is the moment of inertia and any
simple harmonic potential energy
VSHO =

2 x2
2

(47)

or vibration energy. Often times, when the whole system is in a structurally


stable configuration (such as in a crystal), the potential energy near the
equilibrium point of each molecule or atom can be approximated by a SHO
potential. So this is not as artificial as it first looks.
If a molecule has f such degrees of freedom, then the total energy of the
system is
U = Nf

kT
2

(48)

However, not all degrees of freedom contributes at all temperatures. The


translational kinetic energy is always there so f is at least 3. For rotational
energy, quantum mechanics dictates that there is a minimum energy. So
unless kT reaches this minimum energy, this degree of freedom does not
contribute. This is called freeze out. When it does however, it very quickly
each rotational degree of freedom contribute kT /2 to the energy. For the
vibrational energy, again, there is a minimum energy dictated by quantum
mechanics (zero point energy, if you remember) thats required to excite this
sort of motion. So again unless kT is above the minimum energy, vibrations
do not contribute to the total energy. But again once they do, they quickly
contribute kT /2 per d.o.f.
Note that we are already talking about quantum mechanics here. Many
phenomena easily observed in nature is impossible to explain without quantum mechanics. Now, we are not going to use any heavy machinery of QM.
But as the opportunities arise, we wont shy away from it either. Having said
that, lets have consider a simple example where classical consideration and
quantum consideration gives very different simple results.
Monatomic gas: f = 3
Diatomic gas with two identical atoms O2 , N2 , ...:
3 translational d.o.f.
2 rotational d.o.f. Rotation around the symmetry axis doesnt
24

2 vibrational d.o.f. kinetic and potential


Total f = 7
Polyatomic molecule without axial symmetry:
3 translational d.o.f.
3 rotational d.o.f.
Sub Total f = 6
Many different kinds of vibrational mode stretching, bending, ...
Crystal lattice:
3 translational d.o.f.
3 quadratic potential energies
Total f = 6

Kinetic Energy

Rotational Energy

P^2/(2m)

L^2/(2I)

Vibrational Energy
p^2/(2m) + w x^2/2
Figure 7: Different energies of diatomic molecule.
Again, some of these can be frozen out at low temperatures.
For instance, the air molecules around room temperature only exhibits 5
degrees of freedom, missing the vibrational ones.
25

Heat and Work

There are a few fundamental principles of physics which are never vilolated
so far as we know. One of them is the conservation of total energy. Others are the conservation of total momentum, conservation of electric charge.
If you are only concerned about non-relativistic physics (chemistry for instance), then you may add the conservation of mass to the list. Since these
laws are obeyed by most fundamental particles and their interactions, macroscopic systems must also obey the same law. Trouble is, unlike electric charge,
energy can assume many different forms kinetic energy, potential energy,
rotational energy, vibrational energy, ...
If you are concerned about a system of gas in static or near static situation, you dont really care about all these forms of energies. Most of the
times, what you are concerned about are
How much energy did I put into the system? Conversely, how much
energy is spent by the system?
Whats the accompanying temperature change?
How much mechanical work did the system do?
For instance, if you are designing a refrigerator, the temperature is what
you most care about. But as we will soon learn, to make the temperature go
down, you need to make a volume of gas to do work. And if you are designing
an engine, what you really care about is the amount of energy put in versus
the amount of mechanical work the system has done.
In equation, we express this as
U = Q + W

(49)

where U is the total change of energy for the system. Q is the amount of
energy that entered the system from thermal contacts with other systems
and W is the amount of the energy that entered the system from nonthermal contacts (e.g. mechanical, electrical, etc). Negative Q or W
means the energy was taken out of the system thru thermal contacts and
non-thermal contacts respectively. This is referred to as The first law of
thermodynamics. But thats just another way of saying that total energy
is conserved.
26

Now in the textbook, the change in the energy is denoted with symbol
while the heat and the work do not carry such a symbol. Mathematically
this is because dU is a perfect differential whose integral does not depend on
the path of integration. In other words, for energy, if you are at a certain
point in the phase space, it doesnt matter how you get there. The energy is
determined by the point you occupy. However things like mechanical work
can and will depend of the path that lead to the final point.
This is nothing mysterious. In geometrical term, perfect differentials such
as the energy is like the vector displacement. It doesnt matter how you
got to the final position. The displacement is always
~x =

~
xfinal

~
xinit

d~x = ~xfinal ~xinitial

(50)

However the length of your journey is a totally different matter. The length
of your journey
L=

tf
ti

d~

dt
dt

(51)

depends on the path you take even when


are in 1-D. So d~x is a perfect
you

d~x
differential while the line element dL = dt dt is not.
Note that for the thermal equilibrium to be established, heat Q must
be exchanged between two systems brought into contact. For mechanical
equilibrium, W is the relevant quantity. There are different ways heat can
be transferred between the systems.
Conduction: In contact. Kinetic energy is exchanged.
Convection: Circulation of gas and liquid driven by temperature difference and the density changes.
Radiation: Emission of photons.

Compression Work

In the Mechanics, a work is defined by


W = F~ d~r
27

(52)

If the force is conservative, that is if a potential energy can be found so that


F~ = V

(53)

then the change in W when a particle moves from one point to another does
not depend on the path it took. However, if no such potential exists, then
the change in W does depend on the path. Thats why the book doesnt
write W .

dx

F
P

Figure 8: Compression work


Now suppose you have a cylinder full of air with a piston at one end. If
you push the piston in, you know from everyday experience that you need a
certain amount of force to do so especially as the piston goes deeper into the
cylinder. Now from the defintion of pressure, we know that
Fn = P A

(54)

where Fn is the component of the force normal to the surface and A is the
area of the surface. Surface in our case, of course, refers to the surface of the
piston. Plugging this into the first equation gives
W = Fn dr = P Adr = P dV

(55)

where dV is the amount of volume displaced by the piston moving a small


distance dr (the distance in the normal direction). The minus sign indicates
that the system got smaller by this amount.
28

Now for this formula to apply, the movement of the piston has to be
slow so that the system always has the time to adjust to the new volume
and establish an equilibrium accordingly. This sort of slow movement is
called quasi-static movement. Usually, this is a good approximation for
an everyday object (translation: size of O(1 m). For this to be not a good
approximation, the piston has to move close to the speed of sound (330m/s).
Now before we plunge into some calculations, lets stop here and think
about why pressing the piston needs force. Not only that, why it gets harder
as the volume becomes smaller. To see this, we go back to our simple picture
of lots of balls bouncing around the room. Now remember that when a ball

Figure 9: Microscopic view of pressure


bounces off of a wall, the momentum changes by
|p| = 2m|vx |

(56)

and the rate of bounce is


t =

2L
|vx |

(57)

so that the average force exerted by a single particle is


hfx i = h

|p|
hv 2 i
i=m x
t
L
29

(58)

Average force due to N such particle is


hFx i = mN

hvx2 i
L

(59)

Pressure due to N such particles is therefore


P = N hFx i/A = mN

hvx2 i
LA

(60)

Now suppose the average speed of particles, or the average kinetic energy of
particles does not change during the course of volume change. In other words,
suppose the cylinder is in contact with a big system with a temperature T .
Since we are talking about quasi-static change, the temperature in the system
is maintained. This sort of change is called isothermal. Iso in latin meaning
the same. In that case, we can see from the force expression that if L
gets reduced by 1/2, then the force doubles because the rate of collisions
doubles.
On the other hand, lets consider another extreme case when the system
is totally isolated from the outside. That is, put some big chunk of insulator
(styrofoam will do) around the cylinder so that no heat can escape from it.
What happens then?
In purely macroscopic terms, we can get the result in the following way.
If the process is adiabatic, there is no heat enetering or leaving the system.
So the first law says
U = W = P V

(61)

Note again the sign. However, we also know that


U=

f
N kB T
2

(62)

f
N kB T
2

(63)

and hence
U =
Equating the two, we get
f
N kB T = P V
2
30

(64)

If the gas obeys the ideal gas law, we then get


f
V
N kB T = N kB T
2
V

(65)

or

V
f T
=
2 T
V

(66)

Since
d ln x =

dx
x

(67)

we get

or

ln V T f /2 = Const.

(68)

V T f /2 = Const.

(69)

P V = N kT

(70)

PV
P V 1+f /2 = Const.
T

(71)

From

we also get

Heat Capacity

O.K. So compressing or expanding gas can do raise or lower the temperature


of the gas by pumping the energy into the system or out of the system by
mechanical work. Another way of changing temperature of the system is,
of course, make it in thermal contact with another system with different
temperature.
Now experience shows that some system can soak up a lot of energy
before its temperature is substantially raised and for some other systems, it

31

doesnt take much energy to raise/lower the temperature. This property of


the system/material under study is called heat capacity. This is defined as
C=

Q
T

(72)

In other words, the amont of heat needed to raise the temperature by 1


kelvin.
Now before we do any calculation, lets see if we can guess what
C should be. What should it depend on? First of all, consider
one litre of water and 10 litres of water. Which takes more energy
before the temperature can be raised by 1 degree? The 10 litres
of water, of course. And you would expect that the more the
water, the more energy you need to raise its temperature. In
other words,
CV N

(73)

Now think about a gas of a monatomic molecules and a diatomic


molecules. Monatomic molecules can have only 3 degrees of freedom. But we saw that a diatomic molecules can have 7 degrees of
freedom. Now the equipartition theorem states that the energy is equally shared among these degrees of freedoms.
Since each degree of freedom takes kB T /2 amount of energy no
matter what the situation, we can guess that it takes more energy
to raise temperature if there are more degrees of freedom. Hence
Cf

(74)

What should its unit be? Well, since Q is energy C must have
the unit of energy/temperature. But this is precisely the unit of
the Boltzmann constant. Therefore, we can guess that
C = const. kB N f

(75)

where const. should be a order O(1) number.


A more fundamental quantity is the specific heat capacity defined by
c

C
m
32

(76)

where m is the mass of the molecular unit in the system.


One thing to notice is that the above definition is ambiguous for the
precisely the same reason that we dont write Q. That is, the heat is a
process dependent quantity. In other words, it is a function of how the
energy entered the system. Since there are many different ways for the
heat to enter the system, this is not a well defined quantity. In mathematical
term, again Q is not a perfect differential and therefore its integral is pathdependent.
But put that aside for a while and lets think about this thing for a
while. Before doing any analytic work, what can we say about the specific
heat? Think about a monatomic gas and diatomic gas and remember the
equipartion theorem. Any amount of energy entering the system will be
shared equally among the degrees of freedom. In a monatomic gas, the
energy will be shared by 3 translational degrees of freedom. But in for the
diatomic gas, the energy must be shared by up to 7 degrees of freedom. So
given equal amount of heat and all else being equal, it is easier to heat up
monatomic gas than a diatomic gas. In other words, we need smaller heat
to raise temperature for the monatomic gas. That is, the heat capacity for
monatomic gas must be smaller than that of the diatomic gas. In fact, more
degrees of freedom to excite means that the heat capacity will be larger.
That is, there are more sponges for each molecule to soak up the heat.
To see all this more explicity, use
Q = U W

(77)

U W
T

(78)

and write
C=
In case of compressional work,
U + P V
(79)
T
Just as in the consideration of the compressional work, it is the P dV term
that is the source of this trouble.
We can consider two extreme cases. First consider that the volume didnt
change. In that case, there is no mechanical work and
C=

CV =

U
T
33

(80)
V

where the subscript V is there to remind that the volume is held fixed.
Naturally, this is called heat capacity at constant volume.
On the other hand, we can consider fixing the pressure but not the volume.
In this case,
CP =

U
T

+P
P

V
T

(81)
P

again the subscript P is there to remind that the pressure is held fixed.
Naturally, this is called heat capacity at constant pressure.
Which one should be larger? If you just look at the formulas, it looks
like that CP must be larger than CV due to the extra term. But is it true?
Is the sign of (V /T )P positive? Well, yes. Higher temperature means
bigger volume to have the same pressure. If you keep the same volume, then
the pressure is going to be raised as the temperature goes up (remember
P kT ). So to let the steam out, the volume must increase.
O.K. But the question still remains. Why is it reasonable to expect that
CP is larger than CV ? This is simply a consequence of energy conservation.
If the volume is held fixed, all energy goes into rasing the temperature of
the system. On the other hand if the pressure is held fixed, some energy
must be spent in enlarging the system volume against the external pressure.
Therefore it takes more energy to raise the temperature of the system at
constant P than the system at a constant V .
How much more then? This depends on the detailed properties of the gas
molecules. For ideal gas with f degrees of freedom,
CV =

U
T

=
V

d N f kT
Nfk
=
dT 2
2

(82)

and
CP

U
V
+P
=
T P
T P
d N f kT
d
=
+
(N kT ) = CV + N k
dT 2
dT

(83)

Latent Heat
For some system, it is possible to pump in or out heat and not change
the temperature. It may sounds odd, but this is everyday phenomenon. If
34

you let a glass of ice and water on the table, the temperature of the icewater system remain at 273 kelvin until all ice is dissolved. After that the
water temperature will rise some more to eventually equilibrated with the
atmospheric temperature of the room. But this does not mean that no energy
was pumped into the ice-water system while the ice was dissolving. Ice was
dissolving after all.
This example teaches us the following:
This sort of thing happens during phase transition. Well get to
phase transitions later. For now it is suffice to have an intuition about
that. That is, you know that H2 O has three phases: ice, water and
vapor depending on the temperature and pressure.
The amount of energy put into the system must have been spent to
change one phase of matter to another. In the above example, the heat
from the atmosphere was used to break up the bond between water
molecules in the ice and make them runny thats water.
To quantify the amount of energy used in such phase transition, we define
the latent heat
L

Q
m

(84)

where m is the unit mass of the ingredients of the system. In the above
example, the water molecule.
Note again that since this definition involves heat, it is again ambiguous.
One must specify the exact circumstance in which L is measured. The
tables in textbooks usally list L values at P = 1atm. For ice
L = 333J/g

(85)

L = 2260J/g

(86)

For boling water

Where do these numbers come from? Are they natural? Well, we know
that a typical atomic energy scale is
1 eV = 1.6 1019 J
35

(87)

A water molecule has 2 hydrogen and 1 oxygen. Therefore


mH2 O 30 1024 g

(88)

1 eV
5 103 J/g
m H2 O

(89)

So the ratio is

We are in the right ball park. The above numbers for the water means that
the energies involved in breaking the ice into water and the water into vapor
must be in the range of about 0.1 eV to 1 eV. This is, of course, very rough
estimate. But we got it about right within an order of magintude and that
means that means that we are that much closer to actually understand
what goes on at the molecular level.
Enthalpy
The energy conservation in the first law form is
U = Q + W

(90)

This is the law of nature. You cant argue with that. In some situations,
however, it is convenient to rewrite it. One such situation is when the system
is under a constant pressure. In that case, the compressional work done on
the system while its volume changes by V is simply
Wcompressional = P V = (P V )

(91)

Again, note the minus sign. If the volume of the system decreases, a work
was done on the system. The inclusion of P under sign is possible here
because P is constant. Otherwise the last step is in general not permissive.
In this case, one can rewrite the first law as
(U + P V ) = Q + Wothers

(92)

where Wothers represents work done again on the system by contacts other
than thermal and mechanical. This could be magnetic, electric, graviational,
etc.
Lets define Enthalpy
H = U + PV
36

(93)

and rewrite
H = Q + Wothers

(94)

Up to now, all we have done is to take P constant and rewrite the energy
conservation law. The question is, why are we doing this? Why is this
defintion useful?
First of all, a lot of everyday phenomena happens under approximately
constant pressure, i.e. 1 atm. Second, if there are no other works done on
the system, then the above equation simplifies to
H = Q

(95)

By writing it this way, we eliminated pressure and volume dependence from


our consideration. This means that if we can measure the enthalpy just
like we measure energy, then all we need to know about energy flow under
constant pressure is, just that, the enthalpy. In other words, in the absence
of other types of work, enthalpy is heat.
For instance, suppose you are boiling some water. To calculate how much
heat you need, you can do two things. You can explicitly use
U + P V = Q

(96)

and look up the needed energy change and the change of volume when, say,
a mole of liquid water becomes a mole of water vapor at 100 C.
On the other hand, if you just know enthalpy of liquid water and the
water vapor, you can just subtract the two and come up with the answer.
This is, of course, much easier. Chemistry books are full of tables of enthalpy
for different materials. The reason is exactly that it makes a chemists life
that much easier.
O.K. Thats fine. But what is this mysterious quantity called enthalpy?
What is the meaning of it? Well, what is P V anyway? We had
Wcompressional = P V = (P V )

(97)

Remember that this is work done on the system. Now think of the atmosphere as the system. Then P V is the amount of work done on the
atmosphere system to reduce its volume by |V |. In other words, in this
case, something or somebody must do this amount of work on the atmosphere to create something other than air with a volume |V |. Or one may
37

say that P V (note that V itself is positive while the change V can be of
either sign) is the amount of work somthing or somebody must do to push
the atmosphere away to make a way for something else in its place, water
vapor for instance.
In other words, in the expression
H = U + PV

(98)

U is analogous to the mechanical kinetic energy and P V is analogous to


the mechanical potential energy and H is analogous to the total energy.
Think of pressing against a plate attached to a spring. To make a room
for yourself, you push the wall away. By doing so, you have increased your
potential energy by kx2 /2. Water molecules from boiling water in a way need
to do the same thing. They have to push away the air molecules to make
room for themselves. However, CAUTION: Dont take this analogy too
far. Although there is some truth to it, its for illustration only. U in general
contains both the kinetic and the potential energy for the molecules.
Now just as the absolute amount of total energy has no meaning (you
can always define what you mean by zero by adding a constant), absolute
amount of H has no meaning. The only thing that matters the difference
in enthalpy when somethings change into somethings else. For instance, the
change in enthalpy when liquid water changes into water vapor is
HH2 O = 40, 660J

(99)

per mole of water. Now a mole of water is about 18 grams. That means the
enthalpy change per gram of water is
HH2 O /m = 2260J/g

(100)

which is the same as the latent heat. In 40, 660 J


P V = N kT = RT = (8.31 J/K)(373 K) = 3100 J

(101)

is spent working against the atmospheric pressure. Thats about 8 %. The


rest of it spent in breaking up the molecular bonds between water molecules.
Another example is burning hydrogen
1
H2 + O2 H2 O
2
38

(102)

For each mole of water produced,


H = 286 kJ

(103)

Huh? Negative enthalpy? Well, this has two explanations. One, you burned
approximately one and half units of gas (1 for hydrogen and 1/2 for oxygen)
and got one unit of gas (water vapor) that reduced the volume. Therefore
the second term in
H = U + P V

(104)

is negative. The potential energy between hydrogen and oxygen is reduced


when they bind together to form water. This energy is then released as kinetic
energy thereby raising the temperature. Thats burninig. In terms of energy
put into the system, U < 0. Therefore overall, heat is released from
the system. In terms of heat entering the system, thats negative quantity.
This is a good thing. Othewise, Montreal winter would be unbearable.

Schroeder Chapter 2 The second law

In this chapter, we are going to study the second law of thermodynamics.


The first law
U = Q + W

(105)

is an absolute law of nature. The equality is the equality. The second law is
a bit different although in the end it doesnt really matter. The second law
of thermodynamics states:
The entropy always increases.
Stated in this way, it sounds mysterious. But this is not so strange. In
everyday language, it sounds something like this.
Suppose you have a system of many particles, say a boxful. The
particles inside the box flies around more or less randomly. Therefore, you can consider the probability that the particles in the box
are in the phase space volume
N =

N
Y

xi pi

i=1

39

(106)

around a particular configuration


N = {(x1 , p1 ), (x2 , p2 ), ..., (xN , pN )}

(107)

The second law of thermodyanmics states that a very small frac-

x
x1

p
x

Figure 10: Phase space volume


tion of such configurations is overwhelmingly more likely than
all others. The system practically never leaves vicinity of those
most likely configurations. And if the system started out at a
point far from those configuration, given enough time (usually
very short) the system will always end up near the most probable
configurations.
Now notice here that we are starting to talk about probability. This
is the key concept in Statistical mechanics. When do you need probability? Well, if you know exactly how a single particle behaves, for instance,
the movement of a pendulum, then you dont need probability. You know
the position and the momentum of that pendulum absolutely. There is no
40

uncertainty. However, if you are watching a fly darting aroud the room with
no detectable pattern, you cant be absolutely sure where the fly will be 2
minutes later. But by observing the motion of fly long enough, you can guess
where it probably will be, i.e. at the garbage can. But you cant be certain
because you dont know what the fly is thinking.
Thats it. When you know something about the system but not all, all
you can have is the probability. This could be correlated (since the fly is at
the garbage can right now, it will most likely be still there 2 minutes later)
or uncorrelated (since there is no garbage can in the room, the probability
that it will stay at this corner is just as likely as it will be at another corner
two minutes later) but in any case, you must consider the probability.
Now consider a typical example of thermodynamic system a box full
of gas molecules. You cant know the exact position and the momemtum of
each 1024 particles in the box and frankly you dont want to know. But this
means that you cant absolutely predict whats gonna happen to the system
two minutes later. The question is, can we then talk about the probability?
O.K. Suppose we want to do that. Then the next question is
How do you define probability anyway?
In this case, we proceed as follows. First we specify the global conditions of
the system. Usually, we specify the total energy of the system and the total
number of the particles. Suppose we do that. Now thats only 2 constraints
among 1023 degrees of freedom. This means that a lot of different configuration (state) of those 1023 degrees of freedom can have the same U and N .
Now suppose we prepare many, many systems with the same U and N but
dont specify anything else. The whole is called ensemble.
The probability to have any particular configuration (states) C (for instance configurations with 1/4 of particles having momentum smaller than,
say, U/N ) is then given by
P (C) =

Number of systems satisfying C


Total number of systems in the Ensemble

(108)

In the limit of the large total number of systems, the total number of systems
can be thought of as the number of all possible states. And the numerator can be thought of as the number of states satisfying the condition C.
Therefore one of the most important problem in stat-mech is the counting
problem. You need to know how to count, first of all all possible states, and
then need to know how to count all possible states under certain conditions.
41

Now in Classical mechanics, the state of a particle at any given instance is


completely specified by its phase space coordinates and the energy {x, p, E}.
All these variables are continuous variables and there are 7 of them.
However, in reality we know that microscopic world is governed not by
Classical mechanics but by quantum mechanics. The most important fact
in quantum mechanics is the particle-wave duality. Fundamenetally, a
particle obeys wave equation. Only in the macroscopic limits, one can approximate it with classical equation of motion. You will learn more about it
in quantum mechanics course. Here well just briefly state the facts we need
to proceed with the rest of the course.
If particles are fundamentally waves, there are many non-trivial consequences. For us, the followings are needed:
A wave cannot have zero size as particles can. It must have a finite extend in phase space. The consequence is the Heisenberg Uncertainty
principle:
x p h
/2

(109)

which is to say that one cannot measure the position and the momentum of a particle simultaneously. That is to say that there is a minimum
phase space volume that a particle must occupy. In contrast, a classical particle occupies a point in the phase space which by defintion has
zero volume.
If we specify p, then the uncertainty principle tells us that we have no
idea what-so-ever where the particle is actually located. So there is no
point in worrying about the position of the particle. All one has to
specify is either x or p. In our applications, it is much more convenient
to specify p.
When confined either in a box or in a potential, we can have standing
waves. These are only stable form of waves in confined condition. Remember that to form standing waves, certain relationship between the
size of the box (or potential) and the wavelength has to be satisfied.
The consequence is that the energy levels are discrete. In classical
mechanics, x, p can have any real value so that the energy E can have
any real value even if the particle is confined in a potential. That is, the
energy levels are continous. In quantum mechanics, this is no longer
42

true when particles are confined in some way either in a box or in a


potential.
It is possible that different states (labeled by some other quantities
such as momentum or angular momentum) can have the same energy.
The number of such states for a given energy level is referred to as
the multiplicity or the degeneracy of the energy level. In classical
mechanics, this number is infinite because any finite interval of real
number line contains as many points as the whole real line. In this
case, one would speak of volume instead of multiplicity. In quantum
mechanics, this is of course not true. Not only the energy levels are
discrete but other quantities are discrete as well. Therefore, we can
count how many different states share the same energy.
If two particles are identical, there is no distinction between the state
where one particle has energy E1 and the other one E2 . This may
sound trivial. But it is not. Remember we have a counting problem. If
these are classical particles, we can distingish two particles even if they
have identical properties. We can always mark them with a marker.
If you exchange the position and the momentum of two particles, we
end up with different state. In other words, ordering of particles
is important. The list (1, 2) and the list (2, 1) are different and hence
represents two different states. On the other hand in quantum mechanics, identical means identical. You cant label two identical particles in
any way. So the list (1, 2) and (2, 1) are the same. That is, quantum
mechanics corresponds to orderless sets.
Actually, modern quantum mechanics grew out of a crisis in classical
statistical mechanics. One of the reason quantum mechanics was discovered
was 19th century physics inability to explain the momentum spectrum of
black body radiation. Plancks briliant contribution was to assume that the
energy of photons were quantized. This lead to completely different counting
for the low enery photons. The crisis was averted and quantum mechanics
was born.
In summary:
1. There is a minimum phase space volume that a particle must occupy.
2. Energy levels of a confined particle is discrete.
43

3. Multiplicity or degeneracy of each energy level is countable.


4. Identical particles are absolutely identical. There cant be any ordering for them.

Two-state systems

Now lets first think about classical counting. In this case, we can label
each particle even if they are identical in all other properties. I.e. they are
distinguishable.
A prototype of classical counting problem is the coin toss. The question
is:
If you toss a coin N times, what is the probability to have n
number of heads?
To answer this question, we need to know that total number of possiblities
and the number of possibilities where there are n heads.
The textbook has an example where you toss, a penny, a nickel and a dime
in that order. The reason for having different coins is to get away from the
issue of identical particles well, coins. We dont have to use different coins.
We might as well toss a single coin three times but remembering that the
order is important. Here is the reproduction of table 2.1 from the textbook:
Penny Nickel
H
H

Dime
H

Macrostate
3 heads

H
H
T

H
T
H

T
H
T

2 heads (= 1 tail)

H
T
T

T
H
T

T
T
H

1 head (= 2 tails)

0 head (= 3 tails)

44

There are a total of 8 possibilities according to this table. Intuitively


then we have 1/8 chance of getting either no tail or no head and 3/8 chace
of getting either one tail or one head.
This looks like an quite artificial example in that there is no physics
analogy. This is not so. The binary problem happens in physics all the time.
This is because since we the number of states are countable, the simplest
non-trivial problem one can think of involves 2 states. Often enough, at
low temperatures, the most important energy levels are the ground energy
and the first excited states.
In physics, each one of the above line corresponds to a microstate. A
microstate is specified if you know all the details about the system. On
the other hand, if you are only concerned about how many tails you have
but not when and how they appeared, these corresponds to macrostates.
In our case these are states with 3 heads, 2 heads, 1 head and no head. Now
since we are ignoring details in macrostates, each macrostate corresponds to
may microstate. The number of microstate put under a macrostate is called
the multiplicity or the degeneracy of the macrostate. In stat mech, we
mainly use the term multiplicity. The name degeneracy is usually reserved
for the multiple state with the same energy in quantum mechanical sense.
However, this is not a rule. You need to be able to distinguish what means
what from the context. This, however, is usually quite clear. We will in
general denote the multiplicity with the greek letter Omega . For instance,
the multiplicity for 2 head macrostate in the above example will be
(2) = 3

(110)

Note that the way we defined probability, the probability for this state can
be written as
P =

(2)
3
=
(all)
8

(111)

Now it is tedious but quite easy to enumerate all the possibilities of 3 coin
toss. But what if we want to toss a coin many many times, say, 1023 times?
Writing down all the possibilities and counting them are out of question.
Luckily there is a branch of mathematics that deals precisely this sort of
things. This is called combinatorics. At the end of this note for chapter 2,
you will find a summary of often used counting rules.
Let me quickly summarize it
45

Permutation with Repetition : You are picking out numbers for a lottery. To win, you not only need to pick the right numbers (s of them)
but also in the right order. There are N numbers to choose from.
However, repetition is allowed. That is, you can pick 1, 1, 1, ... if you
want to. There are a total of
N

possibilities.

= Ns

(112)

......
s slots
Each slot can be filled with
N number of symbols.
For instance if these are alphabets,
there are 26 possibilities to fill each
slot. So multiplicity = N^s.
Figure 11: Permutation with repetition
Permutation without Repetition : You are picking out numbers for a
lottery. To win, you not only need to pick the right numbers (s of
them), but also the right order. There are N numbers to choose
from. However, this time, no repetition is allowed. There are a
total of
N!
(113)
N Ps = N (N 1)(N 2) (N (s 1)) =
(N s)!
possibilities.

Combination without repetition : You are picking out numbers for a


lottery. This time the rule is more lenient. You only have to pick the
right numbers (again s of them) regardless of the order. Again there
are N numbers to choose from. No repetition is allowed. There are
a total of
Permutation of s without repetition out of N
(114)
Permutation of s with repetition out of s
46

N Cs

N (N 1)(N 2) (N (s 1))
s! !
N!
N

=
s
(N s)!s!

(115)

possibilities.
Combination with repetition : You are picking out numbers for a lottery. This time the rule changed again. You only have to pick the right
numbers (again s of them) regardless of the order. Again there are N
numbers to choose from and this time, repetition is allowed. There
are a total of
N Hs

N +s1
s

(116)

For the problem at hand, we can reformuate it this way:


Suppose you have N slots to write H or T .

......
N slots
......

H H H H

T H H H H H H H

......

H H H H

......

T T T T

......

H H H H H H H H

T T T T T T T T

Figure 12: Number of words of length N in 2 letter alphabet.


That is, you want to write down all possible words of length N in a 2
letter alphabet. The alphabet in this world consists of only two letters H
47

and T . Systematically, you would start out with all-heads configuration


Cfirst = HHHH HHH

(117)

and end with all-tail configuration


Clast = T T T T T T T T

(118)

How many such states are there? Well, for each slot, you have 2 possibilities
and you have N slots. Therefore
(all) = 2N

(119)

This is an example of permutation with repetition.


If you want to know how many words have only one T or H, thats just
N . Because you can choose any one of N slots to put T in and fill all the
other slots with H and vice versa or

T HHH HH

HT HH HH

HHT H HH

C1 = ..

HHHH T H

HHHH HT

(120)

Now if you want to know how many words contain two and only two
T , you need to able to count the number of different ways of picking out 2
different slots out of N .

T T HH HHH

T HT H HHH

C2 = T HHT HHH
..

HHHH HT T

(121)

How many different possiblities are there? Well, for the first slot, you have
N choices. For the second slot you have N 1 different choices because one
slot is already occupied. So you have N (N 1) ordered choices for 2 slots
in which, say, (1, 2) and (2, 1) are counted as different choices. This is an
example of permutation without repetition. But thats not right. These
48

result in an identical word. So you must divide this by 2. Therefore the


multiplicity associated with 2-tail macrostate is
(2 tails) =

N (N 1)
2

(122)

That is, the counting problem becomes combination without repetition.


CAUTION: These identical states have nothing to do with identical
particles. The particles, or coins, here are still distinguishable. This is
purely a matter of counting different words.
You can go on like this. For three heads, you have N (N 1)(N 2)
ordered choices. But states like (1, 2, 3) and (3, 1, 2) leads to the same
word. Now if you have 3 different objects there are 3! = 6 different ways of
ordering it.
So in general, the number of possible events with s heads is
(s) =

N
s

N!
(N s)! s!

(123)

This formula is good for s = 0 and N with the definition 0! = 1. In both


cases
(0) = (N ) = 1

(124)

For large N , this is a huge number. Probability for such state is


P (s) =

(s)
=
(all)

N!
(N s)! s!
2N

(125)

Since this a very typical and also important problem in counting, let me
do it once more. Each state can be a head or a tail . So if you have 2
such particles, all possible combination can appear in the expression
( + )( + ) = + + +

(126)

If order is important, each state gets multiplicity of 1. If order is not important, and are two microstates under the same macrostate.
Therefore we should write the above as
( + )( + ) = +2 +
49

(127)

and read off the multiplicity of each macrostate as 1 for the two-heads
macrostate, 2 for one-head-one-tail macrostate and 1 for the two-tails
macrostate. Likewise, if you have three such particles,
( + )( + )( + ) = + + + + + + +
= +3 +3 +
(128)
and read off the multiplicity of each macrostate as 1 for the three-heads, 3
for the two-heads, 3 for the two-tails and 1 for the three tails.
We can continue like this indefinitely. In general if you have N particles
which can occupy binary states, all possible states appear in the expansion
of
N
Y

i=1

( + ) = ( + )( + ) ( + )

(129)

where there are N ( + ) factors. This is nothing but a well known binomial
expansion. Therefore, if order is not important, we can write this as
N
Y

i=1

N
X

N
N

(N ) =

N
N

( + ) =

N =0

N N

(130)

and read off the multiplicity as


(131)

You can easily extend this to multinomial expansion. Suppose that particles in the system can have 3 states labelled a, b, c. If you have N such
particles in the system, then all possible states of the system itself appears
in
(a + b + c)N =

N X
N
X

TN :na ,nb ,nc ana bnb cnc

(132)

N!
na ! n b ! n c !

(133)

i=0 j=0

where na + nb + nc = N and
TN :na ,nb ,nc =

50

is the trinomial coefficient which gives the multiplicity of a macrostate with


na particles in the a state and nb particles in b state.
The justification of this formula is as follows. First, think of b and c as
the same. Then the multiplicity for the macrostate with na particles in the
a state is
N!
na !(N na )!

(134)

Now consider the b and c. There are N na of them. Now if I want a


particular state with nb particles in the b state, there are
(N na )!
nb ! (N na nb )!

(135)

possibilities. So the total multiplicity for a macrostate with na , nb , nc particles


in the a, b, c states is
(na , nb ) =

N!
(N na )!
N!

=
na !(N na )! nb ! (N na nb )!
na ! n b ! n c !

(136)

using N na = nb + nc .
You can continue on. In general, if you have N particles and k states,
(n1 , n2 , , nk1 ) =

N!
n1 ! n2 ! , nk1 ! nk !

(137)

with n1 + n2 + + nk1 + nk = N .
The Two-State Paramagnet
O.K. So what is this good for? Is there any physical situation that this
coin-flipping is relevant? One very practical problem is that of a magnet.
Magnetism stems from spin of the constituents. A subatomic particle with
a non-zero spin acts like a tiny magnet. If all these tiny magnets tend to line
up with the applied magnetic field, we call the material paramagnet. If the
line up persists even if we turned of the external magnetic field, we call such
material ferromagnet.
You know that magnets always come in dipole. That is, there is no
known (to human anyway) particle or material in the universe that has only
S pole or N pole. Each magnet always come with both poles. Hence, the
51

name dipole. Now if quantum mechanics allow the dipoles of the constituent
to have only two states parallel or anti-parallel to the magnetic field , then
we have two-state paramagnet. This happens, for instance, the relevant
degrees of freedom is electron spin. An electron has a half spin that means
you can have only two states: Up or down. You will learn a lot more about
it in QM course. For now, lets accept that as fact. The problem is, what is
the multiplicity of a state where N number of dipoles are parallel to the H
field? If we have a total of N particles, the answer is
(N ) =

N
N

N!
N ! N

(138)

where N = N + N .
When a magnetic field is applied, the energy of being parallel to it is
E = B

(139)

where is the magnetic moment of the particle and


E = B

(140)

for the anti-parallel orientation. The total energy is therefore


U = N E + N E
= (N N )B
= (N 2N )B

(141)

So specifying the number of up spins is the same as specifying the total


energy but nothing else. So one can also say that (N ) is the multiplicity
of the macrostate with energy E = (N 2N )B. Well learn more about
paramagnetic material later.
What about identical particles?
One thing you should be careful about is the question of identical particles. Suppose the paramagnetism here is caused by an electron. An electron
is an electron. Any electron thats pointing up is as good as any others. They
are identical. But the formula we used came from coin tossing where all the
coins were distinguishable! Whats going on here? Why isnt there only a
single state when there are N up-spins?
52

In this case, we are allowed to distinguish the electrons because the


paramagetic materials are usually in a crystallin structure. That is, each
electron has an assigned lattice site. In that sense, we can say that this
electron belongs to the site (0, 0, 0), this belongs to (1, 0, 0) and so on. This
is possible because the lattice sites are well separated and the electrons well
localized. If the atoms/electrons are not well separated, for instance we are
dealing with a dense liquid of something, then this is not strictly true. We
have to use full machinary of many-body quantum mechanics with built in
identical particle consideration. For now, unless I say otherwise (or the book
says otherwise) we deal with well separate paramagnetic material.

Einstein Solid
Einstein Solid

k
k

Figure 13: Ziggling atoms in a crystal


Now lets think about somewhat more elaborate counting. This is the
problem of counting the multiplicity of a particular macrostate with a fixed
energy U for a crystall with L cubic lattice sites. Thats a mouthful. Let me
53

do this again. Suppose you have a crystal that contains L atoms. Further
suppose that these atoms are arranged in a way that they form a regular
cubic lattice. Thats what you get if you draw bars (parallel to the axis)
between integer points in a Cartesian space or if you build a bigg cube out
of many small cubes.
Now at a finite temperature, the atoms dont stay at the same place.
Thermal energy makes them jiggle around the equilibrium positions. If the
amplitude of the oscillation is small (so that the crystal doesnt melt), then
it is always possible to approximate and consider each link as a spring, or
simple harmonic oscillator.
A simple harmonic osciilator is the most important quantum mechanical system. This is because often times, small amplitude motions around
equilibrium position can be approximated by simple harmonic oscillator.
Also, SHO problem is exactly solvable. Another important property is that
for quantum mechanical SHO, the energy levels are equally q
spaced. That
is, the first excited states has the energy of h
= hf ( = kms , = 2f )
from the ground state, the second excited states has the energy of 2h from
the ground state, and so on. Here h is called the Placks constant. This is a
fundamental constant of nature. When we talked about quantum mechanics
a little bit we said that due to the wave-particle duality, each particle must
occupy a phase-space volume larger than h in each dimension. On the other
hand, a classical particle can occupy a point in the phase space. That is,
volume 0. Hence, if h is zero, we wouldnt have quantum mechanics. On
the other hand, if h is too big, then we would see all kinds of weird stuff
in everyday life (well, we wouldnt think them as weird, just natural). For
now, just think of it as a conversion constant between frequency and the
energy just as you can think of the Boltzmann constant k as the conversion
constant between temperature and energy. In your regular QM course, all
these will be extensively discussed. For now, lets accept this as a fact.
Now consider a crystal with L atoms. Since we live in a 3-D world, each
atom can oscillate in 3 different directions (x, y, z). Therefore, each atom
corresponds to 3 distinct oscillators. The total number of oscillators is
therefore
N = 3L

(142)

Suppose that each of these oscillators have the same . The question we ask
is:
54

What is the multiplicity of a macrostate that has the total energy


of U = qh?
In another words, we want to distribute q units of energy to N different
locations. A single site can have any unit of energy. In another words, we
want to know the number of ways to pick q numbers out of N possible numbers including repetition. Is the order important? No. So the counting
problem reduces to that of combination with repetition. The answer is
(q) =

N +q1
q

(143)

Interacting Systems

When we started this course, I told you that thermodynamics is study of


equilibrium. Intuitively, equilibrium is a state a system reaches if it left alone
for a very long time. For instance, suppose you pour a boiling water in cup
and let it sit for awhile. While the water cools, the temperature constantly
changes. After something like an hour, the water temperature becomes the
same as the room temperature. From then on, it doesnt matter when you
measure the temperature of the water. Itll be always be the same. That is,
the water has reached thermal equilibrium with the air around it. (Most
likely it didnt reach diffusive equilibrium with the air and eventually will
dry up.)
Question is, why? Why does a system tend to reach equilibrium with
the surrounding?
In this section, we start to answer that question. Full answer will come
later, but already we can have a pretty good basic understanding as to why
this happens. This is all about probability and the laws of large numbers.
Intuitively, what happens is like this: A proto-typical example of equilibrium is a box half-full of particles at time t = 0. So initially, then density of
the other half of the box is 0.
However, very quickly, the particles fill up the whole box and the density
becomes homogeneous. That is, after a very short time, you wouldnt know
if this system started out as half filled.
SHOW THE MOVIE

55

Figure 14:

Figure 15:
Now, Newtons law is time reversible. That is, you shouldnt be able to
tell whether a movie is run forward or backward, but if you run the movie
backwards, you know.
SHOW THE MOVIE
So somehow, nature consipres in a way that even if the underlying dynamics is time reversible, things that started out in a non-equilibrium state
will evolve into the equilibrium state and stay there losing all memory of
initial state except a few quantities such as the total energy.
So whats so special about the equilibrium state so that even if you start
out with quite different states they are all driven to the same equilibrium
state? Well the answer is that the multiplicity of the equilibrium state is
56

overwhelmingly larger than any other state. For large number of particles
say 1023 particles, overwhelming means not 10 times or 100 times, or even
1000 times, but more like at least ten billion times more than any others.
Therefore once a system gets there, all it stays inside. Very rarely, the system
is outside the equilibrium state and this happens for very small fluctuations.
Large fluctuations are very, very rare. Practically never.
Thats fine. Now lets quantify this statement. To do so, lets consider a
simple system made up of 2 identical Einstein solids.
We consider this system to be weakly coupled. This means that the
energy exchange between the two solid is much slower than the relaxation
time within each solid. This is convenient for us because we can then meaningfully define the energy of each solid. If the energy exchange between two
solids are rapid, then the energy of one solid at any given moment will change
at the next moment and measuring the energy for each solid dont have a
good definition. But this is a practical concern. There isnt really that big a
need for this. For now, lets suppose so.
The example we are going to consider consists of two Einstein solids each
with 3 oscillators (NA = NB = 3) and total energy unit of q = qA + qB = 6.
The numbers NA and NB dont change with time. The total energy q dont
change with time, either. However, qA and qB will fluctuate as the two system
exchange energy more or less randomly.
The question we ask is this:
What is the multiplicity of a configuration where A has qA unit of
energy?
We will also ask
What is the most likely configuration?
So, lets count. First of all, the number of all possible configuration is
given by, as before
(all) = 6 H6 =

6+61
6

11!
= 462
5!6!

(144)

since we have a total of 6 oscillators and 6 units of energy. Note that it doesnt
matter that we are regarding 3 as a unit and the other 3 as a separate unit.
As long as they can exchange energy, we can treat them together as a total
system and apply the formulas we got before.
57

In general, the multiplicity of a configuration (qA , qB ) is given by


(qA , qB ) = (qA )(qB )

(145)

O.K. Now consider a situation where A has no energy and B has 6. In


this case, the multiplicity is the same as if B is a stand-alone system.
(0, 6) = 3 H6 =

3+61
6

8!
= 28
6!2!

(146)

This, of course is the same if A hogs all the energy and B has none:
(6, 0) = (0, 6) = 28

(147)

Now suppose A has 1 unit and B has 5. In that case


(qA = 1) = 3 H1 =

3+11
1

=3

(148)

(qB = 5) = 3 H5 =

3+51
5

= 21

(149)

so that
(1, 5) = 3 21 = 63

(150)

It gets real tedieus. So lets do algebra. We have


(qA ) =

3 + qA 1//qA

(qB ) =

3 + qB 1//qB

(2 + qA )!
qA !2!

(2 + qB )!
=
qB !2!

(151)
(152)

so that
1 (qA + 2)!(8 qA )!
4 qA ! (6 qA )!
1
=
(qA + 2)(qA + 1)(8 qA )(7 qA )
4

(qA , qB ) =

58

(153)

This gives
(0, 6) = 28
(1, 5) = 63
(2, 4) = 90
(3, 3) = 100
(4, 2) = 90
(5, 1) = 63
(6, 0) = 28

(154)
(155)
(156)
(157)
(158)
(159)
(160)

and thats the answer to the first part of the question. To answer the second question that asks, which is the most likely state?, we need to make
one assumption which is called fundamental assumption of statistical
mechanics. It states:
All accessible states are equally probable.
This is an assumption because we cant prove it. It is very likely that if one
just fixes only the global quantities such as the total energy, any states that
has the same total energy is accessible. The assumption here is stronger than
that. We assume that each of such states is equally likely. This is a very
fruitful assumption and it underlies all the derivations of thermodynamics
from statistcal mechanics. So, memorize it.
Given this assumption, we can then say that (3, 3) is the most probable
macrostate which takes about 1/4 of all possibilities. For the case of 6 oscillators, this is not very impressive. However as the number of particles grows,
the probability for the most probable state quickly outruns any others.
To see this a little more clearly, consider the next case where we have
NA = 300, NB = 200

(161)

oscillators and qtotal = 100 units of energy.


The total number of accessible micro-states is
(all) =

N +q1
q

599
100

= 9.3 10115

(162)

This is huge. To see how big this number is, think about this: The age of
universe is about 10 billion light years. A year is approximately
year 3 107 s
59

(163)

So 10 billion light years is about


1010 year 1017 s

(164)

So if you count about 1098 times per second, you can count all of the above
states in the lifetime of universe and thats with a meager 500 oscillators
with a miserly 100 units of energy! To compare, these days a good CPU can
count up to 109 times per second (thats the Giga in GHz).
To count the multiplicity of each macro-state, again we use
(qA , qB ) = A (qA ) B (qB )

!
!
NA + q A 1
NB + q B 1
=
qA
qB

NA + q A 1
NB + qtotal qA 1
=
qA
qtotal qA
(NA + qA 1)! (NB + qtotal qA 1)!
=
(NA 1)!qA ! (NB 1)!(qtotal qA )!

(165)

There isnt much simplification to be done with this. Time to fire up your
computer and calculate it. Now, I dont know about your computer, but for
most calculator, 69! is the limit. This is because there are less than 100 digits
in 69! 1.71098 . 70! 1.210100 exceeds 100 digits. So how are you going
to calculate something like 500!? Well get to the real trick of calculating the
factorial of large numbers later. For the problem at hand, the trick is not
to calculate the factorial directly.
Think about this:

N
s

N!
s!(N s)!
N (N 1)(N 2) (N s + 1)
=
1 2 3 (s 1) s

N
N 1
N 2
N s+1
=

s
s1
s2
1

!
s1
Y N k
=
sk
k=0

(166)

In this way, you only need to calculate (N k)/(s k) and multiply them
together. There is no need to calculate factorials of large numbers and divide
them to get the combinatorics.
60

Still, calculation of the multiplicity with hundreds is too tedius and time
consuming for humans. Here is a short C program
#include <math.h>
#include <stdio.h>
/*
/*
/*
/*
/*
/*
*/

calcuates combination without repetition or


N choose s
/
\
| N |
| s |
\
/

main()
{
double n, s, prod;
int in, is, k;
printf("Enter N : ");
scanf("%d", &in);
printf("You entered: N = %d\n", in);
printf("Enter s : ");
scanf("%d", &is);
printf("You entered: s = %d\n", is);
n = (double) in; /* convert in to double */
s = (double) is; /* convert is to double */
prod = 1.0;
for(k=0; k<=s-1; k++)
{
prod *= (n-k)/(s-k); /* the same as */
/* prod = prod*(n-k)/(s-k) */
}/* k loop */
printf("%d Choose %d equals : %e\n", in, is, prod);
61

}/* end of main */


Try it out.
Using this sort of program, you can easily get the table in the text book.
I am not going to reproduce the table here. There are a few important things
to notice about this table.
First, that the maximum of the multiplicity is reached when
qA
NA
=
NB
qB

(167)

That is when the energy is equally distributed among the degrees of freedom.
This is another instance of equi-partition of energy. Equilibrium tends
to do that.
Second, note that the compared to the maximum, multiplicity (and hence
the probability) of a configuration like (qA , qB ) = (2, 98) is a factor of 1029
times smaller. That is, if you prepared 1029 systems with NA = 300, NB =
200 and qtotal = 100, less than 1 system will be in such configuration. If the
system can visit a million configuration configuration per second, it will take
1023 seconds to get to (2, 98). The age of universe is only about 1017 seconds.
This is practically never.
Third, notice that configurations like (59, 41) or (61, 39) has about equal
chance as (60, 40). This sort of thing is called thermal fluctuation. In our
case, this is within about 1/60 1.7 % which is noticable and measurable.
However, this is because the number of degrees of freedom in this case is
rather small, only 500 or so. When this number becomes something like
1023 , even this sort of fluctuation becomes negligible.
To show that, however, we need to know how to deal with multiplicity of
a system with a truely large number of degrees of freedom and a large energy.
The program I have given above can handle N and s of about 1000. Even
this, however, fails when N becomes larger than 1000. On my machine, this
program gives
1000 C500

2.7 10299

(168)

The largest number my machine can handle is about 10308 so my machine


fails with
1030 C515

62

(169)

The real value of this is about 2.9 10308 .


This means: we need use our brains and do some math.

10

Large systems and large numbers

The universe we live in are made of many many particles which have been
going on about their business for many many years.
A gram of hydrogen atom contains about 6 1023 protons. Our Sun is
about 21033 g. That mean it has about 1057 protons. Our galaxy has about
hundred billion suns. That so there are about 1069 protons in our galaxy.
There are about hundred billion galaxys. So there are roughly about 1081
protons in the galaxies in our universe.
The possible number of states with this many protons will go roughly like
a factorial of 1081 . Thats mind-bogglingly big. But we can deal with it with
the help of a fellow named Stirling.
Before we do that, lets talk about what the textbook says small, large,
and very large numbers.
Small numbers are numbers we can ordinarily count in our lifetime. For
instance, if you start counting 1 number each second, youll reach a million
without a problem less than a year even if you sleep and eat. So thats not
a very big number.
Large numbers are numbers you cant ordinarily count in your lifetime.
For instance, the avogadrons number 6 1023 or the number of protons in
this universe. Thats large but not impossibly large. A normal computer
these days can handle up to 10308 . A neat property of large number is that
if you add or subtract a small number, it doesnt (well hardly) change unless
you are interested in that small change. That doesnt happen often,
but be mindful of the possibility.
Suppose you have 61023 number of molecules. If you take away a million
molecules, thats just 1/1017 of the total. That is, it will change a digit in
the 17th decimal place in the scientific notation. Thats peanuts. No, less
than peanuts. A peanut weigh a few gram. An elephant weighs about a ton.
Thats about 1 part in a million.
Very large numbers are almost impossibly large numbers. Like the
number of possible microstates of a one mole of gas molecules. You cant
count things like this in any reasonable way, not human way, not machine

63

way, never. These are numbers like


1010

23

(170)

Thats 1 and 1023 zeros.


Adding and subtracting small or large numbers from very large numbers dont do anything to it. Even multiplying it dont do anything to it.
23
If you multiply 1023 to 1010 , that gives
23

1023 1010 = 1010

23 +23

= 1010

23

(171)

Therefore if something takes this much time, it doesnt really matter if you
count the time in seconds, years, or even the age of universe as a unit time.
Again, this is true unless you happened to be interested in the ratio of two
big numbers. Again, this happens sometimes. So be mindful.
So how do we handle this sort of thing? Well, there are two tools of trade.
One is the logarithm and the other is Stirlings formula.
As you know, the natural logarithm is defined as the inverse function of
the exponential. That is,
ln exp(x) = x

(172)

ln ax = x ln a

(173)

10x = exp(x ln 10)

(174)

ln 10x = x ln 10

(175)

Now since

you can always write

or

So instead of 1023 you deal with 23 ln 10 53. That is, ln transforms a


large number to a small number and very large number to a merely large
number.
That should be useful. It is useful especially when it is combined with
the Stirlings formula:

N ! N N eN 2N
(176)
64

Why is this useful? First of all, to calculate N ! for N = 1000, say, you
dont have to multiply 1 thru 1000. Just use this formula and youll be
approximately right. Second, its much easier to think about log of N ! this
way.
How does one justify this formula? Well, one quick way is actually to use
the log. Take the log of N ! and you get
ln N ! = ln(1 2 3 N ) = ln 1 + ln 2 + ln 3 + + ln N
N
X

ln n

(177)

n=1

O.K. So the product became a sum. How does that help? Well, if you have
sum, you may approximate the sum with an integral. And if you can do the
integral, then you can have a formula. In this case,
ln N ! =

N
X

ln n

n=1
Z N
1

dx ln x

= x ln x x|N
1
= N ln N N 1
N ln N N

(178)

or
N ! exp(N ln N N ) = N N eN

(179)

Now this doesnt get every factor right but


if N is a large number, N ! is a
very large number and multiplication of 2N doesnt really matter that
much.
Stirlings formula is actually pretty good even for small N .
Multiplicity of a large Einstein Solid
O.K. Now we are ready to tackle the problem of caculating the multiplicity
(and therefore the probability) for the macro-state of a large Einstein solid.
In this case the number of oscillators N is a large number. We first consider
the case where q is also a large number and q N , thats q is much much
larger than N .
65

The exact formula for the multiplicity is


(N, q) =

q+N 1
q

(q + N 1)!
q! (N 1)!

(180)

Using the Stirlings formula, this becomes


ln (N, q) (q + N 1) ln(q + N 1) (q + N 1)
q ln q + q (N 1) ln(N 1) + (N 1)
= (q + N 1) ln(q + N 1) q ln q (N 1) ln(N 1)
(q + N ) ln(q + N ) q ln q N ln N
(181)
where in the last equality we used N 1. Therefore
(N, q) (q + N )q+N q q N N = (q + N )q (q + N )N q q N N

!q

q+N
q+N N
=
(182)
q
N
Up to here we didnt use the fact that q N .
Now we use q N and make a further approximation
ln(q + N ) = ln q + ln(1 + N/q) ln q + N/q

(183)

so that
ln (N, q)
=
=
=

(q + N ) [ln q + N/q] q ln q N ln N
(q + N ) ln q + (q + N )N/q q ln q N ln N
N ln q + N + N 2 /q N ln N
N ln(q/N ) + N + N 2 /q
N ln(q/N ) + N

(184)

since q N . Therefore
q
(N, q)
N

qe
exp(N ) =
N

This is a very large number since the exponent is a large number.

66

(185)

Now consider putting together two large Einstein solids. For simplicity
let NA = NB = N . More general case of NA 6= NB can be found in the
appendix. Again
= A B
= (N, qA ) (N, qB )

eqA N eqB N

N
N
!N
N

e(qtotal qA )
eqA
=
N
N
N N
e
e
= qAN (q qA )N
N
N

(186)

where we used q = qA + qB .
In the case of smaller solids, we saw that the most likely value of qA was
determined by
NA
qA
=
NB
qB

(187)

So following that, lets guess that is whatll happen and this case and define
1
qA = q + x = q + x
2
1
1
qB = q q A = q x = q x
2
2

(188)
(189)

This yields
1
1
( q + x)N ( q x)N G
2
2
1 2
= ( q x2 )N G
4

(190)

2N

where G = Ne
.
For large N , we can use a formula for the exponential
lim (1 + x/N )N = ex

67

(191)

to get
!N

x2
1
G N 1 2
4
q /4

!
x2
1
G N exp 2
4
(q /4N )

(192)

or
(x) max

x2
exp
2 2

(193)

with
2 =

q2
2N

(194)

The maximum of this gaussian is at x = 0 or qA = q/2 as promised. The


width of this gaussian is
=

q
2N

(195)

since we have q NA , NB , this is can be large number. However, the width


smaller than the mean
q
hqA i =
(196)
2

by a factor of 1/ 2N . If N is 1023 , then even 10 times the width is one in


one billionth of the mean.
From the theory of normal distribution, you know that if you integrate
over from 10 to 10, the answer is
Z

1
x2
dx
exp
2 2
10
2 2
10

= 1 2.1 1045

(197)

What this means is the following. Suppose you prepare a system where
all the energy was in the system B. After the thermal contact is established,
the total system of A + B starts to explore the combined states. Now the
most likely state is located at qA = q/2. If you have N 1023 , most of
the accessible state is located within a relative fluctuation of 1/109 of this
68

value. Only about 1 in 1045 states are out of this range. Therefore, almost
immediately, the combined system will reach the very small neighborhood of
this most likely state and furthermore, it will stay there forever, practically.
There is only 1 in 1045 chance of qA becoming more or less than q/2 by an
amount more than 1 in a billion-th of q/2.
This is the meaning of reaching the equilibrium. For a large system,
the most probable state is so overwhelmingly probable
1. It doesnt matter which state the system started out with. Itll quickly
get to the equilibrium state.
2. And once it gets there, it will stay there.
Also note that in the above example, the equilibrium state is where each
oscillator has the same average energy. This means that there is no net energy
flow between two systems. If all the energy were in B, then the energy (heat)
will quickly flow into A and when it got so that the net flow to and from
each system cancel each other, we have the equilibrium.

11

Ideal gas

Thinking about Einstein solids is in a sense easy because you can think of
them as a collection of simple harmonic oscillators which are fixed at lattice
sites.
Now lets think about somewhat more complicated system of ideal gas.
The question we ask is the same:
1. What is the multiplicity of a macrostate whose energy is fixed at a
certain value, say, U ?
2. If two such systems are brought together, what is the most likely state
of the combined system?
Again, we have a counting problem. Now, for the Einstein solids, counting
was easy once we accepted that Quantum Mechanics dictates that each oscillator has a equally space discrete energy levels. Now we have gas molecules
in a box. What to do? What are the energy levels and how do we count
them?
We can follow the textbook and argue ad-hoc. But lets do it right. I
said before that if you have any form of confinement, the wave nature of
69

particle manifest itself by having discrete energy levels. Particles put in a


finite box are confined in a definite sense. Therefore they must also have
discrete energy levels. Once we have that, then counting becomes easy. We
can proceed with elementary quantum mechanics. But since thats not what
this course is about, Ill just make an analogy. For simplicity, consider 1-D.
In this case, the box is just a piece of line segment.
sin(pi*x)
sin(3*pi*x)
cos(pi*x)
cos(3*pi*x)

0.5

-0.5

-1

0.2

0.4

0.6

0.8

Figure 16: Odd n modes. (0) = (L) is O.K. but x (0) 6= x (L)
What should be the form of wave that represent particle energies? As I
said before, confinement produces discreteness because the only stable modes
are the stationary waves. So we can say that in this case the stationary waves
must be used to represent the particle motions.
As you have learned in the mechanics course, there are several different
kinds of stationary waves depending on the conditions at the boundaries,
open end, closed end, etc. For us, the most intuitive choice of the boundary
condition is what is called periodic boundary condition. This is the
condition that says that the value of the wavefunction and the value of the
slopes at both ends must be the same. That is,
(x = 0, t) = (x = L, t)
x (x = 0, t) = x (x = L, t)
70

(198)
(199)

sin(2*pi*x)
sin(4*pi*x)
cos(2*pi*x)
cos(4*pi*x)

0.5

-0.5

-1

0.2

0.4

0.6

0.8

Figure 17: Even n modes.


for all t.
As you can see in this figure, the possible stationary modes that satisfy
these conditions is
n (x) = A sin(2nx/L) = A sin kn x
n (x) = A cos(2nx/L) = A cos kn x

(200)
(201)

here A is the amplitude and kn = n/L with n = 0, 1, 2, 3, ... is the allowed


wavenumber.
So for each n, there are 2 modes that satsify the periodic boundary condition. Now I dont want to say every time I talk about the modes n for
sine or n for cosine. Is there any better way to represent the two modes?
Well, there is. If you know complex number, you know that we can combine
sine and cosine in this two ways:
cos kn x + i sin kn x = exp(ikn x)
cos kn x i sin kn x = exp(ikn x)

(202)
(203)

So instead of talking about n for sine or cosine, we can just talk about
kn = n/L with n = 0, 1, 2, 3, .... Whats the interpretation? Well,
71

periodic boundary condition is a perfect boundary condition if you have a


1-D circle. The two exponential modes above corresponds to a wave moving
in the right direction and the left direction.
O.K. Thats all good. But how does that relate to energy levels? Remember that when we talked about SHO for the Einstein solid, we said that
there is a relation between the energy and the frequency? That went
E=h
= hf

(204)

In other words the Plancks constant h (or Diracs version h


= h/2) is a
conversion constant between the frequency and the energy.
It turned out that there is similar relationship between the momentum
of a particle and the wavenumber
p=h
k

(205)

In our case, therefore, the particles are only allowed to have discrete momenta
given by
pn = h
kn = 2nh/L

(206)

Now the energy of a particle (with any potential energy) is given by


p2
E=
2m

(207)

Therefore, the particles in this 1-D box are only allowed to have discrete
energy levels given by
p2n
h2 2
2 2
En =
=n 2
2m
Lm

(208)

But wait a minute. The momentum can be positive or negative! So one


energy level corresponds to two momentum states. In that case, we might
as well say that each momentum state (both signs) corresponds to one single
state for a single particle. The number of states available to a single particle
with a fixed energy is then simply 2.
Let me repeat
In the case of single monatomic particle in 1-D, each microstate is
labeled by an integer corresponding to the momentum.
72

Now we ask our standard question: Given the energy E, how many microstates are there? In this case, the answer is easy.
2 2

for some integer, there is none.


1. Unless E satisfies E = n2 2Lh2 m
2 2

, then there are 2 states corresponding to


2. If E does satisfy E = n2 2Lh2 m
|n|. That is, the multiplicity of the macrostate is 2.

Mathematically we can represent this as


(E) =

n=

E, 2n2h 2 2 = 2
L2 m

(209)

where a,b is the Kronecker delta. It is somewhat silly in this case to write
2 this way. But this formula is useful and general enough this is worth it.
If you ask how many states are there if you allow the energy up to U , then
you have
nU
X

(0 < E < U ) =

= 1 + 2nU

(210)

n=nU

where nU satisfies
U=

n2U h2
2n2U h
22
=
L2 m
2mL2

(211)

or
nU =

2mU

pU L
L
=
h
h

(212)

p2U
2m

(213)

since h
= h/(2) and we defined
EU =
We also assume here that
nU 1

(214)

so that we can say


(0 < E < U ) 2nU = 2
73

pU L
2

(215)

Now consider 2 particle case. First, lets say that they are distinguishable.
And further more lets assume that they dont interact with each other so
that there is no potential energy between them. How do you count the
multiplicity? Well, since each particle state can be labeled by an integer, a
microstate can be labeled by a pair of integers (n1 , n2 ). Now since the two
particles are distinguishable, this is an ordered pair. So suppose that each
particle is allowed to have the energy up to U . For each particle, there are
1 + 2nU states to choose from. If we allow repetition, that is, we allow the
two particles to occupy the same momentum state, the answer is
(0 < E1 , E2 U ) = (1+2nU ) 2 = (1 + 2nU )2 (2nU )2

(216)

which can be also written as


U
U
X
X

(E1 , E2 ) =

E1 =0 E2 =0

nU
X

nU
X

= (1 + 2nU )2

n1 =nU n2 =nU
(2nU )2

(217)

Now suppose the particles are identical. This means that the microstate
label (n1 , n2 ) is an unordered pair. Again, suppose we allow each particle
to have energy up to U . If we do allow repetition, we then get

1 + 2nU + 2 1
=
(0 < E1 , E2 U ) = (1+2nU ) H2 =
2
(2nU + 2)(2nU + 1)
(2nU + 2)!
=
=
2! (2nU )!
2!
(2nU )2

2!

2nU + 2
2

(218)

Note that we can interpret this as the single particle multiplicity squared
divided by the 2-factorial.
We can go on like this. Suppose we have N particles. If they are all
distinguishable, then again with each particle energy restricted within 0 <
E < U , we get
(0 < Ei < U ) = (1 + 2nU )N (2nU )N
If N particles are identical, we get
N (0 < Ei < U ) =

2nU +1 HN

74

(219)

2nU + 1 + N 1
=
N
(2nU + N )!
=
N ! (2nU )!
1
(2nU + N )(2nU + N 1)(2nU + N 2) (2nU + 1)
=
N!
(220)
Now suppose 2nU N . In that case,
(2nU )N
N (0 < Ei < U )
N!

(221)

Again the interpretation is that this as the single particle multiplicity raised
to the the N -th power divided by the N-factorial.
Note that this formula works only if the number of avaiable states is much
greater than the number of particles. If this is not the case (for instance,
very cold gas), then we have to get back to the full combinatoric result.
O.K. Now lets see if we can rewrite the above result with the known
characteristics of the system like the volume. For the single particle
nU
X

(0 < E < U ) =

(222)

n=nU

The trick is to approximate the sum with the integral


U
X

n=L

U
L

dn

(223)

which works reasonably well most cases. In our case


(0 < E < U ) =

nU
X

n=nU

nU
nU

dn = 2nU

(224)

Now n is related to the momentum


p=h
k = h

2n
nh
=
L
L

(225)

or
dn = L

dp
h

75

(226)

Therefore, the multiplicity can be written as


1 (0 < E < U ) =

nU
X

n=nU

nU
nU

dn = L

pU
pU

pU L
dp
=2
h
2

(227)

Now note that L is the volume (in 1-D) of the coordinate space and 2pU is
the volume of the momentum space. So we can write the above as
1 (0 < E < U )

V Vp
h

(228)

If you have N particles in , this becomes


N (0 < E < U )

1 V Vp
N!
h

(229)

We can easily generalize this result to 3-D. Just do it 3 times for each
particle. So all we have to do is h h3 with the understanding that V and
Vp now refers to 3-D volume
N (0 < E < U )

1 V Vp
N ! h3

(230)

Pause. Think
Now lets pause a little bit and think about what we just did. What we
just did is actually very profound. Recall that when we first started this
course I said that one of the consequence of the wave nature of a particle is
that there is a minimum phase-space volume it needs to occupy and thata
given by xp = h. In fact, thats just what we have shown here. In 3-D,
the statement is that a particle needs to occupy at least
(xyz) (px py pz ) = V Vp h3

(231)

Therefore if you want to count the number of available microstates , its


simply
=

V Vp
h3

(232)

where V is the total spatial volume and Vp is the total momentum space
volume.
76

Now if particles are distinguishable, the total is just product of individual . If they are identical, then we need to divide that by N !.
I cant emphasize enough that the importance of the fact that a particle
must occupy a certain minimum phase-space volume. It is crucial in many
way how and why quantum systems behave the way they do.
Resume
Now notice that so far we have been avoiding one question. That is, what
we wants to ask is the multiplicity of that macrostate with a total energy
fixed at U . But what we considered above is mostly energy allowed up to
U . What to do?
First of all, total energy for N identical particles is given by
U=

N
X

E na

a=1

N h
i
2h2 2 X
a 2
a 2
a 2
=
(n
)
+
(n
)
+
(n
)
x
y
z
mL2 a=1

(233)

where nax,y,z are the momentum label of the a-th particle corresponding to
(px , py , pz ) = (2nx /L, 2ny /L, 2nz /L)

(234)

So fixing U is equivalent to fixing


N h
X

(nax )2

(nay )2

(naz )2

a=1

pU L
(2mU )L2
=
=
2
h
h

(235)

If ns are all continuous, this defines a sphere in the 3N dimension with the
radius given by pU L/h. What we are asked to do is then to figure out the
surface area of this sphere. If this is 3-D, we would be talking about 2D surface area of a ball. Since this is 3N dimension, we are talking about
volume in 3N 1 dimension.
Mathematically, we have
N (U ) =

X
X
X
1

PN ~n2 =(pU L/h)2 (236)


a=1 a
N ! n1 ,n1 ,n1 = n2 ,n2 ,n2 =
nN ,nN ,nN =
x

Lets use out trick of changing n to p and rewrite

Z 3
d pN P
1 Z d 3 p1 Z d 3 p2
V
V

V
N p2 =p2
U
a=1 a
N!
h3
h3
h3
Z 3
Z 3
NZ
3
V
d p1 d p2
d pN P
=

N p2 =p2
3
3
U
a=1 a
N!
h
h
h3

N (U )

77

(237)

Everything looks fine except that we have a Kronecker delta inside an integral. What shall we do? Well, we can do more refined mathematical
treatment of this function. But at this point, we invoke the law of very large
numbers. N (U ) for large U is going to be a very large number. I claim
that the proportionality factors coming from converting the Kronecker delta
to a more suitable form are just large numbers. Therefore, it doesnt really
matter. So if we can just calculate the surface area (volume) of a sphere in
3N dimension, well be fine. Furthermore, the volume of the sphere itself
and the surface are of the sphere differs by a factor pU (times some small
numbers) which is merely a large number. Therefore it turns out that it
doesnt even matter if we calculate the surface of the sphere or the volume
of the sphere. Since it is more convenient, well calculate the volume.
There are many ways to calculate volume of a sphere in M dimension.
But the simplest way is as follows. Consider the following integral:
IM =

dx1 exp(x21 )

dx2 exp(x22 )

dxM exp(x2M )

(238)

Each one of the integral is a gaussian integral. We know that


Z

dx1 exp(x21 /2 2 ) = 2 2

(239)

so the answer is
IM = M/2

(240)

On the other hand we can rewrite


IM =
=

Z
Z

dx1 exp(x21 )

dx2 exp(x22 )

dx1 dx2 dxM exp((x21 + x22 + + x2M ))

dxM exp(x2M )

(241)

In spherical coordinate system, this is simply


IM = S M

rM 1 dr exp(r 2 )

(242)

where SM is the surface area of the M -sphere with r = 1. If this was in 3-D,
S2 =

sin d

2
0

78

d = 4

(243)


Now we use x = r 2 (dx = 2rdr or dr = dx/2 x and rewrite
IM

SM Z (M 1)/21/2
=
x
dx exp(x)
2 0
SM Z M/21
=
x
dx exp(x)
2 0
SM
(M/2 1)!
=
2

(244)

where we used the fact that 0 xn dx = n!. This ordinarily works if M is


even. But adding one more particle to 6 1023 cant do anything. So well
assume that M is even. Having this then yields
R

SM

2 M/2
=
(M/2 1)!

(245)

and the volume of the M sphere with the radius R is


VM = S M

M/2 M
RM
=
R
M
(M/2)!

(246)

In our case this means


N (U )

3N/2 3N
VN
p = f (N ) V N U 3N/2
N !h3N (3N/2)! U

(247)

It can be shown that in general if the system has N f degrees of freedom,


(U ) U N f /2

(248)

Interacting Ideal Gas


Now that we have answered all we can answer (for now) for a single
system, we can ask the next question: What happens when two systems are
brought together? For simplicity, Ill keep the number of particles in the
system to be equal = N .
Then when the two systems are brought together, the multiplicity of a
macrostate with the energy division of UA and UB is
(UA , UB ) = |f (N )|2 (VA VB )N/2 (UA UB )3N/2
79

(249)

With a fixed total energy U = UA + UB , we can guess that the most likely
state should have UA = UB = U/2. So write
UA = U/2 + x
UB = U/2 x

(250)
(251)

and get

(UA , UB ) = |f (N )|2 (VA VB )N/2 U 2 /4 x2


again invoke

(1 + x/N )N ex

3N/2

(252)

(253)

to get
3N/2
U 3N
1 4x2 /U 2
(UA , UB ) = |f (N )| (VA VB )
2
3N

U
|f (N )|2 (VA VB )N/2
exp (x2 (4/U 2 )(3N/2)
2

!
3N
x2
2
N/2 U
= |f (N )| (VA VB )
exp
(254)
2
2(U 2 /12N )
2

N/2

so the root-mean-square width is


q

hx2 i hxi2 = U =

U
12N

(255)

Compared to the mean value U/2 this is tiny if N is large.


Therefore we can again say this: When to containers of a gas are brought
into a thermal contact, equilibrium is established when each particle has the
same average energy. And once the system gets there, it never leaves.
You can apply the above argument for any factor that looks like
(AB)N

(256)

where N is a large number and A + B is fixed. In particular, you can


easily apply the same argument to the volume and conclude that there is an
equilibrium volume (in this case V/2) and once the equilibrium is established,
it is never interrupted.
The same goes for N . But since f (N ) is more complicated, well wait
until later so that we can develope enough machinery to deal with that.
80

12

Entropy

For the previous few weeks, we have been asking the following questions over
and over again.
1. How do we figure out the multiplicity for a single isolated system with
energy UA ?
2. If we bring two such systems with different energies into thermal contact, what happens?
Our conclusion has always been this: When two systems are brought together into a thermal contact, it is overwhelmingly likely that it will end
up in a very small neighborhood of the most likely state. This is simply because the most likely state has multiplicity thats 10s or orders of magnitude
larger than anything else.
By now, you should feel that it sort of became your intuition that this
must be so. In that case, we should formalize it as the second law of
thermodynamics : Entropy tends to increase.
The concept of entropy seems somewhat mysterious when one first encounters it. But it is just another way of saying multiplicity. Now multiplicity
is a very large number and awkward to handle. So we define the entropy
as the log of multiplicity:
S(U ) = k ln (U )

(257)

where k is again the Boltzmann constant. Since ln is a monotonically increasing function, there is 1-to-1 mapping between the multiplicity and the
entropy. So instead of saying the most probable state is the one with most
multiplicity, we can say that it is the one with the most entropy.
The above definition also makes the multiplicity a bit more like ordinary
energy since
ln AB = ln A + ln B

(258)

so if the multiplicity of the macrostate of A is A and the macrostate B is


B , the multiplicity of the combined system is = A B and the entropy
is
S = ln = ln A + B = SA + SB
81

(259)

That is, the entropy add. In particular, it must be an extensive variable


which scales like the volume, or the size of the system.
Intuitively, entropy may be thought as the degree of disorder or more
precisely, inverse of the amount of useful information. Think of it this way.
Suppose you have 5000 scrabble pieces. Now blindfold yourself and arrange
them on a piece of paper.
What is a chance that it will end up getting an A in an essay test? Not
much. As a matter of fact, the chance is almost nil. On the other hand, if
you spend few days researching and arrange the scrabble pieces so that it
will make a coherent sense, the chances are excellent that you do will get an
A.
What does this mean? Well, it means that randomly arranged 5000
characters have much bigger entropy than a well-thought and and carefully
arranged 5000 characters. Essentially, the well written paper is unique it
has a zero entropy. Conversely, it means that the information content of the
well written paper is much higher than the random text.
Why am I telling you this? For several reasons. This is a proto-typical
entropy consideration. If you become familiar with this example, you can
apply it to many, many situations not just for physics, but communications,
genetics, you name it. Whenever an organized behavior of something is
involved, entropy appears. The book has the example of crushed ice vs. a
glass of water. If you intuit the disorder part visually, yes, some of you
might say that crushed ice looks more disorderly. However, think about the
information content. Once the position of the ice is fixed, you are reasonably
sure that where each water molecules are. Well, you may not be specify them
all in the long list of 1023 lines, but in principle it is possible. At least you
know that a water molecule stays inside the small ice chunk it is stuck in.
For water, although a sitting water may look more orderly, or peaceful, if you
envision whats happening inside the water, you know that a water molecule
can be anywhere in the cup and that goes for everyone of them.
In simple terms, you can write a message with crushed ice-cubes by cleverly arranging each piece. But you can never do such thing with a glass of
water. Less information, more entropy.
Second reason I am telling you this story is the fact that to compose a
well researched, well written paper, you need to put in a lot of energy
yourself. When you compose that paper, it seems like that you have produced something that has a very low entropy. If you start out with a bag
of scrabble pieces then indeed you have lowered the entropy of the system
82

composed of the scrabble pieces. Whats going on? Isnt entropy supposed
to increase? Well, yes it did. The point is that to arrange it carefully so that
there is a lot of information in that arrangement of characters, you have
to spend a lot of energy. By doing so, you have increased the entropy in
your environment (ate hamberger and digested it, for instance) more than
enough to compensate and overcome the entropy you lowered for the bag of
scrabble pieces.
Modern cryptography the game of encryption and decryption is one
of the important application of entropy concept. In essence, encryption is
a transformation of a entropy zero text (a message to the head of CIA, for
instance) into what looks like a very large entropy text (as random as one
can make it. The key word here is looks like. The encrypted text actually
contain all the original information. The security of the encryption program
depends on how seemingly random the encrypted text is. If any pattern is
detectable after encryption, it becomes rather easy to decode the whole thing.
If you however make a encryption program that produces a text without any
discernible pattern, for all practical purposes, the text is random. That is,
withouth the right decryption program and the keys, the encrypted text
yield no information.
In this sense, the entropy is added by the key. If you have a small key, say
one 3 letter word, then it is pretty easy to decrypt the message this is because
the available phase space for the three letter word is small = 263 = 17576.
On the other hand, modern cryptograpy uses 128 bit keys. That is, the
available states number 2128 = 3.4 1038 . This is merely a large number.
However, for the present computer technology, this is sufficient to prevent
any real-time decryption by a 3rd party.
So why do we need the Boltzmann constant at all in the definition of
entropy? The answer is, we dont. This is just a historic relic. Originally,
the entropy was thought as the measure of energy that can be extracted out
of a system in an inverse way. That is, the more the entropy, the less useful
work a system can do. For instance, if you have a gas in a cylinder pressed
by a piston, then by releasing the piston, the gas will do work by expanding
and pushing the piston against the atmospheric pressure outside. Now if you
hold the total energy constant, then the smaller the volume, the higher the
pressure and the more work the system (that is, the piston) can do. But
that means that the initial entropy was small since the volume was small.
What are we trying to say here? What we are trying to say here is this: If
you want a system to do some interesting useful work, you need to create the
83

system as non-equilibrium as possible against the outside environment be this


mechanical, thermal or chemical. Equilibrium systems are boring. Nothing
much happens inside such a system. Everything is static and fluctuations
are small. If the air is totally equilibriated, then there is no wind, no rain,
no weather and no hydro power, no hydro quebec. If all the materials are
in chemical equilibrium with another, then the universe is one uniform soup,
no galaxies, no stars, no planets, no burning log, nothing. So the maximum
entropy means no change in any form of energy into anything else and that
means equilibrium.
This is why we still carry k around. It connects the microscopical concept
of entropy the measure of the available phase space to the macroscopical
concept of useful energy.
Entropy: quantity specifying the amount of disorder or randomness in a system bearing energy or information. Originally defined
in thermodynamics in terms of heat and temperature, entropy indicates the degree to which a given quantity of thermal energy
is available for doing useful workthe greater the entropy, the less
available the energy. For example, consider a system composed of
a hot body and a cold body; this system is ordered because the
faster, more energetic molecules of the hot body are separated
from the less energetic molecules of the cold body. If the bodies
are placed in contact, heat will flow from the hot body to the cold
one. This heat flow can be utilized by a heat engine (device which
turns thermal energy into mechanical energy, or work), but once
the two bodies have reached the same temperature, no more work
can be done. Furthermore, the combined lukewarm bodies cannot
unmix themselves into hot and cold parts in order to repeat the
process. Although no energy has been lost by the heat transfer,
the energy can no longer be used to do work. Thus the entropy of
the system has increased. According to the second law of thermodynamics, during any process the change in entropy of a system
and its surroundings is either zero or positive. In other words
the entropy of the universe as a whole tends toward a maximum.
This means that although energy cannot vanish because of the
law of conservation of energy (see conservation laws ), it tends to
be degraded from useful forms to useless ones. It should be noted
that the second law of thermodynamics is statistical rather than
84

exact; thus there is nothing to prevent the faster molecules from


separating from the slow ones. However, such an occurrence is
so improbable as to be impossible from a practical point of view.
In information theory the term entropy is used to represent the
sum of the predicted values of the data in a message.
Entropy of an ideal gas
In a previous lecture, we derived the multiplicity of an ideal gas in a box
of volume V and the total energy of U as
N =

V N 3N/2
(2mU )3N/2
N !h3N (3N/2)!

(260)

The entropy is therefore


S/k = ln N = N ln V ln N ! 3N ln h + (3N/2) ln
ln(3N/2)! + (3N/2) ln(2mU )

(261)

Using the Stirlings formula, we can simplify this as


S/k N ln V N ln N + N 3N ln h + (3N/2) ln
(3N/2) ln(3N/2) + (3N/2) + (3N/2) ln(2mU )
"

V
= N ln
N

4mU
3N h2

3/2 !

5
+
2

(262)

This is called the Sackur-Tetrode equation.


How do we understand this formula? Well, lets consider the multiplicity
itself:
N = exp(S/k)

"

V
N

4m
3h2

3/2

U
N

3/2

5/2

#N

(263)

The crucial point to notice is that whats inside the square bracket are average quantities or intensive quantities that does not depend on the size of
the system. It is written in terms of the average energy per particle (U/N )
and the average space per particle (V /N ). Therefore in a way, whats inside
85

the square bracket represent the effective single particle multiplicity and
the total multiplicity has the form of
N = (1effective )N

(264)

Now suppose we forgot about the identical particle factor N !. Then what
happens? In that case, there is no N ln N N in Eq.(262) and hence,
distinguishable
N

"

= V

4m
3h2

3/2

U
N

3/2

3/2

#N

(265)

In this case, whats inside the square bracket is the single particle multiplicity
in the sense that if a single particle is left alone in a box of volume V and
energy U/N , this would be its multiplicity. The crucial difference between
the above 2 formula is that for identical particles, the effective single particle
multiplicity has
V /N
and for the distinguishable particles, it becomes
V.
Why should this make sense? Well, think of it this way. Suppose you
have 2 particles. If they are identical, it doesnt matter whether particle 1 is
in the right half of the box and the particle 2 is in the left half of the box or
vice versa. Therefore if you have to make the single particle analogy,
it is as if the particle 1 never leaves the right half of the box. If it does,
then most likely particle 2 is not in the right half of the box, but thats the
same as before anyway! Therefore, effectively if one has to make single
particle analogy, the available distinguishable volume for each particle
is V /2.
If particles are distinguishable, pink ball in the left and blue ball in the
right is different from pink ball in the right and the blue bal in the left.
Therefore, each particle occupies the whole volume of the box.
This formula shows what to do if you want to change the entropy. Increases in N, V and U all leads to increases in entropy although some are
more efficient than the others. The most efficient means of increasing the
entropy is to increase N , the number of particles. Increases ini V and U
only leads to logarythmic increas whereas increase in N can results in nearly
86

V
This is

V
the same as

this.

Might as well be

V/2

V/2

Figure 18: Effective volume for identical particles.


linear increase. Why is that? Why is increasing the number so much more
efficient than the energy or the volume?
Physically, this is because if the particles are independent, the phase space
is the product of N single-particle phase space or
N = N
1

(266)

while 1 itself is just the volume of the single phase space the volume and
the energy dependence is only polynomial. Even if we take into account the
fact that the particles are identical, the basic counting rule that N appears
in the exponent does not change that much (only log correction).
In other words, if you let the volume change 50 %, then for each particle,
the available phase space grows 50 %. That is, N gets (1.5)N . Now this is
merely a small number raised to a large number. On the other hand, if the
number increases by 50 %, then there are more particles
to explore the given

N/2
phase space and grows by a factor of 1 = ( 1 )N . Usually, there are
more than 3 states available for a single particle. Therefore, this is much
better way to increase the entropy, but it is usually most expensive way.
87

V
This is

V
not the same as

this.

Might as well be

V
Figure 19: Effective volume for distinguishable particles.

Note that if we fix U and N , the as the volume changes, the entropy
changes as
Vfinal
S/k = N ln
Vinitial

(267)

Now I dont want to write 1/k all the time. So let me define
S/k = ln

(268)

Volume Expansion & Entropy of Mixing


So far we have been sometimes at pains to say that we are dealing with
the identical gas. One of the reason is that without this bit of Quantum
Mechanical knowledge, entropy does not make sense. Suppose we start with
2 identical boxes containing identical gas molecules all at the same U/N and
V /N , i.e. same average energy and density. They are identical. So if you
put two boxes together and remove the partition, the entropy should simply
double because except that artificial partition, there is no distinguishing this
88

situation with the same with a large box. Without the symmetry factor of
N !, this is not the case.
Conversely, we should expect that if two boxes of un-like molecules mix,
the entropy should more than double. So lets see if this is true.
If you mix two identical boxes of identical molecules, the entropy formula
becomes
2 id. boxes

"

2V
= (2N ) ln
2N
= (2N ) ln

V
N

4m(2U )
3(2N )h2

4mU
3N h2

!3/2
5
+

3/2 !

5
2

= 2 S1 box

(269)

Now if you mix two un-identical boxes (well keep U, V and N the same for
simplicity), we should add the entropy of two substances separately but with
twice the volume available for each particles after removing the partition:
2 diff. boxes

"

2V 4mU
= N ln
N 3N h2
= 2 S1 box + 2N ln 2

3/2 !

"

2V
5
+
+ N ln
2
N

4mU
3N h2

3/2 !

So the difference is
S = k = 2kN ln 2

(271)

this is called the entropy of mixing.


Reversible and Irreversible Processes
There are many ways to change the state of a system. As we have studied,
it is highly probable that any of those change will increase the entropy. And
since it is not probable to reduce the entropy, once the change is made, there
is no way to go back to the original state spontaneously you can make
a Martini, but you cant un-make the Martini by un-shaking it. Well, in
principle you can. Just like the Y movie I showed you. But to have that
special initial state out of an infinitely many possible initial state is just so
improbable as to make it impossible. I could do it because I went the other
way and just reversed the clock. But if you are given an ensemble of state
89

5
+
2
(270)

and pick one at random, the chances are that you are never gonna pick that
exact state. These processes where entropy inevitably increase are called
irreversible for an obvious reason.
There are also reversible processes. These are special processes that is
defined as a limit of very, very, slow change that does not increase, nor
decrease the entropy. Lets take a look at the Sackur-Tetrode formula and
see if we can figure out how to perform such feat.
"

V
= N ln
N

4mU
3N h2

3/2 !

5
+
2

(272)

First of all, note that we cant fix two variables among U, V, N . Since we
want to keep fixed, fixing 2 automatically fixes the other. That is, no
change in anything.
So lets fix N first. How do we change U and V so that remains fixed?
Well, inside the logarithm, U and V appears in this combination:
f = V U 3/2

Or in terms of momentum pU = 2mU ,


f 0 = V p3U

(273)

(274)

So if the change is such that if keeps V U 3/2 fixed, then we have a reversible
process. How is this possible? Well, for this, we should go back to the
quantum mechanics. Recall that if you confine a particle in a line of length
L, the momenta are discrete pn = h
kn = h
2n
and the energies are discrete
L
En =

h
2 (2n)2
p2n
=
2m
2mL2

(275)

If you have a three dimensional box, this generalizes to


En =

h
2 (2)2 (n2x + n2y + n2z )
p2n
=
2m
2mL2

(276)

where n = (nx , ny , nz ) is a collection of three integers. The total energy is


just the sum of all such energies
U=

E n gn

90

(277)

where gn counts the number of particles in the energy level labelled by n.


Since all En has a factor of 1/L2 , U must, too. or
U

1
L2

(278)

Since the volume V = L3 , we can also say that


U

1
V 2/3

(279)

Therefore
U V 2/3 =

X
n

gn

h
2 (2)2 (n2x + n2y + n2z )
2m

(280)

But this is equivalent to V U 3/2 .


What does this mean? Well, this means that if we want to keep U V 2/3
fixed, so that is fixed, we better not disturb gn , the occupancy of each
energy level. Think of it this way. Suppose you have a rubber string that
is vibrating with 2 fixed ends. Now suppose you suddenly moved one end
to another position. If you do that, all sorts of nasty things happen. Lots
of different harmonics will be excited and there will be waves traveling up
and down the string now where before this rude interruption there was a
single nice standing wave. This is irreversible. You cant now go back to the
original length and recover the nice standing wave.
On the other hand, suppose you slowly, very slowly move one end of
the string given the waves time to readust its wave length to maintain the
standing wave. If you do that, you will have the same shape of the wave
even when the length of the string is doubled. You can do the reverse. You
can start to shorten the string in the same slow way and finally recover the
original state. This is reversible.
Note that this concept of fixing entropy by proportionally changing the
volume and energy is almost impossible within classical mechanics.
Almost all processes in nature are, of course, irreversible. But it is a good
theoretical tool to have the possibility to have reversible process.

91

Irreversible

Reversible

Figure 20: Irreversible vs Reversible


Zeroth Law: If two systems are in thermal equilibrium with a third
system, they must be in thermal equilibrium with each other.
First Law: An equilibrium macrostate of a system can be characterized
by a quantity U which has the property that
for an isolated system, U = constant

(281)

If the system is allowed to interact and thus goes from one macrostate
to another, the resulting change in U can be written in the form
U = Q + W

(282)

Second Law: An equilibrium macrostate of a system can be characterized by a quantity S which has the properties that
In any process in which a thermally isolated system goes from one
macrostate to another, the entorpy tends to increase,
S 0

92

(283)

If the system is not isolated and undergoes a quasi-static infinitesimal process in which it absorbs heat Q, then
dS =

Q
T

(284)

Third Law: The entropy S of a system has the limiting property that
as T 0+ ,

93

S0

(285)

13
13.1

Supplemental: Combinatorics
Permutation with repetition

Problem : You have N objects {a1 , a2 , aN } in a bag. You want to pick


an ordered set of s objects from the bag. That is, {a1 , a2 , a3 } is a different
possibility from, say, {a3 , a2 , a1 }. After writing down the object you picked,
you put it back in the bag. Therefore s does not have to be larger or smaller
than N . How many such sets are there?
Answer : There are always N ways of picking the next one. Hence, the
answer is
N s

= Ns

(286)

Physical situation : You have N well separated energy levels. You want
to distribute s particles among these energy levels. Each particle is of a
different species and there is no limit on the occupation of an energy level.
s
N s = N is the the total number of possible configurations. Think of it
this way. Suppose you have particle labeled C, H, O, N and 3 energy levels
labeled by 1, 2, 3. Lets label a configuration with the ordered-list
(nC , nH , nO , nN )

(287)

where nx is the energy level of the particle x. Since each particle is independent, nC can range from 1 to 3 regardless of what others are doing. Likewise
for others. Therefore there are
= 34 = N s

(288)

distinct configurations.

13.2

Permutation without repetition

Problem : You have N distinct objects {a1 , a2 , aN } in a bag. You want


to pick an ordered set of s < N objects without putting the objects back
when its picked. How many ways to do so?
Answer : There are N ways of picking up the first one. There is N 1
possibilities for the second one and so on. Hence,
N Ps

= N (N 1) (N s + 1) =
94

N!
(N s)!

(289)

Physical situation : Suppose I have N well separated energy levels. I also


have s particles which are all of different species. Each energy level can be
occupied by only one particle. The number of different ways to distribute
these particles among the energy levels is N Ps . Note that the difference
between this and the previous one is the possiblity of having more than one
particles on one energy level. This has to do with Fermionic and Bosonic
nature of particles. Well get to that later.

13.3

Combination without repetition

Problem : You have N distinct objects {a1 , a2 , aN } in a bag. You want


to pick s objects disregarding the order without putting the objects back
into the bag. How many ways of doing so?
Answer : There are s! ways of ordering s distinct objects. Therefore, disregarding the order means
N Ps

N!
=
=
N Cs =
s!
(N s)! s!

N
s

(290)

Physical situation : Any binary system of identical particles. A prototypical system is the system of N identical spin 1/2 particles in a magnetic
field. The number of configurations where s number of them are parallel to
the magnetic field is given by N Cs .

13.4

Combination with repetition

Problem : You have N distinct objects {a1 , a2 , aN } in a bag. You want


pick s objects disregarding the order. But this time, whenever you pick
something, you write it down and then put it back and draw again. How
many ways of doing so?
Answer : Its the same as the number of sets
{s1 , s2 , . . . , sN }

(291)

where si is the number of ai s in the list with N


i=1 si = s.
One can think of this as partitioning problem. In general, partioning
problem can be thought of as dots and bars problem. If you want to partition s number of dots into N lots (some of them can be empty), then you
need N 1 bars to mark the divisions. In all you need s + N 1 slots for
P

95

symbols dots and bars and the number of different ways to distribute s dots
among s + N 1 slots (the rest are taken up by the bars) is (Ill denote it as
N Hs )
N Hs

s+N 1
s

(292)

Sometimes this goes by the name of negative bionmial coefficient. The


reason is as follows. Consider the expansion of
f (x) = (1 x)N

(293)

f (1) (0) = N (1 x)N 1 |x=0 = N

(294)

f (2) (0) = N (N 1)(1 x)N 2 |x=0 = N (N + 1)

(295)

f (s) (0) = (1)s N (N + 1) (N + s 1)

(296)

so in general
So

x2
x3
+ N (N + 1)(N + 2) +
2!
3!

X
N (N + 1) (N + s 1) s
x
=
s!
s=0

X
(N + s 1)! s
=
x
s=0 s! (N 1)!

f (x) = 1 + N x + N (N + 1)

s=0

N +s1
s

xs

(297)

Physical situation : Einstein solid. You have N oscillators. The total


energy is fixed at
U = sh

(298)

Since the energy levels are equally spaced, this means that if the first oscillator has energy s1 h
, the second one s2 h
and so on, then
s=

N
X

si

(299)

i=1

The multiplicity of such a system is given by


96

N Hs .

13.5

Hypergeometrical

Problem : You have two kinds of objects. Na as and Nb bs. You randomly
pick k from this set. What is the probability to have s number of a?
Answer : The total number of possibilities:
Ntotal =

Na + N b
k

(300)

The total number of k objects with s as in it:


Nsa =

Na
s

Nb
ks

Na
s

Nb
ks

(301)

So the probability:

Pa (s) =

Na + N b
k

(302)

Physical situation : Suppose you have a mixture of gas. Say 80 % N2 and


20 % O2 . The total number of gas molecules in your box is N = NN2 + NO2 .
Now you let some of it out into another container. And as the gas molecules
get into the new container, you counted them and found that there are now k
number of molecules in the new container. Now ask: What is the probability
to have s oxygen molecules in the new container?
One can also think of putting in a material that can absorbe nitrogen and
oxygen with equal probability and ask: What is the probability that s oxygen
molecules have been absorbed when a total of k molecules are absorbed?

14

Supplemental: NA 6= NB
= A B
= (NA , qA ) (NB , qB )

eqA NA eqB NB

NA
NB
97

!N

NA

e(qtotal qA ) B
=
NB

e NA e NB
= qANA (q qA )NB
NA
NB

eqA
NA

(303)

where we used q = qA + qB .
In the case of smaller solids, we saw that the most likely value of qA was
determined by
qA
NA
=
NB
qB

(304)

So following that, lets guess that is whatll happen and this case and define
NA
q + x = q + x
NA + N B
NB
qB = q q A =
q x = q x
NA + N B
qA =

(305)
(306)

with = NA /(NA + NB ), = NB /(NA + NB ). Note that + = 1.


( q + x)NA ( q x)NB G
where G =

e
NA

N A

e
NB

N B

(307)

ln NA ln(q + x) + NB ln(q x) + ln G
NA ln(q) + NA (x/q) NA (x/q)2 /2
+ NB ln(q) NB (x/q) NB (x/q)2 /2 + ln G

!
NA (NA + NB )2 NB (NA + NB )2 x2
= Constant
+
NA2
NB2
2 q2

2
1
1
x
= Constant (NA + NB )2
+
NA NB 2q 2
3
(NA + NB ) x2
= Constant
(308)
NA NB 2q 2
where we used ln(1 + x) x x2 /2. or
(x) max

x2
exp
2 2

98

(309)

with
q 2 NA NB
=
(NA + NB )3
2

The maximum of this gaussian is at x = 0 or qA =


width of this gaussian is
q
=
NA + N B

(310)
NA
q
NA +NB

as promised. The

NA NB
(NA + NB )2

(311)

since we have q NA , NB , this is can be large number. However, the width


smaller than the mean
hqA i =

NA
NA + N B

(312)

by a factor of roughtly 1/ NA + NB . If NA and NB are 1023 , then even 10


times the width is one in one billionth of of the mean.
From the theory of normal distribution, you know that if you integrate
over from 10 to 10, the answer is
Z

x2
1
exp
dx
2 2
10
2 2
10

= 1 2.1 1045

(313)

What this means is the following. Suppose you prepare a system where
all the energy was in the system B. After the thermal contact is established,
the total system of A + B starts to explore the combined states. Now the
A
most likely state is located at qA = q NAN+N
. If you have N 1023 , most of
B
the accessible state is located within a relative fluctuation of 1/109 of this
value. Only about 1 in 1045 states are out of this range. Therefore, almost
immediately, the combined system will reach the very small neighborhood of
this most likely state and furthermore, it will stay there forever, practically.
There is only 1 in 1045 chance of qA becoming more or less than the value
specified above more than 1 in a billion-th amount.

99

15

Schroeder Chapter 3 Interactions and


Implications

Now that we became familiar with the concept of entropy, we can now talk
about the temperature. Initially, we defined the temperature in 2 ways.
The operational way of defining it was whatever the thermometer tells me.
Another way to say that it is the quantity that is the same when two bodies
in contact are in thermal equilibrium. The second way is more precise, but
it was also rather vague.
Now that we have a definition of thermal equilibrium in terms of entropy,
we can make it more concrete. To do so, we need a little bit of math.
Thermal equilibrium between two system is defined to be the state for which
the multiplicity of the combined system is the greatest. In math, this sort of
thing is called the optimization problem. And the tool for such problem is
the partial differentiation. You are familiar with the ordinary differentiation.
For instance, if you are given the trajectory of a particle as a function of
time, x(t), then the velocity is just
v(t) =

dx
dt

(314)

a(t) =

dv
dt

(315)

and the acceleration is just

If your function depends on more than one variable, then you can define the
partial differentiation as
f (x + , y) f (x, y)
f (x, y)
= lim
0
x

(316)

That is, when you take partial differentiation w.r.t. x, you hold y as a constant. For f /y, you do the same only with x and y exchanged. Why is
this useful? Well, partial differentiation tells you the rate of change in one
particular direction in many dimensional space. Think of a hill.
How do you characterize the top of the hill? Well, this is the point where
in any direction you go, you go down. What about the deepest part of the
valley? Well, this is the point from which any direction you go, you but go
up.
100

Figure 21: Mountain

101

Figure 22: Valley


Thats fine. But how do we characterize this property mathematically?
Well, for a hill or a valley, which is a 2-D surface, you only have to say that
this is true for 2 independent directions. The nominal choice is of course x
and y or NS and WE. But they dont even have to be orthogonal directions.
So think about moving South to North. This could be your x direction. If
you go straight, then your y coordinate is fixed, the same. Therefore you are
following this shape of path:

102

Figure 23: X path

103

Figure 24: Y path


So if you calculate the slope as a function of x, it is 0 when you get to
the top. However, this happens even if you dont really reach the top. If you
think about it, it will happen if you just follow any x line near the top. If you
think about a y line that crosses the x line, the story is the same. However,
there is only 1 point where both the slope in the x and the slope in the y are
zero and thats the summit or the deepest part of the valley.
In any case, what does this have to do with us? Well, we want to say
that the thermal equilibrium is reached when the combined multiplicity (or
entropy) of the two system is the highest. The combined multiplicity is
(UA ) = A (UA )B (U UA )

(317)

where we fixed the total energy U = UA +UB . This is a function of the energy
UA and the volumes. For simplicity, lets fix the volume. The extremum of
104

this function with V fixed is located at UA which satisfies


(UA )
A (UA )
B (U UA )
=
B (U UA ) + A (UA )
UA
UA
U
A
A (UA )
B (UB )

=
B (UB ) A (UA )
UA
UB UB =U UA
= 0

(318)

with the restriction UA + UB = U is fixed and we used the chain-rule


f g
f
g
=
g+f
x
x
x

(319)

ln f
1 f
=
x
f x

(320)

Since

the above condition is equivalent to

ln A (UA ) =
ln A (UB )
UA
UB

(321)

with UA + UB = U fixed. But we know that the entropy of a system is given


by
S(U ) = k ln (U )

(322)

so the above can be re-written as

SA (UA ) =
SA (UB )
UA
UB

(323)

Therefore the equilibrium between two system is reached when this


equality is fulfilled. That is, when the rate of change of the entropy with
respect to the total energy matches, you get an equilibrium. Note that the
left hand side is totally in terms of the property of the system A and the
right hand side is totally in terms of B.
What does it mean? Isnt this exactly what we were looking for? A
property of a system that becomes equal when two systems are brought
together? Yes! This acts very much like the temperature. But is it proportional to the temperature itself or is it merely a function of temperature?
What is the exact connection?
105

To make more concrete connection with this, lets see if the expectation
that the equilibrium temperature must be smaller than the larger
of the two and the large than the smaller of the two. That is, if
you mix hot water and cold water, you get lukewarm water. Lets give this
quantity a name.
=

ln (U )
U

(324)

Suppose at the initial time, A < B . or

ln (UA ) <
ln (UB )
UA
UB

(325)

The only way that this condition is fulfilled is for A to increase and B to
decrease. Thats kinda trivial to see and doesnt help much.
What is not so trivial is to figure out the direction of the flow of energy.
This is not so trivial because we fixed U = UA + UB . When you started with
the above non-equilibrium condition, the only way to reach equilibrium is to
reach a value of that is somewhere between A and B . But this has to be
done under the fixed energy condition. That is, if the energy in A increase,
the energy in B must decrease. Therefore to fulfil the condition Eq.(325),
S
has to be a
not only S has to be a monotonic function of U but also U
monotonic function of U .
For instance, here is an example of mono-increasing S which cannot fulfull
Eq.(325):
SA = aUA
SB = bUB

(bad)
(bad)

(326)

where a and b are different constants. Sure, these functions are monoincreasing. But the derivatives
SA
=a
UA
SB
=b
UB

(327)

can never be the same no matter what the energies are.


So to have the notion of equilibrium, = ln /U itself must be a
monotonic function of U . But which way?
106

TB

A
TA

Figure 25: Piggyback


Suppose UA is very small compared to UB . If is mono-increasing,
then the system A must gain energy from B so that UA becomes larger and
becomes larger. Now suppose the system B is much much smaller than A.
The system B can give up all its energy to A, but there is no guarantee that
this would be sufficient to bring the system to an equilibrium since A can be
arbitrary. But physically, we know that two systems brought in contact will
always become equilibriated given enough time. So this cant happen.
If is mono-decreasing, then the system A must lose energy to B so
that UA becomes smaller and A becomes larger. From the B side, this is
O.K. too since as B gains energy, B becomes smaller and eventually, the
two will coincide and live happily ever after in the equilibrium.
But this means that cant be the temperature itself. Remember that
the energy per particles is always something like kT . Therefore increase in
total energy without increasing the number of particles means increase in
temperature. But behaves exactly the opposite way. It turns out that
actually,
S
1
=
(328)
T
U
The big question is then Does the entropy defined as k ln for physS
ical systems satisfy these conditions? Since = U
has to be a mono=

107

decreasing function of U but S itself is a mono-increasing function of U , this


means that the dependence of S on U must be either
ln U

(329)

with s < 1

(330)

or
Us

Recall that the many body phase space could be written as either

or

N (V /N )N (U/N )N

(331)

N V N (U/N )N

(332)

Since the energy part comes from phase space integral which has no exponential factor, it is highly unlikely that the single particle multiplicity goes
like
exp((U/N )s )

(333)

so most likely, it is ln U .
Lets look at the two entropies we have calculated so far. For ideal gas,
we the Sackur-Tetrode formula is
"

V
ln = N ln
N

4mU
3N h2

3/2 !

5
+
2

(334)

This is certainly mono-increasing function of U . Furthermore,


ln
3N
=
U
2U

(335)

3
U = N kT
2

(336)

or

which recovers the equipartion theorem and gives us further confidence is


that the identification
=

1
T

108

(337)

is indeed the correct one. For the Einstein solid, we have


qe
(N, q) =
N

(338)

where the total energy and q are related by


U = qh

(339)

so that
(N, U ) =

Ue
Nh

(340)

Then
S/k = ln = N [1 + ln U ln N ln(h)]

(341)

N
1
=
S=k
T
U
U

(342)

U = N kT

(343)

Therefore

or

which again recovers the equipartition theorem.


When applicable, that is as long as the thermometers are within reliable temperature range, this theoretical definition of temperature and the
operational defintion of temperature matches up. And since there is no restriction on operational tolerance for the theoretical definition, it is a much
better and really fundamental definition of temperature. You can think of
themometers as the detector of the temperature.
Now step back and Think
Thats all proper and good. What does all this reall mean? I mean,
what is the intuitive understanding of the temperature defined as the energy
derivative of the entropy? How do we understand it?
Entropy is a state function that characterizes the state of the given
system. Therefore, one can think of the temperature as How much does
109

the system change if a small energy is added/subtracted? High


temperature means small
=

S
U

(344)

That means that little change in the system. Is this reasonable? High temperature means that average energy of particles is large hEi = U/N kT .
If you introduce small change in the energy, that doesnt do much to the
average energy of particles either added or subtracted. In terms of energy
levels, the highest energy available to a particle is high enough that adding
or subtracting small energy does very small to reduce or enhance the limit.
On the other hand, low temperature means that the average energy is
small. Therefore relatively small change in energy can induce large change
is the average energy. Think of really low temperature. In that case, the
particles will mostly occupy (if Bosons) the ground level. Add a little energy
and suddenly, the 1st energy level, 2nd energy level and so on a available.
This is a big change. Liquid helium goes from superfluid to normal fluid by
doing so.
Another way to get intuition about temperature is again to think about
two systems. Suppose you have a really cold two blocks at different temperature or better said different average energy U/N . Now put them together.
Now ask: If you want to increase the number of accessible state, which would
be better? To transfer energy from small to big or big to small?
Now on average, the atoms in one block has the hEiA = UA /NA and we
know that the amount of phase space available to a single atom in the block
is a function of (U/N )A . A likely form is 1 (UA /NA )A where A = O(1)
is proportional to the momentum degrees of freedom. The multiplicity of the
total system is therefore
A (1 )NA (UA /NA )A NA

(345)

Suppose that U = UA /NA . That is the total energy changes as much as


few times the average single particle energy. If you have N = 1023 particles,
this is next to nothing. The multiplicity changes to
A (UA + U ) (UA /NA )A NA (1 + U/UA )A NA
= (U/N )AA NA (1 + /NA )B NA
(U/N )AA NA eA
110

(346)

On the other hand for system B similar reasoning leads to


B (UB U )
=

(U/N )BB NB (1 U/UB )B NB


(U/N )BB NB (1 (UA /NA )/UB )B NB
B BA B (UA /UB )(NB /NA )
(U/N )B
e
B BA B (NB /UB )(UA /NA )
(U/N )B
e

(347)

So the combined total multiplicity changes by


0
= e(A B (NB /UB )(UA /NA )
init

(348)

If this is to increase, we must have


A NA
B NB
>
UA
UB

(349)

UB
UA
<
A NA
B NB

(350)

or

That is, the average energy per degree of freedom must be larger if you want
to donate your energy to increase the multiplicity. And as long as UB /B NB
is larger than UA /A NA , the system B will keep giving up energy to get
to the most probable state. This process of course stops when the average
energies become the same or
UB
UA
=
A NA
B NB

(351)

What does this have to do with the temperature? Well, equipartition theorem, of course. Average energy is proportional to the temperature. Therefore, the two system are in equilibrium when temperature is the same or
equivalently when the energy per degree of freedom is the same.

16

Entropy and Heat


Heat Capacity

111

Definition:
CV =

U
T

(352)
N,V

that is, how much energy do you need to raise the temperature of the system
by one unit of temperature?
If we know the multiplicity as a function of U , we can calculate T
1
ln
=k
T
U

(353)

Then by solving for U , we can get U as a function of T . Once we know that


we can calculate CV . For Einstein solid
CV = N k

(354)

3
CV = N k
2

(355)

For monatomic ideal gas

Why do we bother with CV at all? Well this is because we can measure


CV rather easily for many systems and furthermore, CV can show some very
dramatic behavior when the system goes through phase transition:
However, going through the multiplicity to calculate CV is rather awkward. Besides, there arent that many physical systems for which we know
how to count the number of states.
There is a better method called partition function. Well get to that later.
Measuring Entropies
In general, however, physical systems are complicated enough that explicity calculation of entropy or equivalently the multiplicity is impossible.
However, we can still measure it.
Remember the definition:
1
=
T

S
U

(356)
N,V

In differential form we can say


T dS = dU
112

(357)

Figure 26: Behavior of the heat capacity for Helimum 4


with N and V fixed. That is, if the energy of the system increases by dU ,
the entropy increases by T dS with fixed N and V .
Now remember the first law
U = Q + W

(358)

If N and V are fixed, there is no work done on or by the system so W = 0


and we can say
dU = Q = T dS

(359)

That is, T dS is the amount of change in the heat. As we will show later, this
relation between heat and the entropy
Q = T dS

(360)

is very general and applies even when N and V are changing, too.
An integral relation between U , T and S is
Z

Sf
Si

dS =

Uf
Ui

dU
T

(361)

where Si and Ui refer to the initial quantities and Sf and Uf refer to the final
quantities all at fixed N and V .
113

We can use the definition


CV =

U
T

(362)
N,V

to say
Z

Sf
Si

dS =

dU
T !

U
dT
T N,V T

Uf
Ui
Tf
Ti
Tf

CV

Ti

dT
T

(363)

Now if CV is fairly constant over the temperature range (Ti , Tf ) like the
Einstein solid or the ideal gas, we can pull CV out of the integral and say
Tf
S = CV ln
Ti

(364)

using
Z

b
a

dx
= ln x|ba = ln(b/a)
x

(365)

Example
Heat a cup of water (200 g) from 20 C to 100 C. Heat capacity of water
for 200 g of water is just 200 cal/K or about 840 J/K since 1 cal 4.2 J.
Therefore
dT
293 deg K T
= (840J/K) ln(373/293) = 200 J/K

S = (840J/K)

373 deg K

(366)

NOTICE that we used the temperature in Kelvin. Remember, in all the


formulas in thermodyamics and statistical mechanics temperature means absolute temperature.
So how much multiplicity change does this correspond to? Well,
= exp(Sf /k) exp(Si /k)
= exp(Si /k) (exp(S/k) 1)
114

(367)

Now whats the value of k?


If you can recall the size of the Avocadros number more easily, you can
use
kNA 8 J/K

(368)

NA 6 1023

(369)

and

which gives you


k

8
J/K
6

(370)

I think this is more natural to remember in the sense that


kNA 8 J/K

(371)

is a small number in the everyday unit.


Well, the way I remember it goes something like this: First of all, I konw
that the unit of k has to be (energy)/(temperature) since the combination
kT has to be energy and I know
1
eV
40

(372)

1
eV/K
12, 000

(373)

k 300 K
or
k

In microscopic studies, this is the preferred method of memorizing k since


the units here are all natural to the atomic scales.
To divide 200 J/K by k we need to know additionally
1 eV = 1.6 1019 J

(374)

which just gives you the value of the electron charge in Coulomb (so you
better remember that) so that
S/k = 200 J/K/k

!
1019 eV
= 200 J/K
12000K/eV
1.6J
1 1025
115

(375)

This is a large number. At this point precise number does not matter much.
The multiplicity then grows by a factor of

exp(S/k) 1 exp(S/k) exp 1025

(376)

This is a very large number! And remember that this is a factor not an additional term. How can we understand such change? The number 1025 looks
suspisciously close to the Avogadros number of 200 g of water. Remember 1
mole of water equals 18 g since the atomic weight of O is 16 and the hydrogen
is 1.
Well, suppose the water is sufficiently close to the ideal gas (which it
is not, but for illustration purpose, this will do) but with slightly different
degrees of freedom. We know that Sackur-Tetrod formula
V

U
N

(377)

where is proportional to the degrees of freedom and it is O(1). Equipartition theorem says that
U/N T

(378)

so that
V

T N

(379)

So if the temperature changes from Ti to Tf , the multiplicity changes to


V
f
N

TfN

V
N

V
=
N

TiN

Tf
Ti

(380)

or
Tf
f
= exp(S/k) =
i
Ti

(381)

Indeed, that 1025 is related to the total number of molecules in the water
and the precise number in front of it would give us the information on how
many effective degrees of freedom a water molecules have.
Note that since the equipartition theorem estimate
U/N kT
116

(382)

is quite general, and also the dependence


(U/N )N

(383)

is also quite general, the numbers appearing above calculation should be


quite typical.
Now lets think about some limiting cases. What happens to the entropy
as T 0? Since limx0 ln x = , the above formula would say at a certain
temperature, entropy becomes 0 and continues to decrease until it explodes
into infinity at the absolute 0. But this is absurd. Remember that we defined
S/k = ln

(384)

Now counts the number of states a system can be in. If the system exists
at all, there should be at least one state the system can occupy. Therefore,
the minimum value of is 1 and the minimum value of S/k is 0. This is the
the third law of thermodynamics.
In practice, there is usually residual entropy that prevents the measured
entropy to go to zero. This usually has to do with orientations of molecules
or nuclei that takes very little energy to change. In mechanical analogy, this
corresponds to very nearly flat surface. If you drop a ball somewhere, it will
roll since the chance to hit the valley is very small. However, since the surface
is nearly flat, the speed of the ball is very small and it may take a very long
time for the ball to find the true minimum and settle down. Another issue
is the mixing due to isotopes.
Due all these and some more, usually the entropy does not go to 0 as
T 0. However, the integral
Z

T
0

CV

dT
T

(385)

better be finite. That implies that CT T s for some s > 0 for small T and
that means
CV 0 as T 0

(386)

sometimes this is referred to as the third law.


Whats wrong with our Einstein solid formula and the Ideal gas formula
then? For our Einstein solid, we had
eU
S/k = N ln
Nh

117

(387)

So when U/N 0, S seems to blow up. For the ideal gas we had
"

V
S/k = N ln
N

4m
3h2

3/2

U
N

3/2 !

5
+
2

(388)

Again as U/N 0, it seems to blow up.


Well, the reason is because in deriving those formulas, we made some
approximations. For the Einstein solid, we had
qN

(389)

and N 1. The original formula is


(N, q) =

q+N 1
q

(q + N 1)!
q!(N 1)!

(390)

If q = 0, this becomes
(N, 0) =

(N 1)!
=1
0!(N 1)!

(391)

since 0! = 1. Therefore in the zero temperature limit, the entropy of the


Einstein solid is correct ln 1 = 0.
Its a little trickier for the ideal gas. We have made some assumptions
in deriving the Sackur-Tetrode formula Continuity approximations, indistinguishability factor 1/N !, etc. We are not going to patch them up here to
get the right answer. Well wait that until we have developed enough machinary to deal with all the complications later. But as the Einstein solid case
indicates, there is no contradiction if you take into account all the factors.
It also indicates that the naive counting of degrees of freedom works in
only certain limits, namely the high temperature limit. Much interesting
things can happen at low temperature and this is what low temperature
physics such as superconductivity, Bose-Einstein condensate, slowing speed
of light to a crawl, etc, are all about.

17

Paramagnetism

Remember:

118

Paramagnetism :
Tendency to line up with B.
Ferronetism :
Tendency to line up with B and keeping it even in B is turned off.
The two examples we have been using extensively so far, the Einstein
solid and the ideal gas, have many things in common. It may not look like
that at a first glance. One is about a solid and the other about a gas. But
really are many things in common. One common feature is the availability
of the infinite phase space. That is, there no limit in the maximum energy
and hence no limit in the entropy. Even though the Einstein oscillator is
fixed at a lattice point, at high energy the typical amplitude of oscillation
is big enough that it isnt hard to imagine that it could resemble free gas.
In reality, of course, the solid melts first to the liquid phase and eventually
to the vapor phase as the temperature goes up. But thats another story
altogether.
Now, lets consider a very different system where there is a limit on the
maximum energy and furthermore higher energy per particle does not necessarily mean higher entropy.
Consider a set of magnetic dipoles each fixed at a lattice point. Take
them to be independent so that they dont interact with each other. This
system is called 2-state paramagnet.
We take each magnet to have intrinsic spin of 1/2. What does that
mean? Well operationally this means that the magnet can have only point in
2 directions, up or down. This is a little bit weird since if you think of a bar
magnet which is certainly a dipole magnet, you can point it in any direction
you like. Well, this is from quantum mechanics of atoms and electrons. The
intuition you have from macroscopic world do not alway apply.
Quantum mechanics tells us that particles like an electron or the nucleus
of the hydrogen atom (proton) has intrinsic property called spin and it cause
the electron to behave like a tiny magnet. Now since electron is microscopic,
the rules of quantum mechanics has to be applied. The most basic rule of
quantum mechanics is the uncertainty principle. In very general form, it
states that if you know the momentum, you dont know the position and if
you know the position, you dont know the momentum. Mathematically this
is stated as
x p > h

119

(392)

The same principle applies to the angles and angular momentum. Now
spin is a kind of angular momentum. Using spherical coordinate, you can
easily derive
Sz < h

(393)

where is the azimuthal angle. Now suppose you know the size of the z
component of the angular momentum precisely. That is, Sz = 0. In that
case, you have no idea at all what the value of should be. All you are
certain about is the value of Sz . So it is pointless to define the direction of
your angular momentum vector in all 3 directions once you know the value
of Sz . So, it only make sense to say up or down. But why two states? Why
not 3, 4, 5 or 100 for that matter?
This is because the wave nature of a quantum particle. Remember that
to confine a particle in a box, the wavefunctions must be in the form of
stationary wave. That meant imposing boundary conditions. In particular
we imposed the periodic boundary conditions:
(0) = (L)
0 (0) = 0 (L)

(394)
(395)

A similar argument works here. Spinning means that the particle is in


some sense rotating. Consider this as a motion of particle confined in a circle.
If this is really a classical particle, there is no condition on the size of the
particle. However, if we would like to confine a wave in a circle, we need the
wave must satisfy the stationary condition:
() = ( + 2)
0 () = 0 ( + 2)

(396)
(397)

Just like the box conditions quantized the spatial momentum, this condition
quantizes the angular momentum. This is because only some special values
of the angular momentum can give you the above conditions.
In the case of box, we had
2n
(398)
L
as the wavevector where L was the size of the box. This meant that the
momentum
hn
pn = h
kn =
(399)
L
kn =

120

In the case of angular motion, the size of the box is 2, the circumference
of the unit circle. So the analogue of the wavevector is
m =

2n
=m
2

(400)

where n is an integer that could be positive or negative. This means that


the angular momentum
(Lz )m = h
m = h
m

(401)

The value of m has to be of course limited by the size of the total angular
momentum. This is very roughly how it goes.
Now this consideration tells you that the z-component of the angular
momentum is quantized and the value of the angular momentum can only
be an integer multiple of h
.
But hang on a minute. For the 2-state paramagents we are considering,
I said that the value of spin is one half h
. Whats going on here? Well, this
is the magic of relativity. To fully understand this, you need to understand
relativistic quantum mechanics. We dont need that here other than the
fact that elementary particles electron, proton and neutron all carry 1/2
spin. Lets accept that as a fact and be satisfied that we have a heuristic
understanding of why the spin or the angular momentum should be quantized.
It turns out that if the particle has intrinsic spin of 1/2, the only possible
values of Lz are 1/2 and that means up or down.
Now suppose we have N such spins.

....

N=N +N
Figure 27: Many spin 1/2 particles
If we apply magnetic field B, the energy of a magnetic dipole is
E = B
121

(402)

Since the dipole can only be aligned or anti-aligned with the magnetic field,
there are only two possible energy state for each particle:
E = B

(403)

where the minus sign (lower energy) means parallel spin and the positive
sign means the anti-parallel spin. The total energy of a system that has N
(parallel) spin-up particle is
U = B(N N )

(404)

The magnetization of the system is given by


M = U/B = (N N )

(405)

The multiplicity of the state with the same energy U is determined by


the number of possible arrangement of N = N + N arrows. This is just
binomial coefficients or N -choose-N :
(U ) = (N ) =

N
N

N!
N!
=
N !(N N )!
N !N !

(406)

The lowest energy state with B = N B corresponds to


(N B) =

N!
=1
N !0!

(407)

so entropy will work out fine.


The entropy is of course
S(U )/k = ln (U ) = ln N ! ln N ! ln N !
= ln N ! ln N ! ln(N N )!

(408)

Since we know the entropy as an explicit function of U , we know everythinig about this system. To have an estimate, suppose we have 100 such
spin halves. The largest multiplicity happens when exactly half of the spins
are up.
N/2 =

N!
100!
=
(N/2)!(N/2)!
50! 50!
122

(409)

These numbers are big enough for Stirlings formula.

2100 100100 e100


N/2
250 5050 e50
250 5050 e50
100100
=
50100
s

1
25
1
(210 )10
5 3
1
(103 )10
8.5
29
10
= 2100

100
2

502

(410)

To calculate the temperatures and the correponding heat capacity, we


need to
1. Calculate
1
S
=
T
U

(411)

2. Then solve for U = U (T ) and then


3. Calculate
CV =

U
T

(412)

The first step may be possible but with the full formula, the second step is
impossible to carry out exactly. To do this numerically, we need to follow
the following steps
1. Calcualate
1
S
S
1
=
=
T
U
2B N
as a function of N .
2. Make a table of values known so far:
123

(413)

N
99
98
97
..
.

U/B
98
96
94
..
.

S/k
4.61
8.51
11.99
..
.

kT /B
0.47
0.54
0.60
..
.

3. Calculate CV as
CV =

U (n + 1) U (n)
U

T
Tn+1 Tn

(414)

Note that in our table U is always 2B but the temperature difference


between the rows change. For instance, suppose you want to calculate
CV at N = 98. Then
CV (N = 98) k

100 (98)
= 28.6
0.47 0.54

(415)

Or
CV /N = 0.286

(416)

This is slightly different from the value listed in the book. The value
in the book is calculated this way:
CV = 1/(T /U )

(417)

The advantage of using this formula is that U is fixed. In that case,


we can use the second order formula
f 0 (xn ) =

f (xn+1 ) f (xn1 )
+ (x2 )
2x

(418)

For our case, this would be


1/(T /U )

0.47 0.60
4

4. Continue to do so until you run out of rows.

124

4
= 31
0.13

(419)

Now lets look at table 3.2. There are lots funny things about this table.
For instance, look at the temperature. At N = 50, it is infinite! and below
that it is negative. Now if this was Centigrade, there is nothing weird about
negative temperature. We have that outside right now. But this is absolute
temperature! There is supposed to be a limiting temperature called absolute
zero! Whats happening here? Well, whats happening is that we have a
finite system in every aspect. That is, the phase space available to the
system is limited. In particular, there is a maximum energy that the system
can have and more over, the multiplicity of the maximum energy state is 1.
This is very different from the cases we talked about so far i.e. the Einstein
solid and the ideal gas. In those systems, having larger energy meant being
able to access larger regions of the phase space. This is because in principle
each individual particle can have any amount of energy and the phase space
volume Vx Vp /h can grow as big as one wants. However, in this case, this
volume is strictly confined. There are only two energy levels available for
each particle. Therefore, it is no wonder that the temperature, defined as
the derivative of the log of the multiplicity (entropy) w.r.t. the energy goes
a bit crazy.
But what does it mean to have a negative temperature? For that matter,
an infinite temperature?
Note that a finite amount of energy is needed to make the system at
T = . Does that mean that by creating this system we have made a
hottest matter in the Universe? Anything that comes in contact with this
system will instantly melt/destruct/evaporate/explode?
Nope. Not in the ordinary sense, anyway.
Our intuition about something really hot should be used with a caution
here. Our intuition about hot things like hot water, hot steam, hot pot are all
about kinetic energy. Something that runs around fast and hitting things
fast. And infinite temperature means that the average energy per degree of
freedom is infinite. In the case of the spins, this is not the case. We are only
talking about the energetics of the spins here not the temperature of the
underlying structure. Furthermore, at the infinite temperature, the average
amount of energy per degree of freedom is actually, well, zero.
All this comes about because we insisted using the concept of temperature. If we just talk about entropy and the multiplicity, there is no confusion
here. This sort of system, however, do exist in nature.
Analytic
125

The temperature is
1
T

S
U

(420)

Since
U = B(N N ) = B(N 2N ), = B(2x)

(421)

we can say
1
T

S
U
1 S
=
2B x

(422)

If we use the Stirlings formula,


ln N ! N ln N N

(423)

ln x!
ln x
x

(424)

ln(y x)!
ln(y x)
x

(425)

and

so that
1
T

S
U
1 ln S
=
2B N
k

[ ln N + ln(N N )]
2B
!

k
N
=
ln
2B
N
=

(426)

or

N
N

= e2B/kT
126

(427)

Since N = N N , we have
x
=e
yx
x = ye xe
(1 + e)x = ye
x = ye/(1 + e)

N = N

(428)
(429)
(430)
(431)

e2B/kT
1 + e2B/kT

(432)

and
1 e/(1 + e) = (1 + e e)/(1 + e) = 1/(1 + e)
N = N N = N

1
1+

e2B/kT

U = B(N N )
1 e2B/kT
= B
= B tanh(B/kT )
1 + e2B/kT

(433)

(434)

(435)

Magnetization is given by
M =

U
= tanh(B/kT )
B

(436)

For small , this can be approximated by


M

2 B
kT

(437)

by using tanh x x for x 1.


So how big is a typical ? Remember that is the typical dipole moment
of an atom or an electron.
To estimate, we first start with the fact that the angular momentum is
about the size of h
.
Lh

127

(438)

L
R
e

Figure 28: An electron making a loop


The typical size of the charge is of course just e. Now suppose you have
a charge going aroud a loop with the linear speed of v. Assuming circular
motion, then the angular momentum given by
L=rp

(439)

is constant.

=
=
=
=
=
=

1 I
I r dl
2 C
dl
1 I
I r dt
2 C
dt
1 I Z T =2R/v
r mvdt
2m 0
1 I Z T =2R/v
r mvdt
2m 0
1I
LT
2m
1 e
L
2m

128

(440)

Now we know that if you have a loop with current I flowing thru it, you
get the magnetic moment of
= IA

(441)

where I is the current and A is the area enclosed by the loop.


Since we know that an electron has a magnetic moment, lets consider it
to be a current loop with a radius of re . Now, dont take this to be an exact
consideration. This is just to get an estimate.
The area is then
A = re2

(442)

The current is
dQ
dQ dr
=
dt
dr dt
e
=
vr
2re

I =

(443)

where vr is the speed in the tangential direction.


So
e
r2 vr
|| =
2re e
ere vr
=
2
ere mvr
=
2m
e
=
(re pe )
2m
=

e
L
2m

(444)

where we used that for a cicular motion,


L = |r p| = rp

(445)

Now quantum mechanics tells us that the typical size of the angular
momentum in atomic world is h
. Then it follows that
e
|| h

(446)
m
129

Now dont get me wrong. This is NOT exactly what happens. But the
order of magnitude is right on the bang. How big is this B? Well, to do
that, you need to know how to estimate this sort of things. My favorite unit
conversion tricks are as follows:
h
c = 200MeV fm

(447)

me c2 = 0.5 MeV

(448)

e2 /hc = 1/137

(449)

e2 = 1.44 MeV fm

(450)

and

also

or
where kC 8.99 109 N m2 /C 2 is the Coulomb constant.
Now the typical size of a strong magnetic field we can generate is about
1 Tesla. A Tesla is defined by
[B] = N/C/(m/s) = kg m/s2 /(C m/s) = kg/C/s

(451)

which comes from the defintion of Lorentz force


F = qv B

(452)

So
eh
(kg /C/s))
m !

!
kg
e
h

m
C
1s

200nm eV
mp
1.602 1019
(1000)NAvog.
me
1s 3 108 (m/s)
200nm eV
1000NAvog. 2000 1.6 1019 8
10 109 nm
101+23+3+3+21917 eV 104 eV
(453)

1 mole of hydrogen: 1 g of hydrogen. 1 kg of hydrongen: 103 moles of


hydrogen. This is 6 1023+3 = 6 1026 protons. Roughly, 6 1026 GeV
130

1 Coulomb = (1/1.6)1019 = 6.25 1018 e


1 second = 3 108 m = 3 1023 fm
So 1 tesla, with a factor of c:
1 kg/C/s(c2 /c) =

6 1026 GeV / 6 1018 e / 3 1023 fm

0.3 106 eV/e/fm


= 0.3 106 V/fm

(454)

so with B = 1 Tesla,
eBh
m

0.3 106 eV/fm (200MeVfm)/(0.5MeV)

104 eV

(455)

Room temperature 300K is roughly 1/40eV 0.025 eV. So B/kT 1 is a


pretty good approximation.
This approximate relation between the magnetic field and the temperature is discovered by Madame Curie and goes by the name of Curies law.
Emperically, this is pretty good unless you have really low temperature or a
very high magnetic field. Most materials, however, cant stand magnetic field
of more than 10 tesla. Worlds largest magnetic field is currently 25 Tesla
(by Florida State Universitys National High Magnetic Field Laboratory).
So mostly, low temperature is where this approximation fails badly.
Now, lets try to understand the Curies law. First of all lets write
B
M N
kT

(456)

How do we interprete this? We know that the maximum magnetization is


N which happens when all the particles are aligned with the magnetic field.
Without the magnetic field, there is no reason that one direction should
be preferred. Therefore, M = 0. When there is non zero magnetic field,
there is a preference. Question is, on average, how many are aligned and
how many are still random? This of kT as the tendency to get random,
that is, to increase the entropy. Therefore, this part of the system would
prefer that things would be random. But there is also the tendency of the
system which seeks the lowest possible energy state. This is achieved when
all the particles are aligned with B. But in that case, the entropy will be zero.
131

These two tendencies of the system, to maximize the entropy and to minimize
the energy, compete. Therefore, the net alignement must be a function of
the ratio of the two energy scale envolved B which is the characteristic
of energy minimization requirement and kT which is the characteristic of
entropy maximization requirement.
This alomst always happens in many body systems. There is always
competition between the entropy maximization and the energy minimization.
Later, well learn that such systems seek to minimize the combination
F = U TS

(457)

which goes by the name of Helmholtz Free energy. But thats looking ahead.
Note that in Curies law, M depens quadratically on the magnetization
of individual particle. Equivalently, since e/m,
M e2

(458)

That is, the bigger the charge of individual particles, the bigger the magnetic
moment. How do we understand that? Well, recall that the Lorentz force is
So the larger the charge, the larger the force on the individual
In turn, the magneton depends inversely on the mass of the particle.
Therefore,
M 1/m2

(459)

therefore, if the underlying particles are protons instead of electrons, then it


will be
(mn /me )2 20002

(460)

or 4 millions times bigger magnetic field assuming that everything else are
the same. Why is that? Well, you can think of this in terms of inertia. The
heavier the particle, the harder to move it around.

18

Supplemental: Gospers approximation of


N!

Gosper came up with a better approximation:


N!

(2N + 1/3) N N eN
132

(461)

which is good even for 0!. This then gives


(N ) =

N!
N !N !

q
=
=

v
u
u
t

v
u
u
t

(2N + 1/3) N N eN
N

(2N + 1/3) N eN (2N + 1/3) N eN


(2N + 1/3)
NN
(2N + 1/3)(2N 2N + 1/3) NN (N N )N N
(2N + 1/3)
(2N + 1/3)(2N 2N + 1/3)

N
N

! N

N
N N

In Matlab, this is coded as:


function a = omega(m, n)
if m < n, a = NaN;
else
s = sqrt( (2*m + 1/3)/pi/(2*n + 1/3)/(2*(m-n) + 1/3) );
t = (m/n)^n * (m/(m-n))^(m-n);
a = s*t;
end
which gives
>> omega(100, 1)
ans =
100.3994
>> omega(100, 2)
ans =
4.9565e+03
>> omega(100, 3)
133

!N N

(462)

ans =
1.6180e+05
>> omega(100, 48)
ans =
9.3207e+28
You may want to compare this with the values in Table 3.2. The entropy
is then
S/k = ln (N )
!

N
N
1
(2N + 1/3)
+ N ln
+ (N N ) ln
(463)
ln
=
2
(2N + 1/3)(2N 2N + 1/3)
N
N N
Lets change variable to
N = N/2 + x
N = N/2 x

(464)
(465)

then
S/k = ln (N )

!
1
(2N + 1/3)
=
ln
2
(N + 2x + 1/3)(N 2x + 1/3)

!
N
N
+ (N/2 + x) ln
+ (N/2 x) ln
(466)
N/2 + x
N/2 x
which shows that it is symmetric.
Mechanical Equilibrium and Pressure
Now lets generalize our concept of equilibrium a little. So far we have
considered equilibrium in the point of view of the energy alone. But we
134

Energy

Moving wall
Energy

Figure 29: System that can exchange energy and volume


know that mechanical work can involve in reaching equilibrium. Blow a
baloon quickly, for instance.
So how does two system get into the mechanical equilibrium where there
is no relative volume change so that we can say there is no work involved?
Well, look at the picture. It is quite obvious how this happens. If the right
hand side has bigger pressure, it pushes the wall to the left. This lowers the
pressure in the right but raises the pressure in the left. This will continue
until finally the pressure in the two systems are equal.
So, in this case, the pressure plays the role of the temperature. In the
energy case, when the temperature becomes the same, there is no more net
energy exchange and the systems are in thermal equilibrium. In the mechanical case, when the pressure becomes the same, there is no more volume
exchange and the systems are in mechanical equilibrium.
Now the temperature was defined by the condition that total entropy is
at its maximum with respect to the energy of one system

Stotal
UA

=0

(467)

N,V

which upon using Stotal = SA + SB and U = UA + UB turned into

SA
UA

=
N,V

SB
UB

=
N,V

1
T

(468)

In the case of mechanical equilibrium, the volume plays the role of the total
135

energy so that

Stotal
VA

=0

(469)

N,U

which upon using Stotal = SA + SB and V = VA + VB turns into

SA
VA

=
N,U

SB
VB

(470)
N,U

How does this quantity

S
V

(471)
N,U

behave?
First of all, what is its unit? Well, the entropy has the unit of k or J/K.
Therefore S/V has the unit of
"

S
J
=
V
K m3

(472)

We know that a Jule is Newton meter or


"

S
V

J
K m3
N m/s2
=
Km3
N
=
Km2
Pa
=
K
=

(473)

where Pa is the unit of the pressure (force per area) called Pascal (1 N per 1
meter squared).
So beside the temperature unit, it has the right unit to be a pressure. So
lets write

S
V

= P/T
U,N

136

(474)

Lets think about some extreme cases. Suppose the box on the left has much
higher

!
S
V U,N
then the one on the right. This means that a little change in the volume will
produce large change in the entropy. On the other hand, a little change in
the volume does nothing much to the entropy of the box on the right. Now
we are trying to maximize the entropy. Therefore we should increase the
volume of the box on the left. This will increase the entropy of the left box
a lot but dont decrease the entropy of the right box that much. So overall,
the entropy goes up. Therefore the box with higher

S
V

U,N

expands until this quantity in the two systems become the same. This is
exactly how high/low pressure system should behave.
Question is, is this quantity really the pressure, i.e. force per unit area?
Perhaps we are missing a factor of 2? Perhaps there is a dimensionless
function that of U/N kT ?
Well, there is no easy answer to these questions. For now, lets settle for
the fact that the Sackur-Tetrod formula
h

S = kN ln Const. (V /N )(U/N )3/2 + Const.


gives

S = N k ln V + ...

(475)

(476)

so that the above formula gives us the ideal gas law back
P V = N kT

(477)

This is not the proof that the above defintion really gives the value of force
per unit area, but it is a very good indication that we are on the right track.
The Thermodynamic Identity

137

In mathematics, there is this relation: If f (x, y) is a function of x and y,


we have
df =

f
x

f
dx +
y
y

dy

(478)

or in vector form
df = dx f

(479)

We have been talking about temperature that is given by


1
=
T

S
U

P
=
T

S
U

(480)
N,V

and pressure
(481)
N,U

If N is a constant, then S is a function of U and V . Therefore

S
U

S
dS =
dU +
V
V
1
P
=
dU + dV
T
T

dV
U

(482)

or
T dS = dU + P dV

(483)

What does this mean? First of all, this means that the natural variables for
the entropy function are the energy U and the volume P . We can also say
dU = T dS P dV

(484)

So the natural variable for the total energy is the entropy and the volume.
This is called the thermodynamic identity. Note that this also means

U
S

U
V

=T

(485)

= P

(486)

138

This sort of relationship is called conjugate relationship and T is called


the conjugate variable for S w.r.t. the total energy U and P is called the
conjugate variable for V w.r.r. the total energy U .
Why is this relationship useful? Seems like Eq.(478) is trivial. This is
not so trivial as it seems. A priory, there is no reason that temperature and
the pressure are related to the partial derivative of the same funciton. Why
is that a significant fact? Well, thats because there are many, many ways
to connect two dots in 2-D surface. You can draw any old curve you want
to that connects them. A good thing about perfect differential like dU
above is that it doesnt matter how the system got to the final state. The
value of U is a function only of the value of S and V and not how the values
got to where they are. Think of conservative potential energy. They behave
exactly the same way and this is because the force is the gradient of the
potential. The same story here.
The same cant be however said for terms like T dS. The change in this
quanitity does depend on the path it took from the initial state to the final
state.
Entropy and Heat
Now remember the first law:
dU = Q + W

(487)

dU = T dS P dV

(488)

This looks a lot like

So can we say
?

T dS = Q
?

pdV = W

(489)
(490)

Well these equations work if the change is slow (read: quasi-static) and there
is nothing else changing but the volume. In that case, the work is
dW = P dV

(491)

Q = T dS

(492)

and that leaves

139

However, quasi-static processese are pretty special. More often than not the
physical processes are fast enough that quasi-static argument dont apply.
For instance, Lets consider fig.3.16. Suppose you push the piston really
fast. In that case, the gas molecules didnt have time to re arrange itself.
That means that near the surface of the piston the gas is denser. That
means that the pressure exerted on the piston is larger than the quasi-static
case. Therefore, the work you have to do is greater than P dV where P
is the pressure of the quasi-static process. Remember since the volume is
decreasing, dV < 0 and P dV > 0. Now we have
dU = Q + W

(493)

and we have W > P dV . So given the same amount of energy as in quasistatic case, we must have
Q < T dS

(494)

compared to the quasi-static case or


dS > Q/T

(495)

Now the entropy increase in the quick move case must surely be larger than
the entropy increase in the quasi-static case. Therefore, one cannot say that
Q = T dS

(496)

in general.
You can also think about up a partition in a gas container. If the system
is insulated, there is no energy exchange. So dU = 0. But the volume
increases. So in this case P dV > 0. If the volume increase is small enough,
the thermodynamic identity
T dS = dU + P dV

(497)

still applies. Threfore


dS =

P dV
>0
T

Entropy still increases.


Diffusive Equilibrium and Chemical Potential
140

(498)

So far we have considered the energy exchange and the volume exchange
and assumed that the number of particles is constant. But thats not always
the case. Open up a perfume bottle. Suddenly, the room smells nicer than
before. This is called Diffusion. And the equilibrium resulting from that is
called diffusive equilibrium. In terms of perfum, if you open the bottle for
2 second and then close the cap, there is a certain concentration of perfume
molecules in the vicinity of the bottle to begin with. And the person at the
far end of the room may not yet smell the perfume. But given time, the
perfume molecules dispers (or diffuse) by colliding with the air molecules.
You can imagine that the dispersing or diffusion will end when the density
of the perfume molecules are the same everywhere in the room. Without
any other factors such as another opend bottle of perfume, this is right. You
have achieved the diffusive equilbrium.
Show movie
So how do we quantify this? Well, just like before. Equilibrium means
maximum entropy. And the entropy, look at the Sackur-Tetrode formula
S = ln
"
#

!
V 4mU 3/2
5
= N ln
+
N 3N h2
2

(499)

It is a function if U , V and N . So far we thought about maximizing entropy


in terms of U and V . We can do that with N too.
If two system are in contact so that number of particles as well as energy
can be exchanged, the condition for the maximum entropy would read

Stotal
UA

Stotal
NA

=0

(500)

=0

(501)

NA ,VA

UA ,VA

You should pause and think. The entropy appearing in the above is the total
entropy because thats whats being maximized NOT the individual entropy
SA and SB . One entropy may go up and the other one may go down. The
important thing is that the going up part is more than the going down part
so that the net entropy is maximized. Also notice that the derivative is w.r.t.
141

U B, NB , SB

UA , NA , SA

Figure 30: Number and energy can be exchanged.


one of the variables, NOT the total energy or total number, which dont
change.
Now upon applying U = UA + UB and N = NA + NB where U and N
are the total energy and the number which are constants, the first equation
turns into the condition:
TA = T B

(502)

The second equation turns into the condition

SA
NA

=
UA ,VA

SB
NB

(503)
UB ,VB

just like the case of pressure (Remember V = VA + VB ) and the temperature.


We define this quantity

SA
NA

UA ,VA

A
TA

(504)

and call A the chemical potential. Since in equilibrium the temperatures


are the same, we must also have
A = B
142

(505)

The minus sign is there for the following reason. If a particle leaves system
A, the entropy decreases by

SA
NA

(506)
UA ,VA

If this particle enters system B, the system Bs entropy increases by

SB
NB

(507)
UB ,VB

If

SA
NA

<
UA ,VA

SB
NB

(508)
UB ,VB

there will be a flow of particle from A to B. Intuitively, however, we want


something to flow from high to low. Like heat flowing from high T to low
T . If we define the chemical potential without the minus sign, it would mean
that the particle flows from low to high. Thats why the minus sign is there.
Now that we have one more partial derivative, we can again write the
differential of the entropy
dS =

S
U

N,V

S
dU +
V

S
dV +
N
N,U

dN
U,V

1
P

dU + dV dN
(509)
T
T
T
This shows that the natural variables for the entropy function are U, V, N .
or
=

dU = T dS P dV + dN

(510)

which shows that the natural variables for the total energy are S, V, N . This
actually gives another formula for T , P and
U
S

U
P =
V

T =

U
N

143

(511)
V,N

(512)
U,N

(513)
U,V

Lets calculate the chemical potential for the ideal gas. The SackurTetrode formula is
h

S = N k ln (4m/3h2 )3/2 (V /N )(U/N )3/2 + 5/2


Differentiating we get
= T

S
N

= kT

3N kT
,
2

ln (4m/3h2 )3/2 (V /N )(U/N )3/2 + 5/2 + N

(514)

U,V

= kT ln (4m/3h2 )3/2 (V /N )(U/N )3/2


If I use U =

= kT ln (4m/3h2 )3/2 (V /N )(3kT /2)3/2

= kT ln (V /N )(2mkT /h2 )3/2

Note that we can write this as

5
2N

(515)

(516)

5
T S = N + N kT
2

(517)

T S + N = U + P V

(518)

or

Lets estimate how big this is. At room temperature, we know that
kT

1
eV
40

(519)

One mole of gas takes 22.4 litre at the standard condition. Now, this is at
zero degree Celcius, but thats good enough for the estimate. One litre is 10
centimetre by 10 centimetre by 10 centimetre or 103 m3 . If we want to use
our knowledge,
h
= 200 eVnm

(520)

we better convert 22.4 litre into nm3 .


22.4l = 22.4 103 m3 = 2.24 102 (109 nm)3 = 2.24 1025 nm3
144

(521)

That yields
(N/V ) =
or

6 1023
6 1023
= (N/V ) =
nm3 3 102 nm3
25
22.4l
2.24 10
(N/V )h3 (60 eV)3

(522)

(523)

If you are going to do anything to do with chemistry, this is a handy number


to remember. The density of 1 mole of any gas is equivalent to approximately
60 eV (cubed).
Aside:
N kT
V
1
eV (60 eV)3
=
40
= (8.6 eV)4

P =

(524)

The other factor in the log is


2mkT
mkT
=
h2
2h2

(525)

Air is mostly made up of nitrogen. So it contains 28 protons. 1 proton weighs


about 1 GeV. So
m 30GeV

(526)

Hence
kT m/2

1
30 109 /2 eV2 108 eV2
40

(527)

and
q

kT m/2 104 eV

1012 eV3

(528)

Remember Giga = 109 . So the argument in the log is


(V /N )(mkT /2h2 )3/2 = (V /N )(1/h3 )(mkT /2)3/2
(60 eV)3 (104 eV)3
1603 106
145

(529)

So
ln 106 14

(530)

so
14
1
eV eV
(531)
40
3
If there are several species of molecules in the gas, the thermodynamic
identity generalizes into

dU = T dS P dV +

i dNi

(532)

Chemists define
chemistry T

S
n

(533)
U,V

where n is the number of moles. Conversion factor is just the Avogadros


number.
When adding particles but keeping entropy constant, U must go down.
Example: Einstein Solid
3 units of energy. 3 SHO.
= N Hq = 3 H3 =

3+31
3

5
3

= 10

(534)

Add one more oscillator and keep the entropy constant:


= N +1 Hq0 =

= 4 H2 =

N + 1 + q0 1
q0
!

4+21
2

5
2

(535)

= 10

(536)

So U changed by -1 or
=

U
N

= h

(537)

In general
N Hq

= N +1 Hq0
146

(538)

Summary of Terms
Heat : Energy flow due to temperature difference
Work : Energy flow due to everything else
Isothermal : Temperature kept constant. Energy can flow in and out but
no heat can flow in and out. Only Work is allowed.
Adiabatic : Total energy kept constant. Temperature can change, that is,
there can be a heat flow if work is being done.
Quasistatic : The process of volume change is slow enough that the interior
of the system is always in equilibrium. In this case, and only in this
case,
W = P dV
Isentropic : Entropy kept constant. Adiabatic + Quasistatic.
Reversible : Process that leaves the total entropy unchanged. Must be
slow. Same as Isentropic.
Irreversible : Total entropy has increased.

dS =

Q
dU
=
T
T

(539)

works if volume is constant and no other work is done.

dS =

Q
T

(540)

is valid even if volume changes if the process is quasistatic.

T dS = dU + P dV dN
Isobaric : Pressure kept constant.
147

(541)

19

Schroeder Chapter 4 Engines and Refrigerators

20

Heat Engines

So what are these good for? Can we figure out something practical? In our
everyday life, engines and refrigerators are everywhere. In short, remember
that one of the definition of the entropy was the negative index of available
work. Also, remember that the maximum entropy signifies the equilibrium
condition. Therefore, the situation you want to create to get the maximum
work out of is maximally out of equilibrium condition. That is, in terms of
temperature, you want to have as much difference as possible. Or you want
to have as much pressure difference as possible, and so on.
In this chapter, we formalize this intuitive reasoning.
Heat is a energy flow. If you create a big temperature difference, energy
spontaneously flows from the high temperature side to the low temperature
side. What you want to do is to siphon off some of that energy flow and use
it to do some useful work such as running your car or turn the electricity
generator and so on.
Now to make matter simple, consider two heat reservoirs. The term
reservoir is often used in thermal and statmech. It refers to a very large
system (ideally infinite) that is already in equilibrium. Since the system is
so large, it does not matter if you siphon of some energy from it add some
energy to it. Its temperature will not change. This sort of reservoir is called
the thermal reserviors. The reservior can also provide other quantities such
as the molecules themselves. For instance, if you put a highly concentrated
small system onto a large but dilute system, eventually, the density of the
small system will become the same as the density of the dilute system. Yes,
since some more materials are added to the whole system, the overall density
went up a little, but if the particle reservoir is big enough, this change is
negligible.
Now to think about engines and refrigerators, it is convenient to consider energy flow between two reserviors. One hot and one cold. Here is a
schematic diagram:
Due to the temperature difference, heat flows from the hot reservoir at
temperature Th to the cold reservoir Tc . The amount of heat that can flow
per unit time Thats power is what determines the engine to be powerful
148

Hot reservoir at Th
Qh
W
Qc

Engine

Cold reservoir at Tc
Figure 31: Schematic diagram of engine
or weak. But that depends on a lot of details. For now, lets think about
some general things we can figure out.
First of all, lets think about how an engine might work.
(a) Take in the some heat from the hot reservoir thru some process. For
this to happen, the temperature of the engine of course has to be less than
Th .
(b) Use the heat to do some work. In general, the the temperature of
the engine will now go down. But it should not go down lower than the
temperature of the cold reservoir.
(c) Transfer the residual heat to the cold reservoir thru some process.
Essentially reverse the process (a).
(d) Go back to step (a) by essentially reverse process of (b) to get back
to (a). However, since the temperature is now lower, the work needed to
accomplish this is smaller than work output by (b).
The first thing we need to consider is the energy conservation.
Qh = Q c + W

(542)

where Qh is the amount heat that flows out of the hot reservoir and Qc is
149

the amount of heat that actually enters the cold reservoir. The difference is
the work that can be extracted.
We would like to convert Qh to W as much as possible. Ideally, all of it.
But is that possible? Well, not really. Remeber the heat flows because that
increases the overall entropy. When hot reservoir loses energy, its entropy
decreases. So unless the cold reservoirs entropy increases as much or more,
heat does not flow. So enough heat must enter the cold reservoir so that the
total entropy is at least the same as before.
Lets define the efficiency of an engine as
W
Qh
Qh Q c
Qc
=
=1
Qh
Qh

(543)

what we have just argued is that this can never be 1. But can we say more
than that? Yes we can.
Remember that the heat and the entropy is related by
dS

Q
T

(544)

where the equality works only if the process is quasistatic. To make matters
simple, lets assume that the processes can be though of quasistatic. In that
case, the entropy of the hot reservoir is decreased by
Sh =

Qh
Th

(545)

On the other hand the entropy of the cold reservoir is increased by


Sc =

Qc
Tc

(546)

Now the sum must be non-negative:


Sh + Sc 0

(547)

Qc
Qh

Tc
Th

(548)

or

150

or
Qc
Tc

Qh
Th
=1

Qc
Tc
1
Qh
Th

(549)

(550)

Remember that all temperature here are in Kelvin, measured from the
absolute zero. So suppose you have a hot reservoir thats at 300 C and
cold reservor thats at 20 C. In that case, the efficiency cannot exceed 1
20 + 273
or
300 + 273
0.49

(551)

So to make an engine run more efficiently, you need to have a hotter


reservoir or colder reservoir. But there is a limit to the coldness and usually
you want to use water or the air as a coolant so that not much energy is
spent to cool them to a really cold temperature. That means that usually
you need to make hotter reservoir to make the engine run more efficiently.
From then on, the problem migrates from one of the themodymics to that of
the material science. You can create thousands of degree. But if no materials
can contain and withstand that high temperature, then there is not much
point in making an engine using that.
Carnot Cycle
In the 19-th century, Carnot thought up an ideal process for which the
efficiency is the maximum achievable. That is,
=1

Tc
Th

(552)

Now remeber that we made an entropy argument to get the inequality part.
So you can easily guess that some part of Carnot cycle must be isentropic
processes.
So lets start with a cylinder and a piston and ideal gas. First step is for
this engine to absorb some heat. Now, to absorbe heat, the temperature of
the engine must be less than the temperature of the reservoir. If the temperature of the engine is much less than the temperature of the reservoir, then
151

transfering heat increases overall entropy. This, you want to avoid. But
then, if there is no temperature difference, how is it going to absorb heat?
Well, if the temperature of the engine is very slightly less than the temperature of the hot reservoir, transfering heat is still possible but it will generate
only very slightly more entropy. In the limit of infinitesimal difference, the
entropy generated is infinitesimal so we can live with that. The problem with
this is that the transfer of heat in this case will take infinite amount of time.
That is, the power associated with this phase of the cycle is infinitesimal.
But we are talking about an idealization. So lets forgive that.
So by taking heat, the system expands from V0 to V1 .

Isothermal expansion
(a)

V increases T = T
h
from V0
to V1

Hot Res.
Qh
Th

Figure 32: Isothermal expansion stage of Carnot engine


Now the system has more energy than before. So we want to use some of
this energy to do some work. Lets say that this is accomplished by letting
the system expand more adiabatically. Remember adiabatic means that no
heat comes or goes in and out of the system. Now if this process if too
quick, it will generate entropy just as quick push of the piston generates
more entropy. So we want this to be isentropic process. That is, adiabatic
and quasistatic. The system volume changes from V1 to V2 > V1 .
What happens when a gas expands adiabatically? Quick answer is that
152

it cools down. Why? Well, since there is no entropy change, the mechanical
work thats done by the system spends the internal energy of the system.
That is,
dU = P dV

(553)

So by letting the volume grow (dV > 0) we let U go down. Since U/N kT ,
this means that the temperature will have to go down. At this phase we do
the expansion up to when the system reaches the temperature slightly higher
than the cold reservoir.
The reason we dont want the engine temperature to be much higher is
again the same entropy argument. We dont want to new entropy generated
when the excess (useless) energy is drained out of the engine.

Adiabatic expansion
(b)

V increases T decreases
from V1
from Th
to V2
to Tc +
Figure 33: Adiabatic expansion stage of Carnot engine

153

This is the second stage. The third stage involves the cold reservoir. In
this stage, we want to drain away the uesless excess heat from the engine but
without generating entropy. Again, this can be done if the temperature of
the reservoir is only just slightly lower than the engine temperature, but it
will take forever. But then we decided to forgive that in the spirit of idealism.
As the heat is drained away from the system, the system gets cooler from
Tc + towards Tc . That means that the volume of the system goes down
from V2 to V3 . This volume V3 should be larger than the initial volume V0
by design.

Isothermal Compression
(c)

T = Tc +
Cold Res.
Tc

V decreases
from V2
to V3

Qc

Figure 34: Isothermal compression stage of Carnot engine


To get back to the original stage, we need then to compress isentropically.
That process raises temperature because work is done on the system. Since
dS = 0 by design,
dU = P dV

(554)

So by decreasing the volume, you increase the internal energy and doing so
raise the temperature. In this way, we get the system back to the original
temperature and the original volume. The cycle then continues.
154

Adiabatic Compression
(d)

V decreases
from V3
to V0

T increases
from Tc +
to Th

Figure 35: Adiabatic compression stage of Carnot engine


Note that the temperature difference between the reservoirs and the
carnot engine is infinitesimal. That means that the system can expand and
compress only infinitesimally. Therefore the work you get out of is also infinitesimal. To get more than infinitesimal amount of work, the temperature
of the engine has to be between Tc and Th . This way, the expansions can be
finite and finite amount of work can be obtained. But that means that you
are going to have to generate more entropy. So the efficiency of real engines
is of course less than the ideal Carnot engine.

155

21

Refrigerator

Refrigerator makes things colder than its environment. That is, it reduces
the entropy of the part of the system at the expense of increasing the entropy
of the other part of the system.

Hot reservoir at Th
Qh
W
Qc

Refrigerator

Cold reservoir at Tc
Figure 36: Schematic diagram of refrigerator
The idea is to pull heat out of the cold reservoir and dump it into the hot
reservoir by external work. So the end goal is to have the colder temperature.
To do so we need to do work.
Simplest way to achieve colder temperature is adiabatic expansion. You
can easily experiment this with any aerosol spray. If you put your finger in
front of the can and spray, you would feel that the temperature of the liquid
thats coming out of the nozle is much colder than you would have expected.
But be careful! You can be actually frostbitten this way.
156

Anyway, why are we able to coold things down if heat only flows from
hotter temperature to colder temperature? Well thats because energy can
take many forms. Basically, you want to draw first make your refrigerator
(the cooling part) colder than the cold reservoir. This can be done by adiabatic expansion. Then you bring it into contact with the cold reservoir.
The cold reservoir then loses some heat to the refrigerator. You then want
to break the contact with the cold reservoir and make the temperature of
the refrigerator higher than the hot reservoir. This can be accomplished by
adiabatic compression. You then bring it into conctact with the hot reservoir
and dump the excess hit to the hot reservoir. You can then again use adiabatic expansion to bring the temperature of the refrigerator down below the
temperature of the cold reservoir. The process then continues. The work
part is in the adiabatic compression and the expansion part. We or some
other means must supply that part of the work.
The most efficient refrigerator again is the one that does not waste any
energy to generate entropy. The efficiency of a refrigerator is defined by
amount of heat you extract from the cold reservoir vs. the amount of work
you have to do to get it
COP =

Qc
W

(555)

Here COP means coefficient of perfomance. Energy conservation tells us


Qc = Q h W

(556)

Qc
1
=
Qh Q c
Qh /Qc 1

(557)

so
COP =

In this process the entropy of the cold reservior went down by


Sc = Qc /Tc

(558)

The entropy of the hot reservior went up by


Sh = Qh /Th

(559)

The sum must be non-negative:


Sc + Sh 0
157

(560)

or
Qh
Th

Qc
Tc

(561)

This means that


COP =

1
1
Qc
=

Qh Q c
Qh /Qc 1
Th /Tc 1

(562)

Tc
Th T c

(563)

or
COP

again the equality hold only if there is not entropy generated by the process
of extracting heat from the cold reservoir.
Note that unlike the efficiency, COP can easily become larger than 1. To
make it larger, all one has to do is to have very similar Th and Tc . But if
Th is too close to Tc , thats not much of an refrigerator. To get to the really
low temperature, Th Tc must be relatively large. But that also means that
temperature drop per work you put in becomes smaller. Well, there is no
free lunch.

22

Real Heat Engines

The above discussions all very fancy and all, but it lacks practicality. There
is a joke that a good enough physicists can make even hell comfortable because there has to be temperature gradient in hell. If there is a temperature
gradient, he can use that to make heat engine as well as refrigerator to make
his/her corner of hell more comfortable. So knowing how to build practical
machines may help you a bit not only in your next life wherever it may be
but also in this life.
Our most direct expriences with engines has to be the internal combustion
engines. In other words, the one in your car, motor cycle, lawn mower, chain
saw, ... Modern world cannot function without them even if they spew out
some unsavory molecules into the environment.
In general, Gasolne engines are divided into 2 classes. Two stroke engine
and 4 stroke engine. Car engines are all 4 stroke engines. This kind of engines
need gravity for lubricant flow and valves. So they are not very portable.
158

Adiabatic

Pressure

Th Isotherm
Adiabatic
Tc Isotherm
Volume
Figure 37: PV diagram

On the other hand, 2 stroke engines dont have valves. So they tends to be
more portable and easily built. Motor cycle engines, chain saw engines, etc
are therefore mostly 2 stroke engines.
To do anything else, you need to put some energy into the flywheel to start
the whole cycle. Thats where your battery and the ignition motor comes in.
So you turn the ignition key and put some energy into the flywheel. It turns.
In olden days, one had to do this by hand-crank.
The real cycles begins with intake of the fuel. This happens of course
during a down stroke of the piston. When the piston reaches the lowest
position, the intake valve is closed and the piston starts to compress the airgasoline mixture This process is quick enough to be adiabatic. It is of course
not quasi-static. So entropy is generated.
When the piston reaches the heighest position, the spark plug sparks

159

Figure 38: Intake of fuel

Figure 39: Compression


and igintes the air-gasoline mixture. The resulting explosion creates a very
hot gas which naturally would like to expand. This then pushes the piston
downward tranfering more energy to the flywheel than it had before. Some
of this energy is then transferred to your car wheel so that the whole thing
goes.
In the mean time, the expansion is fast enough to be adiabatic so
160

Figure 40: Ignition

Figure 41: Expansion


that the gas inside the engine cools and it is now useless to turn the wheel.
So in the next up-stroke, the exhast valve is opened and the spent gas is
pushed out. When the piston reaches the highest point, the exhaust valve
closes and the fuel valve opens and intake begins in the next down stroke.
So there is one power stage during 4 up and down strokes. Hence the
161

Figure 42: Exhaust


name. Olden days, all the timings are managed by mechanical means thru
belts, chains and gears. These days, of courses they are all managed by onboard computers (multiple, there are more than a single processor in todays
cars).
There is an interesting variation on the same theme. Some cars made
in the 70s had whats called rotary engine. In effect, you have a triangular
shape piston in a cocoon shaped cylinder. The motions are, how ever not
linear. They are circular. Also, all 4 stages of 4 stroke cyle happens at
the same time. Furthermore, there are 3 ignitions per revolution. This is
supposed to create less waste due to the friction, more power, etc. It has not
been very popular so far. But it shows that there are more than one way to
skin the cat.

162

Ignition

Pressure

Power
4

2
Compression

Exhaust
1
Volume

Figure 43: PV diagram


Lets see if we can calculate the efficiency of the Otto cycle. First stage is
the compression of the air-fuel mixture. Since the stroke is fairly fast, we can
take this as an addiabatic process. Adiabatic means that no heat exchange.
Therefore
dU = Q + W = W = P dV

(564)

Now, since this is compression phase, some external agent such as the flywheel
has to supply the work. If the temperature changed from T1 to T2 during
this phase, then the work done by the engine is
W12 = U =

fNk
(T2 T1 )
2

(565)

Note that the work is actually done on the engine since T2 > T1 . The
next phase is ignition. In this case, there is no expansion nor compression.
163

Figure 44: Rotary Engine

164

Only heat transfer. Therefore


U = Q

(566)

and the amount of heat transferred to the engine is


fNk
(T3 T2 )
2

Qh = U =

(567)

Next phase is expansion. Again, this is fairly adiabatic. So the work done
by the engine is
W34 = U =

fNk
(T4 T3 )
2

(568)

Next phase is exhaust. No work is done here. Only heat transfer. Now the
amount of heat that was extracted from the engine is
Qc = U =

fNk
(T4 T1 )
2

(569)

Therefore total amount of work done by the engine is


W = W12 + W34 =

fNk
(T1 + T3 T2 T4 )
2

(570)

which is the same as


Qh Q c =

fNk
(T3 T2 T4 + T1 )
2

(571)

as they should. The efficienty is then


Qh Q c
Qh
Qc
= 1
Qh
T4 T 1
= 1
T3 T 2

(572)

To go further, we need to know how these temperatures are related.


Among the 4 temperatures, T1 and T3 are given.
The equations to use are (i) Energy conservation (ii) Equipartion (iii)
Ideal gas law.
165

First energy conservation says


dU = Q + W

(573)

During 1 2 phase, there is no heat transfer. Therefore work equals the


energy change. If we can assume quasi-static process,
dU = W = P dV

(574)

Now suppose that the gas mixture is dilute enough that we can use ideal gas
equation of state
P V = N kT

(575)

which means
P =

N kT
V

(576)

Now we know that the equipartition theorem says


U=

fN
kT
2

(577)

where f is the number of degrees of freedom. So


fNk
dT
2

(578)

N kT
fNk
dT = P dV =
dV
2
V

(579)

dU =
Combining, we get
dU =
or

f dT
dV
=
2 T
V

(580)

dx
= d ln x
x

(581)

Z V2
f Z T2 dT
dV

=
2 T1 T
V1 V

(582)

now we know that

so

166

or
f
ln(T2 /T1 ) = ln(V2 /V1 )
2

(583)

or
T2
V2 = V 1
T1

f /2

(584)

Remember this compression so V2 < V1 . That means


T2 = T 1

V1
V2

2/f

(585)

Now expansion phase 3 4 is again adiabatic. All the above relationship


applies with appropriate change.
T4 = T 3

V3
V4

2/f

= T3

V2
V1

2/f

(586)

This means that


T2
T3
=
T1
T4

(587)

and the efficiency goes


T3 (T1 /T2 ) T1
T3 T 2
T3 T1 T 1 T2
1
T2 (T3 T2 )
T1 (T3 T2 )
1
T2 (T3 T2 )
T1
1
T2
2/f
V2
1
V1

= 1
=
=
=
=

(588)

Remember however that we used quasistatic condition. This is of course, not


strictly true.
167

Steam engine
The Industrial Revolution started with the invention of steam engine.
Without it, factories could not have run as they did and modern civilization
as we know it may have been quite different. But it was invented and here
we are. Today, a version of steam engine is still used in power plants.
Modern power plants works basically this way: You have a really hot
reservoir be it from burning coal or nuclear reactions. Use that to boil water
at high temperature and high pressure. The steam is then channelled onto
turbines which turns and generates electricity. The steams are then collected
at the condenser, cooled and become water again. It is then pumped back
into the boiler and the cycle continues.
This cycle, called the Rankine cycle looks like this: The crucial element in
the cycle is the fact that we are not only dealing with gas here but also liquid.
That is, we are actually using the fact that water boils and the condenses
back into water.
Among the 4 components of this cycle, the role of boiler, turbine and
condenser is pretty obvious. But what is the role of the pump? If you look
at the diagram, it completes the cycle by making high pressure water. But
why is this necessary? Why cant we just go directly from 1 to 3?
Early thermodynamic developments were centered around improving the
performance of contemporary steam engines. It was desirable to construct a
cycle that was as close to being reversible as possible and would better lend
itself to the characteristics of steam and process control than the Carnot
cycle did. Towards this end, the Rankine cycle was developed. The main
feature of the Rankine cycle, shown in Figure 31, is that it confines the
isentropic compression process to the liquid phase only (Figure 31 points
1 to 2). This minimizes the amount of work required to attain operating
pressures and avoids the mechanical problems associated with pumping a
two-phase mixture. The compression process shown in figure 31 between
points 1 and 2 is greatly exaggerated*. In reality, a temperature rise of only
1F occurs in compressing water from 14.7 psig at a saturation temperature
of 212F to 1000 psig. Figure 31 Rankine Cycle * The constant pressure
lines converge rapidly in the subcooled or compressed liquid region and it is
difficult to distinguish them from the saturated liquid line without artificially
expanding them away from it. In a Rankine cycle available and unavailable
energy on a T-s diagram, like a T-s diagram of a Carnot cycle, is represented
by the areas under the curves. The larger the unavailable energy, the less
168

efficient the cycle. HT-01 Page 88 Rev. 0


The definition of efficiency says
=1

Qc
Qh

(589)

how to calculate Qc and Qh ? Remember that Qc and Qh are involved in


boiling or condensing water. Ideally, the boiling and condensing happens at
finite temperature.
Now lets look at the energy conservation again:
dU = Q + W

(590)

Assuming that quasistatic condition is maintained, we can say


W = P dV

(591)

dU = Q P dV

(592)

so that

Now if P remains constant, we can say


d(U + P V ) = Q

(593)

That is, the change in heat is equal to the change in the enthalpy
H = U + PV

(594)

Qh = H 3 H 2

(595)

Qc = H 4 H 1

(596)

In the boiling phase,

In the condensing phaase,

so
=1

Qc
H4 H 1
H4 H 1
=1
1
Qh
H3 H 2
H3 H 1
169

(597)

The last approximation


H1 = U 1 + P 1 V1 H 2 = U 2 + P 2 V2

(598)

is a good approximation in the following sense. Compared to the gas, volume


of a liquid is very small. So the PV term in the enthalpy is really negligible.
Also, the pump does not add much energy to the molecules.
So lets take a look at the table and figure out the efficiency. Take the
point 1. This could be water or saturated water. We can look that up in the
table. Take point 3. Its in the steam phase so we can look it up. Point 4
is tricky. Its in the water+steam phase. How do we find it? We go on this
way. First, we may assume that the expansion phase (turbine) of the cycle
is adiabatic. Now if we can also assume that the process is quasistatic, then
the whole thing is isentropic. So between 3 and 4, the entropy should not
change. So we look up table 1 for the line that has the specified low pressure
and see what combination of the water and steam entropy corresponds to
this entropy. For this to work, one need to know the pressure at 3 and the
pressure at 4 which are the same as pressure at 2 and pressure at 1.

170

Hot Reservoir

Pump

Win

Turbine

Boiler Q h

Condenser
Qc
Cold Reservoir
Figure 45: Rankine cycle diagram

171

Wout

(Water)
3

Pressure

(Steam)

Boiler
Pump

Turbine
Condenser

1 (Water + Steam)
Volume
Figure 46: PV diagram

172

Real Refrigerator
Real refrigerators the kind you find in your kitchen operates on more or
less the same principle as the steam engine The revserse Rankine cycle.
That means that it involves substance that turns into liquid and then a gas
within easily operable temperatures and pressure. Most of you now know that
the most common substance that used to be used in commercial refrigerator
CFC is no longer in use because it destroys ozone layer.
Lets see if we can understand this diagram. First, a gas is compressed
adiabatically. This raises the temeprature of the gas as well as the pressure.
The temperature must be higher than the hot reservoir temperature. This
is then send to the condenser. Condenser has two constant things. One, the
temperature is constant. This is accomplished by having a contact with the
hot reservoir. Two, the pressure is constant. This has to be done thru a
(unspecified) mechanical devices. The hot reservoir is not really that hot.
Its temperature is lower than the gas-liquid transition temperature of the
substance at that pressure. So the gas condenses to the liquid phase. But
the crucial thing is that the pressure has to be kept constant.
The high-pressure liquid is then sent to the throttle. What throttle does is
it expands the high pressure liquid quickly. This process lowers the pressure
as well as the temperature. At this stage, the temperature is now lower than
the temperature of the cold reservoir.
The cold liquid-gas mixture is send to the evaporator where it evaporates
into gas by extracting heat from the cold reservoir. The temperature of the
cold reservoir must be higher than the liquid-gas transition temperature at
the given pressure. The liqud is then sent to the compressor and the cycle
continues.
Note that the actuall cooling occurs thru the throttling process. But that
is only possible if we can make high pressure liqud. So the real work is done
by the compressor which builds up the necessary pressure.
In the case of steam engine, we asked why is there a pump? In this case,
an analogous question is why is there a condenser? Cant we directly use the
hot gas from 2 in the throttle process?
Again, this is matter of efficiency. The answer is yes, we could, but it
wont be very efficient since the gas is hot. It is much better if we can
dissipate some of that heat into the atmosphere (or any other coolant, look
at the back of your refrigerator) while keeping the pressure high.
at a constant pressure
173

Qh

Hot Reservoir

Throttle

Compressor

Condenser

Evaporator
Cold Reservoir

Qc

Figure 47: Reverse Rankine cycle diagram

174

Pressure

2
Condenser

(Liquid)

(Gas)
Compressor

Throttle
Evaporator

(Liquid + gas)
Volume
Figure 48: PV diagram

175

The coefficient of performance is


COP =

Qc
H1 H 4
=
Qh Q c
H2 H 3 H 1 + H 4

(599)

Again enthalpy is used because the heat are drawn from reservoirs and put
into reservoirs under constant pressure.
Throttling
In the steam engine, there was the pump that at first seemed unnecessary.
In the refrigerator, we have the throttle. What throttle does is it lowers the
pressure and the temperature of the liquid so that it can extract heat out of
the cold reservoir. So it is an integral part of the refrigerator.
Lets consider this a little more.
Since the process is quick enough, it is adiabatic (but slow enough to be
quasi-static). So there is no heat transfer.
To keep things simple to consider, suppose we have mechanical devices
that keep the pressure of the two sides of the throttle constant. Now the first
law says
U = Q + W

(600)

There is no heat transfer. So the only thing that matters is the work:
U = W

(601)

So the amount of the energy change must be equal to the work done to the
system. For the left hand side, the piston does work to the system. So at the
end, it inputs Wi = Pi Vi . For the right hand side, the gas is pushing against
the piston. So there is work by the system. At the end this amount is
Wf = P f V f

(602)

Uf U i = W i W f = P i Vi P f Vf

(603)

Written another way, this means


Ui + P i Vi = U f + P f Vf
176

(604)

Vi
Pi
initial volume 0

Pi

Pf
this volume increases faster
than this volume decrease

Pf

Vf

this volume shrinks to zero


this volume is much larger than Vi
Figure 49: Throttle process
or
Hi = H f

(605)

That is, during this process, enthalpy remains the same.


With a pure ideal gas, then, this process is impossible since
H = U + PV =

fNk
f +2
T + N kT =
N kT
2
2

(606)

and constant H means constant T . No change in temperature! The whole


reason for the throttle is to cool the liquid. So this is useless.
The reason this works in real system is that the energy of a molecule in
a liquid phase consist of potential and kinetic energy. And in a dense liquid
177

phase, potential energy is not negligible compared to the kinetic energy.


As the liquid goes thru the throttle, the distance between the molecules
becomes less. That means that the potential energy has to be converted to
the kinetic energy. Now, to hold the liquid together, the potential energy has
to correspond to attracting force. But that means that the energy associated
with it is negative compared to the zero kinetic energy. Remember, if
something is in a bound state, the total energy is less than the zero kinetic
energy.
Therefore energy conservation demands that when these bonds are broken, the average kinetic energy must do down. So the temperature goes
down.
Now the coefficient of performance can be written
COP =

H1 H 3
H2 H 1

(607)

Liquefaction of Gases

Compresses the gas. Raises temperature.

Throttle

Compressor
Heat exchanger
Negative feedback
mechanism for temperature

cooler

Liquid

Cools the hot temperature


due to the compression
while keeping pressure
constant
Figure 50: Schematic diagram
178

For air this process suffices. For helium or hydrogen. Even this wont work
that well. In these small inert molecules, the attraction between molecules
is very weak while the hard core collisions keep the repulsion sizable. So
by throttling, we lower the collision rate. By lowering the collision rate, we
convert the potential energy into the kinetic energy. So the temperature
goes up. For the throttling to work, the attraction must be larger than the
repulsion. This happens for helium and for hydrogen at already pretty low
temperature. For hydrogen, maximum such temperature is 204 K and for
helium, 43 K.
In other words, these gases are too much like ideal gas. One can see that
from figure 4.12. At temperatures above 200 K (-73 C), the enthalpy is
function only of temperature (doesnt depend on pressure flat lines). Just
like we argued that
H=

f +2
N kT
2

(608)

for ideal gas.


Since the throttling follows constant H curve, throttling lowers temperature only if the slope of the constant enthalpy curve in the PT graph is
positive. That, is lower the pressure, lower the temperature. At some point
along the curve, then, the slope has to change from one sign to another. This
point is called the inversion point and the curve that joins these points at
different enthalpy is called the inversion curve.
Really, Really cold temperature
Helium Dilution refrigerator
Paramagnetic cooling Sudden lowering of the magnetic field
Laser cooling Dopler shift of resonance BE condensate achieved this
way

23

Schroeder Chapter 5 Free energy and


chemical thermodynamics

One place where the consideration of thermodynamic potentials are essential is the chemistry. Look around you. Physicists may figure out how to
179

make cool devices. But chemists and chemical engineers are the ones who
figures out how to make them in a usable way. The question to ask is: How
much energy do I need to make a certain reaction happen? Or what kind of
environment do I need to maintain to have maximum yield?
All these considerations are the realm of thermodynamics. Chemical
thermodynamics, that is. Whats so special about chemistry? Well, the
most important fact is that you end up with something completely different
from what you started with. Thats chemistry.
You can then see right away that not we are going to have to talk about
the change in the number of particles or molecules, not only the energy,
volume, pressure and temperature.
Lets start with a few defintions first.

24

Free Energy

We already defined enthalpy


H U + PV

(609)

This is the amount of energy you need to create a system that has the total
energy of U and volume of V in the constant pressure environment. The
extra P V term is there because you need to push the enviromnent away to
make a room for the new system. So not only you need to supply the energy
that ends up in the created system, but you also have to supply the energy
that is needed to make the room for it.
Now consider another situation. Suppose the new system is made in a
volume already cut out for it so that there is no need to push against the
environment. Further suppose that the environment maintains a constant
temperature T . In that case, there is P V term to add to U . However, certain
amount of that energy can come from the environment as heat. Therefore
the amount of energy you have to supply is
F U TS

(610)

where T S = T S is the heat entering from the environment. Remember that


heat is just another name for energy that enters/leaves due to temperature
difference. This is called the Helmholtz Free Energy.
On the other hand, if you annihilate the system, the recoverable energy is
only F = U T S since you have to dump some entropy to the environment.
180

This is useful quantity when the volume V is fixed.


We can also consider an environment where P and T are constant, say
anything happening in the atmosphere.
In this case, both of the above consideration applies and the amount of
energy we have to supply is given by
G U + PV TS

(611)

This is called the Gibbs Free Energy and in chemistry, this is the most
useful quantity.
We can also think about the following combination
U T S N

(612)

This is called grand Free energy and usuful in the environment where T and
are maintained constant.
Most of times, we dont create the whole system out of nothing. This
would require enormous energy. For instance, if you want to create 1 mole
of hydrogen molecules out of nothing, the rest mass energy alone will cost
mc2 = (2 103 kg)(3.0 108 m/s)2
= 6 1013 J

(613)

this is about a megaton of TNT, that is, the energy release of a large nuclear
bomb. Thats too much. So most circumstances, we would supply only the
difference between the stuff we started with and the stuff we want to end up
with.
So consider a constant temperature environment. Then, you want to
consider the difference in the Helmholtz free energy F = U T S or
F = U T S

(614)

Now we know that the change in the energy is


U = W + Q

(615)

F = W + Q T S

(616)

So that

181

If no new overall entropy is generated, then Q = T S. Otherwise,


T S > Q. That is, the amount of entropy increase in the system is larger
than the amount of thermal energy transfer from the environment divided
by T . Therefore
F = W (T S Q) W

(617)

Remember when writing


U = W + Q

(618)

W is the amount of work done on the system. Therefore


F W

(619)

means that the increase in the Helmholtz free energy is always less than the
work done on the system. Now, the equilibrium state is characterized by
W = 0 and Q = 0. Therefore, the equilibrium is achieved when F becomes
minimum.
If the environment is at constant temperature and pressure, we need to
think about the Gibbs free energy
G = U + PV TS

(620)

Since P and T are constants,a


G = U + P V T S

(621)

Again using U = W + Q,
G = W + Q + P V T S

(622)

T S Q > 0

(623)

As before,

Now remember that W is work done on the system. So if the volume expands,
W is negative because in this case the system has done the work. So we can
say
W + P V = Wother
182

(624)

where Wother is the amount of work done on the system that is not the P dV
work. So
G = Wother (T S P dV ) Wother

(625)

Since G is such a useful quantity, it is measured and tabulated for large


amount of reactions. One can also calculate it using
G = H T S

(626)

from the tabulated enthalpy and the entropy of the final and the initial states.
Lets consider few examples.
Electorsys, Fuel cells and Batteries
Consider the chemical reaction
1
H2 O H2 + O2
2

(627)

This is an electosys. You send current thru water and oxygen comes out of
the positive electrode and the hydrogen out of the negative electrode. The
enthalpy difference H of this reaction is listed in the table as The f here
Substances (form) f H (kJ) f G (kJ) S (J/K) CP (J/K)
H2 O (l)
285.83
237.13
69.91
75.29

V cm3
18.068

indicates that this is the difference measured from the most stable form of
ingredients. In our case, thats just H2 and O2 . What is the actual work one
has to supply for this to happen? Well, since this is happening at fixed T
and P , write
G Wother

(628)

so the minimum work we need to supply is the minus of f G or


G = 237 kJ

(629)

The difference between this and the enthalpy change is the heat since
G = H T S
183

(630)

This is
T S = H G = 286 237 = 49 kJ

(631)

Does this make sense? Well, T = 298K. To calculate S, we need to know


the entropy of water and the gases. Water has:
SH2 O = 70 J/K

(632)

SH2 = 131 J/K


1
SO = 103 J/K
2 2

(633)

S = 131 + 103 70 = 164 J/K

(635)

T S = 298K 164J/K = 49 kJ

(636)

and

(634)

so

so

So this is the amount of heat that entered from the environment. The amount
of energy that remains in the system is
U = H P V

(637)

Now one mole of a gas occupies 22.4 litres at 0 C. That means it occupies
22.4

298
= 24.5litres
273

(638)

at 25 C. We have one and half moles of gas generated. So they occupy 37


litres. One atm is about 105 Pascal. So
P V = 37 103 m3 105 Pa = 37 102 J 4kJ

(639)

This is the amount of work the system has to do against the atmosphere. So
U = 286 4 = 282 kJ
184

(640)

A reverse process is the Fuel cell. Inside fuel cell, the following happens:
1
H2 + O2 H 2 O
2

(641)

Remember G is the amount of work we have to supply (via battery) to


dissociate water into the hydrogen and the oxygen. This is the same amount
of electrical work we can get from this reverse process. If we simply burn the
hydrogen and the oxygen, the amount of heat we can get out is the difference
in the enthalpy. This is 286 kJ. Among this heat, we waste T S = 49 kJ
and convert the rest of it to the electricity. So the efficiency an ideal fuel
cell is 83 %. 1
Similar things happen in a battery. The only difference is that batteries
have finite amount of fuel. In a car battery
Pb + PbO2 + 4H+ + 2SO2
4 2PbSO 4 + 2H2 O

(645)

The table says that


G = Gfinal Ginitial = 394 kJ/mol

(646)

H = 316 kJ/mol

(647)

and

if T is constant. So the energy that comes out of the battery G is larger


than the energy change between the substances H. Whats going on here?
Note that
G = H TS

(648)

1.013bar = 10 meter column of water on 1 square meter

(642)

V = 10 m 1 m2 = 10 m3

(643)

We now that 103 m3 of water is 1 kg. So this is 104 kg. The force is then
F = mg 105 N
so 1 bar is 105 Pa.

185

(644)

so that
G = H T S

(649)

The extra amount of energy that you get out is actually supplied by the
environment as heat.
To figure out how much voltage, we need to know more chemistry. The
reaction takes place in 3 steps:
in solution :
at electrode :
at + electrode :

2SO2
4 + 2H 2HSO4 ;
+

Pb + HSO
4 PbSO4 + H + 2e
+

PbO2 + HSO
4 + 3H + 2e PbSO4 + 2H2 O
(650)

So per reaction, two electrons travel around the circuit. So the electric work
produced per electron is
394 kJ
G
= 3.27 1019 J = 2.04 eV
=
2moles
2 6.02 1023

(651)

So to get 12 volt, you need a six pack.


Thermodynamic identities
Note: at constan T
T S Q

(652)

The equality is valid if there is no extra entropy generated. Remember


Q is energy (heat). So thats what enters the first law.
The first law says:
dU = W + Q

(653)

Here Q is the energy transfer into the system due to the temperature difference. So a negative Q implies that the system has higher temperature than
the environment. W here is the work done onto the system. So a negative
W means that the system has done work.
In section 3.5 we showed that
T dS = dU + P dV dN
186

(654)

is just an identity using the definition


1
T
P
T

=
=

=
T

S
U

S
V

S
N

U,V

N
X

f
dxi
xi

(655)
N,V

(656)
N,U

(657)

so using the mathematical identity


df (x1 , x2 , ..., xN ) =

i=1

(658)

So the above identity is true for any infinitesimal changes.


Whats the relation between
dU = W + Q

(659)

dU = T dS P dV + dN ?

(660)

and

Is it possible to equate Q = T dS? Only if the process is quastistatic.


Entropy can be generated in may ways. Increasing energy, volume, and the
number all increases energy. Only if all other processes dont contribute to
the entropy change but the heat transfer, we can say Q = T dS. In general,
T dS Q and that means that the entropy has been generated processes
other than just the heat transfer.
Now consider what the thermodynamic identity looks like if we used the
enthalpy or the free energies instead of U . For enthalpy,
dH = d(U + P V )
= (T dS P dV + dN ) + (V dP + P dV )
= T dS + V dP + dN

(661)

where we used dU = T dS P dV + dN . What does this mean? Well, first


of all, it says that the natural variables that H depends on are S, P, N . It
187

also says
T =
V

H
S

P,N

H
P

N,S

H
N

S,P

(662)
(663)
(664)

For the Helmholtz free energy,


dF = d(U T S)
= (T dS P dV + dN ) (T dS + SdT )
= SdT P dV + dN

(665)

So
F = F (V, N, T )

(666)

and

F
S =
T
F
P =
V
=

F
N

(667)
V,N

(668)
N,T

(669)
T,V

For the Gibbs free energy,


dG = d(U + P V T S)
= (T dS P dV + dN ) + (P dV + V dP ) (T dS + SdT )
= SdT + V dP + dN
(670)
So
G = G(T, P, N )
188

(671)

and

G
S =
T
=

G
P

G
N

(672)
P,N

(673)
N,T

(674)
T,P

We can also define Grand free energy


= U T S N

(675)

For this,
d = d(U T S N )
= (T dS P dV + dN ) (T dS + SdT ) (dN + N d)
= SdT P dV N d
(676)
So
= (T, P, N )

(677)

and

S =
T

P =
V

N =

25

(678)
V,N

(679)
N,T

(680)
T,P

Free energy as a force towrads Equilibrium

In quantum mechanics, we know that a system would make a transition to


the ground state. That is, if left alone to interact with vacuum, a hydrogen
189

atom in an excited state, for instance, will make a transition to its ground
state by emitting a photon. That is, the system seeks to minimize its energy.
This is a very useful thing. Whenever a principle can be formulated as a
optimization problem, we can use a very powerful approximation technique
called variational problem. For instance, even if you didnt know the how to
solve the hydrogen atom problem exactly, you can still get a reasonably close
answer if you just guess a reasonable wavefunction shape and try to minimize
the energy under the constraint that the normalization of the wavefunction
is fixed.
Question is, is there something similar that can be said about many body
system? Certainly, minimizing the energy is not an answer. Often times, we
would like to fix the energy of the system either exactly or on average. Also,
it this is the case, everything will be at the absoulte zero.
Then what? What about entropy? A system surely wants to maximize
its own entropy, right? Well, yes and no. This is a distinction between an
isolated system and a system in contact with a reservoir. An analogy in
quantum mechanics is as follows. If you just write down the hamiltonian for
the hydrogen atom, each energy states is an eigenstate of the hamiltonian.
That means, among other things, they are stable. By themselves, an electron in the 2S state cannot possibly make a transition to the 1S state. But
it happens. What gives? This is because, the hydrogen atom is not a truely
isolated system. There is always the vaccum. Vaccum is not simple. It has
a lot of structure and if you want to describe the hydrogen more accurately,
you need to consider the role of the environment which is the vaccum. So
if you put a hydrogen atom in an excited state, it will eventually go down to
the ground state but only because it is put in an environment. But you
dont want to think about the hydrogen atom and the vaccum separately. So
we usely say things like the system will seek its minimum energy configuration. But thats because its put in an environment where the permeating
temperature is, well, zero.
If the system is put in a finite temperature environment, what happens
is that the environment keeps providing energy for the atom so that even
if it loses energy by making transition to lower level, it quickly goes up to
the excited state again because of the collision with the surrounding particles, fields, etc. So on average, the system is not at its ground state, but
some at some other energy level corresponding to the temperature of the
surroundings.
Remember that what is important is that the overall entropy is max190

imzed. Not the entropy for the system is maximized. If the system starts
out hotter than the environment, then the system entropy will surely go down
as it cools and lose energy to the environment as heat.
But if you focus solely on the system itself, then its not the entropy
itself that has to be maximized. You also need to consider the energy which,
if left alone will seek minimum. Why? Because losing the energy to the
environment tends to increase the entorpy of the environment. So from the
point of view from the system, maximizing total entropy requires it to
give up some of its energy to the environment if thats more profitable in
increasing overall entropy or absorb more energy from the environment if
thats more profitable. There has to be a balance.
So you can guess that the system seeks to minimize a quantity like
F = U TS

(681)

Lets figure out if this is indeed the case.

System
WS

TS

Reservoir
TR
Q

Figure 51: Finite temperature reservoir


The overall entropy is
Stotal = SS + SR

(682)

where the subscript R means reservoir and S means the system we are interested in. The condition for any change is
dStotal = dSS + dSR 0
191

(683)

The energy of the system changes by


dUS = QS + WS

(684)

In this case, only heat is exchange between the system and the reservoir.
Now from the reservoirs point of view, during this change, the temperature
remained the same, and all other parameters remained the same as well. So
the heat is directly related to the entropy change
QS
= dSR
TR

(685)

so
dStotal = dSS + dSR
QS
0
= dSS
TR

(686)

Using dUS = QS + WS , this is


QS
TR
dUS WS
= dSS
TR
TR dSS dUS + WS
=
0
TR

dStotal = dSS

(687)

Now since TR remains constant no matter what, we can say


d(TR SS US ) + WS 0

(688)

dF0 WS

(689)

or

where we have defined the Free energy.


F 0 = U S T R SS

(690)

Note that this is not the free energy of the system


F S = U S T S SS
192

(691)

since the system temperature can be in general not the same as the reservoir
temperature to begin with. It will eventually become that but then the
system becomes boring.
From the above inequality we can draw two conclusions. Remember that
we defined W to be the work done onto the system. So the above inequality
implies that the maximum work the system can do is F . Second, if
no work is involved,
dF0 0

(692)

that is, any spontaneous change tends to decrease the Free energy
is the system does no work.
What does this mean? Why would a system do that? How do we understand this behavior? Well, if no work is involved,
Stotal =

F0
TR

(693)

Now remember one of the fundamental assumptions of the stat-mech:


P (y) (y) = exp(S(y))

(694)

that is the probability of a system having a parameter value y (could be


energy, volume, number, etc) is proportional to the exponential of the total
entropy. And remember that the equilibrium happens that the most probable state has an overwhelmingly large relative probability compared to any
others.
Now we ask: what is the probability for the system to have the energy
US ? In our case, we know that
Stotal =

F0
TR

(695)

Moreover F0 = US TR SS is function of US only since everything else is


fixed. We know that
P (US ) exp(Stotal (US ))

(696)

but what we have is the difference formula. What to do? Well, we can pick
some fixed value of US , call it UO and write.
Stotal = Stotal (US ) Stotal (UO )
F0 (US ) F0 (UO )
=
TR
193

(697)

But since UO is fixed, we can say


Stotal (US ) =

F0 (US )
+ constant
TR

(698)

so that
P (US ) exp(Stotal (US )) exp(F0 (US )/TR )

(699)

Therefore, the most probable state is the state with minimum F0 .

Initially we followed the argument in Reif. That argument is wrong. Here is the correct one.
Now suppose the reservoir keeps constant temperature as well as the
constant pressure. Then the reservoir entropy can increase in two way, one
take the heat in or get more room for itself. From the point of view of the
system, therefore, you would expect that it likes to minimize something like
G = U + PV TS

(700)

Stotal = SS + SR

(701)

The overall entropy is

where the subscript R means reservoir and S means the system we are interested in. The condition for any change is
dStotal = dSS + dSR 0

(702)

The entropy change of the reservoir is


TR dSR = dUR + PR dVR

(703)

so
dStotal = dSS +

194

dUR + PR dVR
TR

(704)

Reservoir

System

TR
PR

Figure 52: Finite temperature reservoir


Energy conservation implies
dUR = dUS

(705)

and the constant volume implies


dVR = dVS

(706)

so
dUS PR dVS
TR
TR dSS (dUS + PR dVS )
0
=
TR

dStotal = dSS +

(707)

Now since TR and PR remain constant no matter what, we can say


d(TR SS US PR VS ) 0
195

(708)

or
dG0 0

(709)

where we have defined the Free energy.


G 0 = U S + P R V S T R SS

(710)

Note that this is not the free energy of the system


G S = U S + P S V S T S SS

(711)

since the system temperature and pressure can be in general not the same
as the reservoir to begin with. It will eventually become that but then the
system becomes boring.
Now consider this time a reservoir that keeps the temperature and the
pressure of the system constant at T and P while the system does some work.
Again
Stotal = SS + SR

(712)

SR = QR /TR

(713)

We have now

Now heat that leaves the reservoir is the heat that enters the system. Hence,
Stotal = SS

QS
TR

(714)

We have
QS = dUS + PS dVS Wother

(715)

Hence
Stotal = SS

dUS + PS dVS Wother


TR

(716)

But TR = TS and PR = PS by assumption. Hence


(U + P V T S) Wother
196

(717)

(U + P V T S) Wother, by the system

(718)

Rederivation ends
From the above inequality we can draw two conclusions. Remember that
we defined W to be the work done onto the system. So the above inequality
implies that the maximum work the system can do is F . Second, if
no other type of work is involved,
dG0 0

(719)

that is, any spontaneous change tends to decrease the Free energy
is the system does no work other than P dV work.
What does this mean? Why would a system do that? How do we understand this behavior? Well, if no other work is involved,
Stotal =

G0
TR

(720)

Now remember one of the fundamental assumptions of the stat-mech:


P (y) (y) = exp(S(y))

(721)

that is the probability of a system having a parameter value y (could be


energy, volume, number, etc) is proportional to the exponential of the total
entropy. And remember that the equilibrium happens that the most probable state has an overwhelmingly large relative probability compared to any
others.
Now we ask: what is the probability for the system to have the energy
US ? In our case, we know that
Stotal =

G0
TR

(722)

Moreover G0 = US + PR VS TR SS is function of US only since everything


else is fixed. We know that
P (US ) exp(Stotal (US ))

197

(723)

but what we have is the difference formula. What to do? Well, we can pick
some fixed value of US , call it UO and write.
Stotal = Stotal (US ) Stotal (UO )
G0 (US ) G0 (UO )
=
TR

(724)

But since UO is fixed, we can say


Stotal (US ) =

G0 (US )
+ constant
TR

(725)

so that
P (US ) exp(Stotal (US )) exp(G0 (US )/TR )

(726)

Therefore, the most probable state is the state with minimum F0 .


Now it looks like that we have two different definition of the equilibrium
one in terms of the entropy and one in terms of the free energies. Are they
the same? Yes, of course. To see this, consider the Gibbs free energy.
At equilibrium, our current definition requires
0 = dG0
= d(US + PR VS TR SS )
= dUS + PR dVS TR dSS

(727)

Thermodynamic identity for the system is


dUS = TS dSS PS dVS

(728)

so
0 = dUS + PR dVS TR dSS
= TS dSS PS dVS + PR dVS TR dSS
= (TS TR )dSS (PS PR )dVS

(729)

since S and V are independent variables, the only way this vanishes is
TS = T R
PS = P R
198

(730)
(731)

Extensive and intensive and Gibbs


We have a bewildering amount of symbols and concepts now. Here is the
list
U : Internal energy
V : Volume
N : Number of particles
S : Entropy
T : Temperature
P : Pressure
: Chemical potential
H = U + P V : Enthalpy
F = U T S : Helmholtz free energy
G = U + P V T S : Gibbs free energy
= U T S N : Grand free energy
U = Q + W
S Q/T
T dS = dU + P dV dN
Q : Heat
W : Work
Question you should ask : Is there and organizing principle? Do I have to
memorize all these things? Fortunately, there is an organizing principle. Do
you need to memorize these? Yes. Does it have to be hard? No if you
understand the principle.
To talk about the organizing principle, we first notice the following fact.
The quantities above natually divides into two parts. There are quantities
199

that double if you double the system. That is, quantities that are proportional to the volume on average. These are extensive quantities. Fundamentally, we have 4 such quantities V, U, N, S. These are things that adds
when you have two systems. There are also quantities that remains the same
even if you double the system. These are the intensive quantities. Fundamentally, there are 3 such quantities T, P, . But also any ratio of two
extensive quantities is extensive. For instance the energy density = U/V
or the number density n = N/V are intensive quantities as well as the entropy
per particle S/N or the entropy density s = S/V .
Then thermodynamic potentials are also extensive quantities since they
are of the form
X

(intensive) (extensive)

(732)

In particular, consider the Gibbs free energy


G = U + PV TS

(733)

Its thermodynamic identity is


dG = dU + P dV + V dP T dS SdT
= (T dS P dV + dN ) + P dV + V dP T dS SdT
= SdT + V dP + dN

(734)

That is, the naturally


G = G(T, P, N )

(735)

and
=

G
N

(736)
T,P

This implies
G = (T, P )N + f (T, P )

(737)

where f (T, P ) is an arbitary function of T and P . Now we know that G is


an extensive quantity. So it must behave like N . But the second term does
200

not depend on any extensive quantity and will remain the same no matter
what the change in N is. That cannot be. So f = 0. This implies
G = N

(738)

This is deceptibly simple. The power of this equation comes from the fact
that by definition
G = U + PV TS

(739)

so that
T S + N = U + P V
This is a very useful formula and you should memorize it. If you remember
this formula and the thermodynamic identity
T dS = dU + P dV dN

(740)

you can figure out most things without memorizing all the details.
Why is this useful? Well, suppose you have a system that is most easily described in terms of the volume V , the chemical potential and the
temperature T . What could be the most useful combination?
Well, you start with the thermodynamic identity:
dU = T dS + dN P dV

(741)

The right hand side already has P dV so volume part is O.K. but U is a
function of N and S instead of and T . No fear. Consider subtracting N
from U . Call it :
U N

(742)

d = dU dN N d

(743)

The differential is

But then since dU involves dN , d natually involves N d so


d = T dS P dV N d
201

(744)

We stil have T dS. So by the same token, add T S to and get


= TS

(745)

then
d = d T dS SdT
= T dS P dV N d T dS SdT
= SdT P dV N d

(746)

This sort of procedure is called Legendre transformation. This is the same


transformation one take when changing from the Hamiltonian dynamics to
the Lagrangian dynamics and vice versa by adding or subtarcting pq term. Of
course, one cannot have a function that is both a function of independent
P and V for instance or T and S, etc just like we dont have velocities any
more in Hamiltonian dynamics.
As an example of another usage, consider the fact that the Helmholtz free
energy can be written in 2 ways now
F = U T S = N P V

(747)

Now we know that if T, V, N are constant, F remains constant. By looking


at this formula, one can immediately say that that means change in and
the change in P should compensate each other in the way

1
=
P
n

(748)

where n = N/V is constant. Try to figure that out in some other way.
The example in the book is more subtle. Since the equation of state of
the ideal gas is
P V = N kT

(749)

the Gibbs free energy for the ideal gas looks simple:
G(P, T, N ) = N =

PV
kT

(750)

We also know that


dG = d(U + P V T S) = SdT + V dP + dN
202

(751)

so
=

G
N

=
T,P

G
N

(752)

then

N,T

1
=
N

G
P

=
N,T

V
kT
=
N
P

(753)

Integrating at fixed N and T yields


(T, P ) (T, P ) = kT ln(P/P )

(754)

(T, P ) = (T ) + kT ln(P/P )

(755)

or

where the reference point


1 atm and 0 C.

26

is usually chosen to be the standard condition of

Phase transformation of Pure Substances

Phase Transformation : Transition between gas, liquid and solid forms


of matter
Phase diagram : Stable phases as a function of T and P Gibbs free
energy is most natural thermodynamic potential
Why do we even have phase transition? Well, Gibbs is
G = U + PV TS

(756)

Now we know that U N f kT . We would like to minimize this quantity.


Suppose we are at a low temperature. Then the entropy term doesnt
really matter. The system should give up energy to the environment to
maximize overall entropy. In that case, the equipartion theorem tells us
that the kinetic energy of the particles in the system is small. And if
it becomes smaller than the attractive potential energies between molecules
(Van der Vaals and such), then energetically it becomes favorable that the
molecules arrange themselves in a regular manner and become solid.
203

Now suppose we raise the temperature. Then what happens? Well, at


certain point, the entropy gained by the phase change (for instance, if water vaporises, suddenly the whole room becomes available!) becomes more
favorable than the energy gain. So icecreams melt, water boils and dry-ice
sublimates.
(why does alcohol vaporizes so readily at room temperature when its
boiling temperature is about 80 C? That is, what has volatility has to do
with phase diagram?)
As for the pressure term, high pressure means that the volume should
be minimized. So at higher pressure, things will more readily liquify and
solidify. At lower pressure, P V doesnt matter much and U and T S will
have to do the battle.
So we can define
Vapor pressure : Defines the line between liquid/solid and gas. Gas
phase can coexist with liquid or solid phase.
Triple point : Distinction between liquid and dense gas disappears.
Latent heat goes to zero.
Not all items behave the same way.
For instance, water and carbon-dioxide both have 3 phases with a critical
point and a triple point. But between liquid and solid form, the signs of
the slope of the boundaries are not same. For water, higher pressure means
lower melting point while for CO2 , higher pressure means higher melting
point. What could be expected? Well, take a look at Gibbs
G = U + PV TS

(757)

We want to minimize this. To do so, higher pressure would prefer smaller


volume. For most substances, solid has less volume than the liquid. However,
for water, this is not true. Ice floats. Density of ice is lower than the liquid
water. That means that by forming ice, you incur higher P V term. Thats
not good for lowering G.
On the other hand, the slope of boundary between the liquid and the gas
phases are always positive for any substances this is again due to the fact
condensation always reduces the volume.
An interesting point in the phase diagram is the critical point. Beyond
this point, liquid and gas are not distinguishable and the change from one
form to another is smooth instead of discontinous (such as boiling).
204

Negative slope

Pressure

Water and steam are


not distinguishable
here
Critical Point
This slope
is always
positive
Water

Steam

Ice
0K

Triple point
: All 3 forms coexist
Temperature
Figure 53: Water phases

For pure substance, Helium perhaps posesses the most interesting phase
structure.
Whats so special about helium? Well, it become superfulid at very low
temperature, for the starter. A superfluid is a fluid with no frictional resistance to anything. Normally if you set a liquid in motion, say in a ring,
sooner or later the motion dies away by friction between the ring and the
liquid as well as the friction within the fluid. Not so for the superfluid. If
you make such a device with superfluid liquid helium, it will basically rotate forever. Another interesting thing about helium is that while the more
abundant isotope 4 He is a boson, the less abudant isotope 3 He is a fermion.
At low temperature, quantum mechanics is very important. Bosons want to
get together. Fermions want to get away from each other. Superfluid is a
collective phenomena where almost all atoms in the liquid moves coheretly
together in the exactly the same quantum state. This, of course is possible
only if the particles are bosons. Hence, althogh chemically almost identical,

205

Gas and Liquid


not distinguishable
here

Positive
slopes
Pressure

Critical Point

Solid

Liquid
Gas

Triple point
: All 3 forms coexist
Temperature

0K

Figure 54: CO2 phases


4He

3He

Soild

He II
(superfluid)

He I (normal fluid)

Soild

Liqud

Gas

Gas

Figure 55: Helium Phases


the phase behavior of two isotopes are very different.
From this point of view, we would conclude that 3 He does not have su206

perfluid phase. This would be correct if more complicated things like paring
does not happen at really, really low temperature. Due to some quantum
magic, superfluidity for helium 3 does happen but at less than 3 mK.
Superfluide moves without friction or resistance. There is a similar phenomena called superconductivity. This is when the electrical resistance of a
material goes to zero. That is, it becomes a perfect conductoer. Noramlly,
while a current is going thru a wire, it loses energy by heating the wire at the
rate of P = IR2 /2. For superconductors, R = 0, and there is not heating of
the wire. So if you set up a superconducting ring and set a current flow, the
current will flow thru the ring indefinitely without any outside help such
as the battery. In a normal situation, a battery has to provide more energy
to compensate the heat loss. But for superconductor, there is none. So no
need.
If you apply a sufficiently large magnetic field, to a superconductor, it
can disrupt the collective motion of the charge carring particles inside and
destroy the superconductivity. This is called the critical field density or
strength. One can then draw a phase diagram using B and T as parameters
instead of P and T . Basically, any external paramter that can change the
behavior of the molecules can be used to plot the phase diagram.
Diamonds and Graphite
At low pressure, graphite is more stable since it has G lower than the diamond. So at the standard condition, diamonds will become graphite (eventually) although the rate is extremely small. As the pressure increases, the
Gibbs increases at the rate of

G
P

=V

(758)

T,N

Now diamond is more compact than the graphite. If you ignore the compressibility of them, then you can say that the slope is constant. Per mole,
V = 3.4 106 m3 for diamond and V = 5.3 106 m3 for graphite. At
around P = 15 kbar, the lines cross. This pressure is achieved at around 50
km below the earth surface.
Rough estimate:
1. Water pressure increases by 1 bar for every 10 meters.
2. Rock density is about 3 times water density.
207

Slope equals volume.


Diamond has less volume per mole.

Diamond
Graphite

2.9 kJ

15 kbar

P (kbar)

Figure 56: Carbons


3. Pressure then by 1 bar for every 10/3 3 meters of rock.
4. For 15kbar, that means about 45 km.
For the temperature dependency,

G
T

P,N

= S

(759)

Diamond has less enotrpy than the graphite. That is, it is more organized
than graphite or more rigid than the graphite. So raising temperature reduces
G for graphite faster than that of diamond. So at high temperature, graphite
would be more stable form. But then you should make sure that no oxygen
is present otherwise both will burn.
Clausius-Clapeyron Relation
Gibss energy depends on P, T and N .
Along any boundary of two phases, G must be the same.
208

Between gas and liquid


Gl = G g

(760)

This defines the phase boundary. If T and P changes, this no longer holds
in general. But if you are following the phase boundary, the change in P
and T are must be related in such way to make the above relation hold.
Equivalently,
dGl = dGg

(761)

Sl dT + Vl dP = Sg dT + Vg dP

(762)

or
Note that dTl = dTg and dPl = dPg and we assume that dN = 0. So the
curve in the T, P space is characterized by
dP
Sg S l
S
=
=
(763)
dT
Vg V l
V
Large entropy change accross the boundary : P (T ) is a steeply rising
curve.
Large volume change : P (T ) is a slowly rising curve.
Latent heat :

L = T S

(764)

is the amount of heat energy that must be provided to convert one


phase to another: Measurable. Tables exist.
dP
L
=
(765)
dT
T V
Clausius-Clapeyron relation
Diamond-Graphite
Diamond-Graphite coexist at 15 kbar and 1 atm.
If temperature is raised by T , the pressure has to increase by
L
P =
T
(766)
T V
Turns out to be about 1.8 kbar for every 100 degree increase in
temperature.
209

Van der Waals Model


Suppose you are the largest and the fastest computer in the universe (Pick
your name: Hector, Deep-Thought, or the Earth). And suppose you want to
calculate the behavior of one mole of a certain substance. What should you
do?
Well, if Newtonian mechanics is adequate (which it is not), then you
would start with 6 1023 equations
mai = i V (xi xj )

(767)

and solve them numerically. This is an amazingly complicated stuff. Currently the best we can do is simulation of about 1000 atoms. More than that
exceeds CPU, memory, time, you name it and it will exceed any reasonable
capability and longevity of a physicsit.
Whats amazing is, though, the study of this many degrees of freedom
can boil down to study of a handful of macroscopic variables such as the
pressure, volume, temperature, etc. This is because we are interested in
the average behavior. The behavior of individual atoms is of no particular
concern.
A great example of this kind is the equation of state. For an ideal gas
P V = N kT

(768)

this simple equation, nontheless can be used in variety of real situations with
good first order approximation.
A draw back of having this simple relationship, though, is that this sort
of gas does not exhibit phase transformation. P T diagram of ideal gas is
extremly simple. Only gas phase exists no matter high high the pressure and
how low the temperature.
Obviously, we have then 2 choices. One, we go back to the microscopic
description and try to rederive the equation of state including the effect
of the interactions among the particles. Two, make a reasonable and
minimal modification of the ideal gas law so that the modified equation of
state exhibits desired features of phase transition.
The two choices are, of course, not mutually exclusive. In the course of
any normal investigation of physical matter, the guess work and the analytical
or numerical work always go hand in hand.
210

Anyway, since we dont have the tools yet, lets try the second approach.
What did we neglect when we derive the ideal gas law? Well, interactions,
of course. But what sort of interactions? We know that the atoms have finte
sizes. That means that if two atoms are too close together, they must repel.
On the other hand, induced electric dipole moments in atoms can result in
a long range (albeit weak) attraction called Van der Waals interaction.
So the two major effects of having interactions is short range repulsion
and long range attraction.
The two effects can be empherically incorporated in the following way
(P + aN 2 /V 2 )(V bN ) = N kT

(769)

To have an idea how this may arise, lets first derive the Helmholtz free
energy for the ideal gas. Here is the Sackur-Tetrode
h

Sideal = N k ln (4m/3h2 )3/2 (V /N )(U/N )3/2 + 5/2


h

= N k ln (4m/3h2 )3/2 (V /N )(3kT /2)3/2 + 5/2


So

(770)

Fideal = U T S h

i
= U N kT ln (4m/3h2 )3/2 (V /N )(3kT /2)3/2 + 5/2 (771)

First of all, recall that we interpreted this formula to be the available


phase-space volume which is the spatial volume for a single particle
(V /N )

(772)

times the momentum space volume for a single particle


(U/N )3/2 = (3kT /2)3/2

(773)

Now if a single particle has size of its own, call it b, then the available spatial
volume is reduced
(V /N ) (V /N ) b

(774)

That takes care of the replusion part.


What about the attraction part? Well, having attractive force reduces
the total energy. But how much? Suppose the potential is of the type
211

To a first approximation, suppose that the range of attraction is small and


finite. In that case, the amount of potential energy each particle can have is
proportional to the volume corresponding to that range, call this volume va ,
and the number of other particles in that volume which is va (N/V ). If we
denote average energy and call
a = va

(775)

then the average energy of a molecule is reduce by


Uideal /N Uideal /N a(N/V )

(776)

where N/V = n is the density. Shall we make this change in both places
where U appear? Well, not really. At T , the average kinetic energy is still
3kT /2. That is, equipartition theorem still works here. Now since we derived
the Sackur-Tetrode formula as a pure phase space integral, this part is not
really affected by having potential energy.
With this modification, the free energy formula now looks like
h

i
3
Fv.d.Waals = N kT a(N 2 /V ) N kT ln (4m/3h2 )3/2 (V /N b)(3kT /2)3/2 + 5/2(777)
2

We know that
dF = P dV SdT + dN

(778)

or

F
V

T,N

= P

(779)

Carrying out the derivative, we get


P = a

N2
V2

N kT

1
V bN

(780)

which results in
(P + a(N/V )2 )(V bN ) = N kT
212

(781)

The van der Waals formula is a very crude approximation of a very complicated behavior of real fluid. However, qualitatively it can explain a lot
of things. so thats what we are doing here.
For different substances, the size and the interaction strength differ.
Therefore it is natural to consider differen a and b for different substances.
Lets see if we can estimate how big a and b should be.
We said that b is the size of the volume each molecule occupies.
A typical gas molecule has the size of a few
A. But it cannot be too few

Asince the size of hydrogen molecule is about 2A. So a typical volume should
be
b (0.5nm)3 102 10327 m3 = 1028 m3

(782)

This should be O.K. as an order of magnitude estimate.


How about a? Well we defined
a = va

(783)

where was the average interaction energy and va was the interaction volume.
Typically, between atoms
0.1 eV 0.01 eV

(784)

va b

(785)

a 1 eV
A3

(786)

and

so that

The value of a depens very much on the details of the molecular interactions. If a molecule has permanent polarization such as H2 O, then the value
of a is big. If a molecule is very inert like helium, then the value of a is fairly
small.
Now lets see what we can figure out from van der Waals. First to ask is,
how are P and V related? Well,
P =

aN 2
N kT
2
V Nb
V
213

(787)

So first of all, if N/V becomes small, or dilute limit, the system behaves like
ideal gas. Thats good. Second, as V approaches N b, the pressure blows up.
This is because that you are reaching the packing limit. There is no room
to maneuver among the molecules. The forces between two molecules are
becoming infinitely large and hence the pressure blows up.
On the other hand, if T is small enough, the second term can dominate
and pressure can become negative. Huh? What do we mean by that? Doesnt
that mean the theory is sick? Whats happeningn here?
To see what is happening, we need to look at the Gibbs more closely. So
given the equation of state, how do you calculate Gibbs? Well, consider this.
F = U TS

(788)

dF = SdT P dV + dN

(789)

so and

So if we fix T and N ,
F =

V
V0

P dV

(790)

aN 2 aN 2
+
+ f (T, N )
V
V0

(791)

This yields
F = N kT ln [(V N b)/V0 ]

where f (T, N ) is an arbitrary function of T and N . So


G = F + PV
= N kT ln [(V N b)/V0 ]

aN 2 aN 2
N kT V
aN 2
+
+ f (T, N ) +

V
V0
V Nb
V

aN 2 aN 2
N kT (V N b + N b) aN 2
+

+ f (T, N ) +
V
V0
V Nb
V
2
N kT N b 2aN
= N kT ln [(V N b)/V0 ] +

+ g(T, N )
V Nb
V
(792)
= N kT ln [(V N b)/V0 ]

The Gibbs is a function of P, N, T . So the V in this formula actually represents the solution of
(P + aN 2 /V 2 )(V N b) = N kT
214

(793)

for V in terms of P, T, N . This is a cubic equation in V . As such you can


find solutions, but they are not very illuminating.
But computer can, of course do this easily. Essentially, you need to have
a means to solve a cubic equation. In matlab, fzero does the job.
So what does it look like? Well, it looks like this (fig 5.21). Why does it
looks so funny? Well, thats because the above equation is a cubic function.
For a certain range of P , there can be 3 real solutions of the above equation.
That is, we can associate 3 values of V to a given value of P . Three values
of V means in general three values of G. Thats why starting from point 5,
there are three values of G associated with one value of P .
Funny thing is that at certain value of P , the solution still gives 3 different
volumes, but there are only two G values associated with it. Thats point 2
and 6. As P increases, the number of real solutions again become 2 and then
1.
What does this mean? How to interprete this sort of behavior? We know
that the stable state is in which G takes the minimum values. Ah, that is
that, then. The values of G is whatever is the lowest. In the diagram, its
the curve 1-2 and then 6-7. The loop connecting 2-3-4-5-6 are fictitious. The
system does not actually go there and G is a single valued function of P . But
that means that there is a cusp in the GP graph where the slope dG/dP
changes discontinously! What does that mean? Remember that the slope

G
P

=V

(794)

T,N

so that means that as the system goes thru that point, the volume suddenly
decreases while the pressure changes very little. Is that what we want?
Yes! That is exactly the sort of thing a condensing gas does. As the
pressure increases, the gas molecules are packed more closely and then eventually the attractive force between them takes over to make them condense
into liquid. The volumes between points 2 and 6, the two phase can coexist.
What did we learn here?
Phase transition happens when something discontinous happens to the
thermodynamic potential G.
This happens if for a given P there can be multiple V associated with
it.

215

So how do we determine the points 2 and 6 in the PV diagram? We use


the fact that if you integrate a perfect differential over a loop, you get zero:
0=

loop

dG =

loop

G
P

dP =
T

loop

V dP

(795)

Turning the figure upside down, we see that this condition is the same as
requiring the areas 4-5-6 being the same as the areas 2-3-4. This is called
Maxwell construction.
At high temperatures, the equation (P + aN 2 /V 2 )(V bN ) = N kT
has only one real solution for a given value of P .
That means, everything is smooth no phase transition.
The point where it starts to happen is defined by Tc and the Pc called
the critical temperature and the critical pressure and this point
in phase diagram is called the critical point.
van der Waals model qualitatively O.K. Quantitatively not so good.
Only the first approximation in whats called virial expansion.

27

Phase transition of Mixtures

Mixing entropy of two gas: Before


Sinit = SAinit + SBinit

(796)

where
h

SAinit = NA k ln CA (VA /NA )(3kT /2)3/2


h

SBinit = NB k ln CB (VB /NB )(3kT /2)3/2


After: VA V and VB V

SAfin = NA k ln CA (V /NA )(3kT /2)3/2


h

SBfin = NB k ln CB (V /NB )(3kT /2)3/2

Sfin Sinit = NA k ln [(V /VA )] + NB k ln [(V /VB )]


216

(797)
(798)

(799)
(800)

(801)

Now since n = (NA + NB )/(VA + VB ) = NA /VA = NB /VB , we get


VA /V = (nVA )/(nV ) = NA /N = xA

(802)

S = N k [xA ln (xA ) xB ln (xB )]

(803)

where xB = 1 xA .
Free Energy of a Mixture
Suppose we have to different substances. Say two gases which have the
same density. What is the entropy of mixing? It is given by
Smixing = R [x ln x + (1 x) ln(1 x)]

(804)

since 0 < x < 1, it is certain that the mixture has the bigger entropy. If
this was an isolated system, this alone would be sufficient to determine the
final state. But since we are talking about constant pressure and constant
reservoir, we need to think about how the entropy of the reservoir is affected.
To do so you need to consider the free energies. In this case, the Gibbs:
G = U + PV TS

(805)

We are adding P V and subtracting T S so we must have


dG = V dP SdT + dN

(806)

Now if we mix the two substances, the energy, volume and entropy all change.
For simplicity, lets assume that the changes in the energy and the volume
are negligible compared to the change due to the entropy. In this case, all
we have to do is to add T S to the unmixed G
G = (1 x)GA + xGB + RT [x ln x + (1 x) ln(1 x)]

(807)

A mixture behaving like this is called ideal mixture.


Take the derivative:
dG
= GA + GB + RT [ln x ln(1 x)]
dx
217

(808)

So
dG
=
x0 dx
dG
lim
=
x1 dx
lim

(809)
(810)

What does that mean? Remember x = NB /N . Adding a very small amount


of B to pure A state increases entropy greatly or decreases G greatly. So
that always happens. It is very hard to get a 100 % pure substance.

218

28

Uses of thermodynamic potentials

1. Enthalpy: H = U + P V
(a) Under constant P :
dH = dU + P dV

(811)

Use the first law:


dH = dU + P dV
= Q + W + P dV
= Q + Wothers

(812)

(b) Using theromodynamic identity: Start from


dU = T dS P dV + dN

(813)

Enthalpy is energy with P V .


dH = T dS + V dP + dN

(814)

Note: Change sign, exchange intensive and extensive. Under constant P and N
dH = T dS

(815)

dH = Q

(816)

If quasi-static

2. Helmholtz Free energy: F = U T S


(a) Under constant T :
dF = dU T dS

(817)

Use the first law:


dF = Q + W T dS
= (Q T dS) + W

(818)

Use the second law:


dF W
219

(819)

(b) Using theromodynamic identity: Start from


dU = T dS P dV + dN

(820)

dF = SdT P dV + dN

(821)

dF = P dV

(822)

dF = W

(823)

Helmholtz is energy with T S.

Note: Change sign, exchange intensive and extensive. Under constant T and N
If quasi-static

3. Helmholtz Free energy: G = U + P V T S


(a) Under constant T and P :
Use the first law:

dG = dU + P dV T dS

dG = Q + W + P dV T dS
= (Q T dS) + (W + P dV )

(824)

(825)

Use the second law:


or

dG (W + P dV )

(826)

dG Wother

(827)

(b) Using theromodynamic identity: Start from


dU = T dS P dV + dN

(828)

dG = SdT + V dP + dN

(829)

dG = dN

(830)

dN = Wother

(831)

Gibbs is energy with P V T S.

Note: Change sign, exchange intensive and extensive. Under constant T and P
If quasi-static,

220

29

Schroeder Chapter 6 Boltzmann Statistics


Ensembles

From this chapter, we study the microscopic description of many body system. What do I mean by that? Consider the thermodynamic quantities
U, P, S, T, V, , N . For any given substance, these can be measured and tabulated. If you need them for some process, you just look them up in the table
to calculate the enthalpy, free energy, etc.
But do you understand why they have those particular values? No. Can
you actually calculate those values? No. To do so, you need to know how
particles behave when there are a lot of them so that from the knowledge
of elementary interactions such as the van der Waals interaction among the
atoms, you can calculate the macroscopic quantities such as the pressure and
the entropy.
O.K. So suppose you know the interaction among the particles, say in the
form of interparticle potential
Vij = V (|ri rj |)

(832)

then what? Well, classical mechanics would tell you that the next thing you
need to do is to write down the Hamiltonian
H=

N
X
p2i
i=1

2m

X
i<j

V (|ri rj |)

(833)

where i labels the particle and solve the equations of motion


d
H
pi =
dt
qi
H
d
qi =
dt
pi

(834)
(835)

One can certainly try to do that. But if you have 1023 particles, you cant
even write down those equations.
So we need to use some other tools than directly solving the equations
of motion. And practically, directly solving the equations of motions is silly
221

anyway because what we want to know about are the the macroscopic parameters such as P, , T whose numbers are tiny compared to the total number of
the degrees of freedom in the system. That is, all we care about are actually
a few average numbers and the fluctuations. All other details are irrelevant.
Whenever this sort of situation arises, one should immediately say, Ah-ha,
we better use Statistics.
Statistics is a branch of mathematics that deals with probabilities, averages, fluctuations and inferrences. So this tool is ideal for studying a system
with many particles. And the branch of physics that combines the ideas of
statistics and physics is called Statistical Mechanics or stat-mech for short.
If you think about it, almost all condensed matter is stat-mech. Almost all
real systems contain enough number of particles to warrante the statistical
approach. So you can guess how important this topic is in physics.
So what you want are things like
U = hHi

(836)

where H is given above. So from 1023 terms, you only want one number
to come out. Thats nice. But whats that bracket? What do we mean by
average? What are we averaging over?
To make this more precise, we need to introduce the concept of the thermodynamic ensembles. First of all, there are 2 distinct class of systems
we can think about. One is isolated systems.
Isolated systems are literally that. Isolated from its environment. Physically what it means is that the conserved quantities such as energy, number
of particles and also volume are fixed. There is no exchange of heat, or
particles or expansion. And it is natural to characterize such systems with
those fixed quantities U, N, V . Note that these are all extensive quantities
and they are NOT average quantities. These are fixed. God given for the
given isolated system.
Do these 3 numbers completely fix the state of the system? Not really.
Not even remotely. We have 1023 particles and that many degrees of freedom. Fixing 3 numbers do nothing to pin-point the postion of particle 7,
for instance. There are many, many, many, possible state of the system. Far
greater than the number of particles itself.
So think about having all possible systems with only those 3 numbers
fixed. This collection is called the micro-canonical ensemble. And the
average we spoke of are the average over this ensemble. For instance, suppose
222

we want to consider the average kinetic energy. This is different from the
total energy and not fixed.
In this case, the averaging, in theory, goes like this. We measure the
kinetic energy of each system in the ensemble. Call them K where labels
each system in the ensemble. The averge is then
hKi =

1
Nensemble

Nensemble
X

(837)

=1

Is this totally transpranet that this is the right thing to do? Did we make any
assumptions? In fact we did. We assumed here that the probability for each
member of the ensemble is the same. This is actually the same assumption
we made to derive the second law.
Another thing you should notice is that this problem is seemingly intractable. The problem is, we need to know all possible systems in the
ensemble to evaluate things like this exactly.
For some systems, we can do this. For instance, we did exactly this sort
of thing for the Einstein solid. But what about more complicated systems?
Well, thats complicated. What to do then? Do we give up here? No!
Here is where the concept of Reservoirs comes in.
Here is the situation we would like to consider: where the reservoir is
much much larger than the system in all aspects.
We then ask this question: What is the probability for the system to have
the energy E?
Let the total energy of the combined system
Utotal = UR + E

(838)

This is a fixed number. The multiplicity of the combined system is


total (UR , E) = R (UR ) S (E)
= R (Utotal E) S (E)

(839)

And it is our fundamental assumption that the probability is proportional to


the multiplicity:
P (E) = N (UR , E) = N R (Utotal E) S (E)
where N is the normalization factor.
223

(840)

Reservoir
UR

Heat
Exchange

System
E

Figure 57: Small system in contact with a reservoir


Now since the combined system is much larger than the S,
Utotal E

(841)

Then we can expand R (Utotal E) w.r.t. E. However, since R is a very


large number, it is much better if we think about expansion of its log aka the
entropy2
SR (Utotal E) = ln R (Utotal E)

ln R (Utotal )

+ O(E 2 )
= ln R (Utotal ) E

U
U =Utotal
E
= ln R (Utotal )
+ O(E 2 )
(843)
kT

(U E) = (U ) E0 (U ) + E 2 00 (U )/2 + ...

(842)

Since is a very large number of order 10N , it does not matter much if you divide of
multiply by a large number U or small number E. So the terms in the above expansion
are all about the same size. The expansion is not good. On the other hand, since S is
merely a large number, dividing it with another large number U and multipliying by a
small number E does yield a small number. So the expansion is good.

224

where

ln R (U )
1

=k

T
U
U =Utotal

(844)

is the temperature of the reservoir at the total energy Utotal .


This means that
P (E) = N R (Utotal E) S (E)
= N exp (SR (Utotal E)) S (E)
= N R (Utotal ) exp (E/kT ) S (E)
1
=
exp (E/kT ) S (E)
Z

(845)

where we redefined
1
= N R (Utotal )
Z

(846)

since R (Utotal ) is constant. It is fixed by the condition


1=

X
E

or

P (E) = N 0

Z =

exp (E/kT ) S (E)

(847)

S (E) eE/kT

(848)

X
E

Note that we didnt assume anything about our system S it could be a


relatively big system itself or as small as one atom. So what does this mean?
We have
1
(849)
P (E) = S (E)eE/kT
Z
first of all, this is now written in terms of the properties of S only and a gross
property of the reservoir, the temperature.
1. This is a general behavior. The exponential factor
P (E) eE/kT

(850)

is called the Boltzmann factor and it applies to any system that


are attached to a reservoir. This is the most important factor in
Stat-Mech. You should remember it.
225

2. Each possible system configuration with the same energy, that is, configurations that belong to S (E) gets the same probability as before.
3. S (E) could be as simple as the degeneracy of the energy level for an
atomic system or as complicated as the degeneracy of an interacting
molecules in a liquid.
4. Remember that the exponential factor comes from the multiplicity of
the reservoir
R (Utotal E)

(851)

So it does not matter what the reservoir is made up of, or how it got
there. The only thing that matters is the temperature of the reservoir
and the fact that it is big.
5. The multiplicity is a fast-varying function of energy. That means, a
small change in the energy means a large change in the multiplicity.
This fact is reflected in the exponential factor eE/kT . It says that the
probability for the reservoir to give up E amount of energy is exponentially small. So large transfer of energy rarely happens. Turning the
argument around, it means that the probability for the system to have
large E is exponentially small.
6. If the system itself is large enough, then
S (E)

(852)

is also a large number that grows fast with E, S (E) eN Now the
Boltzmann factor eE/kT decreases fast with E. Then there should
be a balance between the two functions where the maximum happens.
If the system is large enough, this maximum will be overwhelmingly
probable than anything else just like we argued before.
Now what did we miss? Well, take a look at the thermodynamic identity
again:
1
(dU + P dV dN )
(853)
T
That says that the entropy is a function of U, V and N . But we only used
expansion in U above to get the entropy. This is O.K. if the volume and the
number are fixed or for some reason = 0. In many cases, this holds.
dS =

226

If however, the number can be exchanged, we should have used

S
S(Utotal E, Ntotal N ) = S(Utotal , Ntotal ) E
U
= S(Utotal , Ntotal )

V,N

S
N

E N
+
T
T

V,U

(854)

and get
P (E, N ) S (E, N ) eE/kT N/kT

(855)

Well use this later but for now, lets set V and N to a fixed number.
The Partition Function

Z=

eE(s)/kT

(856)

Useful to calculate everything.


Thermal Excitation of Atoms

eE2 /kT
P(s2 )
= E1 /kT
P(s1 )
e

(857)

Average value

hEi =

1X
s = 1E(s)eE(s)/kT
Z

(858)

Paramagnetism

Z = eB + eB = 2 cosh(B)
227

(859)

P =

eB
2 cosh(B)

(860)

P =

eB
2 cosh(B)

(861)

eB eB

E = (B)P + (B)P = B
= B tanh(B) (862)
2 cosh(B)
U = N E = N B E

(863)

Rotation of Molecules

E(j)j(j + 1)

(864)

Distinguishable:
Zrot =

(2j + 1)ej(j+1)/kT

(865)

j=0

High T: Turn it into integral:


kT

(866)

Erot = kT

(867)

Zrot

Equipartion theorem.
Equipartion Theorem

228

Free particle
Z =

eE(p)

=
=
=
=
=

V Z d3 p p2 /2mkT
e
h
3 (2)3
V 4 Z
2
dp p2 ep /2mkT
3
3
h
8 0
Z

V
2mkT
mkT
dx x1/2 ex
3
2
0
2 h

V
mkT
2mkT (3/2)
3
2 2 h

V
3 mkT 2mkT
2
2
2 h

V
mkT 2mkT
4 2 h
3

= V

mkT
2h2

(868)

Quantum volume:
s

mkT
vQ = 1/
2h2

(869)

Quantum length:
s

mkT
2h2

(870)

pdp/mkT = dx

(871)

p2 dp = ppdp = mkT 2mkT x1/2 dx

(872)

lQ = 1/
Let p2 /2mkT = x.

ln Z =

3
ln T + ...
2
229

(873)

3
hEi = kT
2

(874)

E =
ln Z

(875)

Maxwell-Boltzman distribution
m
D(v) =
2kT

3/2

4v 2 emv

2 /2kT

(876)

Free Energy
Can identify
F = kT ln Z

(877)

ln Z
(kT ln Z)
= ln Z + kT 2
kT

US
= ln Z
kT

Z =

(878)

(E)eE

eS/kB eE/kB T

e(1/kB T )(ST E)

(879)

S(E) = S(U + E 0 ) = S(U ) + E 0 /T +

E = U + E0
230

E 02 2 S
2 U 2

(880)

(881)

S 00 = (/U )(1/T ) = (1/T 2 )(T /U )


Z =

X
E

exp (1/kB ) S(U ) + E 0 /T + (E 02 /2)S 00 E/T

(SU/T )/kB

(SU/T )/kB

(SU/T )/kB

= e

E 02
exp
2kB T 2 C
"

dE
E 02
exp
E
2kB T 2 C
"

(882)
i

2kB T 2 C
E

(883)

NOTE
U
=C
T

(884)

dF = SdT P dV + dN

(885)

Heat capacity.

Composite system
Z acts like :
Z = Z 1 Z2 Z N

(886)

distiguishable, non-interacting
Z = Z1 Z2 ZN /N !

(887)

indistiguishable, non-interacting
Ideal gas

Z=

1 N
Z
N! 1
231

(888)

F = kT ln Z
= kT (N ln Z1 N ln N + N )

= kT N ln V (mkT /2)3/2 N ln N + N

= kT N ln (V /N )(mkT /2h2 )3/2 + N

S=

30

F
T

= N k [ln(V /N vQ ) + 5/2]

(889)

(890)

V,N

Quantum Statistics

Gibbs:
dS = (dU + P dV dN )

(891)

Boltzmann factor becomes the Gibbs factor


e(E(s)N (s))/kT

Z=
Grand Partition function

e(E(s)N (s))/kT

(892)

(893)

Bosons and Fermions


System: Single particle state
Fermion
Z = 1 + e()/kT

(894)

Boson
Z=

1
1

e()/kT
232

(895)

Average number
nF D =

1
1+

e()/kT

(896)

Boson
nBE =

1
e()/kT 1

nM B = e()/kT

(897)

(898)

Looks like:

30.1

Degenerate Fermi Gas

At zero T, one particle for one energy state: FD goes to and = F .


V Z d3 p 3
dp
N = 2 3
(2)3
h

!
V 4p3F
= 2 3
h
3
8V 3
=
p
3h3 F

(899)

Fermi momentum:
pF = h

3N
8V

1/3

(900)

Fermi energy
p2F
2m

1 2 3N 2/3
=
h
2m
8V

2
h
3N 2/3
=
8m V

F =

233

(901)

Average energy
3
U = N F
5

(902)

Pressure

U
P =
V

S,N

2U
=
3V

(903)

Degeneracy pressure. Neutron star.


Density of state:
V Z d 3 p 3 p2
dp
2m
h
3 (2)3
Z F
2
V 4
2 p
= 2 3
dpp
2m
h
(2)3 0
Z F
2
V
p
= 2 3 4
dpp2
h
2m
0

U = 2

(904)

Set
=

p2
2m

(905)

d =

pdp
m

(906)

then

and
p=

2m

(907)

so that
N =

F
0

234

d g()

(908)

and
Z F
2
V
2 p
dpp
4
h3
2m
0
Z F

V
md 2m
= 2 3 4
h
0
L 3 3 Z F
d
= 8 2 3 m
h
0
Z

U = 2

d g()

(909)

with
V 3
g() 8 2 3 m
h

(910)

If T 6= 0,
N =

d g() nF D ()

(911)

d g() nF D ()

(912)

and
U =

How nF D () looks like:


N=

U=

g()nF D ()d =

g()nF D ()d =

d g()

e()/kT + 1
1

d g()

e()/kT

+1

(913)

(914)

Consider N
N =
=
=

g()nF D ()d

d g()
8mL2
h2

e()/kT

235

+1

1
e()/kT

+1

(915)

First shift:

N =
2

=
2

3
8mL2 Z
1
d ()/kT
2
h
e
+1
0

3
1
8mL2 Z
d + /kT
2
h
e
+1

(916)

Then scale: /kT = x:

N =
2

3
8mL2 Z
1
d + /kT
2
h
e
+1

=
2

Z
q
8mL2
1
3/2
dx
(kT
)
x + /kT x
2
h
e +1
/kT

(917)

Integration by part

N =
2

=
2

=
2

Z
q
8mL2
1
3/2
x + /kT x
dx
(kT
)
2
h
e +1
/kT
3

8mL2
1
3/2 2
3/2
(kT
)
(x
+
/kT
)
h2
3
ex + 1 /kT
#
Z
x
2
e
dx (x + /kT )3/2 x
+
3
(e + 1)2
/kT
"

Z
8mL2
ex
3/2
3/2 2
dx(x
+
/kT
)
(kT
)
h2
3 /kT
(ex + 1)2

(918)

Note
1
ex
= x
x
2
(e + 1)
(e + 1)(1 + ex )
peaks at x = 0.
Consider the case when /kT 1. Expand

(/kT + x)3/2 = (/kT )3/2 1 + 3/2(xkT /) +

1 31
(xkT /)2 + ... (920)
2! 2 2

and extend the lower limit to :

N =
2

Z
8mL2
ex
3/2 2
3/2
(kT
)
dx(x
+
/kT
)
h2
3 /kT
(ex + 1)2

236

(919)

=
2

Z
8mL2
ex
1 31
3/2 2
(kT )
dx x
(/kT )3/2 1 + 3/2(xkT /) +
(xkT /)2 + ...
2
2
h
3 (e + 1)
2! 2 2

Z
8mL2
ex
3/2 2
(kT
)
(/kT )3/2
dx
h2
3 (ex + 1)2

+
2

=
3

Z
8mL2
ex
3/2 2 3
(/kT )3/2 (xkT /)2
dx
(kT
)
2
x
2
h
3 8 (e + 1)
3

8mL2
()3/2
h2

+
8
s

Z
8mL2
ex
3/2
2
dx
()
(kT
/)
x2
h2
(ex + 1)2

2
8mL2
8mL2
3/2
3/2
2
()
+
()
(kT
/)
h2
8
h2
3
3/2
2
2

(kT )
+
= N
+N
1/2
F
8 3/2

=
3

= N

(921)

Solving for / we get

2
=1
F
12

kT
F

!2

...

(922)

Same way
U =
=

3 5/2 3 2 (kT )2
N
+
N
+
5 3/2
8
F
F
3
2 (kT )2
N F + N
+
5
4
F

237

(923)

You might also like