Archive for statistical physics

A Little Bit of Quantum

Posted in The Universe and Stuff with tags , , , , , , , , , , , on January 16, 2010 by telescoper

I’m trying to avoid getting too depressed by writing about the ongoing funding crisis for physics in the United Kingdom, so by way of a distraction I thought I’d post something about physics itself rather than the way it is being torn apart by short-sighted bureaucrats. A number of Cardiff physics students are currently looking forward (?) to their Quantum Mechanics examinations next week, so I thought I’d try to remind them of what fascinating subject it really is…

The development of the kinetic theory of gases in the latter part of the 19th Century represented the culmination of a mechanistic approach to Natural Philosophy that had begun with Isaac Newton two centuries earlier. So successful had this programme been by the turn of the 20th century that it was a fairly common view among scientists of the time that there was virtually nothing important left to be “discovered” in the realm of natural philosophy. All that remained were a few bits and pieces to be tidied up, but nothing could possibly shake the foundations of Newtonian mechanics.

But shake they certainly did. In 1905 the young Albert Einstein – surely the greatest physicist of the 20th century, if not of all time – single-handedly overthrew the underlying basis of Newton’s world with the introduction of his special theory of relativity. Although it took some time before this theory was tested experimentally and gained widespread acceptance, it blew an enormous hole in the mechanistic conception of the Universe by drastically changing the conceptual underpinning of Newtonian physics. Out were the “commonsense” notions of absolute space and absolute time, and in was a more complex “space-time” whose measurable aspects depended on the frame of reference of the observer.

Relativity, however, was only half the story. Another, perhaps even more radical shake-up was also in train at the same time. Although Einstein played an important role in this advance too, it led to a theory he was never comfortable with: quantum mechanics. A hundred years on, the full implications of this view of nature are still far from understood, so maybe Einstein was correct to be uneasy.

The birth of quantum mechanics partly arose from the developments of kinetic theory and statistical mechanics that I discussed briefly in a previous post. Inspired by such luminaries as James Clerk Maxwell and Ludwig Boltzmann, physicists had inexorably increased the range of phenomena that could be brought within the descriptive framework furnished by Newtonian mechanics and the new modes of statistical analysis that they had founded. Maxwell had also been responsible for another major development in theoretical physics: the unification of electricity and magnetism into a single system known as electromagnetism. Out of this mathematical tour de force came the realisation that light was a form of electromagnetic wave, an oscillation of electric and magnetic fields through apparently empty space.  Optical light forms just part of the possible spectrum of electromagnetic radiation, which ranges from very long wavelength radio waves at one end to extremely short wave gamma rays at the other.

With Maxwell’s theory in hand, it became possible to think about how atoms and molecules might exchange energy and reach equilibrium states not just with each other, but with light. Everyday experience that hot things tend to give off radiation and a number of experiments – by Wilhelm Wien and others – had shown that there were well-defined rules that determined what type of radiation (i.e. what wavelength) and how much of it were given off by a body held at a certain temperature. In a nutshell, hotter bodies give off more radiation (proportional to the fourth power of their temperature), and the peak wavelength is shorter for hotter bodies. At room temperature, bodies give off infra-red radiation, stars have surface temperatures measured in thousands of degrees so they give off predominantly optical and ultraviolet light. Our Universe is suffused with microwave radiation corresponding to just a few degrees above absolute zero.

The name given to a body in thermal equilibrium with a bath of radiation is a “black body”, not because it is black – the Sun is quite a good example of a black body and it is not black at all – but because it is simultaneously a perfect absorber and perfect emitter of radiation. In other words, it is a body which is in perfect thermal contact with the light it emits. Surely it would be straightforward to apply classical Maxwell-style statistical reasoning to a black body at some temperature?

It did indeed turn out to be straightforward, but the result was a catastrophe. One can see the nature of the disaster very straightforwardly by taking a simple idea from classical kinetic theory. In many circumstances there is a “rule of thumb” that applies to systems in thermal equilibrium. Roughly speaking, the idea is that energy becomes divided equally between every possible “degree of freedom” the system possesses. For example, if a box of gas consists of particles that can move in three dimensions then, on average, each component of the velocity of a particle will carry the same amount of kinetic energy. Molecules are able to rotate and vibrate as well as move about inside the box, and the equipartition rule can apply to these modes too.

Maxwell had shown that light was essentially a kind of vibration, so it appeared obvious that what one had to do was to assign the same amount of energy to each possible vibrational degree of freedom of the ambient electromagnetic field. Lord Rayleigh and Sir James Jeans did this calculation and found that the amount of energy radiated by a black body as a function of wavelength should vary proportionally to the temperature T and to inversely as the fourth power of the wavelength λ, as shown in the diagram for an example temperature of 5000K:

Even without doing any detailed experiments it is clear that this result just has to be nonsense. The Rayleigh-Jeans law predicts that even very cold bodies should produce infinite amounts of radiation at infinitely short wavelengths, i.e. in the ultraviolet. It also predicts that the total amount of radiation – the area under the curve in the above figure – is infinite. Even a very cold body should emit infinitely intense electromagnetic radiation. Infinity is bad.

Experiments show that the Rayleigh-Jeans law does work at very long wavelengths but in reality the radiation reaches a maximum (at a wavelength that depends on the temperature) and then declines at short wavelengths, as shown also in the above Figure. Clearly something is very badly wrong with the reasoning here, although it works so well for atoms and molecules.

It wouldn’t be accurate to say that physicists all stopped in their tracks because of this difficulty. It is amazing the extent to which people are able to carry on despite the presence of obvious flaws in their theory. It takes a great mind to realise when everyone else is on the wrong track, and a considerable time for revolutionary changes to become accepted. In the meantime, the run-of-the-mill scientist tends to carry on regardless.

The resolution of this particular fundamental conundrum is accredited to Karl Ernst Ludwig “Max” Planck (right), who was born in 1858. He was the son of a law professor, and himself went to university at Berlin and Munich, receiving his doctorate in 1880. He became professor at Kiel in 1885, and moved to Berlin in 1888. In 1930 he became president of the Kaiser Wilhelm Institute, but resigned in 1937 in protest at the behaviour of the Nazis towards Jewish scientists. His life was blighted by family tragedies: his second son died in the First World War; both daughters died in childbirth; and his first son was executed in 1944 for his part in a plot to assassinate Adolf Hitler. After the Second World War the institute was named the Max Planck Institute, and Planck was reappointed director. He died in 1947; by then such a famous scientist that his likeness appeared on the two Deutschmark coin issued in 1958.

Planck had taken some ideas from Boltzmann’s work but applied them in a radically new way. The essence of his reasoning was that the ultraviolet catastrophe basically arises because Maxwell’s electromagnetic field is a continuous thing and, as such, appears to have an infinite variety of ways in which it can absorb energy. When you are allowed to store energy in whatever way you like in all these modes, and add them all together you get an infinite power output. But what if there was some fundamental limitation in the way that an atom could exchange energy with the radiation field? If such a transfer can only occur in discrete lumps or quanta – rather like “atoms” of radiation – then one could eliminate the ultraviolet catastrophe at a stroke. Planck’s genius was to realize this, and the formula he proposed contains a constant that still bears his name. The energy of a light quantum E is related to its frequency ν via E=hν, where h is Planck’s constant, one of the fundamental constants that occur throughout theoretical physics.

Boltzmann had shown that if a system possesses a  discrete energy state labelled by j separated by energy Ej then at a given temperature the likely relative occupation of the two states is determined by a “Boltzmann factor” of the form:

n_{j} \propto \exp\left(-\frac{E_{j}}{k_BT}\right),

so that the higher energy state is exponentially less probable than the lower energy state if the energy difference is much larger than the typical thermal energy kB T ; the quantity kB is Boltzmann’s constant, another fundamental constant. On the other hand, if the states are very close in energy compared to the thermal level then they will be roughly equally populated in accordance with the “equipartition” idea I mentioned above.

The trouble with the classical treatment of an electromagnetic field is that it makes it too easy for the field to store infinite energy in short wavelength oscillations: it can put  a little bit of energy in each of a lot of modes in an unlimited way. Planck realised that his idea would mean ultra-violet radiation could only be emitted in very energetic quanta, rather than in lots of little bits. Building on Boltzmann’s reasoning, he deduced the probability of exciting a quantum with very high energy is exponentially suppressed. This in turn leads to an exponential cut-off in the black-body curve at short wavelengths. Triumphantly, he was able to calculate the exact form of the black-body curve expected in his theory: it matches the Rayleigh-Jeans form at long wavelengths, but turns over and decreases at short wavelengths just as the measurements require. The theoretical Planck curve matches measurements perfectly over the entire range of wavelengths that experiments have been able to probe.

Curiously perhaps, Planck stopped short of the modern interpretation of this: that light (and other electromagnetic radiation) is composed of particles which we now call photons. He was still wedded to Maxwell’s description of light as a wave phenomenon, so he preferred to think of the exchange of energy as being quantised rather than the radiation itself. Einstein’s work on the photoelectric effect in 1905 further vindicated Planck, but also demonstrated that light travelled in packets. After Planck’s work, and the development of the quantum theory of the atom pioneered by Niels Bohr, quantum theory really began to take hold of the physics community and eventually it became acceptable to conceive of not just photons but all matter as being part particle and part wave. Photons are examples of a kind of particle known as a boson, and the atomic constituents such as electrons and protons are fermions. (This classification arises from their spin: bosons have spin which is an integer multiple of Planck’s constant, whereas fermions have half-integral spin.)

You might have expected that the radical step made by Planck would immediately have led to a drastic overhaul of the system of thermodynamics put in place in the preceding half-a-century, but you would be wrong. In many ways the realization that discrete energy levels were involved in the microscopic description of matter if anything made thermodynamics easier to understand and apply. Statistical reasoning is usually most difficult when the space of possibilities is complicated. In quantum theory one always deals fundamentally with a discrete space of possible outcomes. Counting discrete things is not always easy, but it’s usually easier than counting continuous things. Even when they’re infinite.

Much of modern physics research lies in the arena of condensed matter physics, which deals with the properties of solids and gases, often at the very low temperatures where quantum effects become important. The statistical thermodynamics of these systems is based on a very slight modification of Boltzmann’s result:

n_{j} \propto \left[\exp\left(\frac{E_{j}}{k_BT}\right)\pm 1\right]^{-1},

which gives the equilibrium occupation of states at an energy level Ej; the difference between bosons and fermions manifests itself as the sign in the denominator. Fermions take the upper “plus” sign, and the resulting statistical framework is based on the so-called Fermi-Dirac distribution; bosons have the minus sign and obey Bose-Einstein statistics. This modification of the classical theory of Maxwell and Boltzmann is simple, but leads to a range of fascinating phenomena, from neutron stars to superconductivity.

Moreover, the nature the ultraviolet catastrophe for black-body radiation at the start of the 20th Century perhaps also holds lessons for modern physics. One of the fundamental problems we have in theoretical cosmology is how to calculate the energy density of the vacuum using quantum field theory. This is a more complicated thing to do than working out the energy in an electromagnetic field, but the net result is a catastrophe of the same sort. All straightforward ways of computing this quantity produce a divergent answer unless a high-energy cut off is introduced. Although cosmological observations of the accelerating universe suggest that vacuum energy is there, its actual energy density is way too small for any plausible cutoff.

So there we are. A hundred years on, we have another nasty infinity. It’s a fundamental problem, but its answer will probably open up a new way of understanding the Universe.



Ergodic Means…

Posted in The Universe and Stuff with tags , , , , , , on October 19, 2009 by telescoper

The topic of this post is something I’ve been wondering about for quite a while. This afternoon I had half an hour spare after a quick lunch so I thought I’d look it up and see what I could find.

The word ergodic is one you will come across very frequently in the literature of statistical physics, and in cosmology it also appears in discussions of the analysis of the large-scale structure of the Universe. I’ve long been puzzled as to where it comes from and what it actually means. Turning to the excellent Oxford English Dictionary Online, I found the answer to the first of these questions. Well, sort of. Under etymology we have

ad. G. ergoden (L. Boltzmann 1887, in Jrnl. f. d. reine und angewandte Math. C. 208), f. Gr.

I say “sort of” because it does attribute the origin of the word to Ludwig Boltzmann, but the greek roots (εργον and οδοσ) appear to suggest it means “workway” or something like that. I don’t think I follow an ergodic path on my way to work so it remains a little mysterious.

The actual definitions of ergodic given by the OED are

Of a trajectory in a confined portion of space: having the property that in the limit all points of the space will be included in the trajectory with equal frequency. Of a stochastic process: having the property that the probability of any state can be estimated from a single sufficiently extensive realization, independently of initial conditions; statistically stationary.

As I had expected, it has two  meanings which are related, but which apply in different contexts. The first is to do with paths or orbits, although in physics this is usually taken to meantrajectories in phase space (including both positions and velocities) rather than just three-dimensional position space. However, I don’t think the OED has got it right in saying that the system visits all positions with equal frequency. I think an ergodic path is one that must visit all positions within a given volume of phase space rather than being confined to a lower-dimensional piece of that space. For example, the path of a planet under the inverse-square law of gravity around the Sun is confined to a one-dimensional ellipse. If the force law is modified by external perturbations then the path need not be as regular as this, in extreme cases wandering around in such a way that it never joins back on itself but eventually visits all accessible locations. As far as my understanding goes, however, it doesn’t have to visit them all with equal frequency. The ergodic property of orbits is  intimately associated with the presence of chaotic dynamical behaviour.

The other definition relates to stochastic processes, i.e processes involving some sort of random component. These could either consist of a discrete collection of random variables {X1…Xn} (which may or may not be correlated with each other) or a continuously fluctuating function of some parameter such as time t, i.e. X(t) or spatial position (or perhaps both).

Stochastic processes are quite complicated measure-valued mathematical entities because they are specified by probability distributions. What the ergodic hypothesis means in the second sense is that measurements extracted from a single realization of such a process have a definition relationship to analagous quantities defined by the probability distribution.

I always think of a stochastic process being like a kind of algorithm (whose workings we don’t know). Put it on a computer, press “go” and it spits out a sequence of numbers. The ergodic hypothesis means that by examining a sufficiently long run of the output we could learn something about the properties of the algorithm.

An alternative way of thinking about this for those of you of a frequentist disposition is that the probability average is taken over some sort of statistical ensemble of possible realizations produced by the algorithm, and this must match the appropriate long-term average taken over one realization.

This is actually quite a deep concept and it can apply (or not) in various degrees.  A simple example is to do with properties of the mean value. Given a single run of the program over some long time T we can compute the sample average

\bar{X}_T\equiv \frac{1}{T} \int_0^Tx(t) dt

the probability average is defined differently over the probability distribution, which we can call p(x)

\langle X \rangle \equiv \int x p(x) dx

If these two are equal for sufficiently long runs, i.e. as T goes to infinity, then the process is said to be ergodic in the mean. A process could, however, be ergodic in the mean but not ergodic with respect to some other property of the distribution, such as the variance. Strict ergodicity would require that the entire frequency distribution defined from a long run should match the probability distribution to some accuracy.

Now  we have a problem with the OED again. According to the defining quotation given above, ergodic can be taken to mean statistically stationary. Actually that’s not true. ..

In the one-parameter case, “statistically stationary” means that the probability distribution controlling the process is independent of time, i.e. that p(x,t)=p(x,t+Δt) . It’s fairly straightforward to see that the ergodic property requires that a process X(t) be stationary, but the converse is not the case. Not every stationary process is necessarily ergodic. Ned Wright gives an example here. For a higher-dimensional process, such as a spatially-fluctuating random field the analogous property is statistical homogeneity, rather than stationarity, but otherwise everything carries over.

Ergodic theorems are very tricky to prove in general, but there are well-known results that rigorously establish the ergodic properties of Gaussian processes (which is another reason why theorists like myself like them so much). However, it should be mentioned that even if the ergodic assumption applies its usefulness depends critically on the rate of convergence. In the time-dependent example I gave above, it’s no good if the averaging period required is much longer than the age of the Universe; in that case even ergodicity makes it difficult to make inferences from your sample. Likewise the ergodic hypothesis doesn’t help you analyse your galaxy redshift survey if the averaging scale needed is larger than the depth of the sample.

Moreover, it seems to me that many physicists resort to ergodicity when there isn’t any compelling mathematical grounds reason to think that it is true. In some versions of the multiverse scenario, it is hypothesized that the fundamental constants of nature describing our low-energy turn out “randomly” to take on different values in different domains owing to some sort of spontaneous symmetry breaking perhaps associated a phase transition generating  cosmic inflation. We happen to live in a patch within this structure where the constants are such as to make human life possible. There’s no need to assert that the laws of physics have been designed to make us possible if this is the case, as most of the multiverse doesn’t have the fine tuning that appears to be required to allow our existence.

As an application of the Weak Anthropic Principle, I have no objection to this argument. However, behind this idea lies the assertion that all possible vacuum configurations (and all related physical constants) do arise ergodically. I’ve never seen anything resembling a proof that this is the case. Moreover, there are many examples of physical phase transitions for which the ergodic hypothesis is known not to apply.  If there is a rigorous proof that this works out, I’d love to hear about it. In the meantime, I remain sceptical.