Archive for entropy

A Question of Distributions and Entropies

Posted in mathematics with tags , , on November 28, 2022 by telescoper

I thought I’d use the medium of this blog to pick the brains of my readers about some general questions I have about probability and entropy as described on the chalkboard above in order to help me with my homework.

Imagine that px(x) and py(y) are one-point probability density functions and pxy(x,y) is a two-point (joint) probability density function defined so that its marginal distributions are px(x) and py(y) and shown on the left-hand side of the board. These functions are all non-negative definite and integrate to unity as shown.

Note that, unless x and y are independent, in which case pxy(x,y) = px(x) py(y), the joint probability cannot be determined from the marginals alone.

On the right we have Sx, Sy and Sxy defined by integrating plogp for the two univariate distributions and the bivariate distributions respectively as shown on the right-hand side of the board. These would be proportional to the Gibbs entropy of the distributions concerned but that isn’t directly relevant.

My question is: what can be said in general terms (i.e. without making any further assumptions about the distributions involved) about the relationship between Sx, Sy and Sxy ?

Answers on a postcard through the comments block please!

A Mini-Introduction To Information Theory

Posted in The Universe and Stuff with tags , , , on June 5, 2018 by telescoper

The last link to an arXiv paper I posted here seems to have proved rather popular so here’s another that I think is well worth reading, this time by Ed Witten:

This article consists of a very short introduction to classical and quantum information theory. Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information. A few more detailed topics are considered in the quantum case.

It’s not really `very short’ as it is nearly 40 pages long, but it does tackle a very big topic so I won’t quibble about that. You can download a PDF of the full paper here.

As always, comments are welcome through the comments box.

A Question of Entropy

Posted in Bad Statistics with tags , , on August 10, 2015 by telescoper

We haven’t had a poll for a while so here’s one for your entertainment.

An article has appeared on the BBC Website entitled Web’s random numbers are too weak, warn researchers. The piece is about the techniques used to encrypt data on the internet. It’s a confusing piece, largely because of the use of the word “random” which is tricky to define; see a number of previous posts on this topic. I’ll steer clear of going over that issue again. However, there is a paragraph in the article that talks about entropy:

An unshuffled pack of cards has a low entropy, said Mr Potter, because there is little surprising or uncertain about the order the cards would be dealt. The more a pack was shuffled, he said, the more entropy it had because it got harder to be sure about which card would be turned over next.

I won’t prejudice your vote by saying what I think about this statement, but here’s a poll so I can try to see what you think.

Of course I also welcome comments via the box below…

Universality in Space Plasmas?

Posted in Astrohype, The Universe and Stuff with tags , , , , , , , , on June 16, 2013 by telescoper

It’s been a while since I posted anything reasonably technical, largely because I’ve been too busy, so I thought I’d spend a bit of time today on a paper (by Livadiotis & McComas in the journal Entropy) that provoked a Nature News item a couple of weeks ago and caused a mild flutter around the internet.

Here’s the abstract of the paper:

In plasmas, Debye screening structures the possible correlations between particles. We identify a phase space minimum h* in non-equilibrium space plasmas that connects the energy of particles in a Debye sphere to an equivalent wave frequency. In particular, while there is no a priori reason to expect a single value of h* across plasmas, we find a very similar value of h* ≈ (7.5 ± 2.4)×10−22 J·s using four independent methods: (1) Ulysses solar wind measurements, (2) space plasmas that typically reside in stationary states out of thermal equilibrium and spanning a broad range of physical properties, (3) an entropic limit emerging from statistical mechanics, (4) waiting-time distributions of explosive events in space plasmas. Finding a quasi-constant value for the phase space minimum in a variety of different plasmas, similar to the classical Planck constant but 12 orders of magnitude larger may be revealing a new type of quantization in many plasmas and correlated systems more generally.

It looks an interesting claim, so I thought I’d have a look at the paper in a little more detail to see whether it holds up, and perhaps to explain a little to others who haven’t got time to wade through it themselves. I will assume a basic background knowledge of plasma physics, though, so turn away now if that puts you off!

For a start it’s probably a good idea to explain what this mysterious h* is. The authors define it via ½h*ctc, where εc is defined to be “the smallest particle energy that can transfer information” and tc is “the correlation lifetime of Debye Sphere (i.e. volumes of radius the Debye Length for the plasma in question). The second of these can be straightforwardly defined in terms of the ratio between the Debye Length and the thermal sound speed; the authors argue that the first is given by εc=½(mi+me)u2, involving the electron and ion masses in the plasma and the information speed u which is taken to be the speed of a magnetosonic wave.

You might wonder why the authors decided to call their baby h*. Perhaps it’s because the definition looks a bit like the energy-time version of Heisenberg’s Uncertainty Principle, but I can’t be sure of that. In any case the resulting quantity has the same dimensions as Planck’s constant and is therefore measured in the same units (Js in the SI system).

Anyway, the claim is that h* is constant across a wide range of astrophysical plasmas. I’ve taken the liberty of copying the relevant Figure here:


I have to say at this point I had the distinct sense of damp squib going off. The panel on the right purports to show the constancy of h* (y-axis) for plasmas of a wide range of number-densities (x-axis). However, but are shown on logarithmic scales and have enormously large error bars. To be sure, the behaviour looks roughly constant but to use this as a basis for claims of universality is, in my opinion, rather unjustified, especially since there may also be some sort of selection effect arising from the specific observational data used.

One of the authors is quoted in the Nature piece:

“We went into this thinking we’d find one value in one plasma, and another value in another plasma,” says McComas. “We were shocked and slightly horrified to find the same value across all of them. This is really a major deal.”

Perhaps it will turn out to be a major deal. But I’d like to see a lot more evidence first.

Plasma (astro)physics is a fascinating but very difficult subject, not because the underlying requations governing plasmas are especially complicated, but because the resulting behaviour is so sensitively dependent on small details; plasma therefore provide an excellent exemplar of what we mean by a complex physical system. As is the case in other situations where we lack the ability to do detailed calculations at the microscopic level, we do have to rely on more coarse=grained descriptions, so looking for patterns like this is a good thing to do, but I think the Jury is out.

Finally, I have to say I don’t approve of the authors talking about this in terms of “quantization”. Plasma physics is confusing enough as classical physics without confusing it with quantum theory. Opening the door to that is a big mistake, in my view. Who knows what sort of new age crankery might result?

Arrows and Demons

Posted in The Universe and Stuff with tags , , , , , on April 12, 2009 by telescoper

My recent post about randomness and non-randomness spawned a lot of comments over on cosmic variance about the nature of entropy. I thought I’d add a bit about that topic here, mainly because I don’t really agree with most of what is written in textbooks on this subject.

The connection between thermodynamics (which deals with macroscopic quantities) and statistical mechanics (which explains these in terms of microscopic behaviour) is a fascinating but troublesome area.  James Clerk Maxwell (right) did much to establish the microscopic meaning of the first law of thermodynamics he never tried develop the second law from the same standpoint. Those that did were faced with a conundrum.  


The behaviour of a system of interacting particles, such as the particles of a gas, can be expressed in terms of a Hamiltonian H which is constructed from the positions and momenta of its constituent particles. The resulting equations of motion are quite complicated because every particle, in principle, interacts with all the others. They do, however, possess an simple yet important property. Everything is reversible, in the sense that the equations of motion remain the same if one changes the direction of time and changes the direction of motion for all the particles. Consequently, one cannot tell whether a movie of atomic motions is being played forwards or backwards.

This means that the Gibbs entropy is actually a constant of the motion: it neither increases nor decreases during Hamiltonian evolution.

But what about the second law of thermodynamics? This tells us that the entropy of a system tends to increase. Our everyday experience tells us this too: we know that physical systems tend to evolve towards states of increased disorder. Heat never passes from a cold body to a hot one. Pour milk into coffee and everything rapidly mixes. How can this directionality in thermodynamics be reconciled with the completely reversible character of microscopic physics?

The answer to this puzzle is surprisingly simple, as long as you use a sensible interpretation of entropy that arises from the idea that its probabilistic nature represents not randomness (whatever that means) but incompleteness of information. I’m talking, of course, about the Bayesian view of probability.

 First you need to recognize that experimental measurements do not involve describing every individual atomic property (the “microstates” of the system), but large-scale average things like pressure and temperature (these are the “macrostates”). Appropriate macroscopic quantities are chosen by us as useful things to use because they allow us to describe the results of experiments and measurements in a  robust and repeatable way. By definition, however, they involve a substantial coarse-graining of our description of the system.

Suppose we perform an idealized experiment that starts from some initial macrostate. In general this will generally be consistent with a number – probably a very large number – of initial microstates. As the experiment continues the system evolves along a Hamiltonian path so that the initial microstate will evolve into a definite final microstate. This is perfectly symmetrical and reversible. But the point is that we can never have enough information to predict exactly where in the final phase space the system will end up because we haven’t specified all the details of which initial microstate we were in.  Determinism does not in itself allow predictability; you need information too.

If we choose macro-variables so that our experiments are reproducible it is inevitable that the set of microstates consistent with the final macrostate will usually be larger than the set of microstates consistent with the initial macrostate, at least  in any realistic system. Our lack of knowledge means that the probability distribution of the final state is smeared out over a larger phase space volume at the end than at the start. The entropy thus increases, not because of anything happening at the microscopic level but because our definition of macrovariables requires it.


This is illustrated in the Figure. Each individual microstate in the initial collection evolves into one state in the final collection: the narrow arrows represent Hamiltonian evolution.


However, given only a finite amount of information about the initial state these trajectories can’t be as well defined as this. This requires the set of final microstates has to acquire a  sort of “buffer zone” around the strictly Hamiltonian core;  this is the only way to ensure that measurements on such systems will be reproducible.

The “theoretical” Gibbs entropy remains exactly constant during this kind of evolution, and it is precisely this property that requires the experimental entropy to increase. There is no microscopic explanation of the second law. It arises from our attempt to shoe-horn microscopic behaviour into framework furnished by macroscopic experiments.

Another, perhaps even more compelling demonstration of the so-called subjective nature of probability (and hence entropy) is furnished by Maxwell’s demon. This little imp first made its appearance in 1867 or thereabouts and subsequently led a very colourful and influential life. The idea is extremely simple: imagine we have a box divided into two partitions, A and B. The wall dividing the two sections contains a tiny door which can be opened and closed by a “demon” – a microscopic being “whose faculties are so sharpened that he can follow every molecule in its course”. The demon wishes to play havoc with the second law of thermodynamics so he looks out for particularly fast moving molecules in partition A and opens the door to allow them (and only them) to pass into partition B. He does the opposite thing with partition B, looking out for particularly sluggish molecules and opening the door to let them into partition A when they approach.

The net result of the demon’s work is that the fast-moving particles from A are preferentially moved into B and the slower particles from B are gradually moved into A. The net result is that the average kinetic energy of A molecules steadily decreases while that of B molecules increases. In effect, heat is transferred from a cold body to a hot body, something that is forbidden by the second law.

All this talk of demons probably makes this sound rather frivolous, but it is a serious paradox that puzzled many great minds. Until it was resolved in 1929 by Leo Szilard. He showed that the second law of thermodynamics would not actually be violated if entropy of the entire system (i.e. box + demon) increased by an amount every time the demon measured the speed of a molecule so he could decide whether to let it out from one side of the box into the other. This amount of entropy is precisely enough to balance the apparent decrease in entropy caused by the gradual migration of fast molecules from A into B. This illustrates very clearly that there is a real connection between the demon’s state of knowledge and the physical entropy of the system.

By now it should be clear why there is some sense of the word subjective that does apply to entropy. It is not subjective in the sense that anyone can choose entropy to mean whatever they like, but it is subjective in the sense that it is something to do with the way we manage our knowledge about nature rather than about nature itself. I know from experience, however, that many physicists feel very uncomfortable about the idea that entropy might be subjective even in this sense.

On the other hand, I feel completely comfortable about the notion:. I even think it’s obvious. To see why, consider the example I gave above about pouring milk into coffee. We are all used to the idea that the nice swirly pattern you get when you first pour the milk in is a state of relatively low entropy. The parts of the phase space of the coffee + milk system that contain such nice separations of black and white are few and far between. It’s much more likely that the system will end up as a “mixed” state. But then how well mixed the coffee is depends on your ability to resolve the size of the milk droplets. An observer with good eyesight would see less mixing than one with poor eyesight. And an observer who couldn’t perceive the difference between milk and coffee would see perfect mixing. In this case entropy, like beauty, is definitely in the eye of the beholder.

The refusal of many physicists to accept the subjective nature of entropy arises, as do so many misconceptions in physics, from the wrong view of probability.