Throwing a Fit

Posted on February 18, 2009

I’ve just been to a very interesting and stimulating seminar by Subir Sarkar from Oxford, who spoke about Cosmology Beyond the Standard Model, a talk into which he packed a huge number of provocative comments and interesting arguments. His abstract is here:

Precision observations of the cosmic microwave backround and of the large-scale clustering of galaxies have supposedly confirmed the indication from the Hubble diagram of Type Ia supernovae that the universe is dominated by some form of dark energy which is causing the expansion rate to accelerate. Although hailed as having established a ‘standard model’ for cosmology, this raises a profound problem for fundamental physics. I will discuss whether the observations can be equally well explained in alternative inhomogeneous cosmological models that do not require dark energy and will be tested by forthcoming observations.

He made no attempt to be balanced and objective, but it was a thoroughly enjoyable polemic making the point that it is possible that the dark energy whose presence we infer from cosmological observations might just be an artifact of using an oversimplified model to interpret the data. I actually agreed with quite a lot of what he said, and certainly think the subject needs people willing to question the somewhat shaky foundations on which the standard concordance cosmology is built.

But near the end, Subir almost spoiled the whole thing by making a comment that made me decide to make  another entry in my Room 101 of statistical horrors.  He was talking about the  spectrum of fluctuations in the temperature of the Cosmic Microwave Background as measured by the Wilkinson Microwave Anisotropy Probe (WMAP):



I’ve mentioned the importance of this plot in previous posts. In his talk, Subir wanted to point out that the measured spectrum isn’t actually fit all that well by the concordance cosmology prediction shown by the solid line.

A simple way of measuring goodness-of-fit is to work out the value of chi-squared which relates to the sum of the squares of the residuals between the data and the fit. If you do this with the WMAP data you will find that the value of chi-squared is actually a bit high, so high indeed that there is only a 7 per cent chance of such a value arising in a concordance Universe.  The reason is probably to do with the behaviour at low harmonics (i.e. large scales) where there are some points that do appear to lie off the model curve. This means that the best fit concordance model  isn’t a really brilliant fit, but it is acceptable at the usual 5% significance level.

I won’t quibble with this number, although strictly speaking the data points aren’t entirely independent so the translation of chi-squared into a probability is not quite as easy as it may seem.  I’d also stress that I think it is valuable to show that the concordance model isn’t by any means perfect.  However, in Subir’s talk the chi-squared result morphed into a statement that the  probability of the concordance model being right is only 7 per cent.

No! The probability of chi-squared given the model is 7%, but that’s quite different to the probability of the model given the value of chi-squared…

This is a thinly disguised example of the prosecutor’s fallacy which came up in my post about Sir Roy Meadow and his testimony in the case against Sally Clark that resulted in a wrongful conviction for the murder of her two children.

Of course the consequences of this polemicist’s fallacy aren’t so drastic. The Universe won’t go to prison. And it didn’t really spoil what was a fascinating talk. But it did confirm in my mind that statistics is like alcohol. It makes clever people say very silly things.