Archive for extreme value statistics

The Biggest Things in the Universe

Posted in The Universe and Stuff with tags , , , , on November 12, 2011 by telescoper

I’ve never really thought of this blog as a vehicle for promoting my own research in cosmology, but it’s been a while since I posted anything very scientific so I thought I’d put up a brief advertisement for a paper that appeared on the arXiv this week by myself and Ian Harrison (who is a PhD student of mine). Here is the abstract, which I think is pretty informative about the contents of the paper; would that were always the case!

Motivated by recent suggestions that a number of observed galaxy clusters have masses which are too high for their given redshift to occur naturally in a standard model cosmology, we use Extreme Value Statistics to construct confidence regions in the mass-redshift plane for the most extreme objects expected in the universe. We show how such a diagram not only provides a way of potentially ruling out the concordance cosmology, but also allows us to differentiate between alternative models of enhanced structure formation. We compare our theoretical prediction with observations, placing currently observed high and low redshift clusters on a mass-redshift diagram and find – provided we consider the full sky to avoid a posteriori selection effects – that none are in significant tension with concordance cosmology.

The background to this paper is that,  according to standard cosmological theory, galaxies and other large-scale structures such as galaxy clusters form hierarchically. That is to say that they are built from the bottom-up from a population of smaller objects that progressively merge  into larger and larger structures as the Universe evolves. At any given time there is a broad distribution of masses, but the average mass increases as time goes on. Looking out into the distant Universe we should therefore see fewer high-mass objects at high redshift than at low redshift.

Recent observations – I refer you to our paper for references – have revealed evidence for the existence of some very massive galaxy clusters at redshifts around unity or larger, which corresponds to a look-back time of greater than 7 Gyr. Actually these are not at high redshift compared to galaxies, which have bee found at redshifts around 10, where the lookback time is more like 12 Gyr, but these are at least a thousand times less massive than large clusters so their existence in the early Universe is not surprising in the framework of the standard cosmological model. On the other hand, clusters of the masses we’re talking about – about 1,000,000,000,000,000 times the mass of the Sun – should form pretty late in cosmic history so have the potential to challenge the standard theory.
In the paper we approach the issue in a different manner to other analyses and apply Extreme Value Statistics to ask how massive we would expect the largest cluster in the observable universe should be as a function of redshift. If we see one larger than the limits imposed by this calculation then we really need to consider modifying the standard theory. This way of tackling the problem attempts to finesse a  number of biases  in the usual approach, which is to attempt to estimate the number-density n(M) of clusters as a function of mass M, because it does not require a correction for a posteori  selection effects; it is not obvious, for example, prevcisely what volume is being probed by the surveys yielding these cluster candidates.

Anyway, the results are summarised in our Figure 1, which shows some estimated cluster masses, together with their uncertainties, superimposed on the theoretical distribution of the mass of the most massive cluster at that redshift:

If you’re wondering why the curves turn down at very low redshift, it’s just because the volume available to be observed at low redshift is small: although objects are generally more massive at low redshift, the chance of getting a really big one is reduced by the fact that one is observing a much smaller part of space-time.

The results show:  (a) that, contrary to some claims, the current observations are actually entirely consistent with the standard concordance model; but also  (b)  that the existence of clusters at redshifts around 1.5 with masses much bigger than 10^{15} M_{\odot} would require the tabling of an amendment to the standard theory.

Of course this is is a very conservative approach and it yields what is essentially a null result, but I take the view that while theorists should be prepared to consider radical new theoretical ideas, we should also be conservative when it comes to the interpretation of data.


The Laws of Extremely Improbable Things

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , , , , on June 9, 2011 by telescoper

After a couple of boozy nights in Copenhagen during the workshop which has just finished, I thought I’d take things easy this evening and make use of the free internet connection in my hotel to post a short item about something I talked about at the workshop here.

Actually I’ve been meaning to mention a nice bit of statistical theory called Extreme Value Theory on here for some time, because not so many people seem to be aware of it, but somehow I never got around to writing about it. People generally assume that statistical analysis of data revolves around “typical” quantities, such as averages or root-mean-square fluctuations (i.e. “standard” deviations). Sometimes, however, it’s not the typical points that are interesting, but those that appear to be drawn from the extreme tails of a probability distribution. This is particularly the case in planning for floods and other natural disasters, but this field also finds a number of interesting applications in astrophysics and cosmology. What should be the mass of the most massive cluster in my galaxy survey? How bright the brightest galaxy? How hot the hottest hotspot in the distribution of temperature fluctuations on the cosmic microwave background sky? And how cold the coldest? Sometimes just one anomalous event can be enormously useful in testing a theory.

I’m not going to go into the theory in any great depth here. Instead I’ll just give you a simple idea of how things work. First imagine you have a set of n observations labelled X_i. Assume that these are independent and identically distributed with a distribution function F(x), i.e.

\Pr(X_i\leq x)=F(x)

Now suppose you locate the largest value in the sample, X_{\rm max}. What is the distribution of this value? The answer is not F(x), but it is quite easy to work out because the probability that the largest value is less than or equal to, say, z is just the probability that each one is less than or equal to that value, i.e.

F_{\rm max}(z) = \Pr \left(X_{\rm max}\leq z\right)= \Pr \left(X_1\leq z, X_2\leq z\ldots, X_n\leq z\right)

Because the variables are independent and identically distributed, this means that

F_{\rm max} (z) = \left[ F(z) \right]^n

The probability density function associated with this is then just

f_{\rm max}(z) = n f(z) \left[ F(z) \right]^{n-1}

In a situation in which F(x) is known and in which the other assumptions apply, then this simple result offers the best way to proceed in analysing extreme values.

The mathematical interest in extreme values however derives from a paper in 1928 by Fisher \& Tippett which paved the way towards a general theory of extreme value distributions. I don’t want to go too much into details about that, but I will give a flavour by mentioning a historically important, perhaps surprising, and in any case rather illuminating example.

It turns out that for any distribution F(x) of exponential type, which means that

\lim_{x\rightarrow\infty} \frac{1-F(x)}{f(x)} = 0

then there is a stable asymptotic distribution of extreme values, as n \rightarrow \infty which is independent of the underlying distribution, F(x), and which has the form

G(z) = \exp \left(-\exp \left( -\frac{(z-a_n)}{b_n} \right)\right)

where a_n and b_n are location and scale parameters; this is called the Gumbel distribution. It’s not often you come across functions of the form e^{-e^{-y}}!

This result, and others, has established a robust and powerful framework for modelling extreme events. One of course has to be particularly careful if the variables involved are not independent (e.g. part of correlated sequences) or if there are not identically distributed (e.g. if the distribution is changing with time). One also has to be aware of the possibility that an extreme data point may simply be some sort of glitch (e.g. a cosmic ray hit on a pixel, to give an astronomical example). It should also be mentioned that the asymptotic theory is what it says on the tin – asymptotic. Some distributions of exponential type converge extremely slowly to the asymptotic form. A notable example is the Gaussian, which converges at the pathetically slow rate of \sqrt{\ln(n)}! This is why I advocate using the exact distribution resulting from a fully specified model whenever this is possible.

The pitfalls are dangerous and have no doubt led to numerous misapplications of this theory, but, done properly, it’s an approach that has enormous potential.

I’ve been interested in this branch of statistical theory for a long time, since I was introduced to it while I was a graduate student by a classic paper written by my supervisor. In fact I myself contributed to the classic old literature on this topic myself, with a paper on extreme temperature fluctuations in the cosmic microwave background way back in 1988..

Of course there weren’t any CMB maps back in 1988, and if I had thought more about it at the time I should have realised that since this was all done using Gaussian statistics, there was a 50% chance that the most interesting feature would actually be a negative rather than positive fluctuation. It turns out that twenty-odd years on, people are actually discussing an anomalous cold spot in the data from WMAP, proving that Murphy’s law applies to extreme events…