## Hubble Tension: an “Alternative” View?

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on July 25, 2019 by telescoper

There was a new paper last week on the arXiv by Sunny Vagnozzi about the Hubble constant controversy (see this blog passim). I was going to refrain from commenting but I see that one of the bloggers I follow has posted about it so I guess a brief item would not be out of order.

Here is the abstract of the Vagnozzi paper: I posted this picture last week which is relevant to the discussion: The point is that if you allow the equation of state parameter w to vary from the value of w=-1 that it has in the standard cosmology then you get a better fit. However, it is one of the features of Bayesian inference that if you introduce a new free parameter then you have to assign a prior probability over the space of values that parameter could hold. That prior penalty is carried through to the posterior probability. Unless the new model fits observational data significantly better than the old one, this prior penalty will lead to the new model being disfavoured. This is the Bayesian statement of Ockham’s Razor.

The Vagnozzi paper represents a statement of this in the context of the Hubble tension. If a new floating parameter w is introduced the data prefer a value less than -1 (as demonstrated in the figure) but on posterior probability grounds the resulting model is less probable than the standard cosmology for the reason stated above. Vagnozzi then argues that if a new fixed value of, say, w = -1.3 is introduced then the resulting model is not penalized by having to spread the prior probability out over a range of values but puts all its prior eggs in one basket labelled w = -1.3.

This is of course true. The problem is that the value of w = -1.3 does not derive from any ab initio principle of physics but by a posteriori of the inference described above. It’s no surprise that you can get a better answer if you know what outcome you want. I find that I am very good at forecasting the football results if I make my predictions after watching Final Score

Indeed, many cosmologists think any value of w < -1 should be ruled out ab initio because they don’t make physical sense anyway.

## On Probability and Cosmology

Posted in The Universe and Stuff with tags , , , on December 12, 2018 by telescoper

I just noticed a potentially interesting paper by Martin Sahlén on the arXiv. I haven’t actually read it yet, so don’t know if I agree with it, but thought I’d point it out here for those interested in cosmology and things Bayesian.

Here is the abstract:

Modern scientific cosmology pushes the boundaries of knowledge and the knowable. This is prompting questions on the nature of scientific knowledge. A central issue is what defines a ‘good’ model. When addressing global properties of the Universe or its initial state this becomes a particularly pressing issue. How to assess the probability of the Universe as a whole is empirically ambiguous, since we can examine only part of a single realisation of the system under investigation: at some point, data will run out. We review the basics of applying Bayesian statistical explanation to the Universe as a whole. We argue that a conventional Bayesian approach to model inference generally fails in such circumstances, and cannot resolve, e.g., the so-called ‘measure problem’ in inflationary cosmology. Implicit and non-empirical valuations inevitably enter model assessment in these cases. This undermines the possibility to perform Bayesian model comparison. One must therefore either stay silent, or pursue a more general form of systematic and rational model assessment. We outline a generalised axiological Bayesian model inference framework, based on mathematical lattices. This extends inference based on empirical data (evidence) to additionally consider the properties of model structure (elegance) and model possibility space (beneficence). We propose this as a natural and theoretically well-motivated framework for introducing an explicit, rational approach to theoretical model prejudice and inference beyond data.

## Have you got a proper posterior?

Posted in Bad Statistics, The Universe and Stuff with tags , , , , on December 12, 2017 by telescoper

There’s an interesting paper on the arXiv today by Tak et al. with the title How proper are Bayesian models in the astronomical literature?’ The title isn’t all that appropriate, because the problem is not really with models’, but with the choice of prior (which should be implied by the model and other information known or assumed to be true). Moreover, I’m not sure whether the word Bayesian’ applies to the model in any meaningful way.

Anyway, The abstract is as follows:

The well-known Bayes theorem assumes that a posterior distribution is a probability distribution. However, the posterior distribution may no longer be a probability distribution if an improper prior distribution (non-probability measure) such as an unbounded uniform prior is used. Improper priors are often used in the astronomical literature to reflect on a lack of prior knowledge, but checking whether the resulting posterior is a probability distribution is sometimes neglected. It turns out that 24 articles out of 75 articles (32\%) published online in two renowned astronomy journals (ApJ and MNRAS) between Jan 1, 2017 and Oct 15, 2017 make use of Bayesian analyses without rigorously establishing posterior propriety. A disturbing aspect is that a Gibbs-type Markov chain Monte Carlo (MCMC) method can produce a seemingly reasonable posterior sample even when the posterior is not a probability distribution (Hobert and Casella, 1996). In such cases, researchers may erroneously make probabilistic inferences without noticing that the MCMC sample is from a non-existent probability distribution. We review why checking posterior propriety is fundamental in Bayesian analyses when improper priors are used and discuss how we can set up scientifically motivated proper priors to avoid the pitfalls of using improper priors.

This paper makes a point that I have wondered about on a number of occasions. One of the problems, in my opinion, is that astrophysicists don’t think enough about their choice of prior. An improper prior is basically a statement of ignorance about the result one expects in advance of incoming data. However, very often we know more than we think we do. I’ve lost track of the number of papers I’ve seen in which the authors blithely assume a flat prior when that makes no sense whatsoever on the basis of what information is available and, indeed, on the structure of the model within which the data are to be interpreted. I discuss a simple example here.

In my opinion the prior is not (as some frequentists contend) some kind of aberration. It plays a clear logical role in Bayesian inference. It can build into the analysis constraints that are implied by the choice of model framework. Even if it is used as a subjective statement of prejudice, the Bayesian approach at least requires one to put that prejudice on the table where it can be seen.

There are undoubtedly situations where we don’t know enough to assign a proper prior. That’s not necessarily a problem. Improper priors can – and do – lead to proper posterior distributions if (and it’s an important if) they include, or the  likelihood subsequently imposes, a cutoff on the prior space. The onus should be on the authors of a paper to show that their likelihood is such that it does this and produces a posterior which is well-defined probability measure (specifically that it is normalisable, ie can be made to integrate to unity). It seems that astronomers don’t always do this!

## Yellow Stars, Red Stars and Bayesian Inference

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , , on May 25, 2017 by telescoper

I came across a paper on the arXiv yesterday with the title Why do we find ourselves around a yellow star instead of a red star?’.  Here’s the abstract:

M-dwarf stars are more abundant than G-dwarf stars, so our position as observers on a planet orbiting a G-dwarf raises questions about the suitability of other stellar types for supporting life. If we consider ourselves as typical, in the anthropic sense that our environment is probably a typical one for conscious observers, then we are led to the conclusion that planets orbiting in the habitable zone of G-dwarf stars should be the best place for conscious life to develop. But such a conclusion neglects the possibility that K-dwarfs or M-dwarfs could provide more numerous sites for life to develop, both now and in the future. In this paper we analyze this problem through Bayesian inference to demonstrate that our occurrence around a G-dwarf might be a slight statistical anomaly, but only the sort of chance event that we expect to occur regularly. Even if M-dwarfs provide more numerous habitable planets today and in the future, we still expect mid G- to early K-dwarfs stars to be the most likely place for observers like ourselves. This suggests that observers with similar cognitive capabilities as us are most likely to be found at the present time and place, rather than in the future or around much smaller stars.

Athough astrobiology is not really my province,  I was intrigued enough to read on, until I came to the following paragraph in which the authors attempt to explain how Bayesian Inference works:

We approach this problem through the framework of Bayesian inference. As an example, consider a fair coin that is tossed three times in a row. Suppose that all three tosses turn up Heads. Can we conclude from this experiment that the coin must be weighted? In fact, we can still maintain our hypothesis that the coin is fair because the chances of getting three Heads in a row is 1/8. Many events with a probability of 1/8 occur every day, and so we should not be concerned about an event like this indicating that our initial assumptions are flawed. However, if we were to flip the same coin 70 times in a row with all 70 turning up Heads, we would readily conclude that the experiment is fixed. This is because the probability of flipping 70 Heads in a row is about 10-22, which is an exceedingly unlikely event that has probably never happened in the history of the universe. This
informal description of Bayesian inference provides a way to assess the probability of a hypothesis in light of new evidence.

Obviously I agree with the statement right at the end that Bayesian inference provides a way to assess the probability of a hypothesis in light of new evidence’. That’s certainly what Bayesian inference does, but this informal description’ is really a frequentist rather than a Bayesian argument, in that it only mentions the probability of given outcomes not the probability of different hypotheses…

Anyway, I was so unconvinced by this description’ that I stopped reading at that point and went and did something else. Since I didn’t finish the paper I won’t comment on the conclusions, although I am more than usually sceptical. You might disagree of course, so read the paper yourself and form your own opinion! For me, it goes in the file marked Bad Statistics!

## The Bayesian Second Law of Thermodynamics

Posted in The Universe and Stuff with tags , , , on April 3, 2017 by telescoper

I post occasionally about Bayesian probability, particularly with respect to Bayesian inference, and related applications to physics and other things, such as thermodynamics, so in that light here’s a paper I stumbled across yesterday. It’s not a brand new paper – it came out on the ArXiv in 2015 – but it’s of sufficiently long-term interest to warrant sharing on here. Here’s the abstract: You can download the full paper here. There’s also an accessible commentary by one of the authors here.

The interface between thermodynamics, statistical mechanics, information theory  and probability is a fascinating one, but too often important conceptual questions remain unanswered, or indeed unasked, while the field absorbs itself in detailed calculations. Refreshingly, this paper takes the opposite approach.

## Falisifiability versus Testability in Cosmology

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on July 24, 2015 by telescoper

A paper came out a few weeks ago on the arXiv that’s ruffled a few feathers here and there so I thought I would make a few inflammatory comments about it on this blog. The article concerned, by Gubitosi et al., has the abstract: I have to be a little careful as one of the authors is a good friend of mine. Also there’s already been a critique of some of the claims in this paper here. For the record, I agree with the critique and disagree with the original paper, that the claim below cannot be justfied.

…we illustrate how unfalsifiable models and paradigms are always favoured by the Bayes factor.

If I get a bit of time I’ll write a more technical post explaining why I think that. However, for the purposes of this post I want to take issue with a more fundamental problem I have with the philosophy of this paper, namely the way it adopts “falsifiablity” as a required characteristic for a theory to be scientific. The adoption of this criterion can be traced back to the influence of Karl Popper and particularly his insistence that science is deductive rather than inductive. Part of Popper’s claim is just a semantic confusion. It is necessary at some point to deduce what the measurable consequences of a theory might be before one does any experiments, but that doesn’t mean the whole process of science is deductive. As a non-deductivist I’ll frame my argument in the language of Bayesian (inductive) inference.

Popper rejects the basic application of inductive reasoning in updating probabilities in the light of measured data; he asserts that no theory ever becomes more probable when evidence is found in its favour. Every scientific theory begins infinitely improbable, and is doomed to remain so. There is a grain of truth in this, or can be if the space of possibilities is infinite. Standard methods for assigning priors often spread the unit total probability over an infinite space, leading to a prior probability which is formally zero. This is the problem of improper priors. But this is not a killer blow to Bayesianism. Even if the prior is not strictly normalizable, the posterior probability can be. In any case, given sufficient relevant data the cycle of experiment-measurement-update of probability assignment usually soon leaves the prior far behind. Data usually count in the end.

I believe that deductvism fails to describe how science actually works in practice and is actually a dangerous road to start out on. It is indeed a very short ride, philosophically speaking, from deductivism (as espoused by, e.g., David Hume) to irrationalism (as espoused by, e.g., Paul Feyeraband).

The idea by which Popper is best known is the dogma of falsification. According to this doctrine, a hypothesis is only said to be scientific if it is capable of being proved false. In real science certain “falsehood” and certain “truth” are almost never achieved. The claimed detection of primordial B-mode polarization in the cosmic microwave background by BICEP2 was claimed by some to be “proof” of cosmic inflation, which it wouldn’t have been even if it hadn’t subsequently shown not to be a cosmological signal at all. What we now know to be the failure of BICEP2 to detect primordial B-mode polarization doesn’t disprove inflation either.

Theories are simply more probable or less probable than the alternatives available on the market at a given time. The idea that experimental scientists struggle through their entire life simply to prove theorists wrong is a very strange one, although I definitely know some experimentalists who chase theories like lions chase gazelles. The disparaging implication that scientists live only to prove themselves wrong comes from concentrating exclusively on the possibility that a theory might be found to be less probable than a challenger. In fact, evidence neither confirms nor discounts a theory; it either makes the theory more probable (supports it) or makes it less probable (undermines it). For a theory to be scientific it must be capable having its probability influenced in this way, i.e. amenable to being altered by incoming data “i.e. evidence”. The right criterion for a scientific theory is therefore not falsifiability but testability. It follows straightforwardly from Bayes theorem that a testable theory will not predict all things with equal facility. Scientific theories generally do have untestable components. Any theory has its interpretation, which is the untestable penumbra that we need to supply to make it comprehensible to us. But whatever can be tested can be regared as scientific.

So I think the Gubitosi et al. paper starts on the wrong foot by focussing exclusively on “falsifiability”. The issue of whether a theory is testable is complicated in the context of inflation because prior probabilities for most observables are difficult to determine with any confidence because we know next to nothing about either (a) the conditions prevailing in the early Universe prior to the onset of inflation or (b) how properly to define a measure on the space of inflationary models. Even restricting consideration to the simplest models with a single scalar field, initial data are required for the scalar field (and its time derivative) and there is also a potential whose functional form is not known. It is therfore a far from trivial task to assign meaningful prior probabilities on inflationary models and thus extremely difficult to determine the relative probabilities of observables and how these probabilities may or may not be influenced by interactions with data. Moreover, the Bayesian approach involves comparing probabilities of competing theories, so we also have the issue of what to compare inflation with…

The question of whether cosmic inflation (whether in general concept or in the form of a specific model) is testable or not seems to me to boil down to whether it predicts all possible values of relevant observables with equal ease. A theory might be testable in principle, but not testable at a given time if the available technology at that time is not able to make measurements that can distingish between that theory and another. Most theories have to wait some time for experiments can be designed and built to test them. On the other hand a theory might be untestable even in principle, if it is constructed in such a way that its probability can’t be changed at all by any amount of experimental data. As long as a theory is testable in principle, however, it has the right to be called scientific. If the current available evidence can’t test it we need to do better experiments. On other words, there’s a problem with the evidence not the theory.

Gubitosi et al. are correct in identifying the important distinction between the inflationary paradigm, which encompasses a large set of specific models each formulated in a different way, and an individual member of that set. I also agree – in contrast to many of my colleagues – that it is actually difficult to argue that the inflationary paradigm is currently falsfiable testable. But that doesn’t necessarily mean that it isn’t scientific. A theory doesn’t have to have been tested in order to be testable.

## Bayes, Laplace and Bayes’ Theorem

Posted in Bad Statistics with tags , , , , , , , , on October 1, 2014 by telescoper

A  couple of interesting pieces have appeared which discuss Bayesian reasoning in the popular media. One is by Jon Butterworth in his Grauniad science blog and the other is a feature article in the New York Times. I’m in early today because I have an all-day Teaching and Learning Strategy Meeting so before I disappear for that I thought I’d post a quick bit of background.

One way to get to Bayes’ Theorem is by starting with $P(A|C)P(B|AC)=P(B|C)P(A|BC)=P(AB|C)$

where I refer to three logical propositions A, B and C and the vertical bar “|” denotes conditioning, i.e. $P(A|B)$ means the probability of A being true given the assumed truth of B; “AB” means “A and B”, etc. This basically follows from the fact that “A and B” must always be equivalent to “B and A”.  Bayes’ theorem  then follows straightforwardly as $P(B|AC) = K^{-1}P(B|C)P(A|BC) = K^{-1} P(AB|C)$

where $K=P(A|C).$

Many versions of this, including the one in Jon Butterworth’s blog, exclude the third proposition and refer to A and B only. I prefer to keep an extra one in there to remind us that every statement about probability depends on information either known or assumed to be known; any proper statement of probability requires this information to be stated clearly and used appropriately but sadly this requirement is frequently ignored.

Although this is called Bayes’ theorem, the general form of it as stated here was actually first written down not by Bayes, but by Laplace. What Bayes did was derive the special case of this formula for “inverting” the binomial distribution. This distribution gives the probability of x successes in n independent “trials” each having the same probability of success, p; each “trial” has only two possible outcomes (“success” or “failure”). Trials like this are usually called Bernoulli trials, after Daniel Bernoulli. If we ask the question “what is the probability of exactly x successes from the possible n?”, the answer is given by the binomial distribution: $P_n(x|n,p)= C(n,x) p^x (1-p)^{n-x}$

where $C(n,x)= \frac{n!}{x!(n-x)!}$

is the number of distinct combinations of x objects that can be drawn from a pool of n.

You can probably see immediately how this arises. The probability of x consecutive successes is p multiplied by itself x times, or px. The probability of (n-x) successive failures is similarly (1-p)n-x. The last two terms basically therefore tell us the probability that we have exactly x successes (since there must be n-x failures). The combinatorial factor in front takes account of the fact that the ordering of successes and failures doesn’t matter.

The binomial distribution applies, for example, to repeated tosses of a coin, in which case p is taken to be 0.5 for a fair coin. A biased coin might have a different value of p, but as long as the tosses are independent the formula still applies. The binomial distribution also applies to problems involving drawing balls from urns: it works exactly if the balls are replaced in the urn after each draw, but it also applies approximately without replacement, as long as the number of draws is much smaller than the number of balls in the urn. I leave it as an exercise to calculate the expectation value of the binomial distribution, but the result is not surprising: E(X)=np. If you toss a fair coin ten times the expectation value for the number of heads is 10 times 0.5, which is five. No surprise there. After another bit of maths, the variance of the distribution can also be found. It is np(1-p).

So this gives us the probability of x given a fixed value of p. Bayes was interested in the inverse of this result, the probability of p given x. In other words, Bayes was interested in the answer to the question “If I perform n independent trials and get x successes, what is the probability distribution of p?”. This is a classic example of inverse reasoning, in that it involved turning something like P(A|BC) into something like P(B|AC), which is what is achieved by the theorem stated at the start of this post.

Bayes got the correct answer for his problem, eventually, but by very convoluted reasoning. In my opinion it is quite difficult to justify the name Bayes’ theorem based on what he actually did, although Laplace did specifically acknowledge this contribution when he derived the general result later, which is no doubt why the theorem is always named in Bayes’ honour.

This is not the only example in science where the wrong person’s name is attached to a result or discovery. Stigler’s Law of Eponymy strikes again! So who was the mysterious mathematician behind this result? Thomas Bayes was born in 1702, son of Joshua Bayes, who was a Fellow of the Royal Society (FRS) and one of the very first nonconformist ministers to be ordained in England. Thomas was himself ordained and for a while worked with his father in the Presbyterian Meeting House in Leather Lane, near Holborn in London. In 1720 he was a minister in Tunbridge Wells, in Kent. He retired from the church in 1752 and died in 1761. Thomas Bayes didn’t publish a single paper on mathematics in his own name during his lifetime but was elected a Fellow of the Royal Society (FRS) in 1742.

The paper containing the theorem that now bears his name was published posthumously in the Philosophical Transactions of the Royal Society of London in 1763. In his great Philosophical Essay on Probabilities Laplace wrote:

Bayes, in the Transactions Philosophiques of the Year 1763, sought directly the probability that the possibilities indicated by past experiences are comprised within given limits; and he has arrived at this in a refined and very ingenious manner, although a little perplexing.

The reasoning in the 1763 paper is indeed perplexing, and I remain convinced that the general form we now we refer to as Bayes’ Theorem should really be called Laplace’s Theorem. Nevertheless, Bayes did establish an extremely important principle that is reflected in the title of the New York Times piece I referred to at the start of this piece. In a nutshell this is that probabilities of future events can be updated on the basis of past measurements or, as I prefer to put it, “one person’s posterior is another’s prior”.