## Bayes in the dock (again)

This morning on Twitter there appeared a link to a blog post reporting that the Court of Appeal had rejected the use of Bayesian probability in legal cases. I recommend anyone interested in probability to read it, as it gives a fascinating insight into how poorly the concept is understood.

Although this is a new report about a new case, it’s actually not an entirely new conclusion. I blogged about a similar case a couple of years ago, in fact. The earlier story n concerned an erroneous argument given during a trial about the significance of a match found between a footprint found at a crime scene and footwear belonging to a suspect. The judge took exception to the fact that the figures being used were not known sufficiently accurately to make a reliable assessment, and thus decided that Bayes’ theorem shouldn’t be used in court unless the data involved in its application were “firm”.

If you read the Guardian article to which I’ve provided a link you will see that there’s a lot of reaction from the legal establishment and statisticians about this, focussing on the forensic use of probabilistic reasoning. This all reminds me of the tragedy of the Sally Clark case and what a disgrace it is that nothing has been done since then to improve the misrepresentation of statistical arguments in trials. Some of my Bayesian colleagues have expressed dismay at the judge’s opinion.

My reaction to this affair is more muted than you would probably expect. First thing to say is that this is really not an issue relating to the Bayesian versus frequentist debate at all. It’s about a straightforward application of Bayes’ theorem which, as its name suggests, is a *theorem*; actually it’s just a straightforward consequence of the sum and product laws of the calculus of probabilities. No-one, not even the most die-hard frequentist, would argue that Bayes’ theorem is false. What happened in this case is that an “expert” applied Bayes’ theorem to unreliable data and by so doing obtained misleading results. The issue is not Bayes’ theorem *per se*, but the application of it to inaccurate data. Garbage in, garbage out. There’s no place for garbage in the courtroom, so in my opinion the judge was quite right to throw this particular argument out.

But while I’m on the subject of using Bayesian logic in the courts, let me add a few wider comments. First, I think that Bayesian reasoning provides a rigorous mathematical foundation for the process of assessing quantitatively the extent to which evidence supports a given theory or interpretation. As such it describes accurately how scientific investigations proceed by updating probabilities in the light of new data. It also describes how a criminal investigation works too.

What Bayesian inference is *not* good at is achieving closure in the form of a definite verdict. There are two sides to this. One is that the maxim “innocent until proven guilty” cannot be incorporated in Bayesian reasoning. If one assigns a zero *prior* probability of guilt then no amount of evidence will be able to change this into a non-zero posterior probability; the required burden is infinite. On the other hand, there is the problem that the jury must decide guilt in a criminal trial “beyond reasonable doubt”. But how much doubt is reasonable, exactly? And will a jury understand a probabilistic argument anyway?

In pure science we never really need to achieve this kind of closure, collapsing the broad range of probability into a simple “true” or “false”, because this is a process of continual investigation. It’s a reasonable inference, for example, based on Supernovae and other observations that the Universe is accelerating. But is it *proven* that this is so? I’d say “no”, and don’t think my doubts are at all unreasonable…

So what I’d say is that while statistical arguments are extremely important for investigating crimes – narrowing down the field of suspects, assessing the reliability of evidence, establishing lines of inquiry, and so on – I don’t think they should *ever* play a central role once the case has been brought to court unless there’s much clearer guidance given to juries and stricter monitoring of so-called “expert” witnesses.

I’m sure various readers will wish to express diverse opinions on this case so, as usual, please feel free to contribute through the box below!

Follow @telescoper
February 28, 2013 at 1:23 pm

There’s a nice summary of similar cases and the Use of Bayes’ Theorem in court here:

February 28, 2013 at 2:44 pm

One should ask a judge exactly what is meant by “innocent until proven guilty” as he or she will be forced to waffle and will soon realise that fact and hopefully take the opportunity to learn. If you begin by supposing somebody is DEFINITELY innocent of a crime then you will ascribe all evidence to the contrary to coincidence and/or alternative explanations, however implausible. That is intuitive and is exactly what Bayes’ theorem says when applied to the situation. (If you begin with zero probability of guilt before learning the evidence then you end with zero probability of guilt after learning the evidence, since the latter is proportional to the former according to Bayes.) Once again Bayes, applied correctly, matches intuition. But does “beginning with zero probability of guilt” correspond to what lawyers mean by “innocent until proven guilty”? Therein lies the question…

I don’t blame the judges who made this present ruling; like most people today they are confused over the definition of probability and have got hung up over alleged differences in application of the concept to the past and the future. But if Bayes’ theorem is to be thrown out of court then it is time to use a rhetorical trick I have been advocating for some years. What you actually need in any problem involving uncertainty is a number representing how strongly one thing would imply another. More formally, the number i(A|B) is a measure of how strongly one binary proposition ‘B’ would, if true, imply the truth of a second binary proposition ‘A’, according t relations known between the referents of the two propositions. From the (Boolean) algebra of propositions it can be shown that the corresponding algebra of these numbers related to propositions is just the sum and product rules. On this basis I like to call the number “probability” but if anybody objects – traditionally frequentist statisticians, but today perhaps judges – then let’s call it something else (implicability?) and then we can use it and solve the problem anyway. As for Bayes’ theorem, don’t use the name if judges dislike it; just use the sum and product rules, from which it trivially follows.

In another recent case (that of Chris Huhne’s ex-wife) a jury was discharged after seeking clarification from the judge on what phrases like “reasonable doubt” meant. The judge refused to answer, and I have some sympathy with both sides of that exchange. This is all about the bolting of decision theory on to probability theory in order to come up with a verdict.

The appearance of Bayes in court in recent years has been driven mainly by DNA evidence. I regard this as a fleeting phase, because DNA testing will soon be so good as to identify people uniquely except in the case of identical twins.

February 28, 2013 at 10:44 pm

If this had been a criminal case I would have understood, beyond reasonable doubt means jut that beyond any doubt that a ‘reasonbale man’ might have. But this is a civil case which is supposed to be decided on the ‘balance of proobabilities’ FFS!!!!

March 8, 2013 at 7:18 pm

Well, the very fact that a person is being called an accused shows that the prior on the possibility of him/her having committed a crime is zero. Several billion people do not stand accused in any given case – it is these people who strictly have a prior probability of zero. The maxim “innocent until proven guilty” should perhaps be taken to mean “innocent if posterior probability doesn’t exceed the prior by a certain factor”. The question then boils down to quantifying this factor. Circumstantial evidence, motivation for crime, etc may be used (are implicitly used?) in estimating the prior.