Guest Post – Bayesian Book Review

Posted in Bad Statistics, Books, Talks and Reviews with tags , , , on May 30, 2011 by telescoper

My regular commenter Anton circulated this book review by email yesterday and it stimulated quite a lot of reaction. I haven’t read the book myself, but I thought it would be fun to post his review on here to see whether it provokes similar responses. You can find the book on Amazon here (UK) or here ( USA). If you’re not completely au fait with Bayesian probability and the controversy around it, you might try reading one of my earlier posts about it, e.g. this one. I hope I can persuade some of the email commenters to upload their contributions through the box below!

-0-

The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy

by Sharon Bertsch Mcgrayne

I found reading this book, which is a history of Bayes’ theorem written for the layman, to be deeply frustrating. The author does not really understand what probability IS – which is the key to all cogent writing on the subject. She never mentions the sum and product rules, or that Bayes’ theorem is an easy consequence of them. She notes, correctly, that Bayesian methods or something equivalent to them have been rediscovered advantageously again and again in an amazing variety of practical applications, and says that this is because they are pragmatically better than frequentist sampling theory – ie, she never asks the question: Why do they work better and what deeper rationale explains this? RT Cox is not mentioned. Ed Jaynes is mentioned only in passing as someone whose Bayesian fervour supposedly put people off.

The author is correct that computer applications have catalysed the Bayesian revolution, but in the pages on image processing and other general inverse problems (p218-21) she manages to miss the key work through the 1980s of Steve Gull and John Skilling, and you will not find “Maximum entropy” in the index. She does get the key role of Markov Chain Monte Carlo methods in computer implementation of Bayesian methods, however. But I can’t find Dave Mackay either, who deserves to be in the relevant section about modern applications.

On the other hand, as a historian of Bayesianism from Bayes himself to about 1960, she is full of superb anecdotes and information about
people who are to us merely names on the top of papers, or whose personalities are mentioned tantalisingly briefly in Jaynes’ writing.
For this material alone I recommend the book to Bayesians of our sort and am glad that I bought it.

A Dutch Book

Posted in Bad Statistics with tags , , , on October 28, 2009 by telescoper

When I was a research student at Sussex University I lived for a time in Hove, close to the local Greyhound track. I soon discovered that going to the dogs could be both enjoyable and instructive. The card for an evening would usually consist of ten races, each involving six dogs. It didn’t take long for me to realise that it was quite boring to watch the greyhounds unless you had a bet, so I got into the habit of making small investments on each race. In fact, my usual bet would involve trying to predict both first and second place, the kind of combination bet which has longer odds and therefore generally has a better return if you happen to get it right.

The simplest way to bet is through a totalising pool system (called “The Tote”) in which the return on a successful bet  is determined by how much money has been placed on that particular outcome; the higher the amount staked, the lower the return for an individual winner. The Tote accepts very small bets, which suited me because I was an impoverished student in those days. The odds at any particular time are shown on the giant Tote Board you can see in the picture above.

However, every now and again I would place bets with one of the independent trackside bookies who set their own odds. Here the usual bet is for one particular dog to win, rather than on 1st/2nd place combinations. Sometimes these odds were much more generous than those that were showing on the Tote Board so I gave them a go. When bookies offer long odds, however, it’s probably because they know something the punters don’t and I didn’t win very often.

I often watched the bookmakers in action, chalking the odds up, sometimes lengthening them to draw in new bets or sometimes shortening them to discourage bets if they feared heavy losses. It struck me that they have to be very sharp when they change odds in this way because it’s quite easy to make a mistake that might result in a combination bet guaranteeing a win for a customer.

With six possible winners it takes a while to work out if there is such a strategy but to explain what I mean consider  a  race with three competitors. The bookie assigns odds as follows : (1) even money; (2) 3/1 against; and (3)  4/1 against. The quoted odds imply probabilities to win of 50% (1 in 2), 25% (1 in 4) and 20% (1 in 5) respectively.

Now suppose you  place in three different bets:  £100 on (1) to win, £50 on (2) and £40 on (3).  Your total stake is then £190. If (1) succeeds you win £100 and also get your stake back; you lose the other stakes, but you have turned £190 into £200 so are up £10  overall. If (2) wins you also come out with £200: your £50 stake plus £150 for the bet. Likewise if (3) wins. You win whatever the outcome of the race. It’s not a question of being lucky, just that the odds have been designed inconsistently.

I stress that I never saw a bookie actually do this. If one did, he’d soon go out of business. An inconsistent set of odds like this is called a Dutch Book, and a bet which guarantees the better a positive return is often called a lock. It’s the also the principle behind many share-trading schemes based on the idea of arbitrage.

It was only much  later I realised that there is a nice way of turning the Dutch Book argument around to derive the laws of probability from the principle that the odds be consistent, i.e. so that they do not lead to situations where a Dutch Book arises.

To see this, I’ll just generalise the above discussion a bit. Imagine you are a gambler interested in betting on the outcome of some event. If the game is fair, you would have expect to pay a stake px to win an amount x if the probability of the winning outcome is p.

Now  imagine that there are several possible outcomes, each with different probabilities, and you are allowed to bet a different amount on each of them. Clearly, the bookmaker has to be careful that there is no combination of bets that guarantees that you (the punter) will win.

Now consider a specific example. Suppose there are three possible outcomes; call them A, B, and C. Your bookie will accept the following bets: a bet on A with a payoff xA, for which the stake is pAxA; a bet on B for which the return  is xB and the stake  pBxB; and a bet on C with stake  pCxC and payoff xC.

Think about what happens in the special case where the events A and B are mutually exclusive (which just means that they can’t both happen) and C is just given by  A “OR” B, i.e. the event that either A or B happens. There are then three possible outcomes.

First, if A happens but B does not happen the net return to the gambler is

$R=x_A(1-P_A)-x_BP_B+x_c(1-P_C).$

The first term represents the difference between the stake and the return for the successful bet on A, the second is the lost stake corresponding to the failed bet on the event B, and the third term arises from the successful bet on C. The bet on C succeeds because if A happens then A”OR”B must happen too.

Alternatively, if B happens but A does not happen, the net return is

$R=-x_A P_A -x_B(1-P_B)+x_c(1-P_C),$

in a similar way to the previous result except that the bet on A loses, while those on B and C succeed.

Finally there is the possibility that neither A nor B succeeds: in this case the gambler does not win at all, and the return (which is bound to be negative) is

$R=-x_AP_A-x_BP_B -x_C P_C.$

Notice that A and B can’t both happen because I have assumed that they are mutually exclusive. For the game to be consistent (in the sense I’ve discussed above) we need to have

$\textrm{det} \left( \begin{array}{ccc} 1- P_A & -P_B & 1-P_C \\ -P_A & 1-P_B & 1-P_C\\ -P_A & -P_B & -P_C \end{array} \right)=P_A+P_B-P_C=0.$

This means that

$P_C=P_A+P_B$

so, since C is the event A “OR” B, this means that the probabilityof two mutually exclusive events A and B is the sum of the separate probabilities of A and B. This is usually taught as one of the axioms from which the calculus of probabilities is derived, but what this discussion shows is that it can itself be derived in this way from the principle of consistency. It is the only way to combine probabilities  that is consistent from the point of view of betting behaviour. Similar logic leads to the other rules of probability, including those for events which are not mutually exclusive.

Notice that this kind of consistency has nothing to do with averages over a long series of repeated bets: if the rules are violated then the game itself is rigged.

A much more elegant and complete derivation of the laws of probability has been set out by Cox, but I find the Dutch Book argument a  nice practical way to illustrate the important difference between being unlucky and being irrational.

P.S. For legal reasons I should point out that, although I was a research student at the University of Sussex, I do not have a PhD. My doctorate is a DPhil.