## Test Odds

I’m very grateful to Daniel Mortlock for sending me this fascinating plot. It comes from the cricket pages of The Times Online and it shows how the probability of the various possible outcomes of the Final Ashes Test at the Oval evolved with time according to their “Hawk-Eye Analysis”.

I think I should mention that Daniel is an Australian supporter, so this graph must make painful viewing for him! Anyway, it’s a fascinating plot, which I read as an application of Bayesian probability.

At the beginning of the match, a prior probability is assigned to each of the three possible outcomes: England win (blue); Australia win (yellow); and Draw (grey). It looks like these are roughly in the ratio 1:2:2. No details are given as to how these were arrived at, but it must have taken into account the fact that Australia thrashed England in the previous match at Headingley. Information from previous Tests at the Oval was presumably also included.I don’t know if the fact that England won the toss and decided to bat first altered the prior odds significantly, but it should have.

Anyway, what happens next depends on how sophisticated a model is used to determine the subsequent evolution of the  probabilities. In good Bayesian fashion, information is incorporate in a likelihood function determined by the model and this is used to update the  prior  to produce a posterior probability. This is passed on as a prior for  the next time step. And so it goes on until the end of the match where, regardless of what prior is chosen, the data force the model to the correct conclusion.

The red dots show the fall of wickets, but the odds fluctuate continually in accord with variables such as scoring rate, number of wickets,  and, presumably, the weather. Some form of difference equation is clearly being used, but we don’t know the details.

England got off to a pretty good start, so their probability to win started to creep up, but not by all that much, presumably because the model didn’t think their first-innings total of 332 was enough against a good batting side like Australia. However, the odds of a draw fell more significantly as a result of fairly quick scoring and the lack of any rain delays.

When the Australians batted they were going well at the start so England’s probability to win started to fall and theirs to rise. But when they started to lose quick wickets (largely to Stuart Broad), the blue and yellow trajectories swap over and England became favourites by a large margin. Despite a wobble when they lost 3 early wickets and some jitters when Australia’s batsmen put healthy partnerships together, England remained the more probable to win from that point to the end.

Although it all basically makes some sense, there are some curiosities.  Daniel Mortlock asked, for example, whether Australia were  really as likely to win at about 200 for 2 on the fourth day as  England were when Australia were 70 without loss in the first innings?  That’s what the graph seems to say. His reading of this is that too much stock is placed in the difficulty of   breaking a big (100+ runs) parnership, as the curves seem to   “accelerate” when the batsmen seem to be doing well.

I wonder how new information is included in general terms. Australia’s poor first innings batting (160 all out) in any case only reduced their win probability to about the level that England started at. How was their batting in the first innings balanced against their performance in the last match?

I’d love to know more about the algorithm used in this analysis, but I suspect it is copyright. There may be a good reason for not disclosing it. I have noticed in recent years that bookmakers have been setting extremely parsimonious odds for cricket outcomes. Gone are the days (Headingley 1981) when bookmakers offered 500-1 against England to beat Australia, which they then proceeded to do. In those days the bookmakers relied on expert advisors to fix their odds. I believe it was the late Godfrey Evans who persuaded them to offer 500-1. I’m not sure if they ever asked him again!

The system on which Hawkeye is based is much more conservative. Even on the last day of the test, odds against an Australian victory remained around 4-1 until they were down to their last few wickets. Notice also that the odds on a draw were never as long against as they should have been either, when that outcome was clearly virtually impossible. On the morning of the final day I could only find 10-1 against the draw which I think is remarkably ungenerous. However, even with an England victory a near certainty you could still find odds like 1-4. It seems like the system doesn’t like to produce extremely long or extremely short odds.

Perhaps the bookies are now using analyses like this to set their odds, which explains why betting on cricket isn’t as much fun as it used to be. On the other hand, if the system is predisposed against very short odds then maybe that’s the kind of bet to make in order to win. Things like this may be why the algorithm behind Hawkeye isn’t published…

### 9 Responses to “Test Odds”

1. Hi Peter,

Nice to see this graphic (which combines two loves of mine, cricket and probability) included in one of your blog posts. The mysteries you identify are generally the same ones I’ve been intrigued by, but I haven’t pursued these questions with much enthusiasm as I don’t think there’s enough information to unravel the algorithm.

That said, there are some fascinating little passages where the probabilities bounce around in a most exciting way while not much was changing (e.g., the jags during the big partnership at the start of day 4). In part this might stem from not knowing what the independent variable is (time? overs? fraction of match remaining?). This relates to your comment about weather, which is demonstrably included, because the draw probability in the third Test leapt up when a day was lost to rain.

Maybe when we all have lots of time on our hands we can extract the data from these figures and reverse engineer this intriguing algorithm . . .

Daniel

2. telescoper Says:

It would be interesting to match the time series up with a TV broadcast of the action. If the model is very sophisticated one might imagine that it takes into account plays-and-misses or close decisions which in the end just appear as dot balls. Certainly as a spectator one can sense when the batsmen are struggling. One can sense what the rest of the crowd are thinking too!

It may also be very sensitive to small-scale flurries of activity like expensive overs. As a first stab one could look at the aggregate score and it’s first derivative to see if they correlate with the odds in a straightforward way.

3. nit picker Says:

A small nit to pick – there are actually four possible outcomes to a cricket test match – a win for either side, a draw and a tie 🙂

• telescoper Says:

True, but the tie would have the same effect as a draw in terms of the overall Ashes series and is in any case a very rare event in Test cricket.

4. Maybe Nit Picker should also have added in the possibility of forfeit, which is presumably higher any time Pakistan is playing.

5. Tom Shanks Says:

Have now repositioned from Rio to just south of Craster in
in Northumberland. We’re staying in the Bathing House right
on the Coastal Path. As you will know Peter the beaches
up here can easily compete with Copacabana and we have
been very lucky with the weather – at least until Hurricane
Bill strikes later tonight. Therefore although it’s tempting to
get into a debate on whether you really do need subjective
priors to predict cricket probabilities – I think they should
be objectively using maximum likelihood and the multinomial
distribution personally – I think the sun and sea at
embleton beach are going to win out here – at least till
my stocks of caipirinha run oot!

• telescoper Says:

My physics teacher at School always used to say, if a pupil gave an answer such as “ten” without giving it in proper units (e.g. metres per second), “Ten what? Craster kippers?”

Have a good breakfast.

6. When watching (or, in these days of Sky, reading about) Test cricket I often come across statements about “game having been won in this session” or about “a match-winning innings”. These are nominally rather subjective notions that could now be put on a more objective footing by applying the HawkEye algorithm. In the case of the fifth Test shown above it’s clear that if the match was “won” anywhere it was on that horrible second afternoon when Australia lost 7/80-odd. And that, in turn, was precipitated by the bowling of Stuart Broad, so here we have a nice graphical representation of the fact that he bowled a match-winning spell.

Another thought that occurs to me is that this algorithm could be applied retrospectively to all matches for which sufficiently detailed data exist. One could then provide the – or at least an – answer to questions like “What was the lowest winning probability for a team that went on to win?” I’m guessing that England at Headingly in 1981 would be the answer and that the probability would at around the per cent level, roughly consistent with the bookies’ odds.