## Archive for statistics

Posted in Bad Statistics, Science Politics, The Universe and Stuff with tags , , on July 2, 2015 by telescoper

I saw an interesting article in Nature the opening paragraph of which reads:

The past few years have seen a slew of announcements of major discoveries in particle astrophysics and cosmology. The list includes faster-than-light neutrinos; dark-matter particles producing γ-rays; X-rays scattering off nuclei underground; and even evidence in the cosmic microwave background for gravitational waves caused by the rapid inflation of the early Universe. Most of these turned out to be false alarms; and in my view, that is the probable fate of the rest.

The piece goes on to berate physicists for being too trigger-happy in claiming discoveries, the BICEP2 fiasco being a prime example. I agree that this is a problem, but it goes fare beyond physics. In fact its endemic throughout science. A major cause of it is abuse of statistical reasoning.

Anyway, I thought I’d take the opportunity to re-iterate why I statistics and statistical reasoning are so important to science. In fact, I think they lie at the very core of the scientific method, although I am still surprised how few practising scientists are comfortable with even basic statistical language. A more important problem is the popular impression that science is about facts and absolute truths. It isn’t. It’s a process. In order to advance it has to question itself. Getting this message wrong – whether by error or on purpose -is immensely dangerous.

Statistical reasoning also applies to many facets of everyday life, including business, commerce, transport, the media, and politics. Probability even plays a role in personal relationships, though mostly at a subconscious level. It is a feature of everyday life that science and technology are deeply embedded in every aspect of what we do each day. Science has given us greater levels of comfort, better health care, and a plethora of labour-saving devices. It has also given us unprecedented ability to destroy the environment and each other, whether through accident or design.

Civilized societies face rigorous challenges in this century. We must confront the threat of climate change and forthcoming energy crises. We must find better ways of resolving conflicts peacefully lest nuclear or conventional weapons lead us to global catastrophe. We must stop large-scale pollution or systematic destruction of the biosphere that nurtures us. And we must do all of these things without abandoning the many positive things that science has brought us. Abandoning science and rationality by retreating into religious or political fundamentalism would be a catastrophe for humanity.

Unfortunately, recent decades have seen a wholesale breakdown of trust between scientists and the public at large. This is due partly to the deliberate abuse of science for immoral purposes, and partly to the sheer carelessness with which various agencies have exploited scientific discoveries without proper evaluation of the risks involved. The abuse of statistical arguments have undoubtedly contributed to the suspicion with which many individuals view science.

There is an increasing alienation between scientists and the general public. Many fewer students enrol for courses in physics and chemistry than a a few decades ago. Fewer graduates mean fewer qualified science teachers in schools. This is a vicious cycle that threatens our future. It must be broken.

The danger is that the decreasing level of understanding of science in society means that knowledge (as well as its consequent power) becomes concentrated in the minds of a few individuals. This could have dire consequences for the future of our democracy. Even as things stand now, very few Members of Parliament are scientifically literate. How can we expect to control the application of science when the necessary understanding rests with an unelected “priesthood” that is hardly understood by, or represented in, our democratic institutions?

Very few journalists or television producers know enough about science to report sensibly on the latest discoveries or controversies. As a result, important matters that the public needs to know about do not appear at all in the media, or if they do it is in such a garbled fashion that they do more harm than good.

Years ago I used to listen to radio interviews with scientists on the Today programme on BBC Radio 4. I even did such an interview once. It is a deeply frustrating experience. The scientist usually starts by explaining what the discovery is about in the way a scientist should, with careful statements of what is assumed, how the data is interpreted, and what other possible interpretations might be and the likely sources of error. The interviewer then loses patience and asks for a yes or no answer. The scientist tries to continue, but is badgered. Either the interview ends as a row, or the scientist ends up stating a grossly oversimplified version of the story.

Some scientists offer the oversimplified version at the outset, of course, and these are the ones that contribute to the image of scientists as priests. Such individuals often believe in their theories in exactly the same way that some people believe religiously. Not with the conditional and possibly temporary belief that characterizes the scientific method, but with the unquestioning fervour of an unthinking zealot. This approach may pay off for the individual in the short term, in popular esteem and media recognition – but when it goes wrong it is science as a whole that suffers. When a result that has been proclaimed certain is later shown to be false, the result is widespread disillusionment.

The worst example of this tendency that I can think of is the constant use of the phrase “Mind of God” by theoretical physicists to describe fundamental theories. This is not only meaningless but also damaging. As scientists we should know better than to use it. Our theories do not represent absolute truths: they are just the best we can do with the available data and the limited powers of the human mind. We believe in our theories, but only to the extent that we need to accept working hypotheses in order to make progress. Our approach is pragmatic rather than idealistic. We should be humble and avoid making extravagant claims that can’t be justified either theoretically or experimentally.

The more that people get used to the image of “scientist as priest” the more dissatisfied they are with real science. Most of the questions asked of scientists simply can’t be answered with “yes” or “no”. This leaves many with the impression that science is very vague and subjective. The public also tend to lose faith in science when it is unable to come up with quick answers. Science is a process, a way of looking at problems not a list of ready-made answers to impossible problems. Of course it is sometimes vague, but I think it is vague in a rational way and that’s what makes it worthwhile. It is also the reason why science has led to so many objectively measurable advances in our understanding of the World.

I don’t have any easy answers to the question of how to cure this malaise, but do have a few suggestions. It would be easy for a scientist such as myself to blame everything on the media and the education system, but in fact I think the responsibility lies mainly with ourselves. We are usually so obsessed with our own research, and the need to publish specialist papers by the lorry-load in order to advance our own careers that we usually spend very little time explaining what we do to the public or why.

I think every working scientist in the country should be required to spend at least 10% of their time working in schools or with the general media on “outreach”, including writing blogs like this. People in my field – astronomers and cosmologists – do this quite a lot, but these are areas where the public has some empathy with what we do. If only biologists, chemists, nuclear physicists and the rest were viewed in such a friendly light. Doing this sort of thing is not easy, especially when it comes to saying something on the radio that the interviewer does not want to hear. Media training for scientists has been a welcome recent innovation for some branches of science, but most of my colleagues have never had any help at all in this direction.

The second thing that must be done is to improve the dire state of science education in schools. Over the last two decades the national curriculum for British schools has been dumbed down to the point of absurdity. Pupils that leave school at 18 having taken “Advanced Level” physics do so with no useful knowledge of physics at all, even if they have obtained the highest grade. I do not at all blame the students for this; they can only do what they are asked to do. It’s all the fault of the educationalists, who have done the best they can for a long time to convince our young people that science is too hard for them. Science can be difficult, of course, and not everyone will be able to make a career out of it. But that doesn’t mean that it should not be taught properly to those that can take it in. If some students find it is not for them, then so be it. I always wanted to be a musician, but never had the talent for it.

I realise I must sound very gloomy about this, but I do think there are good prospects that the gap between science and society may gradually be healed. The fact that the public distrust scientists leads many of them to question us, which is a very good thing. They should question us and we should be prepared to answer them. If they ask us why, we should be prepared to give reasons. If enough scientists engage in this process then what will emerge is and understanding of the enduring value of science. I don’t just mean through the DVD players and computer games science has given us, but through its cultural impact. It is part of human nature to question our place in the Universe, so science is part of what we are. It gives us purpose. But it also shows us a way of living our lives. Except for a few individuals, the scientific community is tolerant, open, internationally-minded, and imbued with a philosophy of cooperation. It values reason and looks to the future rather than the past. Like anyone else, scientists will always make mistakes, but we can always learn from them. The logic of science may not be infallible, but it’s probably the best logic there is in a world so filled with uncertainty.

## Still Not Significant

Posted in Bad Statistics with tags , on May 27, 2015 by telescoper

I just couldn’t resist reblogging this post because of the wonderful list of meaningless convoluted phrases people use when they don’t get a “statistically significant” result. I particularly like:

“a robust trend toward significance”.

It’s scary to think that these were all taken from peer-reviewed scientific journals…

What to do if your p-value is just over the arbitrary threshold for ‘significance’ of p=0.05?

You don’t need to play the significance testing game – there are better methods, like quoting the effect size with a confidence interval – but if you do, the rules are simple: the result is either significant or it isn’t.

So if your p-value remains stubbornly higher than 0.05, you should call it ‘non-significant’ and write it up as such. The problem for many authors is that this just isn’t the answer they were looking for: publishing so-called ‘negative results’ is harder than ‘positive results’.

The solution is to apply the time-honoured tactic of circumlocution to disguise the non-significant result as something more interesting. The following list is culled from peer-reviewed journal articles in which (a) the authors set themselves the threshold of 0.05 for significance, (b) failed to achieve that threshold value for…

View original post 2,779 more words

## One More for the Bad Statistics in Astronomy File…

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on May 20, 2015 by telescoper

It’s been a while since I last posted anything in the file marked Bad Statistics, but I can remedy that this morning with a comment or two on the following paper by Robertson et al. which I found on the arXiv via the Astrostatistics Facebook page. It’s called Stellar activity mimics a habitable-zone planet around Kapteyn’s star and it the abstract is as follows:

Kapteyn’s star is an old M subdwarf believed to be a member of the Galactic halo population of stars. A recent study has claimed the existence of two super-Earth planets around the star based on radial velocity (RV) observations. The innermost of these candidate planets–Kapteyn b (P = 48 days)–resides within the circumstellar habitable zone. Given recent progress in understanding the impact of stellar activity in detecting planetary signals, we have analyzed the observed HARPS data for signatures of stellar activity. We find that while Kapteyn’s star is photometrically very stable, a suite of spectral activity indices reveals a large-amplitude rotation signal, and we determine the stellar rotation period to be 143 days. The spectral activity tracers are strongly correlated with the purported RV signal of “planet b,” and the 48-day period is an integer fraction (1/3) of the stellar rotation period. We conclude that Kapteyn b is not a planet in the Habitable Zone, but an artifact of stellar activity.

It’s not really my area of specialism but it seemed an interesting conclusions so I had a skim through the rest of the paper. Here’s the pertinent figure, Figure 3,

It looks like difficult data to do a correlation analysis on and there are lots of questions to be asked  about  the form of the errors and how the bunching of the data is handled, to give just two examples.I’d like to have seen a much more comprehensive discussion of this in the paper. In particular the statistic chosen to measure the correlation between variates is the Pearson product-moment correlation coefficient, which is intended to measure linear association between variables. There may indeed be correlations in the plots shown above, but it doesn’t look to me that a straight line fit characterizes it very well. It looks to me in some of the  cases that there are simply two groups of data points…

However, that’s not the real reason for flagging this one up. The real reason is the following statement in the text:

Aargh!

No matter how the p-value is arrived at (see comments above), it says nothing about the “probability of no correlation”. This is an error which is sadly commonplace throughout the scientific literature, not just astronomy.  The point is that the p-value relates to the probability that the given value of the test statistic (in this case the Pearson product-moment correlation coefficient, r) would arise by chace in the sample if the null hypothesis H (in this case that the two variates are uncorrelated) were true. In other words it relates to P(r|H). It does not tells us anything directly about the probability of H. That would require the use of Bayes’ Theorem. If you want to say anything at all about the probability of a hypothesis being true or not you should use a Bayesian approach. And if you don’t want to say anything about the probability of a hypothesis being true or not then what are you trying to do anyway?

If I had my way I would ban p-values altogether, but it people are going to use them I do wish they would be more careful about the statements make about them.

## Social Physics & Astronomy

Posted in The Universe and Stuff with tags , , , , , , on January 25, 2015 by telescoper

When I give popular talks about Cosmology,  I sometimes look for appropriate analogies or metaphors in television programmes about forensic science, such as CSI: Crime Scene Investigation which I watch quite regularly (to the disdain of many of my colleagues and friends). Cosmology is methodologically similar to forensic science because it is generally necessary in both these fields to proceed by observation and inference, rather than experiment and deduction: cosmologists have only one Universe;  forensic scientists have only one scene of the crime. They can collect trace evidence, look for fingerprints, establish or falsify alibis, and so on. But they can’t do what a laboratory physicist or chemist would typically try to do: perform a series of similar experimental crimes under slightly different physical conditions. What we have to do in cosmology is the same as what detectives do when pursuing an investigation: make inferences and deductions within the framework of a hypothesis that we continually subject to empirical test. This process carries on until reasonable doubt is exhausted, if that ever happens.

Of course there is much more pressure on detectives to prove guilt than there is on cosmologists to establish the truth about our Cosmos. That’s just as well, because there is still a very great deal we do not know about how the Universe works.I have a feeling that I’ve stretched this analogy to breaking point but at least it provides some kind of excuse for writing about an interesting historical connection between astronomy and forensic science by way of the social sciences.

The gentleman shown in the picture on the left is Lambert Adolphe Jacques Quételet, a Belgian astronomer who lived from 1796 to 1874. His principal research interest was in the field of celestial mechanics. He was also an expert in statistics. In Quételet’s  time it was by no means unusual for astronomers to well-versed in statistics, but he  was exceptionally distinguished in that field. Indeed, Quételet has been called “the father of modern statistics”. and, amongst other things he was responsible for organizing the first ever international conference on statistics in Paris in 1853.

His fame as a statistician owed less to its applications to astronomy, however, than the fact that in 1835 he had written a very influential book which, in English, was titled A Treatise on Man but whose somewhat more verbose original French title included the phrase physique sociale (“social physics”). I don’t think modern social scientists would see much of a connection between what they do and what we do in the physical sciences. Indeed the philosopher Auguste Comte was annoyed that Quételet appropriated the phrase “social physics” because he did not approve of the quantitative statistical-based  approach that it had come to represent. For that reason Comte  ditched the term from his own work and invented the modern subject of  sociology…

Quételet had been struck not only by the regular motions performed by the planets across the sky, but also by the existence of strong patterns in social phenomena, such as suicides and crime. If statistics was essential for understanding the former, should it not be deployed in the study of the latter? Quételet’s first book was an attempt to apply statistical methods to the development of man’s physical and intellectual faculties. His follow-up book Anthropometry, or the Measurement of Different Faculties in Man (1871) carried these ideas further, at the expense of a much clumsier title.

This foray into “social physics” was controversial at the time, for good reason. It also made Quételet extremely famous in his lifetime and his influence became widespread. For example, Francis Galton wrote about the deep impact Quételet had on a person who went on to become extremely famous:

Her statistics were more than a study, they were indeed her religion. For her Quételet was the hero as scientist, and the presentation copy of his “Social Physics” is annotated on every page. Florence Nightingale believed – and in all the actions of her life acted on that belief – that the administrator could only be successful if he were guided by statistical knowledge. The legislator – to say nothing of the politician – too often failed for want of this knowledge. Nay, she went further; she held that the universe – including human communities – was evolving in accordance with a divine plan; that it was man’s business to endeavour to understand this plan and guide his actions in sympathy with it. But to understand God’s thoughts, she held we must study statistics, for these are the measure of His purpose. Thus the study of statistics was for her a religious duty.

The person  in question was of course  Florence Nightingale. Not many people know that she was an adept statistician who was an early advocate of the use of pie charts to represent data graphically; she apparently found them useful when dealing with dim-witted army officers and dimmer-witted politicians.

The type of thinking described in the quote  also spawned a number of highly unsavoury developments in pseudoscience, such as the eugenics movement (in which Galton himself was involved), and some of the vile activities related to it that were carried out in Nazi Germany. But an idea is not responsible for the people who believe in it, and Quételet’s work did lead to many good things, such as the beginnings of forensic science.

A young medical student by the name of Louis-Adolphe Bertillon was excited by the whole idea of “social physics”, to the extent that he found himself imprisoned for his dangerous ideas during the revolution of 1848, along with one of his Professors, Achile Guillard, who later invented the subject of demography, the study of racial groups and regional populations. When they were both released, Bertillon became a close confidante of Guillard and eventually married his daughter Zoé. Their second son, Adolphe Bertillon, turned out to be a prodigy.

Young Adolphe was so inspired by Quételet’s work, which had no doubt been introduced to him by his father, that he hit upon a novel way to solve crimes. He would create a database of measured physical characteristics of convicted criminals. He chose 11 basic measurements, including length and width of head, right ear, forearm, middle and ring fingers, left foot, height, length of trunk, and so on. On their own none of these individual characteristics could be probative, but it ought to be possible to use a large number of different measurements to establish identity with a very high probability. Indeed, after two years’ study, Bertillon reckoned that the chances of two individuals having all 11 measurements in common were about four million to one. He further improved the system by adding photographs, in portrait and from the side, and a note of any special marks, like scars or moles.

Bertillonage, as this system became known, was rather cumbersome but proved highly successful in a number of high-profile criminal cases in Paris. By 1892, Bertillon was exceedingly famous but nowadays the word bertillonage only appears in places like the Observer’s Azed crossword.

The main reason why Bertillon’s fame subsided and his system fell into disuse was the development of an alternative and much simpler method of criminal identification: fingerprints. The first systematic use of fingerprints on a large scale was implemented in India in 1858 in an attempt to stamp out electoral fraud.

The name of the British civil servant who had the idea of using fingerprinting in this way was Sir William James Herschel (1833-1917), the eldest child of Sir John Herschel, the astronomer, and thus the grandson of Sir William Herschel, the discoverer of Uranus. Another interesting connection between astronomy and forensic science.

## Bayes, Laplace and Bayes’ Theorem

Posted in Bad Statistics with tags , , , , , , , , on October 1, 2014 by telescoper

A  couple of interesting pieces have appeared which discuss Bayesian reasoning in the popular media. One is by Jon Butterworth in his Grauniad science blog and the other is a feature article in the New York Times. I’m in early today because I have an all-day Teaching and Learning Strategy Meeting so before I disappear for that I thought I’d post a quick bit of background.

One way to get to Bayes’ Theorem is by starting with

$P(A|C)P(B|AC)=P(B|C)P(A|BC)=P(AB|C)$

where I refer to three logical propositions A, B and C and the vertical bar “|” denotes conditioning, i.e. $P(A|B)$ means the probability of A being true given the assumed truth of B; “AB” means “A and B”, etc. This basically follows from the fact that “A and B” must always be equivalent to “B and A”.  Bayes’ theorem  then follows straightforwardly as

$P(B|AC) = K^{-1}P(B|C)P(A|BC) = K^{-1} P(AB|C)$

where

$K=P(A|C).$

Many versions of this, including the one in Jon Butterworth’s blog, exclude the third proposition and refer to A and B only. I prefer to keep an extra one in there to remind us that every statement about probability depends on information either known or assumed to be known; any proper statement of probability requires this information to be stated clearly and used appropriately but sadly this requirement is frequently ignored.

Although this is called Bayes’ theorem, the general form of it as stated here was actually first written down not by Bayes, but by Laplace. What Bayes did was derive the special case of this formula for “inverting” the binomial distribution. This distribution gives the probability of x successes in n independent “trials” each having the same probability of success, p; each “trial” has only two possible outcomes (“success” or “failure”). Trials like this are usually called Bernoulli trials, after Daniel Bernoulli. If we ask the question “what is the probability of exactly x successes from the possible n?”, the answer is given by the binomial distribution:

$P_n(x|n,p)= C(n,x) p^x (1-p)^{n-x}$

where

$C(n,x)= \frac{n!}{x!(n-x)!}$

is the number of distinct combinations of x objects that can be drawn from a pool of n.

You can probably see immediately how this arises. The probability of x consecutive successes is p multiplied by itself x times, or px. The probability of (n-x) successive failures is similarly (1-p)n-x. The last two terms basically therefore tell us the probability that we have exactly x successes (since there must be n-x failures). The combinatorial factor in front takes account of the fact that the ordering of successes and failures doesn’t matter.

The binomial distribution applies, for example, to repeated tosses of a coin, in which case p is taken to be 0.5 for a fair coin. A biased coin might have a different value of p, but as long as the tosses are independent the formula still applies. The binomial distribution also applies to problems involving drawing balls from urns: it works exactly if the balls are replaced in the urn after each draw, but it also applies approximately without replacement, as long as the number of draws is much smaller than the number of balls in the urn. I leave it as an exercise to calculate the expectation value of the binomial distribution, but the result is not surprising: E(X)=np. If you toss a fair coin ten times the expectation value for the number of heads is 10 times 0.5, which is five. No surprise there. After another bit of maths, the variance of the distribution can also be found. It is np(1-p).

So this gives us the probability of x given a fixed value of p. Bayes was interested in the inverse of this result, the probability of p given x. In other words, Bayes was interested in the answer to the question “If I perform n independent trials and get x successes, what is the probability distribution of p?”. This is a classic example of inverse reasoning, in that it involved turning something like P(A|BC) into something like P(B|AC), which is what is achieved by the theorem stated at the start of this post.

Bayes got the correct answer for his problem, eventually, but by very convoluted reasoning. In my opinion it is quite difficult to justify the name Bayes’ theorem based on what he actually did, although Laplace did specifically acknowledge this contribution when he derived the general result later, which is no doubt why the theorem is always named in Bayes’ honour.

This is not the only example in science where the wrong person’s name is attached to a result or discovery. Stigler’s Law of Eponymy strikes again!

So who was the mysterious mathematician behind this result? Thomas Bayes was born in 1702, son of Joshua Bayes, who was a Fellow of the Royal Society (FRS) and one of the very first nonconformist ministers to be ordained in England. Thomas was himself ordained and for a while worked with his father in the Presbyterian Meeting House in Leather Lane, near Holborn in London. In 1720 he was a minister in Tunbridge Wells, in Kent. He retired from the church in 1752 and died in 1761. Thomas Bayes didn’t publish a single paper on mathematics in his own name during his lifetime but was elected a Fellow of the Royal Society (FRS) in 1742.

The paper containing the theorem that now bears his name was published posthumously in the Philosophical Transactions of the Royal Society of London in 1763. In his great Philosophical Essay on Probabilities Laplace wrote:

Bayes, in the Transactions Philosophiques of the Year 1763, sought directly the probability that the possibilities indicated by past experiences are comprised within given limits; and he has arrived at this in a refined and very ingenious manner, although a little perplexing.

The reasoning in the 1763 paper is indeed perplexing, and I remain convinced that the general form we now we refer to as Bayes’ Theorem should really be called Laplace’s Theorem. Nevertheless, Bayes did establish an extremely important principle that is reflected in the title of the New York Times piece I referred to at the start of this piece. In a nutshell this is that probabilities of future events can be updated on the basis of past measurements or, as I prefer to put it, “one person’s posterior is another’s prior”.

## Politics, Polls and Insignificance

Posted in Bad Statistics, Politics with tags , , , , , on July 29, 2014 by telescoper

In between various tasks I had a look at the news and saw a story about opinion polls that encouraged me to make another quick contribution to my bad statistics folder.

The piece concerned (in the Independent) includes the following statement:

A ComRes survey for The Independent shows that the Conservatives have dropped to 27 per cent, their lowest in a poll for this newspaper since the 2010 election. The party is down three points on last month, while Labour, now on 33 per cent, is up one point. Ukip is down one point to 17 per cent, with the Liberal Democrats up one point to eight per cent and the Green Party up two points to seven per cent.

The link added to ComRes is mine; the full survey can be found here. Unfortunately, the report, as is sadly almost always the case in surveys of this kind, neglects any mention of the statistical uncertainty in the poll. In fact the last point is based on a telephone poll of a sample of just 1001 respondents. Suppose the fraction of the population having the intention to vote for a particular party is $p$. For a sample of size $n$ with $x$ respondents indicating that they hen one can straightforwardly estimate $p \simeq x/n$. So far so good, as long as there is no bias induced by the form of the question asked nor in the selection of the sample, which for a telephone poll is doubtful.

A  little bit of mathematics involving the binomial distribution yields an answer for the uncertainty in this estimate of p in terms of the sampling error:

$\sigma = \sqrt{\frac{p(1-p)}{n}}$

For the sample size given, and a value $p \simeq 0.33$ this amounts to a standard error of about 1.5%. About 95% of samples drawn from a population in which the true fraction is $p$ will yield an estimate within $p \pm 2\sigma$, i.e. within about 3% of the true figure. In other words the typical variation between two samples drawn from the same underlying population is about 3%.

If you don’t believe my calculation then you could use ComRes’ own “margin of error calculator“. The UK electorate as of 2012 numbered 46,353,900 and a sample size of 1001 returns a margin of error of 3.1%. This figure is not quoted in the report however.

Looking at the figures quoted in the report will tell you that all of the changes reported since last month’s poll are within the sampling uncertainty and are therefore consistent with no change at all in underlying voting intentions over this period.

A summary of the report posted elsewhere states:

A ComRes survey for the Independent shows that Labour have jumped one point to 33 per cent in opinion ratings, with the Conservatives dropping to 27 per cent – their lowest support since the 2010 election.

No! There’s no evidence of support for Labour having “jumped one point”, even if you could describe such a marginal change as a “jump” in the first place.

Statistical illiteracy is as widespread amongst politicians as it is amongst journalists, but the fact that silly reports like this are commonplace doesn’t make them any less annoying. After all, the idea of sampling uncertainty isn’t all that difficult to understand. Is it?

And with so many more important things going on in the world that deserve better press coverage than they are getting, why does a “quality” newspaper waste its valuable column inches on this sort of twaddle?

## A Keno Game Problem

Posted in Cute Problems with tags , , , , on July 25, 2014 by telescoper

It’s been a while since I posted anything in the Cute Problems category so, given that I’ve got an unexpected gap of half an hour today, I thought I’d return to one of my side interests, the mathematics and games and gambling.

There is a variety of gambling games called Keno games in which a player selects (or is given) a set of numbers, some or all of which the player hopes to match with numbers drawn without replacement from a larger set of numbers. A common example of this type of game is Bingo. These games mostly originate in the 19th Century when travelling carnivals and funfairs often involved booths in which customers could gamble in various ways; similar things happen today, though perhaps with more sophisticated games.

In modern Casino Keno (sometimes called Race Horse Keno) a player receives a card with the numbers from 1 to 80 marked on it. He or she then marks a selection between 1 and 15 numbers and indicates the amount of a proposed bet; if n numbers are marked then the game is called `n-spot Keno’. Obviously, in 1-spot Keno, only one number is marked. Twenty numbers are then drawn without replacement from a set comprising the integers 1 to 80, using some form of randomizing device. If an appropriate proportion of the marked numbers are in fact drawn the player gets a payoff calculated by the House. Below you can see the usual payoffs for 10-spot Keno:

If fewer than five of your numbers are drawn, you lose your £1 stake. The expected gain on a £1 bet can be calculated by working out the probability of each of the outcomes listed above multiplied by the corresponding payoff, adding these together and then subtracting the probability of losing your stake (which corresponds to a gain of -£1). If this overall expected gain is negative (which it will be for any competently run casino) then the expected loss is called the house edge. In other words, if you can expect to lose £X on a £1 bet then X is the house edge.

What is the house edge for 10-spot Keno?