Archive for the Bad Statistics Category

Bad Statistics and COVID-19

Posted in Bad Statistics with tags , , , on March 27, 2020 by telescoper

It’s been a while since I posted anything in the Bad Statistics folder. That’s not as if the present Covid-19 outbreak hasn’t provided plenty of examples, it’s that I’ve had my mind on other things. I couldn’t resist, however, sharing this cracker that I found on Twitter:

The paper concerned can be found here from which the key figure is this:

This plots the basic reproductive rate R against temperature for Coronavirus infections from 100 Chinese cities. The argument is that the trend means that higher temperatures correspond to weakened transmission of the virus (as happens with influenza). I don’t know if this paper has been peer-reviewed. I sincerely hope not!

I showed this plot to a colleague of mine the other day who remarked “well, at least all the points lie on a plane”. It looks to me that if if you removed just one point – the one with R>4.5 – then the trend would vanish completely.

The alleged correlation is deeply unimpressive on its own, quite apart from the assumption that any correlation present represents a causative effect due to temperature – there could be many confounding factors.

 

P.S. Among the many hilarious responses on Twitter was this:

 

Taxing Figures

Posted in Bad Statistics, Politics with tags , , , on January 29, 2020 by telescoper

Following the campaign for the forthcoming General Election in Ireland has confirmed (not entirely unexpectedly) that politicians over here are not averse to peddling demonstrable untruths.

One particular example came up in recent televised debate during which Fine Gael leader Leo Varadkar talked about his party’s plans for tax cuts achieved by raising the salary at which workers start paying the higher rate of income tax. Here’s a summary of the proposal from the Irish Times:

Fine Gael wants to increase the threshold at which people hit the higher rate of income tax from €35,300 to €50,000, which it says will be worth €3,000 to the average earner if the policy is fully implemented.

Three thousand (per year) to the average earner! Sounds great!

But let’s look at the figures. There are two tax rates in Ireland. The first part of your income up to a certain amount is taxed at 20% – this is known as the Standard Rate. The remainder of your income is taxed at 40% which is known as the Higher Rate. The cut-off point for the standard rate depends on circumstances, but for a single person it is currently €35,300.

According to official statistics the average salary is €38,893 per year, as has been widely reported. Let’s call that €38,900 for round figures. Note that figure includes overtime and other earnings, not just basic wages.

It’s worth pointing out that in Ireland (as practically everywhere else) the distribution of earnings is very skewed. here is an example showing weekly earnings in Ireland a few years ago to demonstrate the point.

 

This means that there are more people earning less than the average salary (also known as the mean)  than above it. In Ireland over 60% of people earn less than the average.  Using the mean in examples like this* is rather misleading – the median would be less influenced by a few very high salaries –  but let’s continue with it for the sake of argument.

So how much will a person earning €38,900 actually benefit from raising the higher rate tax threshold to €50,000? For clarity I’ll consider this question in isolation from other proposed changes.

Currently such a person pays tax at 40% on the portion of their salary exceeding the threshold which is €38,900 – €35,300 = €3600. Forty per cent of that figure is €1440. If the higher rate threshold is raised above their earnings level this €3600 would instead be taxed at the Standard rate of 20%, which means that €720 would be paid instead of €1440. The net saving is therefore €720 per annum. This is a saving, but it’s nowhere near €3000. Fine Gael’s claim is therefore demonstrably false.

If you look at the way the tax bands work it is clear that a person earning over €50,000 would save an amount which is equivalent to 20% of the difference between €35,300 and €50,000 which is a sum close to €3000, but that only applies to people earning well over the average salary. For anyone earning less than €50,000 the saving is much less.

The untruth lies therefore in the misleading use of the term `average salary’.

Notice furthermore that anyone earning less than the higher rate tax threshold will not benefit in any way from the proposed change, so it favours the better off. That’s not unexpected for Fine Gael. A fairer change (in my view) would involve increasing the higher rate threshold and also the higher rate itself.

All this presupposes of course that you think cutting tax is a good idea at this time. Personally I don’t. Ireland is crying out for greater investment in public services and infrastructure so I think it’s inadvisable to make less money available for these purposes, which is what cutting tax would do.

 

*Another example is provided by the citation numbers for papers in the Open Journal of Astrophysics. The average number of citations for the 12 papers published in 2019 was around 34 but eleven of the twelve had fewer citations than this: the average is dragged up by one paper with >300 citations.

 

Phase Correlations and the LIGO Data Analysis Paper

Posted in Bad Statistics, The Universe and Stuff with tags , , , on September 1, 2019 by telescoper

I have to admit I haven’t really kept up with developments in the world of gravitational waves this summer, though there have been a number of candidate events reported in the third observing run (O3) of Advanced LIGO  which began in April 2019 to which I refer you if you’re interested.

I did notice, however, that late last week a new paper from the LIGO Scientific Collaboration and Virgo Collaboration appeared on the arXiv. This is entitled A guide to LIGO-Virgo detector noise and extraction of transient gravitational-wave signals and has the following abstract:

The LIGO Scientific Collaboration and the Virgo Collaboration have cataloged eleven confidently detected gravitational-wave events during the first two observing runs of the advanced detector era. All eleven events were consistent with being from well-modeled mergers between compact stellar-mass objects: black holes or neutron stars. The data around the time of each of these events have been made publicly available through the Gravitational-Wave Open Science Center. The entirety of the gravitational-wave strain data from the first and second observing runs have also now been made publicly available. There is considerable interest among the broad scientific community in understanding the data and methods used in the analyses. In this paper, we provide an overview of the detector noise properties and the data analysis techniques used to detect gravitational-wave signals and infer the source properties. We describe some of the checks that are performed to validate the analyses and results from the observations of gravitational-wave events. We also address concerns that have been raised about various properties of LIGO-Virgo detector noise and the correctness of our analyses as applied to the resulting data.

It’s an interesting paper that gives quite a lot of detail, especially about signal extraction and parameter-fitting, so it’s very well worth reading.

Two particular things caught my eye about this. One is that there’s no list of authors anywhere in the paper, which seems a little strange. This policy may not be new, of course. I did say I haven’t really been keeping up.

The other point I’ll mention relates to this Figure, the caption of which refers to paper [41], the famous `Danish paper‘:

The Fourier phase is plotted vertically (between 0 and 2π) and the frequency horizontally. A random-phase distribution should have the phases uniformly distributed at each frequency. I think we can agree, without further statistical analysis,  that the blue points don’t have that property!  Of course nobody denies that the strongly correlated phases  in the un-windowed data are at least partly an artifact of the application of a Fourier transform to a non-stationary time series.

I suppose by showing that using a window function to apodize the data removes phase correlations is meant to represent some form of rebuttal of the claims made in the Danish paper. If so, it’s not very convincing.

For a start the caption just says that after windowing resulting `phases appear randomly distributed‘. Could they not provide some more meaningful statistical statement than a simple eyeball impression? The text says little more:

In addition to causing spectral leakage, improper windowing of the data can result in spurious phase correlations in the Fourier transform. Figure 4 shows a scatter plot of the Fourier phase as a function of frequency … both with and without the application of a window function. The un-windowed data shows a strong phase correlation, while the windowed data does not.

(I added the link to the explanation of `spectral leakage’.)

As I have mentioned before on this blog, the human eye is very poor at distinguishing pattern from randomness. There are some subtleties involved in testing for correlated phases (e.g. because they are periodic) but there are various techniques available: I’ve worked on this myself (see, e.g., here and here.). The phases shown may well be consistent with a uniform random distribution, but I’m surprised the LIGO authors didn’t present a proper statistical analysis of the windowed phases to prove beyond doubt the point they seem to be trying to make.

Then again, later on in the caption, there is a statement that `the phases show some clustering around the 60 Hz power line’. So, on the one hand the phases `appear random’, but on the other hand they’re not. There are other plausible clusters elsewhere too. What about them?

I’m afraid the absence of quantitative detail means I don’t find this a very edifying discussion!

 

Hubble Tension: an “Alternative” View?

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on July 25, 2019 by telescoper

There was a new paper last week on the arXiv by Sunny Vagnozzi about the Hubble constant controversy (see this blog passim). I was going to refrain from commenting but I see that one of the bloggers I follow has posted about it so I guess a brief item would not be out of order.

Here is the abstract of the Vagnozzi paper:

I posted this picture last week which is relevant to the discussion:

The point is that if you allow the equation of state parameter w to vary from the value of w=-1 that it has in the standard cosmology then you get a better fit. However, it is one of the features of Bayesian inference that if you introduce a new free parameter then you have to assign a prior probability over the space of values that parameter could hold. That prior penalty is carried through to the posterior probability. Unless the new model fits observational data significantly better than the old one, this prior penalty will lead to the new model being disfavoured. This is the Bayesian statement of Ockham’s Razor.

The Vagnozzi paper represents a statement of this in the context of the Hubble tension. If a new floating parameter w is introduced the data prefer a value less than -1 (as demonstrated in the figure) but on posterior probability grounds the resulting model is less probable than the standard cosmology for the reason stated above. Vagnozzi then argues that if a new fixed value of, say, w = -1.3 is introduced then the resulting model is not penalized by having to spread the prior probability out over a range of values but puts all its prior eggs in one basket labelled w = -1.3.

This is of course true. The problem is that the value of w = -1.3 does not derive from any ab initio principle of physics but by a posteriori of the inference described above. It’s no surprise that you can get a better answer if you know what outcome you want. I find that I am very good at forecasting the football results if I make my predictions after watching Final Score

Indeed, many cosmologists think any value of w < -1 should be ruled out ab initio because they don’t make physical sense anyway.

 

 

 

Statistical Analysis of the 1919 Eclipse Measurements

Posted in Bad Statistics, The Universe and Stuff with tags , , , , on May 27, 2019 by telescoper

So the centenary of the famous 1919 Eclipse measurements is only a couple of days away and to mark it I have a piece on RTÉ Brainstorm published today in advance of my public lecture on Wednesday.

I thought I’d complement the more popular piece by posting a very short summary of how the measurements were analyzed for those who want a bit more technical detail.

The idea is simple. Take a photograph during a solar eclipse during which some stars are visible in the sky close enough to the Sun to be deflected by its gravity. Take a similar photograph of the same stars at night at some other time when the Sun is elsewhere. Compare the positions of the stars on the two photographs and the star positions should have shifted slightly on the eclipse plates compared to the comparison plate. This gravitational shift should be radially outwards from the centre of the Sun.

One can measure the coordinates of the stars in two directions: Right Ascension (x) and Declination (y) and the corresponding (small) difference between the positions in each direction are Dx and Dy on the right hand side of the equations above.

In the absence of any other effects these deflections should be equal to the deflection in each component calculated using Einstein’s theory or Newtonian value. This is represented by the two terms Ex(x,y) and Ey(x,y) which give the calculated components of the deflection in both x and y directions scaled by a parameter α which is the object of interest – α should be precisely a factor two larger in Einstein’s theory than in the `Newtonian’ calculation.

The problem is that there are several other things that can cause differences between positions of stars on the photographic plate, especially if you remember that the eclipse photographs have to be taken out in the field rather than at an observatory.  First of all there might be an offset in the coordinates measured on the two plates: this is represented by the terms c and f in the equations above. Second there might be a slightly different magnification on the two photographs caused by different optical performance when the two plates were exposed. These would result in a uniform scaling in x and y which is distinguishable from the gravitational deflection because it is not radially outwards from the centre of the Sun. This scale factor is represented by the terms a and e. Third, and finally, the plates might be oriented slightly differently, mixing up x and y as represented by the cross-terms b and d.

Before one can determine a value for α from a set of measured deflections one must estimate and remove the other terms represented by the parameters a-f. There are seven unknowns (including α) so one needs at least seven measurements to get the necessary astrometric solution.

The approach Eddington wanted to use to solve this problem involved setting up simultaneous equations for these parameters and eliminating variables to yield values for α for each plate. Repeating this over many allows one to beat down the measurement errors by averaging and return a final overall value for α. The 1919 eclipse was particularly suitable for this experiment because (a) there were many bright stars positioned close to the Sun on the sky during totality and (b) the duration of totality was rather long – around 7 minutes – allowing many exposures to be taken.

This was indeed the approach he did use to analyze the data from the Sobral plates, but tor the plates taken at Principe during poor weather he didn’t have enough star positions to do this: he therefore used estimates of the scale parameters (a and e) taken entirely from the comparison plates. This is by no means ideal, though he didn’t really have any choice.

If you ask me a conceptually better approach would be the Bayesian one: set up priors on the seven parameters then marginalize over a-f  to leave a posterior distribution on α. This task is left as an exercise to the reader.

 

 

Dos and Don’ts of reduced chi-squared

Posted in Bad Statistics, The Universe and Stuff with tags , , on April 26, 2019 by telescoper

Yesterday I saw a tweet about an arXiv paper and thought I’d share it here. The paper, I mean. It’s not new but I’ve never seen it before and I think it’s well worth reading. The abstract of the paper is:

Reduced chi-squared is a very popular method for model assessment, model comparison, convergence diagnostic, and error estimation in astronomy. In this manuscript, we discuss the pitfalls involved in using reduced chi-squared. There are two independent problems: (a) The number of degrees of freedom can only be estimated for linear models. Concerning nonlinear models, the number of degrees of freedom is unknown, i.e., it is not possible to compute the value of reduced chi-squared. (b) Due to random noise in the data, also the value of reduced chi-squared itself is subject to noise, i.e., the value is uncertain. This uncertainty impairs the usefulness of reduced chi-squared for differentiating between models or assessing convergence of a minimisation procedure. The impact of noise on the value of reduced chi-squared is surprisingly large, in particular for small data sets, which are very common in astrophysical problems. We conclude that reduced chi-squared can only be used with due caution for linear models, whereas it must not be used for nonlinear models at all. Finally, we recommend more sophisticated and reliable methods, which are also applicable to nonlinear models.

I added the link at the beginning; you can download a PDF of the paper here.

I’ve never really understood why this statistic (together with related frequentist-inspired ideas) is treated with such reverence by astronomers, so this paper offers a valuable critique to those tempted to rely on it blindly.

 

 

Bad Statistics and the Gender Gap

Posted in Bad Statistics with tags , , , on April 3, 2019 by telescoper

So there’s an article in Scientific American called How to Close the Gender Gap in the Labo(u)r Force (I’ve added a `u’ to `Labour’ so that it can be understood in the UK).

I was just thinking the other day that it’s been a while since I added any posts to the `Bad Statistics’ folder, but this Scientific American article offers a corker:

That parabola is a  `Regression line’? Seriously? Someone needs to a lesson in how not to over-fit data! It’s plausible that the orange curve might be the best-fitting parabola to the blue points, but that doesn’t mean that it provides a sensible description of the data…

I can see a man walking a dog in the pattern of points to the top right: can I get this observation published in Scientific American?