There was a new paper last week on the arXiv by Sunny Vagnozzi about the Hubble constant controversy (see this blog passim). I was going to refrain from commenting but I see that one of the bloggers I follow has posted about it so I guess a brief item would not be out of order.

Here is the abstract of the Vagnozzi paper:

I posted this picture last week which is relevant to the discussion:

The point is that if you allow the equation of state parameter w to vary from the value of w=-1 that it has in the standard cosmology then you get a better fit. However, it is one of the features of Bayesian inference that if you introduce a new free parameter then you have to assign a prior probability over the space of values that parameter could hold. That prior penalty is carried through to the posterior probability. Unless the new model fits observational data significantly better than the old one, this prior penalty will lead to the new model being disfavoured. This is the Bayesian statement of Ockham’s Razor.

The Vagnozzi paper represents a statement of this in the context of the Hubble tension. If a new floating parameter w is introduced the data prefer a value less than -1 (as demonstrated in the figure) but on posterior probability grounds the resulting model is less probable than the standard cosmology for the reason stated above. Vagnozzi then argues that if a new fixed value of, say, w = -1.3 is introduced then the resulting model is not penalized by having to spread the prior probability out over a range of values but puts all its prior eggs in one basket labelled w = -1.3.

This is of course true. The problem is that the value of w = -1.3 does not derive from any ab initio principle of physics but by a posteriori of the inference described above. It’s no surprise that you can get a better answer if you know what outcome you want. I find that I am very good at forecasting the football results if I make my predictions after watching Final Score…

Indeed, many cosmologists think any value of w < -1 should be ruled out ab initio because they don’t make physical sense anyway.

I thought I’d complement the more popular piece by posting a very short summary of how the measurements were analyzed for those who want a bit more technical detail.

The idea is simple. Take a photograph during a solar eclipse during which some stars are visible in the sky close enough to the Sun to be deflected by its gravity. Take a similar photograph of the same stars at night at some other time when the Sun is elsewhere. Compare the positions of the stars on the two photographs and the star positions should have shifted slightly on the eclipse plates compared to the comparison plate. This gravitational shift should be radially outwards from the centre of the Sun.

One can measure the coordinates of the stars in two directions: Right Ascension (x) and Declination (y) and the corresponding (small) difference between the positions in each direction are D_{x} and D_{y} on the right hand side of the equations above.

In the absence of any other effects these deflections should be equal to the deflection in each component calculated using Einstein’s theory or Newtonian value. This is represented by the two terms E_{x}(x,y) and E_{y}(x,y) which give the calculated components of the deflection in both x and y directions scaled by a parameter α which is the object of interest – α should be precisely a factor two larger in Einstein’s theory than in the `Newtonian’ calculation.

The problem is that there are several other things that can cause differences between positions of stars on the photographic plate, especially if you remember that the eclipse photographs have to be taken out in the field rather than at an observatory. First of all there might be an offset in the coordinates measured on the two plates: this is represented by the terms c and f in the equations above. Second there might be a slightly different magnification on the two photographs caused by different optical performance when the two plates were exposed. These would result in a uniform scaling in x and y which is distinguishable from the gravitational deflection because it is not radially outwards from the centre of the Sun. This scale factor is represented by the terms a and e. Third, and finally, the plates might be oriented slightly differently, mixing up x and y as represented by the cross-terms b and d.

Before one can determine a value for α from a set of measured deflections one must estimate and remove the other terms represented by the parameters a-f. There are seven unknowns (including α) so one needs at least seven measurements to get the necessary astrometric solution.

The approach Eddington wanted to use to solve this problem involved setting up simultaneous equations for these parameters and eliminating variables to yield values for α for each plate. Repeating this over many allows one to beat down the measurement errors by averaging and return a final overall value for α. The 1919 eclipse was particularly suitable for this experiment because (a) there were many bright stars positioned close to the Sun on the sky during totality and (b) the duration of totality was rather long – around 7 minutes – allowing many exposures to be taken.

This was indeed the approach he did use to analyze the data from the Sobral plates, but tor the plates taken at Principe during poor weather he didn’t have enough star positions to do this: he therefore used estimates of the scale parameters (a and e) taken entirely from the comparison plates. This is by no means ideal, though he didn’t really have any choice.

If you ask me a conceptually better approach would be the Bayesian one: set up priors on the seven parameters then marginalize over a-f to leave a posterior distribution on α. This task is left as an exercise to the reader.

Yesterday I saw a tweet about an arXiv paper and thought I’d share it here. The paper, I mean. It’s not new but I’ve never seen it before and I think it’s well worth reading. The abstract of the paper is:

Reduced chi-squared is a very popular method for model assessment, model comparison, convergence diagnostic, and error estimation in astronomy. In this manuscript, we discuss the pitfalls involved in using reduced chi-squared. There are two independent problems: (a) The number of degrees of freedom can only be estimated for linear models. Concerning nonlinear models, the number of degrees of freedom is unknown, i.e., it is not possible to compute the value of reduced chi-squared. (b) Due to random noise in the data, also the value of reduced chi-squared itself is subject to noise, i.e., the value is uncertain. This uncertainty impairs the usefulness of reduced chi-squared for differentiating between models or assessing convergence of a minimisation procedure. The impact of noise on the value of reduced chi-squared is surprisingly large, in particular for small data sets, which are very common in astrophysical problems. We conclude that reduced chi-squared can only be used with due caution for linear models, whereas it must not be used for nonlinear models at all. Finally, we recommend more sophisticated and reliable methods, which are also applicable to nonlinear models.

I added the link at the beginning; you can download a PDF of the paper here.

I’ve never really understood why this statistic (together with related frequentist-inspired ideas) is treated with such reverence by astronomers, so this paper offers a valuable critique to those tempted to rely on it blindly.

I was just thinking the other day that it’s been a while since I added any posts to the `Bad Statistics’ folder, but this Scientific American article offers a corker:

That parabola is a `Regression line’? Seriously? Someone needs to a lesson in how not to over-fit data! It’s plausible that the orange curve might be the best-fitting parabola to the blue points, but that doesn’t mean that it provides a sensible description of the data…

I can see a man walking a dog in the pattern of points to the top right: can I get this observation published in Scientific American?

I noticed this morning that this week’s New Scientist cover feature (by Michael Brooks)is entitled Exclusive: Grave doubts over LIGO’s discovery of gravitational waves. The article is behind a paywall – and I’ve so far been unable to locate a hard copy in Maynooth so I haven’t read it yet but it is about the so-called `Danish paper’ that pointed out various unexplained features in LIGO data associated with the first detection of gravitational waves of a binary black hole merger.

I did know this piece was coming, however, as I spoke to the author on the phone some time ago to clarify some points I made in previous blog posts on this issue (e.g. this one and that one). I even ended up being quoted in the article:

Not everyone agrees the Danish choices were wrong. “I think their paper is a good one and it’s a shame that some of the LIGO team have been so churlish in response,” says Peter Coles, a cosmologist at Maynooth University in Ireland.

I stand by that comment, as I think certain members – though by no means all – of the LIGO team have been uncivil in their reaction to the Danish team, implying that they consider it somehow unreasonable that the LIGO results such be subject to independent scrutiny. I am not convinced that the unexplained features in the data released by LIGO really do cast doubt on the detection, but unexplained features there undoubtedly are. Surely it is the job of science to explain the unexplained?

It is an important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not as the case may be). In normal-sized laboratory physics it suffices to explain the experimental set-up in the published paper in sufficient detail for another individual or group to build an equivalent replica experiment if they want to check the results. In `Big Science’, e.g. with LIGO or the Large Hadron Collider, it is not practically possible for other groups to build their own copy, so the best that can be done is to release the data coming from the experiment. A basic problem with reproducibility obviously arises when this does not happen.

In astrophysics and cosmology, results in scientific papers are often based on very complicated analyses of large data sets. This is also the case for gravitational wave experiments. Fortunately, in astrophysics these days, researchers are generally pretty good at sharing their data, but there are a few exceptions in that field.

Even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version, assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. In any case there are often intermediate stages between `raw’ data and scientific results, as well as ancillary data products of various kinds. I think these should all be made public. Doing that could well entail a great deal of effort, but I think in the long run that it is worth it.

I’m not saying that scientific collaborations should not have a proprietary period, just that this period should end when a result is announced, and that any such announcement should be accompanied by a release of the data products and software needed to subject the analysis to independent verification.

Given that the detection of gravitational waves is one of the most important breakthroughs ever made in physics, I think this is a matter of considerable regret. I also find it difficult to understand the reasoning that led the LIGO consortium to think it was a good plan only to go part of the way towards open science, by releasing only part of the information needed to reproduce the processing of the LIGO signals and their subsequent statistical analysis. There may be good reasons that I know nothing about, but at the moment it seems to me to me to represent a wasted opportunity.

CLARIFICATION: The LIGO Consortium released data from the first observing run (O1) – you can find it here – early in 2018, but this data set was not available publicly at the time of publication of the first detection, nor when the team from Denmark did their analysis.

I know I’m an extremist when it comes to open science, and there are probably many who disagree with me, so here’s a poll I’ve been running for a year or so on this issue:

Any other comments welcome through the box below!

UPDATE: There is a (brief) response from LIGO (& VIRGO) here.

As I wait in Cardiff Airport for a flight back to civilization, I thought I’d briefly mention a paper that appeared on the arXiv this summer. The abstract of this paper (by Daniel An, Krzysztof A. Meissner and Roger Penrose) reads as follows:

This paper presents powerful observational evidence of anomalous individual points in the very early universe that appear to be sources of vast amounts of energy, revealed as specific signals found in the CMB sky. Though seemingly problematic for cosmic inflation, the existence of such anomalous points is an implication of conformal cyclic cosmology (CCC), as what could be the Hawking points of the theory, these being the effects of the final Hawking evaporation of supermassive black holes in the aeon prior to ours. Although of extremely low temperature at emission, in CCC this radiation is enormously concentrated by the conformal compression of the entire future of the black hole, resulting in a single point at the crossover into our current aeon, with the emission of vast numbers of particles, whose effects we appear to be seeing as the observed anomalous points. Remarkably, the B-mode location found by BICEP 2 is at one of these anomalous points.

The presence of Roger Penrose in the author list of this paper is no doubt a factor that contributed to the substantial amount of hype surrounding it, but although he is the originator of the Conformal Cyclic Cosmology I suspect he didn’t have anything to do with the data analysis presented in the paper as, great mathematician though he is, data analysis is not his forte.

I have to admit that I am very skeptical of the claims made in this paper – as I was in the previous case of claims of a evidence in favour of the Penrose model. In that case the analysis was flawed because it did not properly calculate the probability of the claimed anomalies in the standard model of cosmology. Moreover, the addition of a reference to BICEP2 at the end of the abstract doesn’t strengthen the case. The detection claimed by BICEP2 was (a) in polarization not in temperature and (b) is now known to be consistent with galactic foregrounds.

I will, however, hold my tongue on these claims, at least for the time being. I have an MSc student at Maynooth who is going to try to reproduce the analysis (which is not trivial, as the description in the paper is extremely vague). Watch this space.

There is a new polling agency on the block, called DeltaPoll.

I had never heard of them until last week, when they had a strange poll published in the Daily Mail (which, obviously, I’m not going to link to).

I think we need new pollsters like we need a hole in the head. These companies are forever misrepresenting the accuracy of their surveys and they confuse more than they inform. I was intrigued, however, so I looked up their Twitter profile and found this:

They don’t have a big Twitter following, but the names behind it have previously been associated with other polling agencies, so perhaps it’s not as dodgy as I assumed.

On the other hand, what on Earth does ’emotional and mathematical measurement methods’ mean?

The views presented here are personal and not necessarily those of my employer (or anyone else for that matter).
Feel free to comment on any of the posts on this blog but comments may be moderated; anonymous comments and any considered by me to be abusive will not be accepted. I do not necessarily endorse, support, sanction, encourage, verify or agree with the opinions or statements of any information or other content in the comments on this site and do not in any way guarantee their accuracy or reliability.