## Oh what a tangled web we weave…

..when first we practice frequentist statistics!

I couldn’t resist a quick post directing you to a short paper on the arXiv with the following abstract:

I use archival data to measure the mass of the central black hole in NGC 4526, M = (4.70 +- 0.14) X 10^8 Msun. This 3% error bar is the most precise for an extra-galactic black hole and is close to the precision obtained for Sgr A* in the Milky Way. The factor 7 improvement over the previous measurement is entirely due to correction of a mathematical error, an error that I suggest may be common among astronomers.

The “mathematical error” quoted in the abstract involves using chi-squared-per-degree-of-freedom instead of chi-squared instead of the full likelihood function instead of the proper, Bayesian, posterior probability. The best way to avoid such confusion is to do things properly in the first place. That way you can also fold in errors on the distance to the black hole, etc etc…

Follow @telescoper
March 11, 2013 at 7:55 pm

While I am not a fan of chi^2, it seems that Gould is making a fool of himself. The paper states they are chi^2 contours (not chi^2/dof as Gould claims), and, as it was pointed out in a discussion group yesterday, it does not look like 3% data.

But I agree, they should have done it properly in the first place.

March 12, 2013 at 11:42 am

I think Gould is probably correct (at least in his re-evaluation of Figure 2 of Davis et al., even if probably not in terms of his final black hole mass estimate) for the following reasons.

The caption of Figure 2 and the text both state that the contours are of chi^2 – chi^2_min, but the two example “bad” models shown in Figure 1 are labelled as having chi^2_red (which I take to be the same as chi^2/dof) that differ by ~28 from the best fit model. Both these models lie just outside the “chi^2 – chi^2_min = 25” contour in Figure 2, which implies that either i) it’s not chi^2_red in Figure 1 or ii) it is chi^2_red in Figure 2.

The data shown in Figure 1 might be able to resolve this ambiguity. The “overweight” black hole model has 14 central data-points which each appear to be ~100 km/s off the predicted values; the Supplementary Information implies uncertainties of ~10 km/s on each point (which roughly matches the error bars on the plot), and doesn’t say anything about correlations being taken into account. So that implies a chi^2 of ~1400, implying a reduced chi^2 of ~20 (but both values are very rough). At any rate, I think it makes it pretty clear that FIgure 1 does list chi^2_red, and hence that Figure 2 shows chi^2_red as well, implying Gould’s re-evaluation is correct.

That said, the comment from the discussion group seems reasonable, and I suspect is in part because the trace data must be significantly correlated. The correct uncertainty on the black hole mass would then be greater than that obtained by Gould, but I still suspect it should be considerably lower than that reported by Davis et al.

I’m not about to try modelling this system for myself, but the data should be made available by the authors – it’s a condition of publishing in Nature. And even without doing any modelling, just having the trace and the uncertainties would be enough to at least calculate chi^2 for the two bad models, hence resolving the chi^2 vs. chi^2_red ambiguity . . .

March 11, 2013 at 9:15 pm

Why does this alleged error have any particular link with frequentist statistics? Bayesians can make silly mistakes too, I believe.

Or do you go by the ‘true Scotsman’ criterion: anyone who makes a silly mistake must not be a true Bayesian?

March 11, 2013 at 10:12 pm

The problem is that chi-squared is used by too many people who don’t know (or care) what it is they’re actually trying to do. Set it up as a Bayesian exercise and you won’t be tempted to use an inappropriate recipe.

March 13, 2013 at 9:09 am

The ArXiv article has the comment “Submitted to ApJ”. Is this an example of the dangers of posting papers to the archive before refereeing?

Perhaps readers of this blog would appreciate it if Peter could find the time to post here a simple Baysian analysis of the problem as a worked example?

March 13, 2013 at 10:52 pm

And apologies for mistyping Bayesian.

(I do know how to spell Bayes’s name – I even know where he’s buried.)

March 13, 2013 at 3:40 pm

Since posting my comment above I have exchanged e-mails with Davis and Gould. Both confirm that the contours in Figure 2 of Davis et al. are in fact chi^2/red or chi^2/dof, so the caption is wrong and Gould’s re-evaluation of the uncertainty in M_BH is correct . . .

. . . at least formally. The reason for the caveat is that, according to Davis, there are strong correlations between the data-points, so the constraints implied by ignoring them (as both Davis and Gould have done so far) are clearly too tight. I suggested that taking every second measurement would produce (nearly) uncorrelated results and Davis agreed; hence there’s an immediate increase in the M_BH uncertainties of sort(2) = 1.4. But reading between the lines I believe Davis suspects this would still be too accurate given the data at hand.

As for Bryn’s request for the Bayesian version of this, the ingredients (bar the as-yet unknown correlations) are already there in the original paper. If one assumes a broad, uniform prior on M_BH and M/L then the shrunken contours that Gould argued for are recovered. The contours are elliptical and evenly spaced for the constant increments of sort(chi^2), so the posterior is well approximated as a bivariate normal. As such, the marginal posterior in M_BH is just a normal with the mean and standard deviation argued for by Gould. (It’s one of those simple cases where the numbers that come out of a Bayesian calculation and a frequentist/classical calculation match, even if the philosophical interpretation differs.)

March 13, 2013 at 3:40 pm

(Bloodly auto-correct has turned “sort” into “sort” . . .)

March 13, 2013 at 3:41 pm

(And yet it doesn’t sort out “bloodily”! Grrr . . .)

March 13, 2013 at 4:52 pm

Of course I could correct these errors, but I rather like them.

March 13, 2013 at 8:27 pm

I happily take back my comment about Gould. Clearly the correlations between the data points are important (and could be quite easily built into modelling), so why wasn’t this done? It’s not much of an excuse for carrying on regardless.

March 25, 2013 at 6:14 pm

I find it surprising that Gould describes his data-gathering process as “I used a Xerox^(TM) machine to enlarge Figure 2…”. rather than “I used a pdf viewer to enlarge Figure 2…”.

Doesn’t Ohio State have electronic access to Nature?