## Cerebral Asymmetry: is it all in the Mind?

After blogging a few days ago about the possibility that our entire Universe might be asymmetric, I found out today that a short comment of mine about a completely different form of asymmetry has been published in the *Proceedings of the National Academy of Sciences of New York*.

Earlier this summer a paper by Ivanka Savic & Per Lindstrom concerning gender and sexuality differences in brain structure received widespread press coverage and the odd blog comment. They had analysed a group of 90 volunteers divided into four classes based on gender and sexual orientation: male heterosexual, male homosexual, female heterosexual and female homosexual.

They studied the brain structure of these volunteers using Magnetic Resonance Imaging and used their data to look for differences between the different classes. In particular they measured the asymmetry between left and right hemispheres for their samples. The right side of the brain for heterosexual men was found to be typically about 2% larger than the left; homosexual women also had an asymmetry, but slightly smaller than this at about 1%. Gay men and heterosexual women showed no discernible cerebral asymmetry. These claims are obviously very interesting and potentially important if they turn out to be true. It is in the nature of the scientific method that such results should be subjected to rigorous scrutiny in order to check their credibility.

As someone who knows nothing about neurobiology but one or two things about statistics, I dug out the research paper by Savic & Lindstrom and looked at the analysis it presents. I very quickly began to suspect there might be a problem. For each volunteer, the authors obtain measurements of the left and right cerebral volumes (call these L and R respectively). Each pair of measurements is then combined to form an asymmetry index (AI) as (L-R)/(L+R). There is then a set of values for AI, one for each volunteer. The claim is that these are systematically different for the different gender and orientation groups, based on a battery of tests including Analysis of Variance (ANOVA) and t-tests based on sample means.

Of course, it would be better to do this using a consistent, Bayesian, approach because this would make explicit the dependence of the results on an underlying model of the data. Sadly, the statistical methodology available off-the-shelf is of inferior frequentist type and this is what researchers tend to do when they don’t really know what they’re doing. They also don’t bother to read the health warnings that state the assumptions behind the results.

The problem in this case is that the tests done by Savic & Lindstrom all depend on the quantity being analysed (AI) having a normal (Gaussian) distribution. This is very often a reasonable hypothesis for biometric data, but unfortunately in this case the construction of the asymmetry index is such that it is expected to have a very non-Gaussian shape as is commonly the case for distributions of variables formed as ratios. In fact, the ratio of two normal variates has a peculiar distribution with very long tails. Many statistical analyses appeal to the Central Limit Theorem to justify the assumption of normality, but distributions with very long tails (such as the Cauchy distribution) violate the conditions of this Theorem, namely that the distribution must have finite variance. The asymmetry index is probably therefore an inappropriate choice of variable for the tests that Savic & Lindstrom perform. In particular the significance levels (or p-values) quoted in their paper are very low (of order 0.0008, for example, in the ANOVA test) which is surprising for such small samples. These probabilities are obtained by assuming the observations have Gaussian statistics, and they would be much lower for a distribution with longer tails.

Being a friendly chap I emailed Dr Savic drawing this problem to her attention and asking if she knew about this problem and the possible implications it might have for the analysis she had presented. If not, I offered to do an independent (private) check on the data to see how reliable the claimed statistical results actually were. I never received a reply.

Worried that the world might be jumping to all kinds of far-reaching conclusions about gay genes based on these questionable statistics, I wrote instead to the editor of the Journal *Proceedings of the National Academy of Sciences of New York*, Randy Schekman, who suggested I submit a written comment to the Journal. I did, it was accepted by the editorial committee, and it came out in the 11th November Issue. What I didn’t realise was that Savic & Lindstrom had actually prepared a reply and that this was published alongside my comment. I find it strange that I wasn’t told about this before publication but that aside, it is in principle quite reasonable to let the authors respond to criticisms like mine. Their response reveals that they completely missed the point of the danger of long-tailed distributions I mentioned above. They state that “*when the sample size n is big the sampling distribution of the mean becomes approximately normal regardless of the distribution of the original variable*“. Not if the distribution of the original variable has such a long tail it doesn’t! In fact, if the observations have a Cauchy distribution then so does the sampling distribution of the mean, whatever the size of sample. You can find this caveat spelled out in many places, including here. Savic & Lindstrom seem oblivous to this pitfall, even after I specifically pointed it out to them.

They also claim that a group size of n=30 is sufficient to be confident that the central limit theorem holds. A pity, then, that none of their groups is of that size. The overall sample is 90, but it is broken down into two groups of 20 and two of 25.

They also say that the measured AI distribution is actually normal anyway and give a plot (above). This shows all the AI values binned into one histogram. Since they don’t give any quantitative measures of goodness of fit, it’s hard to tell whether this has a normal distribution or not. One can, however, easily identify a group of five or six individuals that seem to form a separate group with larger AI values (the small peak to the right of the large peak). Since they don’t give histograms broken down by group it is impossible to be sure, but I would hazard a guess that these few individuals might be responsible for the entire result; remember that the entire sample has n only of 90.

More alarmingly, Savic & Lindstrom state in their reply that “one outlier” is omitted from this graph. Really? On what basis was the outlier rejected? The existence of outliers could be evidence of exactly the sort of problem I am worried about! Unless there was a known mistake in the measurement, this outlier should never have been omitted. They claim that the “recalculation of the data excluding this outlier does not change the results”. It find it difficult to believe that the removal of an outlier from such a small sample could not change the p-values!

In my note I made a few constructive suggestions as to how the difficulty might be circumvented, by Savic & Bergstrom have not followed any of them. Instead they report (without details of the p-values) having done some alternative, non-parametric, tests. These are all very well, but they don’t add very much if their p-values also assume Gaussian statistics. A better way to do this sort of thing robustly would be using Monte Carlo simulations.

The bottom line is that after this exchange of comments we haven’t really got anywhere and I still don’t know if the result is significant. I don’t really think it’s useful to go backwards and forwards through the journal, so I’ve emailed Dr Savic again asking for access to the numbers so I can check the statistics privately. In astronomy it is quite normal for people to make their data sets publically available, but that doesn’t seem to be the case in neurobiology. I’m not hopeful that they will reply, especially since they branded my comments “harsh” and “inappropriate”. Scientists should know how to take constructive criticism.

Their conclusion may eventually turn out to be right, but the analysis done so far is certainly not robust and it needs further checking. In the meantime I don’t just have doubts about the claimed significance of this specific result, which merely serves to illustrate the extremely poor level of statistical understanding displayed by large numbers of professional researchers. This was one of the things I wrote about in my book *From Cosmos to Chaos*. I’m very confident that a large fraction of claimed results in biosciences are based on bogus analyses.

I’ve long thought that scientific journals that deal with subjects like this should employ panels of statisticians to do the analysis independently of the authors and also that publication of the paper should require publication of the raw data. Science advances when results are subject to open criticism and independent analysis. I sincerely hope that Savic & Lindstrom will release their data in order for their conclusions to be checked in this way.

It’s no wonder that there is so much public distrust of science, when such important claims are rushed into the public domain without proper scrutiny.

November 12, 2008 at 3:33 pm

I think I need to go on a statistics course….

November 12, 2008 at 4:28 pm

Good explanation of the underlying statistics in this neurobiology study, that presumptively displays a pitfall, i need an statistics curse as well.

Neverthless, brain assymetry, and in particular assymetry among individuals with different sexual orientations (male homosexual having the same deegre of assymetry than female heterosexual) is confirmed even by anatomical studies and visual inspection of brain samples after staining methods.

In other words, brain assymetry seems an undisputed neurobiological dogma.

But what you said, is really bad to science, because you are making a constructive cirticism and the authors elude to confront it.

Why don´t you think to shift to neurobiology? the brain is equal or even more interesting theoretically than galaxies.

November 12, 2008 at 5:25 pm

But if someone who knows about statistics goes into biology we’re all screwed…. My supervisor will have to retract every paper we’ve ever published. You stick to physics Peter. We don’t need your robust statistical analyses here.

November 12, 2008 at 6:34 pm

Anibal. It wouldn’t work. Neuroscientists get on my nerves.

November 12, 2008 at 8:38 pm

You’re clearly right, Peter. For the sake of the record it is important that you continue the dialogue in the relevant journal.

More worrying is the prevalence of ad hoc statistical tests in areas such as manufacturing quality control of aeroplane parts, and drug safety tests…

Anton

November 15, 2008 at 10:04 am

Peter,

Thinking more about this: I expect you are right that if L and R conform to Gaussian (normal) distributions then (L-R)/(L+R) conforms to a Cauchy distribution. (I’ve not checked the integrations.) But:

1. It won’t change conclusions very much provided that the difference (L-R) is much smaller in magnitude than the sum (L+R), so that the the distribution of (L-R(/(L+R) is tight; and

2. If the distribution is not tight, then you can’t assume normal distributions for L and R because these quantities cannot be negative. (The normal distribution runs from minus infinity to infinity.)

Did the authors say anything like this in their published reply?

Anton

November 16, 2008 at 3:45 pm

Anton,

You’re right about this. It is true that the mean differences are much smaller than the means so, if the populations are Gaussian, (L-R)/(L+R) may not be too badly behaved. I gave the exact form of the distribution in my comment. My point was that ratio distributions have the potential to have a sting in their tails and that this should be checked (and can be checked quite easily). They did not comment on these issues in their reply. They did show the histogram but that doesn’t really settle much.

Peter

November 23, 2008 at 7:35 am

[...] is a nice discussion on statistics. This is in a blog by Peter Coles that I like to read. The issues can be complicated at [...]

March 19, 2010 at 5:51 pm

[...] had a run-in myself with the authors of a paper in neurobiology who based extravagant claims on an inappropriate [...]

June 7, 2010 at 7:33 pm

[...] was essentially the point I made in a previous post about the dangers of using standard statistical techniques – which usually involve the [...]

August 23, 2010 at 10:36 pm

[...] affront that anyone might have the nerve to question their study. I had no alternative but to go public with my doubts, and my concerns have never been satisfactorily answered. How many other examples are there wherein [...]

May 16, 2011 at 2:01 pm

[...] scrutiny by other scientists, let alone members of the general public. I can give an example of my own experience of an encounter with a brick wall when trying to find out more about the statistics behind a study [...]