Archive for Poisson process

GAA Clustering

Posted in Bad Statistics, GAA, The Universe and Stuff with tags , , , , , , on July 25, 2022 by telescoper
The distribution of GAA pitches in Ireland

The above picture was doing the rounds on Twitter yesterday ahead of this year’s All-Ireland Football Final at Croke Park (won by favourites Kerry despite a valiant effort from Galway, who led for much of the game and didn’t play at all like underdogs).

The picture above shows the distribution of Gaelic Athletics Association (GAA) grounds around Ireland. In case you didn’t know, Hurling and Gaelic Football are played on the same pitch with the same goals and markings on the field. First thing you notice is that the grounds are plentiful! Obviously the distribution is clustered around major population centres – Dublin, Cork, Limerick and Galway are particularly clear – but other than that the distribution is quite uniform, though in less populated areas the grounds tend to be less densely packed.

The eye is also drawn to filamentary features, probably related to major arterial roads. People need to be able to get to the grounds, after all. Or am I reading too much into these apparent structures? The eye is notoriously keen to see patterns where none really exist, a point I’ve made repeatedly on this blog in the context of galaxy clustering.

The statistical description of clustered point patterns is a fascinating subject, because it makes contact with the way in which our eyes and brain perceive pattern. I’ve spent a large part of my research career trying to figure out efficient ways of quantifying pattern in an objective way and I can tell you it’s not easy, especially when the data are prone to systematic errors and glitches. I can only touch on the subject here, but to see what I am talking about look at the two patterns below:

You will have to take my word for it that one of these is a realization of a two-dimensional Poisson point process and the other contains correlations between the points. One therefore has a real pattern to it, and one is a realization of a completely unstructured random process.

random or non-random?

I show this example in popular talks and get the audience to vote on which one is the random one. The vast majority usually think that the one on the right that  is random and the one on the left is the one with structure to it. It is not hard to see why. The right-hand pattern is very smooth (what one would naively expect for a constant probability of finding a point at any position in the two-dimensional space) , whereas the left-hand one seems to offer a profusion of linear, filamentary features and densely concentrated clusters.

In fact, it’s the picture on the left that was generated by a Poisson process using a  Monte Carlo random number generator. All the structure that is visually apparent is imposed by our own sensory apparatus, which has evolved to be so good at discerning patterns that it finds them when they’re not even there!

The right-hand process is also generated by a Monte Carlo technique, but the algorithm is more complicated. In this case the presence of a point at some location suppresses the probability of having other points in the vicinity. Each event has a zone of avoidance around it; the points are therefore anticorrelated. The result of this is that the pattern is much smoother than a truly random process should be. In fact, this simulation has nothing to do with galaxy clustering really. The algorithm used to generate it was meant to mimic the behaviour of glow-worms which tend to eat each other if they get  too close. That’s why they spread themselves out in space more uniformly than in the random pattern.

Incidentally, I got both pictures from Stephen Jay Gould’s collection of essays Bully for Brontosaurus and used them, with appropriate credit and copyright permission, in my own book From Cosmos to Chaos.

The tendency to find things that are not there is quite well known to astronomers. The constellations which we all recognize so easily are not physical associations of stars, but are just chance alignments on the sky of things at vastly different distances in space. That is not to say that they are random, but the pattern they form is not caused by direct correlations between the stars. Galaxies form real three-dimensional physical associations through their direct gravitational effect on one another.

People are actually pretty hopeless at understanding what “really” random processes look like, probably because the word random is used so often in very imprecise ways and they don’t know what it means in a specific context like this.  The point about random processes, even simpler ones like repeated tossing of a coin, is that coincidences happen much more frequently than one might suppose.

I suppose there is an evolutionary reason why our brains like to impose order on things in a general way. More specifically scientists often use perceived patterns in order to construct hypotheses. However these hypotheses must be tested objectively and often the initial impressions turn out to be figments of the imagination, like the canals on Mars.