Phase Correlations and Cosmic Structure

Posted in Biographical, The Universe and Stuff with tags , , , on July 9, 2022 by telescoper

I’m indebted to a friend for tipping me off about a nice paper that appeared recently on the arXiv by Franco et al. with the title First measurement of projected phase correlations and large-scale structure constraints. The abstract is here:

Phase correlations are an efficient way to extract astrophysical information that is largely independent from the power spectrum. We develop an estimator for the line correlation function (LCF) of projected fields, given by the correlation between the harmonic-space phases at three equidistant points on a great circle. We make a first, 6.5σ measurement of phase correlations on data from the 2MPZ survey. Finally, we show that the LCF can significantly improve constraints on parameters describing the galaxy-halo connection that are typically degenerate using only two-point data.

I’ve worked on phase correlations myself (with a range of collaborators) – you can see a few of the papers here. Indeed I think it is fair to say I was one of the first people to explore ways of quantifying phase information in cosmology. Although I haven’t done anything on this recently (by which I mean in the last decade or so), other people have been developing very promising looking approaches (including the Line Correlation Function (LCF) explored in the above paper. In my view there is a lot of potential in this approach and as we await even more cosmological data and hopefully more people will look at this in future. In my opinion we still haven’t found the optimal way to exploit phase information statistically so there’s a lot of work to be done in this field.

Anyway, I thought I’d try to explain what phase correlations are and why they are important.

One of the challenges we cosmologists face is how to quantify the patterns we see in, for example, galaxy redshift surveys. In the relatively recent past the small size of the available data sets meant that only relatively crude descriptors could be used; anything sophisticated would be rendered useless by noise. For that reason, statistical analysis of galaxy clustering tended to be limited to the measurement of autocorrelation functions, usually constructed in Fourier space in the form of power spectra; you can find a nice review here.

Because it is so robust and contains a great deal of important information, the power spectrum has become ubiquitous in cosmology. But I think it’s important to realize its limitations.

Take a look at these two N-body computer simulations of large-scale structure:

The one on the left is a proper simulation of the “cosmic web” which is at least qualitatively realistic, in that in contains filaments, clusters and voids pretty much like what is observed in galaxy surveys.

To make the picture on the right I first  took the Fourier transform of the original  simulation. This approach follows the best advice I ever got from my thesis supervisor: “if you can’t think of anything else to do, try Fourier-transforming everything.”

Anyway each Fourier mode is complex and can therefore be characterized by an amplitude and a phase (the modulus and argument of the complex quantity). What I did next was to randomly reshuffle all the phases while leaving the amplitudes alone. I then performed the inverse Fourier transform to construct the image shown on the right.

What this procedure does is to produce a new image which has exactly the same power spectrum as the first. You might be surprised by how little the pattern on the right resembles that on the left, given that they share this property; the distribution on the right is much fuzzier. In fact, the sharply delineated features  are produced by mode-mode correlations and are therefore not well described by the power spectrum, which involves only the amplitude of each separate mode.

If you’re confused by this, consider the Fourier transforms of (a) white noise and (b) a Dirac delta-function. Both produce flat power-spectra, but they look very different in real space because in (b) all the Fourier modes are correlated in such away that they are in phase at the one location where the pattern is not zero; everywhere else they interfere destructively. In (a) the phases are distributed randomly.

The moral of this is that there is much more to the pattern of galaxy clustering than meets the power spectrum…

Phase Correlations and the LIGO Data Analysis Paper

Posted in Bad Statistics, The Universe and Stuff with tags , , , on September 1, 2019 by telescoper

I have to admit I haven’t really kept up with developments in the world of gravitational waves this summer, though there have been a number of candidate events reported in the third observing run (O3) of Advanced LIGO  which began in April 2019 to which I refer you if you’re interested.

I did notice, however, that late last week a new paper from the LIGO Scientific Collaboration and Virgo Collaboration appeared on the arXiv. This is entitled A guide to LIGO-Virgo detector noise and extraction of transient gravitational-wave signals and has the following abstract:

The LIGO Scientific Collaboration and the Virgo Collaboration have cataloged eleven confidently detected gravitational-wave events during the first two observing runs of the advanced detector era. All eleven events were consistent with being from well-modeled mergers between compact stellar-mass objects: black holes or neutron stars. The data around the time of each of these events have been made publicly available through the Gravitational-Wave Open Science Center. The entirety of the gravitational-wave strain data from the first and second observing runs have also now been made publicly available. There is considerable interest among the broad scientific community in understanding the data and methods used in the analyses. In this paper, we provide an overview of the detector noise properties and the data analysis techniques used to detect gravitational-wave signals and infer the source properties. We describe some of the checks that are performed to validate the analyses and results from the observations of gravitational-wave events. We also address concerns that have been raised about various properties of LIGO-Virgo detector noise and the correctness of our analyses as applied to the resulting data.

It’s an interesting paper that gives quite a lot of detail, especially about signal extraction and parameter-fitting, so it’s very well worth reading.

Two particular things caught my eye about this. One is that there’s no list of authors anywhere in the paper, which seems a little strange. This policy may not be new, of course. I did say I haven’t really been keeping up.

The other point I’ll mention relates to this Figure, the caption of which refers to paper [41], the famous Danish paper‘:

The Fourier phase is plotted vertically (between 0 and 2π) and the frequency horizontally. A random-phase distribution should have the phases uniformly distributed at each frequency. I think we can agree, without further statistical analysis,  that the blue points don’t have that property!  Of course nobody denies that the strongly correlated phases  in the un-windowed data are at least partly an artifact of the application of a Fourier transform to a non-stationary time series.

I suppose by showing that using a window function to apodize the data removes phase correlations is meant to represent some form of rebuttal of the claims made in the Danish paper. If so, it’s not very convincing.

For a start the caption just says that after windowing resulting phases appear randomly distributed‘. Could they not provide some more meaningful statistical statement than a simple eyeball impression? The text says little more:

In addition to causing spectral leakage, improper windowing of the data can result in spurious phase correlations in the Fourier transform. Figure 4 shows a scatter plot of the Fourier phase as a function of frequency … both with and without the application of a window function. The un-windowed data shows a strong phase correlation, while the windowed data does not.

(I added the link to the explanation of spectral leakage’.)

As I have mentioned before on this blog, the human eye is very poor at distinguishing pattern from randomness. There are some subtleties involved in testing for correlated phases (e.g. because they are periodic) but there are various techniques available: I’ve worked on this myself (see, e.g., here and here.). The phases shown may well be consistent with a uniform random distribution, but I’m surprised the LIGO authors didn’t present a proper statistical analysis of the windowed phases to prove beyond doubt the point they seem to be trying to make.

Then again, later on in the caption, there is a statement that the phases show some clustering around the 60 Hz power line’. So, on the one hand the phases `appear random’, but on the other hand they’re not. There are other plausible clusters elsewhere too. What about them?

I’m afraid the absence of quantitative detail means I don’t find this a very edifying discussion!