## An Informational Approach to Cosmological Parameter Estimation

In order to avoid having to make a start on examination marking I was having a trawl through the arXiv this morning when I found an interesting paper by Stephens & Gleiser called *An Informational Approach to Cosmological Parameter Estimation*. The abstract is:

You can download a PDF of the full paper here.

I haven’t had time to go through the manuscript in detail but while it doesn’t seem to say very much of a specific nature about the Hubble constant tension issue, it does introduce an approach which is new to me. The Jensen-Shannon Divergence is a variation on the familiar Kullback-Leibler Divergence.

Anyway, I’d be interested in comments on this from experts!

Follow @telescoper
May 22, 2019 at 7:57 pm

I’m commenting here exclusively on the Jensen-Shannon divergence, not its application to cosmology (or anything else). The Jensen-Shannon expression is a quantifier of the difference between two probability distributions, and is simply a symmetrised version of the Kullback-Leibler expression. For two probability distributions p_i and q_i defined on the same space {i}, the Kullback-Leibler expression is

\sum_i p_i \log (p_i / q_i)

and the Jensen-Shannon version symmetrises this in the simplest way by interchanging the p’s and q’s and taking half the sum of original and interchanged expressions.

John Skilling has shown using some basic criteria of consistency that any expression quantifying the information in a probability distribution, which is to be used in an optimising process, *must* be of Kullback-Leibler form. The asymmetry in this form is not something to be squeamish of – you have to decide what is fundamental and what you are going to optimise. The authors should have done that, rather than seek an ad hoc symmetrical expression.

May 22, 2019 at 8:00 pm

Do you have a reference to John Skilling’s proof?

P. S. Must get him here for a talk…

May 22, 2019 at 11:45 pm

He’s published it lots of times, often as part of a paper that uses it. I’ll scan and email you the clearest version I have on my shelves. Also, Ed Jaynes wrote about the continuum version of Shannon entropy being

– \int p(x) \log [ p(x) / m(x) ]

where m(x) is the measure on x-space, and derived it as the discrete version of

\sum_i p_i \log (p_i / m_i)

where m_i is a degeneracy factor which becomes a density of states in the continuum limit. When you understand it like that, you see that m is not something that can be tampered with in the way a probability distribution can: it is fixed by factors relating to the space on which probabilities are defined, and therefore more fundamental that the factors that determine probabilities.