## Ergodic Means…

The topic of this post is something I’ve been wondering about for quite a while. This afternoon I had half an hour spare after a quick lunch so I thought I’d look it up and see what I could find.

The word *ergodic* is one you will come across very frequently in the literature of statistical physics, and in cosmology it also appears in discussions of the analysis of the large-scale structure of the Universe. I’ve long been puzzled as to where it comes from and what it actually means. Turning to the excellent Oxford English Dictionary Online, I found the answer to the first of these questions. Well, sort of. Under *etymology* we have

ad. G.

ergoden(L. Boltzmann 1887, inJrnl. f. d. reine und angewandte Math.C. 208), f. Gr.

I say “sort of” because it does attribute the origin of the word to Ludwig Boltzmann, but the greek roots (εργον and οδοσ) appear to suggest it means “workway” or something like that. I don’t think I follow an ergodic path on my way to work so it remains a little mysterious.

The actual definitions of *ergodic* given by the OED are

Of a trajectory in a confined portion of space: having the property that in the limit all points of the space will be included in the trajectory with equal frequency. Of a stochastic process: having the property that the probability of any state can be estimated from a single sufficiently extensive realization, independently of initial conditions; statistically stationary.

As I had expected, it has two meanings which are related, but which apply in different contexts. The first is to do with paths or orbits, although in physics this is usually taken to meantrajectories in phase space (including both positions and velocities) rather than just three-dimensional position space. However, I don’t think the OED has got it right in saying that the system visits all positions with equal frequency. I think an ergodic path is one that must visit all positions within a given volume of phase space rather than being confined to a lower-dimensional piece of that space. For example, the path of a planet under the inverse-square law of gravity around the Sun is confined to a one-dimensional ellipse. If the force law is modified by external perturbations then the path need not be as regular as this, in extreme cases wandering around in such a way that it never joins back on itself but eventually visits all accessible locations. As far as my understanding goes, however, it doesn’t have to visit them all with equal frequency. The ergodic property of orbits is intimately associated with the presence of chaotic dynamical behaviour.

The other definition relates to stochastic processes, i.e processes involving some sort of random component. These could either consist of a discrete collection of random variables {*X _{1}…X_{n}*} (which may or may not be correlated with each other) or a continuously fluctuating function of some parameter such as time

*t*, i.e.

*X(t)*or spatial position (or perhaps both).

Stochastic processes are quite complicated measure-valued mathematical entities because they are specified by probability distributions. What the ergodic hypothesis means in the second sense is that measurements extracted from a single *realization* of such a process have a definition relationship to analagous quantities defined by the probability distribution.

I always think of a stochastic process being like a kind of algorithm (whose workings we don’t know). Put it on a computer, press “go” and it spits out a sequence of numbers. The ergodic hypothesis means that by examining a sufficiently long run of the output we could learn something about the properties of the algorithm.

An alternative way of thinking about this for those of you of a frequentist disposition is that the probability average is taken over some sort of statistical ensemble of possible realizations produced by the algorithm, and this must match the appropriate long-term average taken over one realization.

This is actually quite a deep concept and it can apply (or not) in various degrees. A simple example is to do with properties of the mean value. Given a single run of the program over some long time *T* we can compute the sample average

the probability average is defined differently over the probability distribution, which we can call *p(x)*

If these two are equal for sufficiently long runs, i.e. as *T* goes to infinity, then the process is said to be *ergodic in the mean*. A process could, however, be ergodic in the mean but not ergodic with respect to some other property of the distribution, such as the variance. *Strict* ergodicity would require that the entire frequency distribution defined from a long run should match the probability distribution to some accuracy.

Now we have a problem with the OED again. According to the defining quotation given above, ergodic can be taken to mean *statistically stationary. *Actually that’s not true. ..

In the one-parameter case, “statistically stationary” means that the probability distribution controlling the process is independent of time, i.e. that *p(x,t)=p(x,t+Δt)* . It’s fairly straightforward to see that the ergodic property requires that a process *X(t)* be stationary, but the converse is not the case. Not every stationary process is necessarily ergodic. Ned Wright gives an example here. For a higher-dimensional process, such as a spatially-fluctuating random field the analogous property is statistical *homogeneity*, rather than stationarity, but otherwise everything carries over.

Ergodic theorems are very tricky to prove in general, but there are well-known results that rigorously establish the ergodic properties of Gaussian processes (which is another reason why theorists like myself like them so much). However, it should be mentioned that even if the ergodic assumption applies its usefulness depends critically on the rate of convergence. In the time-dependent example I gave above, it’s no good if the averaging period required is much longer than the age of the Universe; in that case even ergodicity makes it difficult to make inferences from your sample. Likewise the ergodic hypothesis doesn’t help you analyse your galaxy redshift survey if the averaging scale needed is larger than the depth of the sample.

Moreover, it seems to me that many physicists resort to ergodicity when there isn’t any compelling mathematical grounds reason to think that it is true. In some versions of the multiverse scenario, it is hypothesized that the fundamental constants of nature describing our low-energy turn out “randomly” to take on different values in different domains owing to some sort of spontaneous symmetry breaking perhaps associated a phase transition generating cosmic inflation. We happen to live in a patch within this structure where the constants are such as to make human life possible. There’s no need to assert that the laws of physics have been designed to make us possible if this is the case, as most of the multiverse doesn’t have the fine tuning that appears to be required to allow our existence.

As an application of the Weak Anthropic Principle, I have no objection to this argument. However, behind this idea lies the assertion that all possible vacuum configurations (and all related physical constants) do arise ergodically. I’ve never seen anything resembling a proof that this is the case. Moreover, there are many examples of physical phase transitions for which the ergodic hypothesis is *known* not to apply. If there is a rigorous proof that this works out, I’d love to hear about it. In the meantime, I remain sceptical.

October 19, 2009 at 4:12 pm

It took Ya. Sinai, a world-class mathematician, more than 50 dense pages to prove ergodicity for a billiard ball on a table with rounded ends, and the result was not generalisable if the sides proved to be even fractionally off-parallel. This suggeests that the basis of statistical mechanics does not lie in ergodicity but in something else. That something else is “macroscopic reproducibilty” – that if you consistently get the same result when doing an experiment relating thermodynamic variables (even though the 10^23 atoms are not identically disposed each time) then you can exploit that fact via the so-called Principle of Maximum Entropy in order to generate a predictive formalism.

Of course, macroscopic reproducibility is an axiom. If you do not take temperature into account then you find the pressure- density relationship for a (imperfect) gas is not macroscopically reproducible. Take temperature into account and the P-V-T relationship *is* macroscopically reproducible (unles you are playing with a polar gas like NO2 in a laboratory pervaded by a fluctuating electric field… and so on). Once you declare you have Macroscopic Reproducibility then you can invoke the MaxEnt procedure and get statistical mechanics up and running. You know it makes sense…

Anton

November 15, 2009 at 1:02 am

Nice intuitive approach to ergodicity! Thanks!

November 15, 2009 at 7:50 pm

[…] ZX do not appear anywhere in that play. To guarantee what we need the kind their typing has to be ergodic, a very specific requirement not possessed by all “random” […]

August 1, 2012 at 3:34 pm

Hello, thank you for your explanations. Could you please, help, how to calculate ergodic mean of a variable (Zt) which follows AR(1) process in logarithms, such as logZt+1=m*logZt+et, where et is normal with zero mean and some variance sigma. This question is from macroeconomics.

August 1, 2012 at 3:48 pm

If you just think of this as the usual AR(1) process, i.e. X(t+1)=mX(t)+e(t) then X(t) has zero mean and variance S^2= sigma^2/(1-m^2) and if e(t) is normal then so is X(t).

But X(t)=logZ(t), so Z(t)=exp(X(t)) so Z(t) is lognormal. In this case its mean is exp(S^2/2) with S given by the above expression.