Archive for multiverse

Get thee behind me, Plato

Posted in The Universe and Stuff with tags , , , , , , , , , , on September 4, 2010 by telescoper

The blogosphere, even the tiny little bit of it that I know anything about, has a habit of summoning up strange coincidences between things so, following EM Forster’s maxim “only connect”, I thought I’d spend a lazy saturday lunchtime trying to draw a couple of them together.

A few days ago I posted what was intended to be a fun little item about the wave-particle duality in quantum mechanics. Basically, what I was trying to say is that there’s no real problem about thinking of an electron as behaving sometimes like a wave and sometimes like a particle because, in reality (whatever that is), it is neither. “Particle” and “wave” are useful abstractions but they are not in an exact one-to-one correspondence with natural phenomena.

Before going on I should point out that the vast majority of physicists are well away of the distinction between, say,  the “theoretical” electron and whatever the “real thing” is. We physicists tend to live in theory space rather than in the real world, so we tend to teach physics by developing the formal mathematical properties of the “electron” (or “electric field”) or whatever, and working out what experimental consequences these entail in certain situations. Generally speaking, the theory works so well in practice that we often talk about the theoretical electron that exists in the realm of mathematics and the electron-in-itself as if they are one and the same thing. As long as this is just a pragmatic shorthand, it’s fine. However, I think we need to be careful to keep this sort of language under control. Pushing theoretical ideas out into the ontological domain is a dangerous game. Physics – especially quantum physics – is best understood as a branch of epistemology. What is known? is safer ground than what is there?

Anyway, my  little  piece sparked a number of interesting comments on Reddit, including a thread that went along the lines “of course an electron is neither a particle nor a wave,  it’s actually  a spin-1/2 projective representation of the Lorentz Group on a Hilbert space”. That description, involving more sophisticated mathematical concepts than those involved in bog-standard quantum mechanics, undoubtedly provides a more complete account of natural phenomena associated with the electrons and electrical fields, but I’ll stick to my guns and maintain that it still introduces a deep confusion to assert that the electron “is” something mathematical, whether that’s a “spin-1/2 projective representation” or a complex function or anything else.  That’s saying something physical is a mathematical. Both entities have some sort of existence, of course, but not the same sort, and the one cannot “be” the other. “Certain aspects of an electron’s behaviour can be described by certain mathematical structures” is as I’m  prepared to go.

Pushing deeper than quantum mechanics, into the realm of quantum field theory, there was the following contribution:

The electron field is a quantum field as described in quantum field theories. A quantum field covers all space time and in each point the quantum field is in some state, it could be the ground state or it could be an excitation above the ground state. The excitations of the electron field are the so-called electrons. The mathematical object that describes the electron field possesses, amongst others, certain properties that deal with transformations of the space-time coordinates. If, when performing a transformation of the space-time coordinates, the mathematical object changes in such a way that is compatible with the physics of the quantum field, then one says that the mathematical object of the field (also called field) is represented by a spin 1/2 (in the electron case) representation of a certain group of transformations (the Poincaré group, in this example).I understand your quibbling, it seems natural to think that “spin 1/2″ is a property of the mathematical tool to describe something, not the something itself. If you press on with that distinction however, you should be utterly puzzled of why physics should follow, step by step, the path led by mathematics.

For example, one speaks about the ¨invariance under the local action of the group SU(3)” as a fundamental property of the fields that feel the nuclear strong force. This has two implications, the mathematical object that represents quarks must have 3 ¨strong¨ degrees of freedom (the so-called color) and there must be 32-1 = 8 carriers of the force (the gluons) because the group of transformations in a SU(N) group has N2-1 generators. And this is precisely what is observed.

So an extremely abstract mathematical principle correctly accounts for the dynamics of an inmensely large quantity of phenomena. Why does then physics follow the derivations of mathematics if its true nature is somewhat different?

No doubt this line of reasoning is why so many theoretical physicists seem to adopt a view of the world that regards mathematical theories as being, as it were,  “built into” nature rather than being things we humans invented to describe nature. This is a form of Platonic realism.

I’m no expert on matters philosophical, but I’d say that I find this stance very difficult to understand, although I am prepared to go part of the way. I used to work in a Mathematics department many years ago and one of the questions that came up at coffee time occasionally was “Is mathematics invented or discovered?”. In my experience, pure mathematicians always answered “discovered” while others (especially astronomers, said “invented”). For what it’s worth, I think mathematics is a bit of both. Of course we can invent mathematical objects, endow them with certain attributes and proscribe rules for manipulating them and combining them with other entities. However, once invented anything that is worked out from them is “discovered”. In fact, one could argue that all mathematical theorems etc arising within such a system are simply tautological expressions of the rules you started with.

Of course physicists use mathematics to construct models that describe natural phenomena. Here the process is different from mathematical discovery as what we’re trying to do is work out which, if any, of the possible theories is actually the one that accounts best for whatever empirical data we have. While it’s true that this programme requires us to accept that there are natural phenomena that can be described in mathematical terms, I do not accept that it requires us to accept that nature “is” mathematical. It requires that there be some sort of law governing some  of aspects of nature’s behaviour but not that such laws account for everything.

Of course, mathematical ideas have been extremely successful in helping physicists build new physical descriptions of reality. On the other hand, however, there is a great deal of mathematical formalism that is is not useful in this way.  Physicists have had to select those mathematical object that we can use to represent natural phenomena, like selecting words from a dictionary. The fact that we can assemble a sentence using words from the Oxford English Dictionary that conveys some information about something we see doesn’t not mean that what we see “is” English. A whole load of grammatically correct sentences can be constructed that don’t make any sense in terms of observable reality, just as there is a great deal of mathematics that is internally self-consistent but makes no contact with physics.

Moreover, to the person whose quote I commented on above, I’d agree that the properties of the SU(3) gauge group have indeed accounted for many phenomena associated with the strong interaction, which is why the standard model of particle physics contains 8 gluons and quarks carrying a three-fold colour charge as described by quantum chromodynamics. Leaving aside the fact that QCD is such a terribly difficult theory to work with – in practice it involves  nightmarish lattice calculations on a scale to make even the most diehard enthusiast cringe –  what I would ask is whether this  description in any case sufficient for us to assert that it describes “true nature”?  Many physicists will no doubt disagree with me, but I don’t think so. It’s a map, not the territory.

So why am I boring you all with this rambling dissertation? Well, it  brings me to my other post – about Stephen Hawking’s comments about God. I don’t want to go over that issue again – frankly, I was bored with it before I’d finished writing my own blog post  – but it does relate to the bee that I often find in my bonnet about the tendency of many modern theoretical physicists to assign the wrong category of existence to their mathematical ideas. The prime example that springs to my mind is the multiverse. I can tolerate  certain versions of the multiverse idea, in fact. What I can’t swallow, however is the identification of the possible landscape of string theory vacua – essentially a huge set of possible solutions of a complicated set of mathematical equations – with a realised set of “parallel universes”. That particular ontological step just seems absurd to me.

I’m just about done, but one more thing I’d say to finish with is concerns the (admittedly overused) metaphor of maps and territories. Maps are undoubtedly useful in helping us find our way around, but we have to remember that there are always things that aren’t on the map at all. If we rely too heavily on one, we might miss something of great interest that the cartographer didn’t think important. Likewise if we fool ourselves into thinking our descriptions of nature are so complete that they “are” all that nature is, then we might miss the road to a better understanding.


The Monkey Complex

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on November 15, 2009 by telescoper

There’s an old story that if you leave a set of monkeys hammering on typewriters for a sufficiently long time then they will eventually reproduce the entire text of Shakespeare’s play Hamlet. It comes up in a variety of contexts, but the particular generalisation of this parable in cosmology is to argue that if we live in an enormously big universe (or “multiverse“), in which the laws of nature (as specified by the relevant fundamental constants) vary “sort of randomly” from place to place, then there will be a domain in which they have the right properties for life to evolve. This is one way of explaining away the apparent fine-tuning of the laws of physics: they’re not finely tuned, but we just live in a place where they allowed us to evolve. Although it may seem an easy step from monkeys to the multiverse, it always seemed to me a very shaky one.

For a start, let’s go back to the monkeys. The supposition that given an infinite time the monkeys must produce everything that’s possible in a finite sequence, is not necessarily true even if one does allow an infinite time. It depends on how they type. If the monkeys were always to hit two adjoining keys at the same time then they would never produce a script for Hamlet, no matter how long they typed for, as the combinations QW or ZX do not appear anywhere in that play. To guarantee what we need the kind their typing has to be ergodic, a very specific requirement not possessed by all “random” sequences.

A more fundamental problem is what is meant by randomness in the first place. I’ve actually commented on this before, in a post that still seems to be collecting readers so I thought I’d develop one or two of the ideas a little.

 It is surprisingly easy to generate perfectly deterministic mathematical sequences that behave in the way we usually take to characterize indeterministic processes. As a very simple example, consider the following “iteration” scheme:

 X_{j+1}= 2 X_{j} \mod(1)

If you are not familiar with the notation, the term mod(1) just means “drop the integer part”.  To illustrate how this works, let us start with a (positive) number, say 0.37. To calculate the next value I double it (getting 0.74) and drop the integer part. Well, 0.74 does not have an integer part so that’s fine. This value (0.74) becomes my first iterate. The next one is obtained by putting 0.74 in the formula, i.e. doubling it (1.48) and dropping  the integer part: result 0.48. Next one is 0.96, and so on. You can carry on this process as long as you like, using each output number as the input state for the following step of the iteration.

Now to simplify things a little bit, notice that, because we drop the integer part each time, all iterates must lie in the range between 0 and 1. Suppose I divide this range into two bins, labelled “heads” for X less than ½ and “tails” for X greater than or equal to ½. In my example above the first value of X is 0.37 which is “heads”. Next is 0.74 (tails); then 0.48 (heads), 0.96(heads), and so on.

This sequence now mimics quite accurately the tossing of a fair coin. It produces a pattern of heads and tails with roughly 50% frequency in a long run. It is also difficult to predict the next term in the series given only the classification as “heads” or “tails”.

However, given the seed number which starts off the process, and of course the algorithm, one could reproduce the entire sequence. It is not random, but in some respects  looks like it is.

One can think of “heads” or “tails” in more general terms, as indicating the “0” or “1” states in the binary representation of a number. This method can therefore be used to generate the any sequence of digits. In fact algorithms like this one are used in computers for generating what are called pseudorandom numbers. They are not precisely random because computers can only do arithmetic to a finite number of decimal places. This means that only a finite number of possible sequences can be computed, so some repetition is inevitable, but these limitations are not always important in practice.

The ability to generate  random numbers accurately and rapidly in a computer has led to an entirely new way of doing science. Instead of doing real experiments with measuring equipment and the inevitable errors, one can now do numerical experiments with pseudorandom numbers in order to investigate how an experiment might work if we could do it. If we think we know what the result would be, and what kind of noise might arise, we can do a random simulation to discover the likelihood of success with a particular measurement strategy. This is called the “Monte Carlo” approach, and it is extraordinarily powerful. Observational astronomers and particle physicists use it a great deal in order to plan complex observing programmes and convince the powers that be that their proposal is sufficiently feasible to be allocated time on expensive facilities. In the end there is no substitute for real experiments, but in the meantime the Monte Carlo method can help avoid wasting time on flawed projects:

…in real life mistakes are likely to be irrevocable. Computer simulation, however, makes it economically practical to make mistakes on purpose.

(John McLeod and John Osborne, in Natural Automata and Useful Simulations).

So is there a way to tell whether a set of numbers is really random? Consider the following sequence:


Is this a random string of numbers? There doesn’t seem to be a discernible pattern, and each possible digit seems to occur with roughly the same frequency. It doesn’t look like anyone’s phone number or bank account. Is that enough to make you think it is random?

Actually this is not at all random. If I had started it with a three and a decimal place you might have cottoned on straight away. “3.1415926..” is the first few digits in the decimal representation of p. The full representation goes on forever without repeating. This is a sequence that satisfies most naïve definitions of randomness. It does, however, provide something of a hint as to how we might construct an operational definition, i.e. one that we can apply in practice to a finite set of numbers.

The key idea originates from the Russian mathematician Andrei Kolmogorov, who wrote the first truly rigorous mathematical work on probability theory in 1933. Kolmogorov’s approach was considerably ahead of its time, because it used many concepts that belong to the era of computers. In essence, what he did was to provide a definition of the complexity of an N-digit sequence in terms of the smallest amount of computer memory it would take to store a program capable of generating the sequence. Obviously one can always store the sequence itself, which means that there is always a program that occupies about as many bytes of memory as the sequence itself, but some numbers can be generated by codes much shorter than the numbers themselves. For example the sequence


can be generated by the instruction to “print 1 35 times”, which can be stored in much less memory than the original string of digits. Such a sequence is therefore said to be algorithmically compressible.

There are many ways of calculating the digits of π numerically also, so although it may look superficially like a random string it is most definitely not random. It is algorithmically compressible.

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. When I studied it at school I certainly wished it were a little shorter…

The complexity of a sequence can be defined to be the length of the shortest program capable of generating it. If no algorithm can be found that compresses the sequence into a program shorter than itself then it is maximally complex and can suitably be defined as random. This is a very elegant description, and has good intuitive appeal.  

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. At any rate, when I studied it at school, I certainly wished it were a little shorter…

However, this still does not provide us with a way of testing rigorously whether a given finite sequence has been produced “randomly” or not.

If an algorithmic compression can be found then that means we declare the given sequence not to be  random. However we can never be sure if the next term in the sequence would fit with what our algorithm would predict. We have to argue, inferentially, that if we have fit a long sequence with a simple algorithm then it is improbable that the sequence was generated randomly.

On the other hand, if we fail to find a suitable compression that doesn’t mean it is random either. It may just mean we didn’t look hard enough or weren’t clever enough.

Human brains are good at finding patterns. When we can’t see one we usually take the easy way out and declare that none exists. We often model a complicated system as a random process because it is  too difficult to predict its behaviour accurately even if we know the relevant laws and have  powerful computers at our disposal. That’s a very reasonable thing to do when there is no practical alternative. 

It’s quite another matter, however,  to embrace randomness as a first principle to avoid looking for an explanation in the first place. For one thing, it’s lazy, taking the easy way out like that. And for another it’s a bit arrogant. Just because we can’t find an explanation within the framework of our current theories doesn’t mean more intelligent creatures than us won’t do so. We’re only monkeys, after all.

Ergodic Means…

Posted in The Universe and Stuff with tags , , , , , , on October 19, 2009 by telescoper

The topic of this post is something I’ve been wondering about for quite a while. This afternoon I had half an hour spare after a quick lunch so I thought I’d look it up and see what I could find.

The word ergodic is one you will come across very frequently in the literature of statistical physics, and in cosmology it also appears in discussions of the analysis of the large-scale structure of the Universe. I’ve long been puzzled as to where it comes from and what it actually means. Turning to the excellent Oxford English Dictionary Online, I found the answer to the first of these questions. Well, sort of. Under etymology we have

ad. G. ergoden (L. Boltzmann 1887, in Jrnl. f. d. reine und angewandte Math. C. 208), f. Gr.

I say “sort of” because it does attribute the origin of the word to Ludwig Boltzmann, but the greek roots (εργον and οδοσ) appear to suggest it means “workway” or something like that. I don’t think I follow an ergodic path on my way to work so it remains a little mysterious.

The actual definitions of ergodic given by the OED are

Of a trajectory in a confined portion of space: having the property that in the limit all points of the space will be included in the trajectory with equal frequency. Of a stochastic process: having the property that the probability of any state can be estimated from a single sufficiently extensive realization, independently of initial conditions; statistically stationary.

As I had expected, it has two  meanings which are related, but which apply in different contexts. The first is to do with paths or orbits, although in physics this is usually taken to meantrajectories in phase space (including both positions and velocities) rather than just three-dimensional position space. However, I don’t think the OED has got it right in saying that the system visits all positions with equal frequency. I think an ergodic path is one that must visit all positions within a given volume of phase space rather than being confined to a lower-dimensional piece of that space. For example, the path of a planet under the inverse-square law of gravity around the Sun is confined to a one-dimensional ellipse. If the force law is modified by external perturbations then the path need not be as regular as this, in extreme cases wandering around in such a way that it never joins back on itself but eventually visits all accessible locations. As far as my understanding goes, however, it doesn’t have to visit them all with equal frequency. The ergodic property of orbits is  intimately associated with the presence of chaotic dynamical behaviour.

The other definition relates to stochastic processes, i.e processes involving some sort of random component. These could either consist of a discrete collection of random variables {X1…Xn} (which may or may not be correlated with each other) or a continuously fluctuating function of some parameter such as time t, i.e. X(t) or spatial position (or perhaps both).

Stochastic processes are quite complicated measure-valued mathematical entities because they are specified by probability distributions. What the ergodic hypothesis means in the second sense is that measurements extracted from a single realization of such a process have a definition relationship to analagous quantities defined by the probability distribution.

I always think of a stochastic process being like a kind of algorithm (whose workings we don’t know). Put it on a computer, press “go” and it spits out a sequence of numbers. The ergodic hypothesis means that by examining a sufficiently long run of the output we could learn something about the properties of the algorithm.

An alternative way of thinking about this for those of you of a frequentist disposition is that the probability average is taken over some sort of statistical ensemble of possible realizations produced by the algorithm, and this must match the appropriate long-term average taken over one realization.

This is actually quite a deep concept and it can apply (or not) in various degrees.  A simple example is to do with properties of the mean value. Given a single run of the program over some long time T we can compute the sample average

\bar{X}_T\equiv \frac{1}{T} \int_0^Tx(t) dt

the probability average is defined differently over the probability distribution, which we can call p(x)

\langle X \rangle \equiv \int x p(x) dx

If these two are equal for sufficiently long runs, i.e. as T goes to infinity, then the process is said to be ergodic in the mean. A process could, however, be ergodic in the mean but not ergodic with respect to some other property of the distribution, such as the variance. Strict ergodicity would require that the entire frequency distribution defined from a long run should match the probability distribution to some accuracy.

Now  we have a problem with the OED again. According to the defining quotation given above, ergodic can be taken to mean statistically stationary. Actually that’s not true. ..

In the one-parameter case, “statistically stationary” means that the probability distribution controlling the process is independent of time, i.e. that p(x,t)=p(x,t+Δt) . It’s fairly straightforward to see that the ergodic property requires that a process X(t) be stationary, but the converse is not the case. Not every stationary process is necessarily ergodic. Ned Wright gives an example here. For a higher-dimensional process, such as a spatially-fluctuating random field the analogous property is statistical homogeneity, rather than stationarity, but otherwise everything carries over.

Ergodic theorems are very tricky to prove in general, but there are well-known results that rigorously establish the ergodic properties of Gaussian processes (which is another reason why theorists like myself like them so much). However, it should be mentioned that even if the ergodic assumption applies its usefulness depends critically on the rate of convergence. In the time-dependent example I gave above, it’s no good if the averaging period required is much longer than the age of the Universe; in that case even ergodicity makes it difficult to make inferences from your sample. Likewise the ergodic hypothesis doesn’t help you analyse your galaxy redshift survey if the averaging scale needed is larger than the depth of the sample.

Moreover, it seems to me that many physicists resort to ergodicity when there isn’t any compelling mathematical grounds reason to think that it is true. In some versions of the multiverse scenario, it is hypothesized that the fundamental constants of nature describing our low-energy turn out “randomly” to take on different values in different domains owing to some sort of spontaneous symmetry breaking perhaps associated a phase transition generating  cosmic inflation. We happen to live in a patch within this structure where the constants are such as to make human life possible. There’s no need to assert that the laws of physics have been designed to make us possible if this is the case, as most of the multiverse doesn’t have the fine tuning that appears to be required to allow our existence.

As an application of the Weak Anthropic Principle, I have no objection to this argument. However, behind this idea lies the assertion that all possible vacuum configurations (and all related physical constants) do arise ergodically. I’ve never seen anything resembling a proof that this is the case. Moreover, there are many examples of physical phase transitions for which the ergodic hypothesis is known not to apply.  If there is a rigorous proof that this works out, I’d love to hear about it. In the meantime, I remain sceptical.

Cranks Anonymous

Posted in Biographical, Books, Talks and Reviews, The Universe and Stuff with tags , , , , on September 22, 2009 by telescoper

Sean Carroll, blogger-in-chief at Cosmic Variance, has ventured abroad from his palatial Californian residence and is currently slumming it in a little town called Oxford where he is attending a small conference in celebration of the 70th birthday of George Ellis. In fact he’s been posting regular live commentaries on the proceedings which I’ve been following with great interest. It looks an interesting and unusual meeting because it involves both physicists and philosophers and it is based around a series of debates on topics of current interest. See Sean’s posts here, here and here for expert summaries of the three days of the meeting.

Today’s dispatches included an account of George’s own talk which appears to have involved delivering a polemic against the multiverse, something he has been known to do from time to time. I posted something on it myself, in fact. I don’t think I’m as fundamentally opposed as Geroge to the idea that we might live in a bit of space-time that may belong to some sort of larger collection in which other bits have different properties, but it does bother me how many physicists talk about the multiverse as if it were an established fact. There certainly isn’t any observational evidence that this is true and the theoretical arguments usually advanced are far from rigorous.The multiverse certainly is  a fun thing to think about, I just don’t think it’s really needed.

There is one red herring that regularly floats into arguments about the multiverse, and that concerns testability. Different bits of the multiverse can’t be observed directly by an observer in a particular place, so it is often said that the idea isn’t testable. I don’t think that’s the right way to look at it. If there is a compelling physical theory that can account convincingly for a realised multiverse then that theory really should have other necessary consequences that are testable, otherwise there’s no point. Test the theory in some other way and you test whether the  multiverse emanating from it is sound too.

However, that fairly obvious statement isn’t really the point of this piece. As I was reading Sean’s blog post for today you could have knocked me down with a feather when I saw my name crop up:

Orthodoxy is based on the beliefs held by elites. Consider the story of Peter Coles, who tried to claim back in the 1990’s that the matter density was only 30% of the critical density. He was threatened by a cosmological bigwig, who told him he’d be regarded as a crank if he kept it up. On a related note, we have to admit that even scientists base beliefs on philosophical agendas and rationalize after the fact. That’s often what’s going on when scientists invoke “beauty” as a criterion.

George was actually talking about a paper we co-wrote for Nature in which we went through the different arguments that had been used to estimate the average density of matter in the Universe, tried to weigh up which were the more reliable, and came to the conclusion that the answer was in the range 20 to 40 percent of the critical density. There was a considerable theoretical prejudice at the time, especially from adherents of  inflation, that the density should be very close to the critical value, so we were running against the crowd to some extent. I remember we got quite a lot of press coverage at the time and I was invited to go on Radio 4 to talk about it, so it was an interesting period for me. Working with George was a tremendous experience too.

I won’t name the “bigwig” George referred to, although I will say it was a theorist; it’s more fun for those working in the field to guess for themselves! Opinions among other astronomers and physicists were divided. One prominent observational cosmologist was furious that we had criticized his work (which had yielded a high value of the density). On the other hand, Martin Rees (now “Lord” but then just plain “Sir”) said that he thought we were pushing at an open door and was surprised at the fuss.

Later on, in 1996, we expanded the article into a book in which we covered the ground more deeply but came to the same conclusion as before.  The book and the article it was based on are now both very dated because of the huge advances in observational cosmology over the last decade. However, the intervening years have shown that we were right in our assessment: the standard cosmology has about 30% of the critical density.

Of course there was one major thing we didn’t anticipate which was the discovery in the late 1990s of dark energy which, to be fair, had been suggested by others more prescient than us as early as 1990. You can’t win ’em all.

So that’s the story of my emergence as a crank, a title to which I’ve tried my utmost to do justice since then. Actually, I would have liked to have had the chance to go to George’s meeting in Oxford, primarily to greet my ertswhile collaborator whom I haven’t seen for ages. But it was invitation-only. I can’t work out whether these days I’m too cranky or not cranky enough to get to go to such things. Looking at the reports of the talks, I rather think it could be the latter.

Now, anyone care to risk the libel laws and guess who Professor BigWig was?


Posted in The Universe and Stuff with tags , , on June 17, 2009 by telescoper

The word “cosmology” is derived from the Greek κόσμος (“cosmos”) which means, roughly speaking, “the world as considered as an orderly system”. The other side of the coin to “cosmos” is Χάος (“chaos”). In one world-view the Universe comprised two competing aspects: the orderly part that was governed by laws and which could (at least in principle) be predicted, and the “random” part which was disordered and unpredictable. To make progress in scientific cosmology we do need to assume that the Universe obeys laws. We also assume that these laws apply everywhere and for all time or, if they vary, then they vary in accordance with another law.  This is the cosmos that makes cosmology possible.  However, with the rise of quantum theory, and its applications to the theory of subatomic particles and their interactions, the field of cosmology has gradually ceded some of its territory to chaos.

In the early twentieth century, the first mathematical world models were constructed based on Einstein’s general theory of relativity. This is a classical theory, meaning that it describes a system that evolves smoothly with time. It is also entirely deterministic. Given sufficient information to specify the state of the Universe at a particular epoch, it is possible to calculate with certainty what its state will be at some point in the future. In a sense the entire evolutionary history described by these models is not a succession of events laid out in time, but an entity in itself. Every point along the space-time path of a particle is connected to past and future in an unbreakable chain. If ever the word cosmos applied to anything, this is it.

But as the field of relativistic cosmology matured it was realised that these simple classical models could not be regarded as complete, and consequently that the Universe was unlikely to be as predictable as was first thought. The Big Bang model gradually emerged as the favoured cosmological theory during the middle of the last century, between the 1940s and the 1960s. It was not until the 1960s, with the work of Hawking and Penrose, that it was realised that expanding world models based on general relativity inevitably involve a break-down of known physics at their very beginning. The so-called singularity theorems demonstrate that in any plausible version of the Big Bang model, all physical parameters describing the Universe (such as its density, pressure and temperature) all become infinite at the instant of the Big Bang. The existence of this “singularity” means that we do not know what laws if any apply at that instant. The Big Bang contains the seeds of its own destruction as a complete theory of the Universe. Although we might be able to explain how the Universe subsequently evolves, we have no idea how to describe the instant of its birth. This is a major embarrassment. Lacking any knowledge of the laws we don’t even have any rational basis to assign probabilities. We are marooned with a theory that lets in water.

The second important development was the rise of quantum theory and its incorporation into the description of the matter and energy contained within the Universe. Quantum mechanics (and its development into quantum field theory) entails elements of unpredictability. Although we do not know how to interpret this feature of the theory, it seems that any cosmological theory based on quantum theory must include things that can’t be predicted with certainty.

As particle physicists built ever more complete descriptions of the microscopic world using quantum field theory, they also realised that the approaches they had been using for other interactions just wouldn’t work for gravity. Mathematically speaking, general relativity and quantum field theory just don’t fit together. It might have been hoped that quantum gravity theory would help us plug the gap at the very beginning of the Universe, but that has not happened yet because there isn’t such a theory. What we can say about the origin of the Universe is correspondingly extremely limited and mostly speculative, but some of these speculations have had a powerful impact on the subject.

One thing that has changed radically since the early twentieth century is the possibility that our Universe may actually be part of a much larger “collection” of Universes. The potential for semantic confusion here is enormous. The Universe is, by definition, everything that exists. Obviously, therefore, there can only be one Universe. The name given to a Universe that consists of bits and pieces like this is the multiverse.

 There are various ways a multiverse can be realised. In the “Many Worlds” interpretation of quantum mechanics there is supposed to be a plurality of versions of our Universe, but their ontological status is far from clear (at least to me). Do we really have to accept that each of the many worlds is “out there”, or can we get away with using them as inventions to help our calculations?

 On the other hand, some plausible models based on quantum field theory do admit the possibility that our observable Universe is part of collection of mini-universes, each of which “really” exists. It’s hard to explain precisely what I mean by that, but I hope you get my drift. These mini-universes form a classical ensemble in different domains of a single-space time, which is not what happens in quantum multiverses.

According to the Big Bang model, the Universe (or at least the part of it we know about) began about fourteen billion years ago. We do not know whether the Universe is finite or infinite, but we do know that if it has only existed for a finite time we can only observe a finite part of it. We can’t possibly see light from further away than fourteen billion light years because any light signal travelling further than this distance would have to have set out before the Universe began. Roughly speaking, this defines our “horizon”: the maximum distance we are in principle able to see. But the fact that we can’t observe anything beyond our horizon does not mean that such remote things do not exist at all. Our observable “patch” of the Universe might be a tiny part of a colossal structure that extends much further than we can ever hope to see. And this structure might be not at all homogeneous: distant parts of the Universe might be very different from ours, even if our local piece is well described by the Cosmological Principle.

Some astronomers regard this idea as pure metaphysics, but it is motivated by plausible physical theories. The key idea was provided by the theory of cosmic inflation, which I have blogged about already. In the simplest versions of inflation the Universe expands by an enormous factor, perhaps 1060, in a tiny fraction of a second. This may seem ridiculous, but the energy available to drive this expansion is inconceivably large. Given this phenomenal energy reservoir, it is straightforward to show that such a boost is not at all unreasonable. With inflation, our entire observable Universe could thus have grown from a truly microscopic pre-inflationary region. It is sobering to think that everything galaxy, star, and planet we can see might from a seed that was smaller than an atom. But the point I am trying to make is that the idea of inflation opens up ones mind to the idea that the Universe as a whole may be a landscape of unimaginably immense proportions within which our little world may be little more than a pebble. If this is the case then we might plausibly imagine that this landscape varies haphazardly from place to place, producing what may amount to an ensemble of mini-universes. I say “may” because there is yet no theory that tells us precisely what determines the properties of each hill and valley or the relative probabilities of the different types of terrain.

Many theorists believe that such an ensemble is required if we are to understand how to deal probabilistically with the fundamentally uncertain aspects of modern cosmology. I don’t think this is the case. It is, at least in principle, perfectly possible to apply probabilistic arguments to unique events like the Big Bang using Bayesian inference. If there is an ensemble, of course, then we can discuss proportions within it, and relate these to probabilities too. Bayesians can use frequencies if they are available but do not require them. It is one of the greatest fallacies in science that probabilities need to be interpreted as frequencies.

At the crux of many related arguments is the question of why the Universe appears to be so well suited to our existence within it. This fine-tuning appears surprising based on what (little) we know about the origin of the Universe and the many other ways it might apparently have turned out. Does this suggest that it was designed to be so or do we just happen to live in a bit of the multiverse nice enough for us to have evolved and survived in?  

Views on this issue are often boiled down into a choice between a theistic argument and some form of anthropic selection.  A while ago I gave a talk at a meeting in Cambridge called God or Multiverse? that was an attempt to construct a dialogue between theologians and cosmologists. I found it interesting, but it didn’t alter my view that science and religion don’t really overlap very much at all on this, in the sense that if you believe in God it doesn’t mean you have to reject the multiverse, or vice-versa. If God can create a Universe, he could create a multiverse to0. As it happens, I’m agnostic about both.

So having, I hope, opened up your mind to the possibility that the Universe may be amenable to a frequentist interpretation, I should confess that I think one can actually get along quite nicely without it.  In any case, you will probably have worked out that I don’t really like the multiverse. One reason I don’t like it is that it accepts that some things have no fundamental explanation. We just happen to live in a domain where that’s the way things are. Of course, the Universe may turn out to be like that –  there definitely will be some point at which our puny monkey brains  can’t learn anything more – but if we accept that then we certainly won’t find out if there is really a better answer, i.e. an explanation that isn’t accompanied by an infinite amount of untestable metaphysical baggage. My other objection is that I think it’s cheating to introduce an infinite thing to provide an explanation of fine tuning. Infinity is bad.

Maps, Territories and Landscapes

Posted in The Universe and Stuff with tags , , , , , , , , on January 10, 2009 by telescoper

I was looking through recent posts on cosmic variance and came across an interesting item featuring a map from another blog (run by Samuel Arbesman) which portrays the Milky Way in the style of  a public transport map:


This is just a bit of fun, of course, but I think maps like this are quite fascinating, not just as practical guides to navigating a transport system but also because they often stand up very well as works of art. It’s also interesting how they evolve with time  because of changes to the network and also changing ideas about stylistic matters.

A familiar example is the London Underground or Tube map. There is a fascinating website depicting the evolutionary history of this famous piece of graphic design. Early versions simply portrayed the railway lines inset into a normal geographical map which made them rather complicated, as the real layout of the lines is far from regular. A geographically accurate depiction of the modern tube network is shown here which makes the point:


A revolution occurred in 1933 when Harry Beck compiled the first “modern” version of the map. His great idea was to simplify the representation of the network around a single unifying feature. To this end he turned the Central Line (in red) into a straight line travelling left to right across the centre of the page, only changing direction at the extremities. All other lines were also distorted to run basically either North-South or East-West and produce a much more regular pattern, abandoning any attempt to represent the “real” geometry of the system but preserving its topology (i.e. its connectivity).  Here is an early version of his beautiful construction:

Note that although this a “modern” map in terms of how it represents the layout, it does look rather dated in terms of other design elements such as the border and typefaces used. We tend not to notice how much we surround the essential things with embellishments that date very quickly.

More modern versions of this map that you can get at tube stations and the like rather spoil the idea by introducing a kink in the central line to accommodate the complexity of the interchange between Bank and Monument stations as well as generally buggering about with the predominantly  rectilinear arrangement of the previous design:

I quite often use this map when I’m giving popular talks about physics. I think it illustrates quite nicely some of the philosophical issues related with theoretical representations of nature. I think of theories as being like maps, i.e. as attempts to make a useful representation of some  aspects of external reality. By useful, I mean the things we can use to make tests. However, there is a persistent tendency for some scientists to confuse the theory and the reality it is supposed to describe, especially a tendency to assert there is a one-to-one relationship between all elements of reality and the corresponding elements in the theoretical picture. This confusion was stated most succintly by the Polish scientist Alfred Korzybski in his memorable aphorism :

The map is not the territory.

I see this problem written particularly large with those physicists who persistently identify the landscape of string-theoretical possibilities with a multiverse of physically existing domains in which all these are realised. Of course, the Universe might be like that but it’s by no means clear to me that it has to be. I think we just don’t know what we’re doing well enough to know as much as we like to think we do.

A theory is also surrounded by a penumbra of non-testable elements, including those concepts that we use to translate the mathematical language of physics into everday words. We shouldn’t forget that many equations of physics have survived for a long time, but their interpretation has changed radically over the years.

The inevitable gap that lies between theory and reality does not mean that physics is a useless waste of time, it just means that its scope is limited. The Tube  map is not complete or accurate in all respects, but it’s excellent for what it was made for. Physics goes down the tubes when it loses sight of its key requirement: to be testable.

In any case, an attempt to make a grand unified theory of the London Underground system would no doubt produce a monstrous thing so unwieldly that it would be useless in practice. I think there’s a lesson there for string theorists too…

Now, anyone for a game of Mornington Crescent?