## The Monkey Complex

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on November 15, 2009 by telescoper

There’s an old story that if you leave a set of monkeys hammering on typewriters for a sufficiently long time then they will eventually reproduce the entire text of Shakespeare’s play Hamlet. It comes up in a variety of contexts, but the particular generalisation of this parable in cosmology is to argue that if we live in an enormously big universe (or “multiverse“), in which the laws of nature (as specified by the relevant fundamental constants) vary “sort of randomly” from place to place, then there will be a domain in which they have the right properties for life to evolve. This is one way of explaining away the apparent fine-tuning of the laws of physics: they’re not finely tuned, but we just live in a place where they allowed us to evolve. Although it may seem an easy step from monkeys to the multiverse, it always seemed to me a very shaky one.

For a start, let’s go back to the monkeys. The supposition that given an infinite time the monkeys must produce everything that’s possible in a finite sequence, is not necessarily true even if one does allow an infinite time. It depends on how they type. If the monkeys were always to hit two adjoining keys at the same time then they would never produce a script for Hamlet, no matter how long they typed for, as the combinations QW or ZX do not appear anywhere in that play. To guarantee what we need the kind their typing has to be ergodic, a very specific requirement not possessed by all “random” sequences.

A more fundamental problem is what is meant by randomness in the first place. I’ve actually commented on this before, in a post that still seems to be collecting readers so I thought I’d develop one or two of the ideas a little.

It is surprisingly easy to generate perfectly deterministic mathematical sequences that behave in the way we usually take to characterize indeterministic processes. As a very simple example, consider the following “iteration” scheme:

$X_{j+1}= 2 X_{j} \mod(1)$

If you are not familiar with the notation, the term mod(1) just means “drop the integer part”.  To illustrate how this works, let us start with a (positive) number, say 0.37. To calculate the next value I double it (getting 0.74) and drop the integer part. Well, 0.74 does not have an integer part so that’s fine. This value (0.74) becomes my first iterate. The next one is obtained by putting 0.74 in the formula, i.e. doubling it (1.48) and dropping  the integer part: result 0.48. Next one is 0.96, and so on. You can carry on this process as long as you like, using each output number as the input state for the following step of the iteration.

Now to simplify things a little bit, notice that, because we drop the integer part each time, all iterates must lie in the range between 0 and 1. Suppose I divide this range into two bins, labelled “heads” for X less than ½ and “tails” for X greater than or equal to ½. In my example above the first value of X is 0.37 which is “heads”. Next is 0.74 (tails); then 0.48 (heads), 0.96(heads), and so on.

This sequence now mimics quite accurately the tossing of a fair coin. It produces a pattern of heads and tails with roughly 50% frequency in a long run. It is also difficult to predict the next term in the series given only the classification as “heads” or “tails”.

However, given the seed number which starts off the process, and of course the algorithm, one could reproduce the entire sequence. It is not random, but in some respects  looks like it is.

One can think of “heads” or “tails” in more general terms, as indicating the “0” or “1” states in the binary representation of a number. This method can therefore be used to generate the any sequence of digits. In fact algorithms like this one are used in computers for generating what are called pseudorandom numbers. They are not precisely random because computers can only do arithmetic to a finite number of decimal places. This means that only a finite number of possible sequences can be computed, so some repetition is inevitable, but these limitations are not always important in practice.

The ability to generate  random numbers accurately and rapidly in a computer has led to an entirely new way of doing science. Instead of doing real experiments with measuring equipment and the inevitable errors, one can now do numerical experiments with pseudorandom numbers in order to investigate how an experiment might work if we could do it. If we think we know what the result would be, and what kind of noise might arise, we can do a random simulation to discover the likelihood of success with a particular measurement strategy. This is called the “Monte Carlo” approach, and it is extraordinarily powerful. Observational astronomers and particle physicists use it a great deal in order to plan complex observing programmes and convince the powers that be that their proposal is sufficiently feasible to be allocated time on expensive facilities. In the end there is no substitute for real experiments, but in the meantime the Monte Carlo method can help avoid wasting time on flawed projects:

…in real life mistakes are likely to be irrevocable. Computer simulation, however, makes it economically practical to make mistakes on purpose.

(John McLeod and John Osborne, in Natural Automata and Useful Simulations).

So is there a way to tell whether a set of numbers is really random? Consider the following sequence:

1415926535897932384626433832795028841971

Is this a random string of numbers? There doesn’t seem to be a discernible pattern, and each possible digit seems to occur with roughly the same frequency. It doesn’t look like anyone’s phone number or bank account. Is that enough to make you think it is random?

Actually this is not at all random. If I had started it with a three and a decimal place you might have cottoned on straight away. “3.1415926..” is the first few digits in the decimal representation of p. The full representation goes on forever without repeating. This is a sequence that satisfies most naïve definitions of randomness. It does, however, provide something of a hint as to how we might construct an operational definition, i.e. one that we can apply in practice to a finite set of numbers.

The key idea originates from the Russian mathematician Andrei Kolmogorov, who wrote the first truly rigorous mathematical work on probability theory in 1933. Kolmogorov’s approach was considerably ahead of its time, because it used many concepts that belong to the era of computers. In essence, what he did was to provide a definition of the complexity of an N-digit sequence in terms of the smallest amount of computer memory it would take to store a program capable of generating the sequence. Obviously one can always store the sequence itself, which means that there is always a program that occupies about as many bytes of memory as the sequence itself, but some numbers can be generated by codes much shorter than the numbers themselves. For example the sequence

111111111111111111111111111111111111

can be generated by the instruction to “print 1 35 times”, which can be stored in much less memory than the original string of digits. Such a sequence is therefore said to be algorithmically compressible.

There are many ways of calculating the digits of π numerically also, so although it may look superficially like a random string it is most definitely not random. It is algorithmically compressible.

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. When I studied it at school I certainly wished it were a little shorter…

The complexity of a sequence can be defined to be the length of the shortest program capable of generating it. If no algorithm can be found that compresses the sequence into a program shorter than itself then it is maximally complex and can suitably be defined as random. This is a very elegant description, and has good intuitive appeal.

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. At any rate, when I studied it at school, I certainly wished it were a little shorter…

However, this still does not provide us with a way of testing rigorously whether a given finite sequence has been produced “randomly” or not.

If an algorithmic compression can be found then that means we declare the given sequence not to be  random. However we can never be sure if the next term in the sequence would fit with what our algorithm would predict. We have to argue, inferentially, that if we have fit a long sequence with a simple algorithm then it is improbable that the sequence was generated randomly.

On the other hand, if we fail to find a suitable compression that doesn’t mean it is random either. It may just mean we didn’t look hard enough or weren’t clever enough.

Human brains are good at finding patterns. When we can’t see one we usually take the easy way out and declare that none exists. We often model a complicated system as a random process because it is  too difficult to predict its behaviour accurately even if we know the relevant laws and have  powerful computers at our disposal. That’s a very reasonable thing to do when there is no practical alternative.

It’s quite another matter, however,  to embrace randomness as a first principle to avoid looking for an explanation in the first place. For one thing, it’s lazy, taking the easy way out like that. And for another it’s a bit arrogant. Just because we can’t find an explanation within the framework of our current theories doesn’t mean more intelligent creatures than us won’t do so. We’re only monkeys, after all.