The Monkey Complex

There’s an old story that if you leave a set of monkeys hammering on typewriters for a sufficiently long time then they will eventually reproduce the entire text of Shakespeare’s play Hamlet. It comes up in a variety of contexts, but the particular generalisation of this parable in cosmology is to argue that if we live in an enormously big universe (or “multiverse“), in which the laws of nature (as specified by the relevant fundamental constants) vary “sort of randomly” from place to place, then there will be a domain in which they have the right properties for life to evolve. This is one way of explaining away the apparent fine-tuning of the laws of physics: they’re not finely tuned, but we just live in a place where they allowed us to evolve. Although it may seem an easy step from monkeys to the multiverse, it always seemed to me a very shaky one.

For a start, let’s go back to the monkeys. The supposition that given an infinite time the monkeys must produce everything that’s possible in a finite sequence, is not necessarily true even if one does allow an infinite time. It depends on how they type. If the monkeys were always to hit two adjoining keys at the same time then they would never produce a script for Hamlet, no matter how long they typed for, as the combinations QW or ZX do not appear anywhere in that play. To guarantee what we need the kind their typing has to be ergodic, a very specific requirement not possessed by all “random” sequences.

A more fundamental problem is what is meant by randomness in the first place. I’ve actually commented on this before, in a post that still seems to be collecting readers so I thought I’d develop one or two of the ideas a little.

 It is surprisingly easy to generate perfectly deterministic mathematical sequences that behave in the way we usually take to characterize indeterministic processes. As a very simple example, consider the following “iteration” scheme:

 X_{j+1}= 2 X_{j} \mod(1)

If you are not familiar with the notation, the term mod(1) just means “drop the integer part”.  To illustrate how this works, let us start with a (positive) number, say 0.37. To calculate the next value I double it (getting 0.74) and drop the integer part. Well, 0.74 does not have an integer part so that’s fine. This value (0.74) becomes my first iterate. The next one is obtained by putting 0.74 in the formula, i.e. doubling it (1.48) and dropping  the integer part: result 0.48. Next one is 0.96, and so on. You can carry on this process as long as you like, using each output number as the input state for the following step of the iteration.

Now to simplify things a little bit, notice that, because we drop the integer part each time, all iterates must lie in the range between 0 and 1. Suppose I divide this range into two bins, labelled “heads” for X less than ½ and “tails” for X greater than or equal to ½. In my example above the first value of X is 0.37 which is “heads”. Next is 0.74 (tails); then 0.48 (heads), 0.96(heads), and so on.

This sequence now mimics quite accurately the tossing of a fair coin. It produces a pattern of heads and tails with roughly 50% frequency in a long run. It is also difficult to predict the next term in the series given only the classification as “heads” or “tails”.

However, given the seed number which starts off the process, and of course the algorithm, one could reproduce the entire sequence. It is not random, but in some respects  looks like it is.

One can think of “heads” or “tails” in more general terms, as indicating the “0” or “1” states in the binary representation of a number. This method can therefore be used to generate the any sequence of digits. In fact algorithms like this one are used in computers for generating what are called pseudorandom numbers. They are not precisely random because computers can only do arithmetic to a finite number of decimal places. This means that only a finite number of possible sequences can be computed, so some repetition is inevitable, but these limitations are not always important in practice.

The ability to generate  random numbers accurately and rapidly in a computer has led to an entirely new way of doing science. Instead of doing real experiments with measuring equipment and the inevitable errors, one can now do numerical experiments with pseudorandom numbers in order to investigate how an experiment might work if we could do it. If we think we know what the result would be, and what kind of noise might arise, we can do a random simulation to discover the likelihood of success with a particular measurement strategy. This is called the “Monte Carlo” approach, and it is extraordinarily powerful. Observational astronomers and particle physicists use it a great deal in order to plan complex observing programmes and convince the powers that be that their proposal is sufficiently feasible to be allocated time on expensive facilities. In the end there is no substitute for real experiments, but in the meantime the Monte Carlo method can help avoid wasting time on flawed projects:

…in real life mistakes are likely to be irrevocable. Computer simulation, however, makes it economically practical to make mistakes on purpose.

(John McLeod and John Osborne, in Natural Automata and Useful Simulations).

So is there a way to tell whether a set of numbers is really random? Consider the following sequence:


Is this a random string of numbers? There doesn’t seem to be a discernible pattern, and each possible digit seems to occur with roughly the same frequency. It doesn’t look like anyone’s phone number or bank account. Is that enough to make you think it is random?

Actually this is not at all random. If I had started it with a three and a decimal place you might have cottoned on straight away. “3.1415926..” is the first few digits in the decimal representation of p. The full representation goes on forever without repeating. This is a sequence that satisfies most naïve definitions of randomness. It does, however, provide something of a hint as to how we might construct an operational definition, i.e. one that we can apply in practice to a finite set of numbers.

The key idea originates from the Russian mathematician Andrei Kolmogorov, who wrote the first truly rigorous mathematical work on probability theory in 1933. Kolmogorov’s approach was considerably ahead of its time, because it used many concepts that belong to the era of computers. In essence, what he did was to provide a definition of the complexity of an N-digit sequence in terms of the smallest amount of computer memory it would take to store a program capable of generating the sequence. Obviously one can always store the sequence itself, which means that there is always a program that occupies about as many bytes of memory as the sequence itself, but some numbers can be generated by codes much shorter than the numbers themselves. For example the sequence


can be generated by the instruction to “print 1 35 times”, which can be stored in much less memory than the original string of digits. Such a sequence is therefore said to be algorithmically compressible.

There are many ways of calculating the digits of π numerically also, so although it may look superficially like a random string it is most definitely not random. It is algorithmically compressible.

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. When I studied it at school I certainly wished it were a little shorter…

The complexity of a sequence can be defined to be the length of the shortest program capable of generating it. If no algorithm can be found that compresses the sequence into a program shorter than itself then it is maximally complex and can suitably be defined as random. This is a very elegant description, and has good intuitive appeal.  

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. At any rate, when I studied it at school, I certainly wished it were a little shorter…

However, this still does not provide us with a way of testing rigorously whether a given finite sequence has been produced “randomly” or not.

If an algorithmic compression can be found then that means we declare the given sequence not to be  random. However we can never be sure if the next term in the sequence would fit with what our algorithm would predict. We have to argue, inferentially, that if we have fit a long sequence with a simple algorithm then it is improbable that the sequence was generated randomly.

On the other hand, if we fail to find a suitable compression that doesn’t mean it is random either. It may just mean we didn’t look hard enough or weren’t clever enough.

Human brains are good at finding patterns. When we can’t see one we usually take the easy way out and declare that none exists. We often model a complicated system as a random process because it is  too difficult to predict its behaviour accurately even if we know the relevant laws and have  powerful computers at our disposal. That’s a very reasonable thing to do when there is no practical alternative. 

It’s quite another matter, however,  to embrace randomness as a first principle to avoid looking for an explanation in the first place. For one thing, it’s lazy, taking the easy way out like that. And for another it’s a bit arrogant. Just because we can’t find an explanation within the framework of our current theories doesn’t mean more intelligent creatures than us won’t do so. We’re only monkeys, after all.


29 Responses to “The Monkey Complex”

  1. A very well written post.

  2. Mr Physicist Says:

    Sir, I admire your gangantuan post this Sunday evening on a topic of such a fundamental nature. Having partaken of too much liquour, I find it too difficult to focus on the essential nature. But… a simple question emerges…has anybody ever performed numerous Monte-Carlo simulations and ever arrived at a Hamlet-type script? I think not, and yet some expend their lives looking for life beyond our own Universe!

    Somehow, we have something wrong – dont you think?

  3. Anton Garrett Says:

    I think that the only meaningful definition of a “random” number or a “random” process is:- one that I can’t predict. I don’t accept the definition “a process that isn’t predictable” because although *I* might not be able to predict it, somebody else might. To take Peter’s example, I would be stumped for the next digit after 14159265 if I hadn’t twigged that the sequence comprises digits out of pi in base 10, whereas Dr Bright who noticed this fact would immediately be able to write down the next digit. So it is all about what information you have about the generating process. Otherwise, you get the absurdity that the number “3” (the next digit) is random to me but not to Dr Bright. I think the word “random” is actively unhelpful, because it hides the fact that randomness is in the eye of the beholder, and misleadingly suggests that randomness is to do with the generating process rather than to do with knowledge about that process. And don’t get me started on the meaning of “pseudo-random”…

    In contrast I haven’t looked into the notion of complexity very closely, but the following point worries me: If the complexity of an N-digit sequence is defined via the smallest amount of computer memory needed to store a program capable of generating the sequence, how can you find this? Doesn’t every finite sequence of numbers crop up far enough down the line when pi is written down, playing havoc with this definition?


  4. Well if we’re all monkeys like I claim, then Shakespeare was a monkey and he managed to produce Hamlet. Evidently, elsewhere in the multiverse people go to the theatre to listen to random gibberish.

  5. Anton Garrett Says:

    PS See the Branagh Hamlet film. He doesn’t cut the script of this majestic play (as is usually done with Shakespeare on film) and he understands both the medium of cinema and Shakespeare – normally you get one or the other when a film is made of his plays. It has its imperfections but I regard it as the best Shakespeare on film yet.

  6. Mr Physicist Says:

    PS. I very much like this cultured blog. The e-Astronomer has become boring and apparently sold out for the King’s Shilling! I find this worrying for astronomical observation, comment and debate.

  7. Anton,

    I’m not sure I understand your objection, but I guess you are saying that every finite sequence can be generated using the algorithm for pi, so they are all compressible. Actually, I don’t think that works because you also need to specify the precise starting point in the infinite chain, which needs an infinite amount of information in addition to the algorithm for pi.

    One of the other problems of the complexity interpretation is that although you might be able to write a short program to generate the sequence it may take an excessively long time to run….


  8. Mr Physicist,

    I agree about the e-astronomer. That’s only interesting when Mrs Trellis posts there.


  9. Anton Garrett Says:


    I’m not objecting to the definition of complexity so much as saying I don’t get it, and implicitly appealing for enlightenment by using what appears to my unenlightened mind to be a counter-example. If you happen to know where in pi a sequence comes, you can specify that sequence very concisely. If not, it takes longer. So, is complexity not also dependent on what you know, like “randomness”?


  10. Mr Physicist Says:

    I am almost talking sense under the influence. Does this count as complexity arising from randomness? This is what happens when you unleash such discussions on a Sunday night – you dont know what you have started with your apparently random topic of “randomness”. What happened to that e-Astronomer chap?

  11. Anton,

    All I was saying was that if you know that your sequence starts at a position 124639856 digits along the sequence for pi then you have to store that in addition to the algorithm for pi in order to obtain it. If you move it further along the storage requirement for the number needed to index the position will far exceed that needed for the pi-generator.

    But I do agree with you that this definition is operational, so assumes an operator with some, perhaps limited, knowledge. You might not have a suitable algorithm, but that does not mean there isn’t one.


  12. “See the Branagh Hamlet film.”

    This film moves the setting to 19th century Italy, right? I could never get why people do that. Seems like a cheap way to get some novelty. There is a film of A Midsummer Night’s Dream which is also in the 19th century where bicycles exist.

    One can argue for the Elizabethan (or Jacobite) setting since that is what Shakespeare used. On the other hand, for him, that was modern, so one could argue for setting the plays in the present. Perhaps one can argue for historical accuracy, e.g. setting Julius Caesar in ancient Rome. But what is the point of A Winter’s Tale taking place in the US during the depression or casting Falstaff as a Wall Street broker?

    And don’t even get me started on modern opera productions. (Like Bach, I am not an opera fan, but even if I were, I wouldn’t watch La Traviata taking place in a coal mine, or Madame Butterfly on a battleship during WW II.

  13. Anton Garrett Says:

    Phillip: You have been misinformed; Branagh’s Hamlet is still set in Denmark (not Italy). Shakespeare also set it in mediaeval times well before his own day, but Branagh moves it forward to about 200 years ago, probably to make the most of the spectacular interiors of the stately home which doubled as Elsinore (it was filmed at Blenheim in Oxfordshire). Do remember that cinema is a visual medium and that cinema-goers expect royalty to live in opulence, which to them means the interior style of 200 years ago, not draughty, dark and far smaller mediaeval halls. Branagh’s film is still in the era
    when armies and people moved around by foot or horse, so there is no damage to the plausibility of the plot.

    Apart from that, I agree with you. I’ve often thought that second-rate directors try to get themselves noticed by gimmicky productions of operas and plays written by better men. In particular I share the ambivalence of the audience at the ridiculous Boulez/Chereau Marxist interpretation of Wagner’s ring cycle at Bayreuth a generation ago, set in a power station on the Rhine. The audience wanted to cheer the singers but boo the production, and I would have felt the same. And I thought it daft of Branagh to transpose As You Like It, a play replete with English pastorale, to feudal Japan in his recent film. I knew personally a Scottish playwright (also a distinguished thermodynamicist and engineer, now deceased, who invented desalination of seawater as a continuous-flow process) whose masterpiece, about Robert the Bruce, was partly wrecked by a director who insisted on having TWO Bruces on stage simultaneously. Unlike Shakespeare and Wagner, Bob Silver could answer back, and he made eloquent mincemeat of the director – but he couldn’t budge him, for the director was appointed by the festival in which the play appeared.


  14. Anton Garrett Says:

    @Mr Physicist: Hope you don’t have a random hangover.

  15. OK, I was right about the time shift but wrong about moving it to Italy (probably confusing it with another Shakespeare adaptation). I have seen
    many of Branagh’s films but not this one (yet).

  16. Anton and Philip

    It’s interesting how the trajectory of this discussion is nearly independent of its initial conditions. I must post about chaos sometime.

    However, I should say that I think modern settings of operas can be very effective. I enjoyed the 1920s setting of The Magic of Figaro by WNO and Miller’s Mafia-style Rigoletto at ENO, for example. There is no reason to stay rooted in the period if the story is timeless. But of course it matters a lot HOW it is done. Some altered settings are just daft.

    Another kind of staging is one without obvious period, with minimal sets and costumes. That can be effective too, but also disorienting if done badly.

    To the question why do it, the answer is simple. To keep the drama relevant and to demonstrate the power of a story that it is universal. I don’t think performance art should be treated as a thing to be preserved as it was the time it was written.


  17. “The Magic of Figaro”. Wow, every few years another “lost” Mozart opera turns up. I also enjoyed Giovanni fan tutte.

  18. That’s what happens when I type without paying due attention. The Marriage of Figaro, is what I meant, although the Flute of Figaro would probably be good too! How about Don Tutte?

  19. “the Flute of Figaro”, starring Peter O’Toole.

  20. Anton Garrett Says:

    What about a cast list for this new Mozart/da Ponte opera? For a start I’d have Cappucino and Fettucine as young lovers, and Marscapone as the baddie. Further contributions invited…

  21. Anton Garrett Says:

    Mortadella – Marscapone’s wife
    Espresso – the messenger
    Mozzarella – friend to Fettucine

  22. telescoper Says:

    I think you forgot

    Pomodoro, married to Mozzarella.
    Olio and Aceto, the dressers.
    Ravioli, the strong man
    Funghi, a fun guy.
    Conto, the Bill.

    • telescoper Says:

      Actually, just to return to Philip’s point for a moment. The Magic Flute is an example of an Opera which can be staged in any historical period or any geographical setting. It’s nonsense whatever you do with it! Great fun though.

  23. Anton Garrett Says:

    Peter: Freemasons might disagree (NB I am not now, and have never been…)

  24. Alan Penny Says:

    Getting back to the multiverse.

    As a keen Anthropicist, I not clear about Peter’s doubts on randomness. It doesnt have to be truly random, as long as the constants of nature, or even the laws themselves, are scattered in some way that does once in while allow life to exist, then the anthropic argument holds. What will be more complex is trying to deduce from the scatter of the values in our universe about some ‘ideal’ mean as to whether they are consistent with a random generation and thus give credence to the anthropic idea.

    As to his comment about an audience listening to a random play, this would not of course occur. Life forms content with such a performance would not in fact evolve to the status of playgoers.

    But if there is a true infinity of universes, then somewhere at this moment an audience is being enthralled by a new and innovative performance of ‘The Magic of Figaro’. In fact an infinity of audiences are.

  25. telescoper Says:


    My point is that “randomness” is a word often used to avoid thinking. In that respect “infinity” is even worse. The mere fact that we might have an infinite ensemble does not guarantee that life has to be present somewhere within it. It is also necessary that whatever process generates the constants of nature happens to have a non-zero probability of producing the sequence of coincidences that makes life possible in one domain. Not all “random” process can do that.


  26. […] In the Dark A blog about the Universe, and all that surrounds it « The Monkey Complex […]

  27. […] with some probability. This is the kind of process that’s needed if an infinite collection of monkeys is indeed to type the (large but finite) complete works of shakespeare. It’s not enough that […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: