Archive for A Study in Scarlet

The Inductive Detective

Posted in Bad Statistics, Literature, The Universe and Stuff with tags , , , , , , , on September 4, 2009 by telescoper

I was watching an old episode of Sherlock Holmes last night – from the classic  Granada TV series featuring Jeremy Brett’s brilliant (and splendidly camp) portrayal of the eponymous detective. One of the  things that fascinates me about these and other detective stories is how often they use the word “deduction” to describe the logical methods involved in solving a crime.

As a matter of fact, what Holmes generally uses is not really deduction at all, but inference (a process which is predominantly inductive).

In deductive reasoning, one tries to tease out the logical consequences of a premise; the resulting conclusions are, generally speaking, more specific than the premise. “If these are the general rules, what are the consequences for this particular situation?” is the kind of question one can answer using deduction.

The kind of reasoning of reasoning Holmes employs, however, is essentially opposite to this. The  question being answered is of the form: “From a particular set of observations, what can we infer about the more general circumstances that relating to them?”. The following example from a Study in Scarlet is exactly of this type:

From a drop of water a logician could infer the possibility of an Atlantic or a Niagara without having seen or heard of one or the other.

The word “possibility” makes it clear that no certainty is attached to the actual existence of either the Atlantic or Niagara, but the implication is that observations of (and perhaps experiments on) a single water drop could allow one to infer sufficient of the general properties of water in order to use them to deduce the possible existence of other phenomena. The fundamental process is inductive rather than deductive, although deductions do play a role once general rules have been established.

In the example quoted there is  an inductive step between the water drop and the general physical and chemical properties of water and then a deductive step that shows that these laws could describe the Atlantic Ocean. Deduction involves going from theoretical axioms to observations whereas induction  is the reverse process.

I’m probably labouring this distinction, but the main point of doing so is that a great deal of science is fundamentally inferential and, as a consequence, it entails dealing with inferences (or guesses or conjectures) that are inherently uncertain as to their application to real facts. Dealing with these uncertain aspects requires a more general kind of logic than the  simple Boolean form employed in deductive reasoning. This side of the scientific method is sadly neglected in most approaches to science education.

In physics, the attitude is usually to establish the rules (“the laws of physics”) as axioms (though perhaps giving some experimental justification). Students are then taught to solve problems which generally involve working out particular consequences of these laws. This is all deductive. I’ve got nothing against this as it is what a great deal of theoretical research in physics is actually like, it forms an essential part of the training of an physicist.

However, one of the aims of physics – especially fundamental physics – is to try to establish what the laws of nature actually are from observations of particular outcomes. It would be simplistic to say that this was entirely inductive in character. Sometimes deduction plays an important role in scientific discoveries. For example,  Albert Einstein deduced his Special Theory of Relativity from a postulate that the speed of light was constant for all observers in uniform relative motion. However, the motivation for this entire chain of reasoning arose from previous studies of eletromagnetism which involved a complicated interplay between experiment and theory that eventually led to Maxwell’s equations. Deduction and induction are both involved at some level in a kind of dialectical relationship.

The synthesis of the two approaches requires an evaluation of the evidence the data provides concerning the different theories. This evidence is rarely conclusive, so  a wider range of logical possibilities than “true” or “false” needs to be accommodated. Fortunately, there is a quantitative and logically rigorous way of doing this. It is called Bayesian probability. In this way of reasoning,  the probability (a number between 0 and 1 attached to a hypothesis, model, or anything that can be described as a logical proposition of some sort) represents the extent to which a given set of data supports the given hypothesis.  The calculus of probabilities only reduces to Boolean algebra when the probabilities of all hypothesese involved are either unity (certainly true) or zero (certainly false). In between “true” and “false” there are varying degrees of “uncertain” represented by a number between 0 and 1, i.e. the probability.

Overlooking the importance of inductive reasoning has led to numerous pathological developments that have hindered the growth of science. One example is the widespread and remarkably naive devotion that many scientists have towards the philosophy of the anti-inductivist Karl Popper; his doctrine of falsifiability has led to an unhealthy neglect of  an essential fact of probabilistic reasoning, namely that data can make theories more probable. More generally, the rise of the empiricist philosophical tradition that stems from David Hume (another anti-inductivist) spawned the frequentist conception of probability, with its regrettable legacy of confusion and irrationality.

My own field of cosmology provides the largest-scale illustration of this process in action. Theorists make postulates about the contents of the Universe and the laws that describe it and try to calculate what measurable consequences their ideas might have. Observers make measurements as best they can, but these are inevitably restricted in number and accuracy by technical considerations. Over the years, theoretical cosmologists deductively explored the possible ways Einstein’s General Theory of Relativity could be applied to the cosmos at large. Eventually a family of theoretical models was constructed, each of which could, in principle, describe a universe with the same basic properties as ours. But determining which, if any, of these models applied to the real thing required more detailed data.  For example, observations of the properties of individual galaxies led to the inferred presence of cosmologically important quantities of  dark matter. Inference also played a key role in establishing the existence of dark energy as a major part of the overall energy budget of the Universe. The result is now that we have now arrived at a standard model of cosmology which accounts pretty well for most relevant data.

Nothing is certain, of course, and this model may well turn out to be flawed in important ways. All the best detective stories have twists in which the favoured theory turns out to be wrong. But although the puzzle isn’t exactly solved, we’ve got good reasons for thinking we’re nearer to at least some of the answers than we were 20 years ago.

I think Sherlock Holmes would have approved.