Ligatures, Diphthongs and Supernovae

At the weekend I noticed a nice article by John Butterworth on his Grauniad blog about where Gold comes from. Regular readers of this blog (Sid and Doris Bonkers) know that I am not at all pedantic but my attention was drawn to the plural of supernova in the preamble:


I have to confess that I much prefer the latin plural “supernovae” to the modernised “supernovas”, although most dictionaries (including the One True Chambers) give these both as valid forms.  In the interest of full disclosure I will point out that I did five years of Latin at school, and very much enjoyed it…

When I tweeted about my dislike for supernovas and preference for supernovae some replied that English words should have English plurals so that supernovas was preferred (although I wonder if that logic extends to, e.g. datums and phenomenons). Others said that supernovae was fine among experts but for science communication purposes it was better to say “supernovas” as this more obviously means “more than one supernova”. That’s a reasonable argument, but I have to admit I find it a little condescending to assume that an audience can cope with the idea of a massive star exploding as a consequence of gravitational collapse but be utterly bewildered by a straightforward latin plural.

One of the reasons I prefer the Latin plural – along with some other forms that may appear archaic, e.g. Nebulae – is that Astronomy is unique among sciences for having such a long history. Many astronomical terms derive from very ancient sources and in my view we should celebrate this fact because it’s part of the subject’s fascination. That’s just my opinion, of course. You are welcome to disagree with that too.

Anyway, you might be interested to know a couple of things. One is that the first use of “super-nova” recorded in the Oxford English Dictionary was in 1932 in a paper by Swedish astronomer Knut Lundmark. This word is however formed from “nova” (which means “new” in Latin) and the first use of this term in an astronomical setting was in a book by Tycho Brahe, published in 1573:

Brahe_book(I’ll leave it as an exercise to the student to translate the full title.)

Nowadays a nova is taken to be a much lower budget feature than a supernova but the “nova” described in Tycho’s book was was actually a supernova, SN1572 which he, along with many others, had observed the previous year. Historical novae were very often supernovae, in fact, because they are much brighter than mere novae. The real difference between these two classes of object wasn’t understood until the 20th Century, however, which is why the term supernova was coined much later than nova.

Anyway, back to pedantry.

A subsequent tweet from Roberto Trotta asserted  that in fact supernovae and supernovas are both wrong; the correct plural should be supernovæ, in which the two letters of the digraph “ae” are replaced with a single glyph known as a ligature. Often, as in this case, a ligature stands for a diphthong, a sort of composite vowel sound made by running two vowels together.   It’s one of the peculiarities of English that there are only five vowels, but these can represent quite different sounds depending on the context (and on the regional accent). This  means that English has many hidden diphthongs. For example,  the “o” in “no” is a diphthong in English. In languages such as Italian, in which the vowels are very pure, “no” is pronounced quite differently from English. The best test of whether a vowel is pure or not is whether your mouth changes shape as you pronounce it: your mouth moves as you say an English “no”, closing the vowel that stays open in the Italian “no”…

So, not all diphthongs are represented by ligatures. It’s also the case that not all ligatures represent diphthongs. Indeed some are composed entirely of consonants. My current employer’s logo features a ligature formed from the letters U and S:


The use of the ligature æ arose in Mediaeval Latin (or should I say Mediæval?). In fact if you look at the frontispiece of the Brahe book shown above you will see a number of examples of it in its upper-case form Æ. I’m by no means an expert in such things but my guess is that the use of such ligatures in printed works was favoured simply to speed up the typesetting process – which was very primitive – by allowing the compositor to use a single piece of type to set two characters. However, it does appear in handwritten documents e.g. in Old English, long before printing was invented so easier typesetting doesn’t explain it all.

Use of the specific ligature in question caught on particularly well in Scandinavia where it eventually became promoted to a letter in its own right (“aesc”) and is listed as a separate vowel in the modern Danish and Norwegian alphabets.  Early word-processing and computer typesetting software generally couldn’t render ligatures because they were just too complicated, so their use fell out of favour in the Eighties, though there are significant exceptions to this rule. Latex, for example, always allowed ligatures to be created quite easily. Software – even Microsoft Word – is much more sophisticated than it used to be, so it’s now not so much of a problem to use ligatures in digital text. Maybe they will make a comeback!

Anyway, the use of æ was optional even in Mediaeval Latin so I don’t think it can be argued that supernovæ is really more correct than supernovae, though to go back to a point I made earlier, I do admit that a rambling discussion of ligatures and diphthongs would not add much to a public lecture on exploding stars.


No referenda, please..

One of the most interesting topics under discussion after the announcement of the results of Thursdays Referendum on Scottish independence is whether there will be another one which, in turn, leads to the question what is the proper plural of “referendum”?

Regular readers of this blog know that I’m never pedantic about such matters. Well, maybe a little bit, sometimes. Latin was my best subject at O-level, though, so I can’t resist making a comment.

Any dictionary will tell you that “referendum” is obtained from the Latin verb referre which is itself formed as re- (prefix meaning “back”) + ferre (to carry), thus its literal meaning is “carry back” or, more relevantly, “to refer”. Ferre is actually an irregular verb, so I’ll use simpler examples of regular verbs below

Latin grammar includes two related concepts derived from a verb, the gerund and the gerundive. The gerund is a verbal noun; such things exist in English in forms that mean the act of something, eg running, eating, loving. In the last case the relevant Latin verb is the first conjugation amare and the gerund is amandus. You can find a similar construction surviving in such English words as “graduand”. Note however that a gerund has no plural form because that would make no sense.

Related to the gerund is the gerundive which, as its name suggests, is an adjectival form related to the gerund, specifically expressing necessity.

In Latin, an adjective takes an ending that depends on the gender of the noun it describes; the gerundive also follows this pattern. In the example given above, the gerundive form is amandus in a masculine case or, if referring to a female entity, amanda, hence the name, which means “deserving or requiring love”, or amandum for a neuter noun. In cases where the noun is plural the forms would be amandi, amandae, and amanda. Endings for other verbs are formed in a similar fashion depending on their conjugation.

From this example you can see that in Latin amandum could mean either “loving” (gerund) or “a thing to be loved” (gerundive). Latin grammar is sufficiently clear, however, that the actual meaning will be clear from the context.

Now, to referendum. It seems clear to me that this is a gerundive and thus means “a thing to be referred” (the thing concerned being of no gender, as is normal in such cases in Latin). So what should be the word for more than one referendum?

Think about it and you’ll realise that referenda would imply “more than one thing to be referred”. The familiar word agenda is formed precisely this way and it means “(a list of things) to be done”. But this is not the desired meaning we want, ie “more than one example of a thing being referred”.

I would therefore argue that referenda is clearly wrong, in that it means something quite different from that needed to describe more than one of what a referendum is.

So what should we use? This is a situation where there isn’t a precise correspondence between Latin and English grammatical forms so it seems to me that we should just treat referendum as an English noun and give it the corresponding English plural. So “referendums” it is.

Any questions?

Pluralia Tantum

Meanwhile, over on the e-astronomer, Andy Lawrence recently posted an item about the lamentable tendency of astronomers to abuse the English language. The focus of his venom was “extincted”, a word used by many astro-types as an adjective to describe the state of affairs when light from a source (e.g. a quasar) has suffered “extinction” by intervening matter. “Extinction” is formed from the verb “extinguish” in the same way that “distinction” is formed from “distinguish”. Nobody would describe a professor as “distincted” (certainly not if it is Andy Lawrence) so, clearly, “extincted” is inappropriate. Actually if you really want to nit-pick you could object to “extinction” being applied to an object such as a  quasar, when it isn’t actually the object that is suffering from it but the light it has emitted.

But as a gripe, this is fair enough I’d say. Andy went on to encourage his legions of adoring readers to contribute their own pet hates, preferably with an astronomical orientation. My contribution was “decimate” which  means “to remove the tenth part” or “to reduce by ten percent”, from the Roman practice of punishing disobedient legions by killing every tenth man, but is often regrettably now used to mean “annihilate” or “obliterate”. You might think this hasn’t got much to do with astronomy but, sadly, it does. Indeed, a press release from STFC discussing the recent ten percent cuts to its grants budget states that consequent reduction in PDRAS

..will not cause the decimation of physics departments as has been speculated in media reports.

I would expect a civil servant to have done a bit better, so presumably this was written by an astronomer too. At any rate, it is precisely wrong.

You might argue that things like this don’t matter.  Language evolves,  and if modern usage deviates from its previous meanings then we should just let it change. I fully accept the dynamic nature of language and do not by any means object to all such changes. Society changes and so must the words we use. But if a change is (a) a result of sloppiness and (b) results in the loss of a very good use to be replaced by a bad one, then I think educated people should stand their ground and fight it. If we don’t do that language doesn’t just change, it decays.

Most of us practising scientists have to spend a lot of our time writing scientific papers, departmental memos, grant applications and even books. I think many astronomers see this activity as a chore, take no pleasure from it, and invest the minimum care on it. I was fortunate to have a really excellent writer, John Barrow, as my thesis supervisor and he convinced me that it was worth making the effort to write the best prose I could whatever the context. Not only does this attitude eliminate the ambiguity which is the bane of scientific writing. Taking pains over style and grammar also allows one to feel the pleasure of craftsmanship for its own sake. With John’s guidance and encouragement, I learned to enjoy writing through the satisfaction experienced by finding neat forms of words or nice turns of phrase. You never really feel good about what you do if you scrape through at the miminum acceptable level. Try to make the effort and you will be more fulfilled and the long hours of slog you spend putting together a complicated paper will at least be enlivened by a genuine sense of delight when things fall neatly into place, and a warm glow of achievement when you read it back and it sounds not just acceptable but actually good.

But I digress.

One of the other contributors to Andy’s list of examples of bad grammar was a chap called Norman Gray who objected to astronomers’ use of the word “data” as a plural noun, as in “the data indicate” rather than “the data indicates”. I was taken aback by this because I was expecting the opposite objection.

He has a lengthy rant about this on his own blog so I won’t repeat his arguments in detail here, merely a synopsis. The word “data” is formed from the latin plural of the word “datum” (itself formed from the past participle of the latin verb “dare”, meaning “to give”) hence meaning “things given” or words to that effect. The usage of “data” that we use now (to refer to measurements or quantitative information) seems not to have been present in roman or mediaeval times so Norman argues that it is a deliberate archaism to treat it as a latin plural now. He also argues that “data” in modern usage is a “mass noun” so should on that grounds also be treated as singular.

For those of you who aren’t up with such things, English nouns can be of two forms: “count” and “non-count” (or “mass”). Count nouns are those that can be enumerated and therefore have both plural and singular forms:  one eye, two eyes, etc. Non-count nouns (which is a better term than “mass nouns”) are those which describe something which is not enumerable, such as “furniture” or “cutlery”. Such things can’t be counted and they don’t have a different singular and plural forms. You can have two chairs (count noun) but can’t have two furnitures (non-count noun).

Count and non-count nouns require different grammatical treatment. You can ask “how much furniture do you have?” but not how many. The answer to a “how much” question usually requires a unit or measure word (e.g. “a vanload of furniture”) but the answer to a “how many” question would be just a number. Next time you are in a supermarket queue where it says “ten items or less” you will appreciate that it the sign is grammatically incorrect. “Item” is most definitely a count noun, so the correct form should be “ten items or fewer”..

Anyway, Norman Gray asserts that (a) “data” is a non-count noun and that (b) it should therefore be singular. Forms such as “the data are..” are out (“a vile anacoluthon”) and “the data is…” is in.

So is he right?

Not really.  Unkind though it may be to dismantle a carefully constructed obsession, I think his arguments have quite a few problems with them.

For a start, it seems clear to me that there are (at least) two distinct uses of the word data. One is clearly of non-count type. This is the use of “data” to describe an undifferentiated unspecified or unlimited quantity of information such as that stored on a computer disk. Of such stuff you might well ask “how much data do you have?” and the answer would be in some units (e.g. Gbytes). This clearly identifies it as a mass noun.

But there is another meaning, which is that ascribed to specified pieces of information either given (as per the original latin) or obtained from a measurement. Such things are precisely defined, enumerable and clearly therefore of count-noun form. Indeed one such entity could reasonably be called a datum and the plural would be data. This usage applies when the context defines the relevant quantum of information so no unit is required. This is the usage that arises in most scientific papers, as opposed to software manuals. “In Figure 1, the data are plotted…” is correct. Although it sounds clumsy you could well ask in such a situation “how many data do you have?” (meaning how many measurements do you have) and the answer would just be a number. Archaism? No. It’s just right.

To labour the point still further,  here are another two sentences that show the different uses:

“If I had less data my disk would have more free space on it.” (Non-count)

“If I had fewer data I would not be able to obtain an astrometric solution.” (Count).

Contrary to Norman’s claims, it is not unusual for the same words (if they’re nouns) to have both count and non-count forms in different contexts. I give the example of “whisky” as in “my glass is full of whisky” (non-count) versus “two whiskies, please, barman”. His objection to this was that in the second case a whisky is an artefact of a metonymic shift which takes the word “whisky” to refer to the glass containing it.

Metonymy involves using a word related to a thing rather than the word for thing itself, as in “I have hungry mouths to feed”; it’s not really the mouths that are fed, but the people the mouths belong to. In fact there’s a bit of this going on when people talk about sources being “extincted” rather than their light.

This invalidates the example because, Norman alleges, the resulting meaning is different. This objection is a bit silly because the whole point is that the two forms should have different meanings, otherwise why have them? In any case the  example  simply involves me asking for two well-defined quantities of whisky. I’m not convinced of the relevance of metonymy here. What I care about is the whisky, not what it comes in, and when I drink the whisky I don’t drink the glass anyway. Metonymy would apply if I talked about drinking a couple of glasses. Consider “I drank two whiskies, one after the other” versus “I drank two glasses one after the other”. In both cases what has actually been drunk?

There are countless other examples (pun intended). “Fire” can be a mass noun “fire is dangerous”) but also a count noun (“the firemen were fighting three fires simultaneously”). Another nice one  is “hair” which is non-count when it is on someone’s head (“my hair is going grey”) but count when  they, in the plural, are being split.

Interestingly, though, the  non-count forms of these nouns are all singular. Indeed, many non-count nouns exist only in the singular: such nouns are called singularia tantum. Examples include “dust” and “wealth”. So,  if we accept that “data” can be a non-count noun, does that mean that it should necessarily be treated as singular when it does take on that role?

An example that might be taken to support this view could be “statistics” (the field thereof) which is a non-count noun. Although it appears to be derived from a plural, you would certainly say “statistics is a hard subject”  rather than “statistics are a hard subject”.  On the other hand “statistics” can refer to a set, each element of which is a statistic (i.e. a number), thus giving another example of a noun that can be of either count or non-count form; you might reasonably say “the statistics are impressive” in the count case.  The non-count form “statistics” is a better  example of metonymy than the example above, as it refers to the study of the (count) statistics rather than to the things themselves.

In fact there are also mass nouns, described as pluralia tantum, which exist only in the plural. A (not entirely accurate) list is given here. Examples include scissors and pants, for which the normal measure  is a “pair”. Although these are technically non-count nouns (in the sense that you can’t have one scissor, etc) they don’t shed much light on the example in front of us. Perhaps more pertinent is the word “clothes” which is of non-count type but which is certainly plural. You can’t have one “clothe” (or any other number for that matter) but you would definitely say “your clothes are dirty”.

A more subtle example with relevance to the latin root of “data” is “media” which can refer to broadcast media (non-count) or plural of medium (count).  “The media are out to get me”  seems a correct construction to me, so the non-count form of this noun is a plurale tantum (singular of pluralia tantum).

So,  just because a word may be a non-count noun, it doesn’t necessarily have to be singular.

To summarise,  my argument is that (a) it is not correct to assert “data” is a mass noun. It may or may not be, depending on the context. If it is acting as a count noun (which I contend is the case in most science writing) then it is definitely plural. Furthermore, even in cases where it is clearly a mass noun, and especially if you reject the alternative meaning as a count noun, then  it is still by no means obvious that it must be treated as singular (because of the existence of the plurale tantum). In fact I would go a bit further and argue that you can only justify the singular non-count form at all if you accept that there is a count alternative. To be honest, though, I think I prefer the singular interpretation in the non-count case, as in “statistics”. It just sounds better.

If anyone has managed to read all the way through this exercise in pedantry I’d be interested to see any comments on my analysis of data.