Returning to Lognormality

I’m off later today for a short trip to Copenhagen, a place I always enjoy visiting. I particularly remember a very nice time I had there back in 1990 when I was invited by Bernard Jones, who used to work at the Niels Bohr Institute.  I stayed there several weeks over the May/June period which is the best time of year  for Denmark; it’s sufficiently far North that the summer days are very long, and when it’s light until almost midnight it’s very tempting to spend a lot of time out late at night.

As well as being great fun, that little visit also produced my most-cited paper. I’ve never been very good at grabbing citations – I’m more likely to fall off bandwagons rather than jump onto them – but this little paper seems to keep getting citations. It hasn’t got that many by the standards of some papers, but it’s carried on being referred to for almost twenty years, which I’m quite proud of; you can see the citations per year statistics are fairly flat. The model we proposed turned out to be extremely useful in a range of situations, hence the long half-life.

nph-ref_history

I don’t think this is my best paper, but it’s definitely the one I had most fun working on. I remember we had the idea of doing something with lognormal distributions over coffee one day,  and just a few weeks later the paper was  finished. In some ways it’s the most simple-minded paper I’ve ever written – and that’s up against some pretty stiff competition – but there you go.

Picture1

The lognormal seemed an interesting idea to explore because it applies to non-linear processes in much the same way as the normal distribution does to linear ones. What I mean is that if you have a quantity Y which is the sum of n independent effects, Y=X1+X2+…+Xn, then the distribution of Y tends to be normal by virtue of the Central Limit Theorem regardless of what the distribution of the Xi is  If, however, the process is multiplicative so  Y=X1×X2×…×Xn then since log Y = log X1 + log X2 + …+log Xn then the Central Limit Theorem tends to make log Y normal, which is what the lognormal distribution means.

The lognormal is a good distribution for things produced by multiplicative processes, such as hierarchical fragmentation or coagulation processes: the distribution of sizes of the pebbles on Brighton beach  is quite a good example. It also crops up quite often in the theory of turbulence.

I;ll mention one other thing  about this distribution, just because it’s fun. The lognormal distribution is an example of a distribution that’s not completely determined by knowledge of its moments. Most people assume that if you know all the moments of a distribution then that has to specify the distribution uniquely, but it ain’t necessarily so.

If you’re wondering why I mentioned citations, it’s because it looks like they’re going to play a big part in the Research Excellence Framework, yet another new bureaucratical exercise to attempt to measure the quality of research done in UK universities. Unfortunately, using citations isn’t straightforward. Different disciplines have hugely different citation rates, for one thing. Should one count self-citations?. Also how do you aportion citations to multi-author papers? Suppose a paper with a thousand citations has 25 authors. Does each of them get the thousand citations, or should each get 1000/25? Or, put it another way, how does a single-author paper with 100 citations compare to a 50 author paper with 101?

Or perhaps the REF should use the logarithm of the number of citations instead?

Advertisements

7 Responses to “Returning to Lognormality”

  1. Citations vary within sub-disciplines as well. As you suggest, if numbers are taken at face value there is going to be a big discrepancy between fields that tend to pile on the authors and those that do not.

    That said recent news from Research Fortnight hints that the bibliometrics will not play as huge a role as originally thought.

    • telescoper Says:

      Kav,

      I did try to keep up with the machinations of REF, but when it became obvious that they were going around in circles I lost interest. At some point I’ll try to figure out what they’re doing but to be honest at the moment it looks like they haven’t any idea themselves.

      Peter

  2. Anton Garrett Says:

    Citationism is a Bad Thing because it is over-optimistic about the correlation between quality and quantity, and it neglects hysteresis (i.e., that it takes time for a result to be properly appreciated). Expert opinion is better, but for that you need experts who are broad as well as deep, genuinely impartial, and widely trusted. Not easy to find. To tackle this problem we perhaps need to consider supply side as well as demand side of science funding.

    American rowing moved, a while ago, from picking squads by coaches’ intuition to picking the best oarsmen on the ergometer. They did this because people who were not picked were resorting to the law. Their results immediately went downhill.

    Anton

    PS Any function that is non-analytic will not be reconstructible from its power-law moments alone.

  3. What are the odds that the REF will end up looking very similar to the RAE?

  4. Apart from the usual criticism of bibliometry, let me mention three more. First, even within a subdiscipline, policies vary from institute to institute and even within an institute. For example, some professors want to be on every paper written by someone paid for by money they were awarded, the justification being that the effectiveness of the grant will be judged (partly) on the number of papers it produces. Second, people have different ideas as to when a contribution becomes large enough to be moved from acknowledgement to co-authorship. Third, it is a subjective decision when to cite something. Consider Bernhard Schmidt. He had only one refereed-journal paper. His invention of the Schmidt camera was, along with photography and the CCD, one of the most important contributions to optical astronomy. (Photography was invented for reasons which had nothing to do with astronomy. CCDs as well, though their development was heavily influenced by their use in astronomy. The Schmidt camera was Schmidt applying his genius to an astronomical problem.) However, few papers which make use of Schmidt-camera observations cite his paper. This is also a good example of the importance of indirect citations. I don’t know if the original POSS paper cited Schmidt’s paper (Ein lichtstarkes komafreies Spiegelsystem. in Centralzeitung für Optik und Mechanik 52 (1931) S. 25-26), but even if it did, most users of the POSS—and the POSS has been involved in a huge number of astronomy papers—don’t. Suppose that Schmidt only had one citation, in the POSS paper. That would greatly underestimate his importance.

    Schmidt was a very colourful character. He never had a paid position in astronomy, preferring instead to make his living grinding lenses and mirrors commercially—by hand. And he had only one hand (having lost the other in an experiment with gunpowder). Legend has it that he didn’t need optical tests to determine where more polishing was needed (say, changing a spherical to a parabolic surface), but could feel the difference with his fingers. The director of the Hamburg Observatory had the sense to send him along with Walter Baade on an eclipse expedition (a hugely expensive endeavour, including some booze disguised as photographic chemicals to fool the customs inspectors) which led to his development of the Schmidt camera. What director of a modern-day observatory would allow the equivalent of a one-armed lens grinder to work at his institute, even as a volunteer?

  5. […] linear, additive behaviour in order to hold. I posted about an example where this is not the case here. Theorists love to make the Gaussian assumption when dealing with phenomena that they want to model […]

  6. […] bragged blogged already about my most popular paper citation-wise, which has 287 citations on Google Scholar, which […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: