A “Mr Smith” from Portugal drew my attention to this post. I’ve posted from time to time about my scepticism about bibliometricism and this piece suggests some radical alternatives to the way citations are handled. I’m not sure I agree with it, but it’s well worth reading.
Archive for bibliometric
Here’s a contribution to the discussion of citation rates in Astronomy (see this blog passim) by the estimable Paul Crowther who in addition to being an astronomer also maintains an important page about issues relating to STFC funding.
At last week’s Sheffield astrophysics group journal club I gave a talk on astronomical bibliometrics, motivated in part by Stuart Lowe’s H-R diagram of astronomers blog entry from last year, and the subsequent Seattle AAS 217 poster with Alberto Conti. These combined various versions of google search results with numbers of ADS publications. The original one was by far the most fun.
The poster also included Hirsch’s h-index for Americal Astronomical Society members, which is defined as the number of papers cited at least h times. Conti and Lowe presented the top ten of AAS members, with Donald Schneider in pole position, courtesy of SDSS. Kevin Pimblett has recently compiled the h-index for (domestic) members of the Astronomical Society of Australia, topped by Ken Freeman and Jeremy Mould.
Even though many rightly treat bibliometrics with distain, these studies naturally got me curious about comparable UK statistics. The last attempt to look into this was by Alex Blustin for Astronomy and Geophysics in 2007, but he (perhaps wisely) kept his results anonymous. For the talk I put together my attempt at an equivalent UK top ten, including those working overseas. Mindful of the fact that scientists could achieve a high h-index through heavily cited papers with many coauthors, I also looked into using normalised citations from ADS for an alternative, so-called hl,norm-index. I gather there are a myriad of such indices but stuck with just these two.
Still, I worried that my UK top ten would only be objective if I were to put together a ranked list of the h-index for every UK-based astronomy academic. In fact, given the various pros and cons of the raw and hl,norm-indexes, I thought it best to use an average of these scores when ranking individual astronomers.
For my sample I looked through the astrophysics group web pages for each UK institution represented at the Astronomy Forum, including academics and senior fellows, but excluding emeritus staff where apparent. I also tried to add cosmology, solar physics, planetary science and gravitational wave groups, producing a little over 500 in total. Refereed ADS citations were used to calculate the h-index and hl,norm-index for each academic, taking care to avoid citations to academics with the same surname and initial wherever possible. The results are presented in the chart.
Andy Fabian, George Efstathiou and Carlos Frenk occupy the top three spots for UK astronomy. Beyond these, and although no great football fan, I’d like to use a footballing analogy to rate other academics, with the top ten worthy of a hypothetical Champions League. Others within this illustrious group include John Peacock, Rob Kennicutt and Stephen Hawking.
If these few are the creme de la creme, I figured that others within the top 40 could be likened to Premier League teams, including our current RAS president Roger Davies, plus senior members of STFC committees and panels, including Andy Lawrence, Ian Smail and Andrew Liddle.
For the 60 or so others within the top 20 percent, I decided to continue the footballing analogy with reference to the Championship. At present these include Nial Tanvir, Matthew Bate, Steve Rawlings and Tom Marsh, although some will no doubt challenge for promotion to the Premier League in due course. The remainder of the top 40 per cent or so, forming the next two tiers, each again numbering about 60 academics, would then represent Leagues 1 and 2 – Divisons 3 and 4 from my youth – with Stephen Serjeant and Peter Coles, respectively, amongst their membership.
The majority of astronomers, starting close to the half-way point, represent my fantasy non-league teams, with many big names in the final third, in part due to a lower citation rate within certain sub-fields, notably solar and planetary studies. This week’s Times Higher Ed noted that molecular biology citation rates are 7 times higher than for mathematics, so comparisons across disciplines or sub-disciplines should be taken with a large pinch of salt.
It’s only the final 10 percent that could be thought of as Sunday League players. Still, many of these have a low h-index since they’re relatively young and so will rapidly progress through the leagues in due course, with some of the current star names dropping away once they retire. Others include those who have dedicated much of their careers to building high-impact instruments and so fall outside the mainstream criteria for jobbing astronomers.
This exercise isn’t intended to be taken too seriously by anyone, but finally to give a little international context i’ve carried out the same exercise for a few astronomers based outside the UK. Champions League players include Richard Ellis, Simon White, Jerry Ostriker, Michel Mayor and Reinhard Genzel, with Mike Dopita, Pierro Madau, Simon Lilly, Mario Livio and Rolf Kudritzki in the Premier League, so my ball-park league divisions seem to work out reasonably well beyond these shores.
Oh, I did include myself but am too modest to say which league I currently reside in…
Apparently last year the United Kingdom Infra-Red Telescope (UKIRT) beat its own personal best for scientific productivity. In fact here’s a graphic showing the number of publications resulting from UKIRT to make the point:
The plot also demonstrates that a large part of recent burst of productivity has been associated with UKIDSS (the UKIRT Infrared Deep Sky Survey) which a number of my colleagues are involved in. Excellent chaps. Great project. Lots of hard work done very well. Take a bow, the UKIDSS team!
Now I hope I’ve made it clear that I don’t in any way want to pour cold water on the achievements of UKIRT, and particularly not UKIDSS, but this does provide an example of how difficult it is to use bibliometric information in a meaningful way.
A paper is listed as a UKIDSS paper if it is already published in a journal (with one exception) and satisfies one of the following criteria:
1. It is one of the core papers describing the survey (e.g. calibration, archive, data releases). The DR2 paper is included, and is the only paper listed not published in a journal.
2. It includes science results that are derived in whole or in part from UKIDSS data directly accessed from the archive (analysis of data published in another paper does not count).
3. It contains science results from primary follow-up observations in a programme that is identifiable as a UKIDSS programme (e.g. The physical properties of four ~600K T dwarfs, presenting Spitzer spectra of cool brown dwarfs discovered with UKIDSS).
4. It includes a feasibility study of science that could be achieved using UKIDSS data (e.g. The possiblity of detection of ultracool dwarfs with the UKIRT Infrared Deep Sky Survey by Deacon and Hambly).
Papers are identified by a full-text search for the string ‘UKIDSS’, and then compared against the above criteria.
That all seems to me to by quite reasonable, and it’s certainly one way of defining what a UKIDSS paper is. According to that measure, UKIDSS scores 226.
The Warren measure does, however, include a number of papers that don’t directly use UKIDSS data, and many written by people who aren’t members of the UKIDSS consortium. Being picky you might say that such papers aren’t really original UKIDSS papers, but are more like second-generation spin-offs. So how could you count UKIDSS papers differently?
I just tried one alternative way, which is to use ADS to identify all refereed papers with “UKIDSS” in the title, assuming – possibly incorrectly – that all papers written by the UKIDSS consortium would have UKIDSS in the title. The number returned by this search was 38.
Now I’m not saying that this is more reasonable than the Warren measure. It’s just different, that’s all. According to my criterion however UKIDSS measures 38 rather than 226. It sounds less impressive (if only because 38 is a smaller number than 226), but what does it mean about UKIDSS productivity in absolute terms?
Not very much, I think is the answer.
Yet another way you might try to judge UKIDSS using bibliometric means is to look at its citation impact. After all, any fool can churn out dozens of papers that no-one ever reads. I know that for a fact. I am that fool.
But citation data also provide another way of doing what Steve Warren was trying to measure. Presumably the authors of any paper that uses UKIDSS data in any significant way would cite the main UKIDSS survey paper led by Andy Lawrence (Lawrence et al. 2007). According to ADS, the number of times this has been cited since publication is 359. That’s higher than the Warren measure (226), and much higher than the UKIDSS-in-the-title measure (38).
So there we are, three different measures, all in my opinion perfectly reasonable measures of, er, something or other, but each giving a very different numerical value. I am not saying any is misleading or that any is necessarily better than the others. My point is simply that it’s not easy to assign a numerical value to something that’s intrinsically difficult to define.
Unfortunately, it’s a point few people in government seem to be prepared to acknowledge.
Andy Lawrence is 57.
Following on from yesterday’s post about the forthcoming Research Excellence Framework that plans to use citations as a measure of research quality, I thought I would have a little rant on the subject of bibliometrics.
Recently one particular measure of scientific productivity has established itself as the norm for assessing job applications, grant proposals and for other related tasks. This is called the h-index, named after the physicist Jorge Hirsch, who introduced it in a paper in 2005. This is quite a simple index to define and to calculate (given an appropriately accurate bibliographic database). The definition is that an individual has an h-index of h if that individual has published h papers with at least h citations. If the author has published N papers in total then the other N-h must have no more than h citations. This is a bit like the Eddington number. A citation, as if you didn’t know, is basically an occurrence of that paper in the reference list of another paper.
To calculate it is easy. You just go to the appropriate database – such as the NASA ADS system – search for all papers with a given author and request the results to be returned sorted by decreasing citation count. You scan down the list until the number of citations falls below the position in the ordered list.
Incidentally, one of the issues here is whether to count only refereed journal publications or all articles (including books and conference proceedings). The argument in favour of the former is that the latter are often of lower quality. I think that is in illogical argument because good papers will get cited wherever they are published. Related to this is the fact that some people would like to count “high-impact” journals only, but if you’ve chosen citations as your measure of quality the choice of journal is irrelevant. Indeed a paper that is highly cited despite being in a lesser journal should if anything be given a higher weight than one with the same number of citations published in, e.g., Nature. Of course it’s just a matter of time before the hideously overpriced academic journals run by the publishing mafia go out of business anyway so before long this question will simply vanish.
The h-index has some advantages over more obvious measures, such as the average number of citations, as it is not skewed by one or two publications with enormous numbers of hits. It also, at least to some extent, represents both quantity and quality in a single number. For whatever reasons in recent times h has undoubtedly become common currency (at least in physics and astronomy) as being a quick and easy measure of a person’s scientific oomph.
Incidentally, it has been claimed that this index can be fitted well by a formula h ~ sqrt(T)/2 where T is the total number of citations. This works in my case. If it works for everyone, doesn’t it mean that h is actually of no more use than T in assessing research productivity?
Typical values of h vary enormously from field to field – even within each discipline – and vary a lot between observational and theoretical researchers. In extragalactic astronomy, for example, you might expect a good established observer to have an h-index around 40 or more whereas some other branches of astronomy have much lower citation rates. The top dogs in the field of cosmology are all theorists, though. People like Carlos Frenk, George Efstathiou, and Martin Rees all have very high h-indices. At the extreme end of the scale, string theorist Ed Witten is in the citation stratosphere with an h-index well over a hundred.
I was tempted to put up examples of individuals’ h-numbers but decided instead just to illustrate things with my own. That way the only person to get embarrased is me. My own index value is modest – to say the least – at a meagre 27 (according to ADS). Does that mean Ed Witten is four times the scientist I am? Of course not. He’s much better than that. So how exactly should one use h as an actual metric, for allocating funds or prioritising job applications, and what are the likely pitfalls? I don’t know the answer to the first one, but I have some suggestions for other metrics that avoid some of its shortcomings.
One of these addresses an obvious deficiency of h. Suppose we have an individual who writes one brilliant paper that gets 100 citations and another who is one author amongst 100 on another paper that has the same impact. In terms of total citations, both papers register the same value, but there’s no question in my mind that the first case deserves more credit. One remedy is to normalise the citations of each paper by the number of authors, essentially sharing citations equally between all those that contributed to the paper. This is quite easy to do on ADS also, and in my case it gives a value of 19. Trying the same thing on various other astronomers, astrophysicists and cosmologists reveals that the h index of an observer is likely to reduce by a factor of 3-4 when calculated in this way – whereas theorists (who generally work in smaller groups) suffer less. I imagine Ed Witten’s index doesn’t change much when calculated on a normalized basis, although I haven’t calculated it myself.
Observers complain that this normalized measure is unfair to them, but I’ve yet to hear a reasoned argument as to why this is so. I don’t see why 100 people should get the same credit for a single piece of work: it seems like obvious overcounting to me.
Another possibility – if you want to measure leadership too – is to calculate the h index using only those papers on which the individual concerned is the first author. This is a bit more of a fiddle to do but mine comes out as 20 when done in this way. This is considerably higher than most of my professorial colleagues even though my raw h value is smaller. Using first author papers only is also probably a good way of identifying lurkers: people who add themselves to any paper they can get their hands on but never take the lead. Mentioning no names of course. I propose using the ratio of unnormalized to normalized h-indices as an appropriate lurker detector…
Finally in this list of bibliometrica is the so-called g-index. This is defined in a slightly more complicated way than h: given a set of articles ranked in decreasing order of citation numbers, g is defined to be the largest number such that the top g articles altogether received at least g2 citations. This is a bit like h but takes extra account of the average citations of the top papers. My own g-index is about 47. Obviously I like this one because my number looks bigger, but I’m pretty confident others go up even more than mine!
Of course you can play with these things to your heart’s content, combining ideas from each definition: the normalized g-factor, for example. The message is, though, that although h definitely contains some information, any attempt to condense such complicated information into a single number is never going to be entirely successful.
Comments, particularly with suggestions of alternative metrics are welcome via the box. Even from lurkers.