(Guest Post) The Astronomical Premiership

Here’s a contribution to the discussion of citation rates in Astronomy (see this blog passim) by the estimable Paul Crowther who in addition to being an astronomer also maintains an important page about issues relating to STFC funding.

–0–

At last week’s Sheffield astrophysics group journal club I gave a talk on astronomical bibliometrics, motivated in part by Stuart Lowe’s H-R diagram of astronomers blog entry from last year, and the subsequent Seattle AAS 217 poster with Alberto Conti. These combined various versions of google search results with numbers of ADS publications. The original one was by far the most fun.

The poster also included Hirsch’s h-index for Americal Astronomical Society members, which is defined as the number of papers cited at least h times. Conti and Lowe presented the top ten of AAS members, with Donald Schneider in pole position, courtesy of SDSS. Kevin Pimblett has recently compiled the h-index for (domestic) members of the Astronomical Society of Australia, topped by Ken Freeman and Jeremy Mould.

Even though many rightly treat bibliometrics with distain, these studies naturally got me curious about comparable UK statistics. The last attempt to look into this was by Alex Blustin for Astronomy and Geophysics in 2007, but he (perhaps wisely) kept his results anonymous. For the talk I put together my attempt at an equivalent UK top ten, including those working overseas. Mindful of the fact that scientists could achieve a high h-index through heavily cited papers with many coauthors, I also looked into using normalised citations from ADS for an alternative, so-called hl,norm-index. I gather there are a myriad of such indices but stuck with just these two.

Still, I worried that my UK top ten would only be objective if I were to put together a ranked list of the h-index for every UK-based astronomy academic. In fact, given the various pros and cons of the raw and hl,norm-indexes, I thought it best to use an average of these scores when ranking individual astronomers.

For my sample I looked through the astrophysics group web pages for each UK institution represented at the Astronomy Forum, including academics and senior fellows, but excluding emeritus staff where apparent. I also tried to add cosmology, solar physics, planetary science and gravitational wave groups, producing a little over 500 in total. Refereed ADS citations were used to calculate the h-index and hl,norm-index for each academic, taking care to avoid citations to academics with the same surname and initial wherever possible. The results are presented in the chart.

Andy Fabian, George Efstathiou and Carlos Frenk occupy the top three spots for UK astronomy. Beyond these, and although no great football fan, I’d like to use a footballing analogy to rate other academics, with the top ten worthy of a hypothetical Champions League. Others within this illustrious group include John Peacock, Rob Kennicutt and Stephen Hawking.

If these few are the creme de la creme, I figured that others within the top 40 could be likened to Premier League teams, including our current RAS president Roger Davies, plus senior members of STFC committees and panels, including Andy Lawrence, Ian Smail and Andrew Liddle.

For the 60 or so others within the top 20 percent, I decided to continue the footballing analogy with reference to the Championship. At present these include Nial Tanvir, Matthew Bate, Steve Rawlings and Tom Marsh, although some will no doubt challenge for promotion to the Premier League in due course. The remainder of the top 40 per cent or so, forming the next two tiers, each again numbering about 60 academics, would then represent Leagues 1 and 2 – Divisons 3 and 4 from my youth – with Stephen Serjeant and Peter Coles, respectively, amongst their membership.

The majority of astronomers, starting close to the half-way point, represent my fantasy non-league teams, with many big names in the final third, in part due to a lower citation rate within certain sub-fields, notably solar and planetary studies. This week’s Times Higher Ed noted that molecular biology citation rates are 7 times higher than for mathematics, so comparisons across disciplines or sub-disciplines should be taken with a large pinch of salt.

It’s only the final 10 percent that could be thought of as Sunday League players. Still, many of these have a low h-index since they’re relatively young and so will rapidly progress through the leagues in due course, with some of the current star names dropping away once they retire. Others include those who have dedicated much of their careers to building high-impact instruments and so fall outside the mainstream criteria for jobbing astronomers.

This exercise isn’t intended to be taken too seriously by anyone, but finally to give a little international context i’ve carried out the same exercise for a few astronomers based outside the UK. Champions League players include Richard Ellis, Simon White, Jerry Ostriker, Michel Mayor and Reinhard Genzel, with Mike Dopita, Pierro Madau, Simon Lilly, Mario Livio and Rolf Kudritzki in the Premier League, so my ball-park league divisions seem to work out reasonably well beyond these shores.

Oh, I did include myself but am too modest to say which league I currently reside in…


Share/Bookmark

Advertisements

40 Responses to “(Guest Post) The Astronomical Premiership”

  1. telescoper Says:

    I have a hypothesis, which is that the mean citation rate in a field is proportional to the mean number of authors per paper…

    ..is there any evidence for or against?

  2. peter – isn’t that dealt with by looking at the ranking in terms of normalised citations (depending upon precise normalisation algorithm)? although my guess is that the normalisation breaks down for very large numbers of co-authors (when there is negligible effort for each new paper for most co-authors). hence membership of SDSS or 2dFGRS (or similarly large, more-recent collaborations) cost less effort per publication for many of the members, relative to smaller collaborative projects.

    i agree with paul’s concern about relative citation rates between sub-areas. these are so severe that we shouldn’t use citation rates as a proxy for research “quality” outside very narrow subject boundaries (from my personal experience different parts of extragalactic astronomy have ~3-5x different citation rates for papers which i thought were equally insightful). in part i think this derives from an N^2 dependence in citations – in that fields which are “sexy” for some reason will attract more researchers and as a result ensure more activity (as they seek to differentiate themselves) and thus more citations…

    (and i should out paul as a championship player… just in case anyone was wondering).

    • telescoper Says:

      I was thinking about the reason why, say, Mathematics has much lower citation rates overall than, say, biosciences rather than looking at individual scores.

  3. using a closed-box approximation there are only two variables: numbers of papers being written in a field and the number of other papers being referenced by each publication.

    …i just did an experiment – i asked ADS for the 500 most recent papers which had the keywords “stars: abundances” and “galaxies: evolution” in them (assuming that these might represent two fields where citation rates might differ) and then asked how many distinct papers those 500 papers referenced – it came out close (11%): 13,200 and 14,800 respectively. however, the total number of references were 28,700 and 36,100 – so the galaxy evolution papers cited the same references more frequently (25%).

    the average numbers of co-authors on these 500 papers were 5.99 and 7.94 (33%), so perhaps this is driving the referencing, but it doesn’t seem likely, isn’t it just more likely that (for some reason) certain fields cite supporting papers more frequently?

    • telescoper Says:

      Again I think you’re taking a definition of “field” which is much narrower than I intended. I was thinking about different fields entirely, such as biosciences and mathematics, rather than different branches within astronomy.

    • …and i’m saying that even *within* a field there is a large variation – so we shouldn’t be surprised about large variations between fields.

      my guess is that maths doesn’t cite a lot of supporting papers because the field itself has a well-structured framework (maths/logic), whereas more empirical fields require more demonstration of the truth of a statement by reference to supporting papers.

      who do you cite for 2+2=4, as compared to “massive elliptical galaxies formed through mergers”?

  4. Ken Rice Says:

    I have always found it somewhat interesting that, with 20 teams in the premier league and with something like 30 players per team, there has to be approximately the same number of premier league footballers as professional astronomers. It seems that we could, therefore, directly relate professional astronomers to premier league footballers.

    With nothing better to do on a Saturday night than mess around with Excel, I decided to see if it was possible to rank the premier league footballers in a manner similar to that shown in the post. A number of us at the ROE play a fantasy premier league game, so it seemed easiest to use the fantasy score of each player. Using these scores, there are 477 players who have a score greater than 0, so quite similar to the ~ 500 professional astronomers. Assuming Peter allows me to include a link in this post, the resulting figure is linked below.

    The fantasy score is divided by 2 simply to give a magnitude similar to the h index values. Using this analogy, Andy Fabian, George Efstathiou and Carlos Frenk are represented by Nani, Tevez and Berbatov. The next 10 include Drogba, Malouda, and Cech, while Rooney falls within the next 40 together with Bent, Kuyt, and Terry. Unfortunately, I appear to be represented by Aston Villa’s Luke Young, who hasn’t had particularly good season.

    I hope nobody thinks that I’m in anyway implying that ranking professional astronomers using something like the h-index is equivalent to ranking professional footballers using fantasy scores … 🙂

    • telescoper Says:

      I should point out that the Scottish Premier League is quite a different ball game to the actual Premiership…

    • Rob Ivison Says:

      Ken – your post had me in stitches – straight out of left field (so unlike Luke Young). You’re more of an Ian Holloway – eccentric genius.

  5. Alan Heavens Says:

    The 100 astronomers with the highest indices seem to be divided into two distinct categories: relatively high h-index with low hl norm-index, or vice-versa. Does this reflect a bi-modal distribution of author numbers, representing small collaborations and large consortia? Interesting that the Premier League would look quite different if determined by hl norm-index alone, or h-index alone (the latter is less different, presumably because it has higher weight in the averaging than simple ranking would). Final, I wonder who are the outlier astronomers, at ~87 for example?

    • telescoper Says:

      Alan

      In a desperate attempt to salvage my tattered reputation, I asked Paul to produce a picture which shows the “lurker index”, l= (h-h_norm)/(h+h_norm) as a function of a position in the list.

      You can find the picture here; sorry it doesn’t embed well in the comments section.

      Values of this quantity close to zero indicate that most of the “credit” for the publications being counted actually belongs to the individual concerned, whereas values close to unity mean that the individual researcher is largely benefitting from belonging to large consortia (with many authors). The lurker index is constructed to remove any overall effect of citation numbers.

      I believe the least lurking Premiership player (with an index of 0.10) is Stephen Hawking, and that most of the researchers lying low in the graph would also be theorists of one sort or another. Those with high lurker indices are more likely to be observers. Note that even some Premiership players do seem to spend a lot of their time lurking in the goalmouth, with l >0.6, relying on others to do much of the hard work, but it’s only well down the list that you find individuals who really stick out with l ~ 0.9.

      There is no discernible trend with position in the ranked order, which is also Quite Interesting.

    • Alan,

      Of the 10 overall Champions League players, 8 would have qualified from their h-index and normalized h-index grades, albeit with some reversal in sequence.
      If individuals wanted to compare their individual grades against the norm, the median for UK astronomers is: 91.5 refereed papers, 2600 citations, 28 h-index and 12 hl-norm-index, which conspire to produce an overall average score of 19.

      Astro’s with especially high `lurker’ index (such as the outlier at 87) are members of very large consortia (gravitational waves in this instance).

      Paul

    • telescoper Says:

      It’s probably worth mentioning also that some of the top folks have wider impact too. John Peacock’s excellent Cosmological Physics book is cited all over the place, for example, but doesn’t contribute if you include only refereed journal papers. However, if there’s only one or such items per individual it won’t make a huge difference to the

      I have ranted before about why I think journals are basically a waste of time, and it would be interesting to see how these things change if you include everything on ADS, not just refereed papers. You could argue that it’s all relevant.

      A lot of the big hitters will also have many (presumably unrefereed) conference papers, so it might make the top end even more impressive.

      I feel the people who really miss out, however, are instrument-builders. They get their names on papers of course, but not in a way that satisfactorily measures their contribution to the subject.

  6. Matt Burleigh Says:

    As someone who mainly works in a relatively small field, I have a natural skepticism with the use of citation indices (as pointed out by Ian Smail). Reading this post and the comments, two things spring to mind.

    Many times over the years, and I can think of two instances very recently, papers have been published by others that have failed to cite my own extremely relevant work. Unfortunately, I find the American community to be particularly guilty in this regard. Have other readers of this blog suffered similarly, and does anyone have good suggestions for countering and minimising this ignorance? As you might appreciate, it is potentially important in smaller fields like mine.

    The increasing us of citation indices also seems to me to be encouraging ill behaviour, eg minimising the number of authors on a paper by dropping those who may have had relatively minor roles in its production, ignoring those who are in less of a position to defend themselves – eg phd students, or bullying behaviour in order to gain first authorship.

    • i’m surprised to hear worries about over-pruning of co-author lists… if anything i thought the tendency was in the other direction – the co-author lists seem to be growing – partly on the grounds that once you’re past 3 its all “et al” anyway (the impact on normalised citation rates is a bit more sophisticated concern), while the acknowledgements are shrinking.

      there are certainly several people in various large consortia who are flagged as “remora”-like co-authors – they send a few minor comments and on the basis of those expect to be a co-author on the paper. indeed, many years ago i suggested that most projects need no more than 3 people to actually do the work – and so most co-authors beyond number 3 (in a non-alphabetical list) are likely to be on there for other reasons. in the subsequent ~15 years i’ve found little evidence to refute this rule.

    • telescoper Says:

      I think if we introduced a rule that said that you have to have at least read the paper to be counted as an author then author lists would shrink dramatically!

    • Matt Burleigh Says:

      I’ve seen “pruning” in action in both UK-led and US-led collaborations. I think it’s a terrible way to behave, but encouraged by concerns over metrics. Or at least, that’s how those who have indulged in it have justified it to me.

  7. telescoper Says:

    Matt

    I agree that citation counts are of limited utility. The fundamental reason is that it’s papers, not people, that get citations. The hirsch index was invented for use in theoretical physics where the problem of huge author lists is less pressing, but it really is a problem in other fields.

    I used to get annoyed when other papers failed to cite my work, but I’m much more sanguine about it now that I’m middle-aged. Of course there are many people who email every time the arXiv is updated to demand citations to their papers. My feeling is that good work will out eventually, and that life’s too short to worry about such petty injustices.

    The important thing is to ensure that we don’t end up with a system in which bibliometric measures are the only thing being considered by grant panels and selction boards. The world is increasingly full of dubious attempts to quantify the unquantifiable, so I hope we stick with good old-fashioned peer review in science.

    But of course then you get complaints like “why didn’t I get the job? My h-index is way bigger than hers!”

    Peter

    • Matt Burleigh Says:

      I get annoyed by the serial arXiv emailers too, and so am not keen on indulging in it myself. But there are authors who may genuinely have missed my work, and then there are the two recent cases I referred to where I suspect (but of course couldnt prove) deliberate ignorance. And when you are in a small field where 20-30 citations in 5 years indicates a pretty influential paper, then these omissions count.

      I wish my more senior colleagues would pay less attention to metrics, but I’ve seen them be far too influential in appointments and ranking exercises.

    • Matt,

      For reasons set out above and elsewhere, peer review, although far from perfect, is much preferred to generic metrics by all concerned. This is no doubt why HEFCE have largely backed away from widespread use of metrics for REF, such that it’s much closer to RAE than originally intended (the biggest change is esteem -> impact).

      Still, I was struck by the break in the overall index between my 40 (English) Premier League players and the remaining 470. I’d argue that – at least for these individuals – this reflects a major influence on their respective, fairly diverse, fields (e.g. Andrew King, Max Pettini, Rob Ivison, Jim Pringle, Keith Horne, Martin Ward).

      Paul
      p.s. I’m not surprised you’re hacked off over those pesky Americans basing their “coolest Brown Dwarf candidate” ApJ letter (+ Press Release!) on your IRAC data without any reference to Burleigh et al.

    • telescoper Says:

      Paul,

      I definitely agree. I think however you try to measure their contributions, you’ll always come to the conclusion that there’s a large group of excellent folk right at the top.

      Peter

    • Matt Burleigh Says:

      Paul (and everyone else)

      Come to my talk at NAM for more!

      Matt

  8. Jim Geach Says:

    Is there such a thing as a differential h-/g-index?

    I was thinking that for early career researchers, a potentially useful statistic would be the average yearly increase in h (over some time window, or years since the start of their PhD).

    This might identify the potential future premier league players (it would be interesting to see the correlation between dh/dt averaged over career and current h-index) and those who are just starting to bloom, without having to rely on absolute number?

    • telescoper Says:

      When assessing applicants for PDRA and junior faculty positions, I think it’s a useful guide to look at recent citation data. I usually look at the normalised h-index over the last 5 years for all the candidates. That’s a useful think to compare over a finite set of applicants, but it’s not something you can draw global conclusions from.

    • Jim

      The Australian paper did provide useful stats on how the h-index increases each decade after a PhD is completed. I would have liked to separately look into early career researchers statistics, but 500 was about all I could manage without losing the will to live. Besides, figuring out which year individuals obtained their PhD was a step too far too (at least over the course of a few evenings). It was tough enough trying to assess which academics should be included (Visiting academics, those on leave of absence or secondment, ATC research staff, what to do about Keith Mason even though he’s been purged from MSSL’s staff listing)??

      Paul

    • telescoper Says:

      I’m reliably informed that Keith Mason’s current affiliation is in fact “University of Wales, Aberystwyth” where he is a Fellow.

  9. The question I have, is who is on the list of top players but was not expected to be there (or not have even been heard of)?

    • Non-league membership probably more surprising than those at the top, but for what its worth:

      1: Fabian, 2: Efstathiou, 3: Frenk, 4: Silk* 5: Peacock, 6: Kennicutt, 7: Hawking, 8: Pringle, 9: Rowan-Robinson, 10=Gilmore, Nichol

      *probably disqualified since Oxford -> Paris, but still on Oxford website. Martin Rees would have take up 4th position, but was excluded since Emeritus.

    • telescoper Says:

      Hasn’t Rowan-Robinson also retired?

    • i believe MRR and hawking are officially “retired” – although both still hold paid research positions.

  10. “who do you cite for 2+2=4,”

    Whitehead and Russell.

  11. Some general thoughts: If, and that’s a big “if”, one wants to use bibliometry to assess a candidate, one should use the g-index, not the h-index. However, instead of raw citations, one should a) multiply by the number of pages, b) divide by the number of authors and c) correct for self-citations. One can also compare only people at a) the same stage of their careers (for obvious reasons) and b) at the same point in time (since citation styles change, electronic access makes citing easier etc).

    Even this won’t affect what is one of the main factors: what determines whether one is on the author list at all? Conventions vary enormously, from country to country, field to field, institute to institute, person to person etc. The boundary between being mentioned in the acknowledgments and becoming one of the et al. is rather fuzzy.

  12. Indeed, lots of fuzziness with bibliometrics. FWIW, I concluded my group talk by encouraging early career students and staff *not* to worry about bibliometrics, but simply to write lots of high quality, first author papers#.

    #not applicable in all sub-fields.

  13. Paul

    given the discussion on sub-discipline and out of sheer curiosity, could you reproduce your graph with data-points colour coded by sub discipline? I’d be interested to see what the distributions look like.

    • Me too, but i’m afraid not. At least not for the moment since I didn’t tag individuals with sub-disciplines, although it would be worthwhile exercise for someone to do..

      It is a relatively easy game in some cases (e.g. solar physics, gravitational waves) but non-trivial for others. Some work across both extragalactic and galactic astronomy, so should they be tagged by cosmology or stellar astrophysics, while others may study both exoplanets and CVs, so would they fall under planetary or stellar?

  14. In defence of Peter Coles I’d say I’d personally rank him more highly. In fact, you could regard his position as a reductio ad absurdum for that particular metric.

    But this exposes a serious underlying flaw: we choose metrics that fit our preconceptions, and then we’re tempted to regard our preconceptions as proved when we quote them. If you think objective scientists couldn’t possibly be so unconscious of their own methods, remember that confirmation biases in the early 20th century are what some believed “proved” that women were less intelligent than men (a bogus argument about brain size) and ultimately provided the false evidence base for eugenics and its horrors.

    More controversially, I genuinely wonder about claims that humans are the most intelligent species, because we choose to define intelligence in more-or-less exclusively human terms (e.g. advanced tool-making). A cat watching a garden would probably regard a human watching the same scene as stupidly inattentive and unaware. If you were ET and saw our prejudices, would you want to make contact?

    Dragging myself back on topic, we do have to have metrics, but let’s not forget that academic impact is not a single parameter, so we should have a variety of measures. We can then use a choice of metrics to illustrate a point, but let’s not forget that having the initial freedom to pose the question changes how we should interpret the answer.

  15. Dennis Crabtree Says:

    While I’ve mostly tracked citations for larger aggregates such as telescopes or countries, I have looked a bit at individual ‘rankings’ based upon citations. As pointed out there are large variations between fields (astronomy vs biology), within fields and between sub-fields.

    Another useful discriminant is first author papers. If you plot H-index (or some variant) vs # of first author papers, you get very good separation based on ‘performance’. One could label the two axes as ‘impact’ and leadership.

    I’ve created a database of all papers published in the major astronomy journals for 2000-2009 (downloaded from Web of Science) and I’ve hooked this into my machinery for getting citations and other information from ADS. The data also includes the country of first author so I hope to look at trends between countries over the past decade as well as sub-fields based on keywords.

    • telescoper Says:

      I have heard it argued that one should only count first-author papers in the h-index. I think that’s a bit extreme, but it would be illuminating to see its effect on the Premiership. The trouble is that many people – including myself – tend to put students and PDRAs first unless there’s a good reason not to, so the only papers that will contribute to this index would be those written before the researcher had students or PDRAs.

    • “The trouble is that many people – including myself – tend to put students and PDRAs first unless there’s a good reason not to, so the only papers that will contribute to this index would be those written before the researcher had students or PDRAs.”

      Unless you had the misfortune of working for a big cheese in your earlier career, who insisted that his name go on first. 😦

      I mentioned above that, IF one wants to use a metric, the g-index makes a bit more sense than the h-index (though the h-index is a big step forward from more traditional metrics), but taking into account the length of the paper (as 10-page paper is usually more involved than a 2-page paper) as well as the number of authors. Correcting for self-citations is less important in the h-index and g-index than with other metrics. However, in correcting for the number of authors, one needs to weight by the amount of the contribution. In general, this information is unavailable and cannot be reliably guessed. This, with the related question of the threshold between acknowledgments and being one of the et al., is probably the biggest problem in using metrics to try to measure some objective quantity.

      Of course, there are other metrics not directly related to publications, such as successful students. Folks like Sciama score high here, even though they don’t score that high with bibliometry. (Some, like Lord Rees, score high with most schemes one can think of.)

  16. […] particle physics lecture on the Cabibbo mechanism for quark mixing, which inspired me to go back to Paul Crowther’s guest post of a couple of days ago to present the data in a slightly different […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: