Archive for March 18, 2011

What Counts as Productivity?

Posted in Bad Statistics, Science Politics, The Universe and Stuff with tags , , , , on March 18, 2011 by telescoper

Apparently last year the United Kingdom Infra-Red Telescope (UKIRT) beat its own personal best for scientific productivity. In fact here’s a  graphic showing the number of publications resulting from UKIRT to make the point:

The plot also demonstrates that a large part of recent burst of productivity has been associated with UKIDSS (the UKIRT Infrared Deep Sky Survey) which a number of my colleagues are involved in. Excellent chaps. Great project. Lots of hard work done very well.  Take a bow, the UKIDSS team!

Now I hope I’ve made it clear that  I don’t in any way want to pour cold water on the achievements of UKIRT, and particularly not UKIDSS, but this does provide an example of how difficult it is to use bibliometric information in a meaningful way.

Take the UKIDSS papers used in the plot above. There are 226 of these listed by Steve Warren at Imperial College. But what is a “UKIDSS paper”? Steve states the criteria he adopted:

A paper is listed as a UKIDSS paper if it is already published in a journal (with one exception) and satisfies one of the following criteria:

1. It is one of the core papers describing the survey (e.g. calibration, archive, data releases). The DR2 paper is included, and is the only paper listed not published in a journal.
2. It includes science results that are derived in whole or in part from UKIDSS data directly accessed from the archive (analysis of data published in another paper does not count).
3. It contains science results from primary follow-up observations in a programme that is identifiable as a UKIDSS programme (e.g. The physical properties of four ~600K T dwarfs, presenting Spitzer spectra of cool brown dwarfs discovered with UKIDSS).
4. It includes a feasibility study of science that could be achieved using UKIDSS data (e.g. The possiblity of detection of ultracool dwarfs with the UKIRT Infrared Deep Sky Survey by Deacon and Hambly).

Papers are identified by a full-text search for the string ‘UKIDSS’, and then compared against the above criteria.

That all seems to me to by quite reasonable, and it’s certainly one way of defining what a UKIDSS paper is. According to that measure, UKIDSS scores 226.

The Warren measure does, however, include a number of papers that don’t directly use UKIDSS data, and many written by people who aren’t members of the UKIDSS consortium. Being picky you might say that such papers aren’t really original UKIDSS papers, but are more like second-generation spin-offs. So how could you count UKIDSS papers differently?

I just tried one alternative way, which is to use ADS to identify all refereed papers with “UKIDSS” in the title, assuming – possibly incorrectly – that all papers written by the UKIDSS consortium would have UKIDSS in the title. The number returned by this search was 38.

Now I’m not saying that this is more reasonable than the Warren measure. It’s just different, that’s all.  According to my criterion however UKIDSS measures 38 rather than 226. It sounds less impressive (if only because 38 is a smaller number than 226),  but what does it mean about UKIDSS productivity in absolute terms?

Not very much, I think is the answer.

Yet another way you might try to judge UKIDSS using bibliometric means is to look at its citation impact. After all, any fool can churn out dozens of papers that no-one ever reads. I know that for a fact. I am that fool.

But citation data also provide another way of doing what Steve Warren was trying to measure. Presumably the authors of any paper that uses UKIDSS data in any significant way would cite the main UKIDSS survey paper led by Andy Lawrence (Lawrence et al. 2007). According to ADS, the number of times this has been cited since publication is 359. That’s higher than the Warren measure (226), and much higher than the UKIDSS-in-the-title measure (38).

So there we are, three different measures, all in my opinion perfectly reasonable measures of, er,  something or other, but each giving a very different numerical value. I am not saying any  is misleading or that any is necessarily better than the others. My point is simply that it’s not easy to assign a numerical value to something that’s intrinsically difficult to define.

Unfortunately, it’s a point few people in government seem to be prepared to acknowledge.

Andy Lawrence is 57.