Archive for arXiv: 2110.14115

Citation Metrics and “Judging People’s Careers”

Posted in Bad Statistics, The Universe and Stuff with tags , , , , on October 29, 2021 by telescoper

There’s a paper on the arXiv by John Kormendy entitled Metrics of research impact in astronomy: Predicting later impact from metrics measured 10-15 years after the PhD. The abstract is as follows.

This paper calibrates how metrics derivable from the SAO/NASA Astrophysics Data System can be used to estimate the future impact of astronomy research careers and thereby to inform decisions on resource allocation such as job hires and tenure decisions. Three metrics are used, citations of refereed papers, citations of all publications normalized by the numbers of co-authors, and citations of all first-author papers. Each is individually calibrated as an impact predictor in the book Kormendy (2020), “Metrics of Research Impact in Astronomy” (Publ Astron Soc Pac, San Francisco). How this is done is reviewed in the first half of this paper. Then, I show that averaging results from three metrics produces more accurate predictions. Average prediction machines are constructed for different cohorts of 1990-2007 PhDs and used to postdict 2017 impact from metrics measured 10, 12, and 15 years after the PhD. The time span over which prediction is made ranges from 0 years for 2007 PhDs to 17 years for 1990 PhDs using metrics measured 10 years after the PhD. Calibration is based on perceived 2017 impact as voted by 22 experienced astronomers for 510 faculty members at 17 highly-ranked university astronomy departments world-wide. Prediction machinery reproduces voted impact estimates with an RMS uncertainty of 1/8 of the dynamic range for people in the study sample. The aim of this work is to lend some of the rigor that is normally used in scientific research to the difficult and subjective job of judging people’s careers.

This paper has understandably generated a considerable reaction on social media especially from early career researchers dismayed at how senior astronomers apparently think they should be judged. Presumably “judging people’s careers” means deciding whether or not they should get tenure (or equivalent) although the phrase is not a pleasant one to use.

My own opinion is that while citations and other bibliometric indicators do contain some information, they are extremely difficult to apply in the modern era in which so many high-impact results are generated by large international teams. Note also the extreme selectivity of this exercise: just 22 “experienced astronomers” provide the :calibration” which is for faculty in just 17 “highly-ranked” university astronomy departments. No possibility of any bias there, obviously. Subjectivity doesn’t turn into objectivity just because you make it quantitative.

If you’re interested here are the names of the 22:

Note that the author of the paper is himself on the list. I find that deeply inappropriate.

Anyway, the overall level of statistical gibberish in this paper is such that I am amazed it has been accepted for publication, but then it is in the Proceedings of the National Academy of Sciences, a journal that has form when it comes to dodgy statistics. If I understand correctly, PNAS has a route that allows “senior” authors to publish papers without passing through peer review. That’s the only explanation I can think of for this.

As a rejoinder I’d like to mention this paper by Adler et al. from 12 years ago, which has the following abstract:

This is a report about the use and misuse of citation data in the assessment of scientific research. The idea that research assessment must be done using “simple and objective” methods is increasingly prevalent today. The “simple and objective” methods are broadly interpreted as bibliometrics, that is, citation data and the statistics derived from them. There is a belief that citation statistics are inherently more accurate because they substitute simple numbers for complex judgments, and hence overcome the possible subjectivity of peer review. But this belief is unfounded.

O brave new world that has such metrics in it.

Update: John Kormendy has now withdrawn the paper; you can see his statement here.