Archive for World University Rankings

More Worthless University Rankings

Posted in Bad Statistics, Education with tags , , , on September 6, 2017 by telescoper

The Times Higher World University Rankings were released this week. The main table can be found here and the methodology used to concoct them here.

Here I wish to reiterate the objection I made last year and the year before that to the way these tables are manipulated year on year to create an artificial “churn” that renders them unreliable and impossible to interpret in any objective way. In other words, they’re worthless. This year the narrative text includes:

This year’s list of the best universities in the world is led by two UK universities for the first time. The University of Oxford has held on to the number one spot for the second year in a row, while the University of Cambridge has jumped from fourth to second place.

Overall, European institutions occupy half of the top 200 places, with the Netherlands and Germany joining the UK as the most-represented countries. Italy, Spain and the Netherlands each have new number ones.

Another notable trend is the continued rise of China. The Asian giant is now home to two universities in the top 30: Peking and Tsinghua. The Beijing duo now outrank several prestigious institutions in Europe and the US. Meanwhile, almost all Chinese universities have improved, signalling that the country’s commitments to investment has bolstered results year-on-year.

In contrast, two-fifths of the US institutions in the top 200 (29 out of 62) have dropped places. In total, 77 countries feature in the table.

These comments are all predicated on the assumption that any changes since the last tables represent changes in data (which in turn are assumed to be relevant to how good a university is) rather than changes in the methodology used to analyse that data. Unfortunately, every single year the Times Higher changes its methodology. This time we are told:

This year, we have made a slight improvement to how we handle our papers per academic staff calculation, and expanded the number of broad subject areas that we use.

What has been the effect of these changes? We are not told. The question that must be asked is how can we be sure that any change in league table position for an institution from year to year represents a change in “performance”,rather than a change in the way metrics are constructed and/or combined? Would you trust the outcome of a medical trial in which the response of two groups of patients (e.g. one given medication and the other placebo) were assessed with two different measurement techniques?

There is an obvious and easy way to test for the size of this effect, which is to construct a parallel set of league tables, with this year’s input data but last year’s methodology, which would make it easy to isolate changes in methodology from changes in the performance indicators. The Times Higher – along with other purveyors of similar statistical twaddle – refuses to do this. No scientifically literate person would accept the result of this kind of study unless the systematic effects can be shown to be under control. There is a very easy way for the Times Higher to address this question: all they need to do is publish a set of league tables using, say, the 2016/17 methodology and the 2017/18 data, for comparison with those constructed using this year’s methodology on the 2017/18 data. Any differences between these two tables will give a clear indication of the reliability (or otherwise) of the rankings.

I challenged the Times Higher to do this last year, and they refused. You can draw your own conclusions about why.

P.S. For the record, Cardiff University is 162nd in this year’s table, a rise of 20 places on last year. My former institution, the University of Sussex, is up two places to joint 147th. Whether these changes are anything other than artifacts of the data analysis I very much doubt.

Advertisements

The Worthless University Rankings

Posted in Bad Statistics, Education with tags , , , on September 23, 2016 by telescoper

The Times Higher World University Rankings, which were released this weekk. The main table can be found here and the methodology used to concoct them here.

Here I wish to reiterate the objection I made last year to the way these tables are manipulated year on year to create an artificial “churn” that renders them unreliable and impossible to interpret in an objective way. In other words, they’re worthless. This year, editor Phil Baty has written an article entitled Standing still is not an option in which he makes a statement that “the overall rankings methodology is the same as last year”. Actually it isn’t. In the page on methodology you will find this:

In 2015-16, we excluded papers with more than 1,000 authors because they were having a disproportionate impact on the citation scores of a small number of universities. This year, we have designed a method for reincorporating these papers. Working with Elsevier, we have developed a new fractional counting approach that ensures that all universities where academics are authors of these papers will receive at least 5 per cent of the value of the paper, and where those that provide the most contributors to the paper receive a proportionately larger contribution.

So the methodology just isn’t “the same as last year”. In fact every year that I’ve seen these rankings there’s been some change in methodology. The change above at least attempts to improve on the absurd decision taken last year to eliminate from the citation count any papers arising from large collaborations. In my view, membership of large world-wide collaborations is in itself an indicator of international research excellence, and such papers should if anything be given greater not lesser weight. But whether you agree with the motivation for the change or not is beside the point.

The real question is how can we be sure that any change in league table position for an institution from year to year are is caused by methodological tweaks rather than changes in “performance”, i.e. not by changes in the metrics but by changes in the way they are combined? Would you trust the outcome of a medical trial in which the response of two groups of patients (e.g. one given medication and the other placebo) were assessed with two different measurement techniques?

There is an obvious and easy way to test for the size of this effect, which is to construct a parallel set of league tables, with this year’s input data but last year’s methodology, which would make it easy to isolate changes in methodology from changes in the performance indicators. The Times Higher – along with other purveyors of similar statistical twaddle – refuses to do this. No scientifically literate person would accept the result of this kind of study unless the systematic effects can be shown to be under control. There is a very easy way for the Times Higher to address this question: all they need to do is publish a set of league tables using, say, the 2015/16 methodology and the 2016/17 data, for comparison with those constructed using this year’s methodology on the 2016/17 data. Any differences between these two tables will give a clear indication of the reliability (or otherwise) of the rankings.

I challenged the Times Higher to do this last year, and they refused. You can draw your own conclusions about why.

An Open Letter to the Times Higher World University Rankers

Posted in Education, The Universe and Stuff with tags , , , , , , , , on October 5, 2015 by telescoper

Dear Rankers,

Having perused your latest set of league tables along with the published methodology, a couple of things puzzle me.

First, I note that you have made significant changes to your methodology for combining metrics this year. How, then, can you justify making statements such as

US continues to lose its grip as institutions in Europe up their game

when it appears that any changes could well be explained not by changes in performance, as gauged by the metrics you use,  but in the way they are combined?

I assume, as intelligent and responsible people, that you did the obvious test for this effect, i.e. to construct a parallel set of league tables, with this year’s input data but last year’s methodology, which would make it easy to isolate changes in methodology from changes in the performance indicators.  Your failure to publish such a set, to illustrate how seriously your readers should take statements such as that quoted above, must then simply have been an oversight. Had you deliberately witheld evidence of the unreliability of your conclusions you would have left yourselves open to an accusation of gross dishonesty, which I am sure would be unfair.

Happily, however, there is a very easy way to allay the fears of the global university community that the world rankings are being manipulated: all you need to do is publish a set of league tables using the 2014 methodology and the 2015 data. Any difference between this table and the one you published would then simply be an artefact and the new ranking can be ignored. I’m sure you are as anxious as anyone else to prove that the changes this year are not simply artificially-induced “churn”, and I look forward to seeing the results of this straightforward calculation published in the Times Higher as soon as possible.

Second, I notice that one of the changes to your methodology is explained thus

This year we have removed the very small number of papers (649) with more than 1,000 authors from the citations indicator.

You are presumably aware that this primarily affects papers relating to experimental particle physics, which is mostly conducted through large international collaborations (chiefly, but not exclusively, based at CERN). This change at a stroke renders such fundamental scientific breakthroughs as the discovery of the Higgs Boson completely worthless. This is a strange thing to do because this is exactly the type of research that inspires  prospective students to study physics, as well as being direct measures in themselves of the global standing of a University.

My current institution, the University of Sussex, is heavily involved in experiments at CERN. For example, Dr Iacopo Vivarelli has just been appointed coordinator of all supersymmetry searches using the ATLAS experiment on the Large Hadron Collider. This involvement demonstrates the international standing of our excellent Experimental Particle Physics group, but if evidence of supersymmetry is found at the LHC your methodology will simply ignore it. A similar fate will also befall any experiment that requires large international collaborations: searches for dark matter, dark energy, and gravitational waves to name but three, all exciting and inspiring scientific adventures that you regard as unworthy of any recognition at all but which draw students in large numbers into participating departments.

Your decision to downgrade collaborative research to zero is not only strange but also extremely dangerous, for it tells university managers that participating in world-leading collaborative research will jeopardise their rankings. How can you justify such a deliberate and premeditated attack on collaborative science? Surely it is exactly the sort of thing you should be rewarding? Physics departments not participating in such research are the ones that should be downgraded!

Your answer might be that excluding “superpapers” only damages the rankings of smaller universities because might owe a larger fraction of their total citation count to collaborative work. Well, so what if this is true? It’s not a reason for excluding them. Perhaps small universities are better anyway, especially when they emphasize small group teaching and provide opportunities for students to engage in learning that’s led by cutting-edge research. Or perhaps you have decided otherwise and have changed your methodology to confirm your prejudice…

I look forward to seeing your answers to the above questions through the comments box or elsewhere – though you have ignored my several attempts to raise these questions via social media. I also look forward to seeing you correct your error of omission by demonstrating – by the means described above – what  changes in league table positions are by your design rather than any change in performance. If it turns out that the former is the case, as I think it will, at least your own journal provides you with a platform from which you can apologize to the global academic community for wasting their time.

Yours sincerely,

Telescoper

The League of Extraordinary Gibberish

Posted in Bad Statistics with tags , , , on October 13, 2009 by telescoper

After a very busy few days I thought I’d relax yesterday by catching up with a bit of reading. In last week’s Times Higher I found there was a supplement giving this year’s World University Rankings.

I don’t really approve of league tables but somehow can’t resist looking in them to see where my current employer Cardiff University lies. There we are at number 135 in the list of the top 200 Universities. That’s actually not bad for an institute that’s struggling with a Welsh funding  system that seriously disadvantages it compared to our English colleagues. We’re a long way down compared to Cambridge (2nd), UCL (4th), Imperial and Oxford (5th=) . Compared to places I’ve worked at previously we’re significantly below Nottingham (91st) but still above Queen Mary (164) and Sussex (166). Number 1 in the world is Harvard, which is apparently somewhere near Boston (the American one).

Relieved that we’re in the top 200 at all, I decided to have a look at how the tables were drawn up. I wish I hadn’t bothered because I was horrified at the methodological garbage that lies behind it. You can find a full account of the travesty here. In essence, however, the ranking is arrived at by adding six distinct indicators, weighted differently but with weights assigned for no obvious reason, each of which is arrived at by dubious means and which is highly unlikely to mean what it purports. Each indicator is magically turned into a score out of 100 before being added to all the other ones (with appropriate weighting factors).

The indicators are:

  1. Academic Peer Review. This is weighted 40% of the overall score for each institution and is obtained by asking a sample of academics (selected in a way that is not explained). This year 9386 people were involved; they were asked to name institutions they regard as the best in their field. This sample is a tiny fraction of the global academic population and it would amaze me if it were representative of anything at all!
  2. Employer Survey. The pollsters asked 3281 graduate employers for their opinions of the different universities. This was weighted 10%.
  3. Staff-Student Ratio. Counting 20%, this is supposed to be a measure of “teaching quality”! Good teaching = large numbers of staff? Not if most of them don’t teach as at many research universities. A large staff-student ratio could even mean the place is really unpopular!
  4. International Faculty. This measures the  proportion of overseas staff on the books. Apparently a large number of foreign lecturers makes for a good university and “how attractive an institution is around the world”. Or perhaps that it finds it difficult to recruit its own nationals. This one counts only 5%.
  5. International Students. Another 5% goes to the fraction of each of the student body that is from overseas.
  6. Research Excellence. This is measured solely on the basis of citations – I’ve discussed some of the issues with that before – and counts 20%. They choose to use an unreliable database called SCOPUS, run by the profiteering academic publisher Elsevier. The total number of citations is divided by the number of faculty to “give a sense of the density of research excellence” at the institution.

Well I hope by now you’ve got a sense of the density of the idiots who compiled this farrago. Even if you set aside the issue of the accuracy of the input data, there is still the issue of how on Earth anyone could have thought it was sensible to pick such silly ways of measuring what makes a good university, assigning random weights to them, and then claiming that they had achieved something useful. They probably got paid a lot for doing it too. Talk about money for old rope. I’m in the wrong business.

What gives the game away entirely is the enormous variance from indicator to another. This means that changing the weights slightly would produce a drastically different list. And who is to say that the variables should be added linearly anyway? Is a score of 100 really worth precisely twice as much as a score of 50? What do the distributions look like? How significant are the differences in score from one institute to another? And what are we actually trying to measure anyway?

Here’s an example. The University of California at Berkeley scores 100/100 for 1,2 and 4 and 86 for 5. However for Staff/Student ratio (3) it gets a lowly 25/100 and for (6) it gets only 34, which combine take it down to 39th in the table. Exclude this curiously-chosen proxy for teaching quality and Berkeley would rocket up the table.

Of course you can laugh these things off as unimportant trivia to be looked at with mild amusement over a glass of wine, but such things have increasingly found their way into the minds of managers and politicians. The fact that they are based on flawed assumptions, use a daft methodology, and produce utterly meaningless results seems to be irrelevant. Because they are based on numbers they must represent some kind of absolute truth.

There’s nothing at all wrong with collating and publishing information about schools and universities. Such facts should be available to the public. What is wrong is the manic obsession with  condensing disparate sets of conflicting data into a single number just so things can be ordered in lists that politicians can understand.

You can see the same thing going on in the national newspapers’ lists of University rankings. Each one uses a different weighting and different data and the lists are drastically different. They give different answers because nobody has even bothered to think about what the question is.