Citation-weighted Wordles
Someone who clearly has too much time on his hands emailed me this morning with the results of an in-depth investigation into trends in the titles of highly cited astronomy papers from the past 30 years, and how this reflects the changing ‘hot-topics’.
The procedure adopted was to query ADS for the top 100 cited papers in three ten-year intervals: 1980-1990, 1990-2000, and 2000-2010. He then took all the words from the titles of these papers and weighted them according to the sum of the number of citations of all the articles that word appears in… so if the word ‘galaxy’ appears in two papers with citations of 100 and 300, it gets a weighting of 400, and so-on.
After getting these lists, he used the online ‘Wordle‘ tool
to generate word-clouds of these words, using those citation weightings in the word-sizing calculation. Common words, numbers, etc. are excluded. There may be some cases where non-astronomy papers have crept in, but as much as possible is done to keep these to a minimum.
There’s probably some bias, since older papers have longer to accumulate citations, but the changing hot-topics on ~10 year time-scales take care of this I think.
Anyway, here are the rather interesting results. First is 1980-1990
Followed by 1990-2000
and, lastly, we have 2000-2010
It’s especially interesting to see the extent to which cosmology has elbowed all the other less interesting stuff out of the way…and how the word “observations” has come to the fore in the last decade.
ps. Here’s the last one again with the WMAP papers taken out:
December 12, 2011 at 11:22 am
A simple explanation would be that pretty much every extra-galactic paper that uses a particular value for H0 cites a paper with “[N]-year Wilkinson Microwave Anisotropy Probe (WMAP) observations … cosmological …” as the title. With that in mind (and removing those outliers from the Wordle), it is clear that cosmology only really exists to support research into galaxies 😉
December 12, 2011 at 12:51 pm
…i think jim is stuck at a snowy/icy mauna kea – so he has time on his hands.
December 13, 2011 at 7:11 am
dome is closed again, so here’s another one – perhaps a little more controversial:
http://www.physics.mcgill.ca/~jimgeach/wordcloud/authors.html
it’s co-authors on the top 500 papers from 1995-present, where the weights are given by the sum of (cites/co-author position)/time since publication.
there’s also a penalization factor for the big data release papers (WMAP, SDSS, etc.) to prevent those from overwhelming the cloud.
December 12, 2011 at 6:55 pm
I am definitely a ‘1990-2000’ person. Interesting to see how ‘star’ in the first decade became ‘stars’ and lately ‘stellar’ . Shifting emphasis.
Did I miss ‘planetary nebulae’ in the cloud? Probably needs a microscope to see.