Thirty Years of Preprints

I thought I’d share an interesting paper (by Xie, Shen & Wang) that I found on the arXiv with the title Is preprint the future of science? A thirty year journey of online preprint services. The abstract reads:

Preprint is a version of a scientific paper that is publicly distributed preceding formal peer review. Since the launch of arXiv in 1991, preprints have been increasingly distributed over the Internet as opposed to paper copies. It allows open online access to disseminate the original research within a few days, often at a very low operating cost. This work overviews how preprint has been evolving and impacting the research community over the past thirty years alongside the growth of the Web. In this work, we first report that the number of preprints has exponentially increased 63 times in 30 years, although it only accounts for 4% of research articles. Second, we quantify the benefits that preprints bring to authors: preprints reach an audience 14 months earlier on average and associate with five times more citations compared with a non-preprint counterpart. Last, to address the quality concern of preprints, we discover that 41% of preprints are ultimately published at a peer-reviewed destination, and the published venues are as influential as papers without a preprint version. Additionally, we discuss the unprecedented role of preprints in communicating the latest research data during recent public health emergencies. In conclusion, we provide quantitative evidence to unveil the positive impact of preprints on individual researchers and the community. Preprints make scholarly communication more efficient by disseminating scientific discoveries more rapidly and widely with the aid of Web technologies. The measurements we present in this study can help researchers and policymakers make informed decisions about how to effectively use and responsibly embrace a preprint culture.

The paper makes a number of good arguments, backed up with evidence, as to why preprints are a good idea. I recommend reading it.

Here is Figure 1 from the paper:

(Parts of the chart are difficult to read, so see the paper for details).

This shows that about 50% of all preprints are in the areas of physics and mathematics and their distribution mode is predominantly through the arXiv. Other scientific disciplines have much lower prevalence of preprints, e.g. biology. I’ve been putting my papers on arXiv since the early Nineties, i.e. for most of the duration of the period covered by the paper. I don’t know why other fields are so backward.

It’s standard practice in my own field of astrophysics to put preprints of articles on the arXiv but younger readers will probably not realize that preprints were not always produced in the electronic form they are today. We all used to make large numbers of these and post them at great expense to (potentially) interested colleagues before publication in order to get comments. That was extremely useful because a paper could take over a year to be published after being refereed for a journal: that’s too long a timescale when a PhD or PDRA position is only a few years in duration. The first papers I was given to read as a new graduate student in 1985 were all preprints that were not published until well into the following year. In some cases I had more or less figured out what they were about by the time they appeared in a journal!

The practice of circulating preprints persisted well into the 1990s. Usually these were produced by institutions with a distinctive design, logo, etc which gave them a professional look, which made it easier to distinguish `serious’ papers from crank material (which was also in circulation). This also suggested that some internal refereeing inside an institution had taken place before an “official” preprint was produced and this lending it an air of trustworthiness. Smaller institutions couldn’t afford all this, so were somewhat excluded from the preprint business.

With the arrival of the arXiv the practice of circulating hard copies of preprints in astrophysics gradually died out, to be replaced by ever-increasing numbers of electronic articles. The arXiv does have some gatekeeping – in the sense there are some controls on who can deposit a preprint there – but it is definitely far easier to circulate a preprint now than it was.

It is still the case that big institutions and collaborations insist on quite strict internal refereeing before publishing a preprint – and some even insist on waiting for a paper to be accepted by a journal before adding it to the arXiv – but there’s no denying that among the wheat there is quite a lot of chaff, some of which attracts media coverage that it does not deserve. It must be admitted, however, that the same can be said of some papers that have passed peer review and appeared in high-profile journals! No system that is operated by human beings will ever be flawless, and peer review is no different.

Nowadays, in astrophysics, the single most important point of access to scientific literature is through the arXiv, which is why the Open Journal of Astrophysics was set up as an overlay journal to provide a level of rigorous peer review for preprints, not only to provide a sort of quality mark but also to improve the paper through the editorial process.

So is the preprint the future of science? I think that depends on how far ahead you are willing to look. In my opinion we are currently in an era of transition trying to shoehorn old publishing practices into a digital world. At some point in the future people will realize that the scientific paper itself – whether a preprint or not – is an outmoded 18th Century concept and there are far more effective ways of disseminating scientific ideas and information at our fingertips if only we stopped living in the past.

8 Responses to “Thirty Years of Preprints”

  1. As to why not everyone uses arXiv, and why its use varies from field to field, the best way to find out is to ask colleagues, in your fields and other fields, why they don’t use it.

    Since the The Open Journal of Astrophysics allows papers to be submitted and accepted even if they are not on arXiv, but cannot guarantee that a paper accepted by TOJA will be allowed by arXiv to appear in astro-ph, and since TOJA already has a website, disks are cheap, and the size of a PDF of a paper is very small compared to the size of a modern disk, why not offer local hosting in addition to arXiv? For those who don’t use arXiv it would be the only way of official publication, but others might want to use it as well. TOJA already has the accepted PDF version of the paper in its system, so it is literally just copying it to some obviously named directory (if that) and adding a link to it in addition to the link to arXiv (if the latter exists for the given paper).

    The above would be a very minor addition which would make The Open Journal of Astrophysics attractive to a much larger pool of authors.

  2. “At some point in the future people will realize that the scientific paper itself – whether a preprint or not – is an outmoded 18th Century concept and there are far more effective ways of disseminating scientific ideas and information at our fingertips if only we stopped living in the past.”

    The key reason we haven’t moved beyond the traditional journal article is because the scientific community has “outsourced” the assessment of research and researchers to the journals. Published in Nature or Science? It’s got to be good, right? Until we stop assessing research via the brand-name of the journal then we’re not going to see the demise of the traditional publishing industry any time soon.

    …and I’m just as complicit as anyone in this: “Addicted To The Brand: The Hypocrisy Of A Publishing Academic”

    • There are several issues here. First, the main function of journals is quality control. If you think that that can happen without journals, spend a couple of days at viXra and let me know whether you found anything interesting. They are a useful filter. Having an article published in a good journal shouldn’t be the last word in your judgement of it, but rather the first, at least in many cases.

      Second, suppose 100 people apply for a job and have 100 papers each. Will you read 10,000 papers and form your own opinion? No. Even if the people are all in your field, you won’t have read all 10,000 papers.

      Third, there is a huge difference between journals owned by publishers (who also have the copyright on the articles), often with inflated prices, and professional-society journals, where the author (or some non-profit organization, such as ESO in the case of A&A) retains copyright.

      Fourth, if traditional journals are so bad, why hasn’t anyone come up with an alternative concept which actually works? The Open Journal of Astrophysics comes close, and having published an article there I can vouch for the high quality and ease of processing. However, its reliance on arXiv limits its usefulness in. a big way, and is perhaps the main reason more people haven’t published there. I would certainly submit to TOJA again if proper publication could be guaranteed after acceptance (which is not too much to ask of a journal), but reliance on arXiv prevents that. http://www.astro.multivax.de:8000/helbig/research/publications/arxiv/why_no_arxiv.html

    • True. But the assessment by referees is very important for junior scientists. Non-refereed preprints by senior, well-respected scientists will be read with interest by the community. An unknown PhD student trying the same is more likely to be ignored.

  3. Some institutions had circulation lists, and you could also ask authors to place you on their distribution list for preprints. Also when requesting an offprint (the hard copy of the paper that authors would obtain from the publisher), you could ask the author to send you copies of any other relevant papers, including preprints. (Getting offprints saved on photocopying!)

    I wonder if conferences were more important back then? There were fewer of them (perhaps because flights were so expensive?), but they provided a venue to get the most up-to-date information on research by other people/groups.

    • The purposes of conferences have certainly shifted. In the old days, one could get really new information; these days, someone might say: This is an old plot; I put the newest one on arXiv this morning. That allows for two other reasons to take up more time: getting an overview of fields in which one is interested but does not work, and the personal contact with colleagues. Both are important.

      Were there really fewer conferences back then? And when was then? In the second half of the 1990s, I used to go to about 5 or 6 conferences per year, and there were usually a couple more which I couldn’t make. These days, there are only about 2 or 3 per year which interest me, and I usually go to all of them.

      A quick web search reveals many conferences today, but many are fake. Yes, really; it’s apparently a big industry.

      • I’m really talking about the 1980’s (I’m old), when e.g. a flight to the US was around the same price as now, if not more expensive. Flying became a lot cheaper since then.

  4. “exponentially increased 63 times in 30 years” . I might query this. e^63 times (presumably) 1 paper is quite a big number. If each preprint consists of only 1 page A4 (5 gm), that is twice the mass of the Earth.

    I have noticed another way preprints were used in the olden days. They were added to the publication numbers of an institute, and since normally the journal paper would appear in the year after the preprint, they could be included twice. An easy way to double the institute’s publication rate in the annual report.

    I think arxiv is essential now. So is open access for the final published version. But I am happy to (and prefer to) wait until a paper has been reviewed before putting it out on arxiv.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: