A major impediment to the scientific endeavor today is a lack of transparency, communication, and public visibility. The process and products of scientific research are often hidden behind pay walls at major academic journals and therefore inaccessible to the public. The current paradigm of scientific communication is through the process of peer review. This is where a scientist or a group of scientists report their findings to an academic journal so that they may disseminate the knowledge. However, before the findings are published, the journal will contact a small number of other experts in the concerned field who will evaluate the submitted work for rigor and novelty. These reviewers in turn provide critiques and suggestions to the original scientists, who must then build upon the feedback to improve the body of work. Although peer review is an integral and crucial step of quality control in the scientific process, it often takes months, and sometimes years, slowing down overall scientific progress.
In 1991, the fields of mathematics, physics, and computer science came up with a partial solution to this problem, arXiv.org, an online repository and forum to store, disperse, and discuss preprints, which are scientific manuscripts and communications prior to peer-review. While there is an increasing recognition of the role preprints play in the future of scientific communication, the life sciences have been indisputably behind the curve, [1, 2]. However, this is rapidly changing, and at the forefront of the revolution is Jessica Polka, Ph.D. She is currently a visiting scholar at the Whitehead Institute and director of ASAPbio, a biologist-driven project to promote the productive use of preprints in the life sciences. I recently had the opportunity to speak with her about the rise of preprints in the life sciences. The following interview has been edited for clarity.
Srivats Venkataramanan (SV): On the webpage, you state “ASAPbio is a scientist-driven initiative to promote the productive use of preprints in the life sciences.” Firstly, what is a preprint, and how does it differ from a traditionally peer-reviewed article?
Jessica Polka, Ph.D. (JP): A preprint is a mechanism that researchers use to communicate with each other more quickly. When a researcher prepares a manuscript to submit to a journal, they typically have to wait months or sometimes even years for what can be a very long, opaque and convoluted editorial process. The manuscript, and the results therein, are more or less hidden from the scientific community during the entire process, which includes multiple rounds of peer review, editorial rejection, manuscript revision, and manuscript resubmission. This slows down communication of research and results, and ultimately the progress of science. A preprint is a solution to this problem. It is usually the same manuscript that is going to be, or has been, submitted to a journal, but posted online to a preprint server.
SV: How did you get involved with ASAPbio and preprints?
JP: During my postdoc, I became increasingly interested in the factors that contribute to the efficiency of the scientific enterprise, especially as it pertains to the training of early career researchers, such as graduate students and postdocs. When thinking about all these questions, it is impossible to avoid the topic of publications. Publications dictate the entire reward system and incentive structure of the scientific enterprise. I am a member of the steering committee of a group called ‘Rescuing Biomedical Research’ (RBR), which was borne from an article in 2014 . The goal of the group is to find ways in which biomedical research can be improved.
Ron Vale, who I knew from my time as a graduate student at UCSF, was also a member of that group. He had analyzed the time to publication and graduation for graduate students at UCSF. [SV: He found that the time to publication (and therefore, time to graduation) for trainees in the life sciences had increased significantly over the last 30-odd years, likely as a consequence of increases in the amount of data expected per ‘unit’ of publishable material, conventionally, a peer-reviewed journal article.]. He wrote an article, first a preprint itself and then published in PNAS [the scientific journal Proceedings of the National Academy of Sciences], about these phenonomena, and proposed preprints as a solution to this problem and as a way to accelerate the communication of results in the life sciences [4, 5]. He recruited three other members of the RBR committee, including myself, to organize an ASAPbio meeting in February 2016, with the express purpose of beginning a conversation about preprints in the communication of life sciences research. During that meeting, there was an enthusiastic response among many different groups, and a consensus was reached that preprints were, by and large, a benefit to the biomedical and life science research community. We laid out a roadmap of the work to be done to help preprints develop in a productive way.
SV: Speed of communication is an obvious benefit of preprints. Are there any potential pitfalls?
JP: There are certainly concerns. Preprints do not offer the same quality control that peer review does. This is a delicate problem. The quality of peer review is not the same between journals. It may not even be similar between two manuscripts within the same journal. Some manuscripts may undergo extensive and rigorous peer review that make it more solid, whereas others that enter into the literature may have gone through a less rigorous process. Preprints are not intended as a replacement to journals. They are a complement to the journal system; we need to have a mechanism to select work, curate and provide structured feedback. That is the peer review process.
There is an internal concern of researchers getting scooped. That has begun to diminish, in my opinion, and will continue to do so over time. There is also the concern that posting preprints will mean that the article won’t subsequently be accepted by a journal. That concern is also diminishing as the journal policies are adapting to this new environment . However, there is a more serious concern about the quality of preprints. Because I believe that peer review improves manuscripts, the quality of a manuscript initially posted as a preprint will improve as it goes through the peer review process. But the benefit of communicating early is worth the need for readers to do their own diligence. Peer review itself is not a perfect filter. We should therefore be aware that preprints are unrefereed and unedited when we look at them. Hopefully, through that lens, those of us who can gain some benefit by reading these preprints can do so. Hopefully, any reader should look at it with the recognition that a preprint is a communication between scientists that hasn't gone through the rigorous process of peer review.
SV: To go back to one of the concerns you mentioned earlier, is it accepted within the scientific community that preprints establish priority? Traditionally, the scientific community is concerned with who was the first to come up with an innovative idea or novel discovery. While some may argue that that shouldn’t matter within the broader context of the scientific enterprise, the reality is that priority and credit is still important to the careers of scientists, since funding decisions are largely still made along those metrics.
JP: Recent policy changes at funding agencies have created more recognition of the value of preprints. For example, at the end of March, the NIH released a statement indicating that preprints and other “intermediate research products” can be cited in a grant application in the same manner as a paper . The popularity of preprints will definitely increase with this kind of formal acknowledgement of their scholarly legitimacy, or their establishment as part of the scientific currency.
The question about priority is an interesting one. On one hand, the preprint provides authors with the power to transparently disclose to the community what they know at a certain time, at a certain date. We’ve all heard about cases where authors submit competing manuscripts to two different journals; one might get rejected and have to be resubmitted. In that case, the manuscript gets a new “submitted on” date, which can make it seem that that work was done later than it was, and the editorial process can in general obscure what work was done when. Having a preprint provides an indelible history. However, I am wary of this idea of priority. Focusing on who was first makes science unnecessarily competitive, and neglects the nuances of when an assertion was actually backed up with solid evidence and high quality data. I would hate to see this become a medium to allow people to stake premature priority claims. I would rather that we read and evaluate preprints when assigning credit. This is actually very similar to the problem we have now with journals. Someone can publish a journal article with less information in it, in a journal that is less stringent and therefore ‘scoop’ someone who publishes more rigorous and complete information later. The problem about obsessing about priority and who’s first is not unique to preprints, but that medium provides more control and transparency to a process that can sometimes be clouded by the vagaries of the editorial process.
SV: While I am largely sold on the idea of preprints, I do have one concern. They differ from peer-reviewed journal articles in two major ways. First, and you’ve mentioned this, they haven’t gone through peer review and are thus more likely to contain mistakes or oversights. Secondly, unlike most journals, they aren’t behind a paywall, which makes them more accessible. Is there a possibility that this combination of decreased rigor and increased accessibility could be potentially dangerous in terms of disseminating misinformation? Are there safeguards against this in terms of moderation within the community?
JP: Preprints are moderated before posting. On servers such as PeerJ, BioRxiv and others, community members (mostly other scientists) cursorily screen preprints before they’re posted to ensure that they are a scientific paper, and don't contain egregious potentially damaging information. It is true that the public has more access to preprints and less so to the peer-reviewed, published paper. I would argue that the solution to that is in fact increasing access to the final paper, not decreasing access to preprints, which prevents information from flowing freely where it needs to within the scientific community. Preprints within some servers (such as BioRxiv) contain a link to the subsequently published paper. Even people who don’t have access to the material behind the paywall will be able to at least read the abstract. If there are any major changes to the conclusion of the paper, that should be reflected in the title or abstract. We all need to draw our own conclusions, but I would hope that the broader public could have access to the same high quality information that we as scientists do.
There are other ways to address the challenge of highlighting the difference between a published paper and a preprint. One way to do that would be to create a section or a notice on a preprint server accompanying every published preprint that highlights what those differences are. Basically, an authors’ note that highlights the changes between the preprint and the published version.
SV: What do you see the paradigm of biological science literature moving to 10 years from now? What would you like it to be, and what do you think it will really be?
JP: 10 years is an interesting timeframe. In the longer-term future, one could envision a system where researchers post their scientific contributions; a paper, a single figure, a method, a hypothesis; where we have the potential to make smaller contributions to the global knowledge base and get credit for those contributions in a manner that is more rapid and incremental. This would allow multiple scientists to collaborate and contribute to what we now know of as a single paper. Part of the challenge of the next 10 years is the problem of increasing information overload. Journals in the life sciences are aware that preprints have been around in physics for 25 years, and that the existence of preprints do not diminish the need for journals in that field. It is already impossible for a person to read all the relevant literature in their area, and this will only get harder. We need better tools to read and comprehend the literature, and a lot of these tools will be given by innovations in software and machine learning. My hope is that more of the literature is accessible to text and data mining, which will enhance our ability to understand the literature beyond that of a single human reader.
Final thoughts (SV):
The rise of preprints has certainly made the process of science more accessible to the public because preprints are not hidden behind a pay wall, as most conventional academic journals are. However, as this movement gathers momentum, it would serve us well to keep Dr. Polka’s words of caution in mind. While preprints are a wonderful means of rapidly communicating one’s science, they are not peer-reviewed, and the potential for misuse is high. It is conceivable that preprints could be used to willfully disseminate pseudoscientific misinformation under the semblance of legitimacy, and currently, the vigilance of the scientific community prevents this from occurring. However, it is also entirely possible that the information within preprints is incomplete or misleading due to honest mistakes or misinterpretations by the concerned scientists. Peer review, while no means perfect in this regard, does provide an additional level of scrutiny that preprints lack. Therefore, while the contribution of preprints to transparency of the scientific process and rapid communication of results should rightly be applauded, the information within them should be handled with equal amounts of enthusiasm and due caution. With that caveat, the ball is well and truly rolling. Where it goes next, depends on all of us.
Srivats Venkataramanan (@srivatsv)
Guest Contributor, Signal to Noise
Ph.D. Candidate, Molecular, Cell and Developmental Biology
 Biology preprints over time. http://asapbio.org/preprint-info/biology-preprints-over-time/. (Accessed: 2017, July 16).
 Alberts, B., Kirschner, M. W., Tilghman, S. & Varmus, H. Rescuing US biomedical research from its systemic flaws. PNAS 111, 5773–5777 (2014).
 Vale, R. D. Accelerating scientific publication in biology. PNAS 112, 13439–13446 (2015).
 SHERPA/RoMEO - RoMEO Statistics. http://www.sherpa.ac.uk/romeo/statistics.php?la=en&fIDnum=|&mode=simple. (Accessed: 2017, July 14)
 Reporting preprints and other interim research products. https://grants.nih.gov/grants/guide/notice-files/NOT-OD-17-050.html (Accessed: 16th July 2017).