Barring any major systemic changes, the cost of irreproducibility has probably gone up since 2015. The American pharmaceutical industry spent an estimated $83 billion on research and development (R&D) in 2019 (Congressional Budget Office 2021), and if half of that research was irreproducible, that would equate to more than $40 billion in excess costs. Considering the U.S. accounts for roughly 45% of global early-stage R&D (IQVIA Institute 2021), and assuming reproducibility rates are similar around the world, a back-of -the-napkin calculation indicates that some $90 billion is spent on irreproducible research globally each year. Some variation is unavoidable, and there are studies that by nature cannot be repeated. But if a study cannot be reproduced because of poor design, execution, or reporting, real people can be put at risk without giving us any useful information (Relias Media 2022).
Meanwhile, a 2019 report from Deloitte found that the biopharmaceutical industry is facing major productivity challenges and diminishing returns. Biopharma companies are taking longer than ever to develop new drugs, and as medicine becomes more and more personalized, these drugs are becoming more expensive while reaching smaller groups of patients (Deloitte 2019).
Most preclinical research is replicated before it moves on to a clinical context, and each of these preclinical studies takes anywhere from three months to two years to complete (Freedman et al. 2015). These timelines, as well as the clinical trials that follow, are lengthening as drug development and drugs themselves continue to become more complex (Deloitte 2019).
Experts agree that the current high-risk, high-cost R&D model for biopharma is unsustainable. Together with the reproducibility problem, many would argue that we are in a research crisis (Baker 2016; Stuppleet al. 2019).
The 2015 PLoS Biology study estimated that 27.6% of failures are due to flawed study design, 25.5% to data analysis and reporting, and 10.8% to poor laboratory protocols (Freedman et al. 2015). But poor tools were the biggest culprit. The study estimated that over a third of irreproducible research – 36.1% – fails because subpar biological reagents and reference materials are used.
Improving research reproducibility, especially in the age of large-scale drug development, is an area of intense study. Several individual and systemic solutions have been proposed to combat the reproducibility crisis. These solutions serve both to prevent researchers from pursuing irreproducible studies and to help them improve the quality of their own work.
Combating Data Misuse
Some failure is inevitable. That is why researchers use a conventional 5% false-positive rate – in other words, a statistical significance, or p-value, of 0.05 (Vidgen & Yasseri 2016).
But in the “publish or perish” culture of research (Everett & Earp, 2015), scientists, especially early in their careers, feel the pressure to squeeze out novel, groundbreaking results with low p-values. Noteworthy studies are more profitable, both in academia and in industry. And while flat-out scientific fraud is probably uncommon, there are other research behaviors that fall in a gray area. These include data dredging (or p-hacking) (Parry 2021), selective reporting (Department of Health & Human Services 2015) and hypothesizing after results are known (HARKing) (Wilson 2021).
Cherry-picking data – intentionally or otherwise – is especially prevalent in biopharmaceutical research, where, during the exploration of huge swathes of data, it can be easy to find associations that may appear statistically significant but are not actually meaningful.
Some experts think more stringent cutoffs are necessary in deciding whether a finding is noteworthy. For example, lowering conventional significance levels from 0.05 to, say, 0.005 could make p-hacking more difficult (Benjamin, et al., 2017). Others disagree, saying that rather than redefining the threshold, p-values and other ways of calculating significance should be justified on a case-by-case basis (Lakens et al. 2018).
Many experts and organizations have also proposed a paradigm shift to prioritize replication studies rather than just original ones. Concrete solutions include the provision of more funding specifically for replication studies (De Vrieze 2017) and repeat clinical trials (National Institutes of Health 2014), as well as requiring that more research students perform high-quality replication studies for their coursework (Frank & Saxe 2012), thesis (Quintana 2021), and early-career publications (Everett & Earp 2015).
Improving Study Design
Experts are calling for more rigorous training programs at academic institutions for teaching best practices in basic research skills and experimental planning, as well as requiring continuing education and certifications for principal investigators to receive funding. Some life science technology companies also offer teaching resources, webinars, tutorials, practical guides, and protocols to improve research literacy in science students and investigators alike.
Data misuse or manipulation can be a part of flawed study design. But a potentially irreproducible paper may also lack the appropriate controls, use the wrong statistical tests altogether, fail to repeat an experiment within a given study, or omit outlier experimental runs without justification (Begley 2013). Investigators should look carefully for these hallmarks of questionable studies on a case-by-case basis before they try to build on that research.
Eliminating bias is another important component. Studies have shown that when the same investigators have tried to reproduce their own experiment, they often could not. I n many cases, the only difference was that, in the second time around, they were blinded to which samples were test samples and which were controls (Begley & Ioannidis, 2015). Experts think that making blind experiments the standard would have a big impact on the reproducibility crisis.
Bias towards positive results also factors into improving peer review. Some experts support the concept of a “result-blind” peer evaluation process (Locascio, 2017), where reviewers would first be given only the introduction and methodology for a given paper, without the results. In theory, this would ensure that reviewer recommendations are based purely on the rigor of the paper’s experimental design andthe importance of the research question itself, not on whether the results were positive or “newsworthy.”
A lack of standardization can also cause major inconsistencies in pharmacological research. Experts have estimated that a more stringent use of standards and best practices could save billions per year (Freedman, Cockburn, & Simcoe, 2015). This includes adopting standard practices not just for protocols and reagents, but for methods of analysis as well (Haibe-Kains, et al., 2013).
The intense pressure to produce attention-grabbing, positive findings has resulted in what psychologists call “the file drawer effect” – the filing away of results that does not support researchers’ hypotheses (Apple 2017). More transparency from the very beginning, experts say, would make results more difficult to manipulate, experiments easier to replicate, and fruitless efforts less likely to be duplicated (Vidgen & Yasseri 2016).
These experts have suggested making proposed hypotheses, methods, and analyses open access via the Open Science Framework data repository, for instance, before studies or clinical trials are initiated. Thischange would hold researchers accountable if, for example, they preregistered a p-value goal of 0.01 but later reported 0.05, or if they failed to publish the results of a preregistered clinical trial simply because they didn’t find anything noteworthy (Dickersin & Rennie, 2003).
Furthermore, when multiple teams of researchers can access the same data sets through data sharing platforms, their research becomes a collaborative effort rather than a competitive one. In fact, both the National Science Foundation and the National Institutes of Health have issued firm statements that investigators should disclose data sets, but these statements are not strictly enforced (Begley & Ioannidis 2015).
Downstream Effects of Experimental Failures
Even with meticulous methodology, a chef is only as good as their ingredients.
Antibodies, for example, are the foundation of a huge proportion of life sciences research. This fact is especially true as the market for therapeutic antibodies has boomed in recent years (Deloitte 2019). Experts have called antibodies a major driver for the reproducibility crisis (Baker 2015), both because there are so many poor-quality products out there that lead to inconsistent results and because many labs are not validating antibodies after purchasing them.
Cell lines are another huge part of preclinical research. However, they can often be misidentified or cross-contaminated. The use of a faulty or incorrect cell line can derail an entire study, yet many labs still do not authenticate their cell lines, despite the fact that it only costs a few hundred dollars per assay – a low cost for a potentially critical safeguard (Freedman et al. 2015).
Investigators should use only those vendors that offer validated reagents, including antibodies, primers, and other assays and kits. Investigators should also use validated equipment that is held to U.S. Food and Drug Administration standards to ensure their software is regulated and reliable.
Vendors should also provide quality control reports and certificates of analysis so that investigators can keep track of batches and lots. When producing biologics, which comprise a fast-growing proportion of the drug development market, maintaining purity and minimizing batch-to-batch variability can be difficult (Deloitte 2019). When attempting to scale up the production of these complex therapies, consistency and reliability of reagents is critical to minimize unnecessary cost.
Automation and Digital Technology
As technology advances, experts expect automation to become a bigger part of R&D. As pharmaceuticals – which make up some 60% of life sciences research spending in the U.S. – lean more towards huge data sets, automation and digital technologies can significantly reduce timelines (Deloitte 2019).
Take therapeutic antibodies, for example. The development of antibodies can be streamlined at several stages with the right equipment. Digital PCR can be used to more efficiently develop stable cell lines while using less supplies, and flow cytometers can be upgraded to allow automated, high-throughput antibody screening. Digital PCR and automation also have applications in the development of other biologic stalwarts of the biopharmaceutical market, including cell and gene therapies and vaccines (Deloitte 2019).
Digital technology also applies to data curation and everyday laboratory life. Replacing or supplementing paper lab notebooks with digital ones allows investigators to easily find and reproduce data, connect it to their instruments using cloud-based software, work remotely, and consolidate resources as well as easily share them with collaborators.
Using cloud-based instruments also reduces the need to keep upgrading computers’ operating systems to keep up with cutting-edge software. Streamlining the lab and reducing human error through digital technology has big long-term implications both for research quality and efficient spending.
Bringing Reagents Up to Par
A reproducibility crisis affects more than the bottom line. Each time a promising life science paper is released, it gives hope to clinicians, patients, and their families who are waiting for disease cures. But flawed experimental designs, subpar reagents, and overinflated results send other scientists on a wild goose chase. The more time we spend pursuing irreproducible experiments, the more we delay the release of a lifesaving drug.
Experimental failures also undermine public trust in the value of research, the strength of peer review, and the soundness of the scientific process in general. Furthermore, even with the successful release of a product, the more time and resources are spent developing it, the more expensive it will be for hospitals and patients alike – adding to the already critical problem of healthcare affordability.
With reliable equipment, dedication to better study design, and careful selection of high-quality reagents, researchers can play a big part in improving life sciences research and, ultimately, deliver the best products to patients in need.
BIO-RAD is a trademark of Bio-Rad Laboratories, Inc. All trademarks used herein are the property of their respective owner. © Bio-Rad laboratories, Inc.