September 10, 2022

Understanding science

Small steps toward improving research reliability

Unfortunately, scientific publishing is riddled with myriad problems, many of which likely can’t be solved without completely rethinking current processes and the underlying research culture. However, there are still small, short-term changes that are relatively easy to implement and can yield meaningful improvements in research integrity.

Peter Attia

Read Time 4 minutes

In last week’s newsletter, I highlighted some of the implications of the recent exposé of fabricated data in a heavily cited paper in the field of Alzheimer’s disease research. The flagrant misconduct evident in that case has naturally left many asking the same question: “how does this happen?” Though the actual percentage of retracted papers is only about 0.04% of all scientific research, the Alzheimer’s case shows us how even a fairly small number of fraudulent works can potentially result in significant costs in misdirecting future research and funding. And how it can, understandably, undermine public trust in science. So can anything be done to prevent these problems in the future?

Unfortunately, scientific publishing is riddled with myriad problems, many of which likely can’t be solved without completely rethinking current processes and the underlying research culture. However, there are still small, short-term changes – such as fraud-detection practices and data/methodology sharing – that are relatively easy to implement and can yield meaningful improvements in research integrity. As a case-in-point, let’s take a closer look at the specific concern of image falsification, the specific data fabrication in the paper discussed last week, as an example of how to characterize a particular problem and how technology and publishing policies might be used to minimize it.

The Image Problem: Honest Mistake or Falsification?

It’s impossible to know the exact percentage of research misconduct attributed to image manipulation, but 2-3% of researchers self-report fabrication or falsification of data, in general. Though image falsification represents a fraction of misconduct, a database analysis ascribed an average 15% of all retracted papers from 2010-2015 to problematic images. As exemplified by the retracted Alzheimer’s study, doctored or misused images are one of the common issues in published research, particularly in fields like cell and molecular biology, as images such as Western blots and cell photographs are easily duplicated or modified. A 2016 analysis of more than 20,000 papers in 40 journals from 1995 to 2014 estimated that 3.8% of publications contained a problematic image. Some of the inappropriate images are a result of an honest mistake, such as inserting the wrong image while assembling the manuscript. Others are intentional falsifications intended to present a desired – rather than true – research outcome.

So how many problematic images are accidental, and how many are intentional misrepresentations? A more focused follow-up study reviewing images from 960 papers published in the journal Molecular and Cellular Biology from 2009 to 2016 provides some answers to this question. This study found 6% of papers contained inappropriately duplicated images. However, the majority of these problematic images (>69%) were simply mistakes during figure assembly, and an image correction was issued after the publisher’s review of the primary data. Only 5 (0.5%) of all reviewed papers were ultimately retracted due to either the number of errors or a suspected intention to mislead.

Are there ways to prevent image falsification?

Several journals have implemented policies over the past 20 years to deter falsified images. These policies range from spot checks to scrutiny of every published image. Despite these additional fraud detection measures, the number of manipulated images in accepted papers has remained the same or increased in recent years. Increasingly sophisticated image manipulation software has created a virtual competition between production and detection of doctored images, as the task of noticing problems is non-trivial, even for a well-trained eye. (For some examples – or to check your own skills at spotting fakes – check out the manipulated images in this article.)  

Ideally, publishers would have software that can scan for problematic images automatically. Most publishers employ such software to scan texts for plagiarism, which constitutes another common form of misconduct in scientific publishing. Submitted manuscripts are run through artificial intelligence software that scans for text lifted from other works without citation of the original, for example. But comparing sentences is, in practice, much more straightforward than comparing images, even for a computer. 

Image analysis software can detect duplicate images even if they have been rotated or mirrored. However, automated image detection still requires human confirmation because there are acceptable uses of even the simplest case of exact duplication. For example, an image of a Western blot control band may appear multiple times within a paper if referring to the same experiment. In contrast, the same control band used multiple times for two different experiments would be problematic. More complex image manipulations, such as splicing together multiple images, are more difficult to detect with software.

 Universal publishing standards

At the end of 2021, eight major publishers collectively issued some long overdue guidelines for spotting manipulated images and standard procedures for how to proceed depending on the level of image doctoring. The intention is to detect more problematic images during the peer-review and editing process before publication. By implementing a higher level of upfront scrutiny, the aim is to reduce the number of images that need to be reported by the scientific community after publication.

Even though a small number of publications contain falsified data, it is imperative to keep improving the scientific publishing process and standards to enforce scientific integrity. While the results of any singular study won’t hold much weight over time if the results can’t be repeated, a lot of time and money is wasted when other researchers try to reproduce fraudulent results. Some journals and funding sources are starting to require sharing all raw data as well as code used in published data analysis. The overhead for reproducing results would be that much lower if the availability of both raw data and code became a ubiquitous requirement of scientific publication.

Looking forward

As long as the scientific research community continues to live by the “publish or perish” maxim and the scientific publishing community continues to prioritize flashy, positive results, there will be some small number of researchers who find it irresistible to cheat the system with falsified data. Ideas for improving scientific culture and the publication process have included shifting the peer-review process toward evaluating papers based on methodology instead of results, adapting the purpose of journals toward curation, and publishing “living notebooks” rather than static articles, the last of which I want to dive into further in the future.

However, changing the whole system would require overcoming the inertia of the current publishing process from the perspective of both the publishers and the researchers preparing the manuscripts. While that change might someday take place, there are intermediate steps that can improve the current process in the meantime, such as increased scrutiny of published images and transparency through detailed methodology and data sharing. We can’t completely overhaul the scientific publishing process overnight, but implementing incremental changes and fraud-detection measures are achievable short-term goals for improving research reliability.

– Peter

For a list of all previous weekly emails, click here

Disclaimer: This blog is for general informational purposes only and does not constitute the practice of medicine, nursing or other professional health care services, including the giving of medical advice, and no doctor/patient relationship is formed. The use of information on this blog or materials linked from this blog is at the user's own risk. The content of this blog is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Users should not disregard, or delay in obtaining, medical advice for any medical condition they may have, and should seek the assistance of their health care professionals for any such conditions.
Facebook icon Twitter icon Instagram icon Pinterest icon Google+ icon YouTube icon LinkedIn icon Contact icon