Check out more content on this topic:
- (Oct 17, 2021) Ten errors in randomized experiments
- (Feb 26, 2022) Nutritional epidemiology: abolition vs defending the status quo
- (Feb 28, 2022) The science of obesity & how to improve nutritional epidemiology
Anyone who’s read my stuff with any regularity is acutely aware of my disdain for the way many observational studies are conducted and interpreted in health and nutrition research, as well as my admiration for randomized-controlled trials (RCTs). Randomization, a method by which study participants are assigned to treatment groups based on chance alone, is a critical component in distinguishing cause and effect. Randomization helps to prevent investigators from introducing systematic (and often hidden) biases between experimental groups.
But there are also many ways in which randomized experiments can fall short. Recently, David Allison and his colleagues published an excellent review discussing ten errors in the implementation, analysis, and reporting of randomized experiments — and outlined best practices to avoid them. David is the Dean of Public Health at Indiana University, where he conducts research on obesity and practices psychology. He is also one of the best statisticians in the world, and will be joining me soon as a guest on the podcast. I’ve provided a brief summary of his review below, but to anyone interested in improving their ability to read and understand research, I suggest reading the original text in its entirety. Here, I focus on the general point that while RCTs may be considered the gold standard for establishing reliable knowledge, they are also prone to error and bias.
A) Errors in implementing group allocation
1 | Representing nonrandom allocation methods as random
Occasionally, in studies styled as “randomized,” participants are allocated into treatment groups by use of methods that are not, in fact, random.
The review authors provide the example of a vitamin D supplementation trial in which the control group came from a nonrandomized cohort from another hospital.
Lack of appropriate randomization can introduce selection bias: the selection of subjects into a study that is not representative of the target population.
A 2017 analysis by John Carlisle suggested that nonrandom allocation may be a concern in many studies labeled as “randomized.” One of the trials flagged was the well-known PREDIMED trial. Study participants at high cardiovascular risk were randomly assigned to a Mediterranean diet supplemented with mixed nuts or olive oil, or to a low-fat diet. In some cases, whole households were collectively assigned to the same diet. Even more problematic, one of the sites in the trial assigned entire clinics to the same diet. However, the investigators did not initially report this, and they analyzed their data at the level of individual participants rather than at the level of household or clinic. After discovering these problems in a post-publication audit, PREDIMED investigators retracted and reanalyzed the study, leading to various changes in findings.
2 | Failing to adequately conceal allocation
Allocation concealment hides the sorting of trial participants into treatment groups, preventing researchers from knowing the allocation of the next participant, and participants from knowing their assignment ahead of time. Allocation concealment is different from blinding. Allocation concealment ensures the treatment to be allocated is not known before that participant is entered into, while blinding ensures either the participant or investigator (or both, in the case of double-blinding) remains unaware of treatment allocation after the participant is enrolled in the study. Studies with poor allocation concealment are prone to selection bias.
Poor allocation concealment from participants can lead to bias when, for example, certain study participants prefer one possible treatment over another. Those participants may drop out of the study if they become aware that they will not receive their preferred treatment, potentially skewing the group populations.
Poor allocation concealment from investigators can also lead to bias. Researchers may — consciously or unconsciously — place participants expected to have the best outcomes in the treatment group and those expected to have poorer outcomes in the control group.
3 | Not accounting for changes in allocation ratios
When designing an RCT, one step in the process is determining the ratio of subjects to each group. It’s not always 1:1 – that is, one subject assigned to treatment for every subject assigned to placebo. Sometimes it’s necessary from a statistical standpoint to assign twice (2:1) or three times (3:1) as many individuals to the treatment group as the placebo group. Further, investigators may choose to change the ratios in the middle of a study for various reasons. However, changing the allocation ratio partway through a study requires corresponding changes to statistical analyses, which doesn’t always happen.
Dr. Allison gives the example of a study investigating body weight changes associated with daily intake of sucrose or one of four low-calorie sweeteners. Participants were initially randomly allocated evenly among the five treatment groups (1:1:1:1:1). Because one group had a high attrition rate, the investigators changed to a 2:1:1:1:1 ratio halfway through the study, but they did not account for these different study phases in their statistical analyses.
4 | Replacements are not randomly selected
In virtually all RCTs, some participants will inevitably drop out. One way that investigators try to mitigate this problem is by using intention-to-treat (ITT) analysis, which we discussed in more depth in this article on the efficacy vs. effectiveness of a time-restricted eating trial. In ITT analyses, every participant that is assigned to a treatment group must be included in outcome analyses, regardless of whether those participants followed the protocol or dropped out of the study.
In some cases, investigators replace dropouts with more participants to ensure the study remains adequately powered. These replacements must be randomized to avoid another form of Error #3: changing allocation ratios. (For more information on statistical power, which represents the probability that a study will correctly identify a genuine effect, read Part V of our Studying Studies series.)
B) Errors in the analysis of randomized experiments
5 | Failing to account for non-independence
Sometimes groups of subjects are randomly assigned to a treatment together, but are analyzed as if they were randomized individually. For instance, an entire classroom might be randomized to one group while a separate classroom is assigned to another. These types of studies are referred to as cluster RCTs and are subject to error when they are powered and analyzed at the individual level instead of the group level. The PREDIMED study exemplifies this error, as groups of individuals within certain households or clinics were assigned to a treatment together, but the authors did not initially adjust their statistical analysis to account for clustering.
6 | Basing conclusions on within-group statistical tests instead of between-groups tests
The strength of an RCT lies in its ability to compare the results between two or more groups. For example, I recently wrote about a study that randomized men to morning exercise, evening exercise, or no exercise. The investigators reported that nocturnal glucose profiles improved only in men who exercised in the evening. The improvement, however, was “in-group,” meaning that nocturnal glucose levels had improved relative to baseline values, not compared to the other groups in the study. The authors’ conclusion that evening exercise conferred greater benefit for glycemic control than morning or no exercise is thus an example of the Difference in Nominal Significance (DINS) error. This error occurs when differences in “in-group” effects are used to draw conclusions about differences in “between-group” effects, rather than directly comparing groups to each other.
7 | Improper pooling of data
Pooling data under the umbrella of one study without accounting for it in statistical analyses can introduce bias. Dr. Allison cites an example of a trial on the effects of weight loss on telomere length in women with breast cancer. Data were pooled from two different phases of an RCT with different allocation ratios (see Error #3), which wasn’t taken into account in the analysis.
The different sites, subgroups, or phases of a study need to be taken into account during analysis. Otherwise, any differences in the subsets of data being pooled together can bias the estimation of an effect in the trial.
8 | Failing to account for missing data
Missing data — whether due to dropouts, errors in measurement, or other reasons — may not occur completely at random, breaking the randomization component of the study and introducing bias.
The review authors provide the example of a trial of intermittent energy restriction vs. continuous energy restriction on body composition and resting metabolic rate. The study had a 50% dropout rate, yet only data from participants who completed the protocol were analyzed. (This is an example of “per protocol” analysis, in which data from noncompliant subjects is removed from analyses.) Reanalysis of the study including all participants halved the magnitude of effect estimates compared with original reported results.
Investigators may mitigate this problem by reporting both per protocol and ITT results: efficacy and effectiveness, respectively. However, Dr. Allison suggests that this isn’t a perfect fix: “ITT can estimate the effect of assignment, not treatment per se, in an unbiased manner, whereas the per protocol analysis can only estimate in a way that allows the possibility for bias.”
(As noted earlier, this article details efficacy vs. effectiveness of time-restricted eating.)
C) Errors in the reporting of randomization
9 | Failing to fully describe randomization
Investigators must provide sufficient information so that readers can fully comprehend and evaluate the methods used for randomization. The review authors themselves admit to having a history of inadequate reporting of randomization methods.
10 | Failing to properly communicate inferences from randomized studies
When following the ITT principle, an RCT tests the effect of assigning participants to a treatment on the outcome of interest, but investigators often communicate results as the effect of the treatment itself (meaning, how well the treatment works if followed exactly as it’s prescribed). Avoidance of this error depends on conscientious framing of the precise causality question addressed by the study.
For example, in the article I wrote reviewing a time-restricted eating trial, I highlighted the investigators’ statement that, “Time-restricted eating, in the absence of other interventions, is not more effective in weight loss than eating throughout the day.” In actuality, the investigators found that being assigned to time-restricted eating, in the absence of other interventions, is not more effective in weight loss than being assigned to eating throughout the day.
The review from David Allison and his colleagues highlights that while randomized controlled trials are powerful tools for examining cause-and-effect relationships, they are not immune to errors and bias. The paper is a great reminder of the high level of rigor involved in designing, conducting, and reporting randomized experiments, as well as a useful guide for investigators and readers alike for avoiding many pitfalls associated with this study design.