October 18, 2020

Understanding science

Time-restricted eating: efficacy versus effectiveness

What happens when people are prescribed a treatment vs what happens when people take the treatment are two different questions.

Read Time 14 minutes

Many of you have probably seen the recent study on time-restricted eating (TRE) I’m about to discuss, or the video I posted shortly after its publication discussing it. (You can also read the transcript here.) As I mentioned in the video, my team and I wanted to do a more comprehensive evaluation of the study and get into some of the technical stuff: the nuances of the design, what this study tried to answer, and how confident we can be in its results and conclusions. Due to the length and depth of the analysis here-below, let me give you a roadmap for what is covered. First, let’s start by looking at the outcome that the investigators deemed most important in this study (weight loss) and what they found. Then, we’ll look at a couple of important issues with this study: adherence to the diet and the type of statistics the investigators used to analyze the results. From there we’ll put the primary outcome into context by comparing it to the existing literature on the subject. After that, we’ll dig into the secondary outcomes (body composition and metabolic health measures), how the results fit into the bigger picture of TRE, and some concluding thoughts. This post is longer than usual, and we get into the weeds a little bit, but if you take away anything from it, I hope it includes two points: (1) Knowing how well participants in a free-living-conditions study adhered to the treatment they were assigned to is critical for understanding how effective (or ineffective) the treatment is, and not how efficacious the treatment is (meaning, how well the treatment works if followed exactly as it’s prescribed); and (2) A study reporting how effective a treatment is, versus a study reporting how efficacious a treatment is, are telling you two entirely different (and yet equally important) things. It is crucial to understand which type of result the study is reporting. OK, let’s get to the study.

§

What is the effect of time-restricted eating on weight loss and metabolic health in overweight and obese individuals? This was the question Ethan Weiss and his colleagues asked, and tried to answer, in a 12-week randomized controlled trial, recently published in JAMA Internal Medicine.

The Study of Time-restricted Eating on Weight Loss, or the TREAT trial, for short, was carried out on a custom mobile study app (called Eureka) where 141 participants were randomly assigned to either a consistent-meal timing (CMT) group or a time-restricted eating (TRE) group. The CMT group was instructed to eat 3 structured meals per day and the TRE group was instructed to eat as much food as they wanted from noon until 8:00 PM each day and fast from 8:00 PM until noon the following day. In other words, these participants were assigned to practice what’s probably the most popular form of TRE, known as the 16:8 diet: eat within an 8-hour window then fast for the next 16 hours.

The participants, aged 47 on average (of which 60% were men), all received an at-home Bluetooth weight scale to use daily. The scale measurements were used to assess the primary outcome in the study: change in body weight between the TRE and CMT groups. (Secondary outcomes were measured in a subset of 50 participants, who underwent in-person testing at the beginning and end of the study, which we’ll discuss below.)

When the results were analyzed, 57 participants in the CMT group lost an estimated 1.5 lbs. This group initially weighed about 219 lbs on average before the intervention; so that means a loss of less than 1% total body weight after 12-weeks. The 59 participants analyzed in the TRE group lost an estimated 2 lbs, also losing less than 1% of their initial body weight. Both groups actually had a statistically significant reduction in weight relative to their baselines, but the difference in weight loss between the groups was not statistically significant. Results like these highlight a couple of considerations you should always keep in mind when reading medical literature: (1) The importance of a control group to help eliminate biases. If there wasn’t a control group in this study, we might’ve instead read headlines about a recently published study that found TRE results in significant weight loss. (2) Results can be statistically significant, but clinically insignificant. Whether statistically significant or not, it was a trivial amount of weight loss for a 12-week intervention in overweight participants. Teaching point: never confuse statistical significance with clinical significance in medical literature.

Now, what are we to make of these results? It might be tempting just to stop here and conclude that 16:8 TRE confers no weight loss advantage in overweight adults. That may be true, and it’s what the investigators seem to conclude in the paper, but there are a few things that warrant consideration before we jump to the same conclusion.

There are two important factors in this study that impact both the results and how we should interpret them: (1) participant adherence, and (2) investigator analysis.

One of the biggest challenges in diet studies is objectively measuring adherence in free-living conditions. As the investigator, how can you be sure that study participants perfectly adhere to their assigned treatments as they go about their daily lives? Outside of participants strapping on a GoPro and investigators reviewing the 24/7 videos, the investigators aren’t going to know exactly how well the participants adhered to the diet. This is what makes interpreting the results of these types of studies so difficult.

As we alluded to above, there are two important pieces of information we want to get from a study: (1) How efficacious is the treatment? To answer this, we need to see how well people do when they take the treatment as prescribed. (2) How effective is the prescription? To answer this, we need to see how well people do who received the prescription, regardless of how they adhered to it.

Here is the best example I have of the difference between efficacy and effectiveness in my life: fish oil liquid versus fish oil capsules. The fish oil liquid is more potent and more concentrated in a higher dose, and if I take it every single day, I get higher levels in my red blood cell membranes (how we measure levels of EPA and DHA), which is the desired effect. But on average I forget to take it at least 2 times per week because the bottle needs to sit in the fridge and I forget to take it. Conversely, the fish oil capsules are easy for me to put in my pill pack next to my sink, which I never forget to take, but they are not as potent. So, while the liquid is more efficacious, the capsules may be more effective.

Participant adherence

At first glance, the self-reported adherence of 83.5% in the TRE group looks pretty good. Participants were sent the following daily adherence survey question on their app: “Did you adhere to your eating plan on the previous day?” (Yes/No). When participants completed the daily survey question asking if they adhered to their eating plan, 1128-out-of-1351 times, or just under 6-out-of-7 times on average, the answer was yes. But if we take a closer look at what this actually means, there’s reason to believe this adherence rate is considerably inflated if we could compare it to the true adherence rate in all participants assigned to TRE.

One of the figures in the paper (Figure 2A) provides a chart of the self-reported daily adherence for both groups and responses from all completed surveys were analyzed. The figure notes that 44 participants assigned to TRE were analyzed. These 44 participants received the daily adherence survey a sum total of 3696 times (44 participants x 84 days) over the course of the study. If you recall above, the survey was completed a total of 1351 times. For these 44 participants, their adherence to responding to the adherence survey was about 37% (1351/3696). If you also recall from the primary outcome results above, 59 participants were analyzed in the TRE group. If there are 59 participants, and responses from all completed surveys were analyzed, it suggests that 15 participants (59-44) did not respond to the daily adherence survey even once. If that’s the case, it means there are another 1260 surveys that weren’t answered. On top of this, there were 10 participants assigned to TRE that were excluded from the primary outcome analysis (a total of 69 participants were assigned to TRE and 59 were analyzed) because they did not log a single weight measurement with their Bluetooth scale. This last point may seem irrelevant since these 10 participants were excluded from the study’s results, but it can factor into the equation if we want to know how well a group of participants adhere to a treatment that’s assigned to them.

What’s my point in all this? What’s the probability that the self-reported adherence rate in the TRE group of 83.5% when participants answer the question is the same as the actual adherence rate when they don’t? I suspect there’s a very large selection bias here. I think when people are more adherent to a program, they are more likely to be engaged in it, and thus more likely to complete the adherence survey through the study app. By the same token, when people are less adherent to a program they are less likely to be engaged in it, and less likely to complete the adherence survey. The data used to determine the adherence rate likely selects the times in which the participants are more engaged and misses all of the data in which the participants are less engaged. In short, the self-reported adherence of the group assigned to TRE is higher, maybe a lot higher than the true adherence of participants. We just don’t know, which is a problem if we want to get closer to understanding the true effect of the treatment itself.

A brief aside on participant selection

From previous discussions of clinical studies you’ll likely recall that an important part of making sense of published work is understanding the population studied. In this study, that population was clinically overweight and obese (BMI range: 27 to 43, mean 33). This suggests, almost assuredly, they were also very metabolically impaired. As such, the challenges of reversing their metabolic milieu, which would be necessary to achieve meaningful weight loss, are far greater than in people who are, say, 10 to 15 pounds overweight and looking for a little nudge in the right direction. In other words, this study used a very challenging subset of patients to study a relatively mild intervention.

Investigator analysis

This study used what is called an intention-to-treat (ITT) analysis. An ITT analysis is a method for analyzing results of all randomized participants in the groups to which they were randomly assigned. According to ITT principle, once a participant is assigned to a treatment group you must include the outcome data for that person regardless of whether that person followed the protocol, or were even lost to follow-up or dropped out of the study. To give you a sense of the principle, ITT is described as “once randomized, always analyzed.” ITT is intended to reduce biases that are introduced when study investigators analyze only those participants who adhered to and completed the treatment originally allocated, which is often referred to as per-protocol analysis. ITT tries to inform us about what happens in the “real-world” when we prescribe a drug or a diet to a group of people. In other words, an ITT trial is an effectiveness trial.

Now, you may be thinking to yourself, with ITT, how do you include the outcome data of all participants if you didn’t actually collect all of their actual outcomes (e.g., drop-outs)? With ITT, their outcomes are estimated from other information collected from the trial. This means the outcomes in an ITT study with incomplete information provides an estimate of the results. But the point of doing an ITT analysis is to try to understand what you might expect when you prescribe a treatment to a person without knowing their ability to adhere to it. With per-protocol, you’re doing an efficacy analysis. You’re only including the participants who completed the trial. The bottom line is we need to know both efficacy and effectiveness, and in order to do that, we need to perform both types of analyses.

What does this mean in the context of the TREAT trial? It’s asking, what is the effect of being assigned to time-restricted eating on weight loss and metabolic health in patients with overweight and obesity? What probably causes confusion for the reader is that this distinction isn’t made anywhere in the abstract of the paper. In the Conclusions and Relevance section of the abstract, the investigators write, “Time-restricted eating, in the absence of other interventions, is not more effective in weight loss than eating throughout the day.” But the investigators actually found that being assigned to time-restricted eating, in the absence of other interventions, is not more effective in weight loss than being assigned to eating throughout the day. In other words, it’s an effectiveness trial and not an efficacy trial. The investigators include the appropriate distinction in the Conclusions section at the end of the paper: “In this RCT, a prescription of [emphasis added] TRE did not result in weight loss when compared with a control prescription of [emphasis added] 3 meals per day.”

TRE studies on weight loss

So how does this study fit into the existing literature on (specifically) TRE and weight loss? This year alone, there’s been a couple of studies published that were similar in scope as the TREAT trial. In one study, investigators randomized 22 overweight or obese participants to 16:8 ad libitum TRE (n = 13) or unrestricted eating (non-TRE) (n = 9) for 12 weeks. The participants in the TRE group that completed the study (n = 11) lost a significant 3.8% of their body weight compared to baseline and also lost a significant amount of weight compared to the non-TRE group. Another study published this year randomized 58 participants into 1 of 3 groups: 20:4 ad libitum TRE (n = 19), 18:6 ad libitum TRE (n = 20), or control (n = 19). After 10-weeks, the participants completing the trial in the 20:4 TRE group (n = 16) lost 3.9% and the 18:6 TRE group (n = 19) lost 3.4% of their body weight, both losing significantly more weight than the control group. Three other studies over the last 5 years on TRE in obese, overweight, or participants with metabolic syndrome had eating windows between 8-10 hours, duration between 12-16 weeks, all lacked a true control group, and weight loss (a reduction of between 3.2-3.4%) was statistically significant compared to baseline in all three studies. While this is by no means a comprehensive and rigorous review, the data suggests TRE in the 16:8-range can result in significant but modest weight loss in controlled (and uncontrolled) trials. Most of these trials used a per-protocol analysis, that is, they only looked at the participants that completed the trial. It’s likely had the investigators used an ITT analysis, the effect size would be smaller.

Secondary outcomes

Secondary outcomes were measured in a subset of 50 participants, who underwent in-person testing at the beginning and end of the study, and included changes in weight, body fat, lean mass, fasting insulin, fasting glucose, hemoglobin A1C (hbA1c) levels, resting metabolic rate (RMR), and total energy expenditure (TEE).

There were no significant differences between the groups for any of the secondary outcomes. There was, however, a significant reduction in weight, lean mass, and TEE within the group assigned to TRE. The reduction of lean mass drew concern from the investigators and attention from the media. The headline in the New York Times covering the study read: “A Potential Downside of Intermittent Fasting — A rigorous three-month study found that people lost little weight, and much of that may have been from muscle.”

Of the average weight loss of 3.7 lbs for in-person participants assigned to TRE, approximately 65% of the weight, about 2.4 lbs, was lean mass. Loss of lean mass during weight loss typically accounts for about 25% of total weight loss according to the investigators.

How much of a concern is it that after 12 weeks, the group assigned to TRE went from an estimated 132 lbs of lean mass to 130 lbs as measured by dual-energy X-ray absorptiometry (DXA)? If the results of this study can be generalized to the broader population and someone trying to shed a lot of excess fat should expect to lose 2 lbs for every 1 lb of fat lost if prescribed TRE, you bet I’d be concerned. But there are reasons to question this would be the case. One is that DXA measurements of lean mass reflect the body compartment that is non-fat and non-bone. This means that DXA provides only an approximation of skeletal muscle mass because it will include contributions from skin, connective tissues, and the water contained within these tissues. DXA analysis assumes a constant hydration of lean tissue, but hydration can vary. So one possibility is that the DXA measurements did not accurately reflect the changes in skeletal muscle mass in the participants. Repeatability is reportedly excellent for lean tissue measurements for DXA, with a reported range of 0.5-2%, but the estimated lean mass loss in the group assigned to TRE was less than 2%, within even this small margin of error. The investigators did acknowledge the potential confounding hydration can introduce for the lean mass calculations, but they took measures to help control for this (participants fasted for more than 12 hours and voided their bladder prior to DXA scans), and they reported that the change in lean mass was much greater than the loss of body water, therefore it’s unlikely that the differences in muscle hydration would account for all of the lean mass loss.

Another thing to consider, as the investigators did in their paper, is the possibility that protein intake was reduced in the group assigned to TRE, which may have contributed to the decrease in lean mass. Unfortunately, due to technical issues, the investigators were not able to collect dietary food recalls from the participants, which may have provided some clues here.

Exercise, particularly resistance training, is another way to preserve or even gain lean mass while losing fat mass. Understandably, there were no recommendations for macronutrient intake or physical activity for participants in the study in order to keep the prescription as simple as possible, but these are a couple of variables that can help people mitigate skeletal muscle loss when fasting or trying to shed fat. In fact, a randomized controlled trial found that when protein was matched to pre-study consumption in resistance-trained males assigned to 3 weekly resistance training sessions and to either a 16:8 or a control diet, the TRE group showed a significant decrease in fat mass compared to the control group while fat-free mass was preserved.

While participants in the TREAT trial did not report their physical activity, there were a couple of indications that the participants assigned to the TRE group reduced their TEE. The investigators found a statistically significant 7% decrease in TEE (2718 kcal/d to 2541 kcal/d, measured by doubly-labeled water pre- and post-intervention) within the in-person participants assigned to TRE. Also, some of the participants in this cohort (17 in the TRE group and 17 in the CMT group) received and wore an Oura ring to track activity and sleep habits. Data from the Oura ring reported about a 30% decrease (8555 to 6057) in steps from baseline to post-intervention in the group assigned to TRE compared to a 3% decrease (8871 to 8614) in the group assigned to CMT. The reported RMR in the in-person cohort assigned to TRE did not decrease significantly (1920 kcal/d to 1892 kcal/d). Taken together, this data suggests that participants assigned to TRE decreased their activity during the intervention, and this reduction in physical activity might also explain an unexpected decrease in lean mass.

TRE studies on body composition and metabolic health

The effect of TRE on lean mass is currently unclear: One study reported a decrease in lean mass and a second study did not report a significant difference. One study reported that TRE reduced lean mass (assessed by DXA), although two different TRE interventions that included high protein intake and resistance training reported that TRE did not adversely affect lean mass. Of the previous studies under review that reported body fat, one study did not report a significant change while two studies reported a significant decrease in body fat measured post-intervention.

TRE studies on fasting plasma glucose and insulin have produced conflicting results: Three different studies did not report a significant change in fasting blood glucose (FBG), one study reported a trend for decrease in FBG while another study reported a significant decrease in morning FBG. One study that compared TRE meal timing windows reported a decrease in postprandial plasma glucose (glycemic response) for both early (8:00 AM-5:00 PM) and later (12:00 PM-9:00 PM) eating window treatment groups compared to their respective baseline measures. Two TRE studies reported a decrease in insulin and Homeostatic Model Assessment of Insulin Resistance (HOMA-IR) while two other studies did not report a significant change. There is some suggestion in the literature that TRE may improve insulin sensitivity in people with prediabetes: during a 5-week intervention, fasting plasma insulin was decreased without a change in plasma glucose levels. Two studies did not report a significant difference in hbA1c as a result of TRE intervention while a third study reported a trending decrease, although not statistically significant.

One study on TRE and RMR likewise did not report a significant difference between treatment groups. A second study that evaluated TEE also did not report a significant difference between treatment groups comparing meal timing. And while the course of treatment was only 4 days, one study reported that a 6-hour eating window did not affect 24-hour and resting energy expenditure (REE; using whole-room indirect calorimetry). Altogether, the effect of TRE on energy expenditure requires further study.

Conclusion

I must admit that when I first heard about this study—ad libitum 16:8 TRE in overweight participants seemingly showing no effect on weight loss and other metabolic markers compared to a control group—it seemingly confirmed my bias and I didn’t read past the abstract. “Tell me something I don’t already know,” I thought. In my clinical experience, when metabolically challenged patients restrict only when they eat, but not what they eat, or how much they eat, a 16:8 window usually doesn’t cut it. (Conversely, I’ve seen reasonable results with this limited approach in people who are reasonably metabolically healthy, merely looking to lose a few pounds.) If metabolically ill patients want to see meaningful results with respect to weight loss and metabolic health with TRE, they usually need to pull harder on the other levers as I explain in the video. In practical terms that means some combination of (1) a greater fasting window (more TR), (2) an improvement in dietary quality or restriction (more DR), and/or (3) a reduction of caloric intake (more CR). Overall, there may be modest benefits of TRE compared to a habitual diet on average, and there are some individuals that see more benefits than the average, some less. But I think there are far better strategies to move the needle than an ad libitum 16:8 with no focus on food quality or quantity.

A null finding from the TREAT trial observing this type of TRE in overweight participants did not confer weight loss or metabolic benefits compared to CMT is what I would’ve expected. But what I didn’t initially realize is that’s not precisely the finding observed in this trial, if you dig deeper.

Understanding what happens to people who truly follow the diet or take the pill is important. How effective is doing TRE compared to CMT? This is a question of efficacy. This is a good question to ask, but it’s not a question the TREAT trial was designed to address.

Understanding what happens to people when they are instructed to follow a diet or take a pill we think might result in beneficial effects is also important. How effective is prescribing TRE compared to CMT in overweight participants for weight loss and metabolic health? This, too, is a good question to ask, and this is what the TREAT trial was designed to address. As a practitioner, you not only want to know the efficacy of the drug you’re thinking about prescribing to your patient under ideal circumstances, you also want to know what typically happens to patients in the real-world when the drug is prescribed. In other words, what’s the effectiveness of prescribing the treatment under real-world conditions? The TREAT trial addressed a very important question about effectiveness, but just don’t confuse it with efficacy.

 

– Peter

Disclaimer: This blog is for general informational purposes only and does not constitute the practice of medicine, nursing or other professional health care services, including the giving of medical advice, and no doctor/patient relationship is formed. The use of information on this blog or materials linked from this blog is at the user's own risk. The content of this blog is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Users should not disregard, or delay in obtaining, medical advice for any medical condition they may have, and should seek the assistance of their health care professionals for any such conditions.
  1. “Secondary outcomes were measured in a subset of 50 participants, who underwent in-person testing at the beginning and end of the study, and included changes in weight, body fat, lean mass, fasting insulin, fasting glucose, hemoglobin A1C (hbA1c) levels, resting metabolic rate (RMR), and total energy expenditure (TEE). There were no significant differences between the groups for any of the secondary outcomes.”

    Would you expect to see / have you seen any other metabolic or perhaps immunologic benefits with 16:8 TRE compared to CMT?

    I see 16:8 advocated for health benefits in normal weight healthy individuals, but is there any persuasive evidence to recommend it?

  2. When reading about diet studies I think of this analogy.

    An investigator performs a study to test the effectiveness of a fuel additive treatment.
    100 vehicles are selected at random
    The results were that the fuel additive was effective in 80% of the vehicles
    But 20% of the vehicles died (quit running)
    The results reported were that the fuel additive treatment has high risk of death, therefore not recommended

    Further analysis of the data indicated that the 80 vehicles with gas engines all improved, the 20 vehicles with diesel engines all died. Apparently, the ignorant investigator didn’t understand that engine type could be a confounder.

    Are people who are fat adapted diesels and the SAD people gas engines? Therefore, all the studies on people following SAD are just noise and should be ignored by people who are fat adapted, low carb, and getting reasonable exercise, sleep and stress reduction?

  3. My initial reaction was, ‘an eating window that goes to 8 pm’? That is a ridiculous eating window, for at least two reasons. The first reason is that the hours between 6 and 8 are the hours when we tend to eat the most; these people were keeping the hours when they did most of their eating. The second reason why an eating that goes to 8 pm is ridiculous is that it is too close to sleep time. Part of the purpose of time restricted eating is to create more time between your last meal and sleep. I’m out of my area of expertise here, but Peter’s friend Mathew might be able to support this idea. And thirdly, my understanding is that food that you eat after 6 pm will produce a larger blood glucose peak. I’ll let Peter be the judge of that.

    In summary, these people studied time restricted eating with a terrible eating window.

  4. Word choices matter. Understanding the meaning of those words matter even more. Effective vs. efficacious. Illiteracy abounds.

    “The difference between the right word and the almost right word is the difference between lightning and a lightning bug.” – Mark Twain

  5. Hi Peter,

    What about studies demonstrating an increase in FGL and PPGR. I’m experiencing those personallly, when on a 20:4 regime w/ weekly 24h fasts.

Leave a Reply

Facebook icon Twitter icon Instagram icon Pinterest icon Google+ icon YouTube icon LinkedIn icon Contact icon