#269 – Good vs. bad science: how to read and understand scientific studies

“I think epidemiology has a place, but I think the pendulum has swung a little too far, and it has been asserted as being more valuable than I think it probably is.” —Peter Attia

by Peter Attia

September 4, 2023

Read Time 42 minutes

This special episode is a rebroadcast of AMA #30, now made available to everyone, in which Peter and Bob Kaplan dive deep into all things related to studying studies to help one sift through the noise to find the signal. They define various types of studies, how a study progresses from idea to execution, and how to identify study strengths and limitations. They explain how clinical trials work, as well as biases and common pitfalls to watch out for. They dig into key factors that contribute to the rigor (or lack thereof) of an experiment, and they discuss how to measure effect size, differentiate relative risk from absolute risk, and what it really means when a study is statistically significant. Finally, Peter lays out his personal process when reading through scientific papers.

Subscribe on: APPLE PODCASTS | RSS | GOOGLE | OVERCAST | STITCHER

We discuss:

The ever-changing landscape of scientific literature [2:30];
The process for a study to progress from idea to design to execution [5:00];
Various types of studies and how they differ [8:00];
The different phases of clinical trials [19:45];
Observational studies and the potential for bias [27:00];
Experimental studies: Randomization, blinding, and other factors that make or break a study [44:30];
Power, p-values, and statistical significance [56:45];
Measuring effect size: Relative risk vs. absolute risk, hazard ratios, and “Number Needed to Treat” [1:08:15];
How to interpret confidence intervals [1:18:00];
Why a study might be stopped before its completion [1:24:00];
Why only a fraction of studies are ever published and how to combat publication bias [1:32:00];
Why certain journals are more respected than others [1:41:00];
Peter’s process when reading a scientific paper [1:44:15]; and
More.

Get Peter’s expertise in your inbox 100% free.

Sign up to receive An Introductory Guide to Longevity by Peter Attia, weekly longevity-focused articles, and new podcast announcements.

The ever changing landscape of scientific literature [2:30]

Peter was on The Tim Ferriss Show in June where the topic of studies came up
Peter has also written a five part series on Studying Studies
This podcast is designed to:
- i) help people make sense of the “ever changing landscape of scientific literature and how to distinguish between the signal and the noise of the research news cycle”; and
- ii) be a primer for people to really understand the process of scientific experiments and everything from how studies are published and obviously what some of the limitations are

The process for a study to progress from idea to design to execution [5:00]

Broad steps:

Hypothesis: The default position in an experiment is that there is no relationship between two phenomena. This is called the null (i.e., zero) hypothesis.
Experimental design
power analysis
IRB
primary and secondary outcomes, protocol, stats plan, and preregistration

Overview:

Step 1: Null hypothesis

In theory, it should start with a hypothesis—”Good science is generally hypothesis-driven.”
Null hypothesis: Take the position that there is no relationship between two phenomena
For instance, the hypothesis might be that drinking coffee makes your eyes turn darker
Then it must be framed in a way that says the null hypothesis is that when you drink coffee, your eyes do not change in color in any way
That would imply that the alternative of hypothesis is that when you drink coffee, your eyes do change color
But there’s nuance to this… because am I specifying what color it changes to? Does it get darker? Does it get lighter? Does it change to blue, green? Does it just get the darker shade
To be able to formulate that cleanly is the first step here

Step 2: Conduct an experimental design

How are you going to test that hypothesis?
- A really elegant way to test this is using a randomized controlled experiment and even better if it’s possible to blind it
Other design question would be:
- How long should we make people drink coffee?
- How frequently should they drink coffee?
- How are we going to measure eye color?

Step 3: Power analysis

A very important variable is how many subjects will you have
That will depend on a number of things, including how many arms you will have in this study
It comes down to doing something that’s called a power analysis

Step 4: IRB approval

If this study involves human subjects or animal subjects, you will have to get something called an Institutional Review Board to approve the ethics of the study

Step 5: Determine your primary and secondary outcomes

In short…
- Get the protocol approved, develop a plan for statistics, and then pre-register the study
- And in parallel to this, you have to have funding

*Note: there are some studies that are not experimental where some of these steps are obviously skipped

The various types of studies and how they differ [8:00]

Three main categories of studies/papers:

1) observational studies
2) experimental studies
3) papers that analyze/review those studies

Figure 1. The evidence-based medicine pyramid. Credit: Purdue University

NOTE: One thing Peter doesn’t like about the pyramid above is that it puts a hierarchy in place that suggests a meta analysis better than a randomized control trial, which is not necessarily true

Individual case report

Example: Peter wrote one while he was at the NIH —
- It was about a patient who had come into the clinic with metastatic melanoma and their calcium was dangerously high
- The first assumption was that this patient had metastatic disease to their bone and that they were lysing bone and calcium was leaching into their bloodstream
- That tuned out to not be the case — instead, they had something that had not been previously reported in patients with melanoma, which was, they had developed this parathyroid hormone related like hormone in response to their melanoma
- This is a hormone that exists normally, but it doesn’t exist in this format
- So their cancer was causing them to have more of this hormone that was causing them to raise their calcium level
- It was interesting because it had never been reported before in the literature, so Peter wrote up an individual case report
- What’s the value in that? Well, the next time a patient with melanoma shows up to clinic and their calcium is sky-high and someone goes to the literature to search for it, they’ll see that report and it will hopefully save them time in getting to the diagnosis
On the podcast with Steve Rosenberg, he emphasizes the importance of these types of observations
- To start the process for a study—from idea, to design, to execution—you must have a hypothesis that begins with an observation
- In other words, an interesting observation is hypothesis generating and it might kickstart a larger trial
- That said, with one observation/case report, you can’t make any statement about the frequency of this in the broader subset of patients or make any comment about any intervention that may or may not change the outcome of this

Case series or set of studies [11:15]

Here you’re basically doing the same thing, but you would look at more than one patient
So for example, looking back at one’s clinical practice and noticing that 27 patients over the last 40 years have demonstrated a very unusual finding
For instance, one could write a paper that looks at all spontaneous regressions of cancer—this is incredibly rare, but there are certainly enough of them that one could write a case series.

Cohort studies [12:00]

Cohort studies are larger studies and they can be retrospective or they can be prospective
A retrospective observational cohort study would be looking backwards in time
- For instance, go back and look at all the people who have used saunas for the last 10 years and look at how they’re doing today relative to people who didn’t use saunas over the last 10 years
- It’s observational, no intervention, and the hope when you do this is that you’re going to see some sort of pattern
- Undoubtedly, you will see a pattern, but the real question is, will you be able to establish causality in that pattern?
A prospective cohort study would be looking forward in time
- For instance, one would follow people over the next 5-10 years who use saunas and compare them to a similar number of people who don’t
- In a forward looking fashion, we’re going to be examining the other behaviors of these people and ultimately what their outcomes are—Do they have different rates of death, heart disease, cancer, Alzheimer’s disease, other metrics of health that we might be interested in?
- Again, not intervening and not an experiment per se, just observing

Experimental studies [13:30]

A non-randomized trial sometimes gets referred to as an open label trial
- This is where you take two groups of people and you give one of them a treatment and you give the other one, either a placebo or a different treatment, but you don’t randomize them (i.e., There’s a reason that they’re in that group)
- For example, you may want to study the effect of a certain antibiotic on a person that comes in the ER
  - So you take all the people that come in who look a certain way (fever, white blood cell count, etc.) and you’re going to give them the antibiotic
  - And the people who come in but they don’t have those exact signs or symptoms, you do NOT give an antibiotic and you’re just going to follow them
- In summary, there are enormous limitation to non-randomized trials because presumably there’s a reason you’re making that decision. And that reason will undoubtedly introduce bias
A randomized controlled trial is referred to as the “gold standard”
- This is where whatever question you want to study, you study it, but you attempt to take all bias out of it by randomly assigning people into the treatment groups
  - “Blinding” is another important aspect that Peter will discuss later in the podcast

Meta-analysis [16:15]

This is a statistical technique where you can combine data from multiple studies that are attempting to look at the same question
Each study gets a relative weighting and the weighting of a study is a function of its precision (sample size, other events in the study)
For instance, larger studies, which have smaller standard errors are given more weight than smaller studies with larger standard errors, for example.
You’ll know you’re looking at a meta analysis because usually there will be a figure somewhere in there that will show across rows all of the studies.

Let’s say there’s 10 studies included in the meta-analysis, and then they’ll have the hazard ratios for each of the studies
They’ll represent them usually as little triangles, the triangle will represent the 95% confidence interval of what the hazard ratio is (basically a marker of the risk)
You’ll see all 10 studies and then they’ll show you the final summation of them at the bottom, which of course you wouldn’t be able to deduce looking at the figure, but it takes into account that mathematical waiting

The truth about meta-analyses

On the surface, meta-analyses seem really, really great because if one trial, one randomized trial is good, 10 must be better
But as James Yang once once said during a journal club about a meta analysis that was being presented, he said something to the effect of, “1000 sows ears makes not a pearl necklace” — just an eloquent way to say that garbage in garbage out.
If you do a meta-analysis of a bunch of garbage studies, you get a garbage meta-analysis
So a meta-analysis of great randomized control trials will produce a great meta-analysis.

The different phases of a clinical trial [19:45]

When talking about human clinical trials, the different “phases” is the phraseology is used by the FDA here in the United States

Origins of this process…

Say you have an interesting idea—e.g.., a color cancer drug or molecule that you think will have some benefit
Say you’ve done some interesting experiments in animals, maybe started with some mice and you went up to some rats and maybe even you’ve done something in primates
Now you’re really committed to this as the success of this and it’s both safe and efficacious in animals
You now decide you want to foray into the human space
Step 1 is you have to file for something called an IND, an Investigational New Drug application that basically sets your intention of testing this as a drug in humans
If you can the IND, you go to phase 1

Figure 2. Different phases of a trial.

Phase 1

Phase I is geared specifically to dose escalate this drug from a very, very low level to determine what the toxicity is across a range of doses that will hopefully have efficacy
Very small studies, usually less than 100 people typically done in cohorts
E.g., the first 12 people are going to be at 0.1 mg/kg, and assuming we see no adverse effects there, we’ll go up to 0.15 mg/kg per kilogram for the next 12 people. If we have no issues there, we’ll escalate it to 0.25 and so on
Notice there is nothing in there about whether the drug works or not
The people in the study are going to be patients that all have colon cancer—often metastatic colon cancer
- They have progressed through all other standard treatments without much luck
If the drug gets through phase I safely, then it goes to phase II

Phase 2

The goal of phase II is to continue to evaluate for safety, but also to start to look for efficacy
This is done in an open label fashion—they’re not randomizing patients to one drug versus the other typically
Usually it’s at point where the investigators now we think that one or two doses are going to produce efficacy
The dose levels were deemed safe in the phase I, so now we’re now going to take patients and give them this drug and look for an effect
A lot of times, if there’s no control arm in this study, you’re going to compare the drug to the natural history
For instance, let’s assume that we know that patients with metastatic colon cancer on standard of care, have a medium survival of X months
Now we’re going to give these patients this drug and see if that extends it anymore
Typically these are very small studies — Can be in the 20-50 range, maybe up to a few hundred people
There are lots of things that can introduce bias to a phase II if it does not have randomization
The goal would be to still randomize in phase II, because you really do want to tease out efficacy
So if a compound succeeds in phase II, which means it continues to show no significant adverse safety effects — note, no adverse events doesn’t mean no side effects…it’s just that it doesn’t have side effects that are deemed unacceptable for the risk profile of the patient
So if it shows efficacy without adverse events, you then proceed to phase III

Phase 3

Phase III is a really rigorous trial
It’s typically a log step up in the number of patients to thousands of patients
This is absolutely either a placebo-controlled trial or it can be standard of care versus standard of care plus this new agent
It will be randomized and whenever possible, it is blinded
And these are typically longer studies
And because you have so much more sample size, you’re going to potentially pick up side effects that weren’t there in the first place
Now you really have that gold standard for measuring efficacy
And it’s on the basis of the phase I, phase II and mostly phase III data that a drug will get approved or not approved for broad use, which leads to a fourth phase

Phase 4

This is a post marketing study
Phase IV studies take place after the drug has been approved, and they’re used to basically get additional information because once a drug is approved, you now have more people taking it
And they may also be using this to look at other indications for the drug
In AMA #29, a phase IV trial with semaglutide was discussed
- It was a trial that used semaglutide to look at obesity versus its original phase III trials, which we’re looking at diabetes
- Semaglutide is a drug that’s already been approved — the trial is basically expanding the indication for semaglutide, in this case so that insurance companies would actually pay for it for a new indication
In phase 4, you’re also looking for whether there is another side effect that was missed in the phase 3 trial

Observational studies and the potential for bias [27:00]

Are there things you look for in an observational study that increase or decrease your confidence in it?

{end of show notes preview}

Would you like access to extensive show notes and references for this podcast (and more)?

Check out this post to see an example of what the substantial show notes look like. Become a member today to get access.

Become a Member

Become a premium member

MEMBERSHIP INCLUDES

Exclusive Ask Me Anything episodes
Best in class podcast Show Notes
Premium Articles on longevity
Full access to The Qualys podcast
Quarterly Podcast Summary episodes

LEARN MORE

Does Exercise Affect Lifespan?

Weekly Newsletter

Free Article

Ten errors in randomized experiments

Nutrition

Free Article

The Bad Science Behind 'Skipping Breakfast'

Disclaimer: This blog is for general informational purposes only and does not constitute the practice of medicine, nursing or other professional health care services, including the giving of medical advice, and no doctor/patient relationship is formed. The use of information on this blog or materials linked from this blog is at the user’s own risk. The content of this blog is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Users should not disregard, or delay in obtaining, medical advice for any medical condition they may have, and should seek the assistance of their health care professionals for any such conditions.

Weekly Newsletter

Premium Articles

Compilations

Sign up for the newsletter

Guest Episodes

AMA Episodes

Quarterly Summaries

All Podcast Episodes

Become a Premium Member

About Peter

Media Appearances

Speaking Inquiries

Disclosures