September 4, 2023

Understanding science

#269 – Good vs. bad science: how to read and understand scientific studies

I think epidemiology has a place, but I think the pendulum has swung a little too far, and it has been asserted as being more valuable than I think it probably is.” —Peter Attia

Read Time 42 minutes

This special episode is a rebroadcast of AMA #30, now made available to everyone, in which Peter and Bob Kaplan dive deep into all things related to studying studies to help one sift through the noise to find the signal. They define various types of studies, how a study progresses from idea to execution, and how to identify study strengths and limitations. They explain how clinical trials work, as well as biases and common pitfalls to watch out for. They dig into key factors that contribute to the rigor (or lack thereof) of an experiment, and they discuss how to measure effect size, differentiate relative risk from absolute risk, and what it really means when a study is statistically significant. Finally, Peter lays out his personal process when reading through scientific papers.


We discuss:

  • The ever-changing landscape of scientific literature [2:30];
  • The process for a study to progress from idea to design to execution [5:00];
  • Various types of studies and how they differ [8:00];
  • The different phases of clinical trials [19:45];
  • Observational studies and the potential for bias [27:00];
  • Experimental studies: Randomization, blinding, and other factors that make or break a study [44:30];
  • Power, p-values, and statistical significance [56:45];
  • Measuring effect size: Relative risk vs. absolute risk, hazard ratios, and “Number Needed to Treat” [1:08:15];
  • How to interpret confidence intervals [1:18:00];
  • Why a study might be stopped before its completion [1:24:00];
  • Why only a fraction of studies are ever published and how to combat publication bias [1:32:00];
  • Why certain journals are more respected than others [1:41:00];
  • Peter’s process when reading a scientific paper [1:44:15]; and
  • More.


The ever changing landscape of scientific literature [2:30]

  • Peter was on The Tim Ferriss Show in June where the topic of studies came up
  • Peter has also written a five part series on Studying Studies
  • This podcast is designed to:
    • i)  help people make sense of the “ever changing landscape of scientific literature and how to distinguish between the signal and the noise of the research news cycle”; and
    • ii) be a primer for people to really understand the process of scientific experiments and everything from how studies are published and obviously what some of the limitations are


The process for a study to progress from idea to design to execution [5:00]

Broad steps:

  • Hypothesis: The default position in an experiment is that there is no relationship between two phenomena. This is called the null (i.e., zero) hypothesis. 
  • Experimental design 
  • power analysis 
  • IRB
  • primary and secondary outcomes, protocol, stats plan, and preregistration


Step 1: Null hypothesis

  • In theory, it should start with a hypothesis—”Good science is generally hypothesis-driven.”
  • Null hypothesis: Take the position that there is no relationship between two phenomena
  • For instance, the hypothesis might be that drinking coffee makes your eyes turn darker
  • Then it must be framed in a way that says the null hypothesis is that when you drink coffee, your eyes do not change in color in any way
  • That would imply that the alternative of hypothesis is that when you drink coffee, your eyes do change color
  • But there’s nuance to this… because am I specifying what color it changes to? Does it get darker? Does it get lighter? Does it change to blue, green? Does it just get the darker shade 
  • To be able to formulate that cleanly is the first step here

Step 2: Conduct an experimental design

  • How are you going to test that hypothesis? 
    • A really elegant way to test this is using a randomized controlled experiment and even better if it’s possible to blind it
  • Other design question would be: 
    • How long should we make people drink coffee? 
    • How frequently should they drink coffee? 
    • How are we going to measure eye color? 

Step 3: Power analysis 

  • A very important variable is how many subjects will you have 
  • That will depend on a number of things, including how many arms you will have in this study
  • It comes down to doing something that’s called a power analysis

Step 4: IRB approval 

  • If this study involves human subjects or animal subjects, you will have to get something called an Institutional Review Board to approve the ethics of the study

Step 5: Determine your primary and secondary outcomes

  • In short…
    • Get the protocol approved, develop a plan for statistics, and then pre-register the study
    • And in parallel to this, you have to have funding

*Note: there are some studies that are not experimental where some of these steps are obviously skipped


The various types of studies and how they differ [8:00]

Three main categories of studies/papers:

  • 1) observational studies
  • 2) experimental studies
  • 3) papers that analyze/review those studies

Figure 1. The evidence-based medicine pyramid. Credit: Purdue University

  • NOTE: One thing Peter doesn’t like about the pyramid above is that it puts a hierarchy in place that suggests a meta analysis better than a randomized control trial, which is not necessarily true

Individual case report

  • Example: Peter wrote one while he was at the NIH — 
    • It was about a patient who had come into the clinic with metastatic melanoma and their calcium was dangerously high
    • The first assumption was that this patient had metastatic disease to their bone and that they were lysing bone and calcium was leaching into their bloodstream
    • That tuned out to not be the case — instead, they had something that had not been previously reported in patients with melanoma, which was, they had developed this parathyroid hormone related like hormone in response to their melanoma
    • This is a hormone that exists normally, but it doesn’t exist in this format
    • So their cancer was causing them to have more of this hormone that was causing them to raise their calcium level
    • It was interesting because it had never been reported before in the literature, so Peter wrote up an individual case report
    • What’s the value in that? Well, the next time a patient with melanoma shows up to clinic and their calcium is sky-high and someone goes to the literature to search for it, they’ll see that report and it will hopefully save them time in getting to the diagnosis
  • On the podcast with Steve Rosenberg, he emphasizes the importance of these types of observations
    • To start the process for a study—from idea, to design, to execution—you must have a hypothesis that begins with an observation
    • In other words, an interesting observation is hypothesis generating and it might kickstart a larger trial 
    • That said, with one observation/case report, you can’t make any statement about the frequency of this in the broader subset of patients or make any comment about any intervention that may or may not change the outcome of this

Case series or set of studies [11:15]

  • Here you’re basically doing the same thing, but you would look at more than one patient
  • So for example, looking back at one’s clinical practice and noticing that 27 patients over the last 40 years have demonstrated a very unusual finding
  • For instance, one could write a paper that looks at all spontaneous regressions of cancer—this is incredibly rare, but there are certainly enough of them that one could write a case series.

Cohort studies [12:00]

  • Cohort studies are larger studies and they can be retrospective or they can be prospective
  • A retrospective observational cohort study would be looking backwards in time
    • For instance, go back and look at all the people who have used saunas for the last 10 years and look at how they’re doing today relative to people who didn’t use saunas over the last 10 years
    • It’s observational, no intervention, and the hope when you do this is that you’re going to see some sort of pattern
    • Undoubtedly, you will see a pattern, but the real question is, will you be able to establish causality in that pattern?
  • A prospective cohort study would be looking forward in time
    • For instance, one would follow people over the next 5-10 years who use saunas and compare them to a similar number of people who don’t
    • In a forward looking fashion, we’re going to be examining the other behaviors of these people and ultimately what their outcomes are—Do they have different rates of death, heart disease, cancer, Alzheimer’s disease, other metrics of health that we might be interested in?
    • Again, not intervening and not an experiment per se, just observing

Experimental studies [13:30]

  • A non-randomized trial sometimes gets referred to as an open label trial
    • This is where you take two groups of people and you give one of them a treatment and you give the other one, either a placebo or a different treatment, but you don’t randomize them (i.e., There’s a reason that they’re in that group)
    • For example, you may want to study the effect of a certain antibiotic on a person that comes in the ER
      • So you take all the people that come in who look a certain way (fever, white blood cell count, etc.) and you’re going to give them the antibiotic
      • And the people who come in but they don’t have those exact signs or symptoms, you do NOT give an antibiotic and you’re just going to follow them
    • In summary, there are enormous limitation to non-randomized trials because presumably there’s a reason you’re making that decision. And that reason will undoubtedly introduce bias
  • A randomized controlled trial is referred to as the “gold standard”
    • This is where whatever question you want to study, you study it, but you attempt to take all bias out of it by randomly assigning people into the treatment groups
      • “Blinding” is another important aspect that Peter will discuss later in the podcast

Meta-analysis [16:15]

  • This is a statistical technique where you can combine data from multiple studies that are attempting to look at the same question 
  • Each study gets a relative weighting and the weighting of a study is a function of its precision (sample size, other events in the study)
  • For instance, larger studies, which have smaller standard errors are given more weight than smaller studies with larger standard errors, for example.
  • You’ll know you’re looking at a meta analysis because usually there will be a figure somewhere in there that will show across rows all of the studies.
  • Let’s say there’s 10 studies included in the meta-analysis, and then they’ll have the hazard ratios for each of the studies
  • They’ll represent them usually as little triangles, the triangle will represent the 95% confidence interval of what the hazard ratio is (basically a marker of the risk)
  • You’ll see all 10 studies and then they’ll show you the final summation of them at the bottom, which of course you wouldn’t be able to deduce looking at the figure, but it takes into account that mathematical waiting

The truth about meta-analyses

  • On the surface, meta-analyses seem really, really great because if one trial, one randomized trial is good, 10 must be better
  • But as James Yang once once said during a journal club about a meta analysis that was being presented, he said something to the effect of, “1000 sows ears makes not a pearl necklace” — just an eloquent way to say that garbage in garbage out.
  • If you do a meta-analysis of a bunch of garbage studies, you get a garbage meta-analysis
  • So a meta-analysis of great randomized control trials will produce a great meta-analysis.


The different phases of a clinical trial [19:45]

  • When talking about human clinical trials, the different “phases” is the phraseology is used by the FDA here in the United States

Origins of this process…

  • Say you have an interesting idea—e.g.., a color cancer drug or molecule that you think will have some benefit
  • Say you’ve done some interesting experiments in animals, maybe started with some mice and you went up to some rats and maybe even you’ve done something in primates
  • Now you’re really committed to this as the success of this and it’s both safe and efficacious in animals
  • You now decide you want to foray into the human space
  • Step 1 is you have to file for something called an IND, an Investigational New Drug application that basically sets your intention of testing this as a drug in humans
  • If you can the IND, you go to phase 1

Figure 2. Different phases of a trial.

Phase 1

  • Phase I is geared specifically to dose escalate this drug from a very, very low level to determine what the toxicity is across a range of doses that will hopefully have efficacy
  • Very small studies, usually less than 100 people typically done in cohorts
  • E.g., the first 12 people are going to be at 0.1 mg/kg, and assuming we see no adverse effects there, we’ll go up to 0.15 mg/kg per kilogram for the next 12 people. If we have no issues there, we’ll escalate it to 0.25 and so on
  • Notice there is nothing in there about whether the drug works or not
  • The people in the study are going to be patients that all have colon cancer—often metastatic colon cancer
    • They have progressed through all other standard treatments without much luck
  • If the drug gets through phase I safely, then it goes to phase II

Phase 2

  • The goal of phase II is to continue to evaluate for safety, but also to start to look for efficacy
  • This is done in an open label fashion—they’re not randomizing patients to one drug versus the other typically
  • Usually it’s at point where the investigators now we think that one or two doses are going to produce efficacy
  • The dose levels were deemed safe in the phase I, so now we’re now going to take patients and give them this drug and look for an effect
  • A lot of times, if there’s no control arm in this study, you’re going to compare the drug to the natural history
  • For instance, let’s assume that we know that patients with metastatic colon cancer on standard of care, have a medium survival of X months
  • Now we’re going to give these patients this drug and see if that extends it anymore
  • Typically these are very small studies — Can be in the 20-50 range, maybe up to a few hundred people
  • There are lots of things that can introduce bias to a phase II if it does not have randomization
  • The goal would be to still randomize in phase II, because you really do want to tease out efficacy
  • So if a compound succeeds in phase II, which means it continues to show no significant adverse safety effects — note, no adverse events doesn’t mean no side effects…it’s just that it doesn’t have side effects that are deemed unacceptable for the risk profile of the patient
  • So if it shows efficacy without adverse events, you then proceed to phase III

Phase 3

  • Phase III is a really rigorous trial
  • It’s typically a log step up in the number of patients to thousands of patients
  • This is absolutely either a placebo-controlled trial or it can be standard of care versus standard of care plus this new agent
  • It will be randomized and whenever possible, it is blinded
  • And these are typically longer studies
  • And because you have so much more sample size, you’re going to potentially pick up side effects that weren’t there in the first place
  • Now you really have that gold standard for measuring efficacy
  • And it’s on the basis of the phase I, phase II and mostly phase III data that a drug will get approved or not approved for broad use, which leads to a fourth phase

Phase 4

  • This is a post marketing study
  • Phase IV studies take place after the drug has been approved, and they’re used to basically get additional information because once a drug is approved, you now have more people taking it
  • And they may also be using this to look at other indications for the drug
  • In AMA #29, a phase IV trial with semaglutide was discussed
    • It was a trial that used semaglutide to look at obesity versus its original phase III trials, which we’re looking at diabetes
    • Semaglutide is a drug that’s already been approved — the trial is basically expanding the indication for semaglutide, in this case so that insurance companies would actually pay for it for a new indication
  • In phase 4, you’re also looking for whether there is another side effect that was missed in the phase 3 trial


Observational studies and the potential for bias [27:00]

Are there things you look for in an observational study that increase or decrease your confidence in it?


{end of show notes preview}

Would you like access to extensive show notes and references for this podcast (and more)?

Check out this post to see an example of what the substantial show notes look like. Become a member today to get access.

Become a Member

Disclaimer: This blog is for general informational purposes only and does not constitute the practice of medicine, nursing or other professional health care services, including the giving of medical advice, and no doctor/patient relationship is formed. The use of information on this blog or materials linked from this blog is at the user's own risk. The content of this blog is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Users should not disregard, or delay in obtaining, medical advice for any medical condition they may have, and should seek the assistance of their health care professionals for any such conditions.
Facebook icon Twitter icon Instagram icon Pinterest icon Google+ icon YouTube icon LinkedIn icon Contact icon