I recently read an, shall we say, interesting headline from ScienceDaily: “Positive effect of music and dance on dementia proven by New Zealand study.” (Emphasis mine) The “positive effect” was reported as a statistically significant improvement in quality of life (QoL) in this pilot study.
The aim of this intervention was to promote a better QoL for people with dementia using familiar, reminiscent music and the natural gestures of the group of 22 participants to create an original series of dance exercises. Performed over 10 weekly sessions, the study showed that “participants reported significant improvements in their quality of life after session six.” (We’ll come back to this point, below.)
So what, exactly, was “proven” in this study?
This story is a copy & paste (including the headline) of a press release from the lead author’s institution, the University of Otago. It takes a village to promulgate this nonsense. Presumably, someone at the university had to approve the story and headline before its release, and an editor at ScienceDaily also had to publish it. So nobody in the village gets a pass on this one in my book.
First of all—and if you remember not one other point from this email, remember this one—nothing is ever completely “proven” in scientific endeavors. Proofs are for mathematics, not science. (**) Something can be more (or less) likely to be true, but to say something is proven essentially says that there is 100% certainty in the assertion. That’s not how science works. There is no absolute truth in science.
If that weren’t enough to set me off, consider the overall weakness of this study, and the choice of the word “proven” borders on journalistic manslaughter.
Proven? This was a pilot study, which, by design, are not meant to test hypotheses, and not equipped to determine cause-and-effect relationships. Its purpose is basically to help determine if there is a signal in a sea of data noise and work out the operational kinks to reduce risk for the larger study.
Proven? There was no control group in this study. The study was not randomized. How can we possibly know if any effect, however small, was not more of a result of some other factor associated with being in a study, such as the attention of the investigators? (See more on why randomization and a control group is so important for trying to establish reliable knowledge in my article “Studying Studies: Part IV.”)
Proven? Did you notice that the article stated that participants reported significant improvements in their self-reported (I’ll leave this alone for now, but suffice it to say, there are plenty of problems with self-reporting) QoL after session six, which was 6 weeks into a 10-week study? It turns out that after the 10-weekly sessions, the reported improvements in the QoL of participants werenot statistically significant. If this isn’t a classic case of “p-hacking,” I don’t know what is. (You can find relevant links to this type of data dredging in “Studying Studies: Part V.”)
Proven? Thought experiment: Imagine if a draft of the aforementioned “proven” headline reached the desk of the editor, and she deleted the p-word from the headline and replaced it with something more unassuming. “Positive effect of music and dance on dementia observed by New Zealand study.” This, too, is an insane headline, given all of the limitations of this study. It still confuses association with causality. The headline still implies a cause-and-effect relationship when no such relationship can be established from the study. (This is covered in more detail in “Studying Studies: Part I,” including an example of a similar mistaken headline published on the AAAS website: “Cholesterol-fighting drugs lower risk of Alzheimer’s disease.”)
In summary, a 10-week pilot study, which by design cannot test hypotheses let alone establish cause-and-effect relationships, found that self-reported improvements in QoL after 10-weekly sessions of music and dance were not statistically significant, and this prompted the University of Otago to publish a press release claiming the positive effects of music and dance on QoL in dementia patients is now proven. If you think this study/story combo is an exception, I suggest you subscribe to this type of science newsletter. I think you’ll be surprised at the prevalence of this type of scientific and journalistic toxicity.
(**) Richard Feynman discusses this important concept in The Pleasure of Finding Things Out:
“For the student, when he learns about science, there are two sources of difficulty in trying to weld science and religion together. The first source of difficulty is this–that it is imperative in science to doubt; it is absolutely necessary, for progress in science, to have uncertainty as a fundamental part of your inner nature. To make progress in understanding, we must remain modest and allow that we do not know. Nothing is certain or proved beyond all doubt. You investigate for curiosity, because it is unknown, not because you know the answer. And as you develop more information in the sciences, it is not that you are finding out the truth, but that you are finding out that this or that is more or less likely.
“That is, if we investigate further, we find that the statements of science are not of what is true and what is not true, but statements of what is known to different degrees of certainty: ‘It is very much more likely that so and so is true than that it is not true’; or ‘such and such is almost certain but there is still a little bit of doubt’; or–at the other extreme–‘well, we really don’t know.’ Every one of the concepts of science is on a scale graduated somewhere between, but at neither end of, absolute falsity or absolute truth.”