We all have secret futures, and stories never told.
I could have zigged, but instead I zagged. Turned left instead of right.
What if I never tried to hold her hand?
Believe it or not, our understanding of how medicines work is rooted in counterfactuals like these. Imagine two universes, completely identical, except in one you were treated and in the other you weren’t. The true effect of the treatment is then the difference in outcomes observed in the two universes. It is hopefully obvious to you that this treatment effect is not something we can actually ever know, since it’s no more possible to go back in time to change your treatment as it is to go back and zig instead of zag. The past is lost to us, and there is but one universe. But this is exactly the promise of personalized medicine, a promise that nobody can keep, no matter how many times they make it. Please let me explain.
To understand the false promise of personalized medicine, we should start with the very real problem it’s intended to solve. We generally accept that randomized controlled trials are a good way to understand the effects of medical treatments. However, trials are typically designed to help us understand the average treatment effect (ATE), which, like most averages, are useful but limited.
Let’s imagine a treatment that does different things to different people, i.e. the effects of the treatment vary. Like any other variable, we can summarize the distribution of those effects in different ways, including by calculating their average. If the individual treatment effects are fairly similar, then the average would probably be a good summary of them. But imagine if most people didn’t benefit from the treatment at all, while a few people benefited a lot. The resulting ATE from a trial of this treatment might suggest a mildly beneficial effect, perhaps leading us to use it widely. However, this would result in treating lots of people that wouldn’t actually benefit, needlessly wasting resources and exposing patients to side-effects and other costs. Clearly it would be better if we could target the treatment to those who actually stand to benefit.
So how do we actually do it? Some think that we can observe individual treatment effects by comparing a participant’s status at the end of the trial to their status at the beginning - so if someone gets better during the trial, then they “responded” to the treatment. The problem with this kind of thinking (which is more common than you might think) is that you have no earthly idea if the reason they got better was the treatment or not. Maybe it explains some of the improvement, or maybe it explains none of it. Maybe the person would have gotten even better without the treatment. There is no way to know, since we can’t observe the counterfactual (the outcome for a particular person when they were treated vs their outcome when they weren’t). This is the fundamental problem of causal inference.
However, hope is not lost. While we can’t observe individual treatment effects, we can observe subgroup effects - that is the average treatment effect for a specific subgroup of your overall population. We can then ask smart questions about whether the effect in one subgroup is appreciably different from that in another - or whether the effect meaningfully varies as a function of some continuously measured characteristic.
And then we can look at the ATE in subgroups of the subgroups. And then in the subgroups of the subgroups of the subgroups. By age and by sex and by ethnicity. By neighborhood, occupation and education. By height and by weight and by adiposity. By blood pressure, heart rate and biomarkers. And, finally, by this gene and that gene and those genes (or anything else we can measure about a person). And so we use all the information we can possibly get our hands on to describe effects in more and more precisely defined subgroups, hoping to traverse from medicine to stratified medicine to precision medicine to finally arrive at, perhaps, personalized medicine.
But there is a big problem with this. Estimates of effects always come with some uncertainty. We typically try to overcome that uncertainty through replication, i.e. by estimating the effect in large samples. But as we attempt to estimate the treatment effect in more and more specific subgroups, the sample size underlying the estimate must shrink, and so the error of that estimate gets larger and larger, rendering it less and less useful. This tradeoff is a rocket ship fighting against gravity. It is death and taxes.
Let’s add one final challenge for personalized medicine. We usually run clinical trials to generate evidence about whether a treatment works well enough (and is safe enough) in a relatively small sample of patients before we start using it in some larger, target population. However, if the target population is the unique and beautiful snowflake that is you, then we can’t trial the treatment in you before giving it to you. Obviously!
So there is no such thing as personalized medicine. Of course many of the legitimate scientists who talk about personalized treatment effects know that they aren’t actually personalized. It’s just a more exciting word than targeted or stratified or precision. “This is semantics Darren, move on.” That’s probably good advice, so I’m just going to end this little rant with a few papers that I’ve found particularly helpful when trying to think about how we might realistically improve on the average treatment effect.
Statistical pitfalls of personalized medicine. Stephen Senn
Can we learn individual-level treatment policies from clinical data? Uri Shalit
Great explanation of fundamental issues in personalized medicine!