ICYMI, our article on estimands recently appeared in the BMJ1. I’ll post soon about a response we received, but here want to address some general objections – not to our paper but to the Addendum’s framework.
Addressing some objections from twitter
I’ve stopped engaging on twitter these days (I’m on Bluesky), but a few people have sent me various bits of a conversation going on there. Two main (closely related) objections about estimands seem to be (paraphrasing):
‘I’m more interested in estimation’
‘Estimands don’t change what people actually do, so why bother’
Here are a couple of things you might not know:
The ICH E9(R1) Addendum2 was partly prompted by The National Research Council report on handling of missing data in trials3. The report talked about ‘bias’ of methods like last-observation carried forward. The question is, biased for what estimand?
Another reason for the Addendum was to improve communication between trial statisticians and clinicians. I think it’s helping to an extent, but there is still a way to go, which was part of the motivatio for our BMJ paper.
Hey, if you’re interested in estimation, great! Estimands are not the only thing we need. ‘Using estimands’ is not defining the estimand then doing whatever you want. It’s just the first step. You can be a methodologist who’s interested in estimation – I am. You just can’t do things like derive estimators unless you have a clear estimand in mind. It’s a crazy criticism of the Addendum to say it doesn’t do estimands and the whole of estimation.
For the ‘why bother, it changes nothing’ objection, I’ll use this argument from my colleague Ian White. Someone I greatly respect complained to Ian and me about estimands in general because a hypothetical strategy could represent an impossible scenario. For example, a hypothetical estimand might be ‘if, counterfactually, no patients had suffered adverse events’. Ian responded that the whole point is to bring this out into the open. Someone might write in a SAP ‘we censor and reweight at occurrence of AEs’ – an estimation procedure. The point of estimands is to force us to first say ‘we aim to estimate the effect if AEs never happened’. People who are more interested in estimation, take note.
So, this achieves two things:
Makes transparent the crazy aim;
Makes clear the need to explain how AEs would be avoided to realise this.
I call the latter the realisability criterion. This relates to a bugbear with hypothetical strategies in general, even to handle non-adherence to initiated treatment: while it’s fine to say you would like to know the effect if everyone adhered, estimating this is irrelevant unless you can then ensure that everyone offered the treatment in future does adhere.
The I-prefer-another-framework objection
Another odd objection the Addendum has come from various causal epidemiologists with strong views on trials but no skin in the trials game. Despite that jab, I’ve learned a lot from the field of causal epidemiology (post forthcoming about one such point). However, there is something very strange here. There seems to be this sort of hurt, like, ‘Ugh trials people have discovered estimands and think they’ve invented them, and have all this annoying new terminology when we have been doing this for years’ (yes, everyone agrees intercurrent event is a terrible term, and so are several of the terms for intercurrent event strategies). See above for what prompted the Addendum.
One person said ‘I think it is a mistake to advocate for this framework when, in my opinion, there are established frameworks that address “intercurrent events”.’ Then, no joke, they pointed us to the following passage4.
Several causal effects can be of interest in true randomized trials. Two common ones are the intention-to-treat effect (i.e., the comparative effect of being assigned to the treatment strategies at baseline, regardless of whether the individuals continue following the strategies after baseline) and the per-protocol effect (i.e., the comparative effect of following the treatment strategies specified in the study protocol). Often, both effects are of interest. If the intention-to-treat and per-protocol effects are of interest in the target trial, we would try to estimate analogs of both effects from our observational data.
I assume neither of the authors of this paragraph would claim it as an alternative framework, but one of their disciples does. To be clear, it’s not. It does not contain a description of the estimands (the word ‘analogs’ is a clue; the estimand is a statement of what we wish to estimate, and as such does not depend on the study design). The use of the terms intention-to-treat and per protocol does nicely highlight the need to separate what we want to estimate from how we estimate it. These terms describe analysis sets, and using them to denote estimands causes confusion. The FLO-ELA trial example in our paper shows how treatment policy and ITT can be at-odds. Finally, per protocol is itself an ambiguous term: it could mean a hypothetical strategy, a principal stratum strategy, or something else.
BC Kahan, J Hindley, M Edwards, S Cro, TP Morris. The estimands framework: a primer on the ICH E9(R1) addendum BMJ 2024; 384:e076316 doi:10.1136/bmj-2023-076316
European Medicines Agency. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials. 2020. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-trials-guideline-statistical-principles_en.pdf.
National Research Council Panel on Handling Missing Data in Clinical Trials. The Prevention and Treatment of Missing Data in Clinical Trials. National Academies Press; 2010.
MA Hernán, JM Robins. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. American Journal of Epidemiology. 2016; 183(8):758-64. doi:10.1093/aje/kwv254
I wonder if DAGs suffer from the same kind of critique, i.e. they aren't useful because they don't actually tell you precisely *how* to estimate things.