Random estimands?

This has come up a few times and I’m not sure how much to think/worry about it

Oct 10, 2023

The first time this came up was a few years ago when we were working on our simulation tutorial, we were reviewing papers so we could summarise what people were actually doing in their simulation studies. Two papers involved what I term random estimands. For example, this paper considered subpopulation selection followed by estimation conditional on the selected subpopulation. If you are a sensible person and regard the target population as a key attribute of an estimand, then it’s reasonable to say that the chosen estimand in this simulation study was random across repetitions. Interestingly, the authors were interested in conditional performance, while in the other one they were interested in an average over the distribution of the random estimand.

The second time this came up was in a discussion with Karla DiazOrdaz when working on Mia Tackney’s paper. We were discussing overlap weighting, and Karla described the estimand as random. Random allocation means the overlap population distribution can change from one set of assignments to the next, and Karla argued that the implicit targeted population is then sample-specific (a bit like sample average treatment effect, I guess).

The third time this came up was when Ian White and I were talking about how to combine multiple imputation of partially-observed covariates with standardisation. Our first thought was what Angela Wood calls the pool-last principle, which is to say ‘do Rubin’s rules outside of everything else’*. That means you do standardisation within each imputed dataset then combine inference using Rubin’s rules on the scale of the estimand. However, when you multiply-impute covariates, the imputed distribution of covariates changes for each imputation, meaning that each imputation is standardised to a different target distribution. That is, there is some heterogeneity induced by having a random estimand. I don’t know what the impact of this is. Possible it just drifts into the between imputation variance.

Anyway, it’s come up other times that I’ve forgotten so I thought I’d write it down to return to later. References below.

*The pool-last principle doesn’t always make sense. E.g. in Jonathan Bartlett and Rachael Hughes’ paper on combining bootstrapping with multiple imputation, and Stephen Burgess & colleagues’ paper on meta-analysis and multiple imputation.

REFERENCES

Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Statistics in Medicine. 2019; 38: 2074–2102.

Kimani PK, Todd S, Stallard N. Estimation after subpopulation selection in adaptive seamless trials. Statistics in Medicine. 2015; 34: 2581–2601.

Carreras M, Gutjahr G, Brannath W. Adaptive seamless designs with interim treatment selection: a case study in oncology, Statistics in Medicine. 2015; 34: 1317–1333.

Tackney MS, Morris T, White I. et al. A comparison of covariate adjustment approaches under model misspecification in individually randomized trials. Trials. 2023; 24: 14.

Bartlett JW, Hughes RA. Bootstrap inference for multiple imputation under uncongeniality and misspecification. Statistical Methods in Medical Research. 2020; 29: 3533–3546.

Burgess S, White IR, Resche-Rigon M, Wood AM. Combining multiple imputation and meta-analysis with individual participant data. Statistics in Medicine. 2012; 32: 4499–4514.

Nov 3, 2023

Another example: the switch relative risk. This is defined to be the relative risk if treatment reduces risk, or relative survival if treatment increases risk. Suppose in truth treatment does nothing, then in a simulation, the switch relative risk will be estimated as relative risk in some runs and relative survival in others.

Expand full comment

Statistical methodology meanderings

Discussion about this post