Missing baseline data when analysing change-from-baseline

Mar 06, 2025

A collaborator recently1 asked an interesting question regarding missing baseline data in a randomised trial where we analyse change-from-baseline.

There are a lot of warnings about not analysing change-from-baseline but sometimes people want to do it and sometimes we pick our battles. So if you are about to jump on the premise of change-from-baseline in randomised trials, it is written in a similar spirit to Stephen Senn’s classic title, ‘How to perform the two-stage analysis if you can’t be persuaded that you shouldn’t’2.

Missing baselines in RCTs with covariate adjustment

First off, let’s talk about handling missing baseline covariates in trials where the analysis adjusts for baseline. As far as I know, the first paper to consider this issue was by White and Thompson3, and its results may surprise you. Important note: their results are relevant when the population-level summary measure of interest is marginal. So-called complete case analysis – restricting to individuals with observed baseline – is pretty much the worst thing you can do. Thanks to randomisation4, we can use methods that would otherwise give people like White and Thompson a heart attack: mean imputation, or missing indicators, for example. In this post I’m going to focus on mean imputation5.

Covariate-adjusted analysis of outcome or change returns the same inference about the treatment effect

(Skip this section if the point in its heading is already familiar to you.)

Suppose you wish to estimate a marginal treatment effect on outcome Y and, for various reasons, plan to adjust for baseline X. Your collaborator wants you to instead analyse change-from-baseline, Y–X. Lots of us have been in this position.

You know what’s cool? You can do both: analyse change-from-baseline as they wish, but adjust for baseline as you wish! The reason it’s cool is that, although this sounds like you met in the middle, the resulting inference on the treatment effect is the same as if you had just adjusted for baseline as you wanted. The intercept and covariate coefficient will change, but not the treatment effect or its variance. This is true both for ancova1 and ancova26.

By doing this, you have simultaneously compromised and got your own way. If you want to see, here is some Stata code simulating a dataset and showing that the two analyses return identical inference on the treatment effect (look at the bold lines beginning 1.trt.

clear
set seed 1

matrix mean = (0, 0) // mean x and y are both 0
matrix sd = (2, 2) // variance of x and y are both 4
matrix corr = (1, .3 \ .3, 1) // Corr(x,y) is .3
drawnorm x y , mean(mean) sd(sd) corr(corr) n(50) // Potential outcome without trt
gen byte trt = _n>_N/2
replace y = 1+y if trt // Trt effect (common) is 1
gen float change = y-x
* Mean-centre x
quietly summarize x
quietly replace x = x-r(mean)

* Ancova1
regress y i.trt x
------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       1.trt |   .7406375    .552541     1.34   0.187    -.3709317    1.852207
           x |    .647904   .1385978     4.67   0.000     .3690812    .9267268
       _cons |    .389234   .3906564     1.00   0.324    -.3966654    1.175133
------------------------------------------------------------------------------

regress change i.trt x
------------------------------------------------------------------------------
      change | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       1.trt |   .7406375    .552541     1.34   0.187    -.3709317    1.852207
           x |   -.352096   .1385978    -2.54   0.014    -.6309188   -.0732732
       _cons |    .418356   .3906564     1.07   0.290    -.3675434    1.204255
------------------------------------------------------------------------------

* Ancova2
regress y i.trt##c.x
------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       1.trt |   .7404068   .5585028     1.33   0.191     -.383801    1.864615
           x |   .6544894   .1749995     3.74   0.001     .3022337    1.006745
             |
     trt#c.x |
          1  |   -.018335    .292002    -0.06   0.950    -.6061046    .5694347
             |
       _cons |   .3889397   .3948908     0.98   0.330    -.4059342    1.183814
------------------------------------------------------------------------------

regress change i.trt##c.x
------------------------------------------------------------------------------
      change | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       1.trt |   .7404068   .5585028     1.33   0.191     -.383801    1.864615
           x |  -.3455106   .1749995    -1.97   0.054    -.6977663    .0067451
             |
     trt#c.x |
          1  |  -.0183349    .292002    -0.06   0.950    -.6061046    .5694347
             |
       _cons |   .4180617   .3948908     1.06   0.295    -.3768121    1.212936
------------------------------------------------------------------------------

So…

The point of the above is that, if analysis of change adjusted for baseline is equivalent to analysis of outcome adjusted for baseline, then the missing-baseline methods that work well for the latter (from White & Thompson) are also fine for the former.

I want you to realise that, if you are comfortable with mean imputation of X before analysing outcome adjusted for baseline, you must also be comfortable with mean imputation before analysis of change adjusted for baseline. Because remember, the inference will be identical.

Without baseline adjustment

Now let’s remove the covariate adjustment part. Are we suddenly unhappy with mean imputation before analysing change?

Denote outcome as Y and covariate as X. Then the variance of the estimator when we analyse outcome (ignoring X) is proportional to the variance of the outcome, Var(Y). If we analyse change-from-baseline, Y–X, the variance of our estimator is proportional to Var(Y–X)=Var(Y)+Var(X)–2Cov(Y,X). For Var(Y–X) to be smaller than Var(Y), we must have 2Cov(Y,X)>Var(X). When Var(X)=Var(Y), this amounts to Corr(Y,X)>0.5.

Suppose now that, instead of analysing change from baseline, Y–X, we analyse change from baseline-mean, that is Y–Xbar. Then Var(X)=0 and, Cov(Y,X)=0, so the variance of change-from-baseline-mean is proportional to Var(Y)+0–(2*0)=Var(Y). Because the mean of X is just a constant, it’s obvious that substracting a constant will not change the variance of Y. On this basis, I assume you have no objection to analysing change-from-mean.

Finally, if a trial has some participants with X observed and others with X missing, we could calculate change-from-baseline and change-from-baseline-mean for these groups respectively. The mechanism could feasibly be related to X but not to treatment: X is measured before randomisation, so the missing indicator is independent of treatment, just like any other covariate.

Essentially, mean imputation of X is ok when we analyse change because we’re only imputing the part of the outcome that cannot be affected by randomisation.

Self-inefficiency example?

An interesting point to me that arises from the above is that, if 2Cov(Y,X)<Var(X), an incomplete cases contributes more statistical information than a complete case. That is, unless Y and X are highly correlated, having less data (on X) improves our estimator.

This is a bit like self-inefficiency. Loosely, an estimation procedure is said to be ‘self-inefficient’ if it produces more efficient estimators with less data. In the case above, this is true if 2Cov(Y,X)<Var(X), but not otherwise. Here’s the technical definition ICYI7.

DEFINITION 5. Let W_c be a data set, and let W_o be a subset of W_c created by a selection mechanism. A statistical estimation procedure \hat{θ}(.) for θ is said to be self-efficient (with respect to the selection mechanism) if there is no \lambda\in(-\inf, \inf) such that the mean-squared error of \lambda\hat{θ}(W_o)+(1-\lambda)\hat{θ}(W_c) is less than that of \hat{θ}(W_c).

At the time of posting this wasn’t particularly recent. I thought it was such a nice question that we set it as an assessment question for our students, which meant I’ve held off posting this until they had all submitted.

SJ Senn. The AB/BA cross-over: how to perform the two-stage analysis if you can’t be persuaded that you shouldn’t. In Liber Amicorum Roel van Strik, Edited by B Hansen and M de Ridder. Erasmus University, Rotterdam. 1996: 93–100.

IR White, S Thompson. Adjusting for partially missing baseline measurements in randomized trials. Statistics in Medicine. 2005; 24:993–1007. doi:10.1002/sim.1981

See, it is possible not say we’re exploiting or leveraging randomisation 😱

By the way, this is overall mean imputation (because expected mean mean at baseline within arms is equal thanks to randomisation) not within-arm mean imputation (which would carry chance imbalance in the observed baseline data through to the missing baseline data).

The ancova1 and ancova2 terminology originates, I think, with Yang & Tsiatis (though they used ‘I’ and ‘II’ but this evolved to 1 and 2 by Tsiatis et al.’s famous ‘principled-yet-flexible approach’ Stat Med paper): L Yang, AA Tsiatis. Efficiency study for a treatment effect in a pretest–posttest trial. The American Statistician. 2001; 55:314–321.

X-L Meng. Multiple-Imputation Inferences with Uncongenial Sources of Input. Statistical Science. 1994; 9(4)538–558. www.jstor.org/stable/2246252

David P. Miller

Mar 6

I love the statement "By doing this, you have simultaneously compromised and got your own way." I had an approximately 5-year running argument with a former manager about this that drove all our colleagues crazy. I find that statisticians sometimes shoot themselves in the foot arguing about what's technically most methodologically correct rather than focusing on whether the end-result is correct.

Expand full comment

1 reply by Tim Morris

1 more comment...

Statistical methodology meanderings

Discussion about this post