Skip to contents

Assessing balance for time-varying exposures is a topic that has not been extensively examined in the MSM literature. However, Jackson (2016) has established guidance that we draw on for this package, implemented in the calcBalStats() function (drawn on by the assessBalance() function) in devMSMs.

Calculating Raw Balance Statistics

calcBalStats() assesses balance using correlations between confounder and exposure for continuous exposures and standardized mean differences of the confounder between exposure groups for binary exposures. These balancing statistics are created based on the diagnostics for confounding of time-varying exposures outlined by Jackson (2016). More specifically, at each user-specified exposure time point, raw balance statistics are calculated to quantify the association between exposure at that time point and each confounder, for each exposure history up until that time point. This is accomplished using the col_w_cov() function to calculate covariance for continuous exposures and the col_w_smd() function to calculate mean difference for binary exposures from the cobalt package (Greifer, 2023). Factor covariates are split into separate dummy variables using the splitfactor() function from cobalt. Any covariate interactions present in the formulas are created in the data by multiplying the constituent variables, with interactions including factor covariates implemented by constituent dummy variables.

Weighting Balance Statistics By Exposure History

In the calcBalStats() function, exposure histories comprise all combinations of the presence and absence of exposure (delineated using a median split for continuous exposures) at all prior time points (note that these may be different from the user-specified, substantive exposure histories used to compare dose and timing in the final model; Step 5). Subsequently, a weighted mean of the balance statistics, weighted by the proportion of individuals in that exposure history, provides a raw estimate of balance between exposure at that time point for each confounder. On the occasion that there are balancing statistics for only one person or no people in a given exposure history, the user is alerted and that balance statistic is dropped from inclusion. These raw (unstandardized) balance statistics are calculated without IPTW weights for pre-balance checking and using the IPTW balancing weights (that have been multiplied across time points) for weighted analyses.

Standardizing Balance Statistics

For the purpose of comparison to each other and to a common balance threshold, balancing statistics are then similarly standardized in the following manner. For continuous exposures, the raw balance statistic (covariance) is divided by the product of the standard deviation of the unweighted confounder and the standard deviation of the exposure at that time point to calculate the correlation. Given this, it is possible that the correlation may be greater than 1.

For binary exposures, the raw balance statistic (mean difference) is divided by a pooled standard deviation estimate, or the square root of the average of the standard deviations of the unweighted confounder levels for the exposed and unexposed groups at the first exposure time point. All standard deviations are computed using the unadjusted sample (as recommended by Stuart, 2008; 2010; Greifer, 2022) given the possibility that balancing changes the variation in a confounder and leads to misleading standardized values. At the first exposure time point (when there are no histories), standardized balance statistics are computed directly using the same cobalt functions mentioned above.

Imputed Data

If imputed data are supplied, the above process is conducted on each imputed dataset separately. The function then takes the average of the absolute values of the balance statistics across imputed datasets to account for sampling error (Pishgar et al., 2021 preprint). These averaged average balance statistics are then compared to the balance threshold(s) to summarize which covariates are balanced. Recommended specification of separate balance thresholds for relatively more and less important confounders can be found in the Workflows vignettes. The specification of balance threshold(s) should be kept consistent throughout the use of the devMSMs package.

References

Greifer, N. (2024). cobalt: Covariate Balance Tables and Plots (4.4.1) [Computer software]. https://CRAN.R-project.org/package=cobalt

Jackson, J. W. (2016). Diagnostics for Confounding of Time-varying and Other Joint Exposures. Epidemiology, 27(6), 859. https://doi.org/10.1097/EDE.0000000000000547

Pishgar, F., Greifer, N., Leyrat, C., & Stuart, E. (2021). MatchThem: Matching andWeighting after Multiple Imputation. R Journal, 13(2), 292–305. https://doi.org/10.32614/RJ-2021-073

Stuart, E. A. (2008). Developing practical recommendations for the use of propensity scores: Discussion of ‘A critical appraisal of propensity score matching in the medical literature between 1996 and 2003’ by Peter Austin, Statistics in Medicine. Statistics in Medicine, 27(12), 2062–2065. https://doi.org/10.1002/sim.3207

Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. https://doi.org/10.1214/09-STS313