Version 3.36 of the survey
package and version 2.4 of mitools
are up on CRAN.
There’s one notable new feature in both of them: handling ‘plausible values’, where you have some sets of multiply-imputed variables just as additional columns in a largely non-imputed data set.
There are two implementations behind withPV
, controlled by the rewrite=
option. You have variables PV1MATH
, PV2MATH
,…,PV5MATH
and some code with a variable maths
that you want to run with maths
being each of the plausible values in turn. The default implementation rewrites your code to point to each of the plausible values in turn (quasiquotation!). The backup implementation rewrites the data, creating new data sets with a variable maths
that’s assigned one of the plausible values.
The default is faster and uses less memory, but you could probably break its rewriting if you tried (eg by feeding it tidyeval
code). The backup option is slower and uses more memory, but should be unbreakable.
options(width=100)
suppressMessages(library(survey))
library(mitools)
data(pisamaths, package="mitools")
des<-svydesign(id=~SCHOOLID+STIDSTD, strata=~STRATUM, nest=TRUE,
weights=~W_FSCHWT+condwt, data=pisamaths)
oo<-options(survey.lonely.psu="remove")
results<-withPV(list(maths~PV1MATH+PV2MATH+PV3MATH+PV4MATH+PV5MATH),
data=des,
action=quote(svyglm(maths~ST04Q01*(PCGIRLS+SMRATIO)+MATHEFF+OPENPS, design=des)),
rewrite=TRUE)
results[[1]]
## Stratified 2 - level Cluster Sampling design (with replacement)
## With (151, 2363) clusters.
## svydesign(id = ~SCHOOLID + STIDSTD, strata = ~STRATUM, nest = TRUE,
## weights = ~W_FSCHWT + condwt, data = pisamaths)
##
## Call: svyglm(formula = PV1MATH ~ ST04Q01 * (PCGIRLS + SMRATIO) + MATHEFF +
## OPENPS, design = des)
##
## Coefficients:
## (Intercept) ST04Q01Male PCGIRLS SMRATIO
## 4.703e+02 5.443e+01 6.171e+01 5.224e-02
## MATHEFF OPENPS ST04Q01Male:PCGIRLS ST04Q01Male:SMRATIO
## 4.750e+01 1.344e+01 -1.110e+02 -3.435e-03
##
## Degrees of Freedom: 2362 Total (i.e. Null); 140 Residual
## (1928 observations deleted due to missingness)
## Null Deviance: 22420000
## Residual Deviance: 14110000 AIC: 27340
summary(MIcombine(results))
## Multiple imputation results:
## withPV.survey.design(list(maths ~ PV1MATH + PV2MATH + PV3MATH +
## PV4MATH + PV5MATH), data = des, action = quote(svyglm(maths ~
## ST04Q01 * (PCGIRLS + SMRATIO) + MATHEFF + OPENPS, design = des)),
## rewrite = TRUE)
## MIcombine.default(results)
## results se (lower upper) missInfo
## (Intercept) 4.729436e+02 18.0393855 437.5845852 508.3025560 2 %
## ST04Q01Male 5.268461e+01 21.0694092 11.3661653 94.0030492 4 %
## PCGIRLS 5.974293e+01 15.4147851 29.4989531 89.9869166 6 %
## SMRATIO 3.552268e-02 0.1072370 -0.1747109 0.2457563 3 %
## MATHEFF 4.736517e+01 2.7066937 42.0474147 52.6829207 9 %
## OPENPS 1.317289e+01 2.6812880 7.9012159 18.4445630 11 %
## ST04Q01Male:PCGIRLS -1.109811e+02 28.6338917 -167.1340465 -54.8282283 4 %
## ST04Q01Male:SMRATIO 4.391909e-03 0.1179117 -0.2267333 0.2355171 2 %