Version 4.0 of the survey package is on its way to CRAN.
There are two main updates, which improve the estimation of contrasts
First, a couple of improvements to the handling of replicates. When svycontrast
is used on an object that includes replicate estimates, the estimates will now be transformed and then used to estimate a variance, rather than using the delta method. I think that’s the right thing to do, though you might also want to compute a confidence interval on the original scale and transform the interval. Also, there’s a function to estimate variances from a set of arbitrarily transformed replicates, eg correlation matrices
dstrat <- svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
bstrat<- as.svrepdesign(dstrat,type="subbootstrap")
v <- svyvar(~api00+api99, bstrat, return.replicates=TRUE)
vcor<-cov2cor(as.matrix(v))[2,1]
vreps<-v$replicates
correps<-apply(vreps,1, function(v) v[2]/sqrt(v[1]*v[4]))
vcov(bstrat,correps, centre=vcor)
## [1] 1.486932e-05
## attr(,"means")
## [1] 0.9751789
Second, some functions (so far, svymean
, svytotal
, svyratio
, svykappa
,svyglm
, svymle
) have the option to return influence functions as an attribute of the estimate. This allows svyby
to estimate a covariance matrix including between-domain covariances, so that confidence intervals and standard errors are available for between-domain contrasts. Previously, they have only been available for replicate-weight designs (though for totals and means you could get them with an appropriate regression model). Returning influence functions is not the default, because they can take up a lot of space.
For example
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
totals<-svyby(~enroll, ~stype, svytotal,
design=dclus1, covmat=TRUE)
svycontrast(totals, quote(E/H))
## nlcon SE
## contrast 3.939 1.4319
svycontrast(totals, quote(E-H))
## nlcon SE
## contrast 1574122 543747
The only important function where this does not look feasible is svyquantile
, because of the lack of smoothness (though there are tests for equality of quantiles between groups available with svyranktest
.)
Other changes
There is a function svynls()
to do non-linear least-squares with probability weights. It does most of its work by calling nls
, so if you know how to fit your model with nls
you’ll be ok.
There are also a couple of significant documentation notes
- Lots of people have asked how to get ‘robust’ standard errors for quasipoisson models fitted with
svyglm
, so there is a note saying that there isn’t any choice: the standard error estimates are generalisations of the ‘sandwich’ standard errors - I had an interesting exchange about standard errors in
predict.svyglm
. I implemented these the obvious way: \[\mathrm{var}[x\hat\beta]=x^T\mathrm{var}[\hat\beta]x.\] For the special case where the new value of \(x\) is the true population mean, most people use a different standard error estimator that’s a rescaling of the weighted residual mean square (especially in simple random samples in textbooks, but also in real life). If I’m reading Chapter 2 of Fuller’s Sampling Statistics correctly, my estimator is noisier, but is valid under weaker assumptions.