Version 4.4-1 of the `survey`

package for R is percolating through CRAN. There are some important additions, visible and invisible

The main invisible addition is from Ben Schneider, who has written a set of C++ routines that do the multistage stratified variance calculations previously done by `svyrecvar`

. The compiled versions are the default; use `options(survey.use_rcpp=FALSE)`

to disable them. The C++ code is faster; perhaps more important is that it gives the same answers independently and so is a check on the central routine of the package.

The most important visible addition is the functions `svysmoothUnit`

and `svysmoothArea`

for small-area estimation. These are just an interface to the `SUMMER`

package, but they make a wider range of analyses available. I’ll write a separate post about `svysmoothArea`

, which fits Bayesian versions of the Fay-Herriot model to smooth the direct survey estimates for small areas. The `svysmoothUnit`

function doesn’t use sampling weights; it assumes that sampling is ignorable given a set of unit-level covariates and fits generalised linear models with area random effects. There’s more background here.

If you want to use the small-area estimation functions you need to install the `SUMMER`

package (which is suggested by `survey`

) and also install `INLA`

(which is needed for the SUMMER models). The small-area estimation vignette describes how to do this. The `INLA`

system isn’t an explicit dependency of `survey`

because many users won’t need it and the fact that it doesn’t live on CRAN might make some institutions more reluctant to install it.

There are also other changes: it’s now possible to have arbitrary designs at phase two of a `twophase`

object by specifying a matrix of pairwise sampling probabilities or sampling covariances. The primary motivation for this was to allow Poisson sampling at phase two as a model for non-response, but it will have other uses. There are also some fixes to standard error estimation for some raked two-phase design objects. And there’s a miscellany of smaller bug fixes: for example, `confint`

would sometimes fail to find a profile confidence interval for generalised linear model objects with replicate weights because it was using bad values for the search limits.