svycontrast - Biased and Inefficient

I got asked for more detail about the svycontrast() function, so I thought I’d post it here too. The function is related to the CONTRASTS you get in SAS, but focused on estimation rather than testing.

The input to svycontrast() is a $p$ -vector of estimates $\hat{θ}$ (which I’ll consider as a column vector) and an estimated $p \times p$ covariance matrix $\hat{Ξ}$

There are two main cases:

Linear

Given a $p$ -vector of coefficients $b$ , the function computes $b^{T} \hat{θ}$ and $b^{T} \hat{Ξ} b$ . Given a list of $k$ $p$ -vectors of coefficients the function pastes them into a $p \times k$ matrix $B$ and computes $B^{T} θ$ and $B^{T} \hat{Ξ} B$ .

For example, from the help page

library(survey)
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
a <- svymean(~api00+enroll+api99, dclus1)
b<-svycontrast(a, list(avg=c(0.5,0,0.5), diff=c(1,0,-1)))
b

##      contrast      SE
## avg   625.574 23.8362
## diff   37.191  3.0852

and doing it step by step

coef(a)

##    api00   enroll    api99 
## 644.1694 549.7158 606.9781

crossprod(c(0.5,0,0.5),coef(a))

##          [,1]
## [1,] 625.5738

crossprod(c(1,0,-1),coef(a))

##          [,1]
## [1,] 37.19126

sqrt(crossprod(c(0.5,0,0.5), vcov(a))%*%c(0.5,0,0.5))

##          [,1]
## [1,] 23.83622

sqrt(crossprod(c(1,0,-1), vcov(a))%*%c(1,0,-1))

##          [,1]
## [1,] 3.085197

These might be rounded differently when they print, but are the same,eg

crossprod(c(0.5,0,0.5),coef(a))-coef(b)[1]

##      [,1]
## [1,]    0

sqrt(crossprod(c(1,0,-1), vcov(a))%*%c(1,0,-1)) - SE(b)[2]

##      [,1]
## [1,]    0

And the covariance term:

crossprod(c(0.5,0,0.5), vcov(a))%*%c(1,0,-1)

##           [,1]
## [1,] -16.30776

vcov(b)[1,2]

## [1] -16.30776

You can also use names to indicate coefficients, so that this is the same

svycontrast(a, list(avg=c(api00=0.5,api99=0.5), diff=c(api00=1,api99=-1)))

##      contrast      SE
## avg   625.574 23.8362
## diff   37.191  3.0852

Non-linear

Given a quoted expression in which the free variables are the names of the coefficients, svycontrast() treats it as a function $f ()$ and computes $f (\hat{θ})$ and $f^{'} (\hat{θ})^{T} \hat{Ξ} f^{'} (\hat{θ})$ , using deriv() to do the symbolic differentiation.

As a trivial case, you can, of course, do linear combinations this way and get the same as above

svycontrast(a, list(quote(api00/2+api99/2), quote(api00-api99)))

##        nlcon      SE
## [1,] 625.574 23.8362
## [2,]  37.191  3.0852

Less trivially: geometric means, where $f (θ) = \exp θ$ and so $f^{'} (\hat{θ})$ is a diagonal $2 \times 2$ matrix with diagonal entries $\exp (\hat{θ})$

meanlogs <- svymean(~log(api00)+log(api99), dclus1)
meanlogs

##              mean     SE
## log(api00) 6.4541 0.0378
## log(api99) 6.3905 0.0411

geomeans<-svycontrast(meanlogs,
         list(api00=quote(exp(`log(api00)`)), api99=quote(exp(`log(api99)`))))
exp(coef(meanlogs))-coef(geomeans)

## log(api00) log(api99) 
##          0          0

B<-diag(exp(coef(meanlogs)))
crossprod(B, vcov(meanlogs))%*%B - vcov(geomeans)

##       api00 api99
## api00     0     0
## api99     0     0

Even less trivially, svykappa() does a bit of quasiquotation as well

svycontrast(probs, list(kappa = bquote((.(obs) - .(expect))/(1 - .(expect)))))

where obs are the names of diagonal entries of a square contingency table and expect are the products that define the expected values under independency. For a $2 \times 2$ table the expression comes out as

(.a_..A_ + .b_..B_ - (.a_ * .A_ + .b_ * .B_))/(1 - (.a_ * .A_ + 
    .b_ * .B_))

and svycontrast() uses the deriv() function to differentiate that expression with respect to all its entries.