Another view of the ‘nearly true’ model

Ok, so to recap, we have a large model (such as ‘we know the marginal sampling probabilities’) and a small model (such as the subset of the large model with $l o g i t P [Y = 1] = x β$ ). Under the large model, we would use the estimator ${\hat{β}}_{L}$ , but under the small model there is a more efficient estimator ${\hat{β}}_{S}$ . That is, under the small model
$\sqrt{n} ({\hat{β}}_{S} - β_{0}) \overset{d}{\to} N (0, σ^{2})$
and
$\sqrt{n} ({\hat{β}}_{L} - β_{0}) \overset{d}{\to} N (0, σ^{2} + ω^{2})$

We’re worried that the small model might be slightly misspecified. One test of model misspecification is based on $D = {\hat{β}}_{S} - {\hat{β}}_{L}$ . Under the small model, $\sqrt{n} D \overset{d}{\to} N (0, τ^{2})$ for some $τ^{2}$ . This test isn’t a straw man – for example, DuMouchel and Duncan recommended it in the context of survey regression in a 1983 JASA paper.

If we assume that ${\hat{β}}_{S}$ is (locally, semiparametric) efficient in the small model then $τ = ω$ . Now suppose the small model is slightly untrue so that $\sqrt{n} D \overset{d}{\to} N (Δ, ω^{2})$ with $Δ > 0$ . If, say, $Δ = ω$ , then approximately
${\hat{β}}_{S} \sim N (ω, σ^{2})$
and
${\hat{β}}_{L} \sim N (0, σ^{2} + ω^{2})$
so the two estimators have the same asymptotic mean squared error. Since ${\hat{β}}_{L}$ is asymptotically unbiased it would probably be preferred, but the test based on $D$ has noncentrality parameter 1 and very poor power. If we relied on the test, we would probably end up choosing ${\hat{β}}_{S}$

So the test based on $D$ is not very useful if we want to protect against small amounts of model misspecification. We should use a better test.

But sometimes the test based on $D$ is the most powerful test or not far from it. Since we know what ${\hat{β}}_{S}$ and ${\hat{β}}_{L}$ look like as functionals of the distribution, we could try to maliciously arrange for the model misspecification to be in the direction that maximised ${\hat{β}}_{S} - {\hat{β}}_{L}$ , and $D$ would then be the Neyman-Pearson most powerful test – that’s what UMP tests look like for Gaussian shift alternatives. We can’t quite do that, but in large enough sample sizes we can come as close as we need.