We have a population or cohort of people divided into sampling strata, with a sample of size taken from the population in stratum . Let be the sampling probability for person in stratum . When we do asymptotics we usually assume are bounded away from zero. That’s not ideal for, say, case-control studies of rare diseases, where we might want asymptotic approximations based on the case incidence being small (ie, converging to zero).
In the situations where I’m interested in being small, it’s usually small for a whole stratum. Since sampling is independent between strata, there should be a central limit theorem separately for each stratum, and we should be able to add up the limiting Normal approximations for the stratum totals to get a Normal limit for the population total estimate and the population mean estimate.
To formalise this, suppose for every stratum (so that asymptotics makes sense), and that is bounded above and below, so that within each stratum the sampling probability has a finite (relative) range. As a simple example, we might have a case stratum with and a control stratum with very small .
[Update: As Stas Kolenikov points out, I’m assuming the same strata are small large along the infinite sequence, so I need something like for each pair of strata. This isn’t a meaningful loss of generality since (a) the infinite sequence is an analytic fiction and we might as well set it up for our maximum convenience; and (b) even without assuming anything, every subsequence will have a subsubsequence along which the condition holds]
By standard results, for each stratum , and by the Skorohod representation theorem we can find an -variate normal vector with
(possibly on a different probability space), to get
The will be independent, with mean zero; write for the variances.
[Update: Note that is just , nothing more fundamental. Under stratified random sampling, will be in stratum multiplied by the ‘finite population correction” , but under other sampling schemes it will be something else]
Now,giving
First, suppose $ N_h/N$ converges to a non-zero constant for each . Let and define
where with
Alternatively, for case–control sampling we may have in the case stratum, but we would have all of the same order, and so of the same order as their total, . The limiting distribution is dominated by the largest strata: define (which is non-empty as is finite)
where with
Weaker conditions on and are clearly possible: it is only necessary to identify which terms dominate the limiting distribution of , since the limiting distribution of estimated stratum totals is always independent -variate Normal under appropriate scaling.