An obvious strategy is to prefilter the SNPs to cut down on the number of tests. You might pick SNPs in or near genes. Among those in genes you might pick non-synonymous variants. Among those, you might pick mutations in sites that are strongly evolutionarily conserved or ones that should have a big effect on protein structure. Or you might get really specific and just look at mutations that are predicted to cause complete loss of function. These all make sense, but they don’t work as well as you might think, for a simple mathematical reason: $$\int_t^\infty e^{-x^2}\;dx$$ gets very, very small very fast with increasing $$t$$.