One issue that doesn’t get as much attention is how a change would affect the sensitivity of p-values to analysis choices.  When we started doing genome-wide association studies, we noticed that the results were much more sensitive than we had expected. If you changed the inclusion criteria or the adjustment variables in the model, the p-value for the most significant SNPs often changed a lot.  Part of that is just having low power. Part of it is because the Normal tail probability gets small very quickly: small changes in a $$Z$$-statistic of -5 give you quite large changes in the tail probability. Part of it is because we were working at the edge of where the Normal approximation holds up: analyses that are equivalent to first order can start showing differences.  Overall, as Ken Rice put it “the data are very tired after getting to $$5\times 10^{-8}$$”, and you can’t expect them to do much additional work.  In our (large) multi-cohort GWAS we saw these problems only at very small p-values , but with smaller datasets you’d see them a lot earlier.