From XKCD today
How can we check this calculation?
First, we need to know where the lines are on the y-axis. They are separated by 52.7% of the height of the distribution, and looks as if they are meant to exclude the same height above and below. We don’t need to worry about the mean of the distribution (obviously), or the scale (less obviously). The reason the scale is not needed is that rescaling the x axis shrinks all three of the areas under the curve by the same factor, and since they add up to 1, they stay the same. If we write and for the two heights, we have
from which we get or in R
q<- dnorm(0)*(1-0.527)/2
p<- dnorm(0)-q
Let and be the x-axis values for the lower ends of the filled strip and and be the corresponding upper ends. The area we want can be worked out at least three ways.
First, take the area above the lower line and subtract the area above the upper line
Second, define a function that’s the height of the coloured strip at : it’s the maximum of (the minimum of and 0) and :
Third, take out the rectangle in the middle of the strip, which is wide and high, and then use integration on the curvy bits on each side:
In R, first we need to solve and , for which we need a bound on and ; I’m using 5, which is clearly enough, since
a<- uniroot(function(x) dnorm(x)-q, lower=0, upper=5)$root
b<- uniroot(function(x) dnorm(x)-p, lower=0, upper=5)$root
Now do the integration three ways
integrate( function(x) dnorm(x)-q,lower=-a, upper=a)$value-
integrate( function(x) dnorm(x)-p,lower=-b, upper=b)$value
## [1] 0.5001705
integrate(function(x) dnorm(x)-q,lower=-a, upper=-b)$value+
2*b*(p-q)+
integrate(function(x) dnorm(x)-q,lower=b, upper=a)$value
## [1] 0.5001705
integrate(function(x) pmin(pmax(dnorm(x)-q,0),p-q),lower=-a, upper=a)$value
## [1] 0.5001658
And it works

[Snipe by slobirdr on Flickr]
Update: As Peter Dalgaard points out, the integrals can be replaced with calls to pnorm
, since the integrand is just and constants. For example, gives
(pnorm(-b)-pnorm(-a))-q*(-b - -a)+2*b*(p-q)+(pnorm(a)-pnorm(b))-q*(a-b)
## [1] 0.5001705
Also, if you happen to know the formula , aka dnorm(x)
, you could compute and directly
argdnorm<-function(d) sqrt(-2*log(d*sqrt(2*pi)))
a
## [1] 1.698121
argdnorm(q)
## [1] 1.698121
b
## [1] 0.7346321
argdnorm(p)
## [1] 0.7346321
I think it’s worth not doing this: while it happens to be the case that has a closed-form inverse, there isn’t any good reason I know of that it should have one, and I’m happy to invert dnorm
numerically.