While writing an obituary for George Box, I stumbled on something I thought was ingenious: a method for generating independent pairs of numbers drawn from the normal distribution.

I’ll concede: that’s not necessarily something that makes the average reader-in-the-street stop in their tracks and say “Wow!” In honesty, it would probably make the average reader-in-the-street rapidly become a reader-on-the-other-side-of-the-street. However, I thought an article on it might provide some insight into two mathematical minds: that of George Box, one of the greatest ((If not the greatest)) statisticians of the 20th century, and that of me, possibly the greatest mathematical hack of the 21st.

### How the Box-Muller transform works

If you want to apply the Box-Muller transform, you need two numbers drawn from a uniform distribution - so they’re equally likely to take on any value between 0 and 1. Let’s call these numbers $U$ and $V$. Box and Muller claim that if you work out

$$X = \\sqrt{-2 \\ln (U)} \\cos (2\\pi V)$$ and $$Y = \\sqrt{-2 \\ln (U)} \\sin (2\\pi V)$$

then $X$ and $Y$ are independent (information about one tells you nothing about the other) and normally distributed with a mean of 0 and a standard deviation of 1. I’m not going to prove that, because I don’t know how, but I can explain what’s happening.

There’s a hint in my choices of letter: you might recognise that you could simplify these down to $X = R \cos(\theta)$ and $Y = R\sin(\theta)$, which are just the sides of a triangle. The $R = \sqrt{-2\ln(U)}$ is the distance from $(0,0)$ - because $U$ is between 0 and 1, $\ln(U)$ is anywhere from $-\infty$ to 0 ((It’s exponentially distributed, since you ask.)) Multiplying by -2 turns it into a nice positive number (so you can take its square root really) and tends to reduce the distance from the origin. For normally-distributed variables, you want the distances to clump up in the middle; that’s what the 2 is for.

The $\theta = 2\pi V$ is much simpler: it just says ‘move in a random direction’.

### What Colin did next

My immediate thought was, ‘I wonder if I can use that to work out the probability tables for $z$-scores you get in formula books!’ What do you mean, that wasn’t your immediate thought? ((Weirdo!)) Long story short: the answer is no; I just wanted to show you my thought process and that not everything in maths works out as neatly as you’d like.

My insight was that the probability of generating an $X$ value smaller than some constant $k$ would be the same as the probability of generating $U$ and $V$ values that gave smaller $X$s. So far so obvious! In that case, it’s just a case of rearranging the formulas to get expressions for (say) $V$ in terms of $U$ and integrating to find the appropriate area.

So I tried that:

$\\sqrt{-2 \\ln (U)} \\cos(2\\pi V) = k \\\\ \\cos(2\\pi V) = \\sqrt{ \\frac{k^2}{-2\\ln(U)}} \\\\ V = \\frac{1}{2\\pi}\\cos^{-1}\\left( \\sqrt{ \\frac{k^2}{-2\\ln(U)}} \\right)$

Yikes. I don’t fancy trying to integrate that - the arccos is bad enough, but the $\ln(U)$ on the bottom? Forget about it.

Let’s try the other way:

$\\sqrt{-2 \\ln (U)} \\cos(2\\pi V) = k \\\\ -2\\ln(U) = k^2 \\sec^2(2\\pi V) \\\\ U = e^{-\\frac{k^2}{2}\\sec^2(2\\pi V)}$

Curses! I don’t think that’s going to work, either. $e^{\sec^2 x}$ isn’t an integral I know how to do - so I’m stymied.

Back to the drawing board, I’m afraid - this time, I didn’t get the cookie of a new maths discovery; the difference between a poor mathematician and a decent mathematician is that a poor mathematician says “I got it wrong, I’m rubbish;” the decent mathematician says either “ah well. Next puzzle!” or “ah well! Try again.”

The great mathematicians, of course, see right to the end of the puzzle before they start.