I was reminded recently about something that took me an embarrassingly long time to realise: that the binomial expansion and the binomial distribution were intrinsically linked.

I mean, sure, obviously they both have an nCr in them, but what of it? It’s just Pascal’s triangle.

And then someone mentioned generating functions in a way that made sudden sense to me, and I hope I can do the same for you.

Let’s start with a concrete example

Suppose we roll four standard six-sided dice and want to know how likely we are to throw any given number of sixes. That’s simple enough:

Sixes Calculation Probability
0 $\nCr{4}{0} \left(\frac{1}{6}\right)^0 \left(\frac{5}{6}\right)^4$ 0.482
1 $\nCr{4}{1}\left(\frac{1}{6}\right)^1\left(\frac{5}{6}\right)^3$ 0.386
2 $\nCr{4}{2}\left(\frac{1}{6}\right)^2\left(\frac{5}{6}\right)^2$ 0.116
3 $\nCr{4}{3}\left(\frac{1}{6}\right)^3\left(\frac{5}{6}\right)^1$ 0.015
4 $\nCr{4}{4}\left(\frac{1}{6}\right)^4\left(\frac{5}{6}\right)^0$ 0.001

Now, if we look at the binomial expansion of $(\left(\frac{5}{6}\right) + \left(\frac{1}{6}\right)x)^4$, we get:

  • $\nCr{4}{0}\left(\frac{1}{6}\right)^0\left(\frac{5}{6}\right)^4 x^0 +$
  • $\nCr{4}{1}\left(\frac{1}{6}\right)^1\left(\frac{5}{6}\right)^3x^1 +$
  • $\nCr{4}{2}\left(\frac{1}{6}\right)^2\left(\frac{5}{6}\right)^2x^2 +$
  • $\nCr{4}{3}\left(\frac{1}{6}\right)^3\left(\frac{5}{6}\right)^1x^3+$
  • $\nCr{4}{4}\left(\frac{1}{6}\right)^4\left(\frac{5}{6}\right)^0x^4$

… say look! Those calculations are identical. The coefficient of the $x^k$ term is the probability of rolling $k$ sixes!

And this works in general – if you’ve got $n$ trials, each with a probability of $p$, you want the coefficients of $(q + px)^n$. If the probabilities are different but independent, you can work out $(q_1 + p_1x)(q_2 + p_2x)\dots$ in just the same way.

Other distributions

It turns out, you can do just the same thing for the geometric distribution: given $X \sim Geo(p)$, $P(X=k) = pq^{k-1}$.

What would our GF look like? $g_p(x)= px + pqx^2 + pq^2x^3 + \dots$.

That looks a lot like, well, a geometric sequence. The first term is $px$ and the common ratio is $qx$, so the sum of the series is $g_p(x) = \frac{px}{1-qx}$. How lovely!

And the Poisson? Well, if $Y \sim Po(\lambda)$, then $P(Y=k)= e^{-\lambda}\frac{\lambda^k}{k!}$.

If we write out the corresponding function, we get $e^{-\lambda}\left( 1 + \frac{\lambda x}{1} + \frac{\lambda^2 x^2}{2!}+ \dots\right)$, and that bracket is just $e^{\lambda x}$. The GF is $p_\lambda(x) = e^{\lambda(x-1)}$.

Sanity checks and means

A quick sanity check for GFs is to see whether the probabilities sum to one – and that’s easy: just put in $x=1$ and check the result is 1. (It is for all of these).

There’s also a quick way to figure out means: if you differentiate the generating function and put in $x=1$, the dark magic of GFs gives you the mean. Can you figure out why?