The Echo and a simple answer
Don’t get me wrong, The Dorset Echo is one of my favourite local newspapers. They have been kind enough to feed my ego on several occasions, and even if their headlines sometimes don’t quite reflect the gist of the story, I appreciate that.
This time, though, they’ve gone too far. ((I expect a “Local tutor’s anger at Echo story” story to run in the next few days.))
The story, for those of you who can’t be bothered to click ((understandably, since the Echo website is almost unusable for intrusive ads)), involves the deadly toll of Dorset’s roads ((Dorset, as far as we’re concerned, contains Bournemouth and Poole, no matter how much they protest. Unitary authorities, schmunitary authorities.)): a dramatic 50% increase between 2014 and 2015.
Digging down further, the numbers seem to stack up: 19 fatalities on the roads in 2014, and 28 last year, a 47% rise - close enough to 50% that I’d give them a pass on it.
Assuming each fatality is an independent event ((it isn’t, of course: many collisions involve several cars)), that’s even a statistically significant change. Suppose the number of fatalities, $F$, is drawn from a Poisson distribution; our null hypothesis is that its mean is 19. We can, at a stretch, estimate that as a normal distribution with mean of 19 and standard deviation of $\sqrt{19} \approx 4.36$. The probability of a reading of 28 or more can be calculated from the z-score tables; $z = \frac{28-19}{\sqrt{19}} \approx 2.06$, giving a probability of 0.0197; we reject the null hypothesis at the 5% level.
So why my fury?
The story neatly avoided discussing the casualty rates of the preceding years. Luckily, the DfT makes those available, too ((another small black mark for the Echo story: no link to the publicly available source, naughty naughty.)) In 2013, there were 28 deaths on Dorset’s roads. In 2012, there were 24. If I were less honest, I’d innocently point at the unexpected dip in the graph in 2014, rather than the large increase for 2015. However, I’m not that sort of writer: the figures for 2011 and 2010 were 19 and 18.
A more reasonable estimate for the mean would be the average of the five previous years’ figures, which is 21.6. Now the null hypothesis is that $F$ has a mean of 21.6 and a standard deviation of $\sqrt{21.6}$, and the z-score for 28 is $\frac{28-21.6}{\sqrt{26}} \approx 1.377$; the probability of a more extreme result is somewhere about 0.084%, which is not significant at the 5% level.
It would be interesting to see data about the number of fatal collisions (rather than fatalities), as well (I’d expect the effect of clustering to increase the standard deviation).
In short, the answer to the Echo’s question “why the increase in Dorset?” is probably “noise”.