The Line of Best Fit (Linear Regression)
Have a look at this picture. What do you notice? “It’s a straight line, Colin!” Very good. You could get a ruler out and draw a straight line through the points. Why would you bother doing such a thing? Well, the idea is that if you can model a data set - come up with a formula that describes it - then you can predict what would happen in hypothetical situations. This process is known as linear regression.
This particular straight line has the equation
… Which is all well and good when you have an immaculate straight line, but how about this one? Less of a straight line, certainly, but still a definite trend.
You could get your ruler out, certainly, and come out with a pretty decent line between the points. But there’s something deeply unsatisfying for a mathematician. Surely there’s a better way - a more accurate way - of finding the single line of best fit?
Well, of course. Otherwise I wouldn’t be writing this. Duh.
There are three ways (depending on the context) of working out the line of best fit. Quick GCSE reminder: a straight line needs a gradient (that you’ll remember being
The simplest way is to do it in Excel. I’ll do a screencast on how to do that another time, because you don’t have a computer in the exam. If you ask me, that’s stupid, but I’m not in charge of the world just now2
Linear regression on a calculator
If you have a Casio calculator, the kind with the round button in the middle at the top3, you can get it to do the heavy lifting for you. This is the way I recommend doing it, because given the choice between adding up huge lists of numbers or letting a machine designed to add up huge lists of numbers, I’d generally leave it to the specialist.
Here’s what you do:
- Press mode and then ‘stat’, which is number 2 on my calculator. It’ll give you a table with
and at the top of each column. - Fill in your data, and read it back to make sure you haven’t missed or mistaken anything.
- Press ‘AC’ to get into normal calculator mode. It’ll say ‘STAT’ at the top, which is a Good Thing.
- Press shift then 1 to bring up the statistics menu. You want ‘regression’, which is 5 on my machine.
- It’ll give you a load of options - you want
, which is number 2 for me. - Oh look! There’s an
and a . I wonder what they are? Actually, I know what they are. They’re the and the from the equation. Press the number next to (1 for me) and then equals. It’ll give you the value of . - Go back to step 4 and do the same thing but press the number for
in the last step. That’ll give you (ta-da!) . The calculator has done the linear regression for you!
Linear regression the hard way
That’s a lot easier than doing it the long way - which is to use the formulas in the formula book to work out
Once you’ve worked those out (
And that’s it!
Footnotes:
1. Which, of course, is the baby form of a straight line
2. Vote Colin for Supreme Leader if you think there should be computers in exams!
3. The proper kind.
4. You could also write the equation as