Thinking clearly about confidence intervals

28 September 2020

All survey statistics come with a degree of uncertainty, which we normally call their confidence interval or margin of error. In loose terms this is the range within which we're nearly sure the score for the population would have been if we could have spoken to all customers.

Confidence intervals are, to my mind, the most useful and most powerful tool that statistics gives us in our quest to understand the world more clearly. So why are they so rarely reported, and almost never used properly when organisations think about their survey research?

One of the problems, I think, is that the way we report statistics tends to lead our thinking astray, and it takes a lot of work to overcome this. Let's look at a concrete example.

If we have a satisfaction index of 83, with a margin of error of ±3, then we can think about this in three ways, and the most useful of these is the rarest and most difficult.

1) Our customers' satisfaction index is 83

No, it's probably not. That's the single best estimate we have, but it's more likely to be a little bit wrong than exactly right.

2) Our customers' satisfaction index is almost certainly between 80 and 86

Yes, but that's a pretty wide range, isn't it?

3) Our customers' satisfaction index is almost certainly between 80 and 86, and it's most likely to be quite close to 83

To think accurately with statistics, you need to get used to the idea that the results we have are estimates with a distribution of likelihood. The sample average is the middle of that distribution, but it's probably wrong. The confidence interval range is stretching out to the point at which we've covered 95% of the likelihood, but the distribution is such that it's much more likely to be near the middle than so far towards the edges.

Stop thinking about single estimates, stop thinking about ranges, and start thinking about distributions.