Advertisement

Causation does not always mean correlation

Causation does not always mean correlation





You’ve probably heard before: “Correlation does not necessarily mean causation”. I’ve seen smart people say “But if there was a causation, then you should definitely see some correlation.” But I don’t think that’s true. Or at least, it’s too strong of a statement.

You will have situations where there is no observed correlation, and yet there was causation.

It’s actually worse than that. It’s possible to see a negative correlation in some data, while there is positive causation! That’s possible, even though I know it sounds asinine. It’s not just a theoretical technicality, it does happen in nature. It depends on the system you’re looking at, how things were measured, and what the sample was. Here’s a thought experiment:

Suppose you have a car driving over a very hilly road. Let’s assume the driver is being cautious, and they want to try and keep their speed more constant. So as they go up a hill, they hold down the gas pedal. And then as they go down the other side, they release the gas. They might even do a bit of braking, who knows. This goes on for many hills. Suppose you were recording the speed of the car as one factor, and the angle of the gas pedal as another number.

Now suppose you give this small dataset to an alien species, without telling them any of the context. What do they see? Well, they see that when the pedal is pushed lower, the car is usually going *slower*. They don’t know it’s because the driver was going uphill. And when the pedal is not being pushed, the speed of the car seems to be faster.

You and I know there is a causal relationship between pushing the gas, and the car accelerating. But you’ve got these mediating factors: the behavior of the driver, and the hills.

So even though we know there is a causal relationship, given a limited dataset, you could observe a negative correlation!

But to be fair:



Correlation correlates with causation. Just not as much as you think. It depends on the system you’re looking at, and how you’re measuring things.

I guess it’s intuitive to me that positive causation should be more likely to positively correlate than to have no correlation. And for no correlation to be more likely than an inverse correlation like the hilly driving example. Suppose if you had an infinite amount of random systems. The ones where a mediating factor negatives causality is more complicated.

And now for some chuckles:


Follow that link for spurious correlations, like between oceanic piracy and global ice cream consumption.

correlation

Post a Comment

0 Comments