


There has been a recent rise in comic book sales, and you need to figure out why so the company can maintain the sales figures. Suppose Marvel hires you as a data scientist. Or maybe it’s the opposite, your algorithm predicts completely wrong sales figures, and you need to figure out why. How many times have you looked at the result of your model and wondered what-if the data was something other than what it trained on? You could write an algorithm that predicts the sales of comic books, and your model works well and produces high-accuracy predictions, but you need to know why. Let’s ask you a different question, then. A correlation coefficient of `0` means no correlation between two variables.Ĭonversely, causation is a relationship between two variables where one variable causes the other variable to change.Īt this point, you’re probably wondering: I have heard about causality a few hundred times, but do I need to care? In contrast, a correlation coefficient of `-1` means a perfect negative correlation between two variables. A correlation coefficient of `1` indicates a perfect positive correlation between two variables. Mathematically, correlation is measured by the correlation coefficient between `-1` and `1`. That’s a correlation, but it’s not causation.įor example, if you see a lot of birds in the sky and it starts raining, it doesn’t mean that the birds caused the rain. But a change in one variable doesn’t cause the other to change. But what does it mean? Why does the correlation between two events do not imply causality? What is up?Ĭorrelation is a relationship or connection between two variables where whenever one changes, the other is likely to also change. “Correlation does not mean Causation”: If you had a penny for every time you heard that line, you would probably be a millionaire by now. Figure 1: Causation and Correlation (source: tweet link).
