A fundamental concept that everyone must understand is the difference between correlation and causation when interpreting the relationships between two variables. Too often, assumptions are made that a correlation between two variables implies a causal relationship. But the association between the variables may be coincidental without any cause-and-effect relationship. Here are two examples:
First, let’s look at the relationship between the average hours of sunlight per day and how many hours the average person spends per day wearing sunglasses. You can see that the data follow a similar trajectory, increasing in the summer months. This is an example of a cause-and-effect relationship where the hours of sunlight directly influence the use of sunglasses. This is causation.

Now, let’s look at the rate of shark attacks and the consumption rate of ice cream. Once again, the data follow a similar path, increasing in the summer months and decreasing in the winter. This is an example of a coincidental relationship whereby neither variable is affecting the other. The sharks are not attacking humans because they are eating more ice cream, and people are not eating more ice cream because of the increased attacks. The two variables – shark attacks and ice cream consumption – are independently increasing in the summer months because the warmer temperatures lead to more beach activity and ice cream consumption.

Now, both of these examples are purposefully simple, but it is important to question ‘correlation or causation’ and think critically when presented with information about correlating variables. It is easy to assume that corelating data signifies one of the variables is influencing the other, but there could be other unrepresented variables responsible for the correlation.

Leave a comment