top of page

All That Glitters Is Not Gold: The Illusion of Spurious Correlations

  • Writer: Karolina Dyrla-Mularczyk
    Karolina Dyrla-Mularczyk
  • Aug 19, 2024
  • 1 min read

Updated: Mar 17

The co-occurrence of two (or more) variables in a given situation does not necessarily imply a meaningful relationship between them. Sometimes, this correlation is purely coincidental, a result of timing, or influenced by another underlying variable. 

  

The existence of spurious correlations was demonstrated as early as the 1990s by researchers Donald Redelmeier and Amos Tversky. In their study, they observed 18 patients with degenerative joint disease over a 15-month period while simultaneously recording weather data (pressure, temperature, humidity). The patients believed there was a connection between weather conditions and increased joint pain. However, statistical analysis did not support this belief. 


In one of my recent analyses, a factor that correlated with stroke was marital status. But is this a true correlation? No, because the actual variable correlating with stroke incidence is the patient's age. As patients aged, the likelihood of being married increased. Therefore, age was the so-called confounding variable in this relationship. 



Cat is running
Photo by svklimkin on Unsplash

Examples of Spurious Correlations: 

  • Rising expenditures on pets are associated with an increase in deaths from falling down stairs. 

  • -In years when actor Nicolas Cage appeared in more films, there was a simultaneous rise in the number of people who drowned in backyard pools. 


 More examples of spurious, yet strong correlations can be found here: https://plotlygraphs.medium.com/spurious-correlations-56752fcffb69 


How to Avoid Incorrect Conclusions about Correlations: 

  • Verify if there is a logical and consistent explanation for causality between the variables.

  • Control for confounding variables – identify and manage potential variables that may influence the analyzed variables.

  • Use advanced statistical analysis methods, such as regression analysis, to control for the impact of confounding variables.

 

Sources: 

Redelmeier, D. A., & Tversky, A. (1996). On the belief that arthritis pain is related to the weather. Proceedings of the National Academy of Sciences, 93(7), 2895-2896. 



Need help avoiding this or other statistical pitfalls in your research? Contact us! 



 
 
bottom of page