Spurious Correlation

A Spurious Correlation is a statistical mirage, a relationship where two or more variables appear to be related, moving in tandem, but have no real-world, causal connection. Think of it as a coincidence on a grand scale. The variables might rise and fall together, leading you to believe one is causing the other, but the connection is purely accidental or, more often, caused by a hidden third factor. For investors, mistaking a spurious correlation for a genuine cause-and-effect relationship is a classic and costly trap. It can lead to building an entire investment thesis on a foundation of sand, believing a stock will rise because of an unrelated factor, like the price of butter in Bangladesh or the winner of the Super Bowl. A savvy value investor learns to be a skeptic, always questioning whether a statistical pattern represents a real business dynamic or is just a ghost in the data.

The human brain is a pattern-finding machine. We are hardwired to connect dots and find reasons for why things happen, a tendency often amplified by confirmation bias. In the world of big data, where computers can sift through endless information, it's easier than ever to find seemingly profound correlations that are, in reality, completely meaningless.

Imagine you plot two sets of data: monthly ice cream sales and the monthly number of shark attacks. You'd likely find a stunningly strong correlation. As ice cream sales go up, so do shark attacks! Does this mean buying a vanilla cone incites a feeding frenzy? Of course not. The culprit is a third, “lurking” variable: hot weather.

  • When it’s hot, more people buy ice cream.
  • When it’s hot, more people go swimming in the ocean, increasing the chances of a shark encounter.

The ice cream and the sharks aren't related to each other; they are both related to the heat. This is the essence of a spurious correlation—an illusion of a direct link that is actually explained by something else entirely.

For investors, especially those who lean heavily on quantitative analysis or technical analysis charts without a deep understanding of the underlying business, spurious correlations are kryptonite. They promise a secret formula for beating the market but deliver only random noise.

History is filled with bizarre and ultimately useless market indicators that are classic examples of spurious correlations.

  • The Super Bowl Indicator: For decades, a strangely accurate “rule” stated that if a team from the original National Football League (NFC) won the Super Bowl, the market would go up that year. If an American Football League (AFC) team won, the market would go down. This held true for years, but it was pure, unadulterated coincidence with zero economic logic. Relying on it would be no different than basing your portfolio on a coin flip.
  • Data Mining Dangers: With powerful software, analysts can hunt for patterns across millions of data points. This process, called data mining, can easily spit out “discoveries” like “companies with two vowels in their ticker symbol outperform on Tuesdays.” These are almost always spurious. By testing enough variables, you are statistically guaranteed to find something that correlates with stock prices purely by chance.
  • Unrelated Economic Data: You might find a high correlation between, say, cheese production in Wisconsin and the performance of the Nasdaq index. While amusing, there is no plausible economic link. Building a strategy on this would be foolish, as the correlation is bound to break down unpredictably.

Guarding against spurious correlations requires a healthy dose of skepticism and a commitment to first-principles thinking. It’s about being more of a business detective than a data cruncher.

Before you ever act on a statistical relationship, you must ask one simple, powerful question: Why? What is the logical, real-world mechanism that connects these two variables? If you can't come up with a sensible, evidence-based answer, you should assume the correlation is spurious until proven otherwise.

Key Questions to Ask

  1. Is there a plausible economic reason? Does it make business sense that a rise in Factor A would cause a rise in the value of Company B? For example, it's logical that rising oil prices benefit an oil exploration company. It's not logical that they are affected by the birth rate in Finland.
  2. Could a third factor be driving both? Always hunt for the “hot weather” variable. Are both the stock price and the indicator you're watching being driven by a larger trend, like interest rates, consumer confidence, or technological change?
  3. Is this just a coincidence? Given enough data, coincidences are inevitable. Ask yourself if the “discovery” is more likely the result of random chance than a stable, predictable relationship.

Ultimately, the best defense is a core principle of value investing: focus on what you can understand. Don't chase statistical ghosts. Instead, focus on the tangible drivers of long-term value: a company's competitive advantage (its economic moat), the quality of its management, its financial health, and its ability to generate sustainable cash flow and a high return on equity. These are the true causes of long-term investment success, and unlike a spurious correlation, you can count on them.