Correlation: Difference between revisions

From The Jolly Contrarian
Jump to navigation Jump to search
No edit summary
No edit summary
 
(29 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{a|glossary|}}The idea, first articulated by statistician Karl Pearson<ref>So [https://slate.com/technology/2012/10/correlation-does-not-imply-causation-how-the-internet-fell-in-love-with-a-stats-class-cliche.html Slate Magazine argues, at any rate.</ref>, that a relationship between two variables could be characterised according to its strength and expressed in numbers.  
{{a|stats|{{Image|Quincunx|jpg|A [[quincunx]], yesterday}}}}{{Quote|
{{Script|Triago}}: If substance is family then form is the state <br>
A contrivance, precariously stack’d <br>
Bids yield our resilient bonds<br>
To th’escapements of voguish clockery <br>
To rudely declare in the interests of nation<br>
A final victory of correlation over causation<br>
{{Script|Nuncle}}: But the cleverest contraption rusts<br>
Upon immersion in snot.<br>
: —{{Buchstein}}, {{br|The Victory of Form over Substance}}}}
 
The idea, following from Sir Francis Galton’s experiments with a [[quincunx]] and first articulated by statistician Karl Pearson<ref>So [https://slate.com/technology/2012/10/correlation-does-not-imply-causation-how-the-internet-fell-in-love-with-a-stats-class-cliche.html Slate Magazine] argues, at any rate.</ref> that a relationship between two variables could be characterised according to its statistical strength and expressed in numbers, ''regardless of any perceived [[Causation|causal]] connection between them''.
 
If one can derive significance from a purely statistical correlation without a deeper mechanical theory of the universe that might tell us ''why'', we are well on our way to an [[Artificial intelligence|artificially intelligent]] future where [[Chatbot|robot]]s can wipe elderly arses, [[Rumours of our demise are greatly exaggerated - technology article|all bankers are redundant]] (good, right?), [[A World Without Work: Technology, Automation, and How We Should Respond - Book Review|so is everyone else]] (''not'' so good?) and it is only a matter of time before Skynet becomes self-aware and starts hunting down random skater kids from the 1990s.
 
''[[Spartan if|If]]''.
 
But, in some cases you ''can'' derive a significance; in some cases you ''can’t''<ref>There are whole websites devoted to spurious correlations. Like, well, http://www.spuriouscorrelations.com.</ref> but — irony upcoming — without a sophisticated theory of ''causality'', it will be hard to tell them apart. That is to say, a bare [[correlation]] won’t tell you whether there is a causal arrow at all, much less — if there is — which way it flows.
 
“Correlation”<ref>“A mutual relationship or connection between two or more things.”</ref> ''ought to be'' a synonym for “mere coincidence”<ref>“A remarkable concurrence of events or circumstances without apparent causal connection”.</ref> though in its more fashionable usages, especially among [[big data]] freaks, this tends to get — well — ''buried'' in the [[signal-to-noise ratio|noise]]. There may be something profound, reflexive and ironic about this, but it’s too early in the morning to figure out out. At any rate, the more data you have the, the worse your [[signal-to-noise ratio|signal]], and the more chanting “[[correlation does not imply causation]]” in a sing-song voice whenever anyone cites a correlation will annoy the ''hell'' out of [[big data]] freaks — which is all the more reason to do it.


===Correlation and causation===
===Correlation and causation===
Now it is true that [[correlation]] doesn’t imply [[causation]], but it doesn’t rule it out either. And it is certainly true that a ''lack'' of correlation ''does'' imply a ''lack'' of [[causation]].
Now it is true that [[correlation]] doesn’t imply [[causation]], but it doesn’t rule it out either. And it is easy to infer from a ''lack'' of correlation that there is no [[causation]].
 
But hold your horses.
 
“[[All other things being equal]], a [[correlation]] is more likely to evidence a [[causation]] than a ''lack'' of correlation”, is one of those logical canards. As Monty Python put it, “[[universal affirmative]]s can only be partially converted: all of Alma Cogan is dead, but only some of the class of dead people are Alma Cogan.”
 
So here’s the thing, and I am straining to avoid distracting myself onto my pet subjects of transcendent truth and causal skepticism, so bear with me:
 
Even if you accept some [[reductionism|objectivist ]] model where, whether we can know it or not, there ''is'' a true, unique, single cause for every effect — and down that rabbit hole are a bunch of consequences you really wouldn’t like, but let’s say — it follows that an event must have but ''one'' cause (or consistent matrix of causes) to the absolute exclusion of any other explanation. There cannot be alternative, mutually exclusive, causal explanations of the same event, for that would imply ghastly [[relativism]]<ref>Not ghastly.</ref>
 
That is to say, for every single “''true''” correlation, there are multiple ''spurious'' correlations — events that serendipitously ''seem'', by their statistical regularity, to have causal significance to each other but, in transcendent fact, don’t.
 
How many is “multiple”? ''Depends on how much data, and how much imagination, you’ve got''. Seeing as [[the portion of all data we have collected is nil| portion of all data we have collected is necessarily nil]], the best answer is that ''there are infinite [[spurious correlation]]s and only one ''true ''correlation between a cause and its effect''. The likelihood, without better evidence,<ref>You are right. This qualification is doing ''a lot'' of work.</ref> that a given correlation is the true one is therefore 1/∞, or ''zero''.
 
So it is true to say a lack of any correlation may not increase the likelihood of events being causally related, ''but nor, without other evidence, does the presence of one''. Especially seeing as there may be some data, as yet uncollected or [[narrative|unnarratised]], that could explain how apparently uncorrelated events are, in fact, causally related.
 
Where does this leave us? Here: '''Any [[correlation]], in the absence of better evidence of [[causation]], is ''meaningless'''''.  
 
===“Better evidence of causation”===
Glomming on to a satisfying correlation dodges the hard question, which is, “what possible ''better evidence'' of true causation — a “necesary connexion” between cause and effect — ''could there be''?”


[[All other things being equal]], a [[correlation]] is more likely to evidence a [[causation]] than a ''lack'' of correlation, right? This is one of those logical canards, as Monty Python put it, “[[universal affirmative]]s can only be partially converted: all of Alma Cogan is dead, but only some of the class of dead people are Alma Cogan.
This is not a new conundrum. It was first posed by {{author|David Hume}}, in 1739 — “necessary connexion” was his phrase — and he answered it in the negative. There is no better evidence of causation.


But, fortunately for the interests of narrow-minded righteousness and [[determinism]], Hume allegedly once met someone who was racist, so we can entirely ignore him and the quarter of a millennium of epistemology that he spurred. Plus, he was a Scot.<ref>Disclosure for humourless [[libtard]]s: deliberate irony, intended as a joke.</ref>


{{sa}}
{{sa}}
*[[In God we trust, all others must bring data]]
*[[Contract]]ual [[causation]]
*[[Contract]]ual [[causation]]
*{{br|The Book of Why: The New Science of Cause and Effect}} by {{author|Judea Pearl}}


{{ref}}
{{ref}}
{{Friday Philosophy|Correlation}}

Latest revision as of 09:22, 26 June 2024

Lies, Damn Lies and Statistics
A quincunx, yesterday
Index: Click to expand:
Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.

Triago: If substance is family then form is the state
A contrivance, precariously stack’d
Bids yield our resilient bonds
To th’escapements of voguish clockery
To rudely declare in the interests of nation
A final victory of correlation over causation
Nuncle: But the cleverest contraption rusts
Upon immersion in snot.

Büchstein, The Victory of Form over Substance

The idea, following from Sir Francis Galton’s experiments with a quincunx and first articulated by statistician Karl Pearson[1] that a relationship between two variables could be characterised according to its statistical strength and expressed in numbers, regardless of any perceived causal connection between them.

If one can derive significance from a purely statistical correlation without a deeper mechanical theory of the universe that might tell us why, we are well on our way to an artificially intelligent future where robots can wipe elderly arses, all bankers are redundant (good, right?), so is everyone else (not so good?) and it is only a matter of time before Skynet becomes self-aware and starts hunting down random skater kids from the 1990s.

If.

But, in some cases you can derive a significance; in some cases you can’t[2] but — irony upcoming — without a sophisticated theory of causality, it will be hard to tell them apart. That is to say, a bare correlation won’t tell you whether there is a causal arrow at all, much less — if there is — which way it flows.

“Correlation”[3] ought to be a synonym for “mere coincidence”[4] though in its more fashionable usages, especially among big data freaks, this tends to get — well — buried in the noise. There may be something profound, reflexive and ironic about this, but it’s too early in the morning to figure out out. At any rate, the more data you have the, the worse your signal, and the more chanting “correlation does not imply causation” in a sing-song voice whenever anyone cites a correlation will annoy the hell out of big data freaks — which is all the more reason to do it.

Correlation and causation

Now it is true that correlation doesn’t imply causation, but it doesn’t rule it out either. And it is easy to infer from a lack of correlation that there is no causation.

But hold your horses.

All other things being equal, a correlation is more likely to evidence a causation than a lack of correlation”, is one of those logical canards. As Monty Python put it, “universal affirmatives can only be partially converted: all of Alma Cogan is dead, but only some of the class of dead people are Alma Cogan.”

So here’s the thing, and I am straining to avoid distracting myself onto my pet subjects of transcendent truth and causal skepticism, so bear with me:

Even if you accept some objectivist model where, whether we can know it or not, there is a true, unique, single cause for every effect — and down that rabbit hole are a bunch of consequences you really wouldn’t like, but let’s say — it follows that an event must have but one cause (or consistent matrix of causes) to the absolute exclusion of any other explanation. There cannot be alternative, mutually exclusive, causal explanations of the same event, for that would imply ghastly relativism[5]

That is to say, for every single “true” correlation, there are multiple spurious correlations — events that serendipitously seem, by their statistical regularity, to have causal significance to each other but, in transcendent fact, don’t.

How many is “multiple”? Depends on how much data, and how much imagination, you’ve got. Seeing as portion of all data we have collected is necessarily nil, the best answer is that there are infinite spurious correlations and only one true correlation between a cause and its effect. The likelihood, without better evidence,[6] that a given correlation is the true one is therefore 1/∞, or zero.

So it is true to say a lack of any correlation may not increase the likelihood of events being causally related, but nor, without other evidence, does the presence of one. Especially seeing as there may be some data, as yet uncollected or unnarratised, that could explain how apparently uncorrelated events are, in fact, causally related.

Where does this leave us? Here: Any correlation, in the absence of better evidence of causation, is meaningless.

“Better evidence of causation”

Glomming on to a satisfying correlation dodges the hard question, which is, “what possible better evidence of true causation — a “necesary connexion” between cause and effect — could there be?”

This is not a new conundrum. It was first posed by David Hume, in 1739 — “necessary connexion” was his phrase — and he answered it in the negative. There is no better evidence of causation.

But, fortunately for the interests of narrow-minded righteousness and determinism, Hume allegedly once met someone who was racist, so we can entirely ignore him and the quarter of a millennium of epistemology that he spurred. Plus, he was a Scot.[7]

See also

References

  1. So Slate Magazine argues, at any rate.
  2. There are whole websites devoted to spurious correlations. Like, well, http://www.spuriouscorrelations.com.
  3. “A mutual relationship or connection between two or more things.”
  4. “A remarkable concurrence of events or circumstances without apparent causal connection”.
  5. Not ghastly.
  6. You are right. This qualification is doing a lot of work.
  7. Disclosure for humourless libtards: deliberate irony, intended as a joke.