Signal-to-noise ratio

From The Jolly Contrarian
Revision as of 17:27, 12 March 2021 by Amwelladmin (talk | contribs)
Jump to navigation Jump to search
where
“n” is the data in which you trust; and
“x” is the data you haven’t got yet.
In which the curmudgeonly old sod puts the world to rights.
Index — Click ᐅ to expand:
Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.

Caught in a mesh of living veins,
In cell of padded bone,
He loneliest is when he pretends
That he is not alone.

We’d free the incarcerate race of man
That such a doom endures
Could only you unlock my skull,
Or I creep into yours.

Ogden Nash, Listen...

In God we trust, all others must bring data.

W. Edwards Deming

n/∞=0

—Mathematics

If the information content of the universe, through all time and space is as good as infinite[1] and the data homo sapiens has collected to date is necessarily finite[2] (even counting what we’ve lost along the way), it follows that the total value of our data — in which W. Edwards Deming would have us trust — is, like any other finite number divided by infinity, mathematically nil.

And that is before you consider the quality of our data. If 90% of all gathered data originates from the internet age,[3] a good portion of our summed human knowledge comprises cat videos, self indulgent wikis and hot takes on Twitter — so is shite data, even on its own terms.[4]

In any case, it follows that, should we transcend our meagre hermeneutic bubbles, and free the incarcerate race of man, so to speak, the signal of our data to the noise of all possible data out there is infinitesimal.[5]

If this is what we’re meant to trust, you might ask what is so wrong with God. We are pattern-seeking machines. We don’t take the data as we find it, and coolly fashion objective axioms from it, carving nature at its joints: we bring our idiosyncratic prisms and pre-existing cognitive structures to it —our own “hot takes” — and wilfully create patterns from it to support our convictions.

This is not a criticism about but an observation. This is the doom our incarcerate race endures.

It is not just the Twitterati. Science, too, has its confirmation biases at a meta-level, uncontrollable even by double-blind testing methodologies. Experiments which confirm a hypothesis are a lot more likely to be published than those which don’t.[6] Of those failed experiments that are published, far fewer are cited in other literature. Falsifications die.

This is neither a cause for alarm nor is it new. It is just a reminder how important, in all human discourse, is contingency, provisionality, and above all humility. Your data is likely bunk.

All of these are another way of attacking a familiar problem: the universe, the world, the nation, your market, your workplace and even your interpersonal relationships are complex, not merely complicated. Complication is a function of a paradigm. It is part of the game. It is within the rules. It is soluble, by sufficiently skilled application of the rules. Complication can be beaten by an algorithm. You can brute force it.

Complexity, you cannot.

Complexity describes the limits of the narrative. Complexity is the wilderness beyond the rules of the game. Complexity inhabits the noise, not the signal. Where there is complexity, algorithmic rules do not work. Here data is relegated to noise.[7]

This is why physical sciences apparently have a greater success than social sciences: they ask themselves easier questions: Physical sciences generally address behaviours of independent events — rolling balls, flipping coins, waves and/or particles of light. But rolling balls are not autonomous agents. They act independently. The behaviour of one will not influence that of another. Each coin flip is, as a condition of probability theory — independent.[8] Independent events obey Gaussian principles. They may be modelled. That is to say, they may be complicated but they remain predictable, at least in theory. When physical systems inexplicably go bang — Chernobyl, the Shuttle Challenger, the Torrey Canyon — the root cause will not be a failure of the physical science underlying the engineering, but some supervening cause invalidating the underlying assumptions on which the physical science was based. Things go bang because of non-linear interactions.

Social sciences don’t have that get-out-of-jail-free card: they address precisely that kind of supervening cause: behaviour that is, intrinsically, unpredictable. Psychology, sociology, anthropology, economics — these concern themselves with human agents, who are influenced by each other — which is why we don’t use physical science to predict their behaviour. Social sciences have to deal with the inherently complex, non-Gaussian interactions between human beings.[9]

Behaviourism and The Ghost in the Machine

Now it wasn’t always like that. Fifty years ago psychologists were waging a battle royale against the positivist branch of their own discipline, which insisted on on proceeding by reference, exclusively, to “public events” and ignoring private mental events. Can you imagine it: a psychology which ignores private mental events? Can you imagine an approach to artificially reconstructing natural intelligence which ignores private mental events?

On the strength of this doctrine, the Behaviorists proceeded to purge psychology of all intangibles and unapproachables. The terms ‘consciousness’, ‘mind’, ‘imagination’ and ‘purpose’, together with a score of others were declared to be unscientific, treated as dirty words, and banned from the vocabulary. ...

It was the first ideological purge of such a radical kind in the domain of scientists, predating the ideological purchase in totalitarian politics, but inspired by the same single-mindedness true fanatics.
—Arthur Koestler, The Ghost in the Machine

You might ask what has changed, for it seems that the contemporary interest in in neural networks, big data and natural language processing, all of which eschew the intentional fallacy, adopt exactly the Behaviourist disposition. Don’t they? On one hand, they have no choice: if human psychologists are struggling to understand how consciousness works in situ, in the actual mesh of living veins, in cell of padded bone, is it any wonder people looking at its proxy in a digital network might not bother?

See also

References

  1. This assumes there is not a finite end-point to the universe; by no means settled cosmology, but hardly a rash assumption. And given how little we have of it, the universe’s total information content might as well be infinite, when compared to our finite collection of mortal data. Even the total, ungathered-by-mortal-hand, information content generated by the whole universe to date, not even counting the unknowable future, is as good as infinite.
  2. There is no data from the future.
  3. Eric Schmidt said something like this in 2011, and it sounds totally made up, but let’s run with it, hey?
  4. Get off Twitter, okay? For all of our sakes.
  5. That means, really small.
  6. The Hidden Half: How the World Conceals its Secrets, by Michael Blastland.
  7. Provisional theory: “information” is data framed with a hypothesis.
  8. The technical term: “platykurtic”.
  9. physical sciences set up closed logical systems within which their rules will work, and often these systems are dramatically simplified as compared with anything you see in the real world: Newton, for example, assumes a frictionless, stationery, stable, neutral frame of reference: circumstances which, in any observed environment, do not and cannot not exist. Nancy Cartwright calls these structures “nomological machines”. Because of this explicit caveat, we can put any variances between Newton’s prediction and the observed outcome down not to falsification, but to the messy real world “contaminating” the idealised experimental conditions. Hence, the proverbial crisp packet blowing across St Mark’s Square.