Data modernism

From The Jolly Contrarian
Jump to navigation Jump to search
The JC’s amateur guide to systems theory

The Jolly Contrarian holds forth™

Resources and Navigation

Index: Click to expand:
Index: Click to expand:

I distrust all systematisers and avoid them. The will to a system is a lack of integrity.

Friedrich Nietzsche, Twilight of the Idols

Data modernism
/ˈdeɪtə ˈmɒdənɪzm/ (n.)

The belief that sufficiently powerful machines running sufficiently sophisticated algorithms over sufficiently massive quantities of unstructured data can, by themselves, solve the future.

A prelude to the great delamination.

There is a strand of high-modernist thought[1] that optimised human interaction can be derived mathematically from data science: that all that has stopped it till now is the want of a sufficiently powerful machine to run the calculations.

This is a generalisation, but it finds expression in the nearby singularity, the simulation hypothesis, the more breathless aspirations for AI, Blockchain maximalism, and the slack-jawed wonder with which thought leaders regard Alpha Go.

The underlying premise: the universe is monstrously complicated but nonetheless fundamentally clockwork: bounded, finite and probabilistic. It is complicated, not complex.[2]

By this view the time is now close at hand, whereby the means to calculate everything is at our disposal. We now have the processing power to take colossal “noise” and from it extrapolate a “signal”. We don’t necessarily understand how the machines will do this; just that they will, and far from algorithmic inscrutability being a matter of concern, it is part of the appeal: there is no “all-too-human” bias.[3]

Nutshell: there is a belief which stretches from paid-up Randian anarcho-capitalists to certified latter-day socialists, that we can solve our problems with data.

Summary

Your attitude towards data influences how you organise — construct[4] — the world:

Determinists build from history and for efficiency: without tolerance as there is no doubt, and tolerance implies waste, misapprehension and error.

Pluralists build towards the future and for flexibility: with tolerance, because we don’t know what will happen next, or how we will view what happened in the past, so we need room to adjust. Tolerance implies open-mindedness, and a commitment to rebuild our world to best fit the circumstances as we find them.

Data modernism? Or post-modernism?

An initial objection to the label: in James C. Scott’s classic account of high-modernism[5] there is a top-down, beneficent, controlling human mind of some kind with a pre-existing theory of the game. That central intelligence has derived a theory from deterministic first principles; a sort of cogito ergo sum begets income tax and rice pudding begets a mechanised modernist way of life. The housing project, or five-year plan, or Ministry of Truth is an implementation of that pre-existing theory.

In “data modernism” the controlling human mind does a different job: it no longer needs a pre-existing theory of the game: it delegates — or, at any rate, yields — that responsibility to an ineffable algorithm. The problem is solved not by theory-dependent syllogism, but by a neural network operating at a scale, speed and depth that, to mortal hand and eye, is quite opaque.

The “controlling mind” need not know how, in the particular case, the algorithm works, how it gets to its conclusions, and is fixed with the conviction that, being the summed and filtered output of the collected wisdom of the crowd, the algorithm has a greater intelligence than any “single controlling” mind anyway.

High modernism — a type of top-down, controlling, conviction politics — thereby seems sufficiently different to “data modernism” — agnostic, open-minded, conditional, following the evidence rather than shaping it — that we shouldn’t use the same term for both. what I am calling data modernism is in more like postmodernism, or post-postmodernism.

But this is a difference of emphasis not upshot. It is a different path to the same place. Data science is just a new conveyance to the same reductionist theory of the world.

Unstructured data as hubbub

Now data, as it comes, is an incoherent, imperfect, meaningless thing. It is the pre-cinema audience chat before the lights go down; a “hubbub” made up of millions of individual interactions, each of which may have its own meaning or may be incoherent, or wrong-headed, or irrelevant but in any case when aggregated and taken as an unmoderated whole has no particular meaning at all, beyond “people are talking”?

Now imagine being asked to take that audience hubbub and, with magical tools, to condense it to the single proposition: “what was this audience thinking?” But the interactions are unstructured, as between themselves random and disconnected. Obviously, there is no thread.

But this is what the algorithm is supposedly doing when it extracts signal from noise. Selectively, it filters, limit, compresses and amplifies on the presumption that there is a signal to fund among noise; that all the conversations in that hubbub do boil down to some common sentiment, and that those which don’t are no more than noise: that the hubbub is something like a de-tuned radio, or the white noise on the SETI[6] data, buried within which are signals from pulsars, quasars and intelligent life.

But the hubbub is not like that. It can’t be reduced to prime factors. There is not a common signal. SETI is a bad metaphor: it tries to detect a single bilateral signal from a spectrum of other kinds of radiation that are not a signal, but are broadcast on the same frequency. With the human hubbub all there is is signal. It is just that all the signals conflict, or miss each other, or bear no relation to each other at all. There are a spectrum of unconnected communications and no real “signal”. We are not trying to isolate a single conversation out of all the other ones — that is the direct analogy — but trying to extract a an aggregated message that is not actually there, and to treat is as an emergent property of all those conversations. This is a different thing entirely.

There is no 2 from millions of unrelated conversations. The result is brown, warm and even: maximum entropy.

To make something out of nothing is to deliberately bias. It is to carve David out of a marble block. Bias creates meaning. There may be local meanings — maybe — based on local interactions and echo chambers but these are informal, incomplete, and impossible to delimit. But the machine nonetheless extracts one — spurious correlations or just some kind of frequency analysis pulls out some themes.

Now imagine feeding that single confabulated sentence back to all the theatre patrons to say “this is the issue which the theatre was debating. Now, which side were you on?”

It is natural human nature to read that against your personal situation and come to a view — as if reading a horoscope. Suddenly everyone in the cinema does have a view. They will invest in that conversation.

But the hubbub was just noise all along. None of the individual conversations had anything to do with each other. All had their own, independent meanings. They are immune to aggregation.

We say “we have unconscious biases and they inform our reactions”. Well, no shit.

Averages

So we tend to “extrapolate” central figures from random noise: economic growth. The intention behind expressed electoral preference. Average wages. The wage gap. Why the stock market went up. That the stock market went up: these are spectral figures. They are ghosts, gods, monsters and devils. They are no more real than religions, just because they are the product of “science” and “techne”.

We have, on occasion, some convenient proxies, but they are just proxies: for example, in an election, a manifesto. Without a manifesto, a binary vote for a single candidate in a local electorate (I am assuming FPP, but in honesty it isn’t wildly different for proportional representation) tells us nothing whatever about the individual motivation to vote as she did. A manifesto helps, by a process of deemery.

Did every Conservative voter read the party’s manifesto? Almost certainly, no. Did every Conservative voter who did read it subscribe to every line? Again, almost certainly no. Did anyone subscribe to every line in it? Perhaps, but by no means certainly. So, can we legitimately infer uniform support for the Conservatives’ manifesto from all who voted Conservative? No. We only do by dint of the political convention that those who vote for a party are deemed to support a manifesto (if one is published). But even that convention is a spectre. And where your vote is an issue-based referendum, there is not even a manifesto. Who knows why 33 million people voted for Brexit? Who could possibly presume to aggregate all those individual value judgments into a single guiding principle? There were 33 million reasons for voting leave. They tell us nothing except... leave.

But yet the delaminated Onworld — especially as it feeds back its simplified “signal” and thereby amplifies it — we draw our battle lines and attack based on these, invented, signals. We take them, and make them our own. We truck in archetypes of our own devising.[7]

So, to take the issue du jour — fools rush in etc — how you feel about gender identity might depend on how you envisage the quintessential gender-fluid individual: if you see an exotic, beautiful, fragile, elfen, teenaged creature of beguiling androgyny you will see trans people as harmless, vulnerable and in need of all the protections society can offer. If your personal archetype is six-foot male self-identifying to compete in women's sport, or to access women's changing rooms, you will see trans people as predatory and dangerous.

The argument between people holding alternative visions will be fruitless.

Yet such patently ludicrous arguments animate the public squares in the Onworld.

Hence the delamination: the online world is a world of extruded ghoulish signals aggregated from the unfiltered noise of discourse. The offline world — can we call it the offworld? — is a world of bilateral conversations, one on one. A world of shades, nuance, detail, richness, complexity's and — for the most part — civility.

Feedback loops and feeding that signal back into the memeplex, without necessarily surveilling it or taking anything out of it. So it would include machine learning, AI, etc

See also

References

  1. For more on high-modernism see The Death and Life of Great American Cities and Seeing Like a State
  2. A variety of this disposition is that complex system is just a special type of complicated system, which in turn is just a special type of simple system. See Conway’s Game of Life.
  3. At least, until the algo goes rogue and becomes a Nazi.
  4. The JC draws a long, relativist bow here, in presuming that “the world” is something we construct from our own experiences, assumption and language, and is not something ready made, “out there,” waiting to be discovered. There is a half-way house: the “the world” is partly something out there, waiting to be discovered, and partly something we construct from our own experience, cultural heritage and so on. The pluralist approach works for either. The determinist approach depends on everything being determinate.
  5. Seeing Like A State
  6. Search for Extra-Terrestrial Intelligence. You know, Jodie Foster in Contact.
  7. Our personal conceptualisations of archetypes never quite map to the world: the “Google Disappointment Effect” when an image search (or AI prompt) never quite returns the image you had in mind. This is the variation of the “no average fighter pilot” effect.