Data modernism: Difference between revisions

From The Jolly Contrarian
Jump to navigation Jump to search
No edit summary
Tags: Mobile edit Mobile web edit
No edit summary
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{a|systems|}}{{quote|
{{freeessay|systems|data modernism|{{i|Unnecessarily complicated gears a|gif|Explains everything}}}}{{nld}}
{{D|Data modernism|/ˈdeɪtə ˈmɒdənɪzm/|n|}}
The belief that sufficiently powerful machines running sufficiently sophisticated [[algorithm]]s over sufficiently massive quantities of unstructured [[data]] can, by themselves, solve the future.}} A prelude to the [[great delamination]].
 
There is a strand of [[High modernism|high-modernist]] thought<ref>For more on high-modernism see {{br|The Death and Life of Great American Cities}} and {{br|Seeing Like a State}}</ref> that optimised human interaction can be derived mathematically from data science: that all that has stopped it till now is the want of a sufficiently powerful machine to run the calculations.
 
This is a generalisation, but it finds expression in the [[The Singularity is Near|nearby singularity]], the [[simulation hypothesis]], the more breathless aspirations for [[AI]], [[Blockchain]] maximalism, and the slack-jawed wonder with which [[Thought leader|thought leaders]] regard [[Alpha Go]].
 
The underlying premise: the universe is monstrously complicated but nonetheless fundamentally bounded, [[Finite and Infinite Games|finite]] and probabilistic. It is ''complicated'', not [[complex|''complex'']].<ref>A variety of this disposition is that [[complex system]] is just a special type of [[complicated system]], which in turn is just a special type of [[simple system]]. See [[Conway’s Game of Life]].</ref>
 
By this view the time is now close at hand, whereby the means to calculate everything is at our disposal. We now have the processing power to take colossal “[[noise]]” and from it extrapolate a [[Signal-to-noise ratio|signal]]. We don’t necessarily understand ''how'' the machines will do this; just that they will, and far from [[algorithm|algorithmic]] inscrutability being a matter of concern, it is part of the appeal: there is no “all-too-human” bias.<ref>At least, until the algo goes rogue and becomes a Nazi.</ref>
 
Nutshell: there is a belief which stretches from paid-up Randian anarcho-capitalists to certified latter-day socialists, that ''we can solve our problems with data''.
 
===Data ''modernism''? Or ''post''-modernism?===
An initial objection to the label: in [[James C. Scott]]’s classic account of [[high-modernism]]<ref>{{Br|Seeing Like A State}}</ref> there is a top-down, beneficent, controlling human mind of some kind with a pre-existing theory of the game. That central intelligence has derived a theory from deterministic first principles; a sort of [[cogito ergo sum]] begets [[income tax and rice pudding]] begets a mechanised [[High modernist|modernist]] way of life. The housing project, or five-year plan, or Ministry of Truth is an implementation of that pre-existing theory.
 
In “data modernism” the controlling human mind does a different job: it no longer needs a pre-existing theory of the game: it delegates — or, at any rate, ''yields'' — that responsibility to an ineffable ''[[algorithm]]''. The problem is solved not by theory-dependent syllogism, but by a neural network operating at a scale, speed and depth that, to mortal hand and eye, is quite opaque.
 
The “controlling mind” need not know how, in the particular case, the algorithm works, how it gets to its conclusions, and is fixed with the conviction that, being the summed and filtered output of the collected [[wisdom of the crowd]], the algorithm has a greater intelligence than any “single controlling” mind anyway.
 
High modernism — a type of top-down, controlling, conviction politics — thereby seems sufficiently different to “data modernism” — agnostic, open-minded, conditional, following the evidence rather than shaping it — that we shouldn’t use the same term for both. what I am calling data modernism is in more like [[Post-modernism|''post''modernism]], or ''post''-[[Post-modernism|postmodernism]].
 
But this is a difference of emphasis not  upshot. It is a different path to the same place. Data science is just a new conveyance to the same reductionist theory of the world.
 
=== Unstructured data as hubbub ===
Now data, as it comes, is an incoherent, imperfect, meaningless thing. It is the pre-cinema audience chat before the lights go down;  a “hubbub” made up of millions of individual interactions, each of which ''may'' have its own meaning '''—''' or may be incoherent, or wrong-headed, or irrelevant '''—''' but in any case when aggregated and taken as an unmoderated whole has no particular meaning at all, beyond “people are talking”?
 
Now imagine being asked to take that audience hubbub and, with magical tools, to condense it to the single proposition: “what was this audience thinking?”  But the interactions are unstructured, as between themselves random and disconnected. Obviously, there ''is'' no thread.
 
But this is what the algorithm is supposedly doing when it extracts signal from noise. Selectively, it filters, limit, compresses and amplifies on the presumption that there ''is'' a signal to fund among noise; that all the conversations in that hubbub do boil down to some common sentiment, and that those which don’t are no more than noise: that the hubbub is something like a de-tuned radio, or the white noise on the SETI<ref>Search for Extra-Terrestrial Intelligence. You know, Jodie Foster in ''Contact''.</ref> data, buried within which are signals from pulsars, quasars and intelligent life.
 
But the hubbub is not like that. It can’t be reduced to prime factors. There is not a common signal. SETI is a bad [[metaphor]]: it tries to detect a single bilateral signal from a spectrum of other kinds of radiation that are not a signal, but are broadcast on the same frequency. With the human hubbub ''all there is is signal''. It is just that all the signals conflict, or miss each other, or bear no relation to each other at all. There are a spectrum of unconnected communications and ''no'' real “signal”. We are not trying to isolate a single conversation out of all the other ones — that is the direct analogy — but trying to extract a an aggregated message that is not actually there, and to treat is as an [[Emergence|emergent]] property of all those conversations. This is a different thing entirely.
 
''There is no 2 from millions of unrelated conversations''. The result is brown, warm and even: maximum ''entropy''.
 
To make something out of nothing is to ''deliberately'' bias.  It is to carve David out of a marble block. Bias ''creates'' meaning. There may be ''local'' meanings — maybe — based on local interactions and echo chambers but these are informal, incomplete, and impossible to delimit.
But the machine nonetheless extracts one — spurious correlations or just some kind of frequency analysis pulls out some themes.
 
''Now imagine'' feeding that single confabulated sentence back to all the theatre patrons to say “this is the issue which the theatre was debating. Now, which side were you on?”
 
It is natural human nature to read that against your personal situation and come to a view  — as if reading a horoscope.  Suddenly everyone in the cinema ''does'' have a view. They will invest in that conversation.
 
But the hubbub was just noise all along. None of the individual conversations had anything to do with each other. All had their own, independent meanings. They are ''immune'' to aggregation.
 
We say “we have unconscious biases and they inform our reactions”. Well, no ''shit''.
 
===“[[Onworld]]” v “[[offworld]]” in business communications ===
[[File:Onworld and Offworld Comms.png|450px|thumb|right|A quadrant, yesterday. I’m no happier about it that you are]]
The same dynamic exists in a [[negotiation]]. The JC snookered himself into using a [[quadrant|four box quadrant]] to illustrate this, because there are two perpendicular axes at play here: ''How many'' people are you speaking to, and ''in what medium''.
 
In terms of our Onworld/Offworld distinction let us make some value judgments here: whether we like it or not, we inhabit a [[Complexity|complex]], non-linear world. In such a world, personal, immediate, and ''substantive'' communications beat impersonal, delayed, and formalistic ones. These best suit constructive, pragmatic, expert participants.
 
Now your “medium of communication” can take a more or less ''personal'', and ''immediate'' form. The ''least'' personal and immediate communications are ''written'' ones (here the message is, literally, removed from the sender’s personality, and even where transmitted immediately, does not have to be answered in real time). The ''most'' personal and immediate ones are in actual, analogue person, like that ever happens these days — and failing that, a video call where you can ''see'' and ''hear'' nuance, then an audio call where you can just ''hear'' it. But any of these is vastly superior to written communication.
 
How ''many'' people are in your audience is just as important. The more there are, the more formal you must be, the more generalised, the less opportunity for there is for nuance and that lubricating milk of human frailty, wit. The more people there are, the less will be their common interest — cue appeals to take things off-line. Plainly, the more people there, are the greater the cultural, social and human barriers to unguarded communication rise twill there be
 
In any gauge of communicative effectiveness you can take, other than information dissemination, ''one-to-many'' is categorically worse than ''one-to-one''.
 
''Most'' analogue/immediate is in-person, followed by a video call, then an audio call, then in writing (and there may be a spectrum of formality in that writing too: Instant messages at one end; couriered paper at the other).
 
With how many people you are communicating is obvious: one is best; after that it gets worse
 
 
So we tend to “extrapolate” central figures from random noise: economic growth. The intention behind expressed electoral preference. Average wages. The wage gap. Why the stock market went up. ''That'' the stock market went up: these are spectral figures. They are ghosts, gods, monsters and devils. They are no more real than religions, just because they are the product of “science” and “techne”.
 
We have, on occasion, some convenient proxies, but they are just proxies: for example, in an election, a manifesto. Without a manifesto, a binary vote for a single candidate in a local electorate (I am assuming FPP, but in honesty it isn’t wildly different for proportional represerntation) tells us nothing whatever about the individual motivation to vote as she did. A manifesto helps, by a process of [[Deemery|deem]]ery.
 
Did every Conservative voter read the party’s manifesto? Almost certainly, no. Did every Conservative voter who did read it subscribe to every line? Again, almost certainly no. Did ''anyone'' subscribe to every line in it? Perhaps, but by no means certainly.  So, can we legitimately infer uniform support for the Conservatives’ manifesto from all who voted Conservative? ''No''. We only do by dint of the political convention that those who vote for a party are deemed to support a manifesto (if one is published). But even that convention is a spectre. And where your vote is an issue-based referendum, there is not even a manifesto. Who knows why 33 million people voted for Brexit? Who could possibly presume to aggregate all those individual value judgments into a single guiding principle? There were 33 million reasons for voting leave. They tell us nothing except... ''leave''.
 
But yet the delaminated [[Onworld]] — especially as it feeds back its simplified “signal” and thereby amplifies it — we draw our battle lines and attack based on these, invented, signals. We take them, and make them our own. We truck in archetypes of our own devising.<ref>Our personal conceptualisations of archetypes never quite map to the world: the “Google Disappointment Effect” when an image search (or AI prompt) never quite returns the image you had in mind. This is the variation of the “no average fighter pilot” effect. </ref>
 
So, to take the ''issue du jour'' — fools rush in etc — how you feel about gender identity might depend on how you envisage the quintessential gender-fluid individual: if you see an exotic, beautiful, fragile, elfen, teenaged creature of  beguiling androgyny you will see trans people as harmless, vulnerable and in need of all the protections society can offer. If your personal archetype is six-foot male self-identifying to compete in women's sport, or to access women's changing rooms,  you will see trans people as predatory and dangerous.
 
The argument between people holding alternative visions will be fruitless.
 
Yet such patently ludicrous arguments animate the public squares in the [[Onworld]].
 
Hence the delamination: the online world is a world of extruded ghoulish signals aggregated from the unfiltered noise of discourse. The offline world — can we call it the offworld? — is a world of bilateral conversations, one on one. A world of shades, nuance, detail, richness, complexity's and — for the most part — civility.
 
Feedback loopsand feeding that signal back into the memeplex, without necessarily surveilling it or taking anything out of it.
So it would include machine learning, AI, etc
 
{{Sa}}
*The [[cult of the average]]
*[[Great delamination]]
*[[High modernism]]
{{Ref}}

Latest revision as of 07:53, 26 July 2024

The JC’s amateur guide to systems theory

The Jolly Contrarian holds forth™

Explains everything

Resources and Navigation

Index: Click to expand:
Index: Click to expand:

I distrust all systematisers and avoid them. The will to a system is a lack of integrity.

Friedrich Nietzsche, Twilight of the Idols

if they will not understand that we are bringing them a mathematically faultless happiness, our duty will be to force them to be happy. But before we take up arms, we shall try the power of words.

—Yevgeny Zamyatin, We (1924)

Data modernism
/ˈdeɪtə ˈmɒdənɪzm/ (n.)

The conviction that sufficiently powerful machines running sufficiently sophisticated algorithms over sufficiently large quantities of data can, by themselves, solve the future.

A prelude to the great delamination.

There is a strand of high-modernist thought[1] that optimised human interaction can be derived mathematically from data: that all that has stopped it till now is the want of data and a powerful enough machine to crunch it.

This is a generalisation, but it finds expression in the forthcoming singularity, the simulation hypothesis, the more breathless aspirations for Alpha Go, Blockchain maximalism, and the slack-jawed wonder with which the world’s thought leaders regard AI.

Data modernism’s underlying premise: the universe is a monstrously complicated but fundamentally mechanical clockwork: bounded, finite and probabilistic. It is complicated, not complex.[2]

By this view the time is now close at hand whereby the means to calculate everything is at our disposal. We now have the processing power to take colossal quantities of “noise” and from it extrapolate pure, crystalline “signal”. We don’t necessarily understand how the machines will do this; just that they will, and far from algorithmic inscrutability being a matter of concern, it is part of the appeal: there can be no “all-too-human” bias.[3]

Nutshell: there is a belief that stretches from paid-up Randian anarcho-capitalists to certified latter-day effective altruists, that we can solve our problems with data.

Summary

Your attitude towards data influences how you organise — construct[4] — the world:

Determinists build out of history, from a “best available” present — wherever we happen to be, ipso facto is the greatest accumulation of knowledge yet known by humankind , and it just keeps getting better as we are drawn towards an extrapolated, ever more complete future. They build for efficiency and without tolerance for alternative explanations as they introduce only doubt and confusion where there should be none. Progress is is the erosion of confusion, doubt and uncertainty. There is tolerance for doubt or anything that would imply waste, misapprehension or error.

Pluralists build away from an unsatisfactory present, are agnostic about the distant future as long as the nearby one improves upon the shortcomings of the now. They — we — build for flexibility: with tolerance, because we don’t know what will happen next, or how we will view what happened in the past, so we need room to adjust. Tolerance implies open-mindedness, and a commitment to rebuild our world to best fit the circumstances as we find them.

Data modernism? Or post-modernism?

An initial objection to the label: in James C. Scott’s classic account of high-modernism[5] there is a top-down, beneficent, controlling human mind of some kind with a pre-existing theory of the game. That central intelligence has derived a theory from deterministic first principles; a sort of cogito ergo sum begets income tax and rice pudding begets a mechanised modernist way of life. The housing project, or five-year plan, or Ministry of Truth is an implementation of that pre-existing theory.

In “data modernism” the controlling human mind does a different job: it no longer needs a pre-existing theory of the game: it delegates — or, at any rate, yields — that responsibility to an ineffable algorithm. The problem is solved not by theory-dependent syllogism, but by a “black box” neural network operating at a scale, speed and depth that is quite opaque to mortal hand or eye.

The “controlling mind” need not know how the algorithm works or how it comes to be fixed with its conclusions. Data modernists are gripped by the conviction that, being the summed and filtered output of the collected wisdom of the crowd, the algorithm has a greater intelligence than any “single controlling” mind anyway.

High modernism — a type of top-down, controlling, conviction politics — thereby seems sufficiently different to “data modernism” — agnostic, open-minded, conditional, following the evidence rather than shaping it — that we shouldn’t lump them together. What JC calls “data modernism” is, on this view, more like postmodernism, or post-postmodernism.

This is a difference in emphasis not upshot. It is a different path to the same place. Data science is just a new conveyance to the same reductionist theory of the world.

Unstructured data as hubbub

Now data, as it comes, is an incoherent, imperfect, meaningless thing. It is the pre-movie chat before the lights go down; a “hubbub” made up of millions of individual interactions, each of which may have its own meaning — or may be incoherent, or wrong-headed, or irrelevant — but in any case when aggregated and taken as an unmoderated whole has no particular meaning at all, beyond “people are talking”.

Now imagine being asked to take that audience hubbub and, with magical tools, condense it to the single proposition: “what was this audience thinking?” But the interactions are unstructured, as between themselves random and disconnected. Obviously, there is no thread. There is no single meaning.

But this is what the algorithm is supposedly doing when it extracts signal from noise. Selectively, it filters, limit, compresses and amplifies on the presumption that there is a signal to fund among noise; that all the conversations in that hubbub do boil down to some common sentiment, and that those which don’t are no more than noise: that the hubbub is something like a de-tuned radio, or the white noise on the SETI[6] data, buried within which are signals from pulsars, quasars and intelligent life.

But hubbub is not like that. It can’t be reduced to prime factors. There is not a common signal. SETI is a bad metaphor: it tries to detect a single bilateral signal from a spectrum of other kinds of radiation that are not a signal, but are broadcast on the same frequency. With the human hubbub it is all signal. It is just that all the signals conflict, or miss each other, or bear no relation to each other at all. There are a spectrum of unconnected communications and no real “signal”. We are not trying to isolate a single conversation out of all the other ones — that is the direct analogy — but trying to extract a an aggregated message that is not actually there, and to treat is as an emergent property of all those conversations. This is a different thing entirely.

There is no truth from millions of unrelated conversations. The result is brown, warm and even: maximum entropy.

To make something out of nothing is to deliberately bias. It is to carve David out of a marble block. Bias creates meaning. There may be local meanings — maybe — based on local interactions and echo chambers but these are informal, incomplete, and impossible to delimit. But the machine nonetheless extracts one — spurious correlations or just some kind of frequency analysis pulls out some themes.

Now imagine feeding that single confabulated sentence back to all the theatre patrons to say “this is the issue which the theatre was debating. Now, which side were you on?”

It is natural human nature to read that against your personal situation and come to a view — as if reading a horoscope. Suddenly, everyone in the cinema does have a view. They will invest in that conversation.

But the hubbub was just noise all along. None of the individual conversations had anything to do with each other. All had their own, independent meanings. They are immune to aggregation.

We say “we have unconscious biases and they inform our reactions”. Well, no shit.

Averages

So we tend to “extrapolate” central figures from random noise: economic growth. The intention behind expressed electoral preference. Average wages. The wage gap. Why the stock market went up. That the stock market went up.

These are spectral figures. They are ghosts, gods, monsters and devils. They are no more real than religions, just because they are the product of “science” and “techne”.

We have, on occasion, some convenient proxies, but they are just proxies: for example, in an election, a manifesto. Without a manifesto, a binary vote for a single candidate in a local electorate (I am assuming FPP, but in honesty it isn’t wildly different for proportional representation) tells us nothing whatever about the individual motivation to vote as she did. A manifesto helps, by a process of deemery.

Did every voter read every manifesto? Did any voter who subscribe wholesale to every line? So, can we legitimately infer uniform support for the Conservatives’ manifesto from all who voted Conservative? No. We only do by dint of the political convention that those who vote for a party are deemed to support a manifesto (if one is published). But even that convention is a spectre. And where your vote is an issue-based referendum, there is not even a manifesto. Who knows why 33 million people voted for Brexit? Who could possibly presume to aggregate all those individual value judgments into a single guiding principle? There were 33 million reasons for voting leave. They tell us nothing except... leave.

But yet the delaminated Onworld — especially as it feeds back its simplified “signal” and thereby amplifies it — we draw our battle lines and attack based on these, invented, signals. We take them, and make them our own. We truck in archetypes of our own devising.[7]

So, to take the issue du jour — fools rush in etc — how you feel about gender identity might depend on how you envisage the quintessential gender-fluid individual: if you see an exotic, beautiful, fragile, elfen, teenaged creature of beguiling androgyny you will see trans people as harmless, vulnerable and in need of all the protections society can offer. If your personal archetype is six-foot male self-identifying to compete in women's sport, or to access women's changing rooms, you will see trans people as predatory and dangerous.

The argument between people holding alternative visions will be fruitless.

Yet such patently ludicrous arguments animate the public squares in the Onworld.

Hence the delamination: the online world is a world of extruded ghoulish signals aggregated from the unfiltered noise of discourse. The offline world — can we call it the offworld? — is a world of bilateral conversations, one on one. A world of shades, nuance, detail, richness, complexity's and — for the most part — civility.

Feedback loops and feeding that signal back into the memeplex, without necessarily surveilling it or taking anything out of it. So it would include machine learning, AI, etc

See also

References

  1. For more on high-modernism see The Death and Life of Great American Cities and Seeing Like a State
  2. A variety of this disposition is that complex system is just a special type of complicated system, which in turn is just a special type of simple system. See Conway’s Game of Life.
  3. At least, until the algo goes rogue and becomes a Nazi.
  4. The JC draws a long, relativist bow here, in presuming that “the world” is something we construct from our own experiences, assumption and language, and is not something ready made, “out there,” waiting to be discovered. There is a half-way house: the “the world” is partly something out there, waiting to be discovered, and partly something we construct from our own experience, cultural heritage and so on. The pluralist approach works for either. The determinist approach depends on everything being determinate.
  5. Seeing Like A State
  6. Search for Extra-Terrestrial Intelligence. You know, Jodie Foster in Contact.
  7. Our personal conceptualisations of archetypes never quite map to the world: the “Google Disappointment Effect” when an image search (or AI prompt) never quite returns the image you had in mind. This is the variation of the “no average fighter pilot” effect.