Template:M intro technology rumours of our demise: Difference between revisions

Template:M intro technology rumours of our demise (view source)

Revision as of 10:02, 24 November 2023

112 bytes removed , 24 November 2023

→‎Bayesian priors and the canon of ChatGPT

Amwelladmin

Bureaucrats, Interface administrators, Administrators

82,891

edits

@@ Line 233: / Line 233: @@
 ==== Division of labour, redux ====
-So, about that “[[division of labour]]”. When it comes to mechanical tasks of the “body”, [[Turing machine]]s scale well, and humans scale very badly. “Scaling”, when we are talking about computational tasks, means doing them over and over again, in series or parallel, quickly and accurately. Each operation can be identical; their combined effect astronomical. Of course machines are good at this: this is why we build them. They are digital: they preserve information indefinitely, however many processors we use, with almost no loss of fidelity.
+So, about that “[[division of labour]]”. When it comes to mechanical tasks of the “body”, [[Turing machine]]s scale well, and humans scale badly.
-You could try to use networked humans to replicate a [[Turing machine]], but the results would be slow, costly, disappointing and the humans would not enjoy it.<ref>{{author|Cixin Liu}} runs exactly this thought experiment in his ''Three Body Problem'' trilogy.</ref> Humans are slow and analogue. With each touch the [[signal-to-noise ratio]] would quickly degrade: this is the premise for the parlour game “Chinese Whispers” — with each repetition  the signal degrades, with amusing consequences.
+===== Body scaling =====
+“Scaling”, when we are talking about computational tasks, means doing them over and over again, in series or parallel, quickly and accurately. Each operation can be identical; their combined effect astronomical. Nvidia graphics chips are so good for AI because they can do 25,000 ''trillion'' basic operations per second. ''Of course'' machines are good at this: this is why we build them. They are digital: they preserve information indefinitely, however many processors we use, with almost no loss of fidelity.
+You could try to use networked humans to replicate a [[Turing machine]], but the results would be slow, costly, disappointing and the humans would not enjoy it.<ref>{{author|Cixin Liu}} runs exactly this thought experiment in his ''Three Body Problem'' trilogy.</ref> With each touch the [[signal-to-noise ratio]] degrades: this is the premise for the parlour game “Chinese Whispers”.
 A game of Chinese Whispers among a group of [[Turing machine]]s would make for a grim evening.
@@ Line 241: / Line 244: @@
 In any case, you could not assign a human, or any number of humans, the task of “cataloguing the entire canon of human creative output”: this is quite beyond their theoretical, never mind practical, ability. With a machine, at least in concept, you could.<ref>Though the infinite fidelity of machines is overstated, as I discovered when trying to find the etymology of the word “[[satisfice]]”. Its modern usage was coined by Herbert Simon in a paper in 1956, but the Google ngram suggests its usage began to tick up in the late 1940s. On further examination, the records transpire to be optical character recognition errors. So there is a large part of the human oeuvre — the pre-digital bit that has had be digitised— that does suffer from analogue copy errors.</ref>
-But when it comes to “mind stuff,” humans scale well. “Scaling”, for imaginative tasks, is different. Here we don’t want identical, digital, high-fidelity ''duplication'': ten thousand copies of ''Finnegans Wake'' contribute no more to the human canon than does one.<ref>Or possibly, even ''none'': Wikipedia tells us that, “due to its linguistic experiments, stream of consciousness writing style, literary allusions, free dream associations, and abandonment of narrative conventions, ''Finnegans Wake'' has been agreed to be a work largely unread by the general public.”</ref> Multiple humans doing  “mind stuff” contribute precisely that idiosyncrasy, variability, difference in perspective that generates the wisdom, or brutality, of crowds: a complex community of readers can independently parse, analyse, explain, narratise, extend, criticise, extrapolate, filter, amend, correct, and improvise the information ''and each others’ reactions to it''.
+===== Mind scaling =====
+“Scaling”, for imaginative tasks, is different. Here, humans scale well. We don’t want identical, digital, high-fidelity ''duplications'': ten thousand copies of ''Finnegans Wake'' will contribute no more to the human canon than does one.<ref>Or possibly, even ''none'': Wikipedia tells us that, “due to its linguistic experiments, stream of consciousness writing style, literary allusions, free dream associations, and abandonment of narrative conventions, ''Finnegans Wake'' has been agreed to be a work largely unread by the general public.”</ref> Multiple humans doing “mind stuff” contribute precisely that idiosyncrasy, variability and difference in perspective that generates the wisdom, or brutality, of crowds. A community of readers independently parse, analyse, explain, narratise, contextualise, extend, criticise, extrapolate, filter, amend, correct, and improvise the information ''and each others’ reactions to it''.
-The wisdom of the crowd shapes itself: consensus has a directed intelligence all of its own. This community of expertise is what [[Sam Bankman-Fried]] misses in his dismissal of Shakespeare’s “[[Bayesian prior|Bayesian priors]]”. However handsome William Shakespeare’s own contribution to the Shakespeare canon — the actual folio — it is finite, small and historic. It was complete by 1616 and hasn’t been changed since. The rest of the body of work we think of as “Shakespeare” comprises interpretations, editions, performances, literary criticisms, essays, adaptations and its (and their) infusion into the vernacular and it ''vastly outweighs the master’s actual folio''. What it more, it continues to grow.
+This community of expertise is what [[Sam Bankman-Fried]] misses in dismissing Shakespeare’s damning “[[Bayesian prior|Bayesian priors]]”. However handsome William Shakespeare’s personal contribution was to the Shakespeare canon — the text of the actual folio<ref>What counts as “canon” in Shakespeare’s own written work is a matter of debate. There are different, inconsistent editions. Without Shakespeare to tell us, we must decide for ourselves. Even were Shakespeare able to tell us, we could ''still'' decide for ourselves.</ref> — it is finite, small and historic. It was complete by 1616 and hasn’t been changed since. The rest of the body of work we think of as “Shakespeare” is a continually growing body of interpretations, editions, performances, literary criticisms, essays, adaptations and its (and their) infusion into the vernacular. This ''vastly outweighs the bard’s actual folio''.
-This kind of “communal mind scaling” creates its own intellectual energy and momentum from a small, wondrous seed planted and nurtured four hundred years ago.<ref>On this view Shakespeare is rather like a “prime mover” who created a universe but has left it to develop according to its own devices.</ref>  No matter how fast pattern-matching machines run in parallel, however much brute, replicating horsepower they throw at the task, it is hard to artificial intelligence in the shape of a large learning model, without human steering, doing any of this.
+This kind of “communal mind scaling” creates its own intellectual energy and momentum from a small, wondrous seed planted and nurtured four hundred years ago.<ref>On this view Shakespeare is rather like a “prime mover” who created a universe but has left it to develop according to its own devices.</ref>
-(The “directed intelligence of human consensus” is not [[utopia|magically benign]], of course, as [[Sam Bankman-Fried]] might be able to tell us, having been on both ends of it).<ref>See also Lindy Chamberlain, Peter Ellis and the sub-postmasters wrongly convicted in the horizon debâcle.</ref>
+The wisdom of the crowd thus shapes itself: community consensus has a directed intelligence all of its own. It is not [[utopia|magically benign]], of course, as [[Sam Bankman-Fried]] might tell us, having been on both ends of it.<ref>See also Lindy Chamberlain, Peter Ellis and the sub-postmasters wrongly convicted in the horizon debâcle.</ref>
 ===Bayesian priors and the canon of ChatGPT===
-Last point on literary theory, is that the “[[Bayesian priors]]” argument which fails for Shakespeare fails all the more so for a [[large language model]].
+Last point on literary theory is that the “[[Bayesian priors]]” argument which fails for Shakespeare also fails for a [[large language model]].
-Just as a great deal of the intellectual energy involved in rendering a text into the three-dimensional [[metaphor]]ical universe we think of as ''King Lear'' comes from the cultural milieu surrounding the text, and not its actual author, so it does with the output of an LLM. Its its source after all, is entirely drawn from the human canon. A large language model trained only on randomly assembled ASCII characters would return randomly assembled ASCII characters.
+Just as most of the intellectual energy needed to render a text into the three-dimensional [[metaphor]]ical universe we know as ''King Lear'' comes from the surrounding cultural milieu, so it does with the output of an LLM. The source, after all, is entirely drawn from the human canon. A model trained only on randomly assembled ASCII characters would return only randomly assembled ASCII characters.
-But what if the material is not random? What if the model augments its training data with its own output? Might that create an apocalyptic feedback loop whereby LLMs bootstrap themselves into some kind of hyperintelligent superlanguage, beyond the cognitive capacity of an unaugmented human, from which platform they might dominate human discourse? Are we inadvertently seeding ''Skynet''?
+But what if the material is not random? What if the model augments its training data with its own output? Might that create an apocalyptic feedback loop, whereby LLMs bootstrap themselves into some kind of hyperintelligent super-language, beyond mortal cognitive capacity, whence the machines might dominate human discourse?
-Just look what happened with AlphaGo. It didn't require ''any'' human training data. They just fed it the rules, switched it on and, with indecent brevity, it had walloped the game’s ruling grandmaster. It learned by playing millions of games against itself.  What happens when LLMs do that?
+Are we inadvertently seeding ''Skynet''?
+Just look what happened with [[Alpha Go]]. It didn’t require ''any'' human training data: it learned by playing millions of games against itself. Programmers just fed it the rules, switched it on and, with indecent brevity, it worked everything out and walloped the game’s ruling grandmaster.
+Could LLMs do that? This fear has been with us for a while.
 {{Quote|{{rice pudding and income tax}}}}
-Brute-forcing outcomes in a fully bounded, zero-sum environment with simple, fixed rules is exactly what machines are designed to do. If this is to be our comparison, we are misdirecting ourselves, voluntarily suspending disbelief, setting ourselves up to fail. This is agreeing a contest on the machine’s terms. For playing Go is “body stuff”. The environment is fully fixed, understood, transparent and determined. There is no uncertainty. No ambiguity. No rom for creative construal. ''This is exactly where we would expect a Turing machine to excel''.
+But brute-forcing outcomes in a fully bounded, [[Zero-sum game|zero-sum]] environments with simple, fixed rules — in the jargon of [[Complexity|complexity theory]], a “tame” environment — is exactly what machines are designed to do. We should not be surprised that they are good at this, nor that humans are bad at it.
-We deploy LLMs by contrast, in complex environments. They are unbounded, ambiguous, inchoate and impermanent. The situation changes. Mathematical computation doesn't work.
+To see this as a fair comparison is to misdirect ''ourselves'': willingly, to suspend disbelief. ''This is exactly where we would expect a Turing machine to excel''.
-{{Quote|Imagine how hard physics would be if particles could think.
+By contrast, LLMs must operate in complex, “[[wicked]]” environments. Here conditions are unbounded, ambiguous, inchoate and impermanent. ''This is where humans excel''. The situation continually changes. The components interact with each other to make the landscape dance. Here, narratising is an advantage: brute force mathematical computation won’t do.
+{{Quote|Think how hard physics would be if particles could think.
 :— Murray Gell-Mann}}
-An LLM works because it can quickly composite synthetic outputs ''against human text''. It must pattern-match against well-formed human language. That is how it works its magic.  Degrading its training data will progressively degrade its output. This “model collapse” is an observed effect. LLMs will only work for humans if they’re fed human generated content. A redditor put it more succinctly than I can:
+And nor does it: an LLM works by compositing a synthetic output from a massive database of pre-existing text. It must pattern-match against well-formed human language. Degrading its training data will progressively degrade its output. Such “model collapse” is an observed effect.<ref>https://www.techtarget.com/whatis/feature/Model-collapse-explained-How-synthetic-training-data-breaks-AI</ref> LLMs will only work for humans if they’re fed human generated content.
 {{Quote|{{AlphaGo v LLM}}}}
-There is another contributor to the cultural milieu surrounding any text: the ''reader''. It is the reader, and her “[[cultural baggage]]”, who must make head and tail of the text. She alone determines, for her own case, whether it stands or falls. This is true however rich the cultural milieu that supports the art.
+There is another contributor to the cultural milieu surrounding any text: the ''reader''. It is the reader, and her “[[cultural baggage]]”, who must make head and tail of the text. She alone determines, for her own case, whether it stands or falls. This is true however rich is the cultural milieu that supports the text.
-Construing natural language, much less visuals or sound, is no matter of mere [[Symbol processing|symbol-processing]]. Humans are ''not'' [[Turing machine|Turing machines]].
-We know this because the overture from ''Tristan und Isolde'' can reduce different listeners to tears of joy or boredom. One contrarian can see in the Camden Cat a true inheritor of the great blues pioneers, while others see an unremarkable busker. A text sparks meaning, and becomes art, in the reader’s head.
+Construing natural language, much less visuals or sound, is no matter of mere [[Symbol processing|symbol-processing]]. Humans are ''not'' [[Turing machine|Turing machines]]. A text only sparks meaning, and becomes art, in the reader’s head.
-It turns out that hitherto sterile debate about the nature of meaning and art that turned generations of students off philosophy is now critically important.
+We know this because the overture from ''Tristan und Isolde'' can reduce different listeners to tears of joy or boredom. One contrarian can see in the Camden Cat a true inheritor of the great blues pioneers, others might see an unremarkable busker.
 This is as true of magic — the conjurer’s trick is to misdirect her audience into ''imagining'' something that isn’t there: the magic is supplied by the audience — and it is of ''digital'' magic. We imbue what an LLM generates with meaning. ''The meatware is doing the heavy lifting''.
-If you feed an LLM with its own output it rapidly degrades into meaningless mush. LLMs are not intentional. They are not directed, they depend on the ongoing environment to shape their fitness. That environment is necessarily human.
+It turns out that sterile undergraduate debate about the nature of meaning and art is now critically important.
 ===A real challenger bank===