Go

From The Jolly Contrarian
(Redirected from Alpha Go)
Jump to navigation Jump to search


In which the curmudgeonly old sod puts the world to rights.
Index — Click ᐅ to expand:

Comments? Questions? Suggestions? Requests? Insults? We’d love to 📧 hear from you.
Sign up for our newsletter.

A complicated system that is so hard to brute-force solve — and I mean a lot harder than draughts or chess, that it is, in Daniel Susskind’s mind, compelling evidence that we will shortly be good for little more than pleasant intellectual onanism while the machines harvest our vital fluids for battery juice and wage wars on each other above ground in a post-apocalyptic waste-land.

But Go and Chess are very different environments to the unbounded, uncertain world in which we find ourselves.

Here is a beautifully succint post from Redditor felis-parenthesis as to the differences (emphasis ours):

I see a difference between large language models with Alpha Go learning to play super human Go through self-play.

When Alpha Go adds one of its own self-vs-self games to its training database, it is adding a genuine game. The rules are followed. One side wins. The winning side did something right.

Perhaps the standard of play is low. One side makes some bad moves, the other side makes a fatal blunder, the first side pounces and wins. I was surprised that they got training through self-play to work; in the earlier stages the player who wins is only playing a little better than the player who loses and it is hard to work out what to learn. But the truth of Go is present in the games and not diluted beyond recovery.

But an LLM is playing a post-modern game of intertextuality. It doesn’t know that there is a world beyond language to which language sometimes refers. Is what an LLM writes true or false? It is unaware of either possibility. If its own output is added to the training data, that creates a fascinating dynamic. But where does it go? Without Alpha Go’s crutch of the “truth” of which player won the game according to the hard-coded rules, I think the dynamics have no anchorage in reality and would drift, first into surrealism and then psychosis.

One sees that Alpha Go is copying the moves that it was trained on and an LLM is also copying the moves that it was trained on and that these two things are not the same.[1]

See also

  1. An excellent post by user:felis-parenthesis on Reddit.