Talk:The Bayesian

From The Jolly Contrarian
Jump to navigation Jump to search

Cambridge brainboxes and their supercomputers

Mike Lynch and Richard Gaunt,[1] a pair of Cambridge neural networks postdocs, founded British tech darling Autonomy, in 1996.

Autonomy’s flagship product was IDOL: an “intelligent data operating layer” platform using advanced — for the time — pattern recognition techniques and algorithmic Bayesian inference to extract structured data from the morass of unstructured information with which the modern networked works is inundated: documents, correspondence, contracts, instant messages, email and phone data.

Through IDOL, Autonomy promised a revolution: granular business intelligence from a previously unnavigable swamp.

Now, socially awkward brainboxes hawking monstrous machines powered by mystical tools like “Bayesian inference” to convert oceans of muck into premium brass had quite the cachet in 2005. It has quite the cachet now, come to think of it.

A major target was legal departments. One of IDOL’s obvious applications was in document management — a subject dear to the heart of any budding legal operationalist — so, unusually, legal eagles were a prime subject for the marketing blitz. In a sentence: awesome pitch, pricey product, disappointing demo.

Got tickets to White Hart Lane though, and a catbird seat as a hurricane force corporate scandal unfolded. In 2011 Hewlett Packard pounced, snapped Autonomy up, then got a serious case of buyer’s remorse. Fast.

Swooned no doubt be Autonomy’s honeyed presentations foretelling a future of hoverboards, deLoreans and machines of loving grace watching over us, effortlessly converting the dreary white noise of our internal monologue into monetisable intelligence, and persuaded Autonomy was uniquely placed to sweep the board, HP paid US$11 billion — 70% over stock market valuation — to ride the train.

Plainly the market had a more sanguine view, but even that was optimistic, as events would demonstrate.

It may not surprise the reader — though it did surprise HP’s executive — that the acquisition did not go well. It was, indeed, a disaster: within a year Lynch had been defenestrated and HP had written down its investment by $8.8 billion. Six years HP had hocked the whole thing off.

Cue all manner of litigation. HP shareholders sued the HP board for negligent management. The SFO, SEC and FBI launched criminal investigations. HP sued Lynch and his management team, accusing them of misleading HP into overpaying for the company through fraudulent accounting, and overstated earnings. Lynch countered that HP had misunderstood the product, integrated it poorly into the HP business and then mismanaged it after acquisition.

The various actions rumbled on for years. In 2022, having spent a further USD100m prosecuting it HP — somewhat pyrrhically — was awarded a billion of the five billion dollars in damages it had sought from Lynch l. Lunch appealed.

In the mean time FBI brought criminal charges against Lynch and Autonomy’s former Vice President of Finance Stephen Chamberlain for wire fraud and conspiracy and, in May 2023, extradited the pair to face trial in the US. The criminal standard of proof being much higher than the civil one, things went better for the Autonomy executives and in June 2024, Lynch and Chamberlain were acquitted on all counts.

After the acquittal, Chamberlain returned to his home in Cambridgeshire while, by way of celebration, Lynch treated his daughter and some close friends to a fortnight on his superyacht, The Bayesian in the Mediterranean.

On 17 August 2024 Stephen Chamberlain was hit by a car while jogging in Longstanton Cambridgeshire. He died of his injuries two days later.

Early on the morning of the same day, while at anchor half a mile off the port of Porticello on the north coast of Sicily, The Bayesian was hit by a freak storm and, in the space of about 16 minutes, capsized and sank, tragically killing Lynch, his daughter and five others.

You could not, many were inclined to believe, make it up.

Conspiracy theory

The improbable circumstances of these accidents, within two days of each other and just weeks after their acquittals, raised eyebrows. It seemed, if nothing else, to be an extraordinary coincidence, although the mainstream media quickly rationalised that an actual conspiracy here was highly unlikely — no one had anything obvious to gain, for one thing, and orchestrating any freak storm at all, let alone powerful enough to capsize and sink a 55-metre, 550-ton yacht is likely beyond the capacity of an organisation like Hewlett Packard which apparently struggles with basic commercial due diligence. Plus, if you did want to “off” a business executive, there are far easier ways of doing it than summoning up a fatal storm.

Yet, still, the unfiltered maw of uninformed public speculation — from which I write, dear correspondent — found this all very, well, fishy.

That must have been some kind of storm.

What sort of typhoon causes a $40m state-of-the-art superyacht, crewed by experienced professional seamen, to sink, in sixteen minutes, while at anchor? Has this sort of thing ever happened before?

What, in other words, are the odds?

And so we find ourselves looping reflexively into one of an improbably large number of ironies — oop: there’s another one — with which this this story is shot through:

The odds comes down to a question of Bayesian inference.

Conspiracy v irony: Bayesian reasoning

We hear more and more about “Bayesian reasoning”. This method of statistical reasoning derived from the work of Thomas Bayes, an (at the time) obscure 18th-century Presbyterian minister from Tunbridge Wells, now dominates modern information technology and was important enough to Mike Lynch's business for him to name his superyacht in its honour.

Bayesian reasoning
/beɪːzˈiːən ˈriːzᵊnɪŋ/ (n.)

A method of statistical inference that updates a probability as more information becomes available about the hypothesis. Bayesian reasoning ensures comprehensive use of available information:

  1. improves risk assessment and forecasting.
  2. is used in artificial intelligence to allow machine learning algorithms to learn and adapt from their data more effectively.
  3. helps evaluate evidence and enables more informed judgments, ensuring that decisions are based on a comprehensive analysis of all available information

Automony’s IDOL platform used Bayesian inference to manage and analyze unstructured data, enabling it to make probabilistic predictions and improve information retrieval.

Bayesian inference is not, of course, flawless: prior probabilities can still turn out to be misleading. Correct application of Bayes’ principles might have told Lynch as they dropped anchor at Porticello that his yacht was most unlikely to sink in the night.

Goats and Ferraris

There is no better example of Bayesian inference than the Monty Hall problem. This is either trite, or completely Earth-shattering, depending on whether you have come across it before or not. It indicates, too, how tremendously bad our instinctive grasp of probabilities is:

You are a game-show contestant. The host shows you three doors and tells you: “Behind one of those doors is a Ferrari. Behind the other two are goats. [Why goats? — Ed] you may choose one door.

Knowing you have a ⅓ chance, you choose a door at random.

Before revealing what is behind your door, the host stops, and theatrically opens one of the other doors, to reveal a goat. There are two closed doors left. She offers you the chance to reconsider your choice.

Do you stick with your original choice, switch, or does it not make a difference?

Intuition suggests it makes no difference. At the beginning, each door carried an equal probability, ⅓, and after the reveal, the remaining doors still do: ½. So, while your odds have improved, it still doesn’t matter whether you stick or twist. The odds are now ½ for each remaining door. Right?

Wrong. Bayesian reasoning shows our intuition is wrong. The best odds are if we switch.

Staying put is to commit to a choice you made then the odds were worse.

There are two categories of door; ones you chose, and ones you didn’t. Because there is only one door in the “chosen” category and two doors in the “not chosen ” category, and at the beginning all doors were equally likely to hide the car, we know the odds between the categories were: chosen: ⅓; not chosen: ⅔.

Then Monty opens one of the “not chosen” doors. You have no further information about your chosen category but you do, however, about the unchosen category: one of the two is empty, and definitely has a zero chance of containing the car. Therefore, of the unchosen category, ’'all the chance sits behind the unopened door.

The6⁷u probabilities between the chosen and not chosen categories remain ⅓ and ⅔: it is just that so if the ⅔ chance is vested hlin the one remaining unchosen door. You have a better chance of winning the car (though not a certainty) if you switch.

It is true: a person who now arrives and is given the choice without your prior knowledge, would calculate the probabilities at 50:50. But she is ignorant of your original choice and the decision the host made, based on your original choice (remember the door opened by the host depended on your choice: the rule was “open a door that the contestant did not choose and that does not conceal the Ferrari”). Without that information, from the newcomers perspective, the odds really are 50:50.

So you should switch doors. This proposal outrages some people, at first. Especially when explained to them at a pub, it outrages them later. But it is true.

It is easier to see if instead there are one thousand doors, not three, and after your first pick the host opens 998 of the other doors.

Bayesian inference invites us to update our assessment of probabilities based on what we have since learned. We do this quite naturally in certain environments — when playing cards, for example: once the Ace, King and Queen have been played, you know your Jack is high, and no-one can beat it — but not as a general rule.

Blame and exculpation

The blame and recrimination process started quickly after the accident. Italian police launched an investigation and raised the possibility of manslaughter or “culpable shipwreck” charges against the Bayesian’s skipper, chief engineer and the sailor on watch duty at the time of the incident. All were professional sailors.[2]

In the meantime, the yacht’s builder, The Italian Sea Group,[3] made a series of unusually strident public statements blaming the crew for mishandling the boat and thereby injuring the boat builders reputation. In September 2024 TISG launched, then quickly disowned, civil suit against the Bayesian’s crew and owner — a company controlled by Mike Lynch’s widow Angela Bacares — seeking compensation for reputational damage and loss of earnings, alleging among other things that the crew was inappropriately selected, did not make necessary preparations for the storm despite advanced weather warnings, and their actions during the storm contributed to the sinking.

TISG itself is under police investigation in connection with the tragedy: this may explain its precipitate behaviour in launching formal legal proceedings: one way of getting ahead of suspicion as a wrongdoer is to cast yourself as the victim.

Linear and systems: two competing theories

A conspiracy theorist is someone who’s never tried to organise a surprise party.

— John F. Kennedy

Hold this as a provisional hypothesis: We can divide explanations of the world into linear theories and systems theories.

“Linear theories” view the world as a complicated, but essentially logical system. One that, therefore can, and would benefit from, strategic, top-down design. This is a utopian, grandiose, cybernetic perspective. Failures of the world to produce good outcomes are always attributable: the universe is essentially stable, and there are linear causes and effects which might not be expected, but are calculable. Especially in hindsight. The past is a different country.

Complicated systems” are bounded rule-based processes: they involve interaction between autonomous agents but within fixed boundaries and according to preconfigured static rules of engagement. All relevant information is available to, even if not necessarily known by, all participants in the system.

Complicated systems may be forward~ or backward-engineered. They may require great skill to navigate — in order to beat Pelé at football you will need to be very good — but they are nevertheless, in theory, predictable, in the sense that all events in them are caused, have effects and operate according to a single, consistent set of rules.

One can regard the system’s overall behaviour as entirely explicable in terms of causes and effects. The “root causes” for what happens will dominate the incidental circumstances in which they occur. The inspirational CEO, Pelé, the bad apples who ruined it for everyone, the human operator whose error caused the air crash: these are linear explanations of aberrations in linear systems.

Now the nature of complicated systems is that, while they may be integral, whole and consistent, we cannot always see them in their entirety. Sometimes our root cause analysis reaches a dead, unsatisfactory end, simply due to a lack of information.

The fact is, there are very few political, social, and especially personal problems that arise because of insufficient information. Nonetheless, as incomprehensible problems mount, as the concept of progress fades, as meaning itself becomes suspect, the Technopolist stands firm in believing that what the world needs is yet more information.[4]

The natural response, in a linear system is simply to seek more information. The system has a complete state — there is a “bottom”:[5] we just need to find it. Where that information is available, we should get it.

Where it is not — where an event happens for which there is no apparent cause, inference is required. We must hypothesise, backwards from what we know about the system to infer what happened.

Much of the time, this is easy to do. We are natural inference engines. All of science proceed

Conspiracy theories are typically linear explanations in that they put ultimate blame (or credit) for a given state of affairs things down to the intentional actions, be they malign or well-meant, of a limited number of disproportionately influential people.

“Systems theories” view the world as a non-linear complex system.[6] attribute outcomes to the behaviour of a wider interlocking system of relationships where individuals’ acumen or motivations contribute to, but rarely determine the outcomes the system produces. They are usually non-linear: in a complex system unexpectable things can and do happen. This is generally not a single person’s fault, but an unexpected consequence of the design of the system. As systems experience unexpected consequences they tend to learn and adjust to them.

Old systems, having been around for longer and having been exposed to more variations in condition, are better stress-tested and therefore throw up fewer unexpected consequences than new systems.[7]


I will grant you at once that this is a wide conception indeed of “conspiracy theory”: it includes not just gunpowder plots and Russian bots in Western elections but the general idea that great art is the product of singular genius, commercial success is the outcome of exceptional leadership, and jazz is not just a succession of happy accidents.

By contrast, systems theory says in a nutshell, into a bit more complicated than that. In the case of great artists and great visionary business people, their input into the artistic process is not discounted altogether but instead aggregated with a great deal of other system information to generate an outcome. Shakespeare was indeed a genius but would yet have died in anonymity were it not for his sponsors, publishers, patrons, theatres, actors, critics and audience: the magnificent cultural establishment that we now know as the Shakespeare canon contains a lot of stuff that was nothing to do with William Shakespeare.

Systems and paradigms

I rabbit on a lot on this site about power structures and paradigms. These are systems of political, scientific and cultural control.

Systemantics

The best place to start systems theory is John Gall’s short, acerbic, funny and devastatingly incisive book Systemantics: The Systems Bible. System theories have an acronym: “POSIWID”: the “purpose of a system is what it does”. This, Gall gently points out, Is by inevitable outcome not what those who designed the system had in mind. The System tends to oppose its own intended function so therefore to blame conspirators Who occupy positions of ostensible influence and power within the system is rather to miss the point. They are as much victims of systemantics as anyone else.

Operator error

We should not be surprised to hear operator error advanced as a cause. It is a clear, linear root cause — suitable for a linear theory of complicated system where a trained expert fell short of a required standard — and it has the additional advantage of not challenging the overall integrity of the system. It was not management’s fault, nor failure to supervise, nor the ship’s design, but component failure. Where the component that fails is an autonomous agent there is only so much a system administrator can do. The strategic end-point is to design out the need for autonomous components in the system at all — until very recently this has been an impossible Idealistic end point but with natural language models equipped with, powers of, well, Bayesian reasoning (I told you there would be some irony in this post), perhaps that point is not so far away.

But as long as fallible human autonomous agents are still needed for edge case tasks — for all the mindboggling technology aboard a state of the art super yacht, it still needs a skipper an engineer and a watch detail — the scope for human error is unavoidable.

Defamation

The claim was brought at Italian law, and withdrawn before anyone got much of a look at it, but it is fun to imagine it was essentially one of defamation:

Defamation
ˌdɛfəˈmeɪʃᵊn (n.)

A false statement about a person, communicated to others, that harms the person’s reputation.

The statement must be untrue, specific enough to identify the person and published or shared with third parties such that it damages the person’s reputation or standing.

Can your careless conduct — not “representational” gestures; actually non-communicative acts, like steering (or not steering) a boat — be a statement?

The defamatory “statement”, we presume, would be that the boat was designed so badly that, even at anchor, a crew of competent professionals sailors could not stop it scuttling in a severe storm.

That allegation would certainly be defamatory to a shipbuilder. If it were untrue.

But the problem the shipbuilder has is that the ship did sink at anchor in a storm. This does normally happen to well designed ships. Presuming it was well designed, then either (i) the storm’s ferocity or (ii) the sailors’ negligence, was truly unprecedented.

And here, again Bayesian reasoning can help us. We have 3500 years of Mediterranean seafaring experience. This plays out across two dimensions firstly the experience of the kind of weather conditions that can occur in the Mediterranean. Our “Lindy database” of freak events — which include eruptions on Etna and the Aeolian islands, is massive. Each one of those experiences in is buried deep in the pace layers of Mediterranean seafaring knowledge. We know the conditions ship builders must design for. The other dimension is the parameters of ship design. Ships falling comfortably within the tested limits of ship design are unlikely to be unsuitable for the conditions in the Mediterranean. The bayesian was not unusually long, nor unusually heavy, nor was its retractable keel and unusual design. While it is a relatively new innovation, it's a specific advantage in the comparatively shallow and tranquil waters of the Mediterranean. So the Bayesian was in most respects not an unusual boat. But it was in one colon at the time of its construction in 2008, the bayesians mast was the tallest single yacht mast in history. The next tallest, one of the five masts on the royal clipper, constructed in 2000, with just 60 meters, some 12 meters shorter than the bayesian’s 73 m mast. And the royal clipper is a 179 metre long, permanently keeled square rigged ship. The Bayesian’s mast, relative to its Hull length and given its retractable keel, was extraordinarily disproportionate.

A third angle of prior probability is the acumen of the crew. We should review The historical information for situations in which crew failure was the sole attributed cause of the sinking of a ship at anchor.

This is really another way of asking this question: is it possible to design out of a ship the risk that it would be capsized and sunk purely by inclement weather? Can you solve for human failure, or is the nature of seafaring so dependent on human decision that it is impossible for a ship to design a way the risk of human error?

  1. Interesting: while the Internet is, and always has been, awash with information about Lynch, it contains almost nothing about Gaunt. No Wikipedia page, and no references at all, since he last showed up in the Sunday Times rich list in 2009. Perhaps this in itself is a testament to Mr. Gaunt’s ability to protect unstructured data.
  2. https://apnews.com/article/italy-sicily-superyacht-sinking-investigation-7b26e40e9efd69c4d08e189093f67ad9
  3. Strictly speaking, the successor to the boat’s builder: Perini Navi, which constructed the Bayesian in 2008, had been acquired by TISG in 2021.
  4. Neil Postman, Technopoly: The Surrender of Culture to Technology, 1992.
  5. In information theory, the “entropy” in a system is a function of the information capacity of the system. A given string of symbols has a maximum information capacity.
  6. Complex systems are “unbounded, interactive process. Involves interaction with autonomous agents without boundaries, without pre-agreed rules, and where information is limited and asymmetric. Rules, boundaries and each participant’s objectives are dynamic and change interactively. Impossible to predict.”
  7. This is sometimes called the “Lindy effect”, but is also explained in terms of pace layering: old systems occupy deeper layers.