Template:M intro design System redundancy

The JC likes his pet management theories as you know, readers, and none are dearer to his heart than the idea that the high-modernists have, for forty years, held western management orthodoxy hostage.

The programme is as simple to state as it is self-serving: a distributed organisation is best controlled centrally, and from the place with the best view of the big picture: the top. All relevant information can be articulated as data — you know: “In God we trust, all others must bring data” — and, with enough data everything about the organisation’s present can be known and its future extrapolated.

Even though one inevitably has less than perfect information, extrapolations, mathematical derivations and algorithmic pattern matches from a large but finite data set will have better predictive value than “ineffable expertise”: the learning we have assigned to experienced experts is really a kind of anecdotal folk psychology that lacks analytical rigour: this is the lesson of Moneyball: The Art of Winning an Unfair Game: in the same way that the Wall Street data crunchers could outperform veteran baseball talent scouts, so can models and analytics outperform humans in optimising processes. Thus, from a network of operationalised but largely uncomprehending rule-followers emerges a smooth, steady and stable business revenue stream.

Since the world overflows with data, we can programmatise business. Optimisation is a mathematical problem to be solved. It is a knowable unknown. To the extent we fail, we can put it down to not enough data or computing power.

Since data quantity and computing horsepower have exploded in the last few decades, high modernists become ever more certain their time — the Singularity is nigh. It is not long now, and all will be solved.

The pioneer of this kind of modernism was Frederick Winslow Taylor. His inheritors say things like, “the singularity is near” and “software will eat the world” but for all their millenarianism the on-the-ground experience at the business end of this world-eating software is as grim as it ever was.

We have a theory that this owes itself to a kind of temporal reductionism: just as radical rationalists see all knowledge as reducible to, and explicable in terms of, its infinitessimally small sub-atomic essence, so the data modernists see it as explicable in terms if fininitessimally small windows of time.

This is partly because computer languages don’t do tense: they are coded in the present, and have no frame of reference for continuity. And it is partly because having to cope with history, and the passage of time, makes things exponentially more complex than they already are. A fine-grained snapshot of the world as data is enough of a beast to be still well beyond the operating parameters of even the most powerful present quantum machines: that level of detail extending into the future and back from the past is infinitely less calculable yet. If we can rationalise that this infinitely stretching time is really just comprised of billions of infinitesimally thin, static slices, and each slice is functionally identical to any other, we have a means of handling that complexity.

That is does not have a hope of working seems beside the point.

It’s the long run, stupid

A snapshot of the process, when it is at minimum stress, fair weather, all is operating well. But efficiency must be measured over an appropriate life cycle measured by the frequency of the worst possible negative event. The efficiency of a process must take in all parts of the cycle — the whole gamut of the four seasons — not just that nice day in July when all seems fabulous with the world. There will be other days; difficult ones, on which where multiple unrelated components fail at the same moment, or where the market drops, clients blow up, or tastes gradually change. There will be almost imperceptible, secular changes in the market which will demand products be refreshed, replaced, updated, reconfigured; opportunities and challenges will arise which must be met: your window for measuring who and what is truly redundant in your organisation must be long enough to capture all of those slow-burning, infrequent things.

The skills and operations you need for these phases are different, more expensive, but likely far more determinative of the success of your organization over the long run.

The Simpson’s paradox effect: over a short period the efficiency curve may seem to go one way; over a longer period it may run perpendicular.

The perils, therefore, of data: it is necessarily a snapshot, and in our impatient times we imagine time horizons that are far too short. A sensible time horizon should be determined not by reference to your expected regular income, but to your worst possible day. Take our old friend Archegos: it hardly matters that you can earn $20m from a client in a year, consistently, every year for twenty years if you stand to lose five billion dollars in the twenty-first.

Then, your time horizon for redundancy is not one year, or twenty years, but two-hundred and fifty years. Quarter of a millennium: that is how long it would take to earn back $5 billion in twenty million dollar clips.

In peace-time, things looked easy for Credit Suisse, so they juniorised their risk teams. This, no doubt, marginally improved their net peacetime return on their relationship with Archegos. But those wage savings — even if $10m annually, were out of all proportion to the incremental risk that they assumed as a result.

(We are, of course, assuming that better human risk management might have averted that loss. If it would not have, then the firm should not have been in business at all)

Tight coupling

Redundancy is another word for “slack” in the sense of looseness in the tether between interconnected parts of a wider whole. For optimum normal operation one should minimise slack — allow maximum responsiveness — what musicians would call “attack” — the greatest torque, the most direct transmission of power to road; minimal latency.

But, as Charles Perrow notes^[1] this is only true as long as the machine is working perfectly, in an environment where every outcome can be predicted, monitored, and bad ones can be avoided by rote. But, generally, these are not very interesting environments.

Just-in-time systems have the lowest tolerance for component failure. Should a component misbehave, they have the greatest risk of causing a chain reaction leading to catastrophe. The lack of “give” the shorter the time to diagnose the failure and shut the system down. Conversely a system built with back up can continue to operate while failed components are repaired or replaced. Likewise, a certain amount of “stockpiling” in a production line allows production to continue should there be any outages or supply chain problems throughout the process.

The manufacturing process is nominally optimised, conmoditised, but should nonetheless be in a constant state of improvement — jidoka — to refine the process, adjust for evolving demand, react to competition and take advantage of new technology and knowhow. This is a valuable “background processing” function — important and valuable but not day to day “urgent”— for which “redundant” personnel can be occupied, which they can redeploy immediately should a crisis arise.

This has two benefits: firstly the process “peacetime” self-analysis should in part be aimed at identifying emerging risks and design flaws in the system; secondly the personnel should have an intimate, detailed and holistic understanding of the process and should therefore be better adept to react to a crisis should one arise.

This behaviour is long-term “skin in the game” commitment best serviced by local, full-time, long-serving employees, not itinerant inexperienced outsourced labour.

The importance of employees, and the value they add 8s not constant. In an operationalised workplace they pick up a penny a day on 99 days out of 100; if they save the firm £ on that 100th day, it is worth paying them 2 pennies a day every day even if, 99 days out of 100, you are making a loss.

Fragility

Another thing about a lean system is that it is fragile. Super fragile, infact as it has multiple, parallel - or series-wired critical failure points,any one of them, best case, stop the whole of the system functioning, but in complex systems may themselves cause further component failure.

Even in cases of some redundancy, failure of one component stresses others, and may cause them to fail.

We run the gamut from superfragility, where component failure triggers system meltdown — these are Charles Perrow’s“system accidents”; a continuum between normal fragility, where component failure causes system disruption and normal robustness where there is enough redundancy in the system that it can withstand outages and component failures, bit components will continue to fail in predictable ways, and then antifragility, where the redundancy itself is able to respond to component failures and secular challenges, and resigns the system in light of experience to reduce the risk of known failures.

The difference between robustness and antifragility here is the quality of the redundant components. If your redundancy strategy is to have lots of excess stock, lots of spare components and an inexhaustible supply of itinerant,enthusiastic but inexpert school-leavers from Bucharest ,then your machine will be robust and functional will be able to keep operating as long as macro conditions persist, but it will not learn it will not develop, and it will not adapt to changing circumstances.

An antifragile system requires both kinds of redundancy: plant and stock, to keep the machine going, but tools and knowhow, to tweak the machine. Experience, expertise and insight. The same things — though they are expensive — that can head off catastrophic events can apprehend and capitalise upon outsized business opportunities. ChatGPT will not help with that.

Redundancy as a key to successful change management

Damon Centola ’s research about concentration and bunching of constituents to ensure change is permanent.

↑ In one of the JC’s favourite books, Normal Accidents: Living with High-Risk Technologies.

[1] In one of the JC’s favourite books, Normal Accidents: Living with High-Risk Technologies.

[1]

Template:M intro design System redundancy

Contents

It’s the long run, stupid

Tight coupling

Fragility

Redundancy as a key to successful change management

Navigation menu

Template:M intro design System redundancy

It’s the long run, stupid

Tight coupling

Fragility

Redundancy as a key to successful change management

Navigation menu

Search