Template:M intro design System redundancy

Revision as of 14:46, 17 July 2023 by Amwelladmin (talk | contribs)

It’s the long run, stupid

Taylorism and just-in-time efficiency A snapshot of the process, when it is at minimum stress, fair weather, all is operating well. But efficiency must be measured over an appropriate life cycle measured by the frequency of the worst possible negative event. The efficiency of a process must take in all realistic parts of the cycle, including the difficult ones where components fail, revenue drops, clients blow up, and it must be long enough to capture slow secular changes in the market over which products must be refreshed, replaced, updated, reconfigured, challenges must be met and competitors are developing new and better products.

The skills and operations you need for these phases are different, more expensive, but likely far more determinative of the success of your organization over the long run.

The Simpson’s paradox effect: over a short period the efficiency curve may seem to go one way; over a longer period it may run perpendicular.

The perils, therefore, of data: it is necessarily a snapshot, and we inevitably draw a “relevant time horizon” that is far too short. That time horizon is determined not by your regular income, but by your worst possible day. It does not matter that you can earn $20m a year every year for twenty years if you stand to lose $5bn in the 21st. Then, your time horizon is not one year, or twenty years, but two-hundred and fifty years . In peace-time, things looked easy for Credit Suisse, so they juniorised their risk teams. This, no doubt, marginally improved their net peacetime return on their relationship with Archegos. But those wage savings — even if $10m annually, were out of all proportion to the incremental risk that they assumed as a result.

(We are, of course, assuming that better human risk management might have averted that loss. If it would not have, then the firm should not have been in business at all)

Tight coupling

Redundancy is another word for “slack” in the sense of looseness in the tether between interconnected parts of a wider whole. For optimum normal operation one should minimise slack — allow maximum responsiveness — what musicians would call “attack” — the greatest torque, the most direct transmission of power to road; minimal latency.

But, as Charles Perrow notes[1] this is only true as long as the machine is working perfectly, in an environment where every outcome can be predicted, monitored, and bad ones can be avoided by rote. But, generally, these are not very interesting environments.

Just-in-time systems have the lowest tolerance for component failure. Should a component misbehave, they have the greatest risk of causing a chain reaction leading to catastrophe. The lack of “give” the shorter the time to diagnose the failure and shut the system down. Conversely a system built with back up can continue to operate while failed components are repaired or replaced. Likewise, a certain amount of “stockpiling” in a production line allows production to continue should there be any outages or supply chain problems throughout the process.

The manufacturing process is nominally optimised, conmoditised, but should nonetheless be in a constant state of improvement — jidoka — to refine the process, adjust for evolving demand, react to competition and take advantage of new technology and knowhow. This is a valuable “background processing” function — important and valuable but not day to day “urgent”— for which “redundant” personnel can be occupied, which they can redeploy immediately should a crisis arise.

This has two benefits: firstly the process “peacetime” self-analysis should in part be aimed at identifying emerging risks and design flaws in the system; secondly the personnel should have an intimate, detailed and holistic understanding of the process and should therefore be better adept to react to a crisis should one arise.

This behaviour is long-term “skin in the game” commitment best serviced by local, full-time, long-serving employees, not itinerant inexperienced outsourced labour.

The importance of employees, and the value they add 8s not constant. In an operationalised workplace they pick up a penny a day on 99 days out of 100; if they save the firm £ on that 100th day, it is worth paying them 2 pennies a day every day even if, 99 days out of 100, you are making a loss.

Fragility

Redundancy as a key to successful change management

Damon Centola ’s research about concentration and bunching of constituents to ensure change is permanent.

  1. In one of the JC’s favourite books, Normal Accidents: Living with High-Risk Technologies.