Normal Accidents: Living with High-Risk Technologies: Difference between revisions

Jump to navigation Jump to search
No edit summary
No edit summary
Line 11: Line 11:


===Normal accidents===
===Normal accidents===
Where you have a complex system, we should ''expect'' accidents — and opportunities, quirks and serendipities —to arise from unexpected, non-linear interactions are, says Perrow, “normal”, not in the sense of being regular or expected — in the forty-year operating history of nuclear power stations, there had (at the time of writing!) been no catastrophic meltdowns<ref>“... but his constitutes only an “industrial infancy” for complicated, poorly understood transformation systems.” Perrow had a chilling prediction: “But the ingredients for such accidents are there, and unless we are very lucky, one or more will appear in the next decade and breach containment.” Ouch.</ref>there had, but in the sense that it is an inherent property of the system to have this kind of accident. Financial services [[risk manager]]s take note: you can’t solve for these kinds of accidents. You can’t prevent them. You have to have arrangements in place to deal with them. And these arrangements need to be designed to deal with the unexpected outputs of a ''[[complex]]'' system, not the predictable effects of a merely ''[[complicated]]'' one.
Where you have a complex system, we should ''expect'' accidents — and opportunities, quirks and serendipities, but here we are talking about risk — to arise from unexpected, non-linear interactions. Such accidents, says Perrow, arer“normal”, not in the sense of being regular or expected,<ref>In the forty-year operating history of nuclear power stations, there had (at the time of writing!) been ''no'' catastrophic meltdowns, “... but this constitutes only an “industrial infancy” for complicated, poorly understood transformation systems.” Perrow had a chilling prediction: “... the ingredients for such accidents are there, and unless we are very lucky, one or more will appear in the next decade and breach containment.” Ouch.</ref> but in the sense that it is an inherent property of the system to have this kind of accident.  
 
Are financial systems [[complex]]? About as complex as any distributed system known to humankind. Are they tightly coupled? Well, you could ask the principals of [[LTCM]], [[Enron]], [[Bear Stearns]], Amaranth Advisors, [[Lehman]] brothers or Northern Rock, if any of those venerable institutions were still around to tell yiou about it.
 
So, financial services [[risk controller]]s take note: if your system is a complex, tightly-coupled system — and it is — ''you cannot solve for systemic failures. You can’t prevent them. You have to have arrangements in place to ''deal'' with them. These arrangements need to be able to deal with the unexpected outputs of a ''[[complex]]'' system, not the predictable effects of a merely ''[[complicated]]'' one.
 
Why make the distinction between complex and complicated like this? because pre-configured devices — [[risk taxonomy|risk taxonomies]], [[playbook]]s, [[checklist]]s, [[neural networks]] may help resolve isolated failures in ''complicated'' components, but they have ''no'' chance of helping to resolve systems failures. They are ''of'' the system. They are ''part'' of what has failed. Not only that: these safety mechanisms, by their existence, contribute to complexity in the system, and when a system failure happens they can make it ''harder'' to detect what has gone wrong.


===Inadvertent complexity===
===Inadvertent complexity===
So far, so hoopy; but here’s the rub: we can make systems and processes more or less complex and, to an extent, reduce tight coupling by careful system design. But adding linear safety systems to a system ''increases'' its complexity, and makes dealing with complex interactions even harder. Not only do they create potential accidents of their own, but they also afford a degree of false comfort that encourages managers, who typically have financial targets to meet, not safety ones — to run the system harder, thus increasing the coupling of unrelated components. Perrow catalogues the chain of events leading up to the meltdown at Three Mile Island.
So far, so hoopy; but here’s the rub: we can make systems and processes more or less complex and, to an extent, reduce [[tight coupling]] by careful system design and iterative improvement: air transport has become progressively less complex as it has developed. It has learned from each accident. But it is axiomatic that we can’t eliminate complexity.
 
Here is where the folly of complicated safety mechanisms comes in: adding linear safety systems to a system ''increases'' its complexity, and makes dealing with complex interactions even harder. Not only do they create potential accidents of their own, but they also afford a degree of false comfort that encourages managers, who typically have financial targets to meet, not safety ones — to run the system harder, thus increasing the coupling of unrelated components. Perrow catalogues the chain of events leading up to the meltdown at Three Mile Island.


===“Operator error” is almost always the wrong answer===
===“Operator error” is almost always the wrong answer===