The Field Guide to Human Error Investigations: Difference between revisions

no edit summary
No edit summary
No edit summary
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{A|book review|}}
{{A|book review|{{image|field guide|jpg|}}}}{{br|The Field Guide to Human Error Investigations}}<br>{{author|Sidney Dekker}}{{c|Systems theory}}
Of a piece with {{author|Charles Perrow}}’s {{br|Normal Accidents}}, {{author|Sidney Dekker}}’s book is compelling in rooting the cause of accidents in poor system design and unnecessary complexity, overlaying safety features and compliance measures which only make the problem worse — that is, at the door of management and not poor benighted [[subject matter expert]]s who are expected to make sense of the {{author|Rube Goldberg}} [[Heath Robinson machine|machine]] that management expect them to operate.
===More on systems accidents===
Of a piece with {{author|Charles Perrow}}’s {{br|Normal Accidents}}, {{author|Sidney Dekker}}’s book is compelling in rooting the cause of accidents in poor system design and unnecessary complexity, overlaying safety features and compliance measures which only make the problem worse — that is, at the door of management and not poor benighted [[subject matter expert]]s who are expected to make sense of the [[Rube Goldberg machine]] that management expect them to operate.


There are two ways of looking at system accidents:
There are two ways of looking at system accidents:
*'''It’s the [[meatware]]''': [[Complex]] (and [[complicated]]) systems would be fine ''were it not for the [[meatware]] screwing things up''. Human error is the main contributor to most system accidents, introduce unexpected failures into an essentially robust mechanism.  Here, system design and management direction and oversight are otherwise effective strategies that are let down by unreliable human operators.
*'''It’s the [[meatware]]''': [[Complex]] (and [[complicated]]) systems would be fine ''were it not for the [[meatware]] screwing things up''. Human error is the main contributor to most system accidents, introduce unexpected failures into an essentially robust mechanism.  Here, system design and management direction and oversight are otherwise effective strategies that are let down by unreliable human operators.
*'''It’s the system''': Accidents are an inevitable by-product of operators doing the best they can within [[complex]] systems that contain unpredictable vulnerabilities, where risks shift and change over time and priorities are unclear, conflicting and variable. Here, human operators are let down by shortcomings in system design and conflicting management pressure.
*'''It’s the system''': Accidents are an inevitable by-product of operators doing the best they can within [[complex]] systems that contain unpredictable vulnerabilities, where risks shift and change over time and priorities are unclear, conflicting and variable. Here, human operators are let down by shortcomings in system design and conflicting management pressure.
===[[Blame the meatware]]===
===Blame the [[meatware]]===
Those investigating accidents are motivated in ways which will favour the “[[meatware]]” theory:
Those investigating accidents are motivated in ways which will favour the “[[meatware]]” theory:
*'''Resource constraints''': the wish for a simple [[narrative]] leading to a quick and inexpensive means of remediation
*'''Resource constraints''': the wish for a simple [[narrative]] leading to a quick and inexpensive means of remediation
*'''Failure reaction''': an instinctive reaction that since an accident has happened there must have been a failure
*'''Failure reaction''': an instinctive reaction that since an accident has happened there must have been a failure
*'''Hindsight bias''': the benefit of hindsight in that (i) the accident has happened (ii) the causal chain is now clear (thanks to the simple [[narrative]]) (iii) there has been a failure
*'''Hindsight bias''': the benefit of hindsight in that (i) the accident has happened (ii) the causal chain is now clear (thanks to the simple [[narrative]]) (iii) there has been a failure.
*'''Personal responsibility''': In an ironic self-hack those who are paid to be safety practitioners assume great responsibility, with some pride, notwithstanding that they don’t have the power or authority (what Daniel Pink would call “mastery” and “autonomy”) to effectively ''discharge'' that responsibility. But if they too are motivated to fall on their sword, as a matter of pride, it is an easy conclusion for an accident investigator to draw.
*'''Political imperative''': the “best” outcome for the organisation — and its management — is that there is no deep, systemic root cause requiring extensive recalibration of the system's fundamental architecture (much less criticism of management for its design or supervision), and that superficial action (rooting out/remediating “bad apples”) is all that is needed.
*'''Political imperative''': the “best” outcome for the organisation — and its management — is that there is no deep, systemic root cause requiring extensive recalibration of the system's fundamental architecture (much less criticism of management for its design or supervision), and that superficial action (rooting out/remediating “bad apples”) is all that is needed.


Line 17: Line 19:
*'''[[Downgrading]] employees''': ''removing'' subject matter experts and replacing them with lower calibre (i.e., cheaper) employees with ''even less'' autonomy to follow the ''even more complicated'' rules and processes now introduced.
*'''[[Downgrading]] employees''': ''removing'' subject matter experts and replacing them with lower calibre (i.e., cheaper) employees with ''even less'' autonomy to follow the ''even more complicated'' rules and processes now introduced.


But blaming the [[meatware]] is to ignore history and be condemned yourself to repeat it. Changing the make-up of your operational workforce won’t make much difference if you leave the basic conditions under which they were obliged to operate unaddressed. Just adding more, increasingly detailed, policies — “codified over-reactions to situations that are unlikely to happen again” in {{author|Jason Fried}}’s elegant words<ref>{{author|Jason Fried}}, {{br|ReWork: Change the Way You Work Forever}}</ref> — will only make the gap between theory and practice wider.
But blaming the [[meatware]] is to ignore history and condemn yourself to repeat it. Changing the make-up of your workforce won’t help if the basic conditions under which they are obliged to operate aren’t fixed. Simply adding more, increasingly detailed, policies — “codified over-reactions to situations that are unlikely to happen again” in {{author|Jason Fried}}’s elegant words<ref>{{author|Jason Fried}}, {{br|ReWork: Change the Way You Work Forever}}</ref> — will only make the gap between theory and practice wider.
===The work to rule as falsification of policy===
{{Work to rule capsule}}
 
===Reacting to failure===
Reactions to accidents tend:
*'''Retrospective''': To be made with the benefit of hindsight and full knowledge of inputs and outputs, and with ample time to construct a [[narrative]] that neatly links events into a causal chain ''that was not at all clear to the actors at the time''. This may be an impressive feat of imagination from a [[middle manager]] not normally known for their creative skills, but it ''is'' a feat of imagination.
*'''Proximal''': To blame the [[meatware]] at the sharp end, closest to the accident — the operators, traders, [[negotiator]]s — and not so much on those at the “blunt end” — the executive, its goals, target end-states, its strategic management of the process, how it balances risk and reward, what tools  tools and equipment it provides, its rules and [[policy|policies]] and the constraints and pressures it  imposes on those [[subject matter expert]]s to get the job done.
*'''Counterfactuals''': To construct alternative sequences of events — where operators “zigged”, but could have “zagged” — which might have avoided the incident. “''Forks in the road stand out so clearly to you, looking back. But when inside the tunnel, when looking forward and being pushed ahead by unfolding events, these forks were shrouded in the uncertainty and [[complexity]] of many possible options and demands; they were surrounded by time constraints and other pressures.”
*'''Judgmental''': To explain failure by seeking failure: incorrect analyses, mistaken perceptions, misjudged actions. Again, hindsight is king. In each case, if you presented the operator with the facts as they were available to the investigator, in the same unpressurised environment, you might expect the “correct” outcome.
 
====Common canards====
*'''Cause-consequence equivalence''': The assumption that a bad outcome must have had equally ''bad'' causes and, seeing as management-mandated and properly governed processes are unlikely to have been the process of really bad governance —we bureaucratised the shit out of that, after all — therefore the malignant cause ''must'' be the fault of a bad apple somewhere.
:But bad outcomes are ''not'' necessarily caused by equally bad inputs: the [[Three Mile Island]] disaster was a concatenation of seemingly insignificant and benign, but unusual, events. Indeed, that is the conclusions of [[normal accidents]] theory: You ''can’t'' prevent unexpected non-linear interactions, and these can often quickly spiral out of control.
 
===The [[root cause]]===


{{Sa}}
{{Sa}}
Line 24: Line 41:
*[[Systems analysis]]
*[[Systems analysis]]
*{{br|Normal Accidents}}
*{{br|Normal Accidents}}
{{Ref}}