The Field Guide to Human Error Investigations: Difference between revisions

From The Jolly Contrarian
Jump to navigation Jump to search
No edit summary
Tags: Mobile edit Mobile web edit
No edit summary
Line 1: Line 1:
{{A|book review|}}
{{A|book review|}}
Of a piece with {{author|Charles Perrow}}’s {{br|Normal Accidents}}, in rooting the cause of accidents in poor system design and unnecessary complexity, overlaying safety features and compliance measures which only make the problem worse — that is, at the door of management and not poor benighted [[subject matter expert]]s who are expected to make sense of the {{author|Rube Goldberg}} machine that management expect them to operate.
Of a piece with {{author|Charles Perrow}}’s {{br|Normal Accidents}}, {{author|Sidney Dekker}}’s book is compelling in rooting the cause of accidents in poor system design and unnecessary complexity, overlaying safety features and compliance measures which only make the problem worse — that is, at the door of management and not poor benighted [[subject matter expert]]s who are expected to make sense of the {{author|Rube Goldberg}} [[Heath Robinson machine|machine]] that management expect them to operate.
 
There are two ways of looking at system accidents:
*'''It’s the [[meatware]]''': [[Complex]] (and [[complicated]]) systems would be fine ''were it not for the [[meatware]] screwing things up''. Human error is the main contributor to most system accidents, introduce unexpected  failures into an essentially robust mechanism.  Here, system design and management direction and oversight are otherwise effective strategies that are let down by unreliable human operators.
*'''It’s the system''': Accidents are an inevitable by-product of operators doing the best they can within [[complex]] systems that contain unpredictable vulnerabilities, where risks shift and change over time and priorities are unclear, conflicting and variable. Here, human operators are let down by shortcomings in system design and conflicting management pressure.
 
Those investigating accidents are motivated in ways which will favour the “[[meatware]]” theory:
*'''Resource constraints''': the wish for a simple [[narrative]] leading to a quick and inexpensive means of remediation
*'''Failure reaction''': an instinctive reaction that since an accident has happened there must have been a failure
*'''Hindsight bias''': the benefit of hindsight in that (i) the accident has happened (ii) the causal chain is now clear (thanks to the simple [[narrative]]) (iii) there has been a failure
*'''Political imperative''': the “best” outcome for the organisation — and its management — is that there is no deep, systemic root cause requiring extensive recalibration of the system's fundamental architecture (much less criticism of management for its design or supervision), and that superficial action (rooting out/remediating “bad apples”) is all that is needed.
 
Recommendations thus tend to ''double down'' on the system design, further constraining employees from effectively managing situations and setting them up for blame should future accidents occur:
*'''Tightening procedures''': Further ''removing'' autonomy from employees, obliging them to follow even more detailed instructions, rules and [[policy|policies]]
*'''Introducing safety mechanisms''': Adding more fail-safes, [[warning light]]s, system breakers and other mechanical [[second-order derivative]]s designed to eliminate [[meatware]] screw-ups but which actually ''further'' complicate the system and bury opportunities to directly observe its operation
*'''[[Downgrading]] employees''': ''removing'' subject matter experts and replacing them with lower calibre (i.e., cheaper) employees with ''even less'' autonomy to follow the ''even more complicated'' rules and processes now introduced.
 


{{Sa}}
{{Sa}}

Revision as of 17:46, 25 October 2020

The Jolly Contrarian’s book review service™
Index: Click to expand:
Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.


Of a piece with Charles Perrow’s Normal Accidents, Sidney Dekker’s book is compelling in rooting the cause of accidents in poor system design and unnecessary complexity, overlaying safety features and compliance measures which only make the problem worse — that is, at the door of management and not poor benighted subject matter experts who are expected to make sense of the Rube Goldberg machine that management expect them to operate.

There are two ways of looking at system accidents:

  • It’s the meatware: Complex (and complicated) systems would be fine were it not for the meatware screwing things up. Human error is the main contributor to most system accidents, introduce unexpected failures into an essentially robust mechanism. Here, system design and management direction and oversight are otherwise effective strategies that are let down by unreliable human operators.
  • It’s the system: Accidents are an inevitable by-product of operators doing the best they can within complex systems that contain unpredictable vulnerabilities, where risks shift and change over time and priorities are unclear, conflicting and variable. Here, human operators are let down by shortcomings in system design and conflicting management pressure.

Those investigating accidents are motivated in ways which will favour the “meatware” theory:

  • Resource constraints: the wish for a simple narrative leading to a quick and inexpensive means of remediation
  • Failure reaction: an instinctive reaction that since an accident has happened there must have been a failure
  • Hindsight bias: the benefit of hindsight in that (i) the accident has happened (ii) the causal chain is now clear (thanks to the simple narrative) (iii) there has been a failure
  • Political imperative: the “best” outcome for the organisation — and its management — is that there is no deep, systemic root cause requiring extensive recalibration of the system's fundamental architecture (much less criticism of management for its design or supervision), and that superficial action (rooting out/remediating “bad apples”) is all that is needed.

Recommendations thus tend to double down on the system design, further constraining employees from effectively managing situations and setting them up for blame should future accidents occur:

  • Tightening procedures: Further removing autonomy from employees, obliging them to follow even more detailed instructions, rules and policies
  • Introducing safety mechanisms: Adding more fail-safes, warning lights, system breakers and other mechanical second-order derivatives designed to eliminate meatware screw-ups but which actually further complicate the system and bury opportunities to directly observe its operation
  • Downgrading employees: removing subject matter experts and replacing them with lower calibre (i.e., cheaper) employees with even less autonomy to follow the even more complicated rules and processes now introduced.


See also