Safety in mind: high-reliability organisations

11016

Visiting the flight deck of an aircraft carrier left a strong impression on literary journalist, Tom Wolfe.

This is a skillet!—a frying pan!—a short-order grill!—not gray but black, smeared with skid marks from one end to the other and glistening with pools of hydraulic fluid and the occasional jet-fuel slick, all of it still hot, sticky, greasy, runny, virulent from God knows what traumas—still ablaze!—consumed in detonations, explosions, flames, combustion, roars, shrieks, whines, blasts, cyclones, dust storms, horrible shudders, fracturing impacts, all of it taking place out on the very edge of control … and little men … are skittering about on the surface as if for their very lives (you’ve said it now!), clustering about twin-engine F-4 fighter planes like bees about the queen … and then running for cover as the two jet engines go into their shriek and a huge deflection plate rises up behind the plane because it is about to go into its explosion and quite enough gets blown … off this heaving grill as it is, and then they explode into full afterburner, 31,000 pounds of force, and a very storm of flames, heat, crazed winds …

Wolfe’s vivid verbal riff leaves no doubt that an aircraft carrier deck is a very dangerous place. The interesting thing from a safety point of view is that aircraft carriers have become much less dangerous. By the US Navy’s figures, 776 naval aircraft were destroyed in 1954; in 2015, the equivalent number was seven.

In 1984, Todd LaPorte, Karlene Roberts and Gene Rochlin from the University of California, Berkeley, began analysing organisations where hazardous activities were part of their everyday business, but which had few accidents. Among their objects of study were nuclear power stations, electricity distribution networks, air traffic control—and aircraft carriers. LaPorte, a professor of political science who had been a US Marine corps pilot, wangled fieldwork placements for the team (Roberts was a psychologist and Rochlin a physicist) on two US Navy carriers. Their reports, published in 1987 and1988, introduced the academic world to the concept of the high-reliability organisation (HRO).

As important as their preliminary findings, was the overarching question: why do some organisations do so well at managing risk? LaPorte, Roberts and Rochlin wrote of ‘pervasive operating tension’, ‘reliability-oriented allocation of responsibilities’, and the implicit bargain that ‘in effect, operational reliability rivals short-term efficiency as a dominant organisational value’.

Karl Weick and Kathleen Sutcliffe use the concept of sensemaking in one form of answer to the HRO question. Their five-point summary offers useful advice for any individual or organisation to increase their operational reliability.

Weick and Sutcliffe found a distinctive form of organisational sensemaking in HROs, characterised by mindfulness. This is a heightened awareness of the hazards of the business. High-reliability.org defines mindfulness as, ‘a mental orientation that continually evaluates the environment, as opposed to mindlessness where a simple assessment leads to choosing a plan that is continued until the plan runs its course’.

Weick and Sutcliffe say mindfulness can be broken down into two processes—anticipation and containment.

Anticipation has three elements:

  1. Preoccupation with failure. Early signs of failure are sought out, collected and analysed. In this context, minor errors and failures are viewed as valuable lessons for the organisation. Defect reporting in aircraft maintenance is the classic example of this—a system to find small failures before they become big failures. Consider this December 2016 airworthiness bulletin from Saab AB Aeronautics. ‘Circuit breakers of an unsuitable strength have been found installed on Saab SF340A aeroplanes, failing in protecting the system from an overcurrent. This condition, if not corrected, could lead to the overheating of wires, possibly resulting in smoke or fire on the aeroplane.’ These two sentences tell a story of engineers detecting a series of small fitment errors, but not merely shrugging their collective shoulders before refitting the correct part. Instead they analysed the consequences of using an unsuitable part by asking themselves a simple question, ‘What’s the worst that could happen?’ The answer was chilling enough to warrant publishing an airworthiness directive. The principle can also be applied to individual performance. While most pilots would evaluate their lookout and airspace compliance practices after a near miss in midair, a high-reliability approach would be to review these in detail, even after a minor airspace incursion or level bust, or even if no breach took place but the aircraft had been heading in that direction.
  1. Reluctance to simplify. HROs assume that all failures are systemic rather than localised (in aviation this means avoiding the insight-destroying label of ‘pilot error’) and that all failures therefore have important lessons about how the organisation is performing. This principle also applies to individuals. The classic example in aviation is the cockpit circuit-breaker board: a mindful, high-reliability approach would assume that a circuit breaker popping signified a real problem on the aircraft, rather than the default explanation of ‘the circuit breaker is acting up again’. An event as minor as a missed radio call before taxiing could signal a deeper problem—a person or organisation practising high-reliability principles will seek to know if it does.
  2. Sensitivity to operations. An HRO will seek information on how it is actually performing. This could come from front-line staff, external review, or data monitoring systems. This aspect of HRO is a very close match to the reporting and assurance components of a safety management system (SMS). Any form of reporting system—from a suggestion box to sophisticated data collection and analysis—is a way to make this sensitivity possible. Information is life. For example: a shrewd hospital administrator put managers in touch with front-line staff with a fake meeting. The meeting between the administrator and the managers was scheduled to run for several hours, but after a short time the administrator wrapped up and encouraged the managers to use the rest of the time to walk round the wards and talk to staff, telling them, ‘I know you have time, because I just freed up the afternoon you had booked for this meeting.’ In the business world this practice is decades old, and known as MBWA or management by walking around. Done sincerely, this is a two-way exchange of information. Managers find out about conditions on the front-line, and those personnel see management commitment to safety in action.

Containment has two elements:

  1. Commitment to resilience. In practical terms, this means having back-ups and procedures in place for the unexpected. These must be developed and changed to reflect the lessons learned from previous events. At its simplest, in private aviation, this would mean having a paper map or second tablet in the cockpit when using OzRunways or Avplan for flight planning or navigation assistance.
  1. Deference to expertise. In crisis, normal hierarchies give way to listening to those with expertise and credibility, regardless of their formal rank. Weick says, ‘What is distinctive about effective HROs is that they loosen the designation of who is the “important” decision maker in order to allow decision making to migrate along with problems … hierarchical rank is subordinated to expertise and experience.’ The captain of the aircraft carrier USS Carl Vinson put it more simply: ‘Every person can save the boat.’ LaPorte and colleagues noted that even the lowliest rating on a carrier deck had the authority to suspend launches, and though a sailor would never be punished for a suspension later found to be unnecessary, they would often be praised for a correct call. In airline transport, the ‘Captain, you must listen!’ phrase from a junior crewmember that precedes a warning about extreme danger is a verbal device designed to make unwelcome expertise heard.

Reliably informed…

High-reliability principles can guide individuals and organisations in more effective management of risk. They are, in effect, good habits, and few enough to remember on the fingers of one hand, forming a structured way to assess the question, ‘What am I or what are we doing about safety?’

Further information

Gamble, M (2013). 5 Traits of High Reliability Organizations: How to Hardwire Each in Your Organization. Becker’s Hospital Review, 29 April 2013

Perrow, C. (1984). Normal Accidents: Living with High-Risk Technologies. New York, Basic Books.

United States Navy, Naval Safety Center. Naval Safety Center Annual Mishap Overview FY14

Weick, K. E. (1987). Organizational culture as a source of high reliability. California Management Review, 29, pp112–127.

Weick, K. E. (1995). Sensemaking in Organizations. Thousand Oaks, CA: Sage.

Weick, K. E. (2001). Making Sense of the Organization. Oxford, UK: Blackwell.

Weick, K. E. and Sutcliffe, K. M. (2007). Managing the unexpected: Resilient performance in an age of uncertainty. Second edition. San Francisco: Jossey-Bass.

Wolfe, T. (1975). The Truest Sport: Jousting with Sam and Charlie. Esquire, October 1975, p156.