“Even a room with flammable gas will not explode unless someone strikes a match.”
– Dr. Bob Wachter
Case 1:
On March 27, 1977, two 747s collided on the runway at Tenerife in the Canary Islands, killing 583 people. On a foggy morning, the KLM 747 was waiting for clearance to take off. Captain Van Zanten was a well-known pilot with excellent safety records. The KLM crew spotted a Pan Am 747 taxiing toward the lone runway a while ago and assumed the plane was out of the way. The fog was so thick that the crew had to rely on the Air Traffic Controller (ATC). The KLM copilot said to the ATC, “We are now at takeoff,” which is a nonstandard statement in aviation. The pilot then added, “We are going.” The ATC assumed the KLM plane was in a takeoff position (and not actually taking off) and replied, “OK,” which is another nonstandard reply and led the KLM captain to believe they were cleared for takeoff. Due to the interference in the radio frequency, the KLM crew couldn’t hear accurate information about the Pan Am 747. The KLM flight engineer asked the captain: “Is [the Pan Am] not clear?” The captain just replied with “Yes” and pulled the throttle, accelerating for takeoff. Emerging from the fog was the Pan Am plane sitting on the main runway right in front of them. The collision caused the explosion of both planes, resulting in the worst air traffic collision of all time.
Case 2:
On April 21, 2013, an 86-year-old male patient recovering from pneumonia at a community hospital experienced fluctuating blood pressure throughout the day. A nurse noticed the trend in the morning yet did not voice her concern to the resident and attending physician. The resident later noted the patient’s unstable condition and was unsure about the next steps; however, he did not escalate it to the attending because the attending was off duty. Later in the day, the patient’s daughter noticed the patient’s decreasing heart rate. She pressed the nurse call buttons numerous times, but nobody was there to respond to her call at the nurse station. The daughter rushed to the nurse station and realized that all the heart monitors at the station were turned off. She went back to her father’s bedside and held his hand until he passed away 4 minutes later.
The seemingly unrelated cases above have several common underlying themes:
1. A single event/person did not cause either incident.
2. Authority gradients (psychological distance between a worker and supervisor) and miscommunication played huge roles in the buildup of each event.
3. Both workplaces did not have a system that caught human errors.
Theme #1: A single event/person did not cause either incident
Accumulation of multiple lapses led to the plane collision and the patient’s death. The Swiss Cheese Model is often used in commercial aviation and health care to demonstrate that a single “sharp-end” (e.g., the pilot who operates the plane or the surgeon who makes the incision) error is rarely enough to cause harm. The error must penetrate multiple incomplete layers of protection (Swiss cheese layers) to cause an accident. Organizations’ goal is to shrink the holes in the Swiss Cheese (latent errors) through multiple overlapping layers of protection to decrease the probability that the holes will align and cause harm.
Case 1: Plane collision at Tenerife
Multiple factors caused the plane collision.
Factor 1: weather. Foggy weather contributed to poor vision.
Factor 2: communication between the KLM copilot and air traffic controller (ATC). Both parties used nonstandard terminology when communicating their plane position and status. ATC assumed the KLM plane was in a takeoff position and not actually taking off.
Factor 3: technology. Interference on the radio frequency precluded the KLM crew from hearing Pan Am’s status message and the ATC’s response.
Factor 4: authority gradient. Hierarchy in the aviation industry enforced the psychological distance between the flight engineer/copilot and the pilot. The flight engineer and copilot did not raise concerns about the missed radio transmission from Pan Am. The pilot was not receptive to/aware of their colleagues’ mild voices of concern, such as the KLM flight engineer’s nudging question.
Now moving on to case 2.
Case 2: Medical errors
Factor 1: authority gradient. Health care hierarchy enforced the psychological distance between the nurse and the resident/attending physician. The nurse noticed the patient’s fluctuating blood pressure in the morning and felt concerned; however, she didn’t voice it to the resident/attending physician.
Factor 2: cognitive and communication lapses/authority gradient. When the resident physician later noticed the patient’s condition, he didn’t know what to do. He didn’t ask for help by escalating the issue to the attending physician.
Factor 3: alarm fatigue. Too many insignificant alarms created mental fatigue among health care workers. The main crisis monitor was turned off at the nurse station. Health care providers couldn’t detect warning signals that showed the patient’s dangerously low heart rate for nearly 30 minutes before his heart stopped.
Factor 4: staffing issue. Nobody was at the nurse’s station to respond to the nurse’s call. Low nurse staffing led to a compromised safety culture.
Theme #2: Authority gradients and miscommunication played huge roles in the buildup of each event
Communication structures in both incidents were heavily influenced by authority gradients. Figure 1 shows that surgeons and nurses/residents often have completely different perceptions of the efficacy of their communication structures. While attending surgeons in the survey felt that teamwork in their OR was solid, the rest of the team members disagreed. This means that followers, not just leaders, should also evaluate the communication and teamwork quality. Acknowledging the differences in degrees of perception among team members is critical because we cannot design effective solutions unless we become aware of existing problems.
Why would the discrepancy in perception happen in the first place? Blaming people for their attitudes, such as arrogance and complacency, is easy, but that alone does not explain the whole picture. The core issue is the lack of systems thinking, a paradigm that looks at relationships among parts instead of separate parts when understanding the complexity of the world. In many industries, each role is highly specialized, and individuals are often not aware of interrelationships among different roles, let alone how those relationships affect their workflow and outcome. As a result, cross-departmental communication rarely happens, and each role relies on its assumptions about other roles.
Donald Norman, an advocate for user-centered design, points out the tendency in engineering to forget that individual elements work together, creating a system. The same thing applies to hospital staffing/resource allocation. Everyone wants to do the right thing and perhaps functions well on their own; however, most work requires teamwork, and when these professionals come together without understanding how they work as a system, things fall apart.
Theme #3: Both workplaces did not have a system that caught human errors
Humans err. Telling people not to slip is unrealistic because we work in a dynamic, not a static environment. Multiple factors, including our fatigue level and mental state, influence our decision-making and actions and cause slips and mistakes.
Slips are inadvertent, unconscious lapses when performing automatic tasks such as pilots taking off or health care providers writing prescriptions. In case 1, slips happened for both the copilot/pilot and ATC when they used nonstandard statements during their status communication. In case 2, slips happened when the heart monitor was turned off at the nurse station, and nobody was there to respond to the nurse’s call. If the nurse station had been adequately staffed, they might have been able to prevent the sharp-end harm by responding to the nurse’s call despite the heart monitor being turned off. Another potential slip could be that the nurse who noticed the patient’s fluctuating blood pressure might have forgotten to bring it up to the resident due to her heavy workload, or the information was not communicated well to an incoming nurse during handoffs.
Slips can be prevented by relatively easier approaches. For example, built-in redundancies, cross-checks/checklists, readbacks, and safety practices, such as asking patients for their name and date of birth before administering medications, have been successfully implemented in the U.S. In case 2, simple, standardized communication procedures among health care providers might have helped create complete and accurate information flow in the team. U.S. teaching hospitals have increasingly adopted the I-PASS mnemonic (Illness Severity, Patient Summary, Action List, Situation Awareness and Contingency Planning, Synthesis by Received) to standardize provider-to-provider signout, which has improved safety.
So what can we do?
Crew resource management (CRM) developed after the Tenerife disaster has been one of the most successful practices in improving safety culture in aviation, and it has also influenced health care in the U.S. Following the CRM adoption in the aviation industry, the U.S. and Canadian airlines had a remarkable reduction in the annual fatal accident rate (Figure 2). CRM focuses on training crews in communication and teamwork to encourage crews to speak up against the authority gradient. Communication skills such as SBAR (Situation, Background, Assessment, and Recommendations) and briefing/debriefing techniques have been applied in health care to improve staff communication, especially between nurses and physicians. CUS words (“I’m concerned about …” then”I’m uncomfortable …” and finally,” This is a safety issue!”) are also effective ways to escalate levels of concern for anyone lower on a hierarchy. While aviation and health care have unique field-specific challenges, CRM combined with multi-pronged approaches can address inevitable yet preventable human factors errors.
To recap, here is the message: Systems thinking and the Swiss cheese model are critical to understanding the root cause of each incident, designing system-specific solutions that catch human factors errors, and developing situational awareness about how an individual’s role affects other roles and impacts the workflow.
Alisa Sano is a public health research assistant.