Study after study finds that something like 80% of industrial incidents (give or take) are caused by Human Error. Incidents involving human error often include a failure of the operator to respond to an alarm, which is often directly or indirectly caused by nuisance alarms. Poor alarm management has multiple forms and multiple causes, but nuisance alarms are probably the most prevalent and could be considered the single most important alarm management problem we face today.
So why do I believe nuisance alarms are such a significant issue?
- They frequently show up as a direct or contributing cause in accident investigations.
- They lead to normalization of deviance.
- They change the way the operator thinks about and responds to alarms.
The last one is the “kicker” and is what differentiates nuisance alarms from other alarm management issues like alarm floods. As Nick Sands, the co-chair for ISA-18.2 and co-convener for IEC62682 says, “The presence of nuisance alarms can break the operator”.
“The presence of nuisance alarms can break the operator”
Nuisance Alarms as Causal Factors in Industrial Incidents
While there is never just one thing that goes wrong leading up to an incident, I have found nuisance alarms to be a causal factor in greater than 50% of the incidents that I have analyzed. An incident at a Belle, WV plant resulted in a Methyl chloride leak for five days after startup. A burst disk sensor alarm had been ignored because it was a known nuisance alarm and because maintenance was being performed (which caused of the alarm). The recommendations of the US Chemical Safety Board clearly “called out” the nuisance alarms problem.
“Commission an audit to establish and identify the conditions that cause nuisance alarms at all DuPont facilities. Establish and implement a corporate alarm management program as part of the DuPont PSM Program, including measures to prevent nuisance alarms and other malfunctions in those systems.”
CSB Recommendation 2010-06-I-WV-12
The Pryor Trust Well Blowout in January of 2018 led to a well blowout and rig fire that killed five workers. The alarm system had been deactivated prior to the incident. Why?
One of the contributing causes in the Deepwater Horizon Incident was that the horn to tell the crew to evacuate had been disabled (inhibited) because “they did not want people woke up at 3 o’clock in the morning due to false alarms” originating from the fire and gas detectors. When the detector alarms did go off, there was a 10-minute delay before the evacuation signal was sent. Perhaps if the horn had gone off automatically the loss of life (11 deaths) might have been reduced.
“Breaking the Operator” - Changing the Way the Operator Thinks and Behaves
As portrayed in this video, nuisance alarms are often described as the “Boy Who Cried Wolf” phenomena, because they cause the operator to ignore alarms. Once operators ignore one alarm, they realize they can ignore other alarms, typically without repercussions. This continues until one day an alarm gets ignored that should not have been, leading to an incident.
The alarm system is a critical tool for the operator, so ignoring nuisance alarms has other ramifications besides “it might one day lead to some kind of an incident”. Along the way to the incident, the operator is losing situation awareness because they are ignoring potentially important information. By discrediting external cues, their mental models (understanding of how a system operates and how it behaves) become corrupted, leading to bad decisions (human error).
Figure. Impact of Nuisance Alarms on Situation Awareness and Operator’s Mental Model
So how is it that the operator can be broken? As human factors expert Micah Endsley puts it, ignoring of nuisance alarms is expected and rational behavior under the circumstances.
“A person’s reluctance to respond immediately to a system that is known to have many false alarms is actually quite rational. Responding takes time and attention away from other ongoing tasks perceived as important” (Endsley)
Operators are overloaded with so much data that being able to weed out the important from not important is a critical job skill.
Once operators are “broken” they are not likely to stop ignoring alarms just because some nuisance alarms get fixed. It will take a lot more than that to change the operator’s behavior and decision-making process.
“Designing for Situation Awareness: An Approach to User-Centered Design”, Second Edition, Endsley and Jones, CRC Press