How to Win the Alert Fatigue Battle
IT engineers and DevOps teams cannot help but experience alert fatigue when they receive after-hour alerts lacking context or relevance. Messages come in, for example, telling the engineer on-call that disk space is used up. Does this mean 60% used up or 100% used up? Or an after-hours message might come in alerting to a downed server. Which server? Did the back-up server come on-line as a result? The remedy then is to implement an IT alerting system that differentiates high priority alerts and allows for messaging with attachments. Lack of context can cause significant frustration among engineers as well as alert fatigue.
Impact of Alert fatigue
Companies shouldn’t downplay the impact of alert fatigue. There are also significant financial implications for companies if they have stressed out, unhappy, sleep deprived engineers.
For example, engineers who are feeling the stress of alert fatigue are likely to leave for greener pastures, leaving their employers without their knowledge reservoir and needing to rehire which can cost as much as 30% of the individual’s salary.
Actionable steps to fighting alert fatigue
Companies that take alert fatigue seriously realize that they need to address the issues of false alarms and sleep deprivation for their engineers on call. Here are some tried and true ways to overcome the significant issue of alert fatigue and take positive steps towards a happy workforce.
- Provide context. Context will ensure the problem or issue is actionable.
- Differentiate alerts. Not all alerts are created equal. Some alerts are low priority and can be handled during normal work hours. Filter low priority alerts so they don’t wake up engineers over night.
- Alert through a priority messaging application. OnPage’s alerting app enables engineers to message one another from within the application. Engineers can also escalate alerts. This enables the group to act like team players rather than like solo warriors.
- Alert the right person and make it loud. Proper scheduling will ensure that the person who can do the most to correct the problem is alerted.
- Use post-mortems. Post mortems allow your team to look back at what worked and what didn’t.
IT managers need to set expectation regarding what their engineers can expect from life on-call at their company. By using OnPage, managers can ensure that the experience, while not a cake walk, is a manageable aspect of the job and that alert fatigue will be under control.
Experience OnPage now. See how easy OnPage’s incident management tool is to use. Sign up for a demo and start a new chapter for your on-call engineers.