You can’t prevent IT outages from happening, but you can be better prepared to deal with them
Unplanned system downtime is the reality that IT departments need to deal with every day. Some even see downtime as being the worst thing that can happen with their IT systems. In fact, as almost everything we know has gone through a digital transformation, businesses rely more and more upon IT; therefore an IT issue is a business issue. When critical incidents occur, the business operations can quickly suffer from it:
- Loss of online revenue for e-retailers
- Drop off in employee’s productivity in manufacturing
- Frustrated clinicians, increased patient safety risk and drop of the hospital bed turnover rate in hospitals
- Impact on brand, company image and patient satisfaction
Not long ago, CloudEndure published a survey that put system downtime, and more specifically the cost of system downtime into perspective. The online survey was conducted in January of 2016 and responses were collected from 141 IT professionals from around the world who were using or looking to implement disaster recovery.1 According to the survey:
- The cost of downtime for over a third (36%) of the organizations is a $100,000 per day or higher.
- Almost three quarters (73%) of the organizations surveyed indicated the cost of downtime is $10,000 per day or higher.
We all know that there is a high cost of system downtime for most, if not all, organizations. IT and business operations are so intertwined, failure from the IT side of things is almost guaranteed to negatively impact the business. But do we really know what downtime is? What it means to organizations? This is where CloudEndure explored system downtime further:
- 50% of the survey respondents defined downtime as inaccessible systems.
- 25% said downtime is characterized as time when the system is accessible but performance is highly degraded.
- 25% expanded the definition to include instances when the system is accessible but some functions are not operational.
CloudEndure also asked survey respondents to share the main causes of system downtime, and here are the findings:
How often are organizations experiencing IT system downtime?
- More than half of the companies (57%) had an outage in the past 3 months.
- Almost a third (31%) had an outage in the past month or week.
Finally, when a company experiences IT system downtime how good are they at notifying their customers?
As stated in this study, no matter how well prepared we are to deal with critical IT issue, they keep on occurring and usually at the worst time. When this happens, the goal for to the IT team should really be to keep the mean time to resolution to a minimum. We understand the MTTR can never be null but there are ways to identify and eliminate the unnecessary wasted time spent on efficient activities while trying to fix and bring the service back online. Solutions such as:
- APM, UEM, NPM solutions can be used to automate the detection of issues
- IT Monitoring solutions to consolidate all alerts into a single console for faster reaction
- Event Correlation solutions to reduce the number of alerts
- Service Desk solutions to manage the incident lifecycle
But very little has been done to improve communication and collaboration between the different teams and stakeholders. This is where Everbridge IT Alerting comes in. IT Communications notifies the right on-call personnel with the right information, so they can hop on a conference bridge in one click quickly and fully focus on restoring service and limiting the negative impact of incidents on end-user satisfaction and even revenue. The solution will also help reduce the number of inbound calls into the service desk with the ability to automatically notify the impacted customers or business users of the issue.
To learn more about Everbridge IT Alerting solution for critical IT communication and escalation, visit our website.