How to Prevent the 5 Most Common Communication Outages
While Governor Jerry Brown calls for water conservation in drought-stricken California, Oklahoma and Texas face record rainfalls, and Tropical Storm Ana hits South Carolina almost a month before the official start of the Atlantic hurricane season (Ruining the Cherry Grove beaches, the unpredictable storm defied government experts who said that the official start of the Atlantic hurricane season would not be until June 1.).
In light of all these weather-related events, how can organizations protect their engagement/communication solutions from disasters? Weather cannot be scheduled. Nor can costly communications system outages be predicted. However, we can learn from the past to prepare for the unexpected. Now is a great time to refresh ourselves on the 5 most common causes of network outages.
To help clients manage network-related problems, Avaya emergency recovery expert Joey Fister analyzed client emergency recovery service requests in the white paper “The Essential Guide to Avoiding Network Outages.” He discovered that the top five causes of communications outages are:
- Power outage: 81 percent
- Lack of routine maintenance: 78 percent
- Hardware failure: 52 percent
- Software bug or corruption: 34 percent
- Network issue or outage: 27 percent
Can these problems be avoided?
“Nearly two-thirds of outages resulting from the top five causes and more than a third of all outages Avaya is involved with could have been avoided by using industry-leading outage prevention practices,” Fister writes. “These practices, simple to implement and sustain, can dramatically reduce costly downtime and potential impact on business results and customer confidence.”
Ensuring that your company’s network system is covered by a support organization with access to the latest issue/optimization solutions and performing routine maintenance can help you predict possible outage sources, lessening system and operator stress while saving money.
According to a 2013 Ponemon Institute study, the average cost per minute of unplanned downtime now exceeds $8,000 per minute, and the average total cost per incident is more than $625,000.
How can your organization avoid the heavy cost of an engagement/communications solution outage? The following offers industry-leading suggestions to help you reduce potential outages caused by the top five problems:
#1: Power Outages
As organizations grow, so does the mix of gear relying on uninterruptible power supply (UPS) units, which are essential to keeping systems operating through lightning strikes, storms and other power disruptions.
- Conduct an audit to determine if facilities can meet power demands and ward off problems.
- Prepare a framework for periodic reviews.
- Pay particular attention to hardware that is approaching the end of manufacturer support (EoMS).
- Ensure proper grounding of UPS systems and sensitive equipment.
#2: Lack of Routine Maintenance
Most organizations know that poorly tended systems are a source of downtime, but why do so few maintain their systems properly?
- Manage upkeep to avoid service disruptions.
- Run a proactive health check.
- Monitor systems in real time 24/7.
- Observe and follow a maintenance schedule.
#3: Hardware Failures
Extending the life of equipment may seem like an economical use of resources, but it comes with considerable risk. ”Sweating the assets” can also be an increasingly risky gamble with significant consequences if the replacement parts or equipment are not available immediately.
- Manage proactive upgrades of equipment approaching EoMS.
- Verify system redundancy.
- Update failover strategies for critical systems that can help reduce hardware-based outages.
#4: Software Bugs or Corruption
Software vendors may be constantly releasing fixes and upgrades into the marketplace, but organizations are not necessarily eager to apply them. Some choose to let others occupy the upgrade frontlines and endure potential rollout hiccups, then follow along at a safe interval. This strategy breaks down disastrously when an organization suffers an outage that would have been avoided with a fix that it voluntarily chose to postpone.
- Adopt a regular patch management strategy that proactively eliminates known issues to maintain software performance and avoid software-related outages while ensuring all systems are properly updated at a time that will least impact business flow. Many managed service providers offer release management services that are simple options to implement a proactive patch management plan.
#5: Network Issues or Outages
Jitter, delay and latency can be warnings of a possible network outage.
- Conduct a simple audit of your organization’s underlying network to identify where such conditions exist.
- Prepare a network diagram to isolate an outage and speed resolution by illustrating the relationships among pieces of equipment.
- Implement rigorous configuration control processes to ensure that system changes and refinements do not inadvertently trigger outages and other problems.
Outage recovery can be painful, but if you are prepared ahead of time, the solution can be simple and damages mitigated. For instance, IT experts say the average downtime from an outage requiring a software reinstall–when backups are available–is 2.4 hours. When backups are not available, average recovery time is 38 hours (more than 15x longer), potentially adding millions more dollars to outage-related revenue losses.
Many of the issues above can be mitigated or avoided completely by leveraging a cloud solution as a backup or failover option. With older solutions being more likely to fail and limited physical access to some sites, business continuity may require hosted redundant or paired systems that can quickly come online when access to an active system is limited.
There are many different ways to weather a storm, but all require planning before the clouds arrive. Take the necessary steps so that you too can avoid weather-related communications outages.
Are you ready to weather the storm? What was your worst IT disaster and how long did it take for you to recover? How old is your network and the UPS?
Follow me on Twitter at @Pat_Patterson_V