Preventing IT outages And downtime

As businesses continue to embrace digital transformation, availability has become a company’s most valuable commodity. By Daniela Streng, VP & GM EMEA, LogicMonitor.

  • 4 years ago Posted in

Availability refers to the state of when an organisation’s IT infrastructure, which is critical to operating a successful business, is functioning properly. However, when an organisation experiences an influx in demand or another catastrophic IT issue, availability subsides and downtime occurs at an alarming rate. One of the biggest challenges organisations face is that availability is difficult to maintain and is indiscriminate, even for the world’s largest enterprises. 


Companies like British Airways, Facebook and Twitter have all battled through expensive outages in recent years that not only impact their businesses, but also expose society’s growing dependence on technology to perform key functions of our daily needs. As technology continues to advance, IT outages will continue to ensue and will affect more than just an organisation’s bottom line.

Downtime is still a major issue 

Outages occur when an organisation’s services or systems are unavailable, while brownouts are when an organisation’s services remain available, but are not operating at an optimal level. According to a LogicMonitor survey of IT decision-makers in the UK, US and Canada, and Australia and New Zealand regions, 96 percent of respondents said they experienced at least one outage in the past three years. 

Surprisingly, 69 percent of respondents in Australia and New Zealand experienced five or more outages in the last three years, versus an average of 50 percent of respondents in UK, US and Canada respondents who said they experienced five or more outages in the past three years. Only 31 percent of Australia and New Zealand-based IT decision-makers said they experienced four or fewer outages over the last three years. In comparison, approximately 50 percent of UK, US and Canada respondents said they had experienced four or fewer outages in the same timeframe.

An outage can impact more than just an organisation’s finances. The survey found organisations that experienced frequent outages and brownouts incurred higher costs – up to 16-times more than companies who had fewer instances of downtime. Beyond the financial impact, these organisations had to double the size of their teams to troubleshoot problems, and it still took them twice as long on average to resolve them.

The industries most affected

Results from the survey also revealed that the frequency of outages and brownouts is conducive to the industry in which the company operates. Financial and technology organisations experienced outages and brownouts most frequently during a three year period, followed by retail and manufacturing. According to the survey: 

 

  • 41 percent of respondents from financial organisations stated that they experienced 10 or more outages over the past three years.
  • 37 percent of respondents from technology organisations said they experienced 10 or more outages over the past three years. 
  • 34 percent of respondents from retail organisations stated that they experienced 10 or more outages over the past three years.
  • 28 percent of respondents from manufacturing organisations stated that they experienced 10 or more outages over the past three years. 

These numbers highlight the sweeping nature of outages across the various industry sectors and prove that no company should consider itself immune. 

The importance of availability 

Availability matters not only to an organisation’s customers, but also to the IT decision-makers tasked with maintaining it. In fact, 80 percent of global respondents indicated that performance and availability are important issues, ranking above security and cost-effectiveness. After all, IT availability is essential in the smooth running of IT infrastructure and therefore crucial to maintaining business operations. Availability ensures that airline passengers, for example, aren’t stranded due to system outages, food stays at safe temperatures and customers can access their online banking applications.

Despite the importance of availability, IT decision-makers indicated that 51 percent of outages and 53 percent of brownouts are avoidable. This means that organisations could prevent this costly downtime, but do not have the means necessary – whether that involves tools, teams or other resources – to avoid it.

Concerns over the repercussions

With high-profile outages and brownouts hitting the headlines on a regular basis, concerns over the repercussions of experiencing downtime are inevitable. In the UK, 38 percent of respondents said that they will likely experience a major brownout or outage so severe that it will generate media attention, while 35 percent believe someone might lose his or her job as a result of this downtime. 

In the US and Canada, 50 percent of respondents said they will likely experience a major brownout or outage so severe that it will generate media attention. Of the same respondents, 52 percent fear someone will lose his or her job. A majority of respondents (63 percent) in Australia and New Zealand feel the same way.

The sector that feared the repercussions of downtime the most was retail, followed by manufacturing. 68 percent of respondents working in retail felt that they would experience a major brownout or outage so severe that it would make national media coverage and that someone could lose his or her job. 67 percent of IT decision-makers in manufacturing felt it would make national coverage, while 69 percent were concerned someone would lose his or her job. 

Comprehensive monitoring is key 

To combat downtime, it’s critical that companies have a comprehensive monitoring platform that allows them to view their IT infrastructure through a single glass panel. This means potential causes of downtime are more easily identified and resolved before they can negatively impact the business. This type of visibility is invaluable, allowing organisations to focus less on problem-solving and more on optimisation and innovation.

Evaluating monitoring solutions can be an arduous but necessary task, and the importance of extensibility cannot be overstated. Companies must ensure that the selected platform integrates well with all of its IT systems and can identify and address gaps in a company’s infrastructure that might cause outages. It is also imperative that the selected monitoring solution is not only flexible, but also gives IT teams early visibility into trends that could signify trouble ahead. Taking it a step further, intelligent monitoring solutions that use AIOps functionality like machine learning and artificial intelligence can detect the warning signs that precede issues and warn organisations accordingly. 

Ultimately, whether adopting new technologies or moving infrastructure to the cloud, enterprises must make sure that availability is top of mind, and that their monitoring solution is able to keep up. By selecting a scalable platform that provides visibility into their systems and forecasts potential issues, businesses can rise to the next level without sacrificing availability. This type of visibility will not only prevent downtime and system outages, but also keep organisations from hitting unwanted headlines.

 

 

By Krishna Sai, Senior VP of Technology and Engineering.
By Danny Lopez, CEO of Glasswall.
By Oz Olivo, VP, Product Management at Inrupt.
By Jason Beckett, Head of Technical Sales, Hitachi Vantara.
By Thomas Kiessling, CTO Siemens Smart Infrastructure & Gerhard Kress, SVP Xcelerator Portfolio...
By Dael Williamson, Chief Technology Officer EMEA at Databricks.
By Ramzi Charif, VP Technical Operations, EMEA, VIRTUS Data Centres.