Downtime is the data delivery devil of IT; historically, the most reported end-user problem and the death knell of many an IT support individual.
And now, ‘thanks’ to recent and ongoing global events forcing a mass change in working habits, the dreaded “D” word has raised its presence more than ever. OK – so the aforementioned global events, leading to a forced human lockdown and the need to work from home for many may seem relatively trivial, given the number of homes/home offices with high-speed Internet connectivity nowadays. However, as many who are new to the experience are finding, the home connection is only a minor part of the problem and the potential issues it raises.
Even where an IT solution – maybe an application, or a security authentication mechanism - has been designed for mass, global usage, a sudden ramp-up in users can cause outages, either local or total. One high-profile casualty early during the lockdown period was Microsoft’s Teams unified communication and collaboration platform, so it doesn’t matter how big the company. Other problems were frequently reported such as cloud-based Internet security authentication failures and virtual desktop sessions being unavailable Even networking giants within the IT-world itself have reported massive increases in the number of help desk tickets being generated as a result of the work force shifting to homeworking almost overnight.
But, such examples – regardless of the extremity of the situation – are not inevitable as many assume; they are absolutely avoidable, but rarely resolved by the application provider themselves. Yet these issues need to be resolved; reports from the national media are suggesting that the situation in terms of office versus homeworkers won’t go back to how it was, with many employees - for companies who have sent all staff home - already starting to question why they had to go in to the office in the first place. While, understandably, large technology firms were some of the first to make the switch to remote working for all their staff, the required technology is available for any company in any line of business.
Looking beyond business and industry, education is another obvious candidate for more home-based connectivity being a long-term option. However, such changes in usage patterns create more and different problems for service providers and Data Centre (DC) management. In Italy for example, during the nationwide quarantine, peak Internet traffic went up by over 30% and usage overall increased by around 70%. Moreover, from a traffic management perspective, usage patterns shifted, so peak traffic was occurring earlier in the day in impacted regions. The change in bandwidth peaks and troughs was further impacted by national school suspensions, meaning housebound schoolchildren were competing with workers for data and application access and Internet bandwidth in general.
Key to application and data availability however is primarily at the Data Centre or wherever those applications and data reside. Server overload – whether CPU, memory, disk access or network access – is not a new issue, but it is still the primary cause of unavailability and user frustration and, more importantly, loss of productivity. And, as noted previously, it is – in 99.999% of circumstances - completely avoidable.
Then there is still the security issue to consider – another major cause of downtime. Preventing attacks is a 7x24, millisecond by millisecond process, as literally every few seconds a new threat is launched, in addition to the millions already out there and being constantly relaunched. Endpoint security – i.e. some form of AntiVirus (AV) software or other endpoint detect and response (EDR) solution – on your laptop/PC/phone/tablet is still a primary barrier, but attacks at the server/DC are far more damaging as they impact on potentially thousands of users with one hit, not a single endpoint.
So, what is the solution to enabling mass remote working without users suffering application and data access outages and slow, to the point of near-unusable, performance? The answer primarily lies with this use of Load-Balancer-Application Delivery Controller (LB/ADC) technology, which manages and optimises traffic flow in and out of the servers and out towards the endpoints. Over the history of testing these technologies, within Broadband-Testing we’ve seen the shift from a physical to a virtual LB/ADC environment, as typified by Kemp Technology’s Virtual LoadMaster (VLM) which we’ve been looking at recently: https://kemptechnologies.com/resource-library/industry-research/
The reason for this shift is that key now to enabling 24x7 access to data and applications are flexibility and scalability. Historically, we’ve seen fixed LB/ADC hardware solutions that were designed to deliver to a given level of performance, in terms of data throughput and server accessibility, but then hit the buffers, due to the – literally – physical limitations of the design and architecture. This also meant a lot of guesswork by the customer, in terms of what capacity they might need and, in estimating peak throughputs, solutions for ADC were often over-specified and incredibly expensive. Yet still limited.
A virtual solution sees those estimations replaced by capacity on demand and total management of a completely flexible, scalable estate, regardless of where the data and applications live. Add in automation – the ability to be proactive – and those fixed limitations are able to be resolved in advance of their requirement. Early virtualised LB/ADC solutions came with some trade-off elements; a more limited feature set and performance limits. However, products such as Kemp’s VLM MAX product, as well as having a complete feature set, including SSL offload functionality (to accelerate https access), a Web Application Firewall and high availability, are also unlimited in terms of performance and scalability; if you need more, you simply add more resource, as they can be deployed on all the major hypervisor platforms and leading public cloud services, such as AWS and Azure. Another important aspect of the Kemp example, is that the feature set is consistent, regardless of where it is deployed, to avoid deployment issues where some sites would have access to functionality that other sites had no visibility of.
Manageability is another important factor to consider. With potentially worldwide deployments, these environments must be manageable from a single platform and from essentially anywhere. The ability to see the “bigger picture” from top-down, then drill-down into specifics, is crucial to ensuring a truly optimised working architecture. Similarly, optimisation, analysis and security should be fully integrated, not individual components fighting for a limited pot of IT resource, and automation should further optimise the application delivery model and reduce OpEx (Operating Expenditure) and support costs.
We can’t get away from performance capabilities here; the ultimate virtualised solutions should be fully capable of supporting uncapped throughput, including uncapped SSL performance, purely dependent on the allocated system resources in use. That way, further expansion is always available without changing any products or methodology. At the same time, this has to come at a realistic cost, unlike in the case of many of the hardware-only options of the past. As the IT world increasingly moves away from a fixed CapEx (Capital Expenditure) model to an OpEx based budget, having a range of licensing options, such as perpetual, subscription-based or metered (based on usage/data throughput or VM/container instances, for example) is an equally important consideration. The latter option especially, provides the flexibility to deploy and retire load-balancing resources on-demand, thereby simplifying key operational areas such as DevOps environments and application scaling.
In conclusion, the recent COVID-19 pandemic has resulted in mass deployment of remote and homeworking and significant pressure on supporting 24x7 access to critical applications and data. But what has come out of necessity might now become the new norm across many industries, as most key indicators are suggesting. The ability, therefore, of IT infrastructure to support that remote workplace shift, both at the endpoint and – critically - at the DC, is not simply beneficial, but a “must have”. The role of the next generation of LB/ADC technology then, with near infinite scalability and flexibility in order to support the new demands on application and data availability, is crucial to the successful delivery of this “on demand” resource, both in terms of performance and optimisation.
And, as we’ve highlighted in this article, that technology does exist, and at an affordable price-point. So, all of you application, cloud and managed serviced providers out there – no excuses for downtime, ok?