The key to managing big data

We all know that “big data” – the use of new search and discovery technologies to extract value from huge volumes of information – is coming. We all know that the opportunities and payouts for big data can be huge. What we don’t always realise, however, is that none of this will happen without the proper network infrastructure in place. And with so much of it crossing long-distance functional and organisational boundaries, this means the wide area network (WAN) plays a critical role. By Dave Greenfield, product marketing manager, Silver Peak.

  • 11 years ago Posted in

Pressure on the WAN

Let’s first consider what we are dealing with. Data is growing rapidly. Enterprises must increasingly manage growing volumes, and files expected to grow significantly in the next decade. At the same time, enterprise storage system expenditure growth will be minimal in the next few years. Budget constraints will undoubtedly be a big challenge.

What organisations need to realise is that these massive amounts of data can no longer be handled by normal processing capabilities. This can equal buying expensive new platforms, servers, and storage, as well as training existing or hiring new staff that can take advantage of this big data. Shortage of talent will therefore be another big data problem.

However, it is the stability of the underlying network that is the biggest cause for concern. Often overlooked, this issue can make big data investments effectively pointless. Having a lot of data sitting around does not really accomplish anything; the real key to big data is being able to analyse large, diverse sets and act on the results. Furthermore, all of this data must be backed up. As well as having to move around vast amounts of big data, it is also crucial that this data be protected and kept secure, both for regulatory and compliance reasons, and to maintain customer trust.
These requirements place a huge dependency on the WAN. A recent study conducted by Forrester also found that a large majority, 72 percent, agree or strongly agree that they would like to replicate more of their application data than they do currently, and 62 percent would like to replicate more frequently.1 As this indicates, big data is more than just a storage and server challenge, it is a network challenge.

Defying distance
The first obstacle is geographical distance. The further away the data centre is from the user, the more latency there is to contend with and the longer data will take to reach its destination. If you are in one place and choose to analyse big data held in the same city, then there won’t be a problem. But if you attempt to analyse big data held on the other side of the world - a much more likely scenario - then it is likely it will be a slow, disjointed process.

There are also compliance reasons to take into account here. If you look at recent natural disasters, such as Hurricane Sandy or the Japan Tsunami, we find that it is no longer sufficient or acceptable to replicate data across town or even in the same state - you need to replicate data over a much greater distance.

Additionally, insufficient bandwidth is rarely recognised as a challenge to the success of big data, yet this can also significantly slow down data transfers, resulting in analysis being stale and outdated once it has been evaluated. Bandwidth is often limited and costly to provide, and in many environments, network congestion can exist. If packets are dropped or delivered out of order, it is hard to attain and assess information in real-time. When any or all of these issues arise, big data mobility can become extremely costly and be at risk of failure.


Optimising the WAN
Addressing these challenges by simply introducing additional storage or bandwidth will not solve the problems created by latency and packet loss. Optimising the WAN presents a solution that can rectify the bandwidth, distance, and quality issues that plague moving data over distance. By taking a network-centric approach, networks will be able to cope with accessing and moving big data while network performance and end-user experience is drastically improved, and costs are significantly reduced.

Let’s take a look at WAN optimisation software, which is vital for any organisation wishing to take advantage of big data mobility over distance. This includes the capabilities that reduce the amount of data transmitted across the WAN, correcting the network quality issues that are present in more networks, and accelerating protocols to help overcome distance challenges.

Techniques such as deduplication can speed up data transfer times and lower ongoing costs by recognising a significant portion of repetitive information being sent across the network, and delivering it locally wherever possible. In fact, it can significantly reduce traffic by as much as 99 percent. This can reduce the amount of data sent across the network, and is just one of the many tools that organisations can use to deal with these challenges.

Taking advantage of big data
The next generation of data movement and management challenges will be focused on data replication and the movement of data over greater distances. As data volumes grow and these requirements increase, greater strain is being place on the wider area networking infrastructure.

Many will assume that because bandwidth is getting cheaper and bandwidth rates are going up, the WAN bandwidth bottleneck is going away, or will go away completely. However, the growth of data continues to exceed the rate of which new services, new technologies and bandwidth updgrades are being deployed within carrier networks. The increase in traffic far outpaces the level of innovation and the price drops you see in enterprise WAN services. This means that there is a worse WAN bottleneck today than there was 10 years ago.

So, this means that big data will continue to grow in importance, and its possible for companies of all sizes to take advantage. While larger companies will be able to use it to deliver new products and new ways of servicing their customers, smaller companies will use big data in a more innovative way to outshine and compete with larger competitors. The challenge for many organisations will be how to accomodate big data applications using their existing network infrastructure.

This can beg the question, if big data is complex, expensive and requires a lot of people and new skills that are not yet available, then why even think about it? Simple - big data can create big value. But like big data predecessors - i.e. databases, data warehousing, data mining, data analytics and business intelligence - you need to know what you are looking for, why you are looking at it, what it is worth to you and how you will take advantage of it before you start. If you want to take advantage of big data, you need to make sure that the WAN is included in your plans.