DATA ANALYTICS - DATABRICKS

Is your data strategy in line with your business? By Toby Balfre, Vice President Field Engineering at Databricks.

While many businesses have been successful at capturing the true value of their data, others have yet to be able to break free from siloed systems therefore lacking a single, unified data governance model that allows them to use data in strategic and transformative ways. It seems that while the importance of data is understood and the majority of businesses readily gather actionable insights from their data, only 30% of organisations have a well-articulated data strategy with 24% being ‘data-driven’. In what follows, Toby Balfre at Databricks shares his thoughts on how organisations can align their data strategy with their business strategies.

Why is it that most companies struggle to deliver when trying to gain results from their data strategies?

The issue often lies within the foundations of a business. Often it is the case that a company did not start out with a data strategy, but instead have had data capabilities develop naturally in different parts of the business. Now those companies find that in order to get a competitive advantage from their data, they need to have a coherent data strategy across multiple (if not all) business units. The challenge is that since the data capabilities within each function have grown organically, they use different technologies, governance models, access protocols, etc., which may be incompatible with one another, making it impossible for data to flow and business benefits to be realised.

As companies grow and expand, this complex network of data systems in different business units multiplies, once again preventing data from flowing between the stages of processing as each stage uses differing technologies, exacerbating the problem and driving organisations further away from a single source of truth. Companies in this position will never be able to make the most of their data, as the lack of a unified data governance system prevents data teams from working collaboratively.

How can businesses maximise the value of their existing data?

There are three critical technology pillars that companies should focus on when trying to build a successful data strategy. The good news is that most of these are possible where most data already resides - in the data lakes of the major cloud providers (Microsoft, GCP, and AWS).

1. Use a modern data architecture

Historically, business intelligence has demanded data warehouses, whereas data science and ML have demanded data lakes. A data warehouse is a repository structure for a collection of structured and filtered data, whereas a data lake is required for large collections of unstructured data. Companies should look to a modern, open data architecture that allows for data to be served to all users and use cases without needing to store copies in different technologies for different use cases. Simplifying the data architecture will break down technology barriers and enable teams to work collaboratively, rather than trying to work across disconnected solutions for different use cases.

An open, integrated, modern platform such as the data lakehouse – a combination of the best of the data warehouse and the data lake – will allow for more collaboration and innovation across all data teams.

2. Construct in the cloud

Previously considered as simply a “nice-to-have” accessory, cloud is now integral to successfully scaling data management. Cloud adoption has exploded in recent years, having steadily built momentum since the early 2000’s to become today’s go-to approach when building a modern platform.

Many data teams are also working across several different clouds, adopting a multi-cloud approach. This allows teams to be able to run workloads from anywhere, to easily integrate new solutions, and to ensure they are compliant down the road.

3. Keep it open

The value of open source is set to increase as data architectures continue to evolve. Open-source technology and open standards deter teams from building in-house solutions that are overly complex. They are also relatively inexpensive and vetted by experts, reducing the likelihood of operational risks. Open-source technology can also be an unlimited resource for innovation, connecting data teams to the wider open-source community and giving them full visibility into source code.

Can the use of AI and ML improve success when trying to establish a data culture?

A data-culture exists when data is at the heart of every strategic business decision. This data needs to be of high-quality, centrally governed, and accessible to every team. Leveraging AI and ML allows for mundane data tasks to be automated, increasing efficiency and allowing more time for innovation and complex problem solving.

To introduce a data platform that is open, collaborative, and unified helps to shift data out of siloed systems, where it is often unreliable. Working from a single source of truth allows for truly effective collaboration across the entire company.

Building a data-culture requires strategy, and if successfully implemented, can be highly rewarding. Rolls Royce, for example, gathers insights from its data using an intelligent platform. Leveraging AI and digital twin technology, Rolls Royce has managed to save over 200 million kilograms of carbon from entering the atmosphere by dramatically improving the efficiency of its engines. Rolls Royce has also extended the time between its engine maintenance checks, meaning less money is spent on unnecessary inventory. Exploiting your data to its full potential can, evidently, revolutionise the way a business functions.

How can we govern who gets access to data and what risk does this pose to data protection?

Data governance lends itself as arguably the most challenging aspect of a data transformation initiative, and the culprit is often maintaining multiple competing data architectures. These architectures often offer varying levels of security and different approaches to data governance, leading to data that is unreliable and out of date. Organisations should look to minimise the number of data copies by moving to a single data processing layer. This will give them a view of all available data and allow them to run data governance controls together, maintaining high levels of data compliance and quality.

Without an effective data strategy, organisations are missing out on a wealth of untapped insights. To leverage a modern data architecture is to inspire true collaboration and innovation amongst data teams. Data is centralised, silos are avoided and running analytics is far more efficient. Now is the time for businesses to invest in their most valuable asset – their data.


By Graham Jarvis, Lead Business and Technology Journalist.
Simon Spring, Operations Director of EMEA at WhereScape, discusses some of the common misconceptions surrounding data warehouses, data lakes, and data hubs, and why a proper understanding of each is key to building effective data infrastructure.
By Davide Villa, Director of Business Development EMEAI, Western Digital.
By Bhushan Patil, Chief Growth Officer for Network Services for APJI & EMEA,
By Sharon Einstein, General Manager, Customer Engagement Analytics, NICE.
By Jonathan Westley, Chief Data Officer, Experian UK&I.
The success of environmental policymaking is inextricably linked to the deployment of high-frequency data and artificial intelligence (AI). Geoff McGrath, Managing Director of CKDelta, discusses the role of holistic, data-driven decision making in responding to the climate crisis and driving down carbon emissions.
By Aaron Regis, Senior Solutions Engineer, TigerGraph EMEA.