Enterprise IT organisations are moving rapidly into Hadoop adoption. In its May 2014 Big Data and Analytics Market Survey, Frost & Sullivan found that 58 percent of large, US-based enterprises surveyed have already implemented one or more Big Data and/or analytic solutions, and the rest are either implementing, evaluating, or planning to investigate in the next 18-24 months.
Apache Hadoop is designed to enable very rapid time-to-insight, decision support, and operational efficiencies. But Hadoop poses many security and regulatory compliance challenges, including automatic replication of data across multiple nodes, multiple types of data concentrated in the Hadoop “Data Lake,” and access by many different users with varying analytic needs.
Sensitive data such as credit cards, social security numbers, bank statements, health records, etc., flows into Hadoop, which becomes an especially attractive target for cyber-attackers. Traditional IT security controls cannot fully protect an organisation from data breaches and data leakage.
“Voltage brings a unique, proven data-centric approach to protection of sensitive data in Hadoop, and the ability to significantly reduce the scope of regulatory compliance audits,” explained Mark Bower, vice president product management and solutions architecture, Voltage Security. “This data-centric security approach is very different from traditional data protection approaches that are commonly being offered in the Hadoop environment, such as storage-level or HDFS encryption and data masking.”
Storage-level or HDFS encryption protects against unauthorized personnel who may have physically obtained the disk, from being able to read any data from it. This is a useful control in a Hadoop cluster due to frequent disk swap-outs, but does nothing to protect the data from access when the disk is running. And data masking obfuscates sensitive data, but is intended to be irreversible, which limits its value for many analytic applications.
The data-centric security approach calls for de-identifying the data as close to its source as possible, transforming the sensitive data elements with usable, yet de-identified, equivalents that retain their format, behaviour and meaning. This protected form of the data can then be used in subsequent applications, analytic engines, data transfers and data stores, while being readily and securely re-identified for those specific applications and users that require it.
With the Voltage SecureData Suite for Hadoop, Voltage Security provides maximum data-centric protection and offers broad platform and application support – inside and outside Hadoop – with high performance and high scalability well-matched with Hadoop speeds.
Consisting of software, support and services, the Voltage SecureData Suite for Hadoop is available in two pre-configured packages: the Starter Edition to get started protecting sensitive data for pilot projects and small deployments, and the Enterprise Edition with full production-level implementations. Each package includes an unlimited number of applications running directly on Hadoop or used by an ETL or batch process transferring directly into or out of Hadoop. Protection for additional Hadoop nodes can be added to either package to meet the exact data protection needs for Hadoop environments.