Foundational Data Governance is Imperative for AI Adoption and Compliance

By Manuel Sanchez, Information Security and Compliance Specialist, iManage.

As AI gathers momentum, organisations should ask themselves if they are truly AI-ready – because as new regulatory frameworks like the EU Artificial Intelligence Act make clear, it’s not just the AI vendors who need to get their house in order, it’s the organisations who are using the technology. Organisations deploying AI systems need to demonstrate compliance with a multitude of requirements in the Act, spanning risk management, transparency, and accountability, amongst other areas.

Given these new demands, data governance is more important than ever for organisations looking to adopt AI – and these data governance efforts need to be multifaceted to help avoid making any compliance missteps.

The blueprint for ethical AI and high-quality data

As a starting point, organisations should consider using a single, centralised repository (such as a document management system (DMS)) for all content, which will provide a baseline level of control over the data that feeds the generative AI model.

As a next step: Are internal policies clear on how long to keep that data? What about the type of data and if it contains personally identifiable information (PII)? When should it be archived, or permanently deleted? Organisations need a way to demonstrate in an auditable way that data is retained and disposed of in accordance with various regulatory frameworks, ranging from GDPR to the California Consumer Privacy Act.

Now, how does this play into the EU AI Act?

Frameworks like the EU AI Act place a strong emphasis on ethical AI. Specifically, AI systems must be transparent in their operations, and organisations deploying AI must provide clear documentation on how AI models function, including disclosing the training data sources.

Having a DMS with a core set of data that serves as training resources for a generative AI tool makes it easy for organisations to point to the training data sources, without any ambiguity around what is being fed to the model. Moreover, security policies that are integrated into the DMS ensure that no one – including the model – has access to confidential documents that they shouldn’t.

Meanwhile, a strong records management process that is fully aligned with data retention policies and offers defensible disposition of eligible content, ensures that the organisation is using valid, relevant, and legitimate data to feed the AI models as part of the training process, rather than data that should have been disposed of long ago and was never meant to be part of the training model.

In taking this approach, organisations can start to meet the EU AI Act requirements around being transparent about their data sources and ensuring their AI systems use high-quality datasets.

Locking down AI security requirements

The EU AI Act also requires organisations to comply with strict risk management and accountability measures around their AI: notably, AI models must be secure, resilient, and protected against adversarial attacks.

This means taking steps from both a preventative standpoint as well as a remediation one.

On the preventative side, having a zero-trust framework helps mitigate against data breach risk. Zero-trust is an entire strategy that goes beyond the particular technologies or platforms used to handle data; it’s a whole strategy around how data is handled, who has access to which particular pieces of data, and so on.

Additionally, prevention is aided by having ongoing user awareness training around pervasive threats like phishing, making it clear that an action as seemingly harmless as clicking on the wrong link in an email can result in a devastating data breach.

The key word here is ongoing user awareness training in small bites, rather than one-and-done user awareness training. The latter approach is a bit like going to the gym once a year and expecting to stay fit. As everyone knows, it just doesn’t work that way.

Another piece of prevention should also include user activity monitoring, to help detect exfiltration of data from insiders, and whether users are storing content in a centralised repository to meet adoption targets and compliance requirements.

Of course, even with these preventative measures in place, breaches can happen, which is why organisations need a playbook for how to handle these occurrences. From a remediation standpoint, there should be clear instructions around what steps need to be taken when a breach occurs such as an escalation process, roles and responsibilities, how soon should any regulatory bodies be informed, and other specific details around the business continuity and remediation process.

Keep it fresh

Something to keep in mind, however: just because you have a playbook around what to do if there's an incident or a data breach doesn’t mean you’re done. You have to keep it up to date.

There should be a process in place whereby every six months or so, key stakeholders get together as a team to review the existing playbook and see if anything needs to be updated.

Depending on the size of the organisation, this meeting might involve the CIO, CISO, the head of compliance, or simply the individuals who are most focused on governance and security.

Having a refreshed, up-to-date breach recovery playbook demonstrates compliance with the spirit of the EU AI Act and its requirements around accountability for data. By contrast, a dated, years-old playbook sends quite the opposite message.

Data governance isn’t optional

Ultimately, a robust, structured approach to managing and protecting organisational data enables better personal data handling, compliant retention and defensible disposition of content, stronger risk management, and greater transparency and accountability. For organisations seeking successful AI adoption, this type of strong data governance isn’t optional – it’s a necessity

By Martin Hodgson, Director of Sales EMEA at Paessler GmbH.
By Eric Herzog, CMO at Infinidat.
By Martin Jakobsen, Managing Director, Cybanetix.
By Masha Sedova, Vice President of Product Management, Human Risk, Mimecast.
As more organisations become reliant on cloud-based network services, the traditional...
By Sam Kirkman, Director of Services for EMEA at NetSPI.