Federal Data Protection Commissioner publishes guide for AI and data protection
The use of AI frequently leads to uncertainties for public bodies. The latest publication by the Federal Commissioner for Data Protection provides a remedy.
With the guide "AI in Authorities – Data Protection from the Outset", the Federal Commissioner for Data Protection and Freedom of Information (BfDI), Prof. Dr. Louisa Specht-Riemenschneider, aims to support federal public bodies in the data protection-compliant use of AI. Uncertainties arise particularly from the handling of personal data when training and using Large Language Models (LLMs). The guide also focuses on challenges with data memorized in LLMs and the requirements for legality and transparency. The publication is intended to help develop a structured, solution-oriented approach to AI projects.
Large language models form the basis for chatbots and are used to manage numerous tasks in everyday work. Legally relevant for their use is the AI Regulation, which governs the placing on the market, the putting into service and the use of AI systems. Data protection regulations are also decisive. For example, the GDPR regulates the legality and limits of processing personal data. The AI Regulation and the GDPR "complement each other to form a coherent Union regulatory framework for AI systems", explains the guide.
There are many data protection challenges when using LLMs. The guide mentions, among other things, the black box character, which prevents the data processing from being understood due to the technical nature of the systems. Hallucination poses a challenge to the principle of data accuracy, memorization (imprinting) of personal data in AI, which can unintentionally or through targeted attacks lead to data output. The aspect of (lack of) fairness/bias arises from over- or under-representation in the training data.
Measures against excessive interference
The following chapters analyze the legal foundations in detail and name concrete measures with which the effects of the described challenges can be mitigated – they range from organizational measures such as access and rights concepts to technical measures. To reduce the depth of interference when processing personal data, the guide mentions, for example, the pseudonymization of training data, the best possible removal of personal data such as names, telephone and tax numbers, differential privacy. Through which the dataset is anonymized as much as possible, or the use of filters in the AI system that minimize the extraction of personal data from the AI model – each before training.
"Especially when using Large Language Models, public bodies face considerable uncertainties," summarizes the Federal Commissioner for Data Protection. "With this guide, I want to contribute to legal certainty and show which data protection aspects should be considered when using Artificial Intelligence in the authorities under my supervision." Her office is also available for further examination of specific projects. The complete publication is available on the BfDI website (PDF, 46 pages).
Videos by heise
(ur)