Meta's AI training: data protectionist sees fan page operators as having a duty

Operators of Facebook fan pages should object to the training of meta-AI with user data. An EU authority is concerned about rights holders.

listen Print view
Various icons on the subject of data protection. A security lock in the middle, a digital ID card to the left and a fingerprint to the right. Behind it is a person with both hands open, making it appear as if the security symbols are floating above them.

(Image: TierneyMJ/Shutterstock.com)

5 min. read
Contents

From May 27, Meta intends to use the data of all adult European users of Facebook and Instagram to train its AI applications such as the large language model LLaMA. The US company reserves the right to use all future data as well as data from the past. Anyone who does not want this must expressly object to the use of personal data and images for these purposes. Dagmar Hartge, Brandenburg's data protection officer, now warns that the opt-out only applies to data on your profile. Posts and photos published on other accounts such as Facebook fan pages are not included.

Hartge therefore advises operators of such fan pages in particular to file an objection themselves. In Brandenburg, for example, public authorities also use Facebook fan pages for their public relations work, the controller points out, a practice that is generally highly controversial among data protection experts. Those affected should therefore act "urgently" before the deadline next week.

"If public bodies use social media on which Meta operates its AI applications, they must live up to their role model function", emphasizes Hartge. They are responsible for reducing the data protection risks for citizens as far as possible. They can only ensure that the personal data of users of their Facebook and Instagram pages is not used for AI training if they object to this in good time. At the same time, users are free to ask fan page operators to "file an objection".

Back in April, Hamburg's data protection commissioner Thomas Fuchs advised users to opt out in good time. He said he could well understand users' concerns "if all their images and texts shared on social networks now flow into AI models". Training data would flow irrevocably into AI models and their influence could no longer be removed from the model given the current state of technology. His authority has published a list of questions and answers on the subject.

On April 30, the North Rhine-Westphalia consumer advice center sent Meta a warning to stop its AI usage plans for Instagram and Facebook. According to them, haste is required "because all the data that has once flowed into the AI can only be retrieved with difficulty". The blanket reference to legitimate interest is not enough. In addition, particularly sensitive information could also be used for AI training purposes. Data subjects would have to actively consent to this. The data protection organization Noyb is also calling for a cease-and-desist declaration. According to a legal opinion by Steffen Groß, the processing planned by Meta is incompatible with the General Data Protection Regulation (GDPR) in several key respects.

Meanwhile, the EU Intellectual Property Office (EUIPO) sees evidence in a recent study that most developers of generative artificial intelligence (GenAI) systems such as OpenAI with ChatGPT, Meta or Google with Gemini "obtain and use content available online without the prior consent of copyright holders". This makes effective opt-out solutions for rights holders all the more important.

The authors of the EUIPO analysis also call for more transparency. For example, precise information is needed on the origin of a work to identify its rights holder, as well as on permitted uses. This is the only way to determine whether copyright-protected works may be used by GenAI services. In addition, content created by an AI must be easily identifiable. All of these points have an impact on the effective application and enforcement of copyright law as well as on AI developers.

Videos by heise

The authors also explain the legal situation: in the EU, legislators have defined exceptions to the exclusive exploitation right for text and data mining in the latest copyright amendment. Reproductions of lawfully accessible digital works are therefore permitted, for example for algorithm training to obtain information, in particular about patterns, trends and correlations. Research institutions are entitled to do so, provided they are not pursuing commercial purposes. This is intended to prevent large-scale data mining by research institutions in the service of companies.

Rights holders who wish to prevent text and data mining of their works available online despite such precautions can reserve the right of use themselves. However, such an announcement is only effective if it is made "in machine-readable form" –, for example via the robots.txt file. The EUIPO is calling for simple and clear solutions here. The opening of a special knowledge center at the authority by the end of 2025 offers the opportunity to tackle the identified legal complexity and provide comprehensive information resources. According to a study for the Copyright Initiative, the reproduction of works using models for generative AI constitutes copyright-relevant reproduction and is therefore illegal.

(akn)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.