Penguin Random House wants to protect book authors from AI exploitation
The world's largest specialist publisher has revised its copyright notices. It wants to prevent the authors' creations from being misused for AI training.
(Image: jakkaje879/Shutterstock.com)
Penguin Random House (PRH) has changed its copyright statement for all its publishers worldwide in response to the boom in generative artificial intelligence (AI). The trade magazine The Bookseller reports on this move by the world's largest trade publisher, which is owned by Bertelsmann. The relevant new passage reads: "No part of this book may be used or reproduced in any way for the purpose of training artificial intelligence technologies or systems". The extended notice should be included in all new titles and reprinted volumes from the collection. PRH has confirmed that it will appear "on the imprint pages in our markets".
With the declaration, the publisher also wants to "explicitly exempt the relevant titles from the text and data mining exception" from the latest major amendment to the EU Copyright Directive. In Germany, the Bundestag has implemented this requirement in sections 60d and 44b of the Copyright Act. Accordingly, it is permissible to reproduce legally accessible digital works, for example for algorithm training, "in order to obtain information from them, in particular about patterns, trends and correlations". Research institutions are entitled to do so, provided they are not pursuing commercial purposes.
PRH wants to use AI itself "responsibly"
Authors and exploiters who want to prevent text and data mining in their works available online can reserve the right to use them themselves. Such an announcement is only effective if it is made "in machine-readable form". This usually means in the robots.txt file. In the case of content that is not accessible online, the reservation of use can "also be declared in another way", according to the explanatory memorandum to the law.
According to the report, the head of PRH UK, Tom Weldon, emphasized to employees in August that the company would "vigorously defend the intellectual property of our authors and artists". However, the company also wants to "innovate responsibly" and use generative AI tools itself "selectively and responsibly" if "we see clear evidence that they can advance our goals".
Exemption for AI training is legally controversial
The British collecting society, Authors' Licensing and Collecting Society, welcomed the move. It is encouraging that major publishers such as PRH reaffirm the principle of copyright in their books and explicitly prohibit tech companies from using protected works to train their AI models. It is to be hoped that more publishers will follow suit. The Society of Authors also praised the approach. However, the current wording does not go far enough, as authors' contracts would also have to be amended.
Videos by heise
According to a study commissioned by the Copyright Initiative, the reproduction of works using models for generative AI constitutes copyright-relevant reproduction and is therefore illegal. The analysis states that the training of such systems is not a case of text and data mining. However, the Hamburg Regional Court recently dismissed a photographer's lawsuit against the non-profit association Laion, which offers training datasets for AI. Reasoning: The use of an image of the plaintiff fell under the text and data mining barrier.
(nie)