Copyright class action against Anthropic

A significant proportion of the training data for Anthropics AI consists of illegally copied books. This is the accusation made by book authors in the USA.

Save to Pocket listen Print view
Books, seen from the cut side

Reading is allowed, copying is often not.

(Image: Daniel AJ Sokolov)

3 min. read

Three book authors have filed a lawsuit against the AI provider Anthropic PBC. The authors accuse Anthropic of systematically and illegally copying copyright-protected books and then misusing them to train artificial intelligence. In doing so, the company is infringing US copyright.

The statement of claim refers to a paper written by Anthropic employees in December 2021, according to which 32 percent of a huge training dataset consists of "internet books", which is a code word for books illegally downloaded from the internet. The plaintiffs are seeking class action certification on behalf of all authors of books registered with the US Copyright Office that were copied by Anthropic without a license. The list of claims includes an injunction, damages, disgorgement of enrichment, legal costs and interest. The case will be decided by a jury.

Anthropic has not yet responded to the lawsuit. The three authors leading the lawsuit are Andrea Bartz, Charles Graeber and Kirk Wallace Johnson. They filed their lawsuit Bartz et al v Anthropic PBC in the US Federal District Court for Northern California (Case No. 3:24-cv-05417). A copyright lawsuit brought by music publishers against Anthropic has been pending in the same court since June (Case No. 3:24-cv-03811). The publishers accuse Anthropic of mass and unauthorized use of third-party song lyrics for training and output of the generative AI model Claude. This case is called Concord Music Group et al v Anthropic PBC and was originally filed in the US Federal District Court for Middle Tennessee (Case No. 3:23-cv-01092), but has since been transferred to California.

More than two dozen similar lawsuits for copyright infringement by AI operators are pending in the USA; known lawsuits include those against Alphabet, Bloomberg, Google, Meta Platforms, Microsoft, Mosaic ML, Nvidia, OpenAI, Ross Intelligence and Stability AI. The main venues are the US Federal District Courts for Northern California and Southern New York. Although two related proceedings do not raise allegations under the US Copyright Act, they accuse AI operators of otherwise unlawfully using third-party works to train their generative AIs. One case involves the scraping of YouTube videos (Millette v. OpenAI, Case No. 3:24-cv-04710, Northern California), the other the alleged misuse of recordings of voice actors (Lehrman v. Lovo, Case No. 1:24-cv-03770, Southern New York).

(ds)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.