On thin copyright ice: Apple trains AI models with web content
The company uses freely accessible content for "Apple Intelligence". Apple is only now revealing the option to opt out.
(Image: Sebastian Trepesch)
In addition to licensed content, Apple also uses publicly accessible web content to train its new AI models, as the company has now admitted. The company also accesses content captured by its web crawler "AppleBot" to create its own foundation models - this is apparently independent of the license rules in place on the respective website.
Opt-out option only announced now
According to Apple, those who do not wish to make their texts and images available for "Apple Intelligence" training have the option of opting out. It remains unclear how much and which data has already been used to train Apple AI. To opt out, website operators and content providers must instruct the special "Applebot-Extended" to ignore their content. The "crawling" of websites by the AppleBot also remains in place when opting out if it is not also rejected in the robots.txt file, the company notes.
Apple's approach is similar to that of other major AI providers, who have also used freely accessible web content to train their models and have thus set themselves on a collision course with publishers and content creators. According to previous reports, Apple approached several major US publishers last year about licensing content and is already paying for image content for AI training. There was therefore speculation in the industry that Apple might limit itself entirely to licensed content.
Videos by heise
In a recent interview, Apple CEO Tim Cook advised journalists to license their content for AI training. Cook said that this is "really smart for some people" and that it is not clear what could be bad about licensing unless you don't get a good deal.
Use of "free" content for AI training controversial
Other AI companies, such as Apple partner OpenAI, insist that AI training with freely accessible content is fundamentally "fair" and practically "impossible" without access to copyright-protected content. At the same time, more and more deals are being concluded with publishers and website operators.
Among creatives and content producers, often regular customers of Apple, there is increasing resistance to the unsolicited use of their works for AI training. Apple recently felt the extent of the anger: after a storm of indignation, the company apologized for an iPad commercial in which a giant scrap press crushes musical instruments, among other things.
(lbe)