Unwanted Spotify backup shows statistics and metadata
Spotify is outraged because Anna's Archive has downloaded a portion of the Spotify database and intends to publish it. 300 terabytes.
(Image: norazaminayob/Shutterstock.com)
"We backed up Spotify (metadata and music files)", reports the archive project Anna's Archive. It has downloaded 86 million music tracks. According to the information, this corresponds to about 37 percent of all recordings hosted on Spotify, but 99.6 percent of all Spotify streaming operations. Perhaps even more valuable are the almost completely copied metadata, from 256 million recordings and 186 million individual ISR codes (International Standard Recording Codes).
The data collection, totaling around 300 terabytes, was essentially completed in July. It is not legal, as there was no consent from Spotify or the rights holders. The streaming provider is accordingly furious. It speaks of an "anti-copyright attack" and states that it has deactivated the user accounts used for access. New security measures are intended to prevent further mass downloads.
Spotify's Savings Package
Financially, it will make no difference for the vast majority of rights holders whether they receive no royalties from Spotify or from Anna's Archive. Since the beginning of 2024, Spotify has not paid out anything if a recording is not streamed at least 1,000 times in a year. According to Anna's Archive, this affects over 70 percent of all music tracks. This means that niche artists and newcomers in particular are left empty-handed.
In addition, Spotify has implemented measures to reduce the share of revenue paid out to music rights holders. On the one hand, Spotify is said to have secretly produced music under pseudonyms, which it itself holds the rights to, and which is gladly interspersed by Spotify's algorithms.
Videos by heise
On the other hand, the Swedish company has added audiobooks and then activated a contract clause to halve the payouts to the US music licensing society MLC, citing the audiobooks. A US federal district court has declared this royalty trick to be contractually compliant. As a result, Spotify was able to report a net profit for the first time in 17 years.
The Metadata
Anna's Archive intends to gradually make the approximately 300 terabytes of data available online using the torrent protocol. The metadata will be released first. Insiders consider it more desirable than the music, which can be found on streaming services.
However, a public, central collection of all ISRC datasets has been missing so far. Even the industry association IFPI (International Federation of the Phonographic Industry), which has recommended the use of ISRCs to its members since 1988, does not maintain a directory. Thus, researchers cannot conduct market analyses, music lovers can only gain limited insight, and the creators of many circulating recordings remain in the dark.
The private company Word Collections, which collects and updates metadata from cooperating digital music services monthly, is likely the closest. However, this database is not public. Word Collections represents rights holders to streaming services, excluding collecting societies, which is intended to allow artists to earn significantly more. As heise online has learned, this company's latest ISRC database contains 240 million different entries.
That's a bit more than the 186 million that Anna's Archive has acquired. However, Word Collections' data collection has been growing enormously for three months because digital music services are being flooded with AI-generated files from third parties. Anna's Archive's collection only goes up to July; the AI flood explains part of the difference. The other part is explained by the incomplete collection of metadata for tracks that are rarely played on Spotify.