The Atlantic made a database of music used to train AI models searchable

Key facts
- •The Atlantic created a searchable database of music used to train AI models
- •The database includes four datasets of music, with two containing 12 million and 9 million tracks
- •The datasets have been downloaded thousands of times, with Google and Stability confirming their use
- •The datasets include a wide range of artists, including pop stars and experimental composers
The Atlantic has created a searchable database of music used to train AI models. The database includes four datasets of music, with two of them containing 12 million and 9 million tracks respectively. The other two datasets have over 100,000 songs each.
By the numbers
Dataset Details
The datasets were uncovered by Atlantic reporter Alex Reisner. Some of the sources, like the Free Music Archive dataset, are free to stream for personal use but require licensing for commercial applications. The datasets are freely available on the internet, but using them as training data is not straightforward.
Usage and Sources
The datasets have been downloaded thousands of times, with Google and Stability confirming their use in research papers. The datasets include a wide range of artists, from pop stars like Lady Gaga and Fred Again.., to Radiohead, Aphex Twin, Wu-Tang Clan, Bruce Springsteen, and experimental composer Hainbach.
This article was independently rewritten by ManyPress editorial AI from reporting originally published by The Verge.



