Apple Intelligence may have been unwittingly trained using data pilfered from YouTube, report reveals

Apple Developer YouTube channel
(Image credit: Future / Apple)

As Apple works to ready Apple Intelligence for a beta launch later this year, a new report claims that the company used YouTube videos as a source of data when training their AI models.

Apple is just one company thought to have used data collected by a third party when training AI, with Nvidia and Anthropic also among those thought to have used the same information. The dataset, called YouTube Subtitles, was collected by EleutherAI and created by taking the transcripts from videos created by some of the biggest names on the platform including MKBHD and MrBeast.

While the dataset was not created using the actual videos themselves but rather their transcripts, it's still thought that the act is against YouTube's terms of service.

It's all about the subtitles

The Wired report notes that the dataset is part of a compilation the outfit released called the Pile which is accessible and open to anyone on the internet.

An investigation found that subtitles from 173,536 YouTube videos across 48,000 channels were used to train data with Apple one of the companies that benefited. It's thought that Apple used the Pile, to train OpenELM, a model that was announced in April just weeks before Apple announced that Apple Intelligence would launch alongside iOS 18. The offering is made up of multiple new AI-powered features that generate text and images across multiple apps and services.

Understandably, YouTubers are less than happy with the news. “No one came to me and said, ‘We would like to use this,’” said David Pakman, the host of The David Pakman Show. Others suggested that the use of subtitle data in this way was theft, noting that the same technology could well be used to take creators' jobs in the future.

Apple Intelligence will launch later this year, albeit in beta, alongside iOS 18 and software updates for the Mac, iPad, Mac, Apple Watch, Apple TV, and Apple Vision Pro.

Oliver Haslam
Contributor

Oliver Haslam has written about Apple and the wider technology business for more than a decade with bylines on How-To Geek, PC Mag, iDownloadBlog, and many more. He has also been published in print for Macworld, including cover stories. At iMore, Oliver is involved in daily news coverage and, not being short of opinions, has been known to 'explain' those thoughts in more detail, too. Having grown up using PCs and spending far too much money on graphics card and flashy RAM, Oliver switched to the Mac with a G5 iMac and hasn't looked back. Since then he's seen the growth of the smartphone world, backed by iPhone, and new product categories come and go. Current expertise includes iOS, macOS, streaming services, and pretty much anything that has a battery or plugs into a wall. Oliver also covers mobile gaming for iMore, with Apple Arcade a particular focus. He's been gaming since the Atari 2600 days and still struggles to comprehend the fact he can play console quality titles on his pocket computer.