The Verge - Artificial Intelligences 2024年07月26日
Runway’s AI video generator trained on thousands of scraped YouTube videos
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

据404 Media报道,Runway公司使用数千个YouTube视频和盗版电影来训练其AI文本到视频生成器。该报道还曝光了Runway训练数据中包含来自Netflix、迪士尼、任天堂和Rockstar Games等主要娱乐公司以及MKBHD、Linus Tech Tips和Sam Kolder等创作者的YouTube频道链接。此外,还包括来自The Verge、The New Yorker、路透社和Wired等新闻媒体的频道链接。

🤔 Runway使用数千个YouTube视频和盗版电影来训练其AI文本到视频生成器,其中包括来自Netflix、迪士尼、任天堂和Rockstar Games等主要娱乐公司以及MKBHD、Linus Tech Tips和Sam Kolder等创作者的YouTube频道链接。

👀 除了YouTube频道外,Runway的训练数据还包含来自KissCartoon等盗版网站的链接,这些网站允许用户免费观看动画和其他动画内容。

🤔 Runway表示使用“精选的内部数据集”来训练其模型,但没有提供更多细节。

🤔 虽然Runway不是唯一一家使用YouTube视频训练其AI模型的公司,但使用盗版电影来训练AI模型引发了人们对版权和道德问题的担忧。

🤔 YouTube CEO Neal Mohan表示,在YouTube视频上训练AI模型违反了YouTube的政策。

🤔 这一事件提醒我们,人工智能的快速发展也带来了新的挑战,如何平衡人工智能技术发展与版权保护和道德规范之间的关系,是一个需要认真思考和解决的问题。

Runway CEO Cristóbal Valenzuela onstage at Vox Media’s 2023 Code Conference. | Photo by Jerod Harris/Getty Images for Vox Media

Runway trained its AI text-to-video generator on thousands of YouTube videos and pirated films, according to a report from 404 Media. A spreadsheet of training data obtained by 404 Media includes links to YouTube channels belonging to major entertainment companies, such as Netflix, Disney, Nintendo, and Rockstar Games, along with creators like MKBHD, Linus Tech Tips, and Sam Kolder.

There are also links to channels owned by news outlets like The Verge, The New Yorker, Reuters, and Wired. “The channels in that spreadsheet were a company-wide effort to find good quality videos to build the model with,” a former Runway employee tells 404 Media. “This was then used as input to a massive web crawler which downloaded all the videos from all those channels, using proxies to avoid getting blocked by Google.”

Runway is an AI startup that has received millions in funding from Google parent company Alphabet and Nvidia. It has created impressive tools that allow users to make realistic-looking AI videos as well as ones that capture a particular animation type. Runway’s latest tool, Gen-3 Alpha, launched in June and can “create videos in any style you can imagine.” Like other AI models, Gen-3 Alpha needs to ingest a breadth of content when training.

In addition to YouTube channels, 404 Media also found that Runway’s dataset contains links to piracy sites like KissCartoon, which lets you watch anime and other animated content for free. It’s still not clear whether Runway used all of the videos in this spreadsheet to train its Gen-3 Alpha model — and we may never find out. In an interview with TechCrunch in June, Runway cofounder Anastasis Germanidis said the company uses “curated, internal datasets” to train its models, but he didn’t provide further detail.

When reached for comment, Google pointed The Verge to a statement from YouTube CEO Neal Mohan, who told Bloomberg in April that training AI on the platform’s videos is a “clear violation” of its policies. The Verge reached out to Runway with a request for comment but didn’t immediately hear back.

Runway isn’t the only AI company that has had its AI training data linked to YouTube. Earlier this year, OpenAI CTO Mira Murati said she “wasn’t sure” whether the company’s text-to-video generator, Sora, trained on YouTube. Meanwhile, a recent report from Proof News and Wired found that Anthropic, Apple, Nvidia, and Salesforce trained their AI models on more than 170,000 YouTube videos.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Runway AI 视频生成器 YouTube 盗版电影 训练数据
相关文章