The Verge - Artificial Intelligences 2024年08月21日
Authors sue Anthropic for training AI using pirated books
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

一组作者起诉Anthropic,称其用盗版书籍训练模型,该诉讼在加州法院提起,指控Anthropic利用开源数据集‘The Pile’及其中的盗版电子书库Books3训练Claude AI聊天机器人,作者要求法院认证诉讼、赔偿并禁止其未来使用版权材料,Anthropic未立即回应。

🧐Anthropic被指使用包含大量盗版电子书的‘The Pile’数据集训练Claude AI聊天机器人,其中Books3库涉及众多作者的作品,如斯蒂芬·金、迈克尔·波伦等。

📜诉讼中作者表示,Anthropic明知这些数据集包含大量版权内容且来源于盗版网站,仍进行下载和复制,他们希望法院认证诉讼并要求Anthropic赔偿并停止侵权行为。

👨‍✈️起诉Anthropic的作者包括安德烈·巴茨、查尔斯·格雷伯、柯克·华莱士·约翰逊等。尽管Books3已从‘The Pile’的最官方版本中删除,但原始版本仍在其他地方可获取。

🕵️‍♂️此前已有类似诉讼,如前阿肯色州州长迈克·赫卡比等作者起诉Meta、微软和EleutherAI,乔治·R·R·马丁等作者也起诉OpenAI涉嫌使用其版权内容。

Image: The Verge

A group of authors has sued Anthropic, accusing it of training its models on pirated books, as reported by Reuters. The proposed class action lawsuit was filed in a California court on Monday and alleges Anthropic “built a multibillion-dollar business by stealing hundreds of thousands of copyrighted books.”

In the lawsuit, the authors say that Anthropic used a sprawling, open-source dataset known as “The Pile” to train its family of Claude AI chatbots. Within this dataset is something called Books3, a massive library of pirated ebooks that includes works from Stephen King, Michael Pollan, and thousands of other authors. Earlier this month, Anthropic confirmed to Vox that it used The Pile to train Claude.

“It is apparent that Anthropic downloaded and reproduced copies of The Pile and Books3, knowing that these datasets were comprised of a trove of copyrighted content sourced from pirate websites like Bibiliotik,” the lawsuit reads. The authors want the court to certify their class action lawsuit, as well as require Anthropic to pay proposed damages and prevent the company from using copyrighted material in the future. Anthropic didn’t immediately respond to The Verge’s request for comment.

The writers suing Anthropic include Andrea Bartz, the author of We Were Never Here; Charles Graeber, who wrote The Good Nurse; and Kirk Wallace Johnson, the author of The Feather Thief. While the lawsuit acknowledges that Books3 has been removed from the “most official” version of The Pile, the original version is still allegedly available elsewhere online. A recent investigation also found that companies like Anthropic and Apple trained their AI models on thousands of scraped YouTube video subtitles available with The Pile.

Last year, Former Arkansas Governor Mike Huckabee and other authors filed a similar lawsuit against Meta, Microsoft, and EleutherAI — the nonprofit behind The Pile — over allegations their work was pirated and used to train AI models. George R.R. Martin, Jodi Picoult, Michael Chabon, and several other authors have also sued OpenAI for its alleged use of their copyrighted content.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Anthropic 版权侵权 AI训练 诉讼
相关文章