The Verge - Artificial Intelligences 01月29日
OpenAI has evidence that its models helped train China’s DeepSeek
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

中国人工智能公司DeepSeek推出廉价AI模型,被OpenAI怀疑使用其数据。OpenAI和微软正进行调查,称发现DeepSeek可能存在数据蒸馏行为,违反服务条款,此事引发关注。

DeepSeek发布廉价AI模型,与OpenAI竞争

OpenAI怀疑DeepSeek用其数据训练模型

微软发现大量数据通过OpenAI账户被窃取

OpenAI称将采取措施保护知识产权

Sucking in data you didn’t ask permission for? Sounds familiar.

Chinese artificial intelligence company DeepSeek disrupted Silicon Valley with the release of cheaply developed AI models that compete with flagship offerings from OpenAI — but the ChatGPT maker suspects they were built upon OpenAI data.

OpenAI and Microsoft are investigating whether the Chinese rival used OpenAI’s API to integrate OpenAI’s AI models into DeepSeek’s own models, according to Bloomberg. The outlet’s sources said Microsoft security researchers detected that large amounts of data were being exfiltrated through OpenAI developer accounts in late 2024, which the company believes are affiliated with DeepSeek.

OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of distillation — a common technique developers use to train AI models by extracting data from larger, more capable ones. It’s an efficient way to train smaller models at a fraction of the more than $100 million that OpenAI spent to train GPT-4. While developers can use OpenAI’s API to integrate its AI with their own applications, distilling the outputs to build rival models is a violation of OpenAI’s terms of service. OpenAI has not provided details of the evidence it found.

The situation is rich with irony. After all, it was OpenAI that made huge leaps with its GPT model by sucking down the entirety of the written web without consent.

President Donald Trump’s artificial intelligence czar David Sacks said “it is possible” that IP theft had occurred. “There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models and I don’t think OpenAI is very happy about this,” Sacks told Fox News on Tuesday.

“We know PRC (China) based companies — and others — are constantly trying to distill the models of leading US AI companies,” OpenAI said in a statement to Bloomberg. “As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.”

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DeepSeek OpenAI 数据侵权 知识产权
相关文章