TechCrunch News 01月14日
OpenAI’s AI reasoning model ‘thinks’ in Chinese sometimes and no one really knows why
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

OpenAI的推理AI模型o1在处理问题时,有时会突然切换到中文或其他语言进行“思考”,即使问题是用英文提出的。尽管OpenAI尚未对此现象给出解释,但专家们提出了几种理论。一种观点认为,这可能是因为模型在训练过程中使用了大量的中文数据,特别是来自中国的第三方数据标注服务。另一种观点则认为,模型可能只是在寻找最高效的语言来解决问题,或者这仅仅是一种“幻觉”。模型处理的并非直接的单词,而是tokens,tokens的选择和使用也会引入偏差,最终,AI模型语言的不一致性可能源于训练过程中形成的关联。由于模型的黑盒特性,我们无法完全确定原因,这也突显了AI系统透明度的重要性。

🤔 o1模型在处理英文问题时,有时会用中文或其他语言进行推理,这种现象引发了广泛关注,但OpenAI尚未给出官方解释。

🇨🇳 一种理论认为,o1模型使用了包含大量中文的训练数据,尤其是有中国第三方数据标注服务提供的标签数据,这可能导致模型在推理过程中受到中文的影响。

💡 另一种观点认为,模型可能只是在寻找最有效的语言来解决问题,或者这是一种“幻觉”,模型并不理解语言本身,而只是处理tokens,tokens的选择也会影响模型的行为。

⚙️ 模型处理的并非直接的单词,而是tokens,tokens可以是单词、音节或单个字符,tokens的使用和选择也会引入偏差,例如空格在不同语言中的含义不同。

🧐 AI模型语言不一致的根本原因仍不明确,这凸显了AI系统透明度的重要性,我们需要更深入地了解模型的内部运作机制。

Shortly after OpenAI released o1, its first “reasoning” AI model, people began noting a curious phenomenon. The model would sometimes begin “thinking” in Chinese, Persian, or some other language — even when asked a question in English.

Given a problem to sort out — e.g. “How many R’s are in the word ‘strawberry?’” — o1 would begin its “thought” process, arriving at an answer by performing a series of reasoning steps. If the question was written in English, o1’s final response would be in English. But the model would perform some steps in another language before drawing its conclusion.

“[O1] randomly started thinking in Chinese halfway through,” one user on Reddit said.

“Why did [o1] randomly start thinking in Chinese?” a different user asked in an post on X. “No part of the conversation (5+ messages) was in Chinese.”

OpenAI hasn’t provided an explanation for o1’s strange behavior — or even acknowledged it. So what might be going on?

Well, AI experts aren’t sure. But they have a few theories.

Several on X, including Hugging Face CEO Clément Delangue, alluded to the fact that reasoning models like o1 are trained on data sets containing a lot of Chinese characters. Ted Xiao, a researcher at Google DeepMind, claimed that companies including OpenAI use third-party Chinese data labeling services, and that o1 switching to Chinese is an example of “Chinese linguistic influence on reasoning.”

“[Labs like] OpenAI and Anthropic utilize [third-party] data labeling services for PhD-level reasoning data for science, math, and coding,” Xiao wrote in a post on X. “[F]or expert labor availability and cost reasons, many of these data providers are based in China.”

Labels, also known as tags or annotations, help models understand and interpret data during the training process. For example, labels to train an image recognition model might take the form of markings around objects or captions referring to each person, place, or object depicted in an image.

Studies have shown that biased labels can produce biased models. For example, the average annotator is more likely to label phrases in African-American Vernacular English (AAVE), the informal grammar used by some Black Americans, as toxic, leading AI toxicity detectors trained on the labels to see AAVE as disproportionately toxic.

Other experts don’t buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution.

Rather, these experts say, o1 and other reasoning models might simply be using languages they find most efficient to achieve an objective (or hallucinating).

“The model doesn’t know what language is, or that languages are different,” Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch. “It’s all just text to it.”

Indeed, models don’t directly process words. They use tokens instead. Tokens can be words, such as “fantastic.” Or they can be syllables, like “fan,” “tas” and “tic.” Or they can even be individual characters in words — e.g. “f,” “a,” “n,” “t,” “a,” “s,” “t,” “i,” “c.”

Like labeling, tokens can introduce biases. For example, many word-to-token translators assume a space in a sentence denotes a new word, despite the fact that not all languages use spaces to separate words.

Tiezhen Wang, a software engineer at AI startup Hugging Face, agrees with Guzdial that reasoning models’ language inconsistencies may be explained by associations the models made during training.

“By embracing every linguistic nuance, we expand the model’s worldview and allow it to learn from the full spectrum of human knowledge,” Wang wrote in a post on X. “For example, I prefer doing math in Chinese because each digit is just one syllable, which makes calculations crisp and efficient. But when it comes to topics like unconscious bias, I automatically switch to English, mainly because that’s where I first learned and absorbed those ideas.”

Wang’s theory is plausible. Models are probabilistic machines, after all. Trained on many examples, they learn patterns to make predictions, such as how “to whom” in an email typically precedes “it may concern.”

But Luca Soldaini, a research scientist at the nonprofit Allen Institute for AI, cautioned that we can’t know for certain. “This type of observation on a deployed AI system is impossible to back up due to how opaque these models are,” he told TechCrunch. “It’s one of the many cases for why transparency in how AI systems are built is fundamental.”

Short of an answer from OpenAI, we’re left to muse about why o1 thinks of songs in French but synthetic biology in Mandarin.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI模型 语言切换 数据偏见 模型透明度 tokens
相关文章