MarkTechPost@AI 04月14日 01:40
Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了纽约大学的研究团队在AI推理模型领域取得的进展。他们开发了一种轻量级的探针,用于检测模型在推理过程中的中间步骤,从而实现高效的自验证。通过分析模型的隐藏状态,该探针能够预测中间答案的正确性,并据此提前终止推理过程,减少了不必要的计算资源消耗。实验结果表明,这种方法在保持高准确率的同时,显著降低了Token的使用量,为构建更智能、更高效的AI推理系统提供了新的思路。

🧠 研究背景:AI推理模型在解决复杂问题时,通常会生成一系列推理步骤,但这些模型在得出正确结论后,有时仍会继续处理,导致计算资源的浪费。

💡 研究方法:研究团队设计了一个轻量级的探针,用于分析模型在推理过程中的隐藏状态。该探针能够预测中间答案的正确性,从而实现高效的自验证。

🔍 实验结果:实验表明,该探针在多个数据集上取得了优异的性能,ROC-AUC分数超过0.9,预期校准误差(ECE)低于0.1。通过使用置信度阈值,提前终止推理过程,在保持高准确率的同时,Token的使用量减少了高达24%。

🚀 应用前景:这种基于内部状态的自验证方法,为构建更智能、更高效的AI推理系统提供了新的思路。通过利用模型已知的知识,可以显著提高性能并降低资源消耗。

Artificial intelligence systems have made significant strides in simulating human-style reasoning, particularly mathematics and logic. These models don’t just generate answers—they walk through a series of logical steps to reach conclusions, offering insights into how and why those answers are produced. This step-by-step reasoning, often called Chain-of-Thought (CoT), has become vital in how machines handle complex problem-solving tasks.

A common problem researchers encounter with these models is inefficiency during inference. Reasoning models often continue processing even after reaching a correct conclusion. This overthinking results in the unnecessary generation of tokens, increasing computational cost. Whether these models have an internal sense of correctness remains unclear—do they realize when an intermediate answer is right? If they could identify this internally, the models could halt processing earlier, becoming more efficient without losing accuracy.

Many current approaches measure a model’s confidence through verbal prompts or by analyzing multiple outputs. These black-box strategies ask the model to report how sure it is of its answer. However, they are often imprecise and computationally expensive. On the other hand, white-box methods investigate models’ internal hidden states to extract signals that may correlate with answer correctness. Prior work shows that a model’s internal states can indicate the validity of final answers, but applying this to intermediate steps in long reasoning chains is still an underexplored direction.

The research introduced by a team from New York University and NYU Shanghai tackled this gap by designing a lightweight probe—a simple two-layer neural network—to inspect a model’s hidden states at intermediate reasoning steps. The models used for experimentation included the DeepSeek-R1-Distill series and QwQ-32B, known for their step-by-step reasoning capabilities. These models were tested across various datasets involving mathematical and logical tasks. The researchers trained their probe to read the internal state associated with each chunk of reasoning and predict whether the current intermediate answer was correct.

To construct their approach, the researchers first segmented each long CoT output into smaller parts or chunks, using markers like “wait” or “verify” to identify breaks in reasoning. They used the last token’s hidden state in each chunk as a representation and matched this to a correctness label, which was judged using another model. These representations were then used to train the probe on binary classification tasks. The probe was fine-tuned using grid search across hyperparameters like learning rate and hidden layer size, with most models converging to linear probes—indicating that correctness information is often linearly embedded in the hidden states. The probe worked for fully formed answers and showed the ability to predict correctness before an answer was even completed, hinting at look-ahead capabilities.

Performance results were clear and quantifiable. The probes achieved ROC-AUC scores exceeding 0.9 for some datasets like AIME when using models like R1-Distill-Qwen-32B. Expected Calibration Errors (ECE) remained under 0.1, showing high reliability. For example, R1-Distill-Qwen-32B had an ECE of just 0.01 on GSM8K and 0.06 on MATH datasets. In application, the probe was used to implement a confidence-based early exit strategy during inference. The reasoning process was stopped when the probe’s confidence in an answer exceeded a threshold. At a confidence threshold of 0.85, the accuracy remained at 88.2%, while the inference token count was reduced by 24%. Even at a threshold of 0.9, accuracy stayed at 88.6%, with a 19% token reduction. Compared to static exit methods, this dynamic strategy achieved up to 5% higher accuracy using the same or fewer tokens.

This study offers an efficient, integrated way for reasoning models to self-verify during inference. The researchers’ approach pinpoints a gap—while models inherently know when they’re right, they don’t act on it. The research reveals a path toward smarter, more efficient reasoning systems by leveraging internal representations through probing. It shows that tapping into what the model already “knows” can lead to meaningful performance and resource use improvements.


Check out Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

The post Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24% appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI推理 自验证 Token优化 隐藏状态 机器学习
相关文章