少点错误 2024年10月25日
Balancing Label Quantity and Quality for Scalable Elicitation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了在二元 NLP 分类任务中,利用低质量标签来提升模型性能的可扩展监督研究。作者发现,在有限的预算下,通过结合高质量和低质量标签,可以比单独使用高质量或低质量标签获得更高的准确率。该研究还发现,通过在模型训练中加入少样本提示,可以提高模型的样本效率,进而提升模型性能。

🤔 该研究探讨了在有限的预算下,如何通过结合高质量和低质量标签来提升模型性能。作者发现,在二元 NLP 分类任务中,存在三种不同的训练模式:质量主导、数量主导和混合模式。

🎯 质量主导模式指的是在预算充足的情况下,只使用高质量标签进行训练。数量主导模式指的是在预算有限的情况下,只使用低质量标签进行训练。混合模式则是在预算适中情况下,先使用低质量标签进行训练,再使用高质量标签进行微调。

💡 作者发现,在混合模式下,通过先使用低质量标签进行训练,再使用高质量标签进行微调,可以获得比单独使用高质量或低质量标签更高的准确率。

🚀 作者还发现,通过在模型训练中加入少样本提示,可以提高模型的样本效率,进而提升模型性能。

📊 该研究结果表明,在实际应用中,可以通过结合不同质量的标签以及少样本提示来提高模型的性能,从而更好地解决现实世界中的问题。

Published on October 24, 2024 4:49 PM GMT

ArXiv paper.

Thanks to Nora Belrose, Buck Shlegeris, Jan Hendrik Kirchner, and Ansh Radhakrishnan for guidance throughout the project.

Scalable oversight studies methods of training and evaluating AI systems in domains where human judgment is unreliable or expensive, such as scientific research and software engineering in complex codebases. Most work in this area has focused on methods of improving the quality of labels. Recent work by Burns et al. (2023) considers the complementary problem of training models with low-quality labels, finding that large pretrained models often have an inductive bias towards producing correct answers. In practice, however, neither label quantity nor quality is fixed: practitioners face a quantity-quality tradeoff. In this paper, we explore the microeconomics of the quantity-quality tradeoff on binary NLP classification tasks used in Burns et al. (2023). While sample-efficient learning has been studied extensively, little public research has focused on scalable elicitation: eliciting capabilities from pretrained models subject to labeling cost constraints. We find that this setting has novel dynamics caused by the tradeoff between label quantity and quality, as well as the model's existing latent capabilities. We observe three regimes of eliciting classification knowledge from pretrained models using supervised finetuning: quantity-dominant, quality-dominant, and a mixed regime involving the use of low- and high-quality data together to attain higher accuracy at a lower cost than using either alone. We explore sample-efficient elicitation methods that make use of two datasets of differing qualities, and establish a Pareto frontier of scalable elicitation methods that optimally trade off labeling cost and classifier performance. We find that the accuracy of supervised fine-tuning can be improved by up to 5 percentage points at a fixed labeling budget by adding a few-shot prompt to make use of the model's existing knowledge of the task.

How does this help with AI safety?

Ensuring safety of capable AI systems would be a lot easier if humans had access to all of the knowledge of the AIs they’re supervising. This is the broad framing that has motivated my interest in the Eliciting Latent Knowledge agenda. In this work, we try to measure how effective various elicitation strategies are (in binary classification settings) by plotting accuracy versus cost given various assumptions about the costs of low- and high-quality labels. We attempt to investigate scalable oversight as a quantitative rather than qualitative problem (the framing laid out in the post is roughly what motivates this work).

While I think our work has somewhat generalizable insights for non-scheming models, there may be additional difficulties when trying to elicit knowledge from schemers because intentionally misgeneralizing policies may be more salient than the policies we want to elicit for some distributions of inputs.

Summary of findings

Here are two of our findings:

1. There exists a “mixed” regime where some budget should be spent on a large quantity of low-quality labels before training on some high-quality labels

Here, we arbitrarily define high-quality labels to cost $1 and weak labels $0.10. So, along the x-axis, one high-quality label is given up for every 10 weak labels used. The model is sequentially trained on low- then high-quality labels. Different budgets produce 3 regimes with distinct optimal budget allocations:

Quality-dominant (budget≥$1024): No budget should be allocated to weak labels

Quantity-dominant (≤$64): All budget should be allocated to weak labels

Mixed ($256≤budget<$1024): Peak of the accuracy curve is somewhere in the middle

2. Increasing the salience of the task with a few-shot prompt consistently increases the sample-efficiency of SFT compared to either few-shot prompting or SFT alone.

We think that more research should be aimed at expanding the Pareto frontier of labeling cost and accuracy in realistic elicitation settings, and answering related questions like “How sample efficient should we expect elicitation to be?”



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

可扩展监督 低质量标签 模型性能 样本效率 少样本提示
相关文章