MarkTechPost@AI 2024年10月03日
Integrated Value Guidance (IVG): An AI Method that Combines Implicit and Explicit Value Functions Applied to Token-Wise Sampling and Chunk-Level Beam Search
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章介绍了一种名为集成价值引导(IVG)的AI方法,该方法在模型训练后整合人类价值,避免反复重训模型,能实时适应用户偏好,在多项任务中表现良好。

🎯集成价值引导(IVG)结合了隐式和显式函数,可在推理时使大型语言模型与人类偏好对齐,避免传统微调的复杂性。

💻IVG的隐式函数用于令牌生成,进行逐词评估并选择概率最高的输出;显式函数用于评估较大文本块并生成概率最高的后续词序列,但存在灵活性和计算成本问题。

📈研究者通过两个实验评估IVG,包括受控情感生成和总结以及指令遵循,结果表明IVG在各项任务中表现出色,且Chunk-Level Beam Search比Emulator Fine-Tuning速度效率更高。

Integrating human values after model training using Learning-based algorithms requires fine-tuning LLMs, which requires more computational power and is time-consuming. Additionally, it generates biased and undesirable responses by the user. There is a need to develop a model that can efficiently adapt to user preferences in real time by integrating algorithms that can interfere at inference time. This method will avoid retraining the models repeatedly for desired results by freezing the base model and reducing the computational cost of fine-tuning LLMs.

Researchers developed Inference-time alignment methods to integrate human values after fine-tuning LLMs using the implicit and explicit functions without changing the base model. Implicit functions are used for token generation, which conducts word-by-word evaluations and prefers the output with the highest probability. In contrast, explicit functions require a rigid structure to evaluate larger chunks of text and generate the following sequence of words with the highest probability while maintaining overall context. The explicit function is inflexible and computationally expensive, failing to address token-level optimization, while the implicit function faces interpretability issues and requires frequent forward passes, leading to low real-time efficiency.

To tackle the disadvantages of both functions, the proposed method, Integrated Value Guidance (IVG), combines the implicit function’s token-level optimization and the explicit function’s broader perspective. It was able to ward off adaptation challenges and trade-offs in alignment efficacy, leading to decreased performance discrepancies and making it easier to implement. These advantages facilitated better performance on tasks like controlled sentiment generation and summarization. IVG, combined with the smaller models like GPT-2, could compete with higher models.

IVG incorporates the two value functions, the implicit and explicit functions, to align the model with human values. First, token-wise sampling fine-tunes individual tokens to a specific sequence length, generating multiple sequences. Then, chunk-level beam search compares the probabilities of these sequences and selects the one with the highest probability. Although this method ensures that the output is more robust, the computational power increases during the inference time due to frequent forward passes, leading to slower responses. 

Researchers have used two experimental set-ups to evaluate IVG: 1. Controlled sentiment generation and Summarization, and 2. Instruction-following. In the first one, the GPT-2 model family is used by leveraging synthetic datasets from a gold-reward model to generate positive movie reviews and summarise Reddit posts. In comparison, the second one requires an instruction-tuned model, AlpacaEval 2.0. It employs Tulu Guidance, which uses specific models for implicit function and trains a reward-based model for the explicit function, and Ultraguidance, which fine-tunes a model with Direct Preference Optimization (DPO) for both functions. GPT-4-turbo was used as a reference to assess responses in the second experiment, and IVG consistently performed well.

In addition to these two experiments, an ablation study proved that Chunk-Level Beam Search (CBS) had higher speed efficiency than Emulator Fine-Tuning (EFT), which uses the implicit function for fine-tuning. These results have proved that CBS is much better to use in practice.

In conclusion, Integrated Value Guidance (IVG) offers a novel and efficient approach to aligning large language models with human preferences purely at inference time, bypassing the complexities of traditional fine-tuning. By leveraging implicit and explicit value functions, IVG enhances performance in both token-wise sampling and chunk-level decoding, as demonstrated through significant improvements in sentiment generation, summarization, and instruction-following tasks. The results showed that IVG is a versatile method, providing strong empirical evidence of its ability to outperform existing approaches, making it a promising solution for fine-tuning large models in real-world applications.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

Want to get in front of 1 Million+ AI Readers? Work with us here

The post Integrated Value Guidance (IVG): An AI Method that Combines Implicit and Explicit Value Functions Applied to Token-Wise Sampling and Chunk-Level Beam Search appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

集成价值引导 AI方法 模型优化
相关文章