MarkTechPost@AI 2024年11月04日
Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了小规模LLM的高效函数调用,介绍了其在自然语言理解和生成方面的能力,以及在功能调用方面的探索。还提到了当前大规模LLM存在的问题,如资源消耗和特定问题解决能力不足。文章详细阐述了针对这些问题的新框架,包括其训练方法、流程阶段、实验过程和结果等。

🧐小规模LLM的高效函数调用框架被提出,旨在解决大规模LLM资源消耗和特定问题解决能力不足的问题。该框架通过注入函数描述和示例到提示中,创建正确和错误推理链完成的数据集,以训练小规模模型。

📋框架的训练流程包括定义任务和问题、设置特定任务函数、选择预训练大模型生成数据集、使用DPO算法微调小模型。实验在一阶逻辑和数学问题上进行,使用了基于代理的库Microchain来创建推理链数据集。

📈数据增强扩展了数据集,在Mistral - 7B上进行了微调。性能指标显示模型在一阶逻辑任务上的准确性有显著提高,在数学任务上有一定提升,Wilcoxon测试证实了其统计学意义。

🎉实验结果表明,该方法减少了对大型模型的需求,提高了小规模模型在逻辑和数学相关任务上的性能,在一阶逻辑任务上实现了近乎完美的准确率,未来有望应用于更广泛的推理任务和函数类型。

Recent advancements in Large Language Models (LLMs) have demonstrated exceptional natural language understanding and generation capabilities. Research has explored the unexpected abilities of LLMs beyond their primary training task of text prediction. These models have shown promise in function calling for software APIs, supported by the launch of GPT-4 plugin features. Integrated tools include web browsers, translation systems, Dialogue State Tracking (DST), and robotics. While LLMs show promising results in general complex reasoning, they still face challenges in mathematical problem-solving and logical capacities. To address this, researchers have proposed techniques like function calls, which allow LLMs to execute provided functions and utilize their outputs to assist in various task completion. These functions vary from basic tools like calculators that perform arithmetic operations to more advanced methods. However, concentrating on specific tasks using only a small portion of available APIs highlights the inefficiency of relying solely on large models, which require major computational power for both training and inference and because of the expensive cost of training. This situation calls for creating smaller, task-specific LLMs that maintain core functionality while reducing operational costs. While promising, the trend toward smaller models introduces new challenges.

Current methods involve using large-scale LLMs for reasoning tasks, which are resource-intensive and costly. Due to their generalized nature, these models often struggle with specific logical and mathematical problem-solving. 

The proposed research method introduces a novel framework for training smaller LLMs in function calling, focusing on specific reasoning tasks. This approach employs an agent that queries the LLM by injecting descriptions and examples of usable functions into the prompt, creating a dataset of correct and incorrect reasoning chain completions.

To address the drawbacks of oversized LLMs, which incur excessive training and inference costs, a group of researchers introduced a novel framework to train smaller language models starting from the function-calling abilities of large models for specific logical and mathematical reasoning tasks. Given a problem and a set of useful functions for its solution, this framework involves an agent that queries a large-scale LLM by injecting function descriptions and examples into the prompt and managing the proper function calls needed to find the solution, all in a step-by-step reasoning chain. This procedure is used to create a dataset with correct and incorrect completions. The generated dataset then trains a smaller model using a Reinforcement Learning from Human Feedback (RLHF) approach, known as Direct Preference Optimization (DPO). We present this methodology tested on two reasoning tasks, First-Order Logic (FOL) and math, using a custom-built set of FOL problems inspired by the HuggingFace dataset.

The proposed framework’s pipeline comprises four stages: first, defining tasks and problems to assess the abilities of large language models (LLMs) in various reasoning domains. Next, functions specific to each task are set up, allowing the LLM to solve reasoning steps, manage the chain flow, and verify results. A pre-trained, large-scale LLM is then chosen to generate a dataset of correct and incorrect completions using a chain-of-thought prompting approach. Finally, a smaller LLM model is fine-tuned using the Direct Policy Optimization (DPO) algorithm on the created dataset. Experimentation involved testing the model on first-order logic (FOL) and mathematical problems, with results generated using an agent-based library, Microchain, which facilitates LLM querying with predefined functions to create a chain-of-thought dataset. 

Data augmentation was conducted to extend the dataset, and fine-tuning was performed on Mistral-7B using a single GPU. Performance metrics demonstrated the model’s accuracy improvement in FOL tasks and moderate gains in mathematical tasks, with statistical significance confirmed through a Wilcoxon test.

In conclusion, the researchers proposed a new framework for improving the function-calling abilities of small-scale LLMs, focusing on specific logical and mathematical reasoning tasks. This method reduces the need for large models and boosts the performance on logical and math-related tasks.  The experimental results demonstrate significant improvements in the performance of the small-scale model on FOL tasks, achieving near-perfect accuracy in most cases. In future work, there is great scope to explore the application of the introduced framework to a broader range of reasoning tasks and function types. 


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

[Trending] LLMWare Introduces Model Depot: An Extensive Collection of Small Language Models (SLMs) for Intel PCs

The post Efficient Function Calling in Small-Scale LLMs: A Game-Changer for AI Reasoning Tasks appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

小规模LLM 函数调用 性能提升 推理任务
相关文章