MarkTechPost@AI 2024年07月07日
InternLM2.5-7B-Chat: Open Sourcing Large Language Models with Unmatched Reasoning, Long-Context Handling, and Enhanced Tool Use
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

InternLM2.5-7B-Chat 是 InternLM 推出的最新开源大语言模型,它在推理能力、长文本处理和工具使用方面表现出色。该模型具有 70 亿个参数,并针对实际场景进行了优化。其推理能力,特别是在数学推理方面,超越了 Llama3 和 Gemma2-9B 等竞争对手。此外,它还拥有 100 万个上下文窗口,在 LongBench 等长文本任务中表现出接近完美的性能。InternLM2.5-7B-Chat 还能够处理工具使用,支持从超过 100 个网页收集信息。

🤔 InternLM2.5-7B-Chat 在推理能力方面取得了显著进步,尤其是在数学推理方面,超越了 Llama3 和 Gemma2-9B 等竞争对手。这一进步得益于模型架构的改进,以及对大量合成数据的训练。

🚀 InternLM2.5-7B-Chat 拥有 100 万个上下文窗口,这使其能够处理长文本任务,例如从大量文档中检索信息。这种能力在与 MMRazor 和 MMDeploy 团队开发的 LMDeploy 工具包结合使用时得到了增强。

🧰 InternLM2.5-7B-Chat 能够处理工具使用,支持从超过 100 个网页收集信息。即将发布的 Lagent 将进一步增强此功能,提高模型在指令遵循、工具选择和反思方面的能力。

📊 OpenCompass 工具的性能评估结果表明,InternLM2.5-7B-Chat 在各个方面都表现出色,包括学科能力、语言能力、知识能力、推理能力和理解能力。在 MMLU、CMMLU、BBH、MATH、GSM8K 和 GPQA 等基准测试中,该模型 consistently delivers superior performance 相比于其同行。

💻 InternLM2.5-7B-Chat 提供了全面的安装指南、模型下载说明以及模型推理和服务部署示例。用户可以使用 lmdeploy 对量化模型执行批量离线推理,lmdeploy 支持 INT4 权重仅量化和部署 (W4A16)。这种设置提供了高达 2.4 倍的 FP16 推理速度,适用于兼容的 NVIDIA GPU,包括 20、30 和 40 系列以及 A10、A16、A30 和 A100。

InternLM has unveiled its latest advancement in open large language models, the InternLM2.5-7B-Chat, available in GGUF format. This model is compatible with llama.cpp, an open-source framework for LLM inference, can be utilized locally and in the cloud across various hardware platforms. The GGUF format offers half-precision and low-bit quantized versions, including q5_0, q5_k_m, q6_k, and q8_0.

InternLM2.5 builds on its predecessor, offering a 7 billion parameter base model and a chat model tailored for practical scenarios. This model boasts state-of-the-art reasoning capabilities, especially in mathematical reasoning, surpassing competitors like Llama3 and Gemma2-9B. It also features an impressive 1M context window, demonstrating near-perfect performance in long-context tasks such as those assessed by LongBench.

The model’s ability to handle long contexts makes it particularly effective in retrieving information from extensive documents. This capability is enhanced when paired with LMDeploy, a toolkit developed by the MMRazor and MMDeploy teams for compressing, deploying, and serving LLMs. The InternLM2.5-7B-Chat-1M variant, designed for 1M-long context inference, exemplifies this strength. This version requires significant computational resources, such as 4xA100-80G GPUs, to operate effectively.

Performance evaluations conducted using the OpenCompass tool highlight the model’s competencies across various dimensions: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. In benchmarks like MMLU, CMMLU, BBH, MATH, GSM8K, and GPQA, InternLM2.5-7B-Chat consistently delivers superior performance compared to its peers. For instance, the MMLU benchmark achieves a score of 72.8, outpacing models like Llama-3-8B-Instruct and Gemma2-9B-IT.

InternLM2.5-7B-Chat also excels at handling tool use, supporting gathering information from over 100 web pages. The upcoming release of Lagent will further enhance this functionality, improving the model’s capabilities in instruction following, tool selection, and reflection.

The model’s release includes a comprehensive installation guide, model download instructions, and model inference and service deployment examples. Users can perform batched offline inference with the quantized model using lmdeploy, a framework supporting INT4 weight-only quantization and deployment (W4A16). This setup offers up to 2.4x faster inference than FP16 on compatible NVIDIA GPUs, including the 20, 30, and 40 series and A10, A16, A30, and A100.

InternLM2.5’s architecture retains the robust features of its predecessor while incorporating new technical innovations. These enhancements, driven by a large corpus of synthetic data and an iterative training process, result in a model with improved reasoning performance—boasting a 20% increase over InternLM2. This iteration also maintains the capability to handle 1M context windows with near-full accuracy, making it a leading model for long-context tasks.

In conclusion, with the release of InternLM2.5 and its variants with its advanced reasoning capabilities, long-context handling, and efficient tool use, InternLM2.5-7B-Chat is set to be a valuable resource for various applications in both research and practical scenarios.

The post InternLM2.5-7B-Chat: Open Sourcing Large Language Models with Unmatched Reasoning, Long-Context Handling, and Enhanced Tool Use appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

InternLM 开源大模型 推理能力 长文本处理 工具使用
相关文章