BRAG Released: High-Performance SLMs (Small Language Models) Specifically Trained for RAG Tasks Under $25 Each

MarkTechPost@AI 2024年08月06日

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

BRAG是一系列由Maximalists AI研究人员开发的高性能检索增强生成（RAG）模型。BRAG模型是一系列小语言模型（SLM），旨在为人工智能驱动的语言处理提供经济高效、高性能的替代方案。这些模型的训练成本令人印象深刻，每个模型不到25美元，使其成为人工智能领域中高效且经济的解决方案。

💥 BRAG模型的开发是为了满足对高效、高性能语言模型的需求，这些模型不需要与Nvidia和OpenAI等大型模型相关的广泛计算资源。BRAG背后的主要动机是开发一系列模型，这些模型可以匹配或超过Cohere的Command R+、Qwen2、Llama 3.1和Llama 3 Instruct等领先模型的性能，同时将训练成本降至最低。

💥 BRAG模型包括四个模型：BRAG-Qwen2-7b-v0.1、BRAG-Llama-3.1-8b-v0.1、BRAG-Llama-3-8b-v0.1和BRAG-Qwen2-1.5b-v0.1。这些模型的选取是基于它们在公开基准测试中的表现以及平衡效率和能力的能力。这些模型经过了两个阶段的微调，灵感来自Nvidia的ChatQA方法，包括在通用指令数据集上的初始训练，然后是在RAG特定数据集上的训练。

💥 BRAG模型的训练涉及LoRA（低秩自适应）和QLoRA（量化LoRA）技术。LoRA通过简化自适应矩阵来实现更快的训练，同时降低计算需求。相比之下，QLoRA将权重参数压缩到4位精度，显著降低了内存占用，并促进了在消费级GPU上的训练。

💥 这些模型使用ChatRAG-Bench进行评估，ChatRAG-Bench是一个基准，旨在评估各种文档类型和问题格式的对话式问答和RAG能力。评估指标包括F1得分和精确匹配精度，这些指标提供了对模型生成精确且上下文相关的响应能力的见解。

💥 在训练过程中，遇到了几个挑战，包括处理长文档、解释表格数据以及解决特定领域的查询。通过仔细选择数据集并对各种数据组合进行实验，这些问题得到了缓解。例如，包含DROP、Quoref和SQuAD等数据集有助于提高模型处理复杂和多样化数据类型的能力。虽然F1得分指标被广泛接受，但它在捕捉语义细微差别和上下文方面存在局限性。这突出了需要更全面、更具上下文感知的评估指标，以更好地衡量模型性能。

💥 Maximalists计划通过改进RAG性能和表格数据处理，以及引入引文生成以提高可解释性，来增强BRAG模型。他们还旨在改进查询重写技术，以提高搜索精度和相关性。BRAG的开发得到了Modal Labs的信贷支持，这促进了经济高效的实验。通过利用创新的训练技术和战略性的模型选择，BRAG已经证明，在最小资源支出的情况下可以实现顶级性能，为更易获得和高效的AI解决方案铺平了道路。

💥 BRAG模型系列，高效低成本，引领AI语言处理新时代！

BRAG is a series of high-performance Retrieval Augmented Generation (RAG) models developed by Maximalists AI Researcher. The BRAG models are a family of small language models (SLMs) designed to offer cost-effective, high-performance alternatives in AI-driven language processing. These models have been trained at an impressively low cost of under $25 each, positioning them as efficient and economical solutions in artificial intelligence.

The BRAG models were created in response to the need for efficient and high-performing language models that do not require the extensive computational resources typically associated with large-scale models like those from Nvidia and OpenAI. The primary motivation behind BRAG was to develop a series of models that could match or exceed the performance of leading models such as Cohere’s Command R+, Qwen2, Llama3.1, and Llama3 Instruct while keeping the training costs minimal.

The BRAG series includes four models:

BRAG-Qwen2-7b-v0.1

BRAG-Llama-3.1-8b-v0.1

BRAG-Llama-3-8b-v0.1

BRAG-Qwen2-1.5b-v0.1

These models are chosen based on their performance in open benchmarks and ability to balance efficiency and capability. The models underwent a two-stage fine-tuning process inspired by Nvidia’s ChatQA approach, which involves initial training on general instruction datasets followed by RAG-specific datasets.

The BRAG models are particularly noteworthy for their performance relative to their size. The 1.5B models offer an excellent balance of performance and efficiency. In comparison, the 7B and 8B models can handle more complex tasks, such as long context understanding, tabular data interpretation, and mathematical reasoning. This strategic selection of models and training methodology allowed Maximalists to optimize performance while managing costs effectively.

The BRAG model training involved LoRA (Low-Rank Adaptation) and QLoRA (quantized LoRA) techniques. LoRA enables faster training with reduced computational demands by simplifying the adaptation matrices. In contrast, QLoRA compresses weight parameters to 4-bit precision, significantly reducing memory footprint and facilitating training on consumer-grade GPUs.

The models were evaluated using the ChatRAG-Bench, a benchmark designed to assess conversational QA and RAG capabilities across various document types and question formats. The evaluation metrics included F1-Score and Exact Match Accuracy, which provided insights into the models’ ability to generate precise and contextually relevant responses.

During the training process, several challenges were encountered, including handling long documents, interpreting tabular data, and addressing domain-specific queries. These issues were mitigated through careful dataset selection and experimentation with various data combinations. For instance, including datasets like DROP, Quoref, and SQuAD helped improve the models’ capabilities in handling complex and diverse data types. The F1 score metric, while widely accepted, was noted to have limitations in capturing semantic nuances and context. This highlighted the need for more holistic and context-aware evaluation metrics to better gauge model performance.

In conclusion, the Maximalists plan to enhance BRAG models by improving RAG performance and tabular data handling and introducing citation generation for better interpretability. They also aim to refine query rewriting techniques to improve search accuracy and relevance. The development of BRAG was supported by credits from Modal Labs, which facilitated cost-effective experimentation. By leveraging innovative training techniques and strategic model selection, BRAG has demonstrated that top-tier performance can be achieved with minimal resource expenditure, paving the way for more accessible and efficient AI solutions.

Check out the Models and Details. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 47k+ ML SubReddit

Find Upcoming AI Webinars here

The post BRAG Released: High-Performance SLMs (Small Language Models) Specifically Trained for RAG Tasks Under $25 Each appeared first on MarkTechPost.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签