AWS Machine Learning Blog 2024年07月03日
AI21 Labs Jamba-Instruct model is now available in Amazon Bedrock
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Bedrock 现已推出 AI21 Labs 开发的 Jamba-Instruct 大语言模型 (LLM)。Jamba-Instruct 支持 256,000 个令牌的上下文窗口,使其特别适用于处理大型文档和复杂的检索增强生成 (RAG) 应用程序。Jamba-Instruct 基于 Jamba 基础模型,该模型先前由 AI21 Labs 开源,结合了生产级模型、结构化状态空间 (SSM) 技术和 Transformer 架构。通过 SSM 方法,Jamba-Instruct 能够在其模型尺寸类别中实现最大的上下文窗口长度,同时还提供传统基于 Transformer 的模型提供的性能。这些模型比 AI21 的上一代模型(Jurassic-2 模型系列)实现了性能提升。有关混合 SSM/Transformer 架构的更多信息,请参阅 Jamba:一种混合 Transformer-Mamba 语言模型白皮书。

🎯 **Jamba-Instruct 的独特优势:**Jamba-Instruct 拥有 256,000 个令牌的上下文窗口,使其在处理大型文档和复杂的检索增强生成 (RAG) 应用程序方面具有显著优势。这使得 Jamba-Instruct 能够有效地处理包含大量文本信息的任务,例如分析长篇文档、进行跨文档对比和进行复杂的语义理解。

🚀 **Jamba-Instruct 的应用场景:**Jamba-Instruct 的长上下文长度使其特别适用于复杂的检索增强生成 (RAG) 工作负载或潜在的复杂文档分析。例如,它适合于检测不同文档之间的矛盾或在另一个文档的上下文中分析一个文档。此外,Jamba-Instruct 还适用于查询增强,这是一种将原始查询转换为相关查询的技术,用于优化 RAG 应用程序。

💡 **Jamba-Instruct 的性能提升:**与同类尺寸的模型相比,Jamba-Instruct 的混合 SSM/Transformer 架构在模型吞吐量方面提供了优势。对于超过 128,000 个令牌的上下文窗口长度,它可以提供高达每秒三倍的令牌性能提升。

🌐 **Jamba-Instruct 的可用性:**AI2I Labs Jamba-Instruct 在 Amazon Bedrock 中在美国东部(北弗吉尼亚)AWS 区域可用,可以通过按需消费模式访问。

💻 **Jamba-Instruct 的使用方式:**您可以通过 Amazon Bedrock 控制台访问 Jamba-Instruct,并通过 Amazon Bedrock 文本或聊天游乐场测试模型。此外,您还可以通过 API 使用 Amazon Bedrock 和 AWS SDK for Python (Boto3) 访问 Jamba-Instruct。

We are excited to announce the availability of the Jamba-Instruct large language model (LLM) in Amazon Bedrock. Jamba-Instruct is built by AI21 Labs, and most notably supports a 256,000-token context window, making it especially useful for processing large documents and complex Retrieval Augmented Generation (RAG) applications.

What is Jamba-Instruct

Jamba-Instruct is an instruction-tuned version of the Jamba base model, previously open sourced by AI21 Labs, which combines a production grade-model, Structured State Space (SSM) technology, and Transformer architecture. With the SSM approach, Jamba-Instruct is able to achieve the largest context window length in its model size class while also delivering the performance traditional transformer-based models provide. These models yield a performance boost over AI21’s previous generation of models, the Jurassic-2 family of models. For more information about the hybrid SSM/Transformer architecture, refer to the Jamba: A Hybrid Transformer-Mamba Language Model whitepaper.

Get started with Jamba-Instruct

To get started with Jamba-Instruct models in Amazon Bedrock, first you need to get access to the model.

    On the Amazon Bedrock console, choose Model access in the navigation pane. Choose Modify model access. Select the AI21 Labs models you want to use and choose Next. Choose Submit to request model access.

For more information, refer to Model access.

Next, you can test the model either in the Amazon Bedrock Text or Chat playground.

Example use cases for Jamba-Instruct

Jamba-Instruct’s long context length is particularly well-suited for complex Retrieval Augmented Generation (RAG) workloads, or potentially complex document analysis. For example, it would be suitable for detecting contradictions between different documents or analyzing one document in the context of another. The following is an example prompt suitable for this use case:

You are an expert research assistant; you are to note any contradictions between the first document and second document provided: Document 1: {the document content} Document 2: {the document content} Contradictions:

You can also use Jamba for query augmentation, a technique where an original query is transformed into related queries, for purposes of optimizing RAG applications. For example:

You are a curious and novel researcher, who is highly interested in getting all the relevant information on a specific topic. Given an original query, you would like to generate up to 10 related queries. These queries should be grounded in the original query, but nevertheless new:Original Query:{Original Query}New Queries:

You can also use Jamba for standard LLM operations, such as summarization and entity extraction.

Prompt guidance for Jamba-Instruct can be found in the AI21 model documentation. For more information about Jamba-Instruct, including relevant benchmarks, refer to Built for the Enterprise: Introducing AI21’s Jamba-Instruct Model.

Programmatic access

You can also access Jamba-Instruct through an API, using Amazon Bedrock and AWS SDK for Python (Boto3). For installation and setup instructions, refer to the quickstart. The following is an example code snippet:

import boto3import jsonbedrock = boto3.client(service_name="bedrock-runtime")prompt = "INSERT YOUR PROMPT HERE"body = json.dumps({    "messages":[{"role":"user","content":prompt}],    "max_tokens": 256,    "top_p": 0.8,    "temperature": 0.7,})modelId = "ai21.jamba-instruct-v1:0"accept = "application/json"contentType = "application/json"response = bedrock.invoke_model(    body=body,    modelId=modelId,    accept=accept,    contentType=contentType)result=json.loads(response.get('body').read())print(result['choices'][0]['message']['content'])

Conclusion

AI2I Labs Jamba-Instruct in Amazon Bedrock is well-suited for applications where a long context window (up to 256,000 tokens) is required, like producing summaries or answering questions that are grounded in long documents, avoiding the need to manually segment documents sections to fit the smaller context windows of other LLMs. The new SSM/Transformer hybrid architecture also provides benefits in model throughput. It can provide a performance boost of up to three times more tokens per second for context window lengths exceeding 128,000 tokens, compared to other models in similar size class.

AI2I Labs Jamba-Instruct in Amazon Bedrock is available in the US East (N. Virginia) AWS Region and can be accessed in on-demand consumption model. To learn more, refer to and Supported foundation models in Amazon Bedrock. To get started with AI2I Labs Jamba-Instruct in Amazon Bedrock, visit the Amazon Bedrock console.


About the Authors

Joshua Broyde, PhD, is a Principal Solution Architect at AI21 Labs. He works with customers and AI21 partners across the generative AI value chain, including enabling generative AI at an enterprise level, using complex LLM workflows and chains for regulated and specialized environments, and using LLMs at scale.

Fernando Espigares Caballero is a Senior Partner Solutions Architect at AWS. He creates joint solutions with strategic Technology Partners to deliver value to customers. He has more than 25 years of experience working in IT platforms, data centers, and cloud and internet-related services, holding multiple Industry and AWS certifications. He is currently focusing on generative AI to unlock innovation and creation of novel solutions that solve specific customer needs.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Jamba-Instruct 大语言模型 Amazon Bedrock AI21 Labs 检索增强生成
相关文章