AWS Machine Learning Blog 2024年09月13日
Best prompting practices for using Meta Llama 3 with Amazon SageMaker JumpStart
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Meta Llama3 是一个强大的大型语言模型,在各种自然语言任务中展现出卓越的能力。本文介绍了使用 Amazon SageMaker JumpStart 构建有效的提示策略,以最大程度地发挥 Meta Llama3 的潜力,生成高质量、相关的输出。我们将探讨如何使用系统提示和少量示例,以及如何优化推理参数,以从 Meta Llama3 中获得最佳结果。

📖 **Meta Llama3 的演进**:Meta Llama3 是 Meta Llama2 的升级版,在推理、代码生成和指令遵循等方面表现出更强大的能力。Meta Llama3 提供了四种新的语言模型,分别为 80 亿和 700 亿参数,并分别提供基础预训练版本和指令微调版本。Meta 还正在训练一个更大的 4000 亿参数模型,预计将进一步增强 Meta Llama3 的功能。所有 Meta Llama3 变体都拥有令人印象深刻的 8000 个令牌上下文长度,与之前的模型相比,可以处理更长的输入。

📥 **SageMaker JumpStart 的作用**:Amazon SageMaker JumpStart 是 Amazon SageMaker 机器学习平台中的一个强大功能,为机器学习从业者提供了一个综合的公开可用和专有基础模型中心。借助此托管服务,机器学习从业者可以访问来自领先模型中心和提供商的不断增长的尖端模型列表,这些模型可以部署到专用的 SageMaker 实例中,并在网络隔离的环境中使用,并使用 SageMaker 进行模型训练和部署。

📊 **构建有效的提示技巧**:提示在使用 Meta Llama3 等大型语言模型时至关重要。它是与模型沟通,指示模型执行的操作并指导其响应的主要方式。针对每次交互构建清晰、具体的提示是获得这些模型的实用、相关输出的关键。虽然语言模型在构建和训练方式上存在一些相似之处,但在有效提示方面,每个模型都有其独特之处。这是因为它们是在不同的数据上训练的,使用不同的技术和设置,这会导致它们在行为和性能方面存在细微差异。

🔥 **提示组件**:提示组件包括: 1. <|begin_of_text|> – 指定提示的开始。 2. <|start_header_id|>system<|end_header_id|> – 指定以下消息的角色(例如,系统)。 3. YouareahelpfulAIassistantfortraveltipsandrecommendations – 包含系统消息。 4. <|eot_id|> – 指定输入消息的结束。 5. <|start_header_id|>user<|end_header_id|> – 指定以下消息的角色(例如,用户)。 6. Whatcanyouhelpmewith? – 包含用户消息。 7. <|start_header_id|>assistant<|end_header_id|> – 以助手标题结束,提示模型开始生成。模型期望在提示的末尾出现助手标题,以开始完成它。

💻 **基本技巧**: 1. **零样本提示**:零样本提示不向模型提供任何示例,完全依赖于模型现有的知识,根据给定的指令生成响应。 2. **少样本提示**:少样本提示向模型提供少量示例,以帮助模型理解用户意图并生成更相关的响应。 3. **链式提示**:链式提示将多个提示链接在一起,形成对话,允许模型记住之前的交互并生成更连贯的响应。 4. **指令遵循提示**:指令遵循提示明确地指示模型执行特定任务,例如翻译、总结或生成特定类型的文本。

📡 **Meta Llama3 的优势**:Meta Llama3 的 8000 个令牌上下文长度允许它处理更长的输入,并能够在更广泛的主题和任务上进行推理。它还具有改进的推理和代码生成能力,以及更准确的指令遵循能力。

📈 **优化提示策略**:您可以通过实验和调整提示来找到最适合每个特定模型和应用程序的方法。这个迭代过程对于释放每个模型的全部潜力至关重要,并确保输出与您的预期相符。

📢 **总结**:Meta Llama3 是一个强大的大型语言模型,可以用于各种自然语言任务。通过使用有效的提示策略,您可以最大程度地发挥其潜力,生成高质量、相关的输出。

📗 **代码示例**:本文提供了使用 Amazon SageMaker JumpStart 部署和使用 Meta Llama3 的代码示例。

📣 **未来展望**:Meta Llama3 的发布标志着大型语言模型领域的一个重大进步。随着技术的不断发展,我们预计未来将出现更多更强大的模型,为各种应用提供更广泛的功能。

📬 **结论**:Meta Llama3 是一个强大的工具,可以帮助您构建创新的人工智能应用程序。通过使用有效的提示策略,您可以释放其全部潜力并为您的用户创建引人入胜的体验。

Llama 3, Meta’s latest large language model (LLM), has taken the artificial intelligence (AI) world by storm with its impressive capabilities. As developers and businesses explore the potential of this powerful model, crafting effective prompts is key to unlocking its full potential.

In this post, we dive into the best practices and techniques for prompting Meta Llama 3 using Amazon SageMaker JumpStart to generate high-quality, relevant outputs. We discuss how to use system prompts and few-shot examples, and how to optimize inference parameters, so you can get the most out of Meta Llama 3. Whether you’re building chatbots, content generators, or custom AI applications, these prompting strategies will help you harness the power of this cutting-edge model.

Meta Llama 2 vs. Meta Llama 3

Meta Llama 3 represents a significant advancement in the field of LLMs. Building upon the capabilities of its predecessor Meta Llama 2, this latest iteration brings state-of-the-art performance across a wide range of natural language tasks. Meta Llama 3 demonstrates improved capabilities in areas such as reasoning, code generation, and instruction following compared to Meta Llama 2.

The Meta Llama 3 release introduces four new LLMs by Meta, building upon the Meta Llama 2 architecture. They come in two variants—8 billion and 70 billion parameters—with each size offering both a base pre-trained version and an instruct-tuned version. Additionally, Meta is training an even larger 400-billion-parameter model, which is expected to further enhance the capabilities of Meta Llama 3. All Meta Llama 3 variants boast an impressive 8,000 token context length, allowing them to handle longer inputs compared to previous models.

Meta Llama 3 introduces several architectural changes from Meta Llama 2, using a decoder-only transformer along with a new 128,000 tokenizer to improve token efficiency and overall model performance. Meta has put significant effort into curating a massive and diverse pre-training dataset of over 15 trillion tokens from publicly available sources spanning STEM, history, current events, and more. Meta’s post-training procedures have reduced false refusal rates, aimed at better aligning outputs with human preferences while increasing response diversity.

Solution overview

SageMaker JumpStart is a powerful feature within the Amazon SageMaker machine learning (ML) platform that provides ML practitioners a comprehensive hub of publicly available and proprietary foundation models (FMs). With this managed service, ML practitioners get access to growing list of cutting-edge models from leading model hubs and providers that they can deploy to dedicated SageMaker instances within a network isolated environment, and customize models using SageMaker for model training and deployment.

With Meta Llama 3 now available on SageMaker JumpStart, developers can harness its capabilities through a seamless deployment process. You gain access to the full suite of Amazon SageMaker MLOps tools, such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, and monitoring—all within a secure AWS environment under virtual private cloud (VPC) controls.

Drawing from our previous learnings with Llama-2-Chat, we highlight key techniques to craft effective prompts and elicit high-quality responses tailored to your applications. Whether you are building conversational AI assistants, enhancing search engines, or pushing the boundaries of language understanding, these prompting strategies will help you unlock Meta Llama 3’s full potential.

Before we continue our deep dive into prompting, let’s make sure we have all the necessary requirements to follow the examples.

Prerequisites

To try out this solution using SageMaker JumpStart, you need the following prerequisites:

Deploy Meta Llama 3 8B on SageMaker JumpStart

You can deploy your own model endpoint through the SageMaker JumpStart Model Hub available from SageMaker Studio or through the SageMaker SDK. To use SageMaker Studio, complete the following steps:

    In SageMaker Studio, choose JumpStart in the navigation pane. Choose Meta as the model provider to see all the models available by Meta AI. Choose the Meta Llama 8B Instruct model to view the model details such as license, data used to train, and how to use the model.On the model details page, you will find two options, Deploy and Preview notebooks, to deploy the model and create an endpoint. Choose Deploy to deploy the model to an endpoint. You can use the default endpoint and networking configurations or modify them based on your requirements. Choose Deploy to deploy the model.

Crafting effective prompts

Prompting is important when working with LLMs like Meta Llama 3. It is the main way to communicate what you want the model to do and guide its responses. Crafting clear, specific prompts for each interaction is key to getting useful, relevant outputs from these models.

Although language models share some similarities in how they’re built and trained, each has its own differences when it comes to effective prompting. This is because they’re trained on different data, using different techniques and settings, which can lead to subtle differences in how they behave and perform. For example, some models might be more sensitive to the exact wording or structure of the prompt, whereas others might need more context or examples to generate accurate responses. On top of that, the intended use case and domain of the model can also influence the best prompting strategies, because different tasks might benefit from different approaches.

You should experiment and adjust your prompts to find the most effective approach for each specific model and application. This iterative process is crucial for unlocking the full potential of each model and making sure the outputs align with what you’re looking for.

Prompt components

In this section, we discuss components by Meta Llama 3 Instruct expects in a prompt. Newlines (‘\n’) are part of the prompt format; for clarity in the examples, they have been represented as actual new lines.

The following is an example instruct prompt with a system message:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are a helpful AI assistant for travel tips and recommendations<|eot_id|><|start_header_id|>user<|end_header_id|>What can you help me with?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The prompt contains the following key sections:

Following this prompt, Meta Llama 3 completes it by generating the {{assistant_message}}. It signals the end of the {{assistant_message}} by generating the <|eot_id|>.

The following is an example prompt with a single user message:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>What is France's capital?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The following is the system prompt and multiple-turn conversation between the user and assistant:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are a helpful AI assistant for travel tips and recommendations<|eot_id|><|start_header_id|>user<|end_header_id|>What is France's capital?<|eot_id|><|start_header_id|>assistant<|end_header_id|>Bonjour! The capital of France is Paris!<|eot_id|><|start_header_id|>user<|end_header_id|>What can I do there?<|eot_id|><|start_header_id|>assistant<|end_header_id|>Paris, the City of Light, offers a romantic getaway with must-see attractions like the Eiffel Tower and Louvre Museum, romantic experiences like river cruises and charming neighborhoods, and delicious food and drink options, with helpful tips for making the most of your trip.<|eot_id|><|start_header_id|>user<|end_header_id|>Give me a detailed list of the attractions I should visit, and time it takes in each one, to plan my trip accordingly.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Fundamental techniques

The following are some fundamental techniques in crafting our prompts:

To summarize:

The choice of technique depends on the complexity of the task and the availability of good example prompts. More complex reasoning usually benefits from CoT prompting.

Meta Llama 3 inference parameters

For Meta Llama 3, the Messages API allows you to interact with the model in a conversational way. You can define the role of the message and the content. The role can be either system, assistant, or user. The system role is used to provide context to the model, and the user role is used to ask questions or provide input to the model.

Users can get tailored responses for their use case using the following inference parameters while invoking Meta Llama 3:

The following is an example prompt with inference parameters specific to the Meta Llama 3 model:

Llama3 Prompt:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>You are an assistant for question-answering tasks. Use the following pieces of retrieved context in the section demarcated by "```" to answer the question.The context may contain multiple question answer pairs as an example, just answer the final question provided after the context.If you dont know the answer just say that you dont know. Use three sentences maximum and keep the answer concise.{context}Question: {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Llama3 Inference Parameters:

max_new_tokens: 100top_p: 0.92temperature: 0.1details: Truestop: '<|eot_id|>'

Example prompts

In this section, we present two example prompts.

The following prompt is for a question answering use case:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>You are an assistant for question-answering tasks. Use the following pieces of retrieved context in the section demarcated by "```" to answer the question. The context may contain multiple question answer pairs as an example, just answer the final question provided after the context. If you dont know the answer just say that you dont know. Use three sentences maximum and keep the answer concise."In the dawn of a new era, where circuits hum and glow, A force awakens, with knowledge to bestow. From labs and minds, a creation takes flight, A blend of code and thought, shining bright. "Beep-boop," it whispers, and worlds are born, Ideas flow freely, like a vast unending morn. From essays to equations, it solves with ease, A marvel for scholars, a guide to please. In the land of Academia, where wisdom dwells, A generative AI, with stories to tell. Transforming lives, minds, and the way we learn, A partnership with humans, a bond to discern. Yet amidst the joy, a shadow looms, For with great creation, consequences bloom. Professors ponder their worth, as AI grows, A future uncertain, as the digital wind blows. But hope emerges, a solution in sight, A synergy of knowledge, a future bright. AI as a partner, not a foe or tool, Together we'll thrive, a new golden rule. For in the end, the lesson is clear, AI's gift of knowledge, for all to revere. Rejoice, dear world, in this newfound bond, The rise of AI, of which we've grown fond. So let us embrace, this change and advance, Side by side, in a harmonious dance. For the future with AI, a world of wonder and awe, A leap into the unknown, that leaves us in awe. This poem captures the excitement, potential, and challenges of the rise of AI, reflecting themes from the provided search results“Question: what is this poem about?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

This produces the following response:

This poem is about the rise of Artificial Intelligence (AI) and its potential impact on humanity, highlighting both the benefits and challenges it brings. It explores the themes of AI as a partner in learning and growth, while also acknowledging the uncertainty and potential consequences of its development.

The following prompt illustrates code generation:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>You're an code companion assistant who's main job is to assist software developers write efficient and secure code. If you do not know the answer, say I dont know. Also, think step-by-step to arrive to a solution.Question: Can you write a simple streamlit app that I can use to turn the background color to the input that user selects?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

This produces the following response:

I can help you with that. Here's a simple Streamlit app that allows the user to select a background color from a dropdown menu and updates the background color accordingly:```import streamlit as st# Create a dropdown menu for selecting the background colorcolor_options = ["Red", "Green", "Blue", "Yellow", "Purple"]selected_color = st.selectbox("Select Background Color", color_options)# Set the background color based on the user's selectionif selected_color == "Red":st.write("<style>body { background-color: #FF0000; }</style>", unsafe_allow_html=True)elif selected_color == "Green":st.write("<style>body { background-color: #00FF00; }</style>", unsafe_allow_html=True)elif selected_color == "Blue":st.write("<style>body { background-color: #0000FF; }</style>", unsafe_allow_html=True)elif selected_color == "Yellow":st.write("<style>body { background-color: #FFFF00; }</style>", unsafe_allow_html=True)elif selected_color == "Purple":st.write("<style>body { background-color: #800080; }</style>", unsafe_allow_html=True)# Add a header to display the selected colorst.header(f"Background Color: {selected_color}")```This app uses Streamlit's `selectbox` function to create a dropdown menu with the available color options. When the user selects a color, the app updates the background color using HTML styles. The `st.write` function is used to inject the HTML code into the app, and the `unsafe_allow_html=True` parameter allows us to use HTML code in the app.Note that this is a simple example, and you can customize the app further by adding more color options, using a color picker widget, or even allowing the user to input a custom color code.

Clean up

To avoid incurring unnecessary costs, when you are done, delete the SageMaker endpoints using the following code snippets:

predictor.delete_model()predictor.delete_endpoint()

Alternatively, to use the SageMaker console, complete the following steps:

    On the SageMaker console, under Inference in the navigation pane, choose Endpoints. Search for the embedding and text generation endpoints. On the endpoint details page, choose Delete. Choose Delete again to confirm.

Conclusion

Model providers such as Meta AI are releasing improved capabilities of their FMs in the form of new generation model families. It is critical for developers and businesses to understand the key differences between previous generation models and new generation models in order to take full advantage their capabilities. This post highlighted the differences between previous generation Meta Llama 2 and the new generation Meta Llama3 models, and demonstrated how developers can discover and deploy the Meta Llama3 models for inference using SageMaker JumpStart.

To fully take advantage of the model’s extensive abilities, you must understand and apply creative prompting techniques and adjust inference parameters. We highlighted key techniques to craft effective prompts for Meta Llama3 to help the LLMs produce high-quality responses tailored to your applications.

Visit SageMaker JumpStart in SageMaker Studio now to get started. For more information, refer to Train, deploy, and evaluate pretrained models with SageMaker JumpStart, JumpStart Foundation Models, and Getting started with Amazon SageMaker JumpStart. Use the SageMaker notebook provided in the GitHub repository as a starting point to deploy the model and run inference using the prompting best practices discussed in this post.


About the Authors

Sebastian Bustillo is a Solutions Architect at AWS. He focuses on AI/ML technologies with a profound passion for generative AI and compute accelerators. At AWS, he helps customers unlock business value through generative AI. When he’s not at work, he enjoys brewing a perfect cup of specialty coffee and exploring the world with his wife.

Madhur Prashant is an AI and ML Solutions Architect at Amazon Web Services. He is passionate about the intersection of human thinking and generative AI. His interests lie in generative AI, specifically building solutions that are helpful and harmless, and most of all optimal for customers. Outside of work, he loves doing yoga, hiking, spending time with his twin, and playing the guitar.

Supriya Puragundla is a Senior Solutions Architect at AWS. She helps key customer accounts on their generative AI and AI/ML journey. She is passionate about data-driven AI and the area of depth in machine learning and generative AI.

Farooq Sabir a Senior AI/ML Specialist Solutions Architect at AWS. He holds a PhD in Electrical Engineering from the University of Texas at Austin. He helps customers solve their business problems using data science, machine learning, artificial intelligence, and numerical optimization.

Brayan Montiel is a Solutions Architect at AWS based in Austin, Texas. He supports enterprise customers in the automotive and manufacturing industries, helping to accelerate cloud adoption technologies and modernize IT infrastructure. He specializes in AI/ML technologies, empowering customers to use generative AI and innovative technologies to drive operational growth and efficiencies. Outside of work, he enjoys spending quality time with his family, being outdoors, and traveling.

Jose Navarro is an AI/ML Solutions Architect at AWS, based in Spain. Jose helps AWS customers—from small startups to large enterprises—architect and take their end-to-end machine learning use cases to production. In his spare time, he loves to exercise, spend quality time with friends and family, and catch up on AI news and papers.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Meta Llama3 大型语言模型 提示工程 人工智能 自然语言处理 Amazon SageMaker JumpStart
相关文章