AWS Machine Learning Blog 2024年11月26日
How 123RF saved over 90% of their translation costs by switching to Amazon Bedrock
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

123RF,一家领先的免版税数字内容提供商,为了提升其全球用户体验,利用Amazon Bedrock和Anthropic的Claude 3 Haiku模型,大幅提升了内容审核效率和内容翻译速度。文章详细介绍了123RF在多语言内容翻译方面遇到的挑战,包括成本、质量和语境理解等问题,以及如何通过Amazon Bedrock、Claude 3 Haiku、提示工程和向量数据库等技术,实现更高效、更准确、更具成本效益的多语言内容翻译,从而提升全球内容发现能力。

🤔 **挑战:平衡高质量和低成本的大规模翻译**:123RF在英语用户中取得了成功,但其海量数字资产(数百万张图片、音频文件和动态图形)的元数据主要为英语,需要翻译成15种语言以服务全球用户,但面临着高昂的翻译成本和质量控制问题。

🚀 **解决方案:Amazon Bedrock、Claude 3 Haiku、提示工程和向量存储**:123RF选择Amazon Bedrock和Anthropic的Claude 3 Haiku模型,利用其可扩展性、成本效益和集成能力,结合提示工程技术,例如为模型分配角色、提供上下文信息等,有效地解决了翻译难题,降低了成本,提升了翻译质量。

💡 **提示工程:引导模型生成高质量翻译**:通过精心设计的提示,例如为模型分配“AI语言翻译助手”的角色,提供翻译任务的具体指令和上下文信息,引导模型生成准确、符合语境的翻译结果,并减少了模型输出的不确定性。

🌐 **向量数据库:增强翻译的动态性**:123RF利用向量数据库存储和检索翻译结果,并将其与动态提示结合,进一步提升了翻译的准确性和效率,满足了不同语境下的翻译需求。

💰 **成本优化:显著降低翻译成本**:通过使用Amazon Bedrock和Claude 3 Haiku,123RF实现了显著的成本降低,有效地控制了翻译成本,使其多语言内容策略得以持续发展。

In the rapidly evolving digital content industry, multilingual accessibility is crucial for global reach and user engagement. 123RF, a leading provider of royalty-free digital content, is an online resource for creative assets, including AI-generated images from text. In 2023, they used Amazon OpenSearch Service to improve discovery of images by using vector-based semantic search. Building on this success, they have now implemented Amazon Bedrock and Anthropic’s Claude 3 Haiku to improve their content moderation a hundredfold and more sped up content translation to further enhance their global reach and efficiency.

Although the company achieved significant success among English-speaking users with its generative AI-based semantic search tool, it faced content discovery challenges in 15 other languages because of English-only titles and keywords. The cost of using Google Translate for continuous translations was prohibitive, and other models such as Anthropic’s Claude Sonnet and OpenAI GPT-4o weren’t cost-effective. Although OpenAI GPT-3.5 met cost criteria, it struggled with consistent output quality. This prompted 123RF to search for a more reliable and affordable solution to enhance multilingual content discovery.

This post explores how 123RF used Amazon Bedrock, Anthropic’s Claude 3 Haiku, and a vector store to efficiently translate content metadata, significantly reduce costs, and improve their global content discovery capabilities.

The challenge: Balancing quality and cost in mass translation

After implementing generative AI-based semantic search and text-to-image generation, they saw significant traction among English-speaking users. This success, however, cast a harsh light on a critical gap in their global strategy: their vast library of digital assets—comprising millions of images, audio files, and motion graphics—needed a similar overhaul for non-English speaking users.

The crux of the problem lay in the nature of their content. User-generated titles, keywords, and descriptions—the lifeblood of searchability in the digital asset world—were predominantly in English. To truly serve a global audience and unlock the full potential of their library, 123RF needed to translate this metadata into 15 different languages. But as they quickly discovered, the path to multilingual content was filled with financial and technical challenges.

The translation conundrum: Beyond word-for-word

Idioms don’t always translate well

As 123RF dove deeper into the challenge, they uncovered layers of complexity that went beyond simple word-for-word translation. The preceding figure shows one particularly difficult example: idioms. Phrases like “The early bird gets the worm” being literally translated would not convey the meaning of the word as well as another similar idiom in Spanish, “A quien madruga, Dios le ayuda”. Another significant hurdle was named entity resolution (NER)—a critical aspect for a service dealing with diverse visual and audio content.

NER involves correctly identifying and handling proper nouns, brand names, specific terminology, and culturally significant references across languages. For instance, a stock photo of the Eiffel Tower should retain its name in all languages, rather than being literally translated. Similarly, brand names like Coca-Cola or Nike should remain unchanged, regardless of the target language.

This challenge is particularly acute in the realm of creative content. Consider a hypothetical stock image titled Young woman using MacBook in a Starbucks. An ideal translation system would need to do the following:

These nuances highlighted the inadequacy of simple machine translation tools and underscored the need for a more sophisticated, context-aware solution.

Turning to language models: Large models compared to small models

In their quest for a solution, 123RF explored a spectrum of options, each with its own set of trade-offs:

This exploration laid bare a fundamental challenge in the AI translation space: the seemingly unavoidable trade-off between cost and quality. High-quality translations from top-tier models were financially unfeasible, whereas more affordable options couldn’t meet the standard of accuracy and consistency that 123RF’s business demanded.

Solution: Amazon Bedrock, Anthropic’s Claude 3 Haiku, prompt engineering, and a vector store

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Throughout this transformative journey, Amazon Bedrock proved to be the cornerstone of 123RF’s success. Several factors contributed to making it the provider of choice:

Cracking the code: Prompt engineering techniques

The first breakthrough in 123RF’s translation journey came through a collaborative effort with the AWS team, using the power of Amazon Bedrock and Anthropic’s Claude 3 Haiku. The key to their success lay in the innovative application of prompt engineering techniques—a set of strategies designed to coax the best performance out of LLMs, especially important for cost effective models.

Prompt engineering is crucial when working with LLMs because these models, while powerful, can produce non-deterministic outputs—meaning their responses can vary even for the same input. By carefully crafting prompts, we can provide context and structure that helps mitigate this variability. Moreover, well-designed prompts serve to steer the model towards the specific task at hand, ensuring that the LLM focuses on the most relevant information and produces outputs aligned with the desired outcome. In 123RF’s case, this meant guiding the model to produce accurate, context-aware translations that preserved the nuances of the original content.

Let’s dive into the specific techniques employed.

Assigning a role to the model

The team began by assigning the AI model a specific role—that of an AI language translation assistant. This seemingly simple step was crucial in setting the context for the model’s task. By defining its role, the model was primed to approach the task with the mindset of a professional translator, considering nuances and complexities that a generic language model might overlook.

For example:

You are an AI language translation assistant. Your task is to accurately translate a passage of text from English into another specified language.

Separation of data and prompt templates

A clear delineation between the text to be translated and the instructions for translation was implemented. This separation served two purposes:

For example:

Here is the text to translate:<text> {{TEXT}} </text>Please translate the above text into this language: {{TARGET_LANGUAGE}}

Chain of thought

One of the most innovative aspects of the solution was the implementation of a scratchpad section. This allowed the model to externalize its thinking process, mimicking the way a human translator might work through a challenging passage.

The scratchpad prompted the model to consider the following:

K-shot examples

The team incorporated multiple examples of high-quality translations directly into the prompt. This technique, known as K-shot learning, provided the model with a number (K) of concrete examples in the desired output quality and style.

By carefully selecting diverse examples that showcased different translation challenges (such as idiomatic expressions, technical terms, and cultural references), the team effectively trained the model to handle a wide range of content types.

For example:

Examples:<text>The early bird catches the worm.</text><translated_text>El que madruga, Dios le ayuda.</translated_text>

The magic formula: Putting it all together

The culmination of these techniques resulted in a prompt template that encapsulated the elements needed for high-quality, context-aware translation. The following is an example prompt with the preceding steps. The actual prompt used is not shown here.

You are an AI language translation assistant. Your task is to accurately translate a passage of text from English into another specified language. Here is the text to translate:<text> {{TEXT}} </text>Please translate the above text into this language: {{TARGET_LANGUAGE}}Think carefully, in the <scratchpad> section below, think through how you will translate the text while preserving its full meaning and nuance. Consider:- The overall meaning and intent of the passage- Idioms and expressions that may not translate literally- Tone, formality, and style of the writing- Proper nouns like names and places that should not be translated- Grammatical differences between English and {{TARGET_LANGUAGE}}Examples:<text>The software update is scheduled for next Tuesday.</text><translated_text>La actualización del software está programada para el próximo martes.</translated_text><text>Breaking news: Elon Musk acquires Twitter for $44 billion.</text><translated_text>Última hora: Elon Musk adquiere Twitter por 44 mil millones de dólares.</translated_text>... [8 more diverse examples] ...Now provide your final translated version of the text inside <translated_text> tags. Ensure the translation is as accurate and natural-sounding as possible in {{TARGET_LANGUAGE}}. Do not translate any names, places or other proper nouns.<translated_text>

This template provided a framework for consistent, high-quality translations across a wide range of content types and target languages.

Further refinement: Dynamic prompting for grounding models

Although the initial implementation yielded impressive results, the AWS team suggested further enhancements through dynamic prompting techniques. This advanced approach aimed to make the model even more adaptive and context aware. They adopted the Retrieval Augmented Generation (RAG) technique for creating a dynamic prompt template with K-shot examples relevant to each phrase rather than generic examples for each language. This also allowed 123RF to take advantage of their current catalog of high quality translations to further align the model.

Vector database of high-quality translations

The team proposed creating a vector database for each target language, populated with previous high-quality translations. This database would serve as a rich repository of translation examples, capturing nuances and domain-specific terminologies.

The implementation included the following components:

    Embedding generation:
      Use embedding models such as Amazon Titan or Cohere’s offerings on Amazon Bedrock to convert both source texts and their translations into high-dimensional vectors.
    Chunking strategy:
      To maintain context and ensure meaningful translations, the team implemented a careful chunking strategy:
        Each source text (in English) was paired with its corresponding translation in the target language. These pairs were stored as complete sentences or logical phrases, rather than individual words or arbitrary character lengths. For longer content, such as paragraphs or descriptions, the text was split into semantically meaningful chunks, ensuring that each chunk contained a complete thought or idea. Each chunk pair (source and translation) was assigned a unique identifier to maintain the association.
    Vector storage:
      The vector representations of both the source text and its translation were stored together in the database. The storage structure included:
        The original source text chunk. The corresponding translation chunk. The vector embedding of the source text. The vector embedding of the translation. Metadata such as the content type, domain, and any relevant tags.
    Database organization:
      The database was organized by target language, with separate indices or collections for each language pair (for example, English-Spanish and English-French). Within each language pair, the vector pairs were indexed to allow for efficient similarity searches.
    Similarity search:
      For each new translation task, the system would perform a hybrid search to find the most semantically similar sentences from the vector database:
        The new text to be translated was converted into a vector using the same embedding model. A similarity search was performed in the vector space to find the closest matches in the source language. The corresponding translations of these matches were retrieved, providing relevant examples for the translation task.

This structured approach to storing and retrieving text-translation pairs allowed for efficient, context-aware lookups that significantly improved the quality and relevance of the translations produced by the LLM.

Putting it all together

The top matching examples from the vector database would be dynamically inserted into the prompt, providing the model with highly relevant context for the specific translation task at hand.

This offered the following benefits:

The following is an example of a dynamically generated prompt:

[Standard prompt preamble]...Examples:<text>{{Dynamically inserted similar source text 1}}</text><translated_text>{{Corresponding high-quality translation 1}}</translated_text><text>{{Dynamically inserted similar source text 2}}</text><translated_text>{{Corresponding high-quality translation 2}}</translated_text>...[Rest of the standard prompt]

This dynamic approach allowed the model to continuously improve and adapt, using the growing database of high-quality translations to inform future tasks.
The following diagram illustrates the process workflow.

How to ground translations with a vector store

The process includes the following steps:

    Convert the new text to be translated into a vector using the same embeddings model. Compare text and embeddings against a database of high-quality existing translations. Combine similar translations with an existing prompt template of generic translation examples for target language. Send the new augmented prompt with initial text to be translated to Amazon Bedrock. Store the output of the translation in an existing database or to be saved for human-in-the-loop evaluation.

The results: A 95% cost reduction and beyond

The impact of implementing these advanced techniques on Amazon Bedrock with Anthropic’s Claude 3 Haiku and the engineering effort with AWS account teams was nothing short of innovative for 123RF. By working with AWS, 123RF was able to achieve a staggering 95% reduction in translation costs. But the benefits extended far beyond cost savings:

Looking ahead: Continuous improvement and expansion

The success of this project has opened new horizons for 123RF and set the stage for further advancements:

Conclusion

123RF’s success story with Amazon Bedrock and Anthropic’s Claude is more than just a tale of cost reduction—it’s a blueprint for how businesses can use cutting-edge AI to break down language barriers and truly globalize their digital content. This case study demonstrates the transformative power of innovative thinking, advanced prompt engineering, and the right technological partnership.

123RF’s journey offers the following key takeaways:

As we look to the future, it’s clear that the combination of cloud computing, generative AI, and innovative prompt engineering will continue to reshape the landscape of multilingual content management. The barriers of language are crumbling, opening up new possibilities for global communication and content discovery.

For businesses facing similar challenges in global content discovery, 123RF’s journey offers valuable insights and a roadmap to success. It demonstrates that with the right technology partner and a willingness to innovate, even the most daunting language challenges can be transformed into opportunities for growth and global expansion. If you have a similar use case and want help implementing this technique, reach out to your AWS account teams, or sharpen your prompt engineering skills through our prompt engineering workshop available on GitHub.


About the Author

Fahim Surani is a Solutions Architect at Amazon Web Services who helps customers innovate in the cloud. With a focus in Machine Learning and Generative AI, he works with global digital native companies and financial services to architect scalable, secure, and cost-effective products and services on AWS. Prior to joining AWS, he was an architect, an AI engineer, a mobile games developer, and a software engineer. In his free time he likes to run and read science fiction.

Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build generative AI solutions. His focus since early 2023 has been leading solution architecture efforts for the launch of Amazon Bedrock, AWS’ flagship generative AI offering for builders. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. He has helped companies in insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services. Mark holds six AWS certifications, including the ML Specialty Certification.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Bedrock Claude 3 Haiku 多语言翻译 内容发现 提示工程
相关文章