MarkTechPost@AI 2024年11月16日
Marqo Releases Advanced E-commerce Embedding Models and Comprehensive Evaluation Datasets to Revolutionize Product Search, Recommendation, and Benchmarking for Retail AI Applications
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Marqo推出先进的电商嵌入模型及综合评估数据集,旨在提升电商产品搜索、检索和推荐能力。这些模型在准确性和相关性方面有显著改进,并通过多个数据集进行评估和验证。

🎯Marqo推出电商嵌入模型Marqo-Ecommerce-B和Marqo-Ecommerce-L,提升电商平台能力

📊发布四个评估数据集,用于基准测试和模型比较

💪模型在多项评估指标中表现出色,性能优越

🛠️模型可通过Hugging Face等加载,应用于电商应用

Marqo has introduced four groundbreaking datasets and state-of-the-art e-commerce embedding models designed to advance product search, retrieval, and recommendation capabilities in e-commerce. These models, Marqo-Ecommerce-B and Marqo-Ecommerce-L, offer substantial improvements in accuracy and relevance for e-commerce platforms by delivering high-quality embedding representations of product data. Alongside these models, Marqo has released a series of evaluation datasets, including AmazonProducts-3m, GoogleShopping-1m, AmazonProducts-Eval-100k, and GoogleShopping-General-Eval-100k, to provide a robust foundation for benchmarking and model comparison.

The newly introduced Marqo-Ecommerce-B and Marqo-Ecommerce-L embedding models represent a significant stride in e-commerce search and recommendation systems. Marqo-Ecommerce-B, with 203 million parameters, and Marqo-Ecommerce-L, with 652 million parameters, are optimized for capturing complex features within product images and text descriptions. These models leverage extensive training on diverse product data to facilitate nuanced comparisons and enhance the contextual understanding of various product attributes.

To illustrate the performance of these models, Marqo employed two key datasets for evaluation: AmazonProducts-3m and GoogleShopping-1m. These datasets enable users to test and validate the models’ capabilities across many e-commerce scenarios, simulating the diversity and complexity of a real-world e-commerce platform.

The benchmarking results underscore the impressive performance of Marqo’s models. Marqo-Ecommerce-L, the larger of the two models, demonstrated an average improvement of 17.6% in Mean Reciprocal Rank (MRR) and 20.5% in nDCG@10 compared to the best open-source model, ViT-SO400M-14-SigLIP, on all tasks within the Marqo-Ecommerce-Hard dataset. When compared to Amazon’s proprietary model, Amazon-Titan-Multimodal, Marqo-Ecommerce-L achieved an even more pronounced improvement: 38.9% in MRR, 45.1% in nDCG@10, and 35.9% in Recall across the text-to-image tasks. These metrics highlight Marqo-Ecommerce-L’s proficiency in accurately ranking relevant products and its superior performance in understanding complex textual and visual inputs.

The Four Released Datasets

To support model evaluation, Marqo has released four datasets, each serving a unique purpose in e-commerce-related research and development:

    AmazonProducts-3m: This large-scale dataset of three million Amazon products is designed for high-quality model evaluation. It provides various product data, including images and text descriptions, that challenge models to accurately capture the nuances in product features across diverse categories.GoogleShopping-1m: This dataset comprises one million entries from Google Shopping and provides an alternative perspective to the AmazonProducts dataset, offering products that may have distinct attributes or branding. This dataset enables comprehensive testing of a model’s adaptability to various e-commerce platforms and product categories.AmazonProducts-Eval-100k: A more compact version of AmazonProducts-3m, AmazonProducts-Eval-100k is tailored for researchers who may require a smaller sample for initial testing or model refinement. It maintains the diversity of product attributes found in AmazonProducts-3m, allowing quick yet thorough evaluations of a model’s performance.GoogleShopping-General-Eval-100k: GoogleShopping-General-Eval-100k is a condensed version of GoogleShopping-1m, allowing efficient benchmarking with fewer computational resources. This dataset provides access to the essential characteristics of Google Shopping data, making it ideal for quick evaluations and iterative model tuning.

Marqo’s embedding models are available on Hugging Face, allowing developers to load them for text- and image-based e-commerce applications easily. Through Hugging Face’s Transformers library, users can seamlessly integrate Marqo’s models into their applications. For instance, with a simple code snippet, users can load Marqo-Ecommerce-L or Marqo-Ecommerce-B using the AutoModel and AutoProcessor classes. The models can then be used to process and analyze product images and text, making it easy for users to extract high-quality embeddings that facilitate effective product search and recommendation.

Alternatively, Marqo’s models can be loaded using open_clip for users working with OpenCLIP. This framework enables users to preprocess product images and tokenize text inputs, optimizing them for Marqo’s model architecture. The results produced through OpenCLIP provide label probabilities that indicate how relevant a given image or text input is to specific product labels, aiding in the accurate categorization and recommendation of products.

A central component of Marqo’s model evaluation is Generalized Contrastive Learning (GCL), a technique that enhances the effectiveness of text-to-image and image-to-text matching. By employing GCL, Marqo ensures its models identify nuanced relationships between textual and visual data. This capability is crucial for any e-commerce platform that provides reliable recommendations and robust product search functionalities.

Marqo has included the necessary evaluation scripts, making it straightforward for developers to replicate the benchmarking results and experiment with additional data. With GCL as the core evaluation methodology, Marqo’s models are optimized for real-world e-commerce applications that require highly accurate embeddings across varied and complex data inputs.

Marqo’s release of these models and datasets presents multiple practical applications for e-commerce businesses and researchers. Retailers can leverage Marqo’s models to implement precise product recommendations, facilitate faster and more accurate product searches, and improve customer satisfaction by enhancing their platforms’ relevance. Researchers can also benefit from the datasets’ breadth and diversity, using them as benchmarks to compare their models or to push the boundaries of e-commerce recommendation systems further.

In conclusion, Marqo’s new embedding models and datasets mark an important milestone in the evolution of e-commerce AI. By offering robust, high-performance models and carefully curated datasets, Marqo provides e-commerce businesses and the research community with invaluable tools to drive product search and recommendation innovation. These resources underscore the growing importance of AI in transforming e-commerce and set a new benchmark for what AI models in this sector can achieve.


Check out the Models and Datasets here. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

[FREE AI WEBINAR] Implementing Intelligent Document Processing with GenAI in Financial Services and Real Estate TransactionsFrom Framework to Production

The post Marqo Releases Advanced E-commerce Embedding Models and Comprehensive Evaluation Datasets to Revolutionize Product Search, Recommendation, and Benchmarking for Retail AI Applications appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Marqo 电商嵌入模型 评估数据集 电商应用 AI发展
相关文章