Nvidia’s Superlove: First Superchip, Now Supermode

EnterpriseAI 2024年07月26日

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

Nvidia 推出“超级模型”概念，旨在通过结合多种大型语言模型 (LLM) 和定制化技术，为用户提供高度个性化的 AI 应用。该技术利用 Meta 的 Llama 3.1 和 Nvidia 自研的 Nemotron 模型，并通过精细调整、安全措施和适配器等手段，根据客户需求创建定制化的 AI 模型。这种“超级模型”打破了传统的一刀切 AI 模型模式，转向更灵活、更适应用户需求的 AI 应用发展方向。

📈 **超级模型的定制化优势：** Nvidia 的“超级模型”利用 Llama 3.1 和 Nemotron 模型，并结合精细调整、安全措施和适配器等技术，根据客户需求创建定制化的 AI 模型。这种定制化方法可以根据特定行业或应用场景的需求，提供更精准、更有效的 AI 服务。例如，在医疗领域，可以利用“超级模型”创建专门用于诊断疾病或预测治疗效果的 AI 应用，而无需使用通用的 AI 模型进行繁琐的重新训练。

📡 **多种工具的整合与协同：** Nvidia 的“超级模型”集成了多种 AI 工具，例如 LlamaGuard 或 NeMo Guardrails，用于确保聊天机器人回答的准确性和安全性。此外，RAG 系统和 LoRA 适配器可以帮助微调模型，生成更准确的结果。这些工具的整合和协同可以显著提升 AI 模型的性能和效率，满足不同用户的个性化需求。

📢 **与云服务和开源技术的结合：** Nvidia 的“超级模型”与多个云服务提供商合作，并积极整合开源技术。例如，Nvidia 将开源的 Llama 3.1 模型进行调整，使其能够在 Nvidia 的 GPU 上高效运行，并将其与 Nvidia 的其他技术，例如 GPU 和 CUDA，进行整合。这种开放和协作的模式可以加速 AI 技术的普及和应用。

📣 **NIM 工厂的推出：** Nvidia 推出 NIM 工厂，为企业提供构建自己的 AI 模型和基础设施的工具。NIM 工厂可以帮助企业根据自身需求，创建定制化的 AI 模型，并将其部署到不同的应用场景中。这将进一步推动 AI 技术的应用，并为企业带来更多的商业价值。

Nvidia is tempting fate with its generous use of the term “super” to describe new products—the latest is a “supermodel” that uses innovative techniques to create fine-looking AI models.

The company this week announced support for Meta’s Llama 3.1 AI model with 405-billion parameters on its GPUs. When used alongside its homegrown model called Nemotron, voila, it produces a “supermodel.”

This supermodel term relates to creating highly customized models using multiple LLMs, fine-tuning, guardrails, and adapters to create an AI application that suits customer requirements.

The “supermodel” may represent how LLMs are customized to meet organizational needs. Nvidia is trying to break away from the one-size-fits-all AI model and move toward complementary AI models and tools that work together.

The Llama 3.1-Nemotron technique resembles a good cop-bad cop routine. Llama 3.1 provides output, which passes through Nemotron, which double-checks if the output is good or bad. The reward is a fine-tuned model with more accurate responses.

“You can use those together to create synthetic data. So … create synthetic data, and the reward model says yes, that’s good data or not,” said Kari Briski, vice president at Nvidia, during a press briefing.

Nvidia is also tacking on more makeup for supermodels to look better. The AI factory backend includes many tools that can be mixed and matched to create a finely tuned model.

The added tooling provides faster responses and efficient use of computing resources.

“We’ve seen almost a 10-point increase in accuracy by simply customizing models,” Briski said.

An important component is NIM (Nvidia inference microservices), a downloadable container that provides the interface for customers to interact with AI. The model fine-tuning with multiple LLMs, guardrails, and optimizations happens in the background as users interact via the NIM.

Developers can now download the Llama 3.1 NIMs and fine-tune them with adapters that can customize the model with local data to generate more customized results.

Creating an AI supermodel is a complicated process. First, users need to figure out the ingredients, which could include Llama 3.1 with adapters to pull their own data into AI inferencing.

Customers can attach guardrails such as LlamaGuard or NeMo Guardrails to ensure chatbot answers remain relevant. In many cases, RAG systems and LoRA adapters help fine-tune models to generate more accurate results.

The model also involves extracting and pushing relevant data to a vector database through which information is evaluated, and responses are funneled to users. Companies typically have such information in databases, and Nvidia provides plugins that can interpret stored data for AI use.

“We’ve got models. We’ve got the compute. We’ve got the tooling and expertise,” Briski said.

Nvidia is partnering with many cloud providers to offer this service. The company is also building a sub-factory within its AI factory, called NIM factory, which provides the tooling for companies to build their own AI models and infrastructure.

The support for Llama 3.1 offers insight into how the company will integrate open-source technology into its proprietary AI offerings. Like with Linux, the company is taking open-source models, tuning them to its GPUs, and then linking them to its proprietary tech, including GPUs and CUDA.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签