Nvidia Developer 02月16日
One-Click Deployments for the Best of NVIDIA AI with NVIDIA Launchables
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

NVIDIA Launchables 旨在简化 AI 开发流程,通过预配置的 GPU 计算环境,开发者可以一键部署参考工作流并立即开始构建。Launchables 包含 NVIDIA GPU、Python、CUDA、Docker 容器、开发框架(如 NVIDIA NIM、NeMo 和 Omniverse)、SDK 以及各种依赖项和环境配置。它们还支持自动挂载 GitHub 仓库或 Jupyter Notebook。Launchables 旨在提高开发效率,减少环境配置的复杂性,并促进团队协作,实现一致且可复现的开发环境。通过 Launchables,开发者可以快速访问 NVIDIA GPU,配置环境,并轻松共享和部署,从而加速 AI 应用的开发。

🚀 **一键部署**:Launchables 提供预配置的环境,简化了开发环境的设置,将原本需要数小时的依赖调试、GPU 驱动配置和框架兼容性测试缩减为一键部署,让开发者能够立即开始编写代码。

🔄 **环境可复现性**:Launchables 通过将包括 CUDA 驱动和框架版本在内的整个开发栈打包到一个版本化的、可复现的配置中,解决了环境不一致的问题,确保任何用户都能获得相同的开发环境。

⚙️ **灵活的配置选项**:Launchables 支持细粒度的环境定制,允许用户根据 vRAM 需求选择特定的 NVIDIA GPU(从 T4 到 H100),定义包含精确 Python 和 CUDA 版本的容器配置,并包含自动挂载到 GPU 实例中的特定 GitHub 仓库或 Jupyter Notebook。

🤝 **促进协作**:通过单个 URL 共享完整的开发环境,Launchables 简化了协作流程。这对于开源维护者、教育工作者以及团队成员共享内部项目都非常有价值,并且可以跟踪部署指标,了解他人如何使用环境。

AI development has become a core part of modern software engineering, and NVIDIA is committed to finding ways to bring optimized accelerated computing to every developer that wants to start experimenting with AI. To address this, we’ve been working on making the accelerated computing stack more accessible with NVIDIA Launchables: preconfigured GPU computing environments that enable you to deploy reference workflows and start building immediately, with the required compute provided.What are NVIDIA Launchables?NVIDIA Launchables are one-click deployable GPU development environments with predefined configurations that can help you get up and running with a workflow. They function as templates that contain all the essential components necessary to achieve a purpose: NVIDIA GPUsPythonCUDADocker containersDevelopment frameworks, including NVIDIA NIM, NVIDIA NeMo, and NVIDIA OmniverseSDKsDependenciesEnvironment configurationsThey also can contain GitHub repos or Jupyter notebooks automatically set up and mounted in a GPU instance. For teams collaborating on projects or individual developers working across multiple environments, Launchables ensure consistent and reproducible setups without manual config and overhead:On-demand access to NVIDIA GPUs: Start evaluating a reference workflow even without a GPU by spinning up an environment as specified by the preset variables to get to value faster.Community: Configure an environment for others to easily deploy. Useful for sharing demos, demonstrating training and inference pipelines, and teaching with reference code examples. Creators receive metrics on how a Launchable is viewed or deployed.Launchable examplesHere are a couple of scenarios where Launchables come in handy:Setting up Megatron-LM for GPU-optimized trainingRunning NVIDIA AI Blueprint for multimodal PDF data extractionDeploying Llama3-8B for inference with NVIDIA TensorRT-LLMSetting up Megatron-LM for GPU-optimized training Before tinkering with different parallelism techniques like tensor or pipeline parallelism, you must have PyTorch, CUDA, and a beefy GPU setup to have a reasonable training pipeline. With the Megatron-LM Launchable, you get access to an 8xH100 GPU node environment from a cloud partner that comes with PyTorch, CUDA, and Megatron-LM setup. Now, you can immediately adjust different parameters, such as --tensor-model-parallel-size and --pipeline-model-parallel-size, to determine which parallelism technique is most suitable for your specific model size and pretraining requirements.Unstructured PDF sources can often contain text, tables, charts, and images that must be extracted to run RAG and other downstream generative AI use cases. The pdf-ingest-blueprint Launchable comes with a Jupyter notebook that sets up a PDF data extraction pipeline for enterprise partners. With the NVIDIA-Ingest microservice and various NIM microservices deployed through the Launchable, you can set up a production-grade pipeline to parallelize document splitting and test retrieval on massive corpuses of PDF data. Deploying Llama3-8B for inference with NVIDIA TensorRT-LLMThe Run Llama3 Inference with TRT-LLM Launchable comes with a Jupyter notebook guide and is used as documentation. It demonstrates how to deploy Llama3 with TensorRT-LLM for low-latency inference, by converting a model into an ONNX intermediate representation, creating an underlying runtime through a build config (implements optimization plugins for attention mechanisms using --gpt_attention_plugin and matrix multiplication operations using --gemm_plugin), and deploys the TensorRT engine to run inference on input tokens.Launchable benefitsAfter collecting feedback from early users, here are some core technical capabilities that have developers excited about using Launchables for reproducible workflows:True one-click deploymentEnvironment reproducibilityFlexible configuration optionsBuilt for collaborationTrue one-click deployment Development environment setup typically involves hours of debugging dependencies, configuring GPU drivers, and testing framework compatibility. Launchables reduce this to a one-click deployment process by providing preconfigured environments with frameworks, CUDA versions, and hardware configurations. This means that you can start writing code immediately instead of wrestling with infrastructure.Environment reproducibilityEnvironment inconsistency remains a major source of debugging overhead in AI development teams. Launchables solve this by packaging your entire development stack, from CUDA drivers to framework versions, into a versioned, reproducible configuration. When you share a Launchable URL, you’re guaranteeing that any end consumer gets an identical development environment, eliminating “works on my machine” scenarios.Flexible configuration options Different AI workloads require different hardware and software configurations. Launchables support this through granular environment customization: Select specific NVIDIA GPUs (T4 to H100) based on your vRAM requirements.Define container configurations with precise Python and CUDA version requirements.Include specific GitHub repositories or Jupyter notebooks to be automatically mounted in your GPU instance.Built for collaboration Launchables streamline collaboration by enabling anyone to share complete development environments through a single URL. For open source maintainers, educational instructors, or even teammates sharing an internal project, you can track deployment metrics to understand how others are using your environment. This is also particularly valuable for ensuring reproducibility in research settings and maintaining consistent training environments across distributed teams.Creating a LaunchableCreating a Launchable is straightforward:Choose your compute: Select from a range of NVIDIA GPUs and customize your compute resources.Configure your environment: Pick a VM or container configuration with specific Python and CUDA versions.Add your code: Connect your Jupyter notebooks or GitHub repositories to be added to your end GPU environment.Share and deploy: Generate a shareable link that others can use to instantly deploy the same environment.Video 1. How to Create an NVIDIA LaunchableAfter you create a Launchable, you get the following:A shareable URL: Share with others directly or through an asset like a YouTube video or blog post so that anyone can visit the Launchable. Save in your notes to come back to a preconfigured setup from the past.Markdown code for a badge: Embed a one-click deployment badge in your GitHub readME, Jupyter notebook, and so on.As you share the URL with others to use or save it for your own reproducible setup, you can view metrics on how many times your Launchable has been viewed and deployed.Get started with one-click deployments todayLaunchables drastically reduce the traditional friction of sharing and reproducing GPU development environments by letting you package, version, and instantly deploy exact configurations. Teams spend less time on infrastructure setup and more time building AI applications.We are actively expanding readily available Launchables on build.nvidia.com as new NIM microservices and other NVIDIA software, SDKs, and libraries are released. Explore them today!

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

NVIDIA Launchables GPU开发环境 AI开发 一键部署
相关文章