Nvidia Blog 05月01日 21:15
Control the Composition of AI-Generated Images With the NVIDIA AI Blueprint for 3D-Guided Generative AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

英伟达推出了AI蓝图,利用3D引导生成式AI,让用户能够更好地控制AI生成的图像。该方案使用Blender创建3D场景,生成深度图,再结合用户的文本提示,通过FLUX.1-dev模型生成图像。这种方法降低了对细节和纹理的要求,方便调整物体位置和视角。蓝图集成了ComfyUI,方便连接Blender和FLUX.1-dev模型,并利用NVIDIA NIM微服务优化性能。该方案为AI艺术家和开发者提供了一个预构建的基础,加速了高级AI功能的访问。

🎨 英伟达AI蓝图利用Blender创建的3D场景生成深度图,结合用户输入的文本提示,通过FLUX.1-dev模型生成图像,从而实现对AI生成图像的精确控制,降低了对物体细节和高质量纹理的依赖。

🧩 该方案集成了ComfyUI,这是一个强大的工具,允许创作者以有趣的方式链接生成式AI模型。通过ComfyUI Blender插件,用户可以将Blender无缝连接到ComfyUI,从而实现更高级的图像生成工作流程。

🚀 NVIDIA NIM微服务在GeForce RTX GPU上以最佳性能部署和运行FLUX.1-dev模型,利用NVIDIA TensorRT软件开发工具包和优化的FP4和FP8格式,显著提升了图像生成的效率。

🛠️ AI蓝图提供了一个预构建的基础,包括Blender、ComfyUI、Blender插件、FLUX.1-dev NIM微服务和ComfyUI节点,并附带详细的部署说明、示例资产和预配置环境,使得AI艺术家和开发者能够更轻松地开始使用高级图像生成工作流程。

AI-powered image generation has progressed at a remarkable pace — from early examples of models creating images of humans with too many fingers to now producing strikingly photorealistic visuals. Even with such leaps, one challenge remains: achieving creative control.

Creating scenes using text has gotten easier, no longer requiring complex descriptions — and models have improved alignment to prompts. But describing finer details like composition, camera angles and object placement with text alone is hard, and making adjustments is even more complex. Advanced workflows using ControlNets — tools that enhance image generation by providing greater control over the output — offer solutions, but their setup complexity limits broader accessibility.

To help overcome these challenges and fast-track access to advanced AI capabilities, NVIDIA at the CES trade show earlier this year announced the NVIDIA AI Blueprint for 3D-guided generative AI for RTX PCs. This sample workflow includes everything needed to start generating images with full composition control. Users can download the new Blueprint today.

Harness 3D to Control AI-Generated Images

The NVIDIA AI Blueprint for 3D-guided generative AI controls image generation by using a draft 3D scene in Blender to provide a depth map to the image generator — FLUX.1-dev, from Black Forest Labs — which together with a user’s prompt generates the desired images.

The depth map helps the image model understand where things should be placed. The advantage of this technique is that it doesn’t require highly detailed objects or high-quality textures, since they’ll be converted to grayscale. And because the scenes are in 3D, users can easily move objects around and change camera angles.

Under the hood of the blueprint is ComfyUI, a powerful tool that allows creators to chain generative AI models in interesting ways. For example, the ComfyUI Blender plug-in lets users connect Blender to ComfyUI. Plus, an NVIDIA NIM microservice lets users deploy the FLUX.1-dev model and run it at the best performance on GeForce RTX GPUs, tapping into the NVIDIA TensorRT software development kit and optimized formats like FP4 and FP8. The AI Blueprint for 3D-guided generative AI requires an NVIDIA GeForce RTX 4080 GPU or higher.

A Prebuilt Foundation for Generative AI Workflows

The blueprint for 3D-guided generative AI includes everything necessary for getting started with an advanced image generation workflow: Blender, ComfyUI, the Blender plug-ins to connect the two, the FLUX.1-dev NIM microservice and the ComfyUI nodes required to run it. For AI artists, it also comes with an installer and detailed deployment instructions.

The blueprint offers a structured way to dive into image generation, providing a working pipeline that can be tailored to specific needs. Step-by-step documentation, sample assets and a preconfigured environment provide a solid foundation that makes the creative process more manageable and the results more powerful.

For AI developers, the blueprint can act as a foundation for building similar pipelines or expanding existing ones. It comes with source code, sample data, documentation and a working sample for getting started.

Real-Time Generation Powered by RTX AI 

AI Blueprints run on NVIDIA RTX AI PCs and workstations, harnessing recent performance breakthroughs from the NVIDIA Blackwell architecture.

The FLUX.1-dev NIM microservice included in the blueprint for 3D-guided generative AI is optimized with TensorRT and quantized to FP4 precision for Blackwell GPUs, enabling more than doubled inference speeds over native PyTorch FP16.

For users on NVIDIA Ada Lovelace generation GPUs, the FLUX.1-dev NIM microservice comes with FP8 variants, also accelerated by TensorRT. These improvements make high-performance workflows more accessible for rapid iteration and experimentation. Quantization also helps run models with less VRAM. With FP4, for instance, model sizes are reduced by more than 2x compared with FP16.

Customize and Create With RTX AI

There are 10 NIM microservices currently available for RTX, supporting use cases spanning image and language generation to speech AI and computer vision — with more blueprints and services on the way.

Available now at https://build.nvidia.com/nvidia/genai-3d-guided, AI Blueprints and NIM microservices provide powerful foundations for those ready to create, customize and push the boundaries of generative AI on RTX PCs and workstations.

Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X.

See notice regarding software product information.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI图像生成 NVIDIA 3D引导 ComfyUI NIM微服务
相关文章