MarkTechPost@AI 2024年09月03日
What If Game Engines Could Run on Neural Networks? This AI Paper from Google Unveils GameNGen and Explores How Diffusion Models Are Revolutionizing Real-Time Gaming
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Google 和特拉维夫大学的研究人员发布了 GameNGen,这是一种利用增强版 Stable Diffusion 模型来实时模拟复杂交互式环境(例如 DOOM 游戏)的新方法。GameNGen 通过两阶段训练流程克服了现有方法的局限性,首先训练一个强化学习智能体玩游戏,生成一组游戏轨迹数据集;然后,训练一个生成性扩散模型,根据过去的动作和观察结果来预测下一帧游戏画面。

🤖 GameNGen 采用了一种两阶段训练流程,首先训练一个强化学习智能体玩 DOOM 游戏,生成一组游戏轨迹数据集,然后训练一个生成性扩散模型(Stable Diffusion 的修改版本),根据过去的动作和观察结果来预测下一帧游戏画面。

🚀 GameNGen 能够生成与原始 DOOM 游戏几乎无法区分的视觉效果,即使在长时间序列中也是如此。它在峰值信噪比(PSNR)方面达到了 29.43,与有损 JPEG 压缩相当,并且在感知图像块相似性(LPIPS)得分方面也达到了 0.249,表明其视觉保真度很高。

💡 GameNGen 的出现标志着 AI 驱动的游戏模拟领域的一次突破,证明了神经模型能够以实时方式有效地模拟 DOOM 等复杂交互式环境,同时保持高视觉质量。

💪 GameNGen 能够以每秒 20 帧的速度在单个 TPU 上运行,并且能够提供与原始游戏相媲美的视觉效果,这预示着游戏开发领域可能出现一场变革,游戏将由神经模型而非传统的基于代码的引擎创建和驱动。

🎮 GameNGen 使用了一种名为 Stable Diffusion 的生成性扩散模型,该模型最初用于图像生成,但现在被用于游戏模拟。该模型能够生成逼真的游戏画面,并且能够根据游戏的物理规则和玩家的输入来预测游戏状态。

🌐 GameNGen 的出现也表明了 AI 在游戏开发中的潜力,它可以帮助开发者创建更逼真、更具互动性的游戏。

🎮 GameNGen 采用了一种名为“噪声增强”的技术来解决“自回归漂移”问题,该问题会导致随着时间推移帧质量下降。该技术通过在训练过程中引入噪声来帮助模型更好地预测游戏画面。

🎮 GameNGen 的开发人员还对模型的潜在解码器进行了微调,以提高图像质量,特别是对于游戏界面(HUD)。

A significant challenge in AI-driven game simulation is the ability to accurately simulate complex, real-time interactive environments using neural models. Traditional game engines rely on manually crafted loops that gather user inputs, update game states, and render visuals at high frame rates, crucial for maintaining the illusion of an interactive virtual world. Replicating this process with neural models is particularly difficult due to issues such as maintaining visual fidelity, ensuring stability over extended sequences, and achieving the necessary real-time performance. Addressing these challenges is essential for advancing the capabilities of AI in game development, paving the way for a new paradigm where game engines are powered by neural networks rather than manually written code.

Current approaches to simulating interactive environments with neural models include methods like Reinforcement Learning (RL) and diffusion models. Techniques such as World Models by Ha and Schmidhuber (2018) and GameGAN by Kim et al. (2020) have been developed to simulate game environments using neural networks. However, these methods face significant limitations, including high computational costs, instability over long trajectories, and poor visual quality. For instance, GameGAN, while effective for simpler games, struggles with complex environments like DOOM, often producing blurry and low-quality images. These limitations make these methods less suitable for real-time applications and restrict their utility in more demanding game simulations.

The researchers from Google and Tel Aviv University introduce GameNGen, a novel approach that utilizes an augmented version of the Stable Diffusion v1.4 model to simulate complex interactive environments, such as the game DOOM, in real-time. GameNGen overcomes the limitations of existing methods by employing a two-phase training process: first, an RL agent is trained to play the game, generating a dataset of gameplay trajectories; second, a generative diffusion model is trained on these trajectories to predict the next game frame based on past actions and observations. This approach leverages diffusion models for game simulation, enabling high-quality, stable, and real-time interactive experiences. GameNGen represents a significant advancement in AI-driven game engines, demonstrating that a neural model can match the visual quality of the original game while running interactively.

GameNGen’s development involves a two-stage training process. Initially, an RL agent is trained to play DOOM, creating a diverse set of gameplay trajectories. These trajectories are then used to train a generative diffusion model, a modified version of Stable Diffusion v1.4, to predict subsequent game frames based on sequences of past actions and observations. The model’s training includes velocity parameterization to minimize diffusion loss and optimize frame sequence predictions. To address autoregressive drift, which degrades frame quality over time, noise augmentation is introduced during training. Additionally, the researchers fine-tuned a latent decoder to improve image quality, particularly for the in-game HUD (heads-up display). The model was tested in a VizDoom environment with a dataset of 900 million frames, using a batch size of 128 and a learning rate of 2e-5.

GameNGen demonstrates impressive simulation quality, producing visuals nearly indistinguishable from the original DOOM game, even over extended sequences. The model achieves a Peak Signal-to-Noise Ratio (PSNR) of 29.43, on par with lossy JPEG compression, and a low Learned Perceptual Image Patch Similarity (LPIPS) score of 0.249, indicating strong visual fidelity. The model maintains high-quality output across multiple frames, even when simulating long trajectories, with only minimal degradation over time. Moreover, the approach shows robustness in maintaining game logic and visual consistency, effectively simulating complex game scenarios in real-time at 20 frames per second. These results underline the model’s ability to deliver high-quality, stable performance in real-time game simulations, offering a significant step forward in the use of AI for interactive environments.

GameNGen presents a breakthrough in AI-driven game simulation by demonstrating that complex interactive environments like DOOM can be effectively simulated using a neural model in real-time while maintaining high visual quality. This proposed method addresses critical challenges in the field by combining RL and diffusion models to overcome the limitations of previous approaches. With its ability to run at 20 frames per second on a single TPU while delivering visuals on par with the original game, GameNGen signifies a potential shift towards a new era in game development, where games are created and driven by neural models rather than traditional code-based engines. This innovation could revolutionize game development, making it more accessible and cost-effective.


Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

Here is a highly recommended webinar from our sponsor: ‘Building Performant AI Applications with NVIDIA NIMs and Haystack’

The post What If Game Engines Could Run on Neural Networks? This AI Paper from Google Unveils GameNGen and Explores How Diffusion Models Are Revolutionizing Real-Time Gaming appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

游戏引擎 神经网络 AI Stable Diffusion GameNGen 实时游戏
相关文章