AI News 前天 18:22
Odyssey’s AI model transforms video into interactive worlds
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

伦敦AI实验室Odyssey推出了一项研究预览,展示了其将视频转化为互动世界的模型。该模型专注于为电影和游戏制作创建世界模型,并意外地发现了一种全新的娱乐媒介。Odyssey的AI模型生成的互动视频能够实时响应用户的输入,用户可以通过键盘、手机、控制器甚至语音指令与之互动。该技术可以每40毫秒生成逼真的视频帧,使用户能够近乎即时地影响数字世界。虽然目前体验还不够完美,但Odyssey认为这是Holodeck的早期版本,预示着互动视频在娱乐、教育和广告等领域的巨大潜力。

💡Odyssey的AI模型通过实时响应用户输入,将传统视频转化为互动体验,用户可以使用键盘、手机等设备与视频内容进行交互,创造了一种全新的娱乐形式。

⚙️该AI模型的核心在于其“世界模型”,它不同于传统视频模型一次性生成整个视频片段,而是逐帧预测后续帧,并根据用户输入进行调整,类似于大型语言模型预测下一个词,但处理的是高分辨率视频帧。

🚧为了解决AI生成互动视频的稳定性问题,Odyssey采用了“窄分布模型”,先在通用视频素材上预训练AI,然后在较小的环境集合上进行微调,以减少错误累积,确保视频内容不会变得混乱。

💰目前,支持该体验的基础设施成本为每用户每小时1-2英镑,依赖于分布在美国和欧盟的H100 GPU集群。尽管成本不菲,但与传统游戏或电影内容制作相比,仍具有显著的成本优势,并且Odyssey预计随着模型效率的提高,成本将进一步降低。

London-based AI lab Odyssey has launched a research preview of a model transforming video into interactive worlds. Initially focusing on world models for film and game production, the Odyssey team has stumbled onto potentially a completely new entertainment medium.

The interactive video generated by Odyssey’s AI model responds to inputs in real-time. You can interact with it using your keyboard, phone, controller, or eventually even voice commands. The folks at Odyssey are billing it as an “early version of the Holodeck.”

The underlying AI can generate realistic-looking video frames every 40 milliseconds. That means when you press a button or make a gesture, the video responds almost instantly—creating the illusion that you’re actually influencing this digital world.

“The experience today feels like exploring a glitchy dream—raw, unstable, but undeniably new,” according to Odyssey. We’re not talking about polished, AAA-game quality visuals here, at least not yet.

Not your standard video tech

Let’s get a bit technical for a moment. What makes this AI-generated interactive video tech different from, say, a standard video game or CGI? It all comes down to something Odyssey calls a “world model.”

Unlike traditional video models that generate entire clips in one go, world models work frame-by-frame to predict what should come next based on the current state and any user inputs. It’s similar to how large language models predict the next word in a sequence, but infinitely more complex because we’re talking about high-resolution video frames rather than words.

“A world model is, at its core, an action-conditioned dynamics model,” as Odyssey puts it. Each time you interact, the model takes the current state, your action, and the history of what’s happened, then generates the next video frame accordingly.

The result is something that feels more organic and unpredictable than a traditional game. There’s no pre-programmed logic saying “if a player does X, then Y happens”—instead, the AI is making its best guess at what should happen next based on what it’s learned from watching countless videos.

Odyssey tackles historic challenges with AI-generated video

Building something like this isn’t exactly a walk in the park. One of the biggest hurdles with AI-generated interactive video is keeping it stable over time. When you’re generating each frame based on previous ones, small errors can compound quickly (a phenomenon AI researchers call “drift.”)

To tackle this, Odyssey has used what they term a “narrow distribution model”—essentially pre-training their AI on general video footage, then fine-tuning it on a smaller set of environments. This trade-off means less variety but better stability so everything doesn’t become a bizarre mess.

The company says they’re already making “fast progress” on their next-gen model, which apparently shows “a richer range of pixels, dynamics, and actions.”

Running all this fancy AI tech in real-time isn’t cheap. Currently, the infrastructure powering this experience costs between £0.80-£1.60 (1-2) per user-hour, relying on clusters of H100 GPUs scattered across the US and EU.

That might sound expensive for streaming video, but it’s remarkably cheap compared to producing traditional game or film content. And Odyssey expects these costs to tumble further as models become more efficient.

Interactive video: The next storytelling medium?

Throughout history, new technologies have given birth to new forms of storytelling—from cave paintings to books, photography, radio, film, and video games. Odyssey believes AI-generated interactive video is the next step in this evolution.

If they’re right, we might be looking at the prototype of something that will transform entertainment, education, advertising, and more. Imagine training videos where you can practice the skills being taught, or travel experiences where you can explore destinations from your sofa.

The research preview available now is obviously just a small step towards this vision and more of a proof of concept than a finished product. However, it’s an intriguing glimpse at what might be possible when AI-generated worlds become interactive playgrounds rather than just passive experiences.

You can give the research preview a try here.

See also: Telegram and xAI forge Grok AI deal

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Odyssey’s AI model transforms video into interactive worlds appeared first on AI News.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Odyssey AI模型 互动视频 世界模型
相关文章