The Verge - Artificial Intelligences 04月10日 01:26
You can now give Google’s AI video model camera directions
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

谷歌推出了多项AI工具更新,旨在提升视频和图像编辑功能,并改进音频处理能力。Veo 2新增了“修复”和“扩展”功能,使用户能够更轻松地编辑视频,类似于Adobe的Generative Expand功能。Imagen 3在对象移除方面有所改进,Lyria模型进入私密预览,Chirp 3新增“即时自定义语音”功能。此外,谷歌还更新了Gemini 2.5 Flash模型和Agentic AI工具,以提高效率和跨平台任务处理能力。这些更新旨在帮助企业更高效地进行内容创作和任务处理,例如Kraft Heinz的数字体验负责人表示,曾经需要八周完成的任务现在只需八小时。

🎬 Veo 2新增了“修复”和“扩展”功能,允许用户移除视频中的 unwanted 元素并扩展视频帧,类似于Adobe的Generative Expand。

🎥 Veo 2还引入了电影技术预设,用户可以在生成视频时选择,以指导镜头构图、摄像机角度和节奏,例如延时摄影效果和模拟摄像机平移。

🖼️ Imagen 3的编辑功能得到了更新,显著改进了自动对象移除,旨在提供更自然的结果。

🎵 谷歌推出了文本转音乐模型Lyria的私密预览,并且Chirp 3新增了“即时自定义语音”功能,只需10秒的音频输入即可生成逼真的自定义语音。

🚀 谷歌还更新了Gemini 2.5 Flash模型,以及企业级Agentic AI工具,后者允许AI代理跨平台(如PayPal和Salesforce)进行通信和执行任务。

Google is trying to make it easier for users of its video AI model Veo 2 to make cinematic-looking generations and edit real footage. The new Veo 2 capabilities are available to preview via Google Cloud’s Vertex AI platform, alongside other updates to improve Google’s text-to-image generator, Imagen 3, and audio-related AI models.

New Veo 2 features include inpainting, which can automatically remove “unwanted background images, logos, or distractions from your videos” according to Google, and outpainting, which extends the frame of the original video into a different format. The latter tool will fill the new space with ai-generated video footage that blends into the original clip, similar to Adobe’s Generative Expand feature for images.

The update also lets Veo 2 users select cinematic technique presets to include alongside their text descriptions when generating footage, which can be used to help guide shot composition, camera angles, and pacing in the final results. Example presets include timelapse effects, drone-style POV, and simulating camera-panning in different directions.

A new interpolation feature has also been added that can create a video transition between two still images, filling in the beginning and end sequences with new frames.

Adobe’s competing Firefly video model has some similar capabilities, with a generative AI video extending feature launching in Premiere Pro last week. Google also adds SynthID digital attribution watermarks into its AI-generated outputs, much like Adobe’s Content Credentials system, but Adobe goes a step further by pledging that its tools are fully commercially safe because they’re trained on licensed and public domain content — something Google can’t match after inhaling the web to train its AI models.

Editing capabilities in Google’s text-to-image model Imagen 3 have also been updated to “significantly” improve automatic object removal, according to Google, providing what are supposed to be more natural results when removing distractions. Both Veo 2 and Imagen 3 are already being used by companies like L’Oreal and Kraft Heinz for marketing content production, with Kraft Heinz’s digital experience leader Justin Thomas saying the type of task that “once took us eight weeks is now only taking eight hours.”

On the audio side, Google has released its text-to-music model, Lyria, in a private preview and rolled out an “Instant Custom Voice” feature for its synthetic speech model, Chirp 3. Google says that Chirp 3 can now generate “realistic custom voices from 10 seconds of audio input,” and that a new transcription feature is launching in preview that can identify and separate individual speakers to provide clearer transcriptions for calls where multiple people are talking.

These updates are just a handful of AI-related announcements that Google made today. Gemini 2.5 Flash, the latest version of the company’s efficiency-optimized Flash model, will soon be available on Vertex AI. Google says that Gemini 2.5 Flash “automatically adjusts processing time” based on the complexity of the task to provide faster results for simple requests.

Google is also updating its enterprise-focused Agentic AI tools this week to allow AI agents to communicate with each other and perform tasks across platforms like PayPal and Salesforce. Meanwhile, a new section is being launched on Google’s Cloud Marketplace for companies to browse and purchase AI agents built by third-party Google partners.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

谷歌 AI Veo 2 Imagen 3 Lyria
相关文章