Last Week in AI 04月09日
Last Week in AI #306: Astrocade, Llama 4, Nova Act
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文综述了近期AI领域的重要进展。Meta发布了新一代AI模型Llama 4,并受到关注。亚马逊推出了AI代理Nova Act,用于控制浏览器。Adobe更新了Premiere Pro,引入AI驱动的视频编辑功能。OpenAI的ChatGPT付费用户激增,营收也随之增长。此外,文章还介绍了Runway的Gen-4视频生成模型、Midjourney的V7图像模型、以及微软、谷歌、字节跳动、阿里巴巴等公司在AI领域的新动作,涵盖模型发布、技术更新、市场竞争及商业应用等多个方面。

🚀 **Meta发布Llama 4:**Meta推出了新的AI模型套件Llama 4,包括Llama 4 Scout、Llama 4 Maverick和Llama 4 Behemoth。这些模型使用混合专家(MoE)架构,提高了计算效率,但由于欧盟的AI和数据隐私法规,其使用受到限制,并且可能因为表现未达预期受到质疑。

💻 **亚马逊推出Nova Act:** 亚马逊发布了Nova Act,一个通用AI代理,能够控制网络浏览器并执行简单任务。Nova Act将成为即将推出的Alexa+升级的关键组件,并提供SDK供开发者构建AI代理原型,从而实现网页导航、表单填写和日历安排等功能。

🎬 **Adobe Premiere Pro的AI增强:** Adobe在Premiere Pro中增加了由Firefly驱动的Generative Extend工具,允许用户延长视频剪辑和背景音频。此外,更新还包括一个AI驱动的搜索面板,用于识别剪辑内容,并支持将视频字幕自动翻译成多种语言,提升了软件的效率与性能。

💰 **OpenAI营收增长:** OpenAI的ChatGPT付费用户增加了30%,每月收入从3.33亿美元增长到4.15亿美元。公司预计2025年总收入将增至127亿美元,2026年达到294亿美元。

Editorial note: my startup just launched a public demo of what we’ve been working on!

Would appreciate if you go check it out over at Astrocade.com :)

Make games with AI!

Please follow us on X here and consider joining our Discord here.

Top News

Meta releases Llama 4, a new crop of flagship AI models

Meta has launched a new suite of AI models, Llama 4, which includes Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth. These models were trained on large volumes of unlabeled text, image, and video data to enhance their visual understanding capabilities. The Llama 4 models are the first to use a mixture of experts (MoE) architecture, which improves computational efficiency by dividing data processing tasks among smaller, specialized models. However, the use of these models is restricted in the EU due to regional AI and data privacy laws, and companies with over 700 million monthly active users must obtain a special license from Meta.

Despite touting good benchmark performance, Llama 4 has been criticized within the AI community for seemingly underwhelming performance. There were even rumors of cheating, to the extent that a Meta exec had to explicitly deny the company artificially boosted Llama 4’s benchmark scores.

Amazon unveils Nova Act, an AI agent that can control a web browser

Amazon has introduced Nova Act, a general-purpose AI agent capable of controlling a web browser and performing simple tasks independently. Developed by Amazon's AGI lab, Nova Act will be a key component of the upcoming Alexa+ upgrade, an AI-enhanced version of Amazon's voice assistant. The Nova Act SDK, also released by Amazon, allows developers to build agent prototypes, enabling AI agents to navigate web pages, fill out forms, and schedule dates on a calendar. Despite the crowded market, Amazon claims that Nova Act outperforms similar agents from OpenAI and Anthropic in several internal tests.

Adobe launches Premiere Pro’s generative AI video extender

Adobe has released version 25.2 of Premiere Pro, introducing AI-powered features designed to enhance video editing. The most significant addition is Generative Extend, a tool powered by Adobe's Firefly generative AI video model, which allows users to extend video clips by up to two seconds and ambient background audio by up to ten seconds. This feature is free for a limited time, after which users will need to spend Firefly generative credits. The update also includes an AI-powered Search panel that recognizes the content of clips, enabling users to search for footage using text descriptions. Additionally, Premiere Pro can now automatically translate video captions into 27 languages and offers improved speed and performance on both Apple silicon and Windows devices.

OpenAI's Revenue Set to Triple After Jump in Paid ChatGPT Users

OpenAI has seen a 30% increase in its paid subscriber base for ChatGPT, rising from 15.5 million to over 20 million in the last quarter. This surge has led to a corresponding 30% increase in monthly revenue, from $333 million to $415 million. The company, recently valued at $300 billion following a $40 billion funding round led by SoftBank and supported by Microsoft, also revealed that ChatGPT is used by over 500 million people weekly. OpenAI projects a significant revenue expansion, expecting to triple its total to $12.7 billion in 2025 from $3.7 billion in 2024, and anticipates generating $29.4 billion in 2026.

Other News

Tools

Runway releases an impressive new video-generating AI model - Runway's new Gen-4 video-generating AI model offers high-fidelity video creation with consistent characters and environments, but faces legal challenges over its training data and potential industry disruption.

Runway Introduces Gen-4 Turbo Video AI Model With Faster Generation Speeds - Runway's Gen-4 Turbo AI model significantly enhances video generation speed and efficiency, offering improved consistency and realism in video scenes while being more credit-efficient than its predecessor.

OpenAI prepares reasoning slider and memory update for ChatGPT users - OpenAI is enhancing ChatGPT with features like improved memory for context-aware interactions, a reasoning slider for task complexity, and a notification feed to keep users informed about updates.

Google’s AI Mode can now see and search with images - Google's AI Mode now integrates Gemini AI and Lens technology to enhance image-based search capabilities, providing detailed responses and recommendations by analyzing the context and relationships within images.

ByteDance’s DreamActor-M1 Turns Images into Stunningly Real Human Videos - DreamActor-M1, a new framework by ByteDance, uses a Diffusion Transformer architecture to create realistic human animations from images, outperforming existing models while addressing ethical concerns and limitations like dynamic camera movements.

Microsoft updates Copilot with the greatest hits from other AIs - Microsoft's Copilot is being enhanced with features like memory, personalization, web actions, and podcast creation to better compete with AI alternatives such as ChatGPT and Claude.

Midjourney launches its new V7 AI image model that can process text prompts better - Midjourney's V7 AI image model introduces enhanced text prompt processing, improved image quality, and new features like Draft Mode for faster, cost-effective iterations, while personalization options allow users to tailor the AI to their visual preferences.

Microsoft has created an AI-generated version of Quake - Microsoft's Muse AI model is being showcased through an AI-generated Quake II tech demo, highlighting its potential to assist game developers in prototyping and preserving classic games for modern platforms.

Alibaba Preparing for Flagship AI Model Release as Soon as April - Alibaba Group Holding Ltd. is planning to release Qwen 3, an upgraded version of its flagship AI model, as soon as this month with competition from rivals including OpenAI and DeepSeek heating up.

China’s Zhipu Offers Free AI Agent in Riposte to DeepSeek, Manus - China’s Zhipu is making its new AI agent free to use as domestic competition to build emerging artificial intelligence technologies heats up. The Beijing-based startup on Monday unveiled AutoGLM, an artificial intelligence agent that can conduct deep research.

Business

Nvidia H20 Chips: $16 Billion Orders from ByteDance, Alibaba, and Tencent - Chinese tech giants ByteDance, Alibaba, and Tencent have placed substantial orders for Nvidia's H20 server chips, driven by China's rapidly expanding AI industry despite U.S. export restrictions.

Intel and TSMC are reportedly launching a joint chipmaking venture - Intel and TSMC are forming a joint venture to operate Intel's chipmaking facilities, with TSMC contributing expertise and training instead of capital, amid efforts to revitalize Intel under new CEO Lip-Bu Tan.

Google-backed Isomorphic Labs raises $600m to advance AI drug discovery - Isomorphic Labs has raised $600 million to accelerate the development of its AI drug design engine and advance its programs into clinical development, amidst a growing trend of AI integration in the pharmaceutical industry.

AI Video Startup Runway Valued at $3 Billion in Funding Round - Runway AI Inc. has raised $308 million in a new round of funding that more than doubles the company’s valuation — a sign of investor enthusiasm for startups building artificial intelligence software that can generate videos.

Spotify debuts Gen AI ads, programmatic ad buying - Spotify is enhancing its advertising business with Gen AI ads and the Spotify Ad Exchange, enabling real-time auctions and AI-generated scripts and voiceovers to better target its extensive Gen Z user base.

Google Gemini is shaking up its AI leadership ranks - Google is undergoing a leadership change in its AI division, with Josh Woodward taking over from Sissie Hsiao to focus on advancing the Gemini app as the AI race emphasizes product development alongside model innovation.

DeepMind is holding back release of AI research to give Google an edge - DeepMind has implemented stricter publication policies to maintain a competitive advantage for Google in the AI industry, delaying the release of strategic research papers.

Is the CEO of the heavily funded humanoid robot startup Figure AI exaggerating his startup’s work with BMW? - Questions arise about the accuracy of the CEO's claims regarding Figure AI's collaboration with BMW.

Research

This A.I. Forecast Predicts Storms Ahead - The A.I. Futures Project, led by former OpenAI researcher Daniel Kokotajlo, is forecasting potential global disruptions caused by increasingly powerful artificial intelligence systems by 2027.

An Approach to Technical AGI Safety and Security - Google DeepMind outlines a roadmap for mitigating severe risks from AGI, focusing on technical safety and security through strategies addressing misuse and misalignment, while emphasizing robust training, monitoring, and security measures.

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems - The survey explores the modular architecture of intelligent agents inspired by human brain functions, their self-enhancement and adaptive evolution, collaborative multi-agent systems, and the importance of building safe and secure AI systems.

Large Language Models Pass the Turing Test - Recent studies demonstrate that large language models like GPT-4.5 and LLaMa-3.1-405B can pass the Turing test when prompted to adopt a humanlike persona, suggesting their potential to convincingly imitate human conversation and raising implications for their use in social and economic contexts.

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad - Current large language models struggle with rigorous mathematical reasoning and proof generation, achieving less than 5% accuracy on the 2025 USA Math Olympiad problems, indicating a need for significant advancements in these areas.

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning - RoboVerse introduces a comprehensive framework with a simulation platform, synthetic dataset, and unified benchmarks to address the challenges of data scaling and evaluation in robot learning, enhancing performance and sim-to-real transfer.

PaperBench: Evaluating AI's Ability to Replicate AI Research - PaperBench is a benchmark designed to evaluate AI agents' ability to autonomously replicate state-of-the-art machine learning research papers, featuring a comprehensive grading system and an auxiliary evaluation method using LLM-based judges.

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead - Inference-time scaling enhances model performance on complex tasks by allocating more computational resources, but its effectiveness varies across domains and tasks, with diminishing returns as complexity increases, highlighting the need for more efficient and purposeful scaling approaches.

AI masters Minecraft: DeepMind program finds diamonds without being taught - DeepMind's Dreamer AI successfully learned to collect diamonds in Minecraft using reinforcement learning and a world model to generalize knowledge and predict future scenarios without prior instruction.

MoCha: Towards Movie-Grade Talking Character Synthesis - MoCha is a novel diffusion transformer model designed to generate movie-grade talking characters with synchronized speech, realistic emotions, and full-body actions, outperforming existing methods in lip-sync quality, naturalness, and visual coherence.

Policy

AI’s $4.8 trillion future: UN warns of widening digital divide without urgent action - The UNCTAD report highlights the urgent need for international cooperation and investment in infrastructure, data, and skills to address the growing digital divide and ensure AI benefits are equitably distributed worldwide.

Major publishers call on the US government to ‘Stop AI Theft’ - Major publishers are urging the US government to implement regulations that require Big Tech companies to compensate creators for using their content in AI training, as part of a campaign called Support Responsible AI.

Judge doesn’t buy OpenAI argument NYT’s own reporting weakens copyright suit - A US district judge denied OpenAI's motion to dismiss The New York Times' copyright claims, ruling that OpenAI failed to prove the newspaper knew about potential copyright violations by ChatGPT before its release.


Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Meta Llama 4 亚马逊 Nova Act Adobe Premiere Pro OpenAI
相关文章