Artificial Ignorance 2024年12月28日
10 AI stories that shaped 2024
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

2024年AI领域创新持续,虽不如2023年引人瞩目,但AI已深入日常生活。从GPT-4的独领风骚到Claude 3.5、Llama 3、Gemini 1.5的涌现,AI格局剧变。Google积极反击落后言论,OpenAI将ChatGPT转变为多模态平台。xAI的崛起、世界模型的进步、深度伪造的警示、数据交易的兴起、人才争夺战、AI代理的出现、AI法规的制定、Llama 3的发布、多模态AI的普及以及推理模型的突破,共同塑造了2024年AI发展的新篇章,预示着AI正从新奇走向必需。

🚀 xAI异军突起:xAI公司凭借强大的资金支持和Colossus超级计算机集群迅速崛起,成为AI领域的一颗新星,展示了AI基础设施建设的巨大潜力。

🌍 世界模型崭露头角:Sora和Genie 2等模型展现了对世界运行方式的理解能力,能够生成具有物理规律的视频和交互式3D环境,标志着AI在模拟现实世界方面取得重大进展。

🎭 深度伪造引发警惕:Taylor Swift深度伪造事件暴露了AI技术滥用的风险,促使各方加强监管和技术防御,引发了关于内容审核和法律法规的深刻思考。

🤝 数据交易重塑互联网:出版商与AI公司达成数据授权协议,引发了关于数据隐私、公平补偿以及数据所有权的讨论,重塑了互联网的社会契约。

🧠 推理模型取得突破:OpenAI的o3模型在复杂编程、高水平数学和ARC-AGI基准测试中均取得突破性进展,表明AI的推理能力得到了显著提升,不再仅仅依赖于海量数据训练。

It's been the second year in a row of seemingly non-stop AI news, though perhaps fewer mainstream headlines than in 2023. The innovations have been more subtle but no less significant - AI has become more integrated into our daily lives, moving from novelty to necessity.

A year ago, GPT-4 stood alone at the top of the benchmarks (Claude 3.5, Llama 3, and Gemini 1.5 didn't exist). (Almost) nobody had heard of o1, or NotebookLM, or Sora. And most people still thought about LLMs as chatbots - not as agents or “AI employees.”

Source: Reddit

But the landscape has shifted dramatically. Google countered the narrative that they were falling behind, and OpenAI transformed ChatGPT into a multimodal platform. There were also plenty of controversies along the way, from copyright infringement to deepfakes to for-profit statuses.

So as we head into the final stretch of 2024, let's look at some of the year's biggest stories.

Subscribe now

Honorable mention: xAI

Despite the constant stream of AI headlines, xAI burst onto the scene with a bang. The company raised $6 billion in funding from heavyweight investors like BlackRock and Sequoia Capital, doubling its valuation to $40 billion in just six months.

At the heart of xAI's ambitions is Colossus, a supercomputer cluster in Memphis that would make even the most hardened tech enthusiast's jaw drop. Built in partnership with NVIDIA in just 122 days, Colossus houses 100,000 of NVIDIA's most powerful chips. And they're not stopping there - the plan is to expand to a whopping one million GPUs.

10. World models

When it was first previewed, Sora left everyone speechless with the quality of its video generation. What's interesting about Sora, though, isn't that it can make slick videos - it seems to have a genuine understanding of how objects move and interact. This is known as a "world model”: i.e. a model capable of "understanding" how the world works, physics and all.

But Sora isn't alone in pushing the boundaries of world modeling. Google's Genie 2 represents a parallel breakthrough, creating interactive 3D environments that you can explore. What's remarkable about Genie 2 is its object permanence - if you hide an object behind another one, the model remembers it's there. Walk around to the other side, and that object will remain exactly where you left it. This might seem obvious to us humans, but it's a huge leap forward for AI.

9. Deepfakes

The Taylor Swift deepfake incident in January was a wake-up call for millions not yet acquainted with deepfakes. When sexually explicit AI-generated images of the pop star went viral on X (formerly Twitter), reaching tens of millions of views, the platform's content moderation system completely broke down.

The White House got involved, actors' unions sounded the alarm, and tech companies scrambled to patch their systems. Microsoft had to admit that people were exploiting their AI image tools and rushed to add new safeguards. And even though Congress has yet to pass meaningful deepfake legislation, California passed a suite of new laws targeting deepfakes, from election misinformation to sexually explicit content.

8. Data deals

The numbers tell the story: $250 million to NewsCorp, $60 million to Reddit, and dozens more deals in between. From Le Monde to The Atlantic to Vox Media, publishers rushed to sign licensing deals with OpenAI and other AI startups.

There was also a push from UGC (user-generated content) platforms - Reddit's deal to let AI companies train on user content, for example, sparked immediate controversy, with the FTC getting involved. Other platforms followed suit - Stack Overflow partnered with Google, and Automattic (parent to Tumblr and Wordpress) started working with AI companies.

This gold rush is transforming the social contract of the internet. Publishers are setting boundaries about how their content can be used, and some, like HarperCollins, are sharing the wealth with authors. But as more platforms rush to monetize their users' content, we're facing tough questions about privacy, fair compensation, and who truly owns the data that trains our AI models.

7. Talent "acquisitions"

In 2024, Big Tech found a creative new way to gobble up AI talent. Instead of buying entire companies, they struck unusual deals that looked more like talent raids. Google paid billions to effectively kneecap Character.AI, taking its founders and a chunk of its workforce. Microsoft and Amazon pulled similar moves with Inflection and Adept, snagging their best people while leaving the companies as hollow shells.

These weren't typical arrangements - in fact, they seemed carefully crafted to fly under the regulatory radar. The tech giants called them "licensing deals," but the FTC launched investigations into both Amazon and Microsoft's deals. After all, if it walks like an acquisition and talks like an acquisition, maybe it is an acquisition.

This talent consolidation is happening on top of Big Tech's massive investments into AI startups. Amazon alone has put $6.25 billion into Anthropic this year - that's out of $8 billion total, and still less than the nearly $14 billion Microsoft has put into OpenAI. Going from the trend, the future of AI is increasingly in the hands of a very small group of very large companies.

6. The age of agents

AI agents have arrived, and they're not just fancy chatbots anymore. Companies are starting to use them for everything from customer service to data analysis, though everyone seems to have a different idea of what exactly an "agent" is. Of course, some of these are just regular automation tools with an "agent" label slapped on top - but others are genuinely pushing the boundaries of what autonomous AI can do.

Salesforce and other tech giants are building agent platforms, while a wave of YC companies are launching "AI employees" for various business functions. We're still in the early days, and the hype around agents is still ahead of reality. But the direction is clear: AI agents are graduating from science fiction into the mainstream business world.

5. AI Acts

The EU made history this year by passing the world's first comprehensive AI regulation. The AI Act takes a tiered approach, treating different AI systems differently based on their potential risks. If you're building AI systems that might touch EU users, you need to pay attention - even if your company isn't based in Europe. The rules have teeth, and the fines for breaking them can be massive.

Meanwhile, in California, Governor Gavin Newsom vetoed SB 1047, a bill that would have required strict oversight of large AI models. Newsom's argument was that the bill was too broad and might actually make us less safe by focusing on the wrong things.

But Newsom wasn't completely against regulation - far from it. The same month he vetoed SB 1047, he signed 17 other AI-related bills into law. These new rules tackle everything from AI watermarking to preventing AI-generated misinformation. It's a sign that even if we're still figuring out exactly how to regulate AI, governments aren't sitting on their hands anymore.

4. Llama 3

Llama 2 was one of the most impactful AI stories last year, and Llama 3 continues that trend. While proprietary models from OpenAI and Google often hog the spotlight, Meta continuously works towards state-of-the-art AI that isn't locked behind closed doors. The crown jewel was Llama 3.3 70B, released in December, which matched the performance of models nearly 6 times its size.

But the real value wasn't just about raw performance - it was about accessibility. These models could run on consumer hardware, meaning you can now run a GPT-4 class model on your laptop. When others added vision capabilities with Llama 3V, they showed that even advanced features like image understanding didn't need to be locked away in expensive API calls.

3. Multimodality

2024 was the year AI learned to see, hear, and speak. While early AI models were limited to text, the latest versions can handle pretty much anything you throw at them - images, audio, video, and more. And these capabilities are quickly becoming table stakes: it's hard to imagine a ChatGPT competitor that doesn't offer multimodal capabilities.

This shift to multimodal AI isn't just a cool tech demo - it's fundamentally changing how we interact with computers. For example, OpenAI's Advanced Voice Mode turned ChatGPT into something that could hold a conversation, complete with different personalities and voices. These capabilities are making things like AI companions or fully automated call centers a potential reality.

Likewise, Anthropic's Computer Use mode adapts Claude to use the same UIs humans do - by taking screenshots of your browser and sending back keyboard and mouse events. Slowly but surely, LLMs are growing from just a single sense to experiencing the (digital) world in many of the same ways we do.

2. Reasoning models

After months of rumors about a project called Q* (or "Strawberry"), OpenAI released o1 - a "reasoning" model designed to think through problems step-by-step. Based on chain-of-thought prompting, the process doesn't lock models into an answer upfront and instead allows them to explore different lines of thinking.

But o3, released in December, marked a major breakthrough. It achieved unprecedented results across the board: solving complex programming tasks, competing at high-level mathematics, and even surpassing human performance on the ARC-AGI benchmark - a test that had seen minimal AI progress for years. This wasn't just an incremental improvement; it represented a fundamental shift in what AI could accomplish.

This shift toward "reasoning" models represents a different direction for AI development. Instead of focusing solely on larger training runs, OpenAI found that allowing models more computation time at inference could produce better results across technical domains. And it has shown improved performance on an incredibly compressed timeline - there were only 3 months between o1 and o3, versus 3 years between GPT-3 and GPT-4.

1. AI infrastructure

Tech giants have evolved from spending billions on AI companies to spending hundreds of billions on AI infrastructure. They’re building massive data centers, buying up every GPU they can find, and preparing for an AI-powered future. Microsoft, Amazon, Meta and Google combined are on pace to pour over $100 billion into data center expansions and related infrastructure costs this year alone. Much of this spend is targeted for the US, but there are also multi-billion data center investments happening in France, Malaysia, Finland, Singapore, and the UK.

We're seeing entirely new data center designs specifically for AI workloads, with some facilities dedicating most of their power to running NVIDIA chips. Meta and xAI are building GPU clusters that would make a supercomputer blush, with tens of thousands of chips working in parallel. And Microsoft and Amazon have even committed billions towards nuclear power projects to try to help fill energy demand.

This massive build-out shows how much faith (and money) the tech industry is putting into AI. It's an unfathomably large bet that AI will reshape not just the Internet but also the physical infrastructure that powers it.

Over to you

There was so, so much that happened this year, and I left a lot of different threads on the cutting room floor. What’s on your top ten list? What stories do you think went underreported this year? Let me know in the comments!

Leave a comment

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI创新 世界模型 深度伪造 数据交易 推理模型
相关文章