MetaDevo AI Blog 2024年11月26日
AI: Get Rich or Die Tryin
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了当前流行的生成式AI,特别是大型语言模型(LLM)背后的技术和用户体验。作者认为,这些AI模型可能通过巧妙的界面设计和后端技术,营造出一种“智能”的假象,而实际上可能只是基于模式匹配和大量训练数据进行预测。文章分析了OpenAI的o1模型,以及Chain of Thought(CoT)技术,并探讨了AI模型是否真正具备推理能力。此外,文章还探讨了AI作为未来主要用户界面的可能性,以及AI行业可能面临的崩溃风险,呼吁用户和行业警惕AI的潜在风险和局限性。

🤔 **OpenAI的o1模型引入了Chain of Thought(CoT)技术,声称提升了模型的“推理”能力,但实际上可能是通过猜测和基于模式匹配进行预测。**该技术通过在运行时创建“推理”标记,并向用户收取费用,但用户无法访问原始的推理链。

💡 **生成式AI被营销为具有“人工智能”甚至“强人工智能”的能力,但作者认为这种说法存在误导性。**当前的AI模型更多地是基于模式识别和数据关联,而非真正的推理和理解。

💻 **AI模型的后端技术不断演进,但用户界面却相对稳定,这使得开发者可以在后台添加各种功能,例如利用人工进行辅助。**这种“幕后操作”可能导致用户误以为AI具备更强的能力,例如“实时推理”。

🌐 **AI有可能成为未来主要的交互界面,取代现有的桌面、网页、移动应用等。**这将导致用户与技术的交互方式发生重大变化,并可能加剧不同用户群体之间的技术鸿沟。

🥶 **AI行业存在泡沫风险,可能重蹈历史覆辙,再次陷入“AI寒冬”。**过高的期望和不切实际的承诺可能导致资金流失和研究停滞,需要警惕AI行业的潜在风险。

The Upside of Being Tricked?

By upsetting the apple cart, pranksters aspire to change the world—or at least to inspire visions of a better one.

Kembrew McLeod, Pranksters (2014)

Magic Tricks

Ed Zitron calls recent big AI events “A big, stupid magic trick,” implying that these tricks are indicators of the AI bubble popping soon.

These events include OpenAI launching o1 (aka “strawberry”) which Zitron described as not impressive despite Chain of Thought, a big new feature and selling point:

It has described the reinforcement training process as “thinking” and “reasoning,” when, in fact, it’s making guesses, and then guessing on the correctness of those guesses at each step, where the end destination is often something that can be known in advance.

It’s an insult to people…

Chain of Thought (CoT) expands the previous architecture. As Simon Willison explained in Notes on OpenAI’s new o1 chain-of-thought models:

One way to think about these new models is as a specialized extension of the chain of thought prompting pattern—the “think step by step” trick that we’ve been exploring as a a community for a couple of years now, first introduced in the paper Large Language Models are Zero-Shot Reasoners in May 2022.

The OpenAI o1 model has CoT support baked in. This is complemented by a new post-training runtime component that creates invisible “reasoning” tokens, which of course the user is charged for even though they can’t access the raw chains of thought.

According to OpenAI’s o1 System Card document, o1 is better for safety when tested on existing benchmarks, but its “heightened intelligence” leads to new risks. Always fighting the previous war, as they say.

The OpenAI document doesn’t mention impending financial doom if they didn’t release it immediately. But it does conveniently say that they have to just let it into the wild right now so that we the people can fix the safety issues as unpaid workers:

iterative real-world deployment is the most effective way to bring everyone who is affected by this technology into the AI safety conversation

Nothing to worry about except this ominous tweet on Sept 17, 2024:

Did you know that the knowledge cutoff of o1 is October 2023? Exactly the time when all the rumors about Q* emerged, the fire letter from the OpenAI employees that a new, dangerous algorithm had been developed, and at the same time the time we jokingly refer to as the “what did Ilya see” time.

Generative AI is the bulk of this current era of popular AI, with LLMs typically doing text-only content and other models doing imagery. Ed Zitron thinks Generative AI is sold on lies—the lies being that it’s “going to get better” and that it is in fact Artificial Intelligence and/or going to become AI.

I disagree with that terminology—the lie is that modern AI is going to become AGI or Strong AI. But of course that terminology is probably lost or muddled for the mainstream and abused by business and marketing folks. So I see why Zitron is using the terms that way since the mainstream and also a lot of tech bro cheerleaders talk about “artificial intelligence” as if there’s only one kind and always has been and by golly we’ve got it now or at least a junior version of it that will keep getting smarter.

If the Trick Works is it Still a Trick?

As the models have changed—and the overall architectures themselves both at training time and during actual use—the interfaces have largely remained continuously online and without radical abrupt changes. People may complain about the quality of the output but it’s not like they suddenly have to record themselves doing interpretive dance as the prompt (although that might be interesting—free product idea y’all).

This is good in some ways because it means they can keep adding all kinds of stuff behind the scenes. Sure the nerds care and the alignment people sort of care about the backends, but the normal users don’t care, they’re just using it for a task at hand. The corporate buyers probably don’t care either—I don’t know what they care about but I’m guessing it’s a matter of cost vs. potential savings and also feeling modern.

Changing the backend without crippling the front end is a common enough goal but it’s sort of new for AI aside from things like the old internet search engines. And it could enable current big AIs go beyond their flat and narrow architectures. Expand, bring in some of the good ol’ fashioned AI ideas maybe. I could see pipelines to specialized delegations that are polymorphic so they can swap in other implementations of those delegations. Interfaces in the backend are just as important as user interfaces.

The downside of that is they could be hiring networks of humans at lower cost than running GPUs to pretend to be your AI, Mechanical Turk style. This is already done for the training side of AI models. And you would never know unless the latency is suspiciously high. But even now when OpenAI’s o1 takes a long time to respond, people just believe that it’s the AI “reasoning.”

In fact people believe a lot of things that I find really flimsy, for instance Alberto Romero wrote in “OpenAI o1: A New Paradigm For AI“:

That’s what o1 models do. They have learned to reason with a reinforcement learning mechanism…can spend resources to provide slow, deliberate, rational (system 2 thinking, per Daniel Kahneman’s definition) answers to questions they consider require such an approach. That’s how humans do it—we’re quick for simple problems and slow for harder ones.

it’s not far-fetched to say these models—what differentiates them from the previous generation—is that they can reason in real-time like a human would.

It’s an interesting comparison. But the realities of computation and that this might be like a million-levels-dull-crayon-level abstraction of human thought has me skeptical. And most people are interfacing with this outside of the box. The other thing is that they are definitely not “reasoning in real-time”—it’s all slow time or even slower time. Embody and embed these in unplanned real-world environments and then we’ll see.

Over a year ago, regarding the state of things with CoT, Melanie Mitchell wrote in “Can Large Language Models Reason“:

If it turns out that LLMs are not actually reasoning in order to solve the problems we give them, how else could they be solving them? Several researchers have shown that LLMs are substantially better at solving problems that involve terms or concepts that appear more frequently in their training data, leading to the hypothesis that LLMs do not perform robust abstract reasoning to solve problems, but instead solve problems (at least in part) by identifying patterns in their training data that match, or are similar to, or are otherwise related to the text of the prompts they are given.

Are we being tricked that there’s any reasoning? Is part of it that we have user interfaces and examples that are by design trying to make the AI seem more human?

There’s always been tricks in user interfaces. For instance, there’s no such thing as files or folders. There’s not really even a such thing as a file system—that’s an abstraction too, even though technically there are data structures and code programmers use to create the illusion. Meanwhile, apps are written by developers who often are only operating at a few top layers of a huge stack of layers, yet the deployed programs necessarily cause chains of reactions going all the way down to the silicon (aka the microchips) and often throughout the seven layers of the internet.

Skeuomorphic design gave us tricks and lies for many decades—but they’ve mostly been useful lies and tricks. Icons, desktops, trash bins, early smart phone apps imitating real world objects. Even a text terminal is not real—it’s an interface made out of pixels and code. Around 2012, popular design moved a bit away from skeuomorphism towards flat and minimal designs.

But regardless of design trends, it’s all tricks. There’s no such thing as the icon you click on. Or the button you press on a screen. When you talk to a device either by voice or texting, it’s just more tricks. But who cares as long as you get the results you want?

It’s tricks all the way down.

AI as the Everything Interface?

If AI continues on this current path, will it become the dominant user interface? Or perhaps a UX template of sorts that tech companies shift to more and more across the board?

In other words, we still have the desktops, the web apps, the iPhones / Androids, the concept of Search, and the concept of the older voice activated agents like Siri and Google Assistant. Chat bots have been around forever—but now with these big LLM backends, as well as improved speech-to-text for when you want to talk instead of type, the AI interfaces we have now could turn into the new Search and even the new Everything for UX.

At least for casual or normal people. There’s always been Power Users or whatever you want to call them. And coders, nerds, professionals who know what they want to do and don’t need the hassle of a certain level of interface. And there is some adaptation to a new interface no matter how easy it is, especially if you’re an engineer or some other non-normal person where it might actually be less intuitive.

All depends on the user and the context and so forth. So we’ll have a huge gap in the future, potentially, in how people use tech. Arguably we’ve always had that gap though, and I’m not sure if this really increases the gap or if it’s just a little shift.

Of course this may not come to fruition if the industry collapses…

Road to Perdition

Previously, I have warned about another AI Winter. We’ve had AI crashes in history before (AI goes back all the way to the 1950s) causing funding dips—the “winters.”

In The AI Winter Shit-Winds Are Coming: Where Were You When the Modern AI Hype Train Derailed? (2022) I suggested they invest in more broad and robust approaches to AI to avoid a new winter.

In 2023 I posed the question, How Long Will Hot AI Summer Last?:

Tech and finance seem to hunger for this…they need something and this is it as things like crypto and web3 float away in the midst of the past couple years of economic problems.

In Potemkin Villages and AI History (2024) I said “The townsfolk are slowly grabbing the pitchforks and torches again…”

If AI crashes now, it would be the biggest AI crash ever. And that’s aside from also affecting a huge amount of the general tech industry and the stock market.

Here’s Zitron again from his article “The Subprime AI Crisis“:

I am deeply concerned that this entire industry is built on sand. Large Language Models at the scale of ChatGPT, Claude, Gemini and Llama are unsustainable, and do not appear to have a path to profitability due to the compute-intensive nature of generative AI. Training them necessitates spending hundreds of millions — if not billions — of dollars, and requires such a large amount of training data that these companies have effectively stolen from millions of artists and writers and hoped they’d get away with it.

Last month Gary Marcus claimed in “Why the collapse of the Generative AI bubble may be imminent”:

I just wrote a hard-hitting essay for WIRED predicting that the AI bubble will collapse in 2025 — and now I wish I hadn’t.

Clearly, I got the year wrong. It’s going to be days or weeks from now, not months.

He was kind of right about the timeline if you apply it to Nvidia, (quote from investopedia):

Nvidia’s stock has been plagued by turbulence for months as Wall Street’s once boundless optimism about artificial intelligence has moderated. Its shares, pressured throughout July and early August by a shift in interest rate expectations, rebounded throughout August before tumbling again at the end of the month after the company’s second-quarter earnings report disappointed investors despite exceeding expectations on paper.

Amid that slump, the stock recorded the largest ever single-day loss of market value in history earlier this month when its shares tumbled nearly 10% amid a semiconductor sell-off.

In general, yikes!

I don’t know if the Odd Couple juxtaposition of real AI people and financial people will ever be normal. David Cahn wrote in “AI’s $600B Question“:

Those who remain level-headed through this moment have the chance to build extremely important companies. But we need to make sure not to believe in the delusion that has now spread from Silicon Valley to the rest of the country, and indeed the world. That delusion says that we’re all going to get rich quick, because AGI is coming tomorrow, and we all need to stockpile the only valuable resource, which is GPUs. 

We may not stop AI coming into our lives. I predict this popular form of AI will soon be stacked onto a shelf of types of user interfaces. But as the meme goes, the real engineers don’t trust any new tech and keep a loaded pistol to shoot the printer in case it makes a weird noise.

And I don’t predict, but I hope, that serious investigations can get funded—as a side effect—for other AI algorithms and the core problems of AI, and Strong AI and AGI, which in my opinion the current AI is not achieving.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 大型语言模型 生成式AI 用户界面 AI风险
相关文章