少点错误 04月02日 18:52
Ghiblification is good, actually
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了图像生成的现象,认为其对AGI发展有促进作用,但存在资源约束和优先级转变等问题。还提到了一些公司在这方面的情况及作者的投资行为。

🎨图像生成是广泛现象,但对艺术家不利,宫崎骏曾公开反对。

💻图像生成对AGI发展有好处,但资源紧张,OpenAI的计算能力有限。

📈图像生成需要不同的预训练等,是公司的主要优先级转变,B2B有很多盈利场景。

👀一些公司不涉足图像生成,如Anthropic,作者买其股票并认为其可能领先。

Published on April 2, 2025 10:48 AM GMT

Epistemic status: sure in fundamentals, informal tone is a choice to publish this take faster. Title is a bit of a clickbait, but in good faith. I don't think much context is needed here, ghiblification and native image generation was (and still is) a very much all-encompassing phenomenon.

No-no, not in the way you might have thought. Of course it is terrible for artists, it's a spit in the face of Miyazaki, who publicly disavowed image generation way back when.
A lot of people hate it, and image generation in general. 

It is good, however, for AGI timelines.

It is good because of this:
 

https://x.com/sama/status/1905296867145154688

 

And this:
 

https://x.com/sama/status/1906771292390666325

 

And this:

https://x.com/sama/status/1907098207467032632

 

Why, you might ask? Isn't all this making people and investors  "feel the AGI" and pour more money into it? Make more money for OpenAI and lead them to buy more GPUs? 

Well, yeah. But...


Resource constraints


The resources are very much tight right now.
It might change, of course, but OpenAI only has so much compute. Where do you think they suddenly found capacity to support all these generations from millions of users?
Mind you, I work in the field: image generation is much more computationally expensive than text generation.
One image, depending on the model, can generate upwards from 10s to 120s, depending on the quality you expect.
That means for 1 RPS (request per second) with the model that generates an image in 10 seconds you need 10 GPUs. You have only 9? Too bad, the generation time for each new request will grow, and will not stop unless people stop asking for generations.
If you have way less GPUs than you have RPS your queue for image generation will spike from tens of seconds to hours and days quite fast. Millions of daily users and SOTA quality... Well. You take a guess where they got their GPUs to support all that.

I say from pretraining cluster.

(Potential) priority shift


Image generation requires different pretraining, different post-training, analytics, people etc. And same GPUs. 

That's a major priority shift for a company (and maybe the field overall), and the compute they have is pretty much the same and won't change so soon.

I mentioned I work in the field. And there's kind of an open secret: text generation? Barely earns money. Most of the revenue (from my knowledge) comes from private deployments for organizations that need to host models within their own infrastructure, rather than from cloud API requests.
Image generation? Well, there are very very lucrative usecases in B2B. B2C, as you can see, also has a lot of interest.

Seeing how Sam is product-oriented first...

This might buy us some time. At least until GPU production ramps up more.

There are companies, of course, who don't dabble in image generation. Anthropic does not care about it at all, at least for now. I expect them to potentially take the lead in frontier models and hold it for some time. There's China, too, with Deepseek that does not have image generation.
To reflect that expectation, I bought 1k shares in this market betting on Anthropic for 10c. I expect that probability to grow substantially in the coming year.

My overall timelines became a few years longer.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

图像生成 AGI发展 资源约束 优先级转变
相关文章