ChatGPT’s Studio Ghibli-style images show its creative power

By Kai Riemer, University of Sydney and Sandra Peter, University of Sydney

Social media has recently been flooded with images that look like they belong in a Studio Ghibli film. Selfies, family photos and even memes have been re-imagined with the soft pastel palette characteristic of the Japanese animation company founded by Hayao Miyazaki.

This followed OpenAI’s latest update to ChatGPT. The update significantly improved ChatGPT’s image generation capabilities, allowing users to create convincing Ghibli-style images in mere seconds. It has been enormously popular – so much so, in fact, that the system crashed due to user demand.

Generative artificial intelligence (AI) systems such as ChatGPT are best understood as “style engines”. And what we are seeing now is these systems offering users more precision and control than ever before.

But this is also raising entirely new questions about copyright and creative ownership.

it's super fun seeing people love images in chatgpt.
but our GPUs are melting.
we are going to temporarily introduce some rate limits while we work on making it more efficient. hopefully won't be long!
chatgpt free tier will get 3 generations per day soon.
— Sam Altman (@sama) March 27, 2025

How the new ChatGPT makes images

Generative AI programs work by producing outputs in response to user prompts, including prompts to create an image.

Previous generations of AI image generators used diffusion models. These models gradually refine random, noisy data into a coherent image. But the latest update to ChatGPT uses what’s known as an “autoregressive algorithm”.

This algorithm treats images more like language, breaking them down into “tokens”. Just as ChatGPT predicts the most likely words in a sentence, it can now predict different visual elements in an image separately.

This tokenisation enables the algorithm to better separate certain features of an image – and their relationship with words in a prompt. As a result, ChatGPT can more accurately create images from precise user prompts than previous generations of image generators. It can replace or change specific features while preserving the rest of the image, and it improves on the longstanding issue of generating correct text in images.

A particularly powerful advantage of generating images inside a large language model is the ability to draw on all the knowledge already encoded in the system. This means users don’t need to describe every aspect of an image in painstaking detail. They can simply refer to concepts such as Studio Ghibli and the AI understands the reference.

The recent Studio Ghibli trend began with OpenAI itself, before spreading among Silcon Valley software engineers and then even governments and politicians – including seemingly unlikely uses such as the White House creating a Ghiblified image of a crying woman being deported and the Indian government promoting Prime Minister Narendra Modi’s narrative of a “New India”.

View this post on Instagram
A post shared by MyGov, Government of India (@mygovindia)

Understanding AI as ‘style engines’

Generative AI systems don’t store information in any traditional sense. Instead they encode text, facts, or image fragments as patterns – or “styles” – within their neural networks.

Trained on vast amounts of data, AI models learn to recognise patterns at multiple levels. Lower network layers might capture basic features such as word relationships or visual textures. Higher layers encode more complex concepts or visual elements.

This means everything – objects, properties, writing genres, professional voices – gets transformed into styles. When AI learns about Miyazaki’s work, it’s not storing actual Studio Ghibli frames (though image generators may sometimes produce close imitations of input images). Instead, it’s encoding “Ghibli-ness” as a mathematical pattern – a style that can be applied to new images.

The same happens with bananas, cats or corporate emails. The AI learns “banana-ness”, “cat-ness” or “corporate email-ness” – patterns that define what makes something recognisably a banana, cat or a professional communication.

The encoding and transfer of styles has for a long time been an express goal in visual AI. Now we have an image generator that achieves this with unprecedented scale and control.

This approach unlocks remarkable creative possibilities across both text and images. If everything is a style, then these styles can be freely combined and transferred. That’s why we refer to these systems as “style engines”. Try creating an armchair in the style of a cat, or in elvish style.

The copyright controversy: when styles become identity

While the ability to work with styles is what makes generative AI so powerful, it’s also at the heart of growing controversy. For many artists, there’s something deeply unsettling about seeing their distinctive artistic approaches reduced to just another “style” that anyone can apply with a simple text prompt.

Hayao Miyazaki has not publicly commented on the recent trend of people using ChatGPT to generate images in his world-famous animation style. But he has been critical of AI previously.

All of this also raises entirely new questions about copyright and creative ownership.

Traditionally, copyright law doesn’t protect styles – only specific expressions. You can’t copyright a music genre such as “ska” or an art movement such as “impressionism”.

This limitation exists for good reason. If someone could monopolise an entire style, it would stifle creative expression for everyone else.

But there’s a difference between general styles and highly distinctive ones that become almost synonymous with someone’s identity. When an AI can generate work “in the style of Greg Rutkowski” – a Polish artist whose name was reportedly used in over more than 93,000 prompts in AI image generator Stable Diffusion – it potentially threatens both his livelihood and artistic legacy.

Some creators have already taken legal action.

In a case filed in late 2022, three artists formed a class to sue multiple AI companies, arguing that their image generators were trained on their original works without permission, and now allow users to generate derivative works mimicking their distinctive styles.

As technology evolves faster than the law, work is under way on new legislation to try and balance technological innovation with protecting artists’ creative identities.

Whatever the outcome, these debates highlight the transformative nature of AI style engines – and the need to consider both their untapped creative potential and more nuanced protections of distinctive artistic styles.

Kai Riemer, Professor of Information Technology and Organisation, University of Sydney and Sandra Peter, Director of Sydney Executive Plus, University of Sydney

This article is republished from The Conversation under a Creative Commons license. Read the original article.

How the new ChatGPT makes images

Understanding AI as ‘style engines’

The copyright controversy: when styles become identity

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签