ΑΙhub 04月08日 19:24
ChatGPT’s Studio Ghibli-style images show its creative power – but raise new copyright problems
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

ChatGPT更新后图像生成能力提升,可快速生成Ghibli风格图像,但其使用的算法及引发的版权等问题受关注。该系统被视为‘风格引擎’,能编码各种风格,虽带来创作可能,但也引发艺术家对版权和创意所有权的担忧。

ChatGPT更新后图像生成能力强,几秒内可创Ghibli风格图像,系统曾因需求过大崩溃。

ChatGPT使用‘自回归算法’,像处理语言一样处理图像,能更准确按用户提示生成图像。

生成式AI系统将信息编码为模式或‘风格’,能学习多种事物的‘特性’,被视为‘风格引擎’。

AI生成图像虽有创意可能,但引发版权和创意所有权争议,一些创作者已采取法律行动。

By Kai Riemer, University of Sydney and Sandra Peter, University of Sydney

Social media has recently been flooded with images that look like they belong in a Studio Ghibli film. Selfies, family photos and even memes have been re-imagined with the soft pastel palette characteristic of the Japanese animation company founded by Hayao Miyazaki.

This followed OpenAI’s latest update to ChatGPT. The update significantly improved ChatGPT’s image generation capabilities, allowing users to create convincing Ghibli-style images in mere seconds. It has been enormously popular – so much so, in fact, that the system crashed due to user demand.

Generative artificial intelligence (AI) systems such as ChatGPT are best understood as “style engines”. And what we are seeing now is these systems offering users more precision and control than ever before.

But this is also raising entirely new questions about copyright and creative ownership.

How the new ChatGPT makes images

Generative AI programs work by producing outputs in response to user prompts, including prompts to create an image.

Previous generations of AI image generators used diffusion models. These models gradually refine random, noisy data into a coherent image. But the latest update to ChatGPT uses what’s known as an “autoregressive algorithm”.

This algorithm treats images more like language, breaking them down into “tokens”. Just as ChatGPT predicts the most likely words in a sentence, it can now predict different visual elements in an image separately.

This tokenisation enables the algorithm to better separate certain features of an image – and their relationship with words in a prompt. As a result, ChatGPT can more accurately create images from precise user prompts than previous generations of image generators. It can replace or change specific features while preserving the rest of the image, and it improves on the longstanding issue of generating correct text in images.

A particularly powerful advantage of generating images inside a large language model is the ability to draw on all the knowledge already encoded in the system. This means users don’t need to describe every aspect of an image in painstaking detail. They can simply refer to concepts such as Studio Ghibli and the AI understands the reference.

The recent Studio Ghibli trend began with OpenAI itself, before spreading among Silcon Valley software engineers and then even governments and politicians – including seemingly unlikely uses such as the White House creating a Ghiblified image of a crying woman being deported and the Indian government promoting Prime Minister Narendra Modi’s narrative of a “New India”.

View this post on Instagram

A post shared by MyGov, Government of India (@mygovindia)

Understanding AI as ‘style engines’

Generative AI systems don’t store information in any traditional sense. Instead they encode text, facts, or image fragments as patterns – or “styles” – within their neural networks.

Trained on vast amounts of data, AI models learn to recognise patterns at multiple levels. Lower network layers might capture basic features such as word relationships or visual textures. Higher layers encode more complex concepts or visual elements.

This means everything – objects, properties, writing genres, professional voices – gets transformed into styles. When AI learns about Miyazaki’s work, it’s not storing actual Studio Ghibli frames (though image generators may sometimes produce close imitations of input images). Instead, it’s encoding “Ghibli-ness” as a mathematical pattern – a style that can be applied to new images.

The same happens with bananas, cats or corporate emails. The AI learns “banana-ness”, “cat-ness” or “corporate email-ness” – patterns that define what makes something recognisably a banana, cat or a professional communication.

The encoding and transfer of styles has for a long time been an express goal in visual AI. Now we have an image generator that achieves this with unprecedented scale and control.

This approach unlocks remarkable creative possibilities across both text and images. If everything is a style, then these styles can be freely combined and transferred. That’s why we refer to these systems as “style engines”. Try creating an armchair in the style of a cat, or in elvish style.

The copyright controversy: when styles become identity

While the ability to work with styles is what makes generative AI so powerful, it’s also at the heart of growing controversy. For many artists, there’s something deeply unsettling about seeing their distinctive artistic approaches reduced to just another “style” that anyone can apply with a simple text prompt.

Hayao Miyazaki has not publicly commented on the recent trend of people using ChatGPT to generate images in his world-famous animation style. But he has been critical of AI previously.

All of this also raises entirely new questions about copyright and creative ownership.

Traditionally, copyright law doesn’t protect styles – only specific expressions. You can’t copyright a music genre such as “ska” or an art movement such as “impressionism”.

This limitation exists for good reason. If someone could monopolise an entire style, it would stifle creative expression for everyone else.

But there’s a difference between general styles and highly distinctive ones that become almost synonymous with someone’s identity. When an AI can generate work “in the style of Greg Rutkowski” – a Polish artist whose name was reportedly used in over more than 93,000 prompts in AI image generator Stable Diffusion – it potentially threatens both his livelihood and artistic legacy.

Some creators have already taken legal action.

In a case filed in late 2022, three artists formed a class to sue multiple AI companies, arguing that their image generators were trained on their original works without permission, and now allow users to generate derivative works mimicking their distinctive styles.

As technology evolves faster than the law, work is under way on new legislation to try and balance technological innovation with protecting artists’ creative identities.

Whatever the outcome, these debates highlight the transformative nature of AI style engines – and the need to consider both their untapped creative potential and more nuanced protections of distinctive artistic styles.

Kai Riemer, Professor of Information Technology and Organisation, University of Sydney and Sandra Peter, Director of Sydney Executive Plus, University of Sydney

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

ChatGPT 图像生成 风格引擎 版权争议
相关文章