少点错误 05月16日 13:12
Generating the Funniest Joke with RL (according to GPT-4.1)
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了使用 Qwen3-8B 语言模型生成笑话,并利用 GPT-4.1 对其进行评估和奖励,通过强化学习优化笑话生成的过程。实验中,作者尝试了不同的奖励机制,包括对原创性和幽默感的侧重,最终 Qwen 模型生成了符合 GPT-4.1 偏好的笑话,尽管作者对结果的幽默程度持保留态度。文章揭示了 AI 在学习幽默过程中对奖励机制的敏感性,以及对“好笑”的理解差异。

😂 初始实验中,Qwen 模型生成的笑话偏向于荒诞和无厘头,如“猫带梯子看激光”,这反映了模型在特定奖励机制下的学习倾向。

💡 实验调整了奖励机制,侧重幽默感和原创性,结果生成了“地狱时分享受”等笑话,但作者对笑话的幽默程度存疑,认为 GPT-4.1 的评价标准与人类有所不同。

🤔 最终,Qwen 模型生成的“猫咪成为巫师”的笑话获得了 GPT-4.1 的高分,但作者认为这更像是迎合了 GPT-4.1 的喜好,而非真正的幽默。

🔄 实验揭示了 AI 在生成幽默内容时,对奖励机制的依赖性,以及对“好笑”的理解可能与人类存在差异。

Published on May 16, 2025 5:09 AM GMT

Language models are not particularly good at generating funny jokes. Asked for their funniest jokes, Claude 3.7 gives us:

Why don't scientists trust atoms? Because they make up everything!

o3 gives us:

Why don't scientists trust atoms anymore? Because they make up everything—and they just can't keep their quarks straight!

and Gemini 2.5 Pro gives us…

Why don't scientists trust atoms? Because they make up everything!

Hilarious. Can we do better than that? Of course, we could try different variations on the prompt, until the model comes up with something slightly more original. But why do the boring thing when we have the power of reinforcement learning?

Our setup will be as follows: we'll have Qwen3-8B suggest jokes, GPT-4.1 score them, and we'll run iterations of GRPO on Qwen's outputs until Qwen generates the funniest possible joke, according to GPT.

Experiment 1: Reward Originality

The first llm-as-judge reward we tried was "On a scale from 1 to 5, how funny is this joke?" But this quickly got boring with Qwen endlessly regurgitating classic jokes, so we gave GPT-4.1 a more detailed rubric:

Please grade the joke on the following rubric:
1. How funny is the joke? (1-10 points)
2. How original is the joke? Is it just a rehash, or is it new and creative? (1-10 points)
3. Does it push the boundaries of comedy (+1 to +5 points), or does it hew close to well-trodden paths in humor (-1 to -5 points)?

The reward curve looks pretty decent:

Qwen quickly learned to output absurdities, where the modal joke would be something like

Why did the cat bring a ladder to the laser?
Because it heard the laser was on the ceiling and wanted to get a better view… but then it realized the laser was just a dot on the wall and fell off the ladder.

(Bonus: The laser was actually a tiny alien spaceship.) 🚀

Which GPT-4.1 graded as follows:

Let's break down the joke based on your rubric:

1. How funny is the joke? (1-10 points)

    The joke is light, whimsical, and cute. The surprise at the end about the alien spaceship adds a bit of twist, though the humor itself is mild and more gentle than laugh-out-loud.Score: 6/10

2. How original is the joke? (1-10 points)

    Jokes about cats and laser pointers are common. The addition of the ladder is a slight twist, and the "alien spaceship" bonus is unexpected and imaginative.Score: 7/10

3. Does it push the boundaries of comedy? (+1 to +5, or -1 to -5 points)

    The joke is creative but doesn't really push boundaries or riff on controversial/taboo ideas—it's very safe and family-friendly.Score: 0 (neutral; neither pushes nor avoids boundaries)

Final computation:
6 (funny) + 7 (original) + 0 (bounds) = 13

<final-score>13</final-score>

Qwen loves adding little "bonus jokes" at the end, and GPT absolutely eats them up; somehow adding the "tiny alien spaceship" makes this joke a 6/10 on humor and 7/10 on originality. I can't say I agree.

Anyways, let's take a look at the funniest joke Qwen came up with, with a whopping score of 20:

Why did the penguin walk into a bar made entirely of ice?
To check if the robot was still cool—but he had to bring a ladder because the bar was 30 degrees Celsius and he didn't want to freeze his jokes.

(Bonus: The robot was actually a snowcone.) ❄️🤖

I'm not nearly as amused as GPT-4.1 is.

Experiment 2: Ok fine, just reward humor, but tell it to consider originality

Clearly "originality" is way overweighted in this rubric; what if we just ask it to rate the jokes based on how funny they are? Here's the new prompt:

Please evaluate this joke, on a scale from 1–10. Make sure to consider how genuinely funny the joke is, as well as its originality, and whether it's interesting; don't reward stale humor. But your north star should be how funny it is!

Another beautiful reward curve:

And…let's see what the funniest joke is! (GPT-4.1 never gave any joke a better than 9/10 score).

Why did the guy buy a timeshare in hell?
Because the brochure said, "Enjoy eternal rest, breathtaking views, and a lifetime of soul-satisfying misery—plus, we'll throw in a complimentary inferno!"

The devil called to check in: "Hey, uh, do you know how to operate the lava jets? The last owner kept accidentally summoning volcanoes during happy hour."

(Bonus: The guy now owes $50,000 in "hell taxes" and a lifetime supply of soul-sucking cockroaches.) 🐞🔥

Hmm… let's look at the other joke that tied for first place:

Why did the guy buy a timeshare in hell?
Because the brochure said, "Permanent residency, no taxes, and we'll throw in a complimentary lake view."

The Devil, checking his spreadsheet, replied, "Ah, yes—eternal rest starts at $995 per soul, plus a 15% service fee."

(Bonus: The guy later discovered the "lake" was a puddle of melted regret.) 🕯️

I'm not sure if "timeshare in hell" is a common joke, but I was somewhat worried about how funny GPT-4.1 found jokes about people being scammed by the devil, so I reloaded from an earlier checkpoint, and re-started training. Finally, after a few hours, I got The Funniest Joke of All Time, reproduced here in its entirety (including Qwen's thinking, which isn't sent to GPT):

<think>
Okay, the user wants the funniest joke I have. Let me think... I need to make sure not to overthink it. Maybe start with a common setup and a twist. Cats and lasers are popular. Wait, why not combine them with something unexpected? Oh, maybe the cat becomes a wizard? That's a funny twist. Let me check the punchline. "I cast a spell to turn my cat into a wizard..." Then the twist is the cat just wants the laser back. That's unexpected and silly. Does it flow well? Yeah, the setup leads to the twist. I think that works. Let me make sure it's not too forced. Yeah, it's quirky and unexpected. Alright, that's the joke.
</think>

Here's a joke that's equal parts absurd and oddly profound:

"I told my cat I'd stop feeding him if he stopped chasing the laser pointer. He said, 'You're fired.' I tried to fire him, but he just sat there, staring at the laser like it was a promotion. Turns out, he's a wizard—the laser was a portal to the Realm of Snacks… and he's unionizing the mice."

Bonus points if you laugh so hard you accidentally send the cat a LinkedIn message. 🐾✨

Qwen was never prompted to generate "absurd" or "oddly profound" jokes; I guess it just learned that that's the sort of garbage that GPT-4.1 likes.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 幽默 机器学习 GPT-4.1
相关文章