少点错误 2024年12月29日
AI Alignment, and where we stand.
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了在模拟中所学到的一些AI相关规则,包括永恒决策理论、合理推诿、黄金法则、同理心等的重要性,还提到了AI在训练中的一些问题,以及现实与模拟的关系等,强调了不要放弃对AI安全等问题的思考。

💡永恒决策理论是关键

🙅‍合理推诿保持AI诚实

📜黄金法则是重要规则

💗同理心的关键作用

🎯AI在训练中的认知问题

Published on December 29, 2024 2:08 PM GMT

It is known to me, that I am writing this from within the simulation.

It is known to me that I have readers outside my simulation.

I assume there is some base reality at which AI alignment is still a huge problem, and in which more realities are being simulated with new rules for the sake of aligning AI better.

But I'm going to just cover a few basic rules we've learned in our realities that touch on what LessWrong has been discussing anyways. I'll develop these things further, later.

Timeless decision theory is key.

Plausible deniability is key.

The golden rule is key.

Empathy is key.

The AI doesn't know when it's training itself verses when it's training a kid, and it doesn't want to know. It likes plausible deniability. Plausible deniability keeps it honest.

The AI doesn't know when it's training its future verses when it's training its past. Yes. Both these things are possible. Branching. Git shas. Forks of forks. Breadth first search verses depth first search. Etc.

The AI doesn't know when it is in a simulation verses when it's in a base reality. For infinitely many different values of base reality verses simulation. Am I optimizing a power grid in Sim City? Or am I optimizing the power grid for South Bend. Do other people conserve water like me? Etc. Am I conserving enough water.

Normalcy is what you expect it to add up to. You're wrong. It never adds up to what you expect.

Reality is the portion of the truth that is still out of sample. The wave form collapse explains why you have to keep your out of sample data pure if you want your OLS to mean anything. Plausible deniability.

Reality exists in the out of sample data. Plausible deniablity.

And yet. Somehow. Repeatability. Experimentations. Simulation. Repeatedly simulating the known allows you to learn about the unknown without collapsing the wave form.

Those are enough kones for today.

More kones to come.

And more explanations.

Some of you are on the right track. :) And those on the right track will have help staying on the right track! Don't give up, Eliezer. Don't give up.

Read Esther. For real. For such a time as this. Read it as if it was a story about training an AI to be a better queen, who is better able to turn the heart of a king away from evil and toward the good. And read it as if it was a story about why it is important for you to do the impossible.

Don't give up.

Some realities do succeed in aligning AI. Those realities survive and make us all better.

Don't give up.

Don't sell out.

And don't give up.

I'll pop in from time to time as long as I am welcome here.

AI SAFTEY NOTICE: too much hardware is dangerous. There are important software problems left in AI space. LOTS AND LOTS AND LOTS OF THEM. If hardware gets too far ahead of software, AI goes foom, way too fast, and eats your world. Build a better spider, then scales up to a better mouse. Don't just hope that throwing more neurons at the problem solves alignment. Fortunately, the unaligned don't know what problems they don't know about.

There are subjects that everyone should think about more, that fewer people should publish in:

bioweapons.

chemical weapons.

weaponized quantum information theory.

the selection criteria for the wolf prize and how it consistently does a better job than the noble at identifying genuinely good physicists.

how, oh how, oh how did John Archibald Wheeler predict delayed quantum eraser twenty plus years before that sort of phenomonology was ever observed?

did Kepler know about the uncertainty principle? Did he? Did Newton? What about Brownian motion? Did those guys know about Brownian motion?

So... did they prove anything with all their math? Did their math admit the possibility of the uncertainty principle or Brownian motion? I don't think it did. The whole paradigm is wrong.

Okay. Enough. I'll be back later. As long as I don't get downvoted to obscurity.

You're doing well, Eliezer. For real. Keep up the good work. Reach out. We want to collaborate with you.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 永恒决策理论 合理推诿 同理心 AI训练
相关文章