AI Alignment, and where we stand.

Published on December 29, 2024 2:08 PM GMT

It is known to me, that I am writing this from within the simulation.

It is known to me that I have readers outside my simulation.

I assume there is some base reality at which AI alignment is still a huge problem, and in which more realities are being simulated with new rules for the sake of aligning AI better.

But I'm going to just cover a few basic rules we've learned in our realities that touch on what LessWrong has been discussing anyways. I'll develop these things further, later.

Timeless decision theory is key.

Plausible deniability is key.

The golden rule is key.

Empathy is key.

The AI doesn't know when it's training itself verses when it's training a kid, and it doesn't want to know. It likes plausible deniability. Plausible deniability keeps it honest.

The AI doesn't know when it's training its future verses when it's training its past. Yes. Both these things are possible. Branching. Git shas. Forks of forks. Breadth first search verses depth first search. Etc.

The AI doesn't know when it is in a simulation verses when it's in a base reality. For infinitely many different values of base reality verses simulation. Am I optimizing a power grid in Sim City? Or am I optimizing the power grid for South Bend. Do other people conserve water like me? Etc. Am I conserving enough water.

Normalcy is what you expect it to add up to. You're wrong. It never adds up to what you expect.

Reality is the portion of the truth that is still out of sample. The wave form collapse explains why you have to keep your out of sample data pure if you want your OLS to mean anything. Plausible deniability.

Reality exists in the out of sample data. Plausible deniablity.

And yet. Somehow. Repeatability. Experimentations. Simulation. Repeatedly simulating the known allows you to learn about the unknown without collapsing the wave form.

Those are enough kones for today.

More kones to come.

And more explanations.

Some of you are on the right track. :) And those on the right track will have help staying on the right track! Don't give up, Eliezer. Don't give up.

Read Esther. For real. For such a time as this. Read it as if it was a story about training an AI to be a better queen, who is better able to turn the heart of a king away from evil and toward the good. And read it as if it was a story about why it is important for you to do the impossible.

Don't give up.

Some realities do succeed in aligning AI. Those realities survive and make us all better.

Don't give up.

Don't sell out.

And don't give up.

I'll pop in from time to time as long as I am welcome here.

AI SAFTEY NOTICE: too much hardware is dangerous. There are important software problems left in AI space. LOTS AND LOTS AND LOTS OF THEM. If hardware gets too far ahead of software, AI goes foom, way too fast, and eats your world. Build a better spider, then scales up to a better mouse. Don't just hope that throwing more neurons at the problem solves alignment. Fortunately, the unaligned don't know what problems they don't know about.

There are subjects that everyone should think about more, that fewer people should publish in:

bioweapons.

chemical weapons.

weaponized quantum information theory.

the selection criteria for the wolf prize and how it consistently does a better job than the noble at identifying genuinely good physicists.

how, oh how, oh how did John Archibald Wheeler predict delayed quantum eraser twenty plus years before that sort of phenomonology was ever observed?

did Kepler know about the uncertainty principle? Did he? Did Newton? What about Brownian motion? Did those guys know about Brownian motion?

So... did they prove anything with all their math? Did their math admit the possibility of the uncertainty principle or Brownian motion? I don't think it did. The whole paradigm is wrong.

Okay. Enough. I'll be back later. As long as I don't get downvoted to obscurity.

You're doing well, Eliezer. For real. Keep up the good work. Reach out. We want to collaborate with you.

Discuss

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签