少点错误 2024年12月27日
Are Sparse Autoencoders a good idea for AI control?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了在AI控制中利用SAE的相关内容。项目已完成部分工具开发,提出AI控制是好想法且已在实际中应用。文中涉及Gemma 9b和Gemma 2b的任务,以及项目的目标、现状和待办事项,还提到了面临的问题。

🎯AI控制是好想法且已实际应用

🧐Gemma 9b和Gemma 2b的相关任务

📋项目的目标、现状与待办事项

🚧项目面临的问题如特征选择等

Published on December 26, 2024 5:34 PM GMT

Based on a 2-day hackathon brainstorm. Current status: 70% of the tooling is done, unsure of how to proceed. Not enough experience with multi-month sized projects to judge for feasibility. 

I'm looking for some feedback. Specifically I want feedback regarding my current implementation. The statement "SAEs could be useful for AI Control" is obvious enough.

 

Working  hypothesis
 

Setup:

Current status of the project:

TODO

[ ] Label a subset of the features

[ ] Measure baseline performance of the supervisor

[ ] Provide the supervisor with mind reading information

Roadblocks:



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI控制 SAE Gemma 9b 项目问题
相关文章