少点错误 19小时前
Call for suggestions - AI safety course
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文作者计划在哈佛大学开设一门AI安全研究生课程,并向公众征求课程内容建议。课程旨在探讨AI可能带来的各种风险,如故障、滥用、社会动荡、军备竞赛、监控、偏见、失调和失控等。作者希望结合其他领域的经验,如软件安全、航空安全等,并涵盖政策、未来预测方法等内容。同时,课程将侧重技术层面,包括评估、技术缓解措施、攻击、可解释性方法和模型生物等,并鼓励学生参与实践项目。作者希望通过这门课程深入研究AI安全领域的前沿研究成果。

💡 课程目标:探讨AI可能引发的各种风险,包括但不限于AI的故障、滥用、社会动荡、军备竞赛、监控、偏见、失调以及失控等问题,旨在全面理解AI安全挑战。

📚 跨学科融合:课程将借鉴软件安全、航空安全、汽车安全、药物安全、核军备控制等领域的经验,以多角度、跨学科的视角审视AI安全问题,寻求解决之道。

📝 课程内容:涵盖政策制定、公司内部框架、相关法规以及对未来AI发展的预测方法,力求理论与实践相结合,为学生提供全面的AI安全知识体系。

💻 技术导向:课程将有80%的比例专注于技术层面,包括研究论文和研究成果,重点讲解评估方法、技术缓解措施、攻击手段、白盒与黑盒可解释性方法,以及模型生物等,并辅以实践项目。

🌟 课程特色:课程鼓励学生深入研究AI安全领域的前沿研究成果,并提供实践机会,旨在激发学生的创新思维,培养其解决实际问题的能力。

Published on July 3, 2025 2:30 PM GMT

In the fall I am planning to teach an AI safety graduate course at Harvard. The format is likely to be roughly similar to my "foundations of deep learning" course.

I am still not sure of the content, and would be happy to get suggestions.

Some (somewhat conflicting desiderata):

    I would like to cover the various ways AI could go wrong: malfunction, misuse, societal upheaval, arms race, surveillance, bias, misalignment, loss of control,... (and anything else I'm not thinking of right now). I talke about some of these issues here and here.I would like to see what we can learn from other fields, including software security, aviation and automative safety, drug safety, nuclear arms control, etc.. (and happy to get other suggestions)Talk about policy as well, various frameworks inside companies, regulations etc..Talk about predictions for the future, methodologies for how to come up with them.All of the above said, I get antsy if I don't get my dosage of math and code- I intend 80% of the course to be technical and cover research papers and results. It should also involve some hands on projects.Some technical components should include: evaluations, technical mitigations, attacks, white and black box interpretability methods, model organisms.

    Whenever I teach a course I always like to learn something from it, so I hope to cover state of art research results, especially ones that require some work to dig into and I wouldn't get to do so without this excuse.

    Anyway, since I haven't yet planned this course, I thought I would solicit comments on what should be covered in such a course. Links to other courses blogs etc. are also useful. (I do have a quirk that I've never been able to teach from someone else's material, and often ended up writing a textbook, see here, here, and here whenever I teach a course.., so I don't intent to adapt any curricula wholesale)

    Thanks in advance!

     



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 哈佛大学 课程 风险 技术
相关文章