少点错误 2024年07月06日
AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

ARENA 4.0是一个为期4-5周的人工智能安全主题机器学习强化训练营,旨在为有天赋的人才提供机器学习工程技能、工具和环境,以帮助他们在技术岗位上直接为人工智能安全做出贡献。ARENA将于9月2日至10月4日在伦敦旧街的伦敦人工智能安全倡议组织(LISA)举行,第一周是可选的,主要是复习神经网络的基础知识。

🎯 **ARENA 4.0 的目标:** ARENA 4.0 旨在帮助参与者提升与人工智能安全相关的机器学习技能,培养希望在人工智能安全领域工作的研究人员和工程师,帮助他们迈出职业生涯的下一步,并帮助参与者对人工智能安全和不同议程的影响路径形成内部视角。

🧠 **课程内容概述:** ARENA 4.0 的课程内容涵盖四个主要模块: 1. **基础知识:** 涵盖深度学习基础、基本机器学习术语、神经网络以及如何训练它们。 2. **Transformer 和可解释性:** 学习 Transformer 的知识,构建和训练自己的 Transformer 模型,并研究 LLM 可解释性,包括 Anthropic 的 Transformer Circuits 序列和 Neel Nanda 的开源工作。 3. **强化学习:** 学习强化学习的基础知识,并使用 OpenAI 的 Gym 环境进行实验。 4. **模型评估:** 学习如何评估模型,包括构建自己的多项选择基准来评估当前模型,以及构建和评估 LM 代理。

🏆 **项目成果:** ARENA 4.0 的最后一周将进行一个结业项目,参与者将在导师的指导下进行为期一周的研究项目,该项目将基于本课程中教授的材料。结业项目将利用参与者在之前几周积累的技能和知识,并进行论文复现教程。

👨‍🏫 **师资力量:** 如果您在课程中涵盖的主题方面有特别的专业知识,并希望申请成为助教,请使用此表格申请。助教的报酬将与其时间相匹配。如果您有任何其他问题,请联系 info@arena.education。

🙋 **适合人群:** ARENA 4.0 欢迎所有符合以下标准的人员申请: * 关心人工智能安全并希望未来的人工智能发展顺利。 * 具备相对较强的数学技能。 * 具备扎实的编程能力。 * 具备 Python 编程经验。 * 能够从 9 月 2 日(或 9 月 9 日,如果跳过介绍周)开始,在伦敦参加为期 4-5 周的课程。

🧭 **课程安排:** ARENA 4.0 的课程安排将与 ARENA 3.0 大致相同,但增加了关于评估的一周。具体内容请参考官网。此外,我们还建立了一个 Slack 频道来支持独立学习材料(加入链接在这里)。

Published on July 6, 2024 11:34 AM GMT

TL;DR

We are excited to announce the fourth iteration of ARENA (Alignment Research Engineer Accelerator), a 4-5 week ML bootcamp with a focus on AI safety! ARENA’s mission is to provide talented individuals with the skills, tools, and environment necessary for upskilling in ML engineering, for the purpose of contributing directly to AI alignment in technical roles. ARENA will be running in-person from LISA from 2nd September - 4th October (the first week is an optional review of the fundamentals of neural networks).

Apply here before 23:59 July 20th anywhere on Earth!

Summary

ARENA has been successfully run three times, with alumni going on to become MATS scholars and LASR participants; AI safety engineers at Apollo Research, Anthropic, METR, and OpenAI; and even starting their own AI safety organisations!

This iteration will run from 2nd September - 4th October (the first week is an optional review of the fundamentals of neural networks) at the London Initiative for Safe AI (LISA) in Old Street, London. LISA houses small organisations (e.g., Apollo Research, BlueDot Impact), several other AI safety researcher development programmes (e.g., LASR Labs, MATS extension, PIBBS, Pivotal), and many individual researchers (independent and externally affiliated). Being situated at LISA, therefore, brings several benefits, e.g. facilitating productive discussions about AI safety & different agendas, allowing participants to form a better picture of what working on AI safety can look like in practice, and offering chances for research collaborations post-ARENA.

The main goals of ARENA are to:

The programme's structure will remain broadly the same as ARENA 3.0 (see below); however, we are also adding an additional week on evaluations.

For more information, see our website.

Also, note that we have a Slack group designed to support the independent study of the material (join link here).

Outline of Content

The 4-week program will be structured as follows:

Chapter 0 - Fundamentals

Before getting into more advanced topics, we first cover the basics of deep learning, including basic machine learning terminology, what neural networks are, and how to train them. We will also cover some subjects we expect to be useful going forward, e.g. using GPT-3 and 4 to streamline your learning, good coding practices, and version control.

Note: Participants can optionally skip the program this week and join us at the start of Chapter 1 if they'd prefer this option and if we're confident that they are already comfortable with the material in this chapter.

Topics include:

Chapter 1 - Transformers & Interpretability

In this chapter, you will learn all about transformers and build and train your own. You'll also study LLM interpretability, a field which has been advanced by Anthropic’s Transformer Circuits sequence, and open-source work by Neel Nanda. This chapter will also branch into areas more accurately classed as "model internals" than interpretability, e.g. recent work on steering vectors.

Topics include:

Chapter 2 - Reinforcement Learning

In this chapter, you will learn about some of the fundamentals of RL and work with OpenAI’s Gym environment to run their own experiments.

Topics include:

Chapter 3 - Model Evaluation

In this chapter, you will learn how to evaluate models. We'll take you through the process of building a multiple-choice benchmark of your own and using this to evaluate current models. We'll then move on to study LM agents: how to build them and how to evaluate them.

Topics include:

Chapter 4 - Capstone Project

We will conclude this program with a Capstone Project, where participants will receive guidance and mentorship to undertake a 1-week research project building on materials taught in this course. This should draw on the skills and knowledge that participants have developed from previous weeks and our paper replication tutorials.

Here is some sample material from the course on how to replicate the Indirect Object Identification paper (from the chapter on Transformers & Mechanistic Interpretability). An example Capstone Project might be to apply this method to interpret other circuits, or to improve the method of path patching.

Staff

If you have particular expertise in topics in our curriculum and want to apply to be a TA, use this form to apply. TAs will be well compensated for their time. Please contact info@arena.education with any more questions.

FAQ

Q: Who is this program suitable for?

A: We welcome applications from  people who fit most or all of the following criteria:

Note - these criteria are mainly intended as guidelines. If you're uncertain whether you meet these criteria, or you don't meet some of them but still think you might be a good fit for the program, please do apply! You can also reach out to us directly at i>info@arena.education</i.

Q: What will an average day in this program look like?

At the start of the program, most days will involve pair programming, working through structured exercises designed to cover all the essential material in a particular chapter. The purpose is to get you more familiar with the material in a hands-on way. There will also usually be a short selection of required readings designed to inform the coding exercises.

As we move through the course, some chapters will transition into more open-ended material. For example, in the Transformers & Interpretability chapter, after you complete the core exercises, you'll be able to choose from a large set of different exercises, covering topics as broad as model editing, superposition, circuit discovery, grokking, discovering latent knowledge, and more. In the last week, you'll choose a research paper related to the content we've covered so far & replicate its results (possibly even extend them!). There will still be TA supervision during these sections, but the goal is for you to develop your own research & implementation skills.  Although we strongly encourage paper replication during this chapter, we would also be willing to support well-scoped projects if participants are excited about them.

Q: How many participants will there be?

We're expecting roughly 20-25 participants in the in-person program.

Q: Will there be prerequisite materials?

A: Yes, we will send you prerequisite reading & exercises covering material such as PyTorch, einops and some linear algebra (this will be in the form of a Colab notebook) a few weeks before the start of the program.

Q: When is the application deadline?

A: The deadline for submitting applications is July 20th, 11:59 pm anywhere on Earth.  

Q: What will the application process look like?

A: There will be three steps:

    Fill out the application form (this is designed to take <1 hour).Perform a coding assessment.Interview virtually with one of us, so we can find out more about your background and interests in this course.

Q: Can I join for some sections but not others?

A: Participants will be expected to attend the entire programme. The material is interconnected, so missing content would lead to a disjointed experience. We have limited space and, therefore, are more excited about offering spots to participants who can attend the entirety of the programme.

The exception to this is the first week, which participants can choose to opt in or out of based on their level of prior experience.

Q: Will you pay stipends to participants?

A: Unfortunately, we won't be able to pay stipends to participants. However, we will be providing housing & travel assistance to in-person participants (see below).

Q: Which costs will you be covering for the in-person programme?

A: We will cover all reasonable travel expenses (which will vary depending on where the participant is from) and visa assistance, where needed. Accommodation, meals, and drinks & snacks will also all be included.   

Q: I'm interested in trialling some of the material or recommending material to be added. Is there a way I can do this?

A: If either of these is the case, please feel free to reach out directly via an EAForum/LessWrong message (or email info@arena.education) - we'd love to hear from you! 

Link to Apply

Here is the link to apply as a participant. You should spend no more than one hour on it.

Here is the link to apply as staff. You shouldn’t spend longer than 30 minutes on it.

We look forward to receiving your application!



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

ARENA 人工智能安全 机器学习 强化训练营 伦敦人工智能安全倡议组织
相关文章