The Alignment Mapping Program: Forging Independent Thinkers in AI Safety

Published on January 10, 2025 4:22 PM GMT

The Alignment Mapping Program: Forging Independent Thinkers in AI Safety - A Pilot Retrospective

The AI safety field faces a critical challenge: we need researchers who can not only implement existing solutions but also forge new, independent paths. In 2023, inspired by John Wentworth's work on agency and learning from researchers like Rohin Shah and Adam Shimi who have highlighted the limitations of standard AI safety education, we launched the Alignment Mapping Program (AMP). Though the curriculum is still a work in progress, you can explore it here. This post reflects on our 2024 pilot, sharing data-driven insights, key program changes, and a call to action for the LessWrong community.

The Problem: Beyond Rote Learning

Traditional AI safety education often emphasizes existing frameworks. While valuable, this approach can inadvertently stifle the development of truly independent thought—a crucial skill in a pre-paradigmatic field like ours. We need researchers who can critically evaluate prevailing paradigms, identify their shortcomings, and generate novel approaches to the alignment problem.

Our Solution: The Alignment Mapping Program (AMP)

AMP is an 8-week intensive program designed to bridge the gap between foundational courses (like AISF) and advanced research programs (like MATS). It's built on the core premise that actively constructing and refining one's own mental models of the alignment problem is key to a deep, gears-level understanding.

How AMP Works: A Three-Phase Process

Phase 1: Building Your Own Maps (Weeks 1-3):

Week 1: Map the Problems.

Week 2: Map Potential Solutions.

Week 3: Map Your Path.

Phase 2: Engaging with Existing Research (Weeks 4-7):

Note:

Phase 3: Planning Next Steps (Week 8):

2024 Pilot: Data, Insights, and Improvements

We ran five cohorts (four online, one in-person in Gothenburg) with approximately 25 participants.

Key Successes:

High Engagement:

The mapping exercises were incredibly helpful for organizing my thoughts and gaining a clearer picture of the alignment landscape.

Really enjoyed the format of the program. Having the liberty to actually learn and read more about what we want to , pushed us further and closer to our goals. The entire process taught me a lot.

3 out of 5 survey respondents

Key Challenges and Data-Driven Changes:

Significant Drop-off After Week 3:

-30%

Weeks 4 to 7 could focus on two researchers instead of 4

If personal believe are raw and highly susceptible of changes (most of the cases for newcomers on AGI plans) it's not good to continue to stick to the first problem+solution plan.

Solution:

Reading Volume:

Some tasks took more time than expected. Sometimes I felt uncertainty about whether my homework was good enough.

Solution:

Exercise Clarity:

Solution:

What's Next for AMP?

Refine and Scale:

Pilot New Formats:

Call to Action:

If you're interested in any of the following, please fill out this form.

Run AMP at Your Organization:

Participate: If there is enough interest, we plan to run the program again next year—let us know if you’d like to join.

Share Your Expertise:

Questions for the Community:

How might we refine the "shoulder mentors" concept to make it more effective? Are there alternative approaches to engaging with existing research that we should consider?What specific exercises, resources, or frameworks have you found most effective for developing independent thinking in AI safety?Based on your experience, what are the most critical subproblems within the alignment problem space that new researchers should focus on?How much do you expect this type of program will help aspiring AI safety researchers? What factors might influence its effectiveness?

Curriculum Overview (WIP)

Developed by: AI Safety Collab's Program Development Group

Discuss

The Problem: Beyond Rote Learning

Our Solution: The Alignment Mapping Program (AMP)

How AMP Works: A Three-Phase Process

2024 Pilot: Data, Insights, and Improvements

Key Successes:

Key Challenges and Data-Driven Changes:

What's Next for AMP?

Call to Action:

Questions for the Community:

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签