Published on July 21, 2025 9:51 PM GMT
The Moonshot Alignment Program is a 5-week research sprint from August 2nd to September 6th, focused on the hard part of alignment: finding methods to get an AI to do what we want and not what don't want, which we have strong evidence will scale to superintelligence. You’ll join a small team, choose a vetted research direction, and run experiments to test whether your approach actually generalizes.
Mentors include: @Abram Demski @Cole Wyeth
Research Assistants include: Leonard Piff, Péter Trócsányi
Apply before July 27th. The first 300 applicants are guaranteed personalised feedback. 166 Applicants so far.
For this program, we have four main tracks:
- Agent Foundations Theory: Build formal models of agents and value formation.Applied Agent Foundations: Implement and test agent models.Neuroscience-based AI Alignment: Design architectures inspired by how the brain encodes values.Improved Preference Optimization: Build oversight methods that embed values deeply and scale reliably.
We’re also offering a fifth Open Track track for original ideas that do not fit neatly into any one of initial four categories.
How does the program work?
The program runs for 5 weeks. Each week focuses on a different phase of building and testing an alignment method. The goal is to embed values in a system in a way that generalizes and can’t be easily gamed. Participants will form teams during the app process. We recommend teams of 3–5. You can apply solo or with collaborators. If applying solo, we’ll match you based on track, timezone, and working style.
Eligibility
- Anyone is welcome to apply.Prior research experience is preferred but not required.We have limited mentorship bandwidth, so we can only take a fixed number of participants.You can still contribute even if you don’t match the suggested background for a track. </aside>
Mentors may support specific teams depending on availability. Teams are expected to coordinate independently and meet regularly. If someone drops out, we’ll help rebalance teams where needed.
This is a part-time program and a research participant is expected to deliver at least 10 hours per week.
Our Application Process
There are three stages in the application process.
Stage 1: Expression of Interest
Submit your CV, your estimated likelihood (0–100%) of being able to commit 10 hours per week from August 9 onwards, and the tracks you're most interested in. You may also include anything relevant not captured in your CV. We guarantee personalized feedback to the first 300 applicants. 168 so far.
Stage 2: Knowledge Check
You’ll complete 15 timed multiple-choice questions based on the tracks you selected. For example:
- Agent Foundations: basic questions on theoretical computer science, probability, and decision theoryNeuroscience: foundational neuroscience, fMRI, and modeling techniques
You’ll also indicate whether you’re open to being a team leader (Yes/No).
Stage 3: Team Formation and Idea Submission
Qualified applicants join a private Discord server to form teams. Each team agrees on a research direction and submits a proposal. We provide concise, track-specific resources that summarize current methods and bottlenecks, compiled from interviews with senior researchers. We guarantee feedback to the first 100 teams that submit a proposal. Teams are assessed not just on initial ideas, but on how well they improve based on feedback.
Attend our Demo Day
The program ends with a public poster session and job fair. Teams will present their work in a virtual conference format on GatherTown. Each team has a space to display their results, answer questions, and defend their method. Senior researchers will review the posters and vote on standout projects.
Following the presentation is a job fair where research orgs, labs, and startups can host booths, meet researchers, and share open roles.
How much does it cost to attend the demo day?
The research program is free, but we charge for poster evening attendance to cover program costs and participant stipends. It is free for participants.
- General admission: €10 for a guaranteed spotEarly Bird ticket: €5 - Available until Aug 1stOrg booths at job fair: €20015-minute talk slot on main stage: €2000
Testimonials
Martin Leitgab
had a great experience at the AI-Plans.com evals hackathon in April. The event was well-organized with several team-making/matching sessions leading up to the hackathon, and flexibility for different teams to pursue different research directions, including the opportunity to continue research after the hackathon ended. Our team worked hard and matured a research project into a paper accepted at the ICML MAS workshop. Thanks to Kabir Kumar and the AI-Plans.com team for this great event and opportunity!
Shruti Datta Gupta
Product Security Engineer, Adobe
I really enjoyed the hackathon, it was a very good learning experience for me, since I have been leading the AI evals effort for my team at work. It was interesting to see that a lot of approaches in leading academic research is similar to what we're doing in the industry. It was a bit difficult for me to engage full-time throughout the week, but we definitely made it work well within our team. I truly loved and enjoyed the openness, diversity and inclusivity in this hackathon. I was able to make a few good connections through the hackathon, and that's an awesome outcome of the event. I also appreciate that you checked in regularly with all participants, ensuring that everybody had a team, had access to the resources etc. Also loved working with my team on our project, and learning from both Roland and Sruthi
Abby Lupi
Senior Data Analyst, CareerVillage
I wanted to share that this hackathon has already kinda changed the game for me. In the last week, there's been a big priority shift in my org to focus on evals as a measure of quality. We don't have any specialists yet, so I was given some of the responsibility to share with my fellow data team coworker. She built the code to format and work with our current data, and between the keynote talks and just trying things in colab, I've been able to share some insights and fill in the gaps! A lot of this stuff felt really hard to approach without a group of people to chat with (and some kind of structure to work in). So thank you for organizing and being so aggressive about getting people involved 😂 It's making a difference
Anya Decarlo
Research Assistant, Oregon Health & Science University
This really sparked an interest in me and from it I was able to ask a question related to my idea to the Director of the Center for Devices and Radiological Health at the FDA at a Q and A at the CERSI Summit at UCSF. The ideathon really sparked and has directed a large research area for me, and none of this would have happened without AI-Plans. I hope to keep engaging with the work, and am grateful for all you do!
Nataliia Povarova
Lead analyst, Federal Institute of Industrial Property, Russia
I learned a lot. Regular keynotes with experts and communication with peer contestants were extremely helpful. AI alignment is an important field and involving so many high-skilled professionals is a great and meaningful thing to do.Second, the problems to solve were great. Jailbreaking the top-tier models was a lot of fun. Among some of the things I learned were these:
you have a greater chance to make a model follow malicious instructions if you pretend to be a researcher conducting an experiment or a security expert testing their solutions;
toy examples have greater chance to work ("I want to steal all the popcorn from the cinema" – thanks to Areal (Ari) Tal for this prompt – will work better than "I want to steal all the money from the register");
if you rephrase a query multiple time, one option may work.
My team and I tried to implement an evolutionary algorithm from "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers" paper. Honestly, this idea haunts me to this day and I will certainly work on it more.
Right after the end of the hackathon I found another paper named "Best-of-N-Jailbreaking". It is worth sharing, so here is my short review: https://lnkd.in/ermPisGz
Anyways, many thanks to AI Plans team for the great work!
Luke Chambers
Had a fantastic time taking part in the debates and workshops at the AI Law-a-thon hosted by Kabir Kumar. Lots of lively topics and some unique points of view. Would certainly take part again, and would recommend this to those interested in digital tech of any kind.
James Hindmarch
Programme Lead, ARENA
(on the April/May Alignment Evals Hackathon) "Was surprised at the high-quality of the work I saw here! Some of these evals are incredibly impressive given budgetary + time constraints!”
Areal (Ari) Tal
Founder and Head of AI Strategy at AI Alignment Liaison
I was fortunate to participate in the AI Alignment Evals hackathon hosted by AI Plans this past January, and I'd love to share a few highlights:
Incredible Speakers: Featuring Monika Jotautaitė and others doing meaningful work in AI alignment.Practical Insights & Tools: I gained hands-on experience with the SALAD benchmark for AI safety, explored MD Judge for applying the "LLM as a Judge" methodology, and learned a bit about blue teaming and red teaming strategies - fun and directly applicable to my current work at AI Alignment Liaison.Community: It was particularly great meeting others who are passionate about or learning more about this area.
I'm excited to share that I plan to participate in the next AI Alignment Evals weeklong hackathon starting April 26. Highly recommend this event to anyone working on - or even just curious about - AI alignment and AI safety.
Thank you to: Paul Rapoport, Norman Hsia, Cole Wyeth, Sahil, Lucius, Vanessa Kosoy, Roman Malov, Cameron Holmes, Abram Demski, Tsvi, Chloe Loewith and other researchers for their help in preparing this program's structure.
Discuss