Published on August 1, 2025 9:52 AM GMT
The Alignment Project is a global fund of over £15 million, dedicated to accelerating progress in AI control and alignment research. It is backed by an international coalition of governments, industry, venture capital and philanthropic funders.
This sequence sets out the research areas we are excited to fund – we hope this list of research ideas presents a novel contribution to the alignment field. We have deliberately focused on areas that we think the AI safety community currently underrates.
Apply now to join researchers worldwide in advancing AI safety.
For those with experience scaling and running ambitious projects, apply to our Strategy & Operations role here.
Our research goals
In-scope projects will aim to address either of the following challenges:
- AI Control: How can we prevent AI systems from carrying out actions that pose risks to our collective security, even when they may attempt to carry out such actions?AI Alignment: How can we design AI systems which do not attempt to carry out such actions in the first place?
Making substantial breakthroughs in these areas is an interdisciplinary effort, requiring a diversity of tools and perspectives. We want the best and the brightest across many fields to contribute to alignment research, so have organised these priority research areas as a set of discipline-specific questions. We suggest clicking ahead to your specific areas of interest, rather than reading linearly. Sections are roughly ordered from most theoretical to most empirical.
Some of the subfields below have more detail than others about subproblems, recent work, and related work. This should not be read as a signal about which areas we believe are more important: much of the variance is due to areas our alignment and control teams, or our collaborators, have focused on to date. So, for example, lots of the alignment questions focus on scalable oversight / debate. We want to bring other areas up to similar levels of detail, and will attempt to do this in future versions of this agenda.
We’re excited about funding projects that tackle these questions, even if they aren’t focused on a problem outlined below. Feel free to look at others’ lists and overviews — e.g. Google DeepMind, Anthropic, or Redwood Research — for ideas. If you see connections between your research and these challenges, we encourage you to submit a proposal.
Research areas
- Information Theory and Cryptography: Prove theoretical limits on what AI systems can hide, reveal or prove about their behaviour.Computational Complexity Theory: Find Formal guarantees and impossibility results behind scalable oversight protocols.Economic Theory and Game Theory: Find incentives and mechanisms to direct strategic AI agents to desirable equilibria.Probabilistic Methods: Bayesian and rare-event techniques for tail-risk estimation, scientist-AI, and formal reasoning under uncertainty.Learning Theory: Understand how training dynamics and inductive biases shape generalisation.Evaluation and Guarantees in Reinforcement Learning: Stress-test AI agents and prove when they can’t game, sandbag or exploit rewards.Cognitive Science: Map and mitigate the biases and limitations of human supervision.Interpretability: Access internal mechanisms to spot deception.Benchmark Design and Evaluation: Translate alignment's conceptual challenges into concrete, measurable tasks.Methods for Post-training and Elicitation: Refining, probing and constraining model behaviour.AI Control: Current alignment methods can't ensure AI systems act safely as they grow more capable, so the field of AI Control focuses on practical techniques—like restricting AI capabilities and using oversight models—to prevent catastrophic outcomes and test systems before they can cause harm.
Our backers
The Alignment Project is supported by an international coalition of government, industry, and philanthropic funders — including the UK AI Security Institute, the Canadian AI Safety Institute, Schmidt Sciences, Amazon Web Services, Anthropic, Halcyon Futures, the Safe AI Fund and the UK Advanced Research and Innovation Agency — and a world-leading expert advisory board.
View The Alignment Project website to learn more or apply for funding here.
Discuss