少点错误 01月10日
MATS mentor selection
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了MATS(机器学习对齐)项目在2024-25冬季导师选拔中使用的流程,该流程也将在2025年夏季项目中使用。MATS采用“信任链”结构,由顾问选择导师,导师选择学者,学者选择研究项目。顾问们来自AI安全领域的不同机构和专业方向,包括控制、可解释性、评估、治理和策略等。导师的选拔基于顾问的评分、对学者数量的建议以及导师的专业领域。最终,选拔过程旨在确保支持高质量的研究,并平衡不同研究方向的学者数量,尽管在实际操作中,部分领域如可解释性导师的录取比例较低,而评估和控制领域的导师录取比例较高。

🧑‍🏫 **导师选拔流程**: MATS采用“信任链”模式,由顾问选择导师,导师选择学者,旨在确保研究方向的合理性与高质量,并对整个过程进行监督。

📊 **顾问多元化**: 顾问团队由12位来自AI安全领域的专家组成,涵盖控制、可解释性、评估、治理等多个方向,确保了评估的全面性和专业性。

⚖️ **评估标准**: 导师的选拔基于顾问的评分(1-10分)、对学者数量的建议以及导师的专业领域,同时考虑了顾问的信心水平和潜在的利益冲突,并对评分在5-7分之间的导师进行额外考虑,以增强研究多样性。

🎯 **导师与学者分布**: 最终选拔的导师在各研究方向的分布为:控制与监督29%,评估21%,可解释性18%,治理16%,AI机构16%。学者分布为:控制与监督31%,可解释性23%,评估22%,AI机构13%,治理10%。其中可解释性方向的导师录取率相对较低。

👥 **学者分配**: 大部分学者与有三名学者的导师合作,这与2024年夏季项目相比,导师的学者数量分布更加分散。

Published on January 10, 2025 3:12 AM GMT

Introduction

MATS currently has more people interested in being mentors than we are able to support—for example, for the Winter 2024-25 Program, we received applications from 87 prospective mentors who cumulatively asked for 223 scholars[1] (for a cohort where we expected to only accept 80 scholars). As a result, we need some process for how to choose which researchers to take on as mentors and how many scholars to allocate each. Our desiderata for the process are as follows:

In this post, we describe the process we used to select mentors for the Winter 2024-25 Program, which will be very close to the process we will use to select mentors for the Summer 2025 Program. In a nutshell, we select advisors, who select mentors, who select scholars, who often select specific research projects, in a “chain of trust,” with MATS input and oversight at every stage. This system is designed to ensure that we make reasonable decisions about the scholars, mentors, and, ultimately, the research we support, even if MATS staff are not subject matter experts for every branch of AI safety research. We want to make this "chain of trust" structure transparent so that potential funders and collaborators can trust in our process, even if we cannot share specific details of selection (e.g., what advisor X said about prospective mentor Y).

Mentor selection

First, we solicited applications from potential mentors. These applications covered basic information about the mentors, the field they work in, their experience in research and mentoring, what projects they might supervise, and how many scholars they might supervise.

These applications were then reviewed by a team of 12 advisors. Our advisors were chosen to be people with experience in the AI existential safety community, as well as to cover a range of perspectives and subfields, as discussed above. We selected advisors by first creating a long list of approximately 100 candidates, then narrowing it down to a short list of approximately 30 candidates, who we invited to advise us. Of these 30, 12 candidates were available. The advisors include members of AI safety research non-profits, AI "scaling lab" safety teams, AI policy think-tanks, and AI safety grantmaking organizations. Breaking down advisors by field (and keeping in mind most advisors selected multiple fields):

Number of advisors who focus on various fields. Note that most advisors selected multiple fields.

Most advisors were not able to rate all applicants, but focused their energies on applicants whose research areas matched their own expertise. For each rated applicant, advisors were able to tell us:

Advisors also had a field to write free-form text notes. Not all advisors filled out all fields.

As shown in the figure below, all but one application was reviewed by at least three advisors, and the median applicant was reviewed by four advisors. One applicant who applied late was only able to be reviewed by a single reviewer.

If someone was rated by n advisors, they go in the bin between n and n+1. So, for example, there were 15 applications that were reviewed by 5 advisors.

We primarily considered the average ratings and scholar number recommendations, taking into account confidence levels and conflicts of interest. Our rule of thumb was that we accepted applicants rated 7/10 and higher, and chose some of the applicants rated between 5/10 and 7/10 to enhance research diversity (in part to counter what we believed to be potential lingering biases of our sample of advisors). To choose mentors for certain neglected research areas, we paid special attention to ratings by advisors who specialize in those research areas.

For accepted mentors, we chose scholar counts based on advisor recommendations and ratings, as well as ratings from MATS Research Managers and scholars for returning mentors. The cut-offs at 5/10 and 7/10 were chosen partly to ensure we chose highly-rated mentors, and partly in light of how many scholars we wanted to accept in total (80, in this program). For edge cases, we also considered the notes written by our advisors.

We then made adjustments based on various contingencies:

Mentor demographics

What sort of results did the above process produce? One way to understand this is to aggregate mentors by "track"—MATS’s classification of the type of AI safety research mentors perform. For the Winter 2024-25 program, we have five tracks: oversight & control, evaluations, interpretability, governance & strategy, and agency[2]. Note that these are coarse-grained, and may not perfectly represent each mentor’s research.

This is how our applicants broke down by track:

Our accepted mentors broke down this way:

Proportionally, the biggest deviations between the applying and accepted mentors were that relatively few interpretability researchers were accepted as mentors, and relatively many evaluations and oversight & control researchers were accepted.

To give a better sense of our mentors’ research interests, we can also analyse the accepted mentors by whether they focused on:

These were somewhat subjective designations and, for the latter two distinctions, some mentors did not neatly fall either way.

The yellow portion of the bar is mentors who did not neatly fall into either category.

Scholar demographics

Another way to measure the cohort's research portfolio is to look at the breakdown of scholar count assigned to each mentor.[3] Firstly, we can distinguish scholars by their mentors' research track:

This is somewhat more weighted towards interpretability and away from governance than our mentor count.

Another relevant factor is how many scholars each mentor has, shown in the histogram below. The median scholar will be working with a three-scholar mentor—that is, with two other scholars under the same mentor. This data is shown in histogram form below. Note that for the purpose of these statistics, if two mentors are co-mentoring some scholars, they are counted as one "mentor."

Numbers are provisional and subject to change. If a mentor has n scholars, they go in the bin between n and n+1. So, for example, there are 15 mentors with 2 scholars. Total scholar count will be lower than in this histogram, since mentors who have not yet determined the division of scholars between them were assigned more scholars in aggregate than they accepted.

This can be compared to the distribution of scholars per mentor in the Summer 2024 Program. In that program, the distribution was more concentrated: more scholars were working in streams of one or two scholars (the median scholar was working with a 2-scholar mentor, i.e. with only one other scholar), and there were fewer mentors with 3-5 scholars.

As with the mentors, we can also break down scholar assignments by their mentors’ research focus.

The yellow portion of the bar is scholars whose mentor did not neatly fall into either category.

Acknowledgements

This report was produced by the ML Alignment & Theory Scholars Program. Daniel Filan was the primary author of this report and Ryan Kidd scoped, managed, and edited the project. Huge thanks to the many people who volunteered to give their time to mentor scholars at MATS! We would also like to thank our 2024 donors, without whom MATS would not be possible: Open Philanthropy, Foresight Institute, the Survival and Flourishing Fund, the Long-Term Future Fund, Craig Falls, and several donors via Manifund.

  1. ^

    More precisely: when people applied to mentor, they answered the question “What is the average number of scholars you expect to accept?”. 223 (or more precisely, 222.6) is the sum of all applicants’ answers.

  2. ^

    By “agency”, we mean modeling optimal agents, how those agents interact with each other, and how some agents can be aligned with each other. In practice, this covers cooperative AI, agent foundations, value learning, and "shard theory" work

  3. ^

    Note that scholar counts are not yet finalized—some co-mentoring researchers have not yet assigned scholars between themselves. This means that the per-track numbers will be correct, since those mentors are all in the same track, but the statistics about number of scholars per mentor will not be precisely accurate.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MATS 导师选拔 AI安全 信任链 研究方向
相关文章