少点错误 22小时前
Epoch: What is Epoch?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Epoch AI是一家非营利性研究机构,致力于提升社会对人工智能发展轨迹的理解。通过收集和分析AI趋势数据、开发衡量AI能力的基准以及提供独立的模型评估,Epoch AI旨在为AI领域的决策提供最佳的依据。他们公开分享研究成果,与AI政策专家、记者和开发者等广泛受众交流。此外,Epoch AI还与政府机构和AI公司合作,提供咨询和委托研究,从而推动对AI的深入理解,并促进更明智的AI相关决策。

📊 Epoch AI的核心任务是深入理解AI的现在与未来。他们通过追踪AI模型、硬件和超级计算机,构建并分析AI趋势数据。这些数据为投资、治理等领域的决策提供了关键信息,旨在弥合AI发展信息掌握者与公众之间的信息鸿沟。

💡 Epoch AI开发基准来衡量先进的AI能力。他们的代表性工作包括与OpenAI合作开发的FrontierMath基准,用于评估AI解决复杂推理问题的能力。Epoch AI致力于提高基准测试的质量和透明度,并为AI公司提供公平的访问机会。

🔍 Epoch AI提供对公开可用模型的独立评估。他们维护一个公开的仪表板,展示对AI能力的基准评估,旨在揭示AI领域的能力发展趋势。他们还考虑与AI实验室进行预发布评估合作,以提升评估专业性,但同时关注避免影响其使命的保密限制。

🤝 Epoch AI与各组织合作,包括大型AI公司、政府机构等,提供咨询和委托研究服务。这些合作有助于他们深入了解AI,并为合作伙伴提供对AI的深入理解,从而促进更明智的战略决策。Epoch AI在合作中保持高度透明,并将利润再投资于其使命。

Published on June 27, 2025 4:45 PM GMT

Our director explains Epoch AI’s mission and how we decide our priorities. In short, we work on projects to understand the trajectory of AI, share this knowledge publicly, and inform important decisions about AI.


Since we started Epoch three years ago, we have engaged in hundreds of projects and achieved a wide audience. Yet, one question I often get asked is, ‘What is Epoch?’

In a way, this is an easy question to answer. We are a nonprofit research organization with the mission of improving society’s understanding of the trajectory of AI. Simply put, we are doing what we can so that decisions about AI are informed by the best possible evidence.

To achieve this, we are curating data and conducting high-quality research into some of the most significant trends in AI. We share most of this work publicly, aimed at a broad audience, including AI policy experts, journalists and AI developers. Importantly, we are committed to always sharing what the data says, rather than tailoring it to fit a narrative.

We work on this mission because we believe that if we all collectively know more about AI, we will make better decisions on average. I will not agree with all the decisions that our work will inform — but I believe that we can have a smarter conversation about AI if it is grounded in data, and I am pleased with the level of success that we have achieved in this mission.

However, while helpful, this brief description misses many nuances of our culture and ethos. In this post, I will expand on what we do, why we do it, and what we are not. My goal is to let you know more about how we make decisions at Epoch, so you can better understand our motivations.

What we do

Our primary focus is on working on what we believe will be most helpful in understanding the present and future of AI. We are committed to sharing this knowledge publicly, because we think doing so will help inform important decisions society will make in coming years regarding AI.

AI is a rapidly evolving field, and our mission requires us to quickly adapt to trends and opportunities. This naturally leads us to change our focus fairly often, making it hard for outsiders to understand our priorities.

Below, I talk about the projects we have chosen to work on. This is not meant to be an update on our projects – instead, I want to focus on why we chose to work on what we did. This will give you a better understanding of how we choose what to work on.

We curate and analyze data on AI trends

Epoch AI began as an effort to curate and organize data on AI models, exemplified by our most cited paper to date, Compute Trends Across Three Eras of Machine Learning.

Today, we run a full-on open-source intelligence program, tracking AI models, hardware and supercomputers. To help interpret this data, we release frequent short analyses that illustrate important trends in AI.

We run this service both as a public good, funded by our revenue and philanthropic funding, and as a paid service. For example, UK ARIA has previously commissioned work on compute trends, and we analyzed biological sequence models for Sentinel Bio. For this work, we delivered brief, private reports to our clients to address their strategic questions, and publicly released the data we collected through the course of the project to advance our mission.

The rationale behind this data work is straightforward: access to up-to-date data on AI trends is crucial for decisions in investments, governance and elsewhere. By default, most information about AI will be held in the hands of a few individuals closely working in the technology, with most other stakeholders (e.g., policymakers, stakeholders from other industries, journalists, and researchers outside AI companies) having access to only a limited and delayed view of AI’s development. Epoch strives to improve this situation, so that more people have a clear strategic picture of AI.

We develop benchmarks to measure advanced AI capabilities

Our best-known work to date is FrontierMath — a private benchmark commissioned by OpenAI to measure advanced AI math capabilities. We have also run some other pilots, including benchmarks related to software engineering and remote work, and are currently exploring partnerships to develop more such benchmarks.

Our main criterion for deciding whether to work on a benchmark is whether it will improve the public’s understanding of AI’s trajectory. We chose to work on FrontierMath because we thought that a challenging math benchmark would clarify the degree to which AI is capable of solving novel and difficult reasoning problems. This bet has paid off, as FrontierMath has become a popular measure of AI capabilities.

We are committed to improving the quality and transparency of our benchmarking work. OpenAI is the only AI company with access to FrontierMath (and the upcoming Tier 4 of this benchmark, an even more difficult set of problems), which has diminished confidence in FrontierMath results for OpenAI models. For future benchmarks we develop, we are committed to retaining benchmark ownership and providing equitable access to AI companies (for example, by releasing benchmarks publicly or providing structured access to any company for a fee). We are also committed to proactive transparency — during FrontierMath, we made the mistake of only disclosing the OpenAI funding relationship to the contributors who explicitly asked. Going forward, we plan to inform all contributors proactively and not release a benchmark without clearly disclosing the funding parties.

We provide independent evaluations of AI models

Another important aspect of our work to measure AI capabilities is our model evaluation work. We maintain a public dashboard with benchmark evaluations of publicly available models. This dashboard aims to illustrate trends in AI capabilities, and we have been pleased with the positive reception of this work. We intend to continue releasing evaluations of notable and publicly available models, at least when technically feasible.

We are currently considering whether to pursue pre-release evaluation work with AI labs. We would see this as a good opportunity to improve our evaluation expertise and work closely with AI companies, but we are concerned about NDAs preventing us from sharing what we learn, which would cut against our mission.

We provide consultations and commissioned research

As part of our work, we routinely partner with other organizations that either work on AI or are affected by AI. These include: large AI companies (e.g. Google), government agencies (e.g. the UK Department of Science, Innovation and Technology), organizations working on adjacent sectors (e.g. the Electric Power Research Institute), and companies that are affected by AI (including hardware, energy and investment firms, consultancies and others).

The services we offer our partners cover the types of work described above (data collection, benchmark development, and model evaluations), as well as consulting and commissioned research. Our work with partners includes, for example, investigations into the power demand for AI, summaries of our work on AI scaling, and novel research into aggregating results from different benchmarks to measure AI capabilities.

Partnerships subsidise our public-facing work, allow us to identify projects that are more urgent and relevant, and provide feedback that improves our outputs. We also aim at delivering value to our clients, providing them with a deeper understanding of AI through consultations and private reports that explain how our work applies to their strategic situation.

In choosing who to work with, we ask ourselves the following questions:

Throughout these partnerships, we are committed to a high level of transparency — while acknowledging that we cannot, for example, divulge details of upcoming regulation or company secrets. A list of partners we’ve worked with is available on our website.

What we are not

One other way of understanding Epoch is to compare us to organizations doing similar work, like Our World In Data (which curates and shares data about important world trends), the AI Index (which publishes yearly compilations of data on AI) and Artificial Analysis (an independent AI benchmarking & analysis company).

However, none of them is quite the right analogue. To my knowledge, there are simply no other organizations in our niche. Instead, I can discuss what we are not.

We are not an AI development company

Many of our research projects may help advance the state of the art in artificial intelligence. We partnered with OpenAI to create the best math AI benchmark today. We have gone to great lengths to study bottlenecks to AI scaling. And we have advanced research on AI scaling laws.

However, our goal is not to contribute to AI progress per se. In choosing to work on these projects, we are prioritizing our mission of improving societal understanding of the trajectory of AI.

Our staff, and the AI community more broadly, is split on whether advancing AI will ultimately benefit society. As an organization, we are decidedly neutral on this question. We will continue working on projects that advance (or slow down) AI, as long as their primary purpose is to advance the public understanding of AI.

We are not an AI policy think tank

We have partnered with government agencies worldwide, such as the UK Department of Science, Innovation and Technology. And we do see it as an important part of our mission to inform governments of the state of the art in AI, so they can enact wiser policies.

However, we do not push for any particular stance on AI policy as an organisation. While our staff have their own (diverse) opinions on how AI should be handled, we see Epoch as providing a unique service of informing everyone with trustworthy data and evidence about AI, without pushing for one agenda.

We might point out the consequences of a policy, while adhering to our usual standards of rigour and transparency. For example, we might publish estimates on the number of developers affected by a compute threshold regulation, and point out that keeping the scope limited will require elevating the threshold. And our staff members are encouraged to offer their personal opinions. But we won’t outright make official policy recommendations.

We are not a company incubator

Epoch employs a team of AI experts with a variety of opinions about AI. Some of our staff have gone on to create new organizations, premised on very different beliefs about what should be done about AI.

For instance, Epoch co-founder Marius Hobbhahn left to create Apollo Research — an AI safety organization focused on scheming, evaluations, control and AI governance. More recently, Epoch co-founder and associate director Tamay Besiroglu and three other employees left to create Mechanize — a start-up focused on creating the data to enable AI automation of work tasks.

Incubating these initiatives is not the purpose of Epoch AI, but we see it as natural for our staff to sometimes grow in directions beyond our organization. We are excited to host people on our staff with different beliefs about AI, and work with them for as long as their paths are aligned with our mission, then part ways when it no longer makes sense to continue our work together.

In the cases where our staff wants to monetize work they did while at Epoch, we might negotiate deals with them at a fair price, prioritising our mission.

Closing words

When we started Epoch AI three years ago, we saw an important role to be filled. We foresaw AI taking off, and the stakes growing. We knew that understanding what was happening would be key to good decisions. And yet, little public analysis existed documenting these trends.

We bet we could become the organization to fill that gap, and I am pleased with how far we’ve come. We will continue striving toward this mission: producing quality analysis on the trajectory of AI, so we can all make better decisions regarding this very important technology.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Epoch AI 人工智能 AI研究 AI评估
相关文章