Published on April 28, 2025 7:04 PM GMT

tl;dr

This post is an update on the Proceedings of ILIAD, a conference journal for AI alignment research intended to bridge the gap between the Alignment Forum and academia. Following our successful first issue with 9 workshop papers from last year's ILIAD conference, we're launching a second issue in association with ILIAD 2: ODYSSEY. The conference is August 25-29, 2025 at Lighthaven in Berkeley, CA. Submissions to the Proceedings are open now and due June 25. Our goal is to support impactful, rapid, and readable research, carefully rationing scarce researcher time, using features like public submissions, partial anonymity, partial confidentiality, reviewer-written abstracts, reviewer compensation, and open licensing. We are soliciting community feedback and suggestions for reviewers and editorial board members.

Motivation

Prior to the deep learning explosion, much early work on AI alignment occurred at MIRI, the Alignment Forum, and LessWrong (and their predecessors). Although there is now vastly more alignment and safety work happening at ML conferences and inside industry labs, it's heavily slanted toward near-term concerns and ideas that are tractable with empirical techniques. This is partly for good reasons: we now have much more capable models which guide theory and allow extremely useful empirical testing.

However, conceptual, mathematically abstract, and long-term research on alignment still doesn't have a good home in traditional academic journals and conferences. Much of it is still done on the AI Alignment Forum and here on LessWrong, or is done informally (private discussion, Twitter, blogs, etc) by academic researchers without a good venue for attracting the best constructive criticism.

As a result, there remains a gulf between more traditional academic work and much of the most important alignment work:

Some traditional academics consider alignment work to be sloppy or unsophisticated, often re-inventing the wheel, neglecting prior academic literature, failing to engage with the strongest criticism, lapsing into hubris, and ignoring valuable academic norms.Some alignment researchers consider traditional academic work to be unacceptably slow, unwilling to confront the hard problems, incremental, credentialist, reluctant to publicly criticize misleading work, unable to abandon failed approaches, and prone to fetishizing the trappings of academia.

There is substantial truth in many of these criticisms, but often they can be misaimed. There is value in the techniques and philosophy of both communities, and we would like get best of both worlds by building a central venue to bridge them.

There is currently nothing like an academic journal on alignment. Several scientific fields (e.g., game theory, cybernetics) were meaningfully accelerated or incubated by an initial conference followed up with a journal, so we decided to take this route. We were warned several times that a journal/proceedings is an enormous amount of work, but we're stubborn and decided to try out a trial version at the first ILIAD conference (LW announcement) that took place August 28 - September 3, 2024 at Lighthaven.

Experience with first issue of Proceedings

We have just released 9 workshop papers in the first issue of Proceedings of ILIAD. First, the bad:

We took way too long. It's been >8 months since ILIAD. This is silly. Doubly so in a short-timeline world. Turnover should be 2-month max.We lacked experience with the process of reviewing.We randomly allocated submitters to do two reviews each. This means people often reviewed work outside their expertise.It was a lot of work and since it wasn't our primary priority this was an onerous burden.It was harder than expected to have a consistent standard.

And the good:

There was a sufficient number of submissions (this was our main uncertainty).The overall quality was good.

Overall we were satisfied with this experiment and have decided to push forward. We have just opened up submissions for the second issue of the Proceedings in association with the second annual conference, ILIAD 2: ODYSSEY, taking place August 25-29, 2025 at Lighthaven. The soft deadline for submission is June 25th. We are continuing to experiment on mechanisms for running these proceedings, especially with the review process. If we succeed in getting good community engagement and adding value, we may start an archival journal dedicated to AI alignment.

In the rest of this post we first describe our general philosophy and then how we expect that to cash out in terms of the design of the second issue of the Proceedings and a possible alignment journal.

General philosophy

We want to accelerate impactful research and make it more readable to other researchers. Our hope to combine the best features of academic journals and internet forums.

An idealized traditional journal...

someone

On the other hand, an idealized internet forum...

...enables near-instant and near-frictionless dissemination of results....generates rapid feedback from forum comments. Review takes hours or days, not months....solicits feedback from whoever happens to be most interested, rather than guessing with editor-picked reviewers. This can get more engagement, and many eyes on a work can surface problems....allows updating of posts (“living”).

We want to combine all the above advantages as much as possible. We are inspired by the Distill Journal while keeping in mind its reasons for shutting down.

A central lens through which we think about all decisions is that, when it comes to the economics of academic discussion and review, the scarce resource is researcher time. Researcher time is best used when:

Reviewers are matched with papers that interest them and for which they have the relevant expertise.Reviewers are not forced to read low-quality papers, and authors are not forced to respond to low-quality reviews.Neither authors nor reviewers waste time on papers that shouldn't be written in the first place: incremental, deceptive, or boring research.Reviews offer constructive feedback.The review process is rapid, and does not get derailed by bickering or minutia.The public output of the review process is not a single bit (publish/reject), but rather a distillation of the reviewer's insight about a paper that can aid future readers.

Design of the second issue of the Proceedings

Some simple and relatively uncontroversial features:

Rapid review

Google-scholar indexing

No public comments

Manuscript submissions in any format

shudder

Prior- and post-publication allowed

Proceedings

Web-first formatting:

Distill version of R Markdown

Formatting assistance

Limited mentorship

Proceedings

Here are some less-trivial ideas we are tentatively planning to use:

Publicly visible submissions

Submitted manuscripts will be publicly posted while under review.Anyone may submit an unsolicited review to the editor. (See "self-nominated reviewer" below.) If constructive, editors will include these in the review process.

Dual abstracts

write the abstract they wish they could have read

Confidential and semi-anonymous review

OpenReview

Licensing

Creative Commons Attribution (CC-BY) 4.0

exact words and figures

Proceedings.

Proceedings

Reviewer payments

Authors of useful reviews will receive ~$200, and unusually excellent reviews will get double (~$400). (Quality is judged by the editor, and unsolicited reviews are eligible for compensation.) An additional ~$100 will go to the reviewer who writes the reviewer abstract.The amounts are subject to revision based on funding availability before the review process startsReviewers will be paid for positive and negative reviews alike.We are hoping that payments, though modest, will spur reviewers to review quickly, thoroughly, and professionally.We're of course cognizant of the various ways payments can negatively distort motivations. But we think its worth trying.

Possible design for an alignment journal

Towards our ultimate goal of combining as many advantages of academic journals and internet forums into one venue, here are some ideas we're strongly considering for an alignment journal (but not for the next issue of Proceedings):

Living

Re-review

Test of time

Self-nominated reviewer:

much

The last bullet point is potentially the most powerful mechanism for obtaining most of the benefits of internet forums while retaining the essentials of peer review. In essence, the review process would be a conventional internet forum discussion, except that (1) participants are filtered for expertise, (2) the discussion is confidential and lightly moderated by the editor, and (3) the detail results of the discussion are released as a reviewer abstract. Importantly, the moderation workload is greatly reduced compared to an open forum because the editor doesn’t have to monitor the comments in real-time; new commenters are hidden by default until approved by the editor.

Asks for readers

We're seeking constructive criticism of the above ideas. Please also let us know:

email us

Acknowledgements

We thank Oliver Habryka for discussion.

Discuss

tl;dr

Motivation

Experience with first issue of Proceedings

General philosophy

Design of the second issue of the Proceedings

Possible design for an alignment journal

Asks for readers

Acknowledgements

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签