Communications of the ACM - Artificial Intelligence 前天 23:43
Generative Artificial Intelligence Policies under the Microscope
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了生成式人工智能(GenAI)在计算机科学(CS)学术写作和同行评审中的应用,重点分析了ACM、IEEE和AAAI等主要CS会议的GenAI使用政策。研究发现,尽管GenAI在CS领域内逐渐普及,但各会议的政策差异较大,对作者的政策相对宽松,而对审稿人的政策则更为严格。文章还分析了不同CS领域(如AI、跨学科、系统和理论)对GenAI政策的采纳情况,并强调了建立明确和伦理规范的重要性,以促进学术写作中的公平和负责任的GenAI实践。

🤖 许多CS会议尚未制定GenAI使用政策,已有的政策在宽松度、披露和制裁方面存在差异。作者政策比审稿人政策更为普遍,部分政策涉及代码编写和文档。

💡 AI领域会议在GenAI政策采纳方面最为积极,其次是跨学科领域,系统领域也在逐步跟进。理论领域则尚未采纳相关政策。

📝 会议对作者的GenAI使用通常较为宽松,平均宽松度评级在3.50(第一年)和3.61(第二年)以上。而审稿人的政策则更为严格,平均宽松度评级从3.00(第一年)降至2.18(第二年)。

📈 总体而言,会议正逐步引入新的GenAI政策。在第一年,64个会议中仅有18个(28.1%)制定了GenAI政策,到第二年,这一比例增加到32个(50%)。

Since the rise of ChatGPT, generative artificial intelligence (GenAI) technologies gained widespread popularity, impacting academic research and everyday communication.5,10 While GenAI offers benefits in task automation,9 it can also be misused and abused in nefarious applications,7 with significant risks to long-tail populations.6 Professionals in fields such as journalism and law still remain cautious due to concerns related to hallucinations and ethical issues, but scholars in computer science (CS), the field where GenAI originated, appear to be cautiously, yet actively exploring its use. For instance, Liang, W. et al.3 report the increased use of large language models (LLMs) in the CS scholarly articles (up to 17.5%), compared to mathematics articles (up to 6.3%), and Liang, W. et al.2 report that, between 6.5% and 16.9% of peer reviews at ICLR 2024, NeurIPS 2023, CoRL 2023, and EMNLP 2023 may have been altered by LLMs beyond minor revisions.

Considering researchers’ increasing adoption of GenAI, it is crucial to establish usage policies to promote fair and ethical practices in scholarly writing and peer reviews. Some conferences, such as ICML 2023 accurately highlighted the confusion and questions surrounding GenAI use, with concerns regarding the novelty and ownership of generated content. Previous research examined GenAI policies by major publishers such as Elsevier and Springer,5 but there is still a lack of clear understanding of how CS conferences are adapting to this paradigm shift. Hence, this Opinion column provides a summarized overview of the “scholarly writing” policies of CS conferences and major computing societies: ACM, IEEE, and AAAI (2024 and 2025 if available; otherwise 2023 and 2024) and offers recommendations. Conferences studied in this analysis were selected based on CSRankings,a including leading conferences in each CS subfield. Our analysis shows that many CS conferences have not established GenAI policies and those that have, vary in leniency, disclosure, and sanctions. Policies for authors are more prevalent compared to reviewers, and some address code writing and documentation. These policies are evolving, as demonstrated by conferences such as ICML 2023, which initially prohibited LLM-generated text but later clarified its allowance for editing author-written content. However, by and large, adoption remains inconsistent across conferences, creating uncertainty in their application.

Table 1 lists the conferences considered, alongside their leniency ratings on a 5-point Likert scale. Three authors independently annotated the policies after agreeing on rating criteria. Given the ordinal nature of the ratings, we used Krippendorff’s alpha (α = 0.832, satisfactory) to assess interrater reliability. Final ratings were assigned through majority voting. Nevertheless, we acknowledge that variations in ratings may arise from the subjective nature of the task, largely attributable to the ambiguity in policy language.

Table 1.  Leniency ratings on a 5-pt Likert scale, “1”=“Extremely restrictive” (restricts almost all types of use), “2”=“Somewhat restrictive” (restricts most types of use, but allows a few, for example, allows grammar/spelling edits), “3”=“Neither lenient nor restrictive” (allows some types of use, restricts others, for example, editing or polishing author-written content, may imply rewrite such as changing writing style, summarizing, and so forth), “4”=“Somewhat lenient” (allows most types of use with detailed reporting, for example, in text, figure, code, and so forth, with disclosures), “5”=“Extremely lenient” (almost all types of use, no reporting). Conferences that adopted society-level policies for authors are marked with an asterisk (*) and those with no policies are marked as “-”. Conference policies were initially evaluated between September 6–12, 2024, and subsequently reviewed and revised between December 12–17, 2024.
AreasConference/Society Name/(s)Year/(s)Year 1 (2023/2024)Year 2 (2024/2025)
AuthorReviewerAuthorReviewer
Society (3)AAAI202442
ACM202443
IEEE20244
AI (13)AAAI2024, 202533
IJCAI2024, 202533
CVPR2024, 20255352
ECCV202453
ICCV20231
ICLR2024, 2025333
ICML2024, 202533
NeurIPS2023, 202444
ACL2024, 202544
EMNLP2023, 20244
WWW2024, 202543
NAACL20254
SIGIR*2024, 20254
Interdisciplinary (15)SIGCSE*2024, 202543
CHI2024, 20254343
UbiComp / Pervasive / IMWUT*2023, 202442
UIST2024, 202511
RSS2024, 202533
VIS*2023, 202442
VR2024, 202522
ICRA2024, 202541
IROS2024, 202541
ISMB, RECOMB, SIGGRAPH, SIGGRAPH Asia, EC, WINE2023, 2024, 2025
Systems (29)SIGCOMM*2024, 202544
SIGMOD*2024, 20254
DAC*2024, 202544
HPDC2024, 202543
SC2023, 202443
IMC*2024, 20254343
FSE*2024, 202544
ICSE*2024, 20254
NSDI2024, 20253
MobiSys*2024, 20254
ASPLOS, ISCA, MICRO, CCS, IEEE S&P (“Oakland”), USENIX Security, VLDB, ICCAD, EMSOFT, RTAS, ICS, MobiCom, SenSys, SIGMETRICS, OSDI, SOSP, PLDI, POPL, RTSS2023, 2024, 2025
Theory (7)FOCS, SODA, STOC, CRYPTO, EuroCrypt, CAV, LICS2023, 2024, 2025

Most conferences with GenAI policies were somewhat lenient for authors (ratings of ‘3’ or ‘4’). The 2025 author policy for WWW was assigned a rating of ‘3’ as it allowed rephrasing and ICRA was assigned ‘4’ as it allowed different types of usage with disclosure. Interdisciplinary conferences such as UIST and VR had lower leniency ratings of ‘1’ (‘no AI-generated content allowed) and ‘2’ (for example, allowing grammar/spelling edits), respectively. In contrast, several AI conferences, including CVPR, were highly permissive of GenAI use with rating ‘5’ (for example, no restrictions), likely due to a greater familiarity of the AI field with LLMs and their perceived benefits. In comparison, conferences adopted more restrictive policies for reviewers. Particularly, ICRA, VIS, and IMC were rated ‘1’, ‘2’, and ‘3’, respectively, likely stemming from concerns over content leakage and limitations of LLMs in handling complex tasks.1

Area-Level Trends

We compared the trend of GenAI policy adoption across different CS areas, following the classification from CSRankings: AI, Interdisciplinary, Systems, and Theory. For AI, 61.5% of conferences had GenAI policies for authors in Year 1, which increased to 92.3% in Year 2 (Figure 1a), indicating that conferences in AI field are the most active in adopting GenAI policies for authors. Conferences in the Interdisciplinary area showed 26.7% in Year 1 and 60% in Year 2, making them the second most proactive in introducing GenAI policies for authors. While these two areas are leading in GenAI policy adoption for authors, the Systems area lagged behind the overall trend in Year 1 (20.7%) but saw a 13.8% increase in Year 2 (34.5%), implying that this area is gradually aligning with the overall trend. A similar pattern was observed for reviewers, with AI (Year 1: 7.7%, Year 2: 30.8%) and Interdisciplinary (Year 1: 6.7%, Year 2: 40%) leading GenAI policy adoption, while the Systems area (Year 1: 3.4%, Year 2: 3.4%) is gradually converging with the broader trend. In contrast, no conferences in the Theory area had GenAI policies for authors or reviewers. This may be due to the conservative nature of the area or lack of active GenAI usage in writing theory articles.

Figure 1.  Area-wise percentage (%) of conferences with GenAI policies for authors and reviewers in Years 1 and 2. The number in parentheses indicates the number of conferences per area.

In terms of author policies, AI area shows higher leniency ratings compared to the overall average (see Figure 2a). Additionally, the Systems area shows very high leniency ratings (Year 1: 4.00, Year 2: 3.7) compared to overall average (Year 1: 3.5, Year 2: 3.61). In contrast, the leniency ratings in the Interdisciplinary area are relatively lower (Year 1: 2.50, Year 2: 3.33). This may imply that, despite being more proactive in adopting policies than Systems, Interdisciplinary area tends to be more cautious, imposing stricter guidelines on the use of GenAI.

Figure 2.  Area-wise average leniency ratings of conferences with GenAI policies for authors and reviewers in Years 1 and 2. The ratings were calculated based only on conferences that adopted GenAI policies for each year in each area. Theory field was excluded as those conferences did not adopt GenAI policies.

While conferences with GenAI policies tend to be more lenient toward authors, with an average leniency rating of 3.50 in Year 1 and 3.61 in Year 2—both above 3—the policies for reviewers are notably more restrictive (see Figure 2b). The average leniency rating for reviewers dropped from 3.00 in Year 1 to 2.18 in Year 2, falling below 3, and no conferences granted reviewers a leniency rating above 3 (see Table 1). This likely reflects concerns about the potential risks of using AI tools in the peer review process, such as the possibility of leaking sensitive or unpublished information and inadequate capability of LLMs for performing tasks of a high level of expertise and nuanced judgment.1 Since reviewers have access to confidential work, conferences may be more cautious to ensure data security, privacy, and the protection of intellectual property.

Temporal Trends

We examined the GenAI policies for two consecutive years, 2024 and 2025, and when unavailable, 2023 and 2024. Overall, conferences are moving toward introducing new GenAI policies. In Year 1, only 18 (28.1%) out of 64 conferences had GenAI policies, either for authors or reviewers, which increased to 32 (50%) in Year 2. Specifically, for authors (see Figure 1a), only 18 (28.1%) had GenAI policies in Year 1, increasing to 31 (48.4%) in Year 2, a 20.3 percentage point increase. In contrast, for reviewers (see Figure 1b), only 3 (4.7%) conferences had GenAI policies in Year 1, rising to 11 (17.2%) in Year 2. While many conferences are increasingly adopting and publicly sharing GenAI policies for authors, they are slower in providing clear guidelines for reviewers. Conferences may be unaware of reviewers’ needs or could be implementing GenAI policies for reviewers internally without making them publicly available. In either case, there is a noticeable disparity in how GenAI policies are adopted and communicated for authors versus reviewers.

Several conferences adopted GenAI policies during the latter year, including SIGMOD and EMNLP for authors, and SIGCSE and VIS for authors and reviewers. Once a policy was established, similar degree of leniency was maintained for the latter year, with a few exceptions (for example, SC shifted to a more restrictive policy in 2024). These findings suggest a cautious, evolving approach to GenAI policies, with some conferences adopting clearer guidelines with increasing GenAI use.

Society Versus Conference-Level Trends

We reviewed the GenAI policies of ACM, IEEE, and AAAI, as many CS conferences are affiliated with these societies. While conference-specific policies differ in disclosure requirements and flexibility, society-level policies permit authors’ GenAI use with disclosures. ACM and AAAI also permit reviewers to enhance their reviews using GenAI, as long as the submissions remain unexposed to these systems. In contrast, IEEE has no policies for conference reviewers. None of these policies mention sanctions for authors or reviewers. As Table 1 summarizes, some conferences do not have conference-specific GenAI policies for authors and refer to the society-level policies—for example, SIGCSE, VIS, IMC, while others have established their own GenAI policies for authors, including CVPR (IEEE) and UIST (ACM). Still, many conferences appear unaware of society-level GenAI policies, as the conference websites do not mention them. For reviewers, few conferences adhere to society-level policies (for example, IMC), while many follow their own (for example, CVPR).

Recommendations

Among the 64 conferences analyzed, 32 (50%) adopted GenAI policies for authors or reviewers in two years, but only 11 (17.18%) addressed reviewers. Theory and Systems fields lagged behind, with no Theory and 34.5% Systems conferences adopting policies for authors by Year 2, compared to 92.3% in AI and 60% in Interdisciplinary fields. Notably, 14 of 32 policies emerged in Year 2, and no conference allowed GenAI authorship. Many ACM and IEEE conferences appear unaware of society-level policies, showing inconsistent adoption. Policies also lack sanctions for non-compliance, which is essential to enforce rules and prevent misuse.

A notable gap exists in reviewers’ GenAI policies, perhaps stemming from concerns about exposing sensitive information. Yet, this risk can be mitigated by configuring LLM settings to ensure models neither retain nor learn from user-provided data.4 LLMs, for example, OpenAI’s ChatGPT and Google’s Gemini, allow users to opt out of model training, providing added safeguards.4 Moreover, LLMs provide enterprise versions with data security (for example, ChatGPTb and Anthropic’s Claude 3c) or accommodate privacy requests.d With the rise of GenAI, conferences without policies should establish guidelines for authors and reviewers, including evaluating AI use in submissions. In addition, GenAI policies at the area or society level would ensure consistency, benefiting both authors and reviewers, especially for resubmissions. Moreover, implementing professional development initiatives, training programs, and balanced enforcement strategies can promote responsible AI use.4

GenAI is a transformative technology that enhances efficiency across disciplines, and fields that fail to adapt, risk falling behind. We recommend lenient use of GenAI in conferences, especially to support non-native English speakers. GenAI should enhance, not alter, the author’s work,4 while reviewers’ judgments must be independent. Both authors and reviewers should disclose GenAI use transparently and take full responsibility. Ultimately, scholars must verify content for accuracy and ethical compliance, as GenAI tools are imperfect and prone to hallucinations.8

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生成式AI 学术会议 政策分析 计算机科学 同行评审
相关文章