Communications of the ACM - Artificial Intelligence 03月07日
The Challenge of Consistency in Generative AI: Will We Adapt or Fix the System?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

生成式AI在学术研究中展现出强大能力,但也存在信息不确定性。以ScholarGPT为例,对“压力源”的定义每次都略有不同,虽然措辞变化使其在总结和解释方面有优势,但在需要精确术语的领域,如法律、医学和学术写作中,这种变化性构成挑战。文章提出,与其期望GenAI像静态数据库,不如拥抱其流动性,但需保持警惕,进行验证,并依赖同行评审文献。研究者应谨慎解读AI输出,将其视为不断发展的对话伙伴,而非静态参考资料。适应AI或推动更结构化的AI发展,将塑造人机协作的未来。

🧠生成式AI在提供多样化信息的同时,也带来了定义上的不确定性。例如,对“压力源”的定义,GenAI每次给出的解释都略有不同,这在需要精确术语的领域,如学术写作中,会构成挑战。

📚数据溯源是解决GenAI可靠性问题的一种方案。通过在AI生成的内容中嵌入引文和来源,可以追溯信息到可靠的参考文献,从而提高AI生成信息的可靠性。

🤝研究者应将GenAI视为不断发展的对话伙伴,而非静态参考资料。这意味着在使用GenAI时,需要保持警惕,进行验证,并依赖同行评审的文献作为最终引用和定义的来源。同时,适应AI或推动更结构化的AI发展,将塑造人机协作的未来。

We are all familiar with the well-known idiom: “Doing the same thing and expecting a different result—this is the definition of insanity.”1 While the Oxford English Dictionary defines insanity as “a state of mind that impedes the ability to think, reason, or behave in ways that are considered normal,”2 the sentiment behind the idiom is widely understood. Yet, paradoxically, this is precisely how interactions with generative AI (GenAI) unfold: when asking the same question multiple times, we get different results.

For example, when using a GenAI system like ScholarGPT to define the term stressor—that is, a condition that causes stress3—the system generates competing definitions with each iteration. GenAI defines the term first as a stimulus, event, or condition, then as a demand, event, or circumstance, and finally as anything that causes stress. While these paraphrased responses are subtly different, they can carry distinct conceptual implications, particularly in this common scholarly use case4 of defining central concepts where linguistic precision matters.5

This observation highlights both the strengths and weaknesses of GenAI. On the one hand, the ability to rephrase and recombine information is one of its most intriguing features, making it valuable for tasks like summarization and explanation.6 On the other hand, this variability poses challenges in disciplines where exact terminology is crucial, such as law, medicine, and academic writing. Moreover, this challenge seems to stem from an inherent property of generative AI, which generates responses through probabilistic recombination rather than deterministic retrieval.7,8

However, this also begs the question: How can researchers rely on GenAI to assist with literature reviews, definitions, or conceptual frameworks? One proposed solution is assurance through data provenance9 by embedding citations and sources within AI-generated content to trace information back to reliable references. Some emerging AI models attempt to do this, but it remains difficult given that large language models encode the recombinatorial logic of digital innovation.10,11 Instead of expecting GenAI to behave like a static database, we might need to embrace its fluid, generative nature while remaining vigilant about verification and only relying on peer-reviewed literature for final citations and definitions verified with the original manuscript.

Ultimately, as GenAI becomes an integral part of research and knowledge work, researchers must become more cautious when interpreting its outputs. AI systems are not static reference materials but evolving conversational partners—valuable but requiring careful oversight. Whether we adapt to them or push for more structured AI development will shape the next phase of human-AI collaboration.12,13

References

    Brown, R.M. Sudden Death (Bantam Books, 1983).Oxford University Press, Insanity. Oxford English Dictionary (2024).Lepine, J.A., Podsakoff, N.P., and Lepine, M.A. A Meta-Analytic Test of the Challenge Stressor–Hindrance Stressor Framework: An Explanation for Inconsistent Relationships Among Stressors and Performance. Academy of Management Journal 48, 764–775 (2005).Wacker, J.G. A theory of formal conceptual definitions: Developing theory-building measurement instruments. Journal of Operations Management 22, 629–650 (2004).R. M. Schwartz, R.M. and Raphael, T.E. Concept of Definition: A Key to Improving Students’ Vocabulary. Reading Teacher 39 (2), 198–205 (1985).Dwivedi, Y.K. et al. ‘So what if ChatGPT wrote it?’ Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int J Inf Manage 71 (2023).Ding, N. et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nat Mach Intell 5, 220–235 (2023).O. Henfridsson, O., and Bygstad, B. The Generative Mechanisms of Digital Infrastructure Evolution. MIS Quarterly 37, 907–931 (2013).Werder, K., Ramesh, B., and Zhang, R. Establishing data provenance for responsible artificial intelligence systems. ACM Trans Manag Inf Syst 13, 1–23 (2022).Yoo, Y., Henfridsson, O., and Lyytinen, K. Research Commentary: The New Organizing Logic of Digital Innovation: An Agenda for Information Systems Research. Information Systems Research 21, 724–735 (2010).Baiyere, A., Grover, V., Lyytinen, K.J., Woerner, S., and Gupta, A. Digital “x”—Charting a Path for Digital-Themed Research. Information Systems Research 34, 463 (2023).Hillebrand, L., Raisch, S., and Schad, J. Managing with Artificial Intelligence: An Integrative Framework. Academy of Management Annals (2025). https://doi.org/10.5465/annals.2022.0072.Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).

Karl Werder is an associate professor in the Section Digital Business Innovation, IT University of Copenhagen, Denmark. His research interests focus on systems development for performance, artificial intelligence for decision making, and organizing for digital innovation.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生成式AI 学术研究 数据溯源
相关文章