cs.AI updates on arXiv.org 07月29日 12:22
CrossPL: Evaluating Large Language Models on Cross Programming Language Code Generation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍CrossPL,首个评估大型语言模型生成跨语言互操作代码能力的基准。通过分析多语言GitHub仓库,构建包含1982个任务的基准,涵盖六种编程语言和七种跨语言技术。评估结果显示,现有模型在跨语言场景中表现不佳,需要更多针对性研究。

arXiv:2507.19904v1 Announce Type: cross Abstract: As large language models (LLMs) become increasingly embedded in software engineering workflows, a critical capability remains underexplored: generating correct code that enables cross-programming-language (CPL) interoperability. This skill is essential for building complex systems that integrate components written in multiple languages via mechanisms like inter-process communication (IPC). To bridge this gap, we present CrossPL, the first benchmark designed to systematically evaluate LLMs' ability to generate CPL-interoperating code. CrossPL comprises 1,982 tasks centered around IPC, covering six widely-used programming languages and seven representative CPL techniques. We construct this benchmark by (i) analyzing 19,169 multi-language GitHub repositories using 156 hand-crafted finite state machines (FSMs), and (ii) developing an LLM-based pipeline that automatically extracts CPL code snippets, generates task instructions, and validates functional correctness. We evaluate 14 state-of-the-art general-purpose LLMs and 6 code-oriented LLMs released in the past three years on CrossPL via FSM-based validation. Results reveal that even the best-performing models struggle with CPL scenarios, underscoring the need for more targeted research in this space. Our benchmark and code are available at: https://anonymous.4open.science/r/crosspl-2814.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 跨语言编程 基准测试
相关文章