cs.AI updates on arXiv.org 07月25日 12:28
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出基于稳态转移系统的差异化奖励方法,将交通流特性分析引入奖励设计,以优化多车协同决策中的动作选择和政策学习,在RL算法中验证了该方法的有效性。

arXiv:2502.00352v2 Announce Type: replace Abstract: Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

强化学习 多车协同 驾驶策略 奖励设计 交通流
相关文章