cs.AI updates on arXiv.org 07月25日 12:28
Optimising Call Centre Operations using Reinforcement Learning: Value Iteration versus Proximal Policy Optimisation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文研究强化学习在呼叫中心优化呼叫路由,以减少客户等待时间和员工闲置时间。比较了基于模型和价值迭代的模型方法与基于经验学习的无模型方法,通过模拟模型评估了随机、价值迭代和近端策略优化策略,发现PPO策略在测试中表现最佳。

arXiv:2507.18398v1 Announce Type: new Abstract: This paper investigates the application of Reinforcement Learning (RL) to optimise call routing in call centres to minimise client waiting time and staff idle time. Two methods are compared: a model-based approach using Value Iteration (VI) under known system dynamics, and a model-free approach using Proximal Policy Optimisation (PPO) that learns from experience. For the model-based approach, a theoretical model is used, while a simulation model combining Discrete Event Simulation (DES) with the OpenAI Gym environment is developed for model-free learning. Both models frame the problem as a Markov Decision Process (MDP) within a Skills-Based Routing (SBR) framework, with Poisson client arrivals and exponentially distributed service and abandonment times. For policy evaluation, random, VI, and PPO policies are evaluated using the simulation model. After 1,000 test episodes, PPO consistently achives the highest rewards, along with the lowest client waiting time and staff idle time, despite requiring longer training time.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

强化学习 呼叫中心 路由优化 模型评估 PPO策略
相关文章