Online Learning with Probing for Sequential User-Centric Selection

cs.AI updates on arXiv.org 07月29日 12:22

Online Learning with Probing for Sequential User-Centric Selection

本文提出了一种名为PUCS的决策框架，用于信息获取的序列决策，并在线上线下环境中分别提出算法与下界，实验验证了其有效性。

arXiv:2507.20112v1 Announce Type: cross Abstract: We formalize sequential decision-making with information acquisition as the probing-augmented user-centric selection (PUCS) framework, where a learner first probes a subset of arms to obtain side information on resources and rewards, and then assigns $K$ plays to $M$ arms. PUCS covers applications such as ridesharing, wireless scheduling, and content recommendation, in which both resources and payoffs are initially unknown and probing is costly. For the offline setting with known distributions, we present a greedy probing algorithm with a constant-factor approximation guarantee $\zeta = (e-1)/(2e-1)$. For the online setting with unknown distributions, we introduce OLPA, a stochastic combinatorial bandit algorithm that achieves a regret bound $\mathcal{O}(\sqrt{T} + \ln^{2} T)$. We also prove a lower bound $\Omega(\sqrt{T})$, showing that the upper bound is tight up to logarithmic factors. Experiments on real-world data demonstrate the effectiveness of our solutions.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

PUCS框架序列决策信息获取算法实验验证

相关文章

近期学到的一个技能：相信别人已经做过。很多问题的解决方案，这个世界上已经存在过。一定有这个世界上某个团队某个人已经思考的非常透彻非常成熟，可能在书籍...

SLIDE: Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning with Beidi Chen - #356

Carlos Guestrin - Explaining the Predictions of Machine Learning Models - TWiML Talk #7

Top Courses on Data Structures and Algorithms

Sea遭印尼反垄断机构调查

TikTok回应“正开发核心算法的‘美国版本’”：相关报道具有误导性，与事实不符

谷歌意外泄露内部文档，被指欺骗 SEO 行业多年

Show HN: 任意二维绘图的傅里叶级数展开

Show HN: Medullar - 彻底改变数据搜索和管理方式

谷歌意外在 GitHub 上发布内部搜索文档