热点
"自动奖励标注" 相关文章
Real-World Offline Reinforcement Learning from Vision Language Model Feedback
cs.AI updates on arXiv.org 2025-08-07T04:12:50.000000Z