Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning

cs.AI updates on arXiv.org 07月08日 14:58

Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning

本文介绍了一种名为动态转换VIN（DT-VIN）的新规划网络，通过增强潜在马尔可夫决策过程和解决梯度消失问题，实现了长期和大规模规划的突破。

arXiv:2406.08404v2 Announce Type: replace-cross Abstract: The Value Iteration Network (VIN) is an end-to-end differentiable neural network architecture for planning. It exhibits strong generalization to unseen domains by incorporating a differentiable planning module that operates on a latent Markov Decision Process (MDP). However, VINs struggle to scale to long-term and large-scale planning tasks, such as navigating a 100x100 maze -- a task that typically requires thousands of planning steps to solve. We observe that this deficiency is due to two issues: the representation capacity of the latent MDP and the planning module's depth. We address these by augmenting the latent MDP with a dynamic transition kernel, dramatically improving its representational capacity, and, to mitigate the vanishing gradient problem, introduce an "adaptive highway loss" that constructs skip connections to improve gradient flow. We evaluate our method on 2D/3D maze navigation environments, continuous control, and the real-world Lunar rover navigation task. We find that our new method, named Dynamic Transition VIN (DT-VIN), scales to 5000 layers and solves challenging versions of the above tasks. Altogether, we believe that DT-VIN represents a concrete step forward in performing long-term large-scale planning in complex environments.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

动态转换VIN 长期规划大规模规划

相关文章

很多朋友，都认为进电力，银行，是避险！有没有可能，进电力，银行等，就是纯粹看好未来呢？如果大家有研究整个市场经济，金融周期，就知道稳定，将会是未来三五...

消费降级的风终于吹到「金融消费」上了？

This AI Paper from UNC-Chapel Hill Introduces the System-1.x Planner: A Hybrid Framework for Efficient and Accurate Long-Horizon Planning with Language Models

【纪要】株冶集团(600961)交流纪要20240731

不满意现在的工作，但又不能马上辞职咋办？近些年的职场环境是越来越卷，大公司有大公司病，小公司又很不稳定，还要面临各种职场向上管理等问题。我就看到过一...

人口30万，财政一般公共收入2个亿左右，工资6万一年的县委书记的联络员和沿海发达地区的，利润超过2亿的，有发展前途的上市公司纳米级股东，谁退休后（60岁）跟...

如果摈弃前嫌，中国足球全面和日本合作，各位jr是否能接受

我在一段时间里，想法都是固定的，现在不做短线了，我可能就盘看都不看了，有的人说一天波动挺大的，那就是仓位大了，你们自己控制好，这个票本身做的就是长票，...

很有实操性的建议，写的真好，必须转发一下【平和地聊聊发钱和消费】首先，我们坐下来深呼吸一下，喝一口水，做个眼保健操，然后开始聊。心态平和地聊。为什么要...

财政部原副部长朱光耀：全球要回到多边合作的轨道上来。清华五道口首席经济学家论坛在北京举办。财政部原副部长朱光耀围绕全球经济格局和产业结构变化发表主旨演...