Learning Safe Numeric Planning Action Models

cs.AI updates on arXiv.org 19小时前

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

本文介绍了一种名为N-SAM的算法，用于学习安全、数值化的行动模型，以解决现实世界规划问题中获取精确模型所面临的挑战。特别是在不允许试错的关键任务领域，安全行动模型至关重要。N-SAM算法能够在线性时间内从观测数据中学习安全数值前置条件和效果。为了克服N-SAM在需要大量观测数据才能构建模型这一局限性，研究者提出了N-SAM*，它保证即使仅观测到一次的行动，在某些状态下也是可应用的，且不影响模型安全性。实验表明，N-SAM和N-SAM*在数值规划领域表现出色，并与现有算法进行了比较。

💡 N-SAM算法的核心在于学习安全数值行动模型，这对于在不允许试错的关键任务领域至关重要。该算法能够从观测数据中学习数值前置条件和效果，确保生成的规划方案安全可靠。

🚀 N-SAM算法在处理观测数据方面具有线性时间复杂度，这意味着其处理效率高，能够快速从大量数据中学习行动模型。然而，为了保证安全性，N-SAM需要大量的观测数据才能构建模型。

✨ 针对N-SAM的局限性，研究者提出了N-SAM*算法。N-SAM*即使在仅观测到一次行动的情况下，也能确保该行动在某些状态下是可应用的，同时不影响模型的安全性。这提高了算法在实际应用中的灵活性和实用性。

🔬 通过在广泛的数值规划领域进行测试，N-SAM和N-SAM*的性能得到了验证，并与现有的先进算法进行了比较。实验结果证明了N-SAM及其扩展算法在数值规划中的有效性和优越性。

arXiv:2312.10705v2 Announce Type: replace-cross Abstract: A significant challenge in applying planning technology to real-world problems lies in obtaining a planning model that accurately represents the problem's dynamics. Obtaining a planning model is even more challenging in mission-critical domains, where a trial-and-error approach to learning how to act is not an option. In such domains, the action model used to generate plans must be safe, in the sense that plans generated with it must be applicable and achieve their goals. % Learning safe action models for planning has been mostly explored for domains in which states are sufficiently described with Boolean variables. % In this work, we go beyond this limitation and propose the Numeric Safe Action Models Learning (N-SAM) algorithm. In this work, we present N-SAM, an action model learning algorithm capable of learning safe numeric preconditions and effects. We prove that N-SAM runs in linear time in the number of observations and, under certain conditions, is guaranteed to return safe action models. However, to preserve this safety guarantee, N-SAM must observe a substantial number of examples for each action before including it in the learned model. We address this limitation of N-SAM and propose N-SAM, an extension to the N-SAM algorithm that always returns an action model where every observed action is applicable at least in some states, even if it was observed only once. N-SAM does so without compromising the safety of the returned action model. We prove that N-SAM is optimal in terms of sample complexity compared to any other algorithm that guarantees safety. N-SAM and N-SAM are evaluated over an extensive benchmark of numeric planning domains, and their performance is compared to a state-of-the-art numeric action model learning algorithm. We also provide a discussion on the impact of numerical accuracy on the learning process.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签