A Continuous Off-Policy Reinforcement Learning Scheme for Optimal Motion Planning in Simply-Connected Workspaces
Panagiotis Rousseas, Charalampos Bechlioulis, Kostas Kyriakopoulos
Abstract
In this work, an Integral Reinforcement Learn- ing (RL) framework is employed to provide provably safe, convergent and almost globally optimal policies in a novel Off-Policy Iterative method for simply-connected workspaces. This restriction stems from the impossibility of strictly global navigation in multiply connected manifolds, and is necessary for formulating continuous solutions. The current method general- izes and improves upon previous results, where parametrized controllers hindered the method in scope and results. Through enhancing the traditional reactive paradigm with RL, the proposed scheme is demonstrated to outperform both previous reactive methods as well as an RRT⋆method in path length, cost function values and execution times, indicating almost global optimality.