Perturbation-Based Best Arm Identification for Efficient Task Planning with Monte-Carlo Tree Search
Jin Daejong, Juhan Park, Kyungjae Lee
Abstract
Combining task and motion planning (TAMP) is crucial for intelligent robots to perform complex and long- horizon tasks. In TAMP, many approaches generally employ Monte-Carlo tree search (MCTS) with upper confidence bound (UCB) for task planning to handle exploration-exploitation trade-off and find globally optimal solutions. However, since UCB basically considers the estimation error caused by noise, the error caused by insufficient optimization of the sub-tree is not represented. Hence, UCB-based approaches have the disadvantage of not exploring underestimated sub-trees. To alleviate this issue, we propose a novel tree search method using perturbation-based best-arm identification (PBAI). We theoretically prove the bound of the simple regret of our method and empirically verify that PBAI finds the optimal task plans faster and more efficiently than the existing algorithms. The source code of our proposed algorithm is available at https://github.com/jdj2261/pytamp.