Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning
Heqing Yang, Ziyuan Jiao, Shu Wang, Yida Niu, Si Liu, Hangxin Liu
AI summary
Problem
Robots in partially known environments struggle to balance information gathering with complex task execution due to incomplete prior knowledge, dynamic scene changes, and the limitations of hand-engineered heuristics.
Approach
EPoG employs a bilevel planner that dynamically updates a belief graph using LLM predictions and real-time observations, generates action sequences via graph edit operations, and uses an LLM for situated replanning when execution exceptions occur.
Key results
- 91.3% success rate across 46 household scenes and 5 long-horizon tasks
- 36.1% average reduction in robot travel distance
- Outperforms purely LLM-based planners on long-horizon object transportation
- Successfully validated on a physical mobile manipulator in dynamic environments
Why it matters
Provides a scalable, robust framework for real-world robots to execute complex, long-horizon tasks in unknown and changing environments without manual domain engineering.
Abstract
In partially known environments, robots must combine exploration to gather information with task planning for efficient execution. To address this challenge, we propose EPoG, an Exploration-based sequential manipulation Planning framework on Scene Graphs. EPoG integrates a graph-based global planner with a Large Language Model (LLM)-based situated local planner, continuously updating a belief graph using observations and LLM predictions to represent known and unknown objects. Action sequences are generated by computing graph edit operations between the goal and belief graphs, ordered by temporal dependencies and movement costs. This approach seamlessly combines exploration and sequential manipulation planning. In ablation studies across 46 realistic household scenes and 5 long-horizon daily object transportation tasks, EPoG achieved a success rate of 91.3%, reducing travel distance by 36.1% on average. Furthermore, a physical mobile manipulator successfully executed complex tasks in unknown and dynamic environments, demonstrating EPoG’s potential for real-world applications.