← Back ICRA 2026

GRATE: A Graph Transformer-Based Deep Reinforcement Learning Approach for Time-Efficient Autonomous Robot Exploration

Haozhan Ni, Jingsong Liang, Chenyu He, Yuhong Cao, Guillaume Adrien Sartoretti

PDF

AI summary

Key figure (auto-extracted from paper)

GRATE improves exploration efficiency by up to 21% in distance and time through a Graph Transformer policy and kinodynamic trajectory smoothing.

Autonomous exploration Graph Transformer Deep reinforcement learning Kinodynamic feasibility Trajectory smoothing Robot planning

Problem

Current RL-based exploration methods suffer from limited graph reasoning and neglect robot kinematics, yielding spatially optimal but temporally inefficient and physically infeasible paths.

Approach

GRATE combines a Graph Transformer to capture local and global graph dependencies for waypoint selection with a Kalman filter to smooth outputs into kinodynamically feasible trajectories.

Key results

Up to 21.5% reduction in travel distance and 21.3% in exploration time versus SOTA baselines
Generation of kinodynamically feasible paths via Kalman filter smoothing
Successful validation in high-fidelity Gazebo simulations and real-world ground robot tests
Outperforms conventional (TARE, FAEL, HPHS) and learning-based (ARiADNE) planners

Why it matters

Enables faster, physically realistic autonomous exploration for ground robots in complex environments.

Abstract

Autonomous robot exploration (ARE) is the pro- cess of a robot autonomously navigating and mapping an unknown environment. Recent Reinforcement Learning (RL)- based approaches typically formulate ARE as a sequential decision-making problem defined on a collision-free informative graph. However, these methods often demonstrate limited rea- soning ability over graph-structured data. Moreover, due to the insufficient consideration of robot motion, the resulting RL poli- cies are generally optimized to minimize travel distance, while neglecting time efficiency. To overcome these limitations, we propose GRATE, a Deep Reinforcement Learning (DRL)-based approach that leverages a Graph Transformer to effectively capture both local structure patterns and global contextual dependencies of the informative graph, thereby enhancing the model’s reasoning capability across the entire environment. In addition, we deploy a Kalman filter to smooth the waypoint outputs, ensuring that the resulting path is kinodynamically fea- sible for the robot to follow. Experimental results demonstrate that our method exhibits better exploration efficiency (up to 21.5% in distance and 21.3% in time to complete exploration) than state-of-the-art conventional and learning-based baselines in various simulation benchmarks. We also validate our planner in real-world scenarios.

Index terms

View Planning for SLAM Reinforcement Learning