← Back ICRA 2026

TreeIRL: Safe Urban Driving with Tree Search and Inverse Reinforcement Learning

Momchil Tomov, Sang Uk Lee, Hansford Hendargo, Jinwook Huh, Teawon Han, Forbes Howington, Rafael Rodrigues da Silva, Gianmarco Bernasconi, Marc Heim, Samuel Findler, Xiaonan Ji, Alexander Boule, Michael Napoli, Kuo Chen, Jesse Miller, Boaz Cornelis Floor, Yunqing Hu

PDF

AI summary

Key figure (auto-extracted from paper)

TreeIRL combines Monte Carlo tree search with inverse reinforcement learning to achieve state-of-the-art safety, comfort, and human-like behavior in real-world urban autonomous driving.

Autonomous driving Monte Carlo tree search Inverse reinforcement learning Motion planning Real-world evaluation Hybrid planning

Problem

Classical motion planners often produce unnatural or uncomfortable behavior, while machine learning-based planners struggle to guarantee safety and generalize to rare, critical scenarios.

Approach

The method repurposes Monte Carlo tree search to efficiently generate a diverse set of safe candidate trajectories, then uses a deep scoring function trained via inverse reinforcement learning to select the most human-like option for execution.

Key results

First real-world demonstration of MCTS-based planning in dense urban traffic
Outperforms classical and ML-based planners in safety, comfort, and progress across simulation and 500+ miles of on-road testing
Zero safety driver takeovers due to ACC or cut-in failures across 268 autonomous miles
Demonstrates that hybrid classical/learning architectures effectively bridge the sim-to-real gap for discretionary driving metrics

Why it matters

Provides a scalable, safe, and human-like planning framework for autonomous vehicles, highlighting the need for real-world evaluation across diverse metrics to advance self-driving technology.

Abstract

We present TreeIRL, a novel planner for au- tonomous driving that combines Monte Carlo tree search (MCTS) and inverse reinforcement learning (IRL) to achieve state-of-the-art performance in simulation and in real-world driving. The key idea is to use MCTS to find a promising set of safe candidate trajectories and a deep scoring function trained with IRL to select the most human-like among them. We evaluate TreeIRL against classical and state-of-the-art planners on large-scale simulations and on 500+ miles of real-world au- tonomous driving in the Las Vegas metropolitan area. Scenarios include navigating heavy urban traffic, adaptive cruise control, cut-ins, and traffic lights. TreeIRL achieves the best overall performance, striking a balance between safety, progress, com- fort, and human-likeness. To the best of our knowledge, our work is the first public-road demonstration of MCTS-based planning and underscores the importance of evaluating planners across a diverse set of metrics and in real-world environments. TreeIRL is highly extensible and could be further improved with reinforcement learning and imitation learning, providing a framework for exploring different combinations of classical and learning-based approaches to solve the planning bottleneck in autonomous driving.

Index terms

Autonomous Vehicle Navigation Motion and Path Planning Reinforcement Learning