← Back ICRA 2026

Imitation-BT: Automating Behavior Tree Generation by Echoing Reinforcement Learning Agents

Shailendra Sekhar Bathula, Ramviyas Parasuraman

PDF

AI summary

Key figure (auto-extracted from paper)

Imitation-BT successfully distills opaque deep reinforcement learning policies into compact, interpretable Behavior Trees while retaining near-expert performance.

Behavior Trees Imitation Learning Explainable RL Policy Distillation Autonomous Systems Knowledge Transfer

Problem

Deep reinforcement learning policies are highly performant but operate as opaque black boxes, making them unsafe and inflexible for deployment in high-stakes environments. Existing methods for automating Behavior Tree generation from RL policies suffer from distributional shift, lack scalability, or fail to produce truly transparent architectures.

Approach

The framework uses an interactive imitation learning pipeline to distill expert RL policies into decision trees, then applies logic optimization and rule extraction to automatically synthesize a compact, modular, and transparent Behavior Tree.

Key results

Achieves high performance retention across diverse continuous and discrete environments
Generates significantly more compact and interpretable Behavior Trees than existing synthesis baselines
Effectively mitigates distributional shift through interactive policy distillation and weighted resampling
Provides a modular, transparent alternative to opaque deep RL policies for safety-critical deployment

Why it matters

It enables the safe deployment of high-performing RL agents in critical domains by converting black-box policies into transparent, easily modifiable control architectures.

Abstract

Understanding an autonomous agent’s decision- making prowess is of paramount importance, as it increases trust and guarantees safety. Although agent policies learned through reinforcement learning (RL) and machine learning (ML) paradigms have demonstrated their dominance in var- ious domains, they struggle with deployment in high-stakes environments due to their algorithmic opacity. A structured and transparent representation of a policy helps us understand, evaluate, and modify it if necessary. Due to their inherent reac- tivity, modularity, and transparent hierarchical representation, the Behavior Tree (BT) is an ideal solution to represent control policies. In this paper, we focus on building a knowledge repre- sentation transfer framework in which knowledge of trained RL agents is captured through imitation learning and then utilized to form a compact BT. Our primary focus is to retain maximum performance while improving the interpretability of the BTs. In combination with planning and learning, we automate the formation of a BT and offer an alternative, transparent archi- tecture for policy representation. In an extensive analysis with a variety of gymnasium environments and the Robotics Package Delivery domain simulations, we demonstrate the significant performance retention capability and superior interpretability of the proposed Imitation-BT.

Index terms

Behavior-Based Systems Imitation Learning Reinforcement Learning