← Back ICRA 2026

Learning Sidewalk Autopilot from Multi-Scale Imitation with Corrective Behavior Expansion

Honglin He, Yukai Ma, Brad Squicciarini, Wayne Wu, Bolei Zhou

PDF

AI summary

Key figure (auto-extracted from paper)

Synthesizing corrective recovery behaviors and multi-scale supervision from offline teleoperation data significantly boosts the robustness and generalization of sidewalk navigation policies.

sidewalk navigation imitation learning corrective behavior multi-scale prediction micromobility data augmentation

Problem

Standard imitation learning for sidewalk navigation fails under closed-loop deployment due to compounding errors and a lack of recovery demonstrations in fixed offline datasets.

Approach

The MIMIC framework augments teleoperation logs with synthesized failure-recovery trajectories and employs a multi-scale imitation architecture that supervises policies across short- and long-term horizons simultaneously.

Key results

Synthesized corrective behavior pipeline expands teleoperation data coverage
Multi-scale architecture captures both short-horizon interactions and long-horizon planning
Real-world robot deployment demonstrates improved robustness and generalization
Effective policy learning achieved using only offline teleoperation data without reinforcement learning

Why it matters

Enables safer, more adaptable autonomous micromobility and delivery robots to navigate complex urban sidewalks using only existing teleoperation datasets.

Abstract

Sidewalk micromobility is a promising solution for last-mile transportation, but current learning-based control methods struggle in complex urban environments. Imitation learning (IL) learns policies from human demonstrations, yet its reliance on fixed offline data often leads to compounding errors, limited robustness, and poor generalization. To address these challenges, we propose a framework that advances IL through corrective behavior expansion and multi-scale imitation learning. On the data side, we augment teleoperation datasets with diverse corrective behaviors and sensor augmentations to enable the policy to learn to recover from its own mistakes. On the model side, we introduce a multi-scale IL architecture that captures both short-horizon interactive behaviors and long- horizon goal-directed intentions via horizon-based trajectory clustering and hierarchical supervision. Real-world experiments show that our approach significantly improves robustness and generalization in diverse sidewalk scenarios. Demo video and additional information are available on the project page.

Index terms

Vision-Based Navigation Imitation Learning Wheeled Robots