X-Nav: Learning End-To-End Cross-Embodiment Navigation for Mobile Robots
Haitong Wang, Aaron Hao Tan, Angus Fung, Goldie Nejat
AI summary
Problem
Existing navigation methods rely on embodiment-specific kinematics, dynamics, or controllers, preventing policies trained on one robot from generalizing to others. This limits scalability and requires costly manual tuning or separate planners for each new platform.
Approach
X-Nav employs a two-stage framework that first trains multiple expert policies via deep reinforcement learning on randomly generated robot embodiments, then distills their knowledge into a single transformer-based policy that maps visual and proprioceptive inputs directly to low-level control commands.
Key results
- Zero-shot transfer to unseen wheeled and quadrupedal robot embodiments
- Successful navigation in photorealistic simulated and real-world environments
- Performance scales positively with the number of randomly generated training embodiments
- Ablation study confirms the effectiveness of the Nav-ACT transformer and distillation pipeline
Why it matters
Enables developers and researchers to deploy a single navigation model across diverse mobile robot platforms without costly embodiment-specific tuning or retraining.
Abstract
Existing navigation methods are primarily designed for specific robot embodiments, limiting their generalizability across diverse robot platforms. In this paper, we introduce X-Nav, a novel framework for end-to-end cross-embodiment navigation where a single unified policy can be deployed across various embodiments for both wheeled and quadrupedal robots. X-Nav consists of two learning stages: 1) multiple expert policies are trained using deep reinforcement learning with privileged observations on a wide range of randomly generated robot embodiments; and 2) a single general policy is distilled from the expert policies via navigation action chunking with transformer (Nav-ACT). The general policy directly maps visual and proprioceptive observations to low-level control commands, enabling generalization to novel robot embodiments. Simulated experiments demonstrated that X-Nav achieved zero-shot transfer to both unseen embodiments and photorealistic environments. A scalability study showed that the performance of X-Nav improves when trained with an increasing number of randomly generated embodiments. An ablation study confirmed the design choices of X-Nav. Furthermore, real-world experiments were conducted to validate the generalizability of X-Nav in real-world environments.