LAMP: Implicit Language Map for Robot Navigation
Sibaek Lee, Hyeonwoo Yu, Giseop Kim, Sunwook Choi
AI summary
Problem
Existing language-based navigation methods explicitly store language vectors in grids or graphs, causing excessive memory usage and coarse resolution that hinder fine-grained planning in large environments.
Approach
The authors introduce LAMP, which learns a continuous implicit neural field from RGB images to encode language features, paired with a sparse topological graph for coarse path planning and gradient-based optimization for precise goal refinement, all regularized by a Bayesian uncertainty model.
Key results
- First implicit language map enabling fine-grained path generation from RGB input
- Bayesian von Mises–Fisher framework improves generalization to unobserved regions
- Score-based graph sampling reduces computational overhead while preserving semantic coverage
- Outperforms explicit methods in memory efficiency and goal-reaching accuracy in simulation and real-world tests
Why it matters
Provides a scalable, memory-efficient alternative to explicit language mapping, enabling precise natural-language-driven navigation in large-scale environments.
Abstract
Recent advances in vision-language models have madezero-shotnavigationfeasible,enablingrobotstointerpretand follow natural language instructions without requiring labeling. However, existing methods that explicitly store language vectors in grid or node-based maps struggle to scale to large environments due to excessive memory requirements and limited resolution for fine- grained planning. We introduce LAMP (Language Map), a novel neural language field-based navigation framework that learns a continuous, language-driven map and directly leverages it for fine- grained path generation. Unlike prior approaches, our method encodes language features as an implicit neural field rather than storing them explicitly at every location. By combining this implicit representationwithasparsegraph,LAMPsupportsefficientcoarse path planning and then performs gradient-based optimization in thelearnedfieldtorefineposesnearthegoal.Ourtwo-stagepipeline of coarse graph search followed by language-driven, gradient- guided optimization is the first application of an implicit language map for precise path generation. This refinement is particularly effective at selecting goal regions not directly observed by leverag- ing semantic similarities in the learned feature space. To further enhance robustness, we adopt a Bayesian framework that mod- els embedding uncertainty via the von Mises–Fisher distribution, thereby improving generalization to unobserved regions. To scale to large environments, LAMP employs a graph sampling strategy that prioritizes spatial coverage and embedding confidence, retaining only the most informative nodes and substantially reducing compu- tational overhead. Our experimental results, both in NVIDIA Isaac Sim and on a real multi-floor building, demonstrate that LAMP outperforms existing explicit methods in both memory efficiency and fine-grained goal-reaching accuracy, opening new possibilities for scalable, language-driven robot navigation.