← Back ICRA 2026

TopoNav: Topological Graphs As a Key Enabler for Advanced Object Navigation

Yihao Qin, Hang Zhou, Jun Ma,, Renjing Xu,, Yiding Ji

PDF

AI summary

Key figure (auto-extracted from paper)

Dynamic topological memory graphs significantly boost long-horizon object navigation success and efficiency by bridging transient visual inputs with persistent spatial understanding.

object navigation topological memory spatial reasoning vision-language models embodied AI dynamic mapping

Problem

Current LLM-driven ObjectNav methods lack robust spatial memory, causing fragmented reasoning, goal confusion, and inefficient paths in complex, long-horizon tasks.

Approach

TopoNav constructs a dynamic topological graph that encodes room-level nodes and connectivity, integrating it with semantic point clouds and LLM reasoning to guide exploration and strategic backtracking.

Key results

State-of-the-art performance on standard ObjectNav benchmarks
Higher success rates and shorter navigation paths compared to baselines
Effective long-horizon planning via topological memory recall and backtracking
Successful real-world robotic deployment validating simulation results

Why it matters

Enables embodied agents to overcome memory bottlenecks for robust, scalable navigation in large-scale, dynamic real-world environments.

Abstract

Object Navigation (ObjectNav) has made great progress with large language models (LLMs), but still faces challenges in memory management, especially in long-horizon tasks and dynamic scenes. To address this, we propose TopoNav, a new framework that leverages topological structures as spatial memory. By building and updating a topological graph that captures scene connections, adjacency, and semantic meaning, TopoNav helps agents accumulate spatial knowledge over time, retrieve key information, and reason effectively toward distant goals. Our experiments show that TopoNav achieves state-of- the-art performance on benchmark ObjectNav datasets, with higher success rates and more efficient paths. It particularly excels in diverse and complex environments, as it connects temporary visual inputs with lasting spatial understanding.

Index terms

Vision-Based Navigation AI-Based Methods AI-Enabled Robotics