Research Analyzer
← Back ICRA 2026

ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-Centric Semantic Fusion

Mingjie Zhang, Yuheng Du, Chengkai Wu, Jinni ZHOU, Zhenchao Qi, Jun Ma, Boyu Zhou

PDF

AI summary

Key figure (auto-extracted from paper)
ApexNav significantly improves zero-shot object navigation efficiency and robustness by adaptively switching between semantic and geometric exploration and fusing multi-frame detections to filter noise.
Zero-shot object navigation Adaptive exploration Semantic fusion Autonomous agents Robotic navigation Target-centric memory

Problem

Current zero-shot object navigation methods struggle with inefficient exploration in weakly semantic environments and unreliable target identification due to noisy single-frame detections or rigid fusion strategies.

Approach

The framework dynamically switches between semantic reasoning and geometry-based exploration based on environmental cue strength, while using a target-centric fusion method to accumulate multi-frame evidence for reliable object identification.

Key results

  • Adaptive exploration strategy that switches between semantic and geometric modes based on cue distribution
  • Target-centric semantic fusion that aggregates multi-frame detections with confidence weighting
  • State-of-the-art zero-shot navigation performance on HM3Dv1, HM3Dv2, and MP3D datasets
  • Successful real-world deployment validating sim-to-real transfer

Why it matters

Advances practical autonomous navigation for search and rescue and service robots by enabling reliable, efficient target search in complex, unknown environments.

Abstract

Navigating unknown environments to find a target object is a significant challenge. While semantic information is crucial for navigation, relying solely on it for decision-making may not always be efficient, especially in environments with weak semantic cues. Additionally, many methods are susceptible to misdetections, especially in environments with visually similar objects. To address these limitations, we propose ApexNav, a zero-shot object navigation framework that is both more efficient and reliable. For efficiency, ApexNav adaptively utilizes semantic information by analyzing its distribution in the environment, guiding exploration through semantic reasoning when cues are strong, and switching to geometry-based exploration when they are weak. For reliability, we propose a target-centric semantic fusion method that preserves long-term memory of the target and similar objects, enabling robust object identification even under noisy detections. We evaluate ApexNav on the HM3Dv1, HM3Dv2, and MP3D datasets, where it outperforms state-of- the-art methods in both SR and SPL metrics. Comprehensive ablation studies further demonstrate the effectiveness of each module. Furthermore, real-world experiments validate the prac- ticality of ApexNav in physical environments. The code will be released at https://github.com/Robotics-STAR-Lab/ApexNav.

Index terms

Search and Rescue Robots Vision-Based Navigation Autonomous Agents

Related papers