TrustNavGPT: Trust-Driven Audio-Guided Robot Navigation under Uncertainty with Large Language Models
Xingpeng Sun, Yiran Zhang, Xindi Tang, Amrit Singh Bedi, Aniket Bera
Abstract
Large language models (LLMs) exhibit a wide range of promising capabilities – from step-by-step planning to commonsense reasoning –that provide utility for robot navigation. However, as humans communicate with robots in the real world, ambiguity and uncertainty may be embedded inside spoken instructions. While LLMs are proficient at processing text in human conversations, they often encounter difficulties with the nuances of verbal instructions and, thus, remain prone to hallucinate trust in human command. In this work, we present TrustNavGPT, an LLM-based audio-guided navigation agent that uses affective cues in spoken communication—elements such as tone and inflection that convey meaning beyond words—allowing it to assess the trustworthiness of human commands and make effective, safe decisions. Experiments across a variety of simulation and real-world setups show a 70.46% success rate in catching command uncertainty and an 80% success rate in finding the target, 48.30%, and 55% outperform existing LLM-based navigation methods, respectively. Additionally, TrustNavGPT shows remarkable resilience against adversarial attacks, highlighted by a 22%+ less decrease ratio than the existing LLM navigation method in success rate. Our approach provides a lightweight yet effective approach that extends existing LLMs to model audio vocal features embedded in the voice command and model uncertainty for safe robotic navigation. For more information, visit the TrustNav project page.