← Back SII 2025

Open Vocabulary Object Search Utilizing Large Language Models and Fuzzy Inferencing

Akash Chikhalikar, Ankit A. Ravankar, Jose Victorio Salazar Luces, Yasuhisa Hirata

PDF

Abstract

Open vocabulary task execution is crucial in autonomous robotics, particularly for indoor service robots operating in dynamic, human-centric environments. Conven- tional dictionary-based approaches either fail to capture the diversity in interactions between objects and humans or often face scalability issues in memory and computation over time. Thus, a framework capable of executing high-level tasks and robust open-set capabilities is desirable. We consider the task of searching for dynamic objects in an indoor environment called Object Search. While the state-of-the-art approaches focus on the most effective ways to search for a closed set of objects, we propose a framework capable of generalizing to unknown, unseen, and ultimately an open set of objects. Our framework consists of a method to leverage priors of a fixed set of objects to generate task-driven priors for an open set of objects. We utilize Large Language Models (LLMs) and fuzzy logic to facilitate this prior generation. Additionally, the proposed framework also captures the physical layout of the environment to inform task-driven prior generation. Finally, we validate our framework through extensive real- world experiments and provide comparisons with competitive methods, demonstrating its effectiveness in generalizing to an open-set of objects. The results demonstrate our framework’s superiority in reducing search time, distance, and number of visited landmarks, outperforming related methods.

Index terms

Decision Making Systems Human-Robot/System Interaction