← Back ICRA 2026

OVITA: Open-Vocabulary Interpretable Trajectory Adaptations

Anurag Maurya, TASHMOY GHOSH, Anh Nguyen, Ravi Prakash

PDF

AI summary

Key figure (auto-extracted from paper)

OVITA enables non-expert users to intuitively adapt robot trajectories in real-time using open-vocabulary natural language commands, backed by an interpretable code-based policy and iterative feedback loop.

Open-vocabulary control Trajectory adaptation Large language models Human-robot collaboration Interpretable robotics Code-as-policy

Problem

Adapting robot trajectories to dynamic environments and user preferences remains difficult for non-experts due to the need for complex parameterization and rigid command structures. Existing methods often lack interpretability, require extensive training data, or fail to support precise, open-ended instructions.

Approach

OVITA leverages multiple pre-trained LLMs to translate open-vocabulary natural language instructions into executable Python code that modifies trajectory waypoints, followed by a quadratic programming module to ensure safety and smoothness. An integrated code explainer and iterative feedback loop allow non-expert users to understand and refine adaptations intuitively.

Key results

Supports exact numerical, open-ended, and multi-step language commands without fine-tuning
Generates executable Python code as an interpretable adaptation policy
Validated across diverse tasks on heterogeneous platforms including manipulators, ground robots, and drones
Integrates a QP-based constraint module to guarantee physically feasible and smooth trajectories

Why it matters

Bridges the gap between high-level human intent and low-level robotic control, enabling scalable, interpretable, and real-time trajectory adaptation for non-expert users in dynamic environments.

Abstract

Adapting trajectories to dynamic situations and user preferences is crucial for robot operation in unstructured environments with non-expert users. Natural language enables users to express these adjustments in an interactive manner. We introduce OVITA, an interpretable, open-vocabulary, language- driven framework designed for adapting robot trajectories in dynamic and novel situations based on human instructions. OVITA leverages multiple pre-trained Large Language Models (LLMs) to integrate user commands into trajectories generated by motion planners or those learned through demonstrations. OVITA employs code as an adaptation policy generated by an LLM, enabling users to adjust individual waypoints, thus providing flexible control. Another LLM, which acts as a code explainer, removes the need for expert users, enabling intuitive interactions. The efficacy and significance of the proposed OVITA framework is demonstrated through extensive simulations and real-world environments with diverse tasks involving spatiotem- poral variations on heterogeneous robotic platforms such as a KUKA IIWA robot manipulator, Clearpath Jackal ground robot, and CrazyFlie drone.

Index terms

Motion and Path Planning Human-Robot Collaboration Big Data in Robotics and Automation