← Back ICRA 2026

Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation

Albert Yu, Chengshu Li, Luca Macesanu, Arnav Balaji, Ruchira Ray, Raymond Mooney, Roberto MartÃn-MartÃn

PDF

AI summary

Key figure (auto-extracted from paper)

MICoBot enables bidirectional natural language negotiation to dynamically allocate physical tasks, boosting success rates by 50% and user preference over baselines.

Mixed-initiative dialog human-robot collaboration task allocation natural language negotiation robotic manipulation adaptive planning

Problem

Current human-robot collaboration systems rely on fixed plans or one-directional dialog, failing to adapt to diverse human partners, changing willingness to help, and dynamic task contexts.

Approach

MICoBot uses a three-level framework that estimates human helpfulness from dialog history, evaluates robot capabilities via simulation, and applies constrained optimization to dynamically allocate task steps and negotiate via natural language.

Key results

50% improvement in task success rate over pure LLM baseline
Preferred by over 75% of human participants in physical trials
Novel constrained optimization framework balancing success and human effort
Collaborative simulation environment with LLM-controlled virtual humans

Why it matters

Enables flexible, adaptive human-robot teamwork for long-horizon physical tasks, with broader implications for proactive AI agents and collaborative assistants.

Abstract

Effective robotic systems for long-horizon human- robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot’s capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We propose MICoBot, a system that enables the human and robot, both using natural language, to take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot’s capabilities (measured by a simulation-pretrained affordance model) and the estimated human’s willingness to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. In physical robot trials with 18 unique human participants, MICoBot significantly improves task success and user experience over a pure LLM baseline and standard agent allocation models. See additional videos and materials at our project site. 1

Index terms

Human-Robot Collaboration Natural Dialog for HRI Machine Learning for Robot Control