Research Analyzer
← Back ICRA 2026

Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation

Albert Yu, Chengshu Li, Luca Macesanu, Arnav Balaji, Ruchira Ray, Raymond Mooney, Roberto Martín-Martín

PDF

AI summary

Key figure (auto-extracted from paper)
MICoBot enables bidirectional natural language negotiation to dynamically allocate physical tasks, boosting success rates by 50% and user preference over baselines.
Mixed-initiative dialog human-robot collaboration task allocation natural language negotiation robotic manipulation adaptive planning

Problem

Current human-robot collaboration systems rely on fixed plans or one-directional dialog, failing to adapt to diverse human partners, changing willingness to help, and dynamic task contexts.

Approach

MICoBot uses a three-level framework that estimates human helpfulness from dialog history, evaluates robot capabilities via simulation, and applies constrained optimization to dynamically allocate task steps and negotiate via natural language.

Key results

  • 50% improvement in task success rate over pure LLM baseline
  • Preferred by over 75% of human participants in physical trials
  • Novel constrained optimization framework balancing success and human effort
  • Collaborative simulation environment with LLM-controlled virtual humans

Why it matters

Enables flexible, adaptive human-robot teamwork for long-horizon physical tasks, with broader implications for proactive AI agents and collaborative assistants.

Abstract

Effective robotic systems for long-horizon human- robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot’s capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We propose MICoBot, a system that enables the human and robot, both using natural language, to take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot’s capabilities (measured by a simulation-pretrained affordance model) and the estimated human’s willingness to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. In physical robot trials with 18 unique human participants, MICoBot significantly improves task success and user experience over a pure LLM baseline and standard agent allocation models. See additional videos and materials at our project site. 1

Index terms

Human-Robot Collaboration Natural Dialog for HRI Machine Learning for Robot Control

Related papers