MICA: Multi-Agent Industrial Coordination Assistant
Di Wen, Kunyu Peng, Junwei Zheng, Yufan Chen, Yitian Shi, Jiale Wei, Ruiping LIU, Kailun Yang, Rainer Stiefelhagen
AI summary
Problem
Existing multi-agent LLM assistants lack perception grounding, real-time responsiveness, and privacy-preserving edge deployment required for safety-critical factory workflows.
Approach
MICA integrates depth-guided egocentric vision with a lightweight multi-agent reasoning core and Adaptive Step Fusion (ASF) to dynamically blend workflow rules, visual retrieval, and online speech feedback under strict safety auditing.
Key results
- Adaptive Step Fusion improves step recognition accuracy and calibration via online speech feedback
- Role-specialized multi-agent routing consistently outperforms baseline coordination topologies in task success and reliability
- System achieves real-time, safety-audited guidance while running fully offline on practical edge hardware
- New benchmark with Knowledge Base Alignment and Energy-per-success metrics enables standardized evaluation of industrial assistance
Why it matters
Enables deployable, privacy-preserving multi-agent assistance for dynamic factory environments where cloud offloading and large annotated datasets are infeasible.
Abstract
Industrial workflows demand adaptive and trust- worthy assistance that can operate under limited computing, connectivity, and strict privacy constraints. In this work, we present MICA (Multi-Agent Industrial Coordination Assis- tant), a perception-grounded and speech-interactive system that delivers real-time guidance for assembly, troubleshoot- ing, part queries, and maintenance. MICA coordinates five role-specialized language agents, audited by a safety checker, to ensure accurate and compliant support. To achieve ro- bust step understanding, we introduce Adaptive Step Fusion (ASF), which dynamically blends expert reasoning with online adaptation from natural speech feedback. Furthermore, we establish a new multi-agent coordination benchmark across representative task categories and propose evaluation metrics tailored to industrial assistance, enabling systematic comparison of different coordination topologies. Our experiments demon- strate that MICA consistently improves task success, reliability, and responsiveness over baseline structures, while remain- ing deployable on practical offline hardware. Together, these contributions highlight MICA as a step toward deployable, privacy-preserving multi-agent assistants for dynamic factory environments. The source code will be made publicly available at https://github.com/Kratos-Wen/MICA.