← Back ICRA 2026

Fast ECoT: Efficient Embodied Chain-Of-Thought Via Thoughts Reuse

Zhekai Duan, Yuan Zhang, Shikai Geng, gaowen liu, Joschka Boedecker, Chris Xiaoxuan Lu

PDF

AI summary

Key figure (auto-extracted from paper)

Fast ECoT cuts embodied chain-of-thought inference latency by up to 7.5× while preserving task success and interpretability, enabling real-time robotic deployment.

Embodied reasoning Inference acceleration Chain-of-thought Robotic control Parallel decoding Real-time robotics

Problem

Sequential autoregressive generation in Embodied Chain-of-Thought (ECoT) creates high inference latency, making real-time deployment of interpretable robotic policies impractical.

Approach

Fast ECoT caches stable high-level reasoning across timesteps, parallelizes modular reasoning steps, and decouples reasoning from action decoding via an asynchronous scheduler.

Key results

Up to 7.5× inference latency reduction
Higher task success rates on LIBERO benchmarks
Preserved reasoning faithfulness and interpretability
No additional training or model changes required

Why it matters

Enables practical, real-time deployment of interpretable, reasoning-capable robotic policies without sacrificing performance or requiring retraining.

Abstract

Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate rea- soning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference- time acceleration method that exploits the structured and repetitive nature of ECoT to (1) cache and reuse high-level reasoning across timesteps and (2) parallelise the generation of modular reasoning steps. Additionally, we introduce an asynchronous scheduler that decouples reasoning from action decoding, further boosting responsiveness. Fast ECoT requires no model changes or additional training and easily integrates into existing VLA pipelines. Experiments in both simulation (LIBERO) and real-world robot tasks show up to a 7.5× reduction in latency with comparable or improved task success rate and reasoning faithfulness, bringing ECoT policies closer to practical real-time deployment. Code is available at https: //github.com/kevinDuan1/Fast-ECoT.

Index terms

Imitation Learning Engineering for Robotic Systems