← Back ICRA 2026

Robot Control Stack: A Lean Ecosystem for Robot Learning at Scale

Tobias Thomas Jülg, Pierre Krack, Seongjin Bien, Yannik Blei, Khaled Gamal, Ken Nakahara, Johannes Hechtl, Roberto Calandra, Wolfram Burgard, Florian Walter

PDF

AI summary

Key figure (auto-extracted from paper)

RCS enables scalable, data-driven robot learning by providing a lightweight, unified software stack that seamlessly bridges simulation and real-world hardware for Vision-Language-Action models.

Robot learning Vision-Language-Action models Sim-to-real transfer Robot control stack MuJoCo Imitation learning

Problem

Traditional robotics frameworks and simulators lack the flexibility and scalability needed for modern, data-centric robot learning workflows, creating bottlenecks for training and transferring Vision-Language-Action models to physical robots.

Approach

The authors present RCS, a lean, wrapper-based ecosystem that unifies hardware and MuJoCo simulation under a single Gymnasium-compatible Python API and a high-performance C++ backend to streamline policy training and deployment.

Key results

Modular wrapper-based architecture supporting Python and C++ robot control
Cross-embodiment evaluation across four physical robot setups and matched MuJoCo simulations
Extensive benchmarking of Octo, OpenVLA, and π0 policies on a standardized picking task
Demonstration that mixing synthetic and real-world data significantly boosts real-world policy performance

Why it matters

It provides robotics researchers with a scalable, low-overhead toolchain that accelerates the development, training, and deployment of foundation-model-based robot policies.

Abstract

Vision-Language-Action models (VLAs) mark a major shift in robot learning. They replace specialized archi- tectures and task-tailored components of expert policies with large-scale data collection and setup-specific fine-tuning. In this machine learning-focused workflow that is centered around models and scalable training, traditional robotics software frameworks become a bottleneck, while robot simulations offer only limited support for transitioning from and to real-world experiments. In this work, we close this gap by introducing Robot Control Stack (RCS), a lean ecosystem designed from the ground up to support research in robot learning with large- scale generalist policies. At its core, RCS features a modular and easily extensible layered architecture with a unified interface for simulated and physical robots, facilitating sim-to-real transfer. Despite its minimal footprint and dependencies, it offers a complete feature set, enabling both real-world experiments and large-scale training in simulation. Our contribution is twofold: First, we introduce the architecture of RCS and explain its design principles. Second, we evaluate its usability and performance along the development cycle of VLA and RL policies. Our experiments also provide an extensive evaluation of Octo, OpenVLA, and π0 on multiple robots and shed light on how simulation data can improve real-world policy performance. Our code, datasets and videos are available at https://robotcontrolstack.github.io

Index terms

Software Architecture for Robotic and Automation Imitation Learning Transfer Learning