Research Analyzer
← Back ICRA 2023

EMS®: A Massive Computational Experiment Management System towards Data-Driven Robotics

Qinjie Lin, Guo Ye, Han Liu

PDF

Abstract

We propose EMS®, a cloud-enabled massive com- putational experiment management system supporting high- throughput computational robotics research. Compared to ex- isting systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environ- ments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration manage- ment layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each exper- imental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data- driven robotics research leading to much-improved productiv- ity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two mag- nitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.

Index terms

Software Architecture for Robotic and Automation AI-Enabled Robotics Distributed Robot Systems