Research Analyzer
← Back ICRA 2026

Uncovering Communication Bottlenecks in Scalable ROS 2 Deployments on Kubernetes for Cloud/Edge Robotics

Yongzhou Zhang, Oliver Waldhorst, Björn Hein

PDF

AI summary

Key figure (auto-extracted from paper)
Scalable ROS 2 deployments on Kubernetes are feasible but require strategic CNI selection, payload optimization, and topology-aware networking to minimize latency and jitter.
ROS 2 Kubernetes Cloud/Edge Robotics Communication Performance CNI Plugins Latency & Throughput

Problem

Integrating ROS 2 with Kubernetes introduces a complex, multi-layered communication stack that impacts application-level performance, yet comprehensive guidance on managing these bottlenecks across cloud/edge robotic deployments is lacking.

Approach

The authors constructed a Kubernetes-based testbed spanning robots, edge, and cloud nodes to systematically measure throughput and one-way latency while varying CNI plugins, QoS settings, encryption, and physical network conditions.

Key results

  • Bandwidth reduction from containerization overhead remains acceptable for common robotic payloads
  • Overlay networks maintain stable latency but increase jitter, particularly for small payloads
  • eBPF-based CNI routing significantly boosts throughput for small payloads and reduces latency jitter compared to iptables
  • Communication bottlenecks shift dynamically across physical, virtual, and middleware layers depending on deployment topology

Why it matters

It provides roboticists and cloud-native engineers with quantitative, layer-aware guidelines for designing robust and scalable ROS 2 infrastructure.

Abstract

Containerization and orchestration using cloud- native technologies enable scalable deployment of robotic soft- ware. Integrating ROS 2 with Kubernetes offers a flexible infrastructure, but also introduces a complex, multi-layered communication stack - from DDS middleware to container networks and the physical layer. Each layer adds overhead and variability that impact application-level performance. This paper presents a comprehensive analysis of communication performance across the cloud–edge–robot continuum, focus- ing on throughput and one-way latency in scalable ROS 2 deployments. We evaluate communication across intra-robot, edge, and cloud segments using wired and wireless connections, including emerging technologies like Wi-Fi 7 and high-speed LAN. Using a Kubernetes-based testbed, we investigate various ROS 2 middlewares, CNI plugins, QoS configurations, and en- cryption options. Our experiments reveal the impact of network overlays, routing paths, and middleware choices on latency and bandwidth. Despite the inherent complexity, the results confirm the feasibility of deploying ROS 2 in orchestrated, scalable environments. We summarize key insights as practical takeaways, many of which apply beyond Kubernetes, to guide the design of robust cloud/edge robotic systems.

Index terms

Networked Robots Software Middleware and Programming Environments Engineering for Robotic Systems

Related papers