AI summary
Problem
Centralized robot training limits scalability and collaboration while overloading servers, whereas federated learning offers a distributed alternative but degrades under the non-IID data inherent to multi-robot environments.
Approach
The authors simulate a multi-robot manipulation task to compare centralized and federated training using SAC+HER and DDPG+HER, specifically measuring how heterogeneous goal distributions affect learning stability and success rates.
Key results
- SAC+HER demonstrates stable training and higher success rates than DDPG+HER in both centralized and federated settings
- DDPG+HER shows high sensitivity to exploration noise and lower overall success rates
- Non-IID goal distributions degrade performance for both algorithms, causing increased instability in DDPG+HER
- Varying exploration noise across federated clients improves DDPG+HER performance by enabling diverse exploration
Why it matters
Provides foundational insights for deploying scalable, collaborative robot learning systems in real-world environments where data heterogeneity is unavoidable.
Abstract
Robot learning primarily relies on centralized train- ing. While it provides the infrastructure, centralization limits par- allel and collaborative learning among robots and place significant computational load on the central server, indicating the need for federated learning (FL) in context of multi-robot training. How- ever, robots trained in a federated setup are subjected to non-in- dependent and identically distributed data (non-IID), resulting in degraded model performance. This extended abstract presents the current state of research aimed at improving robot learning under non-IID conditions in FL. In this regard, this work provides an initial comparative analysis of robot learning methods in central- ized and federated training setups, with an emphasis on the impact of non-IID data on learning behaviour in a simulation environ- ment. The results highlight the differences in learning stability across algorithms and present the influence of non-IID goal distri- butions on performance.