Research Analyzer
← Back ICRA 2026

Continual-RL for Generalization in Autonomous Racing on the RoboRacer Platform

Joel Siegert, Edoardo Ghignone, Michele Magno

PDF

AI summary

Key figure (auto-extracted from paper)
A continual RL framework combining SAC and Continual Backpropagation enables rapid real-world adaptation, outperforming classical controllers after just 15 minutes of fine-tuning on an unseen track.
Continual Learning Reinforcement Learning Autonomous Racing Real-World Robotics Soft Actor-Critic Plasticity

Problem

Real-world reinforcement learning struggles with sample efficiency and catastrophic forgetting when adapting to new, unseen environments. Autonomous racing specifically demands rapid policy updates to novel track layouts and tire-floor combinations with minimal physical data.

Approach

The authors adapt the sample-efficient Soft Actor-Critic algorithm with Continual Backpropagation and L2 initialization to maintain neural plasticity while learning from multiple real-world tracks. They also benchmark this against an offline RL pre-training method using Implicit Q-Learning.

Key results

  • CBP-enhanced SAC surpasses classical controllers after 15 minutes of fine-tuning on unseen tracks
  • Offline RL pre-training shows promising plasticity but lower final performance than continual learning
  • Simulation analysis confirms continual techniques improve fine-tuning over buffer management alone
  • Tracks, simulation models, and RL frameworks open-sourced for replication

Why it matters

Provides a practical, sample-efficient pathway for deploying adaptable RL controllers on physical robots in non-stationary environments, directly benefiting autonomous racing and real-world robotics research.

Abstract

A key challenge in modern robotics is to adapt to changing environments, a challenge that is exacerbated when simulations cannot encompass every possible real-world configuration, and therefore Reinforcement Learning (RL) in the physical world becomes necessary. Continual Reinforcement Learning (RL) provides the tools to address this challenge; however, both the frameworks and the methods remain un- derexplored. Autonomous Racing (AR) and in particular the RoboRacer competition provide a testing ground for such methods, as learning to drive on a new track-floor combination with the least amount of new experience naturally frames a continual learning problem. This work tries to address this gap by proposing a continual RL framework based on Continual Backpropagation (CBP) that is able, with only real-world data, to train a generalistic policy on a set of tracks and then fine- tune it within 15 minutes to outperform classical controllers. Furthermore, a comparison method based on offline RL is proposed, and a simulation analysis of the plasticity properties of the methods is conducted.

Index terms

Wheeled Robots Continual Learning Reinforcement Learning

Related papers