← Back ICRA 2026

Learning Conservative Neural Control Barrier Functions from Demonstrations

Ihab Tabbara, Hussein Sibai

PDF

AI summary

Key figure (auto-extracted from paper)

Conservative Control Barrier Functions (CCBFs) enable reliable, offline-trained safety filters that prevent unsafe states while preserving task performance in high-dimensional systems.

Control barrier functions offline reinforcement learning safety filters neural networks conservative learning autonomous control

Problem

Synthesizing control barrier functions (CBFs) for safety does not scale to high-dimensional systems, and existing data-driven methods lack guarantees or suffer from distribution shift when trained offline.

Approach

The authors train neural CBFs on offline safe and unsafe trajectory datasets using a conservative loss inspired by Conservative Q-Learning to avoid overestimating safety in unseen states, enabling safe control via quadratic programs.

Key results

Outperforms baselines in safety and task performance across four environments
Successfully learns safety filters from offline datasets without online interaction
Effectively penalizes out-of-distribution states to improve reliability
Requires minimal hyperparameter tuning and is straightforward to train

Why it matters

Enables scalable, theoretically grounded safety filtering for high-dimensional autonomous systems using only offline demonstration data.

Abstract

Safety filters, particularly those based on control barrier functions, have gained increased interest as effective tools for safe control of dynamical systems. Existing correct-by- construction synthesis algorithms for such filters, however, suffer from the curse-of-dimensionality. Deep learning approaches have been proposed in recent years to address this challenge. In this paper, we add to this set of approaches an algorithm for training neural control barrier functions from offline datasets. Such functions can be used to design constraints for quadratic programs that are then used as safety filters. Our algorithm trains these functions so that the system is not only prevented from reaching unsafe states but is also disincentivized from reaching out-of-distribution ones, at which they would be less reliable. It is inspired by Conservative Q-learning, an offline reinforcement learning algorithm. We call its outputs Conservative Control Barrier Functions (CCBFs). Our empirical results demonstrate that CCBFs outperform existing methods in maintaining safety while minimally affecting task performance. Code is available at https://github.com/tabz23/CCBF.

Index terms

Robot Safety Collision Avoidance Reinforcement Learning