Transfer Your Safety: Learning Transferable Model-Free Safety Filters from a Single Policy to Enhance Safety across Diverse Tasks
Junjun Xie, Siru Li, Shuhao Zhao, Xiaochen Xie and Liang Hu∗
AI summary
Problem
Existing safety methods rely on task-specific training, predefined dynamic models, or costly manual safety labeling, making them inflexible and poorly generalizable across different tasks and environments.
Approach
The authors learn perception-based control barrier functions directly from a single policy using a zero-one reward scheme, then derive a model-free safety filter that modifies actions based on learned value functions without needing dynamic models.
Key results
- Robust, model-free perception-based safety filter transferable across tasks without system dynamics models
- Theoretical guarantees that the filter improves initial policy safety and relaxes CBF construction requirements
- Successful transfer of LiDAR-visual multimodal CBFs from a LiDAR-only policy to three downstream tasks
- Effective safety enhancement and generalizability across diverse safety-critical tasks in random environments
Why it matters
Enables robots to rapidly deploy safe behaviors in new tasks or environments by reusing safety knowledge from a single policy, significantly reducing data collection and retraining costs.
Abstract
Safety is a fundamental and pervasive require- ment in robotics, yet most existing approaches rely on task- specific training or predefined models, necessitating redesign or retraining from scratch when tasks or systems change. In this paper, we propose a novel approach for constructing model- free safety filters that learns perception-based control barrier functions (CBFs) from one initial policy for arbitrary tasks and then derives task-independent safety filters in terms of CBF- based constraints in a model-free manner. The safety filters can be flexibly integrated into policies for diverse tasks and remain robust to mild environmental variations. We further theoretically prove that the safety filters can improve the safety of the initial policy itself, relaxing the safety requirements on initial policies used for CBF construction. The proposed method is systematically evaluated over multiple safety-critical tasks and random environments, validating the effectiveness and generalizability of our method. Notably, starting with an initial LiDAR-only navigation policy, our approach successfully learn LiDAR-visual multimodal CBFs with LiDAR and vision inputs, and applies them to different downstream tasks.