Don't Let Your Robot Be Harmful: Responsible Robotic Manipulation Via Safety-As-Policy
Minheng Ni, Lei Zhang, Zihan Chen, Kaixin Bai, Zhaopeng Chen, Jianwei Zhang, Lei Zhang, Wangmeng Zuo
AI summary
Problem
Mindlessly executing human instructions can cause severe safety accidents, yet training robots to handle diverse, unseen risks is impractical due to the variability and danger of real-world scenarios.
Approach
The method pairs a large multimodal model with a world model that generates virtual risky scenarios and a mental model that iteratively infers consequences to update safety cognition, enabling safe task planning without physical risk.
Key results
- Significantly outperforms baselines in safety and success rates across synthetic and real-world tests
- Introduces SafeBox, a 100-task synthetic dataset that reliably mirrors real-world safety evaluations
- Enables proactive hazard avoidance in electrical, fire/chemical, and human safety scenarios
- Learns safety cognition autonomously through iterative virtual interactions and consequence reflection
Why it matters
Provides a scalable, risk-free training paradigm and benchmark for deploying responsible, AI-driven robots in complex human environments.
Abstract
Unthinking execution of human instructions in robotic manipulation can lead to severe safety risks, such as poisonings, fires, and even explosions. In this paper, we present responsible robotic manipulation, which requires robots to consider potential hazards in the real-world environment while completing instructions and performing complex operations safely and efficiently. However, such scenarios in real world are variable and risky for training. To address this challenge, we propose Safety-as-policy, which includes (i) a world model to automatically generate scenarios containing safety risks and conduct virtual interactions, and (ii) a mental model to infer consequences with reflections and gradually develop the cognition of safety, allowing robots to accomplish tasks while avoiding dangers. Additionally, we create the SafeBox synthetic dataset, which includes one hundred responsible robotic manipulation tasks with different safety risk scenarios and instructions, effectively reducing the risks associated with real-world experiments. Experiments demonstrate that Safety-as-policy can avoid risks and efficiently complete tasks in both synthetic dataset and real-world experiments, significantly outperforming baseline methods. Our SafeBox dataset shows consistent evaluation results with real-world scenarios, serving as a safe and effective benchmark for future research. Our code, data, and supplementary materials are available at: https://sites.google.com/view/safety-as-policy.