Diffusing Trajectory Optimization Problems for Recovery During Multi-Finger Manipulation
Abhinav Kumar, Fan Yang, Sergio Aguilera, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson
AI summary
Problem
Environmental perturbations during fine manipulation can push robot states out of the trained distribution, often leading to catastrophic task failures like dropping a tool.
Approach
The framework detects out-of-distribution states using a task diffusion model and employs a second diffusion model to sample contact modes and initializations for trajectory optimization.
Key results
- Improved hardware screwdriver-turning performance by 96%
- Only evaluated method capable of recovery without causing catastrophic failure
- Successfully projected OOD states back into the task distribution using diffusion sampling
- Reduced online computation time by distilling planning into a diffusion model
Why it matters
Enables robust, high-precision multi-finger manipulation in real-world environments where perturbations are inevitable and safety constraints are strict.
Abstract
Multi-fingered hands are emerging as powerful platforms for performing fine manipulation tasks, including tool use. However, environmental perturbations or execution errors can impede task performance, motivating the use of recovery behaviors that enable normal task execution to resume. In this work, we take advantage of recent advances in diffusion models to construct a framework that autonomously identifies when recovery is necessary and optimizes contact-rich trajectories to recover. We use a diffusion model trained on the task to estimate when states are not conducive to task execution, framed as an out-of-distribution detection problem. We then use diffusion sampling to project these states in-distribution and use trajec- tory optimization to plan contact-rich recovery trajectories. We also propose a novel diffusion-based approach that distills this process to efficiently diffuse the full parameterization, includ- ing constraints, goal state, and initialization, of the recovery trajectory optimization problem, saving time during online execution. We compare our method to a reinforcement learning baseline and other methods that do not explicitly plan contact interactions, including on a hardware screwdriver-turning task where we show that recovering using our method improves task performance by 96% and that ours is the only method evaluated that can attempt recovery without causing catastrophic task failure. Videos can be found at https://dtourrecovery.github.io/.