← Back ICRA 2026

Inference-Stage Adaptation-Projection Strategy Adapts Diffusion Policy to Cross-Manipulators Scenarios

Xiangtong Yao, Yirui Zhou, Yuan Meng, Yanwen Liu, Liangyu Dong, Zitao Zhang, Zhenshan Bing, Kai Huang, Fuchun Sun, Alois Knoll

PDF

AI summary

Key figure (auto-extracted from paper)

An inference-stage adaptation-projection strategy enables diffusion policies to adapt to novel manipulators and task requirements without retraining or fine-tuning.

Diffusion Policy Cross-manipulator Adaptation Inference-stage Optimization Trajectory Projection Robotic Manipulation Zero-shot Generalization

Problem

Diffusion policies fail to generalize to unseen manipulators, end-effectors, or dynamic task constraints at inference time, typically requiring costly data recollection and policy retraining for each new hardware or task configuration.

Approach

The method adjusts observation inputs to compensate for kinematic offsets like TCP shifts and projects generated trajectories during the denoising process using quadratic programming to enforce safety and task constraints, all without retraining.

Key results

Zero-shot adaptation to novel grippers and robots at inference time
Dynamic satisfaction of new task requirements via constrained trajectory projection
Temporal consistency and safety enforced through cumulative denoising optimization
High success rates across real-world pick-and-place, pushing, and pouring tasks

Why it matters

It provides a cost-effective, flexible deployment framework for diffusion policies in real-world robotics, eliminating the need for hardware-specific retraining.

Abstract

Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time. Addressing this typically requires costly data recollection and policy retraining for each new hardware or task configura- tion. To overcome this, we introduce an adaptation-projection strategy that enables a diffusion policy to perform cost-effective adaptation to novel manipulators and dynamic task settings, entirely at inference time and without retraining or fine- tuning the policy. Our method first trains a diffusion policy in SE(3) space using demonstrations from a base manipulator. During online deployment, it projects the policy’s generated trajectories to satisfy the kinematic and task-specific constraints imposed by the new hardware and objectives. Moreover, this projection dynamically adapts to physical differences (e.g., tool- center-point offsets, jaw widths) and task requirements (e.g., obstacle heights), ensuring robust and successful execution. We validate our approach on real-world pick-and-place, pushing, and pouring tasks across multiple manipulators, including the Franka Panda and Kuka iiwa 14, equipped with a diverse array of end-effectors like flexible grippers, Robotiq 2F/3F grippers, 1 School of Computation, Information and Technology, Technical Uni- versity of Munich, Garching, Germany. Email: xiangtong.yao@tum.de. 2 State Key Laboratory for Novel Software Technology and the School of Science and Technology, Nanjing University (Suzhou Campus), China. 3 Key Laboratory of Machine Intelligence and Advanced Computing, School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China. 4 Department of Computer Science and Technology, Tsinghua University, Beijing, China. †Corresponding author: Zhenshan Bing bing@nju.edu.cn, bing@in.tum.de and various 3D-printed designs. Our results demonstrate consis- tently high success rates in these cross-manipulator scenarios, proving the effectiveness and practicality of our adaptation- projection strategy.

Index terms

Imitation Learning Transfer Learning Deep Learning in Grasping and Manipulation