DexKnot: Generalizable Visuomotor Policy Learning for Dexterous Bag-Knotting Manipulation
Jiayuan Zhang, Ruihai Wu, Haojun Chen, Yuran Wang, Yifan Zhong, Ceyao Zhang, Yaodong Yang, Yuanpei Chen
AI summary
Problem
Robots struggle to generalize knotting tasks across different plastic bag instances and initial deformations due to the high-dimensional, infinite degrees of freedom inherent in highly deformable objects.
Approach
The framework collects real-world manual deformation data to train a PointNet++ encoder for shape-agnostic keypoint correspondence, then uses these identified keypoints as low-dimensional inputs to a diffusion transformer policy for generalizable manipulation.
Key results
- High success rates across unseen bag instances and novel deformations
- Outperforms DP3 and standard Diffusion Policy on out-of-distribution states
- Enables cross-instance and cross-deformation generalization with few demonstrations
- Real-world keypoint correspondence pipeline bypasses simulation and heavy annotation
Why it matters
It provides a practical, generalizable framework for dexterous manipulation of highly deformable objects, advancing real-world robotic automation for everyday tasks like waste management and retail.
Abstract
Knotting plastic bags is a common task in daily life, yet it is challenging for robots due to the bags’ in- finite degrees of freedom and complex physical dynamics. Existing methods often struggle in generalization to unseen bag instances or deformations. To address this, we present DexKnot, a framework that combines keypoint affordance with diffusion policy to learn a generalizable bag-knotting policy. Our approach learns a shape-agnostic representation of bags from keypoint correspondence data collected through real- world manual deformation. For an unseen bag configuration, the keypoints can be identified by matching the representa- tion to a reference. These keypoints are then provided to a diffusion transformer, which generates robot action based on a small number of human demonstrations. DexKnot enables effective policy generalization by reducing the dimensionality of observation space into a sparse set of keypoints. Experiments show that DexKnot achieves reliable and consistent knotting performance across a variety of previously unseen instances and deformations.