Research Analyzer
← Back ICRA 2026

DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps

Md Faizal Karim, Vignesh Vembar, Keshab Patra, Gaurav Singh, Madhava Krishna

PDF

AI summary

Key figure (auto-extracted from paper)
DAGDiff uses classifier-guided diffusion to directly generate stable, collision-free dual-arm grasps from point clouds, outperforming prior methods and transferring reliably to real-world robots.
Dual-arm grasping Diffusion models Force-closure stability Collision avoidance Robotic manipulation SE(3) diffusion

Problem

Reliable dual-arm grasping for large, complex objects remains challenging due to joint stability, collision avoidance, and generalization limits. Prior methods typically decompose the task into independent single-arm proposals or rely on heuristics, lacking principled guarantees of physically valid coordination.

Approach

The framework extends SE(3) diffusion to the dual-arm setting and steers the generative process using classifier signals for force-closure stability and collision avoidance, eliminating the need for explicit region detection or object priors.

Key results

  • First end-to-end diffusion framework for dual-arm grasp generation in SE(3)×SE(3) space
  • Classifier-guided diffusion with force-closure and collision heads enforces physical validity
  • Substantial improvements over baselines in analytical stability checks and large-scale physics simulations
  • Successful zero-shot sim-to-real transfer on a heterogeneous dual-arm setup with unseen objects

Why it matters

Enables robots to reliably manipulate large, complex objects in real-world settings by ensuring coordinated, physically valid dual-arm grasps without manual region priors.

Abstract

Reliable dual-arm grasping is essential for ma- nipulating large and complex objects but remains a chal- lenging problem due to stability, collision, and generalization requirements. Prior methods typically decompose the task into two independent grasp proposals, relying on region priors or heuristics that limit generalization and provide no principled guarantee of stability. We propose DAGDiff, an end-to-end framework that directly denoises to grasp pairs in the SE(3)× SE(3) space. Our key insight is that stability and collision can be enforced more effectively by guiding the diffusion process with classifier signals, rather than relying on explicit region detection or object priors. To this end, DAGDiff integrates geometry-, stability-, and collision-aware guidance terms that steer the generative process toward grasps that are physically valid and force-closure compliant. We comprehensively evalu- ate DAGDiff through analytical force-closure checks, collision analysis, and large-scale physics-based simulations, showing consistent improvements over previous work on these metrics. Finally, we demonstrate that our framework generates dual- arm grasps directly on real-world point clouds of previously unseen objects, which are executed on a heterogeneous dual- arm setup where two manipulators reliably grasp and lift them. Project Page: dag-diff.github.io/dagdiff/

Index terms

Deep Learning in Grasping and Manipulation Grasping

Related papers