← Back ICRA 2026

Learning Dexterous Manipulation with Quantized Hand State

Ying Feng, Hongjie Fang, Yinong He, Jingjing Chen, Chenxi Wang, Zihao He, Ruonan Liu, Cewu Lu

PDF

AI summary

Key figure (auto-extracted from paper)

Quantizing hand states and jointly diffusing arm actions enables more balanced, efficient learning and higher success rates in complex dexterous manipulation tasks.

dexterous manipulation visuomotor policy action quantization arm-hand coordination diffusion policy robotic teleoperation

Problem

Existing visuomotor policies either combine arm and hand actions in a high-dimensional space—causing hand motions to dominate and degrade arm localization—or naively separate them, breaking essential arm-hand coordination.

Approach

DQ-RISE compresses dexterous hand states into discrete codes via a VQ-VAE, then applies PCA-based continuous relaxation to allow the policy to diffuse arm actions jointly with these compact hand representations, preserving coordination while simplifying the learning objective.

Key results

85.83% average success rate across six diverse dexterous manipulation tasks
Outperforms joint and separate action prediction baselines by over 20% in average success rate
Enables precise arm localization and smooth arm-hand coordination without manual gesture discretization
Introduces a hybrid VR-glove teleoperation system for intuitive high-DoF data collection

Why it matters

Offers a scalable, data-efficient framework for training generalizable visuomotor policies on high-DoF robotic hands, accelerating progress toward human-like dexterous manipulation.

Abstract

Dexterous robotic hands enable robots to perform complex manipulations that require fine-grained control and adaptability. Achieving such manipulation is challenging be- cause the high degrees of freedom tightly couple hand and arm motions, making learning and control difficult. Successful dex- terous manipulation relies not only on precise hand motions, but also on accurate spatial positioning of the arm and coordinated arm-hand dynamics. However, most existing visuomotor policies represent arm and hand actions in a single combined space, which often causes high-dimensional hand actions to dominate the coupled action space and compromise arm control. To address this, we propose DQ-RISE, which quantizes hand states to simplify hand motion prediction while preserving essential patterns, and applies a continuous relaxation that allows arm actions to diffuse jointly with these compact hand states. This design enables the policy to learn arm-hand coordination from data while preventing hand actions from overwhelming the action space. Experiments show that DQ-RISE achieves more balanced and efficient learning, paving the way toward structured and generalizable dexterous manipulation. Project website: https://rise-policy.github.io/DQ-RISE/.

Index terms

Imitation Learning Dexterous Manipulation Deep Learning in Grasping and Manipulation