← Back ICRA 2026

Learning Optimal Strategies for Needle Handover in Surgical Suturing

Cholin Kim, Jeonghyeon Yoon, Hakyun Lee, Sihyeoung Park, Hyojae Park, Seungjun Lee, Michael Yip, Minho Hwang

PDF

AI summary

Key figure (auto-extracted from paper)

A reinforcement learning framework achieves human-level efficiency in surgical needle handovers while improving consistency and avoiding joint limits.

surgical automation needle handover reinforcement learning da Vinci robot robotic surgery DQN

Problem

Automating surgical needle handover is hindered by combinatorial grasping strategies, pose estimation uncertainties, and cable-driven robot inaccuracies, making it difficult to minimize unnecessary handovers and maintain reliability.

Approach

The authors formulate needle pickup and handover as a goal-oriented reinforcement learning task, training a DQN agent with simulated kinematic disturbances to learn optimal grasping sequences that minimize handovers while respecting joint limits.

Key results

First framework unifying pickup and handover planning for target grasping
DQN policy trained with joint margin rewards and disturbance modeling for robustness
Achieved human-level handover efficiency (1.65 ± 0.50 vs. 1.62 ± 0.55 attempts) on dVRK
Improved consistency and joint-limit avoidance compared to human teleoperation

Why it matters

Demonstrates the viability of reinforcement learning for safe, reliable automation of critical surgical subtasks, paving the way for more autonomous robotic surgery.

Abstract

Automation of suturing subtasks, such as needle handover, has the potential to reduce surgeons’ fatigue and improve surgical efficiency. Needle handover is particularly challenging due to the combinatorial nature of grasping and handover strategies, uncertainties in needle pose estimation, and inaccuracies inherent in cable-driven surgical robots such as the da Vinci system. In this work, we present a reinforce- ment learning framework for needle handover, spanning the process from initial pickup to a desired grasping state. We formulate the task as a goal-oriented planning problem and design a state–action representation that captures grasping and handover configurations. A DQN-based policy is trained with disturbances that reflect real-world kinematic errors to ensure robustness. The learned policy was validated on the da Vinci Research Kit (dVRK) and quantitatively compared with human teleoperation. Results demonstrate that our approach achieves human-level efficiency in terms of handover attempts (1.65 ± 0.50 vs. 1.62 ± 0.55), while improving consistency and joint-limit avoidance. The proposed framework demonstrates the potential of reinforcement learning for safe and reliable automation of surgical handover and points to opportunities for extending autonomy to more complex handover scenarios.

Index terms

Surgical Robotics: Planning Reinforcement Learning Task Planning