← Back ICRA 2026

Cross-View Exocentric and Egocentric Fusion for Robust Microsurgical Anastomosis Understanding

Yuxuan Liu, Yuyang Zhuge, Xinyao Zhou, Yating Luo, Yunfei Luan, Yao Guo, Guang-Zhong Yang

PDF

AI summary

Key figure (auto-extracted from paper)

Integrating top-down microscope views with instrument-mounted eye-in-hand cameras via cross-view fusion significantly improves the accuracy and robustness of microsurgical scene understanding compared to single-view systems.

Microsurgical robotics cross-view fusion exocentric-egocentric vision surgical scene understanding dual-view system anastomosis

Problem

Conventional top-down-view microscopes in microsurgical robotics suffer from restricted fields of view and severe self-occlusion during instrument-tissue interactions, limiting accurate scene understanding and surgical safety.

Approach

The authors developed a dual-view vision system combining exocentric (top-down) microscopes with egocentric (eye-in-hand) cameras, and designed frame-wise 2D and temporal 3D cross-view feature fusion networks to jointly predict surgical actions, gripper-object interactions, and instrument poses.

Key results

Collected a comprehensive intraoperative microsurgical anastomosis dataset with multi-task annotations
Designed frame-wise 2D and temporal 3D cross-view fusion networks for joint surgical understanding
Dual-view fusion consistently outperforms single-view baselines across action recognition, interaction prediction, and pose estimation
Temporal 3D fusion achieves peak performance with 93.76% action accuracy and 6.38° pose MAE

Why it matters

Enhances the reliability and interpretability of visual perception for robotic microsurgery, paving the way for safer and more autonomous anastomosis procedures.

Abstract

Microsurgical anastomosis has become increas- ingly prevalent in surgical autonomy, requiring accurate and stable control of suturing needles and threads while enhancing the efficiency and safety of microsurgical operations. How- ever, current systems predominantly employ top-down-view microscopes for intraoperative imaging, which are constrained by limited field-of-view and significant occlusion caused by instrument-tissue interactions. To address these challenges, we develop a dual-view vision system for microsurgical anastomo- sis, integrating both conventional top-down-view microscopes and eye-in-hand cameras mounted on surgical instrument tips. Our approach involves cross-view feature fusion through dif- ferent schemes to improve microsurgical scene understanding, including surgical action recognition, gripper-object interaction prediction, and instrument pose estimation. Extensive anas- tomosis datasets are collected on our robotic platform and several experiments are conducted for detailed evaluation of the system performance. Quantitative and qualitative results demonstrate that our dual-view microsurgical system signifi- cantly outperforms single-view microscopes in terms of robust visual perception, and cross-view feature fusion improves both the accuracy and precision of anastomosis scene understanding.

Index terms

Computer Vision for Medical Robotics Deep Learning for Visual Perception Medical Robots and Systems