Cross-View Exocentric and Egocentric Fusion for Robust Microsurgical Anastomosis Understanding
Yuxuan Liu, Yuyang Zhuge, Xinyao Zhou, Yating Luo, Yunfei Luan, Yao Guo, Guang-Zhong Yang
AI summary
Problem
Conventional top-down-view microscopes in microsurgical robotics suffer from restricted fields of view and severe self-occlusion during instrument-tissue interactions, limiting accurate scene understanding and surgical safety.
Approach
The authors developed a dual-view vision system combining exocentric (top-down) microscopes with egocentric (eye-in-hand) cameras, and designed frame-wise 2D and temporal 3D cross-view feature fusion networks to jointly predict surgical actions, gripper-object interactions, and instrument poses.
Key results
- Collected a comprehensive intraoperative microsurgical anastomosis dataset with multi-task annotations
- Designed frame-wise 2D and temporal 3D cross-view fusion networks for joint surgical understanding
- Dual-view fusion consistently outperforms single-view baselines across action recognition, interaction prediction, and pose estimation
- Temporal 3D fusion achieves peak performance with 93.76% action accuracy and 6.38° pose MAE
Why it matters
Enhances the reliability and interpretability of visual perception for robotic microsurgery, paving the way for safer and more autonomous anastomosis procedures.
Abstract
Microsurgical anastomosis has become increas- ingly prevalent in surgical autonomy, requiring accurate and stable control of suturing needles and threads while enhancing the efficiency and safety of microsurgical operations. How- ever, current systems predominantly employ top-down-view microscopes for intraoperative imaging, which are constrained by limited field-of-view and significant occlusion caused by instrument-tissue interactions. To address these challenges, we develop a dual-view vision system for microsurgical anastomo- sis, integrating both conventional top-down-view microscopes and eye-in-hand cameras mounted on surgical instrument tips. Our approach involves cross-view feature fusion through dif- ferent schemes to improve microsurgical scene understanding, including surgical action recognition, gripper-object interaction prediction, and instrument pose estimation. Extensive anas- tomosis datasets are collected on our robotic platform and several experiments are conducted for detailed evaluation of the system performance. Quantitative and qualitative results demonstrate that our dual-view microsurgical system signifi- cantly outperforms single-view microscopes in terms of robust visual perception, and cross-view feature fusion improves both the accuracy and precision of anastomosis scene understanding.