Endoscopic Spine Surgical View Enhancement Via Diffusion-Prior Contrastive and Physics-Informed Constraints for Robotic Navigation
Haojie Han, Longfei Ma, Kai Xu, suxi gu, Shipeng Zhang, Guochen Ning, Fang Chen, Hongen Liao
AI summary
Problem
Intraoperative spinal endoscopic views are frequently degraded by bleeding, fluids, and artifacts, yet acquiring paired clean and degraded images for model training is clinically infeasible, limiting robotic perception and surgical safety.
Approach
The authors propose DCP-Net, an unpaired restoration framework that aligns latent features via diffusion-prior contrastive learning, enforces anatomical fidelity through physics-informed constraints, and generates pixel-wise uncertainty maps to guide risk-aware robotic perception.
Key results
- First large-scale spine endoscopy dataset with 21,845 paired and unpaired samples for bleeding, bubbles, and artifacts
- State-of-the-art unpaired restoration performance across PSNR, SSIM, and color fidelity metrics
- 16.31% increase in mAP for downstream bleeding point detection using uncertainty-guided perception
- Effective suppression of intraoperative artifacts while preserving critical anatomical details
Why it matters
Improves visual clarity and perception reliability for robot-assisted spinal surgery, directly enhancing procedural safety and navigation accuracy for surgeons and robotic systems.
Abstract
In robot-assisted spinal endoscopy, intraoperative imaging is frequently degraded by bleeding, irrigation fluids, bubbles, smoke, and uneven illumination, which can severely compromise surgical precision, safety, and decision- making. Accurate identification of anatomical structures is particularly critical in spinal procedures, yet acquiring paired clean and degraded images in real clinical settings is infeasible. To address this challenge, we propose DCP-Net, an unpaired endoscopic image restoration framework tailored for robotic spinal surgery. DCP-Net integrates Diffusion-Prior Contrastive Learning (DPCL) to leverage generative priors and contrastive objectives for robust latent representations, and Physics-Informed Constraints (PIC) to ensure anatomically consistent restoration. Furthermore, we introduce Diffusion- Prior Uncertainty Estimation (DPUE), providing pixel-wise confidence maps that quantify restoration reliability and guide risk-aware robotic perception. We further constructed a dataset comprising 21,845 paired/unpaired samples of intraoperative visual degradations in spinal endoscopy, primarily involving bleeding, bubbles, and other artifacts. Extensive experiments show that DCP-Net outperforms existing methods in both quantitative metrics and perceptual quality, significantly improving visual clarity and supporting various robotic navigation tasks. Among these tasks, accurate bleeding point detection plays a particularly critical role in ensuring safe and precise navigation in clinical practice.