← Back ICRA 2026

Endoscopic Spine Surgical View Enhancement Via Diffusion-Prior Contrastive and Physics-Informed Constraints for Robotic Navigation

Haojie Han, Longfei Ma, Kai Xu, suxi gu, Shipeng Zhang, Guochen Ning, Fang Chen, Hongen Liao

PDF

AI summary

Key figure (auto-extracted from paper)

DCP-Net achieves state-of-the-art unpaired endoscopic image restoration and significantly boosts downstream bleeding point detection for safer robotic spinal navigation.

Spinal endoscopy Image restoration Diffusion models Robotic surgery Unpaired learning Bleeding detection

Problem

Intraoperative spinal endoscopic views are frequently degraded by bleeding, fluids, and artifacts, yet acquiring paired clean and degraded images for model training is clinically infeasible, limiting robotic perception and surgical safety.

Approach

The authors propose DCP-Net, an unpaired restoration framework that aligns latent features via diffusion-prior contrastive learning, enforces anatomical fidelity through physics-informed constraints, and generates pixel-wise uncertainty maps to guide risk-aware robotic perception.

Key results

First large-scale spine endoscopy dataset with 21,845 paired and unpaired samples for bleeding, bubbles, and artifacts
State-of-the-art unpaired restoration performance across PSNR, SSIM, and color fidelity metrics
16.31% increase in mAP for downstream bleeding point detection using uncertainty-guided perception
Effective suppression of intraoperative artifacts while preserving critical anatomical details

Why it matters

Improves visual clarity and perception reliability for robot-assisted spinal surgery, directly enhancing procedural safety and navigation accuracy for surgeons and robotic systems.

Abstract

In robot-assisted spinal endoscopy, intraoperative imaging is frequently degraded by bleeding, irrigation fluids, bubbles, smoke, and uneven illumination, which can severely compromise surgical precision, safety, and decision- making. Accurate identification of anatomical structures is particularly critical in spinal procedures, yet acquiring paired clean and degraded images in real clinical settings is infeasible. To address this challenge, we propose DCP-Net, an unpaired endoscopic image restoration framework tailored for robotic spinal surgery. DCP-Net integrates Diffusion-Prior Contrastive Learning (DPCL) to leverage generative priors and contrastive objectives for robust latent representations, and Physics-Informed Constraints (PIC) to ensure anatomically consistent restoration. Furthermore, we introduce Diffusion- Prior Uncertainty Estimation (DPUE), providing pixel-wise confidence maps that quantify restoration reliability and guide risk-aware robotic perception. We further constructed a dataset comprising 21,845 paired/unpaired samples of intraoperative visual degradations in spinal endoscopy, primarily involving bleeding, bubbles, and other artifacts. Extensive experiments show that DCP-Net outperforms existing methods in both quantitative metrics and perceptual quality, significantly improving visual clarity and supporting various robotic navigation tasks. Among these tasks, accurate bleeding point detection plays a particularly critical role in ensuring safe and precise navigation in clinical practice.

Index terms

Computer Vision for Medical Robotics Data Sets for Robotic Vision Vision-Based Navigation