Manifold Geometry-Based Feature Decoupling for Endoscopic Image Analysis
Yan Wen, Haodong Wang, Lingyu Chen, Wenbo She, Dingpei Han, Fang Chen, Tianqi Huang,∗
AI summary
Problem
Endoscopic images suffer from ambiguous semantic boundaries and weak feature discriminability due to complex anatomical structures and optical limitations, which traditional Euclidean-based models struggle to represent effectively.
Approach
The proposed SFDP framework uses a Manifold Geometry-Based Feature Decoupling Module (MANDE) to project features into a manifold space, decompose them into semantically independent sub-features using geometric constraints, and process them in parallel before fusing the results.
Key results
- Reduces depth estimation RMSE by 14.2% on average
- Decreases segmentation MAE by 10.5%
- Adds only 5.10M additional parameters
- Demonstrates consistent performance gains across multiple backbone networks
Why it matters
Provides a parameter-efficient, geometrically grounded paradigm to enhance visual perception for endoscopic surgery and surgical robotics.
Abstract
Endoscopic images suffer from ambiguous seman- tic boundaries and weak feature discriminability due to acqui- sition limitations and structural constraints of the anatomical lumen, severely limiting the performance of image analysis models. Mainstream models operate in Euclidean space, which is inherently limited in representing the non-linear geometric characteristics of endoscopic imagery. Manifold space provides a natural advantage in representing such complex structures. Inspired by the manifold hypothesis, this paper models en- doscopic image features as separable semantic subspaces on a manifold and proposes a Manifold Geometry-Based Feature De- coupling Module (MANDE). Guided by novel manifold geomet- ric constraints, MANDE adaptively decouples feature maps into multiple semantically independent sub-feature maps, effectively mitigating performance degradation caused by feature space coupling. Furthermore, this paper introduces the Semantic Fea- ture Decoupling-and-Processing (SFDP) framework adopting a divide-and-conquer strategy: utilizing a backbone network for feature extraction, MANDE for decoupling, and a Decouple- Aggregation Head for parallel processing and fusion of sub- features. Extensive experiments demonstrate the framework’s adaptability and effectiveness. When integrated with various popular networks, SFDP significantly enhances performance on endoscopic tasks: it reduces RMSE by an average of 14.2% for depth estimation and decreases MAE by 10.5% for segmentation, with only 5.10M additional parameters. Unlike prior works, SFDP uniquely integrates manifold geometry with semantic hierarchical modeling for endoscopic images, provid- ing a novel perspective for surgical robot scene understanding: from holistic features to semantic structures.