← Back ICRA 2026

Multi-Modal Sensing in Colonoscopy: A Data-Driven Approach

Viola Del Bono, Emma Capaldi, Anushka Kelshiker, Ayhan Aktas, Hiroyuki Aihara, Sheila Russo

PDF

AI summary

Key figure (auto-extracted from paper)

A machine learning framework paired with an automated calibration platform enables real-time, accurate estimation of 3D shape and localized contact force in a soft optical colonoscopy sleeve.

Soft robotics Optical sensing Machine learning Colonoscopy Force estimation Shape reconstruction

Problem

Soft optical sensors for colonoscopy exhibit complex, multi-modal responses that are difficult to model, while manual calibration creates a major bottleneck for collecting the large datasets required for machine learning.

Approach

The researchers built an automated calibration platform to rapidly collect large-scale multi-modal datasets, then trained a cascaded machine learning architecture to sequentially predict contact force and 3D shape from optical sensor readings.

Key results

Automated calibration platform generating large multi-modal datasets
Cascaded ML model achieving 4.7% curvature error, 2.37% orientation error, and 5.5% force tracking error
High-accuracy contact localization across 16 indenters (8 reaching >80% accuracy)
Conical indenters with silicone domes improving low-force sensitivity and repeatability

Why it matters

This approach provides surgeons with real-time, high-resolution tactile and shape feedback to improve navigation safety and minimize tissue damage during colonoscopy.

Abstract

Soft optical sensors hold potential for enhancing min- imally invasive procedures like colonoscopy, yet their complex, multi-modal responses pose significant challenges. This work in- troduces a machine learning (ML) framework for real-time esti- mation of 3D shape and contact force in a soft robotic sleeve for colonoscopy. To overcome limitations of manual calibration and collect large datasets for ML, we developed an automated platform for collecting data across a range of orientations, curvatures, and contact forces. A cascaded ML architecture was implemented for sequential estimation of contact force and 3D shape, enabling an accuracy with errors of 4.7% for curvature, 2.37% for orientation, and 5.5% for force tracking. We also explored the potential of ML for contact localization by training a model to estimate contact intensity and location across 16 indenters distributed along the sleeve.Theforceintensitywasestimatedwithanerrorrangingfrom 0.06 N to 0.31 N throughout the indenters. Despite the proximity of the contact points, the system achieved high localization perfor- mances, with 8 indenters reaching over 80% accuracy, demonstrat- ing promising spatial resolution.

Index terms

Soft Sensors and Actuators Force and Tactile Sensing Modeling Control and Learning for Soft Robots