← Back ICRA 2026

Automatic Physically-Based Sim2Real for Tactile Images through Differentiable Path-Tracing Rendering

Guillaume Duret, Anna Samsonenko, Florence ZARA, Jan Peters, Liming Chen

PDF

AI summary

Key figure (auto-extracted from paper)

A fully differentiable path-tracing pipeline automatically optimizes camera pose, lighting, and texture from just three real images to bridge the sim-to-real gap for visual tactile sensors.

Differentiable rendering Sim2Real Tactile sensing Path tracing NOCS Robotic manipulation

Problem

High-fidelity simulation of vision-based tactile sensors suffers from a persistent sim-to-real gap due to unmodeled optical effects like glass refraction and the need for manual tuning of physical parameters such as camera pose and lighting.

Approach

The authors develop a fully differentiable rendering pipeline that optimizes critical simulation parameters directly from minimal real-world images, and pair it with a fast image-to-image translation model trained on NOCS maps to enable rapid, high-fidelity inference.

Key results

First fully differentiable rendering pipeline for visual tactile sensors
State-of-the-art sim-to-real accuracy on multi-axis deformation benchmarks
Novel inverse rendering application for single-image mesh reconstruction
Near real-time inference via NOCS-based image-to-image translation

Why it matters

Provides a scalable, automated solution for generating photorealistic tactile data, accelerating the development of data-driven robotic manipulation and tactile perception algorithms.

Abstract

High-fidelity simulation of vision-based tactile sen- sors is essential for developing data-driven robotic manipulation algorithms. However, a significant sim-to-real gap persists due to the difficulty in modeling complex optical effects, such as refraction through protective glass layers, and in accurately estimating physical parameters like sensor pose and lighting. To bridge this gap, we introduce a novel, fully differentiable pipeline for visual tactile simulation. Leveraging a differentiable path tracer, our method optimizes critical parameters—including camera pose, lighting conditions, and object texture—directly from just three real images. This approach achieves highly realistic simulations with physically accurate light transport and glass refraction. We validate our method through a comprehensive benchmark against real-world data, demonstrating state-of-the-art sim-to-real accuracy. We also enable novel applications, such as mesh reconstruction from a single tactile image via inverse rendering. To overcome the computational cost of path tracing, we further use a image-to-image translation model. This model uses high-fidelity simulated data alongside Normalized Object Coordinate Space (NOCS) maps as input, preserving crucial deformation infor- mation while enabling rapid inference. The code is available on https://tacdiffrend.github.io/

Index terms

Force and Tactile Sensing Simulation and Animation Performance Evaluation and Benchmarking