Research Analyzer
← Back ICRA 2026

Scan, Materialize, Simulate: A Generalizable Framework for Physically Grounded Robot Planning

Amine Elhafsi, Daniel Morton, Marco Pavone

PDF

AI summary

Key figure (auto-extracted from paper)
SMS bridges 3D scene reconstruction, semantic material inference, and physics simulation to enable generalizable, physically grounded robot planning without task-specific retraining.
3D Gaussian Splatting Physics Simulation Robot Planning Vision-Language Models Scene Reconstruction Generalizable Robotics

Problem

Autonomous robots struggle to anticipate physical consequences in unstructured environments, as existing methods rely on rigid assumptions, specialized algorithms, or learned policies that degrade out-of-distribution.

Approach

The framework scans scenes with 3D Gaussian Splatting, uses foundation models to segment objects and infer material properties, and leverages a physics engine to simulate and optimize actions before real-world deployment.

Key results

  • Accurate 3D scene reconstruction with object-level semantic segmentation
  • Automated inference of physical material properties via vision-language models
  • Successful simulated domain transfer and real-world validation in billiards and quadrotor tasks
  • Superior physical reasoning and generalizability over grid-search and learning-based baselines

Why it matters

It enables reliable, physics-aware planning in diverse, unstructured environments, advancing general-purpose autonomous systems for applications like disaster response and logistics.

Abstract

Autonomous robots must reason about the phys- ical consequences of their actions to operate effectively in unstructured, real-world environments. We present Scan, Ma- terialize, Simulate (SMS), a unified framework that combines 3D Gaussian Splatting for accurate scene reconstruction, visual foundation models for semantic segmentation, vision-language models for material property inference, and physics simulation for reliable prediction of action outcomes. By integrating these components, SMS enables generalizable physical reasoning and object-centric planning without the need to relearn foundational physical dynamics. We empirically validate SMS in a billiards- inspired manipulation task and a challenging quadrotor landing scenario, demonstrating robust performance on both simulated domain transfer and real-world experiments. Our results high- light the potential of bridging differentiable rendering for scene reconstruction, foundation models for semantic understanding, and physics-based simulation to achieve physically grounded robot planning across diverse settings. Our project page with additional materials can be found at https://sites. google.com/view/scan-materialize-simulate.

Index terms

Integrated Planning and Learning AI-Enabled Robotics Simulation and Animation

Related papers