← Back ICRA 2026

Grounding Language Models with Semantic Digital Twins for Robotic Planning

Mehreen Naeem, Andrew Melnik, Michael Beetz

PDF

AI summary

Key figure (auto-extracted from paper)

Integrating Semantic Digital Twins with LLMs enables robots to ground natural language instructions in real-time environmental context, achieving reliable task completion with built-in error recovery.

Semantic Digital Twins LLM Robotics Task Planning Failure Recovery Embodied AI Action Triplets

Problem

LLM-based robotic planners often hallucinate, lack grounding in physical constraints, and struggle to adapt to execution failures in dynamic environments.

Approach

The framework uses a Semantic Digital Twin to provide real-time object affordances and interaction rules, grounding LLM-generated action triplets and enabling context-aware failure resolution and iterative replanning.

Key results

100% task success rate on ALFRED household tasks with SDT integration
Drastic reduction in failure and replanning iterations compared to baseline
Context-aware failure resolver corrects object selection and affordance errors
Training-free, real-time adaptation to dynamic environmental changes

Why it matters

Enables reliable, adaptive robotic execution in complex environments without relying on external training or static scene graphs.

Abstract

We introduce a novel framework that integrates Semantic Digital Twins (SDTs) with Large Language Models (LLMs) to enable adaptive and goal-driven robotic task execu- tion in dynamic environments. The system decomposes natural language instructions into structured action triplets, which are grounded in contextual environmental data provided by the SDT. This semantic grounding allows the robot to interpret object affordances and interaction rules, enabling action planning and real-time adaptability. In case of execution failures, the LLM utilizes error feedback and SDT insights to generate recovery strategies and iteratively revise the action plan. We evaluate our approach using tasks from the ALFRED benchmark, demonstrat- ing robust performance across various household scenarios. The proposed framework effectively combines high-level reasoning with semantic environment understanding, achieving reliable task completion in the face of uncertainty and failure.

Index terms

Failure Detection and Recovery Task Planning