← Back ICRA 2026

LLM-Driven Corrective Robot Operation Code Generation with Static Text-Based Simulation

Wenhao Wang, Yi Rong, Yanyan Li, Long Jiao, Jiawei Yuan

PDF

AI summary

An LLM replaces dynamic simulators by statically simulating robot code execution to iteratively correct and generate reliable operation code.

LLM Robot Code Generation Static Simulation Corrective Framework UAV Semantic Observation

Problem

Current LLM-driven robot code correction relies on costly physical experiments or custom simulators, limiting deployment due to high configuration effort and long execution times.

Approach

The authors configure an LLM to act as a static text-based simulator that interprets code actions, reasons over state transitions, and generates semantic observations to iteratively correct code mismatches.

Key results

Static text-based simulation achieves over 97.5% accuracy compared to AirSim and PX4-Gazebo
Corrective framework delivers over 85% success rate and 96.9% completeness on UAV tasks
Matches state-of-the-art performance without requiring dynamic code execution or custom simulators
Demonstrates cross-platform adaptability with high success rates on both UAVs and ground vehicles

Why it matters

Enables reliable, low-overhead robot code generation and correction for diverse platforms without complex simulation setups or physical testing.

Abstract

Recent advances in Large language models (LLMs) have demonstrated their promising capabilities of generating robot operation code to enable LLM-driven robots. To enhance the reliability of operation code generated by LLMs, corrective designs with feedback from the observation of exe- cuting code have been increasingly adopted in existing research. However, the code execution in these designs relies on either a physical experiment or a customized simulation environment, which limits their deployment due to the high configuration effort of the environment and the potential long execution time. In this paper, we explore the possibility of directly leveraging LLM to enable static simulation of robot operation code, and then leverage it to design a new reliable LLM-driven corrective robot operation code generation framework. Our framework configures the LLM as a static simulator with enhanced capabil- ities that reliably simulate robot code execution by interpreting actions, reasoning over state transitions, analyzing execution outcomes, and generating semantic observations that accurately capture trajectory dynamics. To validate the performance of our framework, we performed experiments on various operation tasks for different robots, including UAVs and small ground vehicles. The experiment results not only demonstrated the high accuracy of our static text-based simulation but also the reliable code generation of our LLM-driven corrective framework, which achieves a comparable performance with state-of-the- art research while does not rely on dynamic code execution using physical experiments or simulators.

Index terms

Task Planning AI-Based Methods AI-Enabled Robotics