A LLM-Assisted Compiler for Generating Standard-Compliant Driving Scenarios from Natural Language
Majid Jegarian, Amir K. Esfahani, Katharina Bause, Tobias Düser
Abstract
This paper introduces a Scenario Compiler for converting textual concrete scenarios into executable XML files compliant with simulation standards. It serves as the final module in a structured framework for automated driving scenario generation, transforming natural language descrip- tions into simulation-ready XML files. The proposed approach combines schema-guided parsing with contextual inference using a fine-tuned large language model (LLM). In the first phase, the parser generates a complete and schema-compliant XML structure and fills in all directly extractable values, and inserting placeholders for missing or ambiguous information. The challenge addressed here is the frequent absence of explicit values for certain scenario parameters in the input text, which makes it difficult to generate fully specified XML solely through rule-based methods. In the second phase, the fine-tuned LLM infers and fills in these missing values by analyzing the broader scenario context, ensuring that the final output is both complete and plausible. An evaluation against non-specialized LLMs shows that the Scenario Compiler produces significantly more correctly instantiated XML elements while avoiding invalid tags or schema violations. By combining rule-based schema com- pliance with LLM-based reasoning, the approach automates the scenario generation process and reduces manual effort in simulation-based validation workflows.