← Back ICRA 2026

STAGE: STyle-Controllable Action GEneration for Personalized Autonomous Driving

Zihao Liu, Xing Liu, Yizhai Zhang, Panfeng Huang

PDF

AI summary

Key figure (auto-extracted from paper)

STAGE enables personalized autonomous driving by allowing users to input a continuous style value that dynamically controls driving aggressiveness and action patterns.

Autonomous driving driving style imitation learning preference learning style controllability personalized control

Problem

Current autonomous driving systems lack personalized driving styles, often relying on discrete classifications or unstructured latent spaces that fail to align with human expectations or allow intuitive control.

Approach

STAGE combines imitation learning with a Transformer-based network and an action modality encoder, using preference learning to extract a continuous, monotonic style value that conditions action generation.

Key results

Action modality encoder extracts continuous style values and VAE latents from driving data
Aggressiveness scoring rules automate preference learning, reducing manual annotation
Real-time user input of style values dynamically controls driving aggressiveness and control commands
Style-controlled actions significantly align with human expectations while maintaining safety in typical road scenarios

Why it matters

Enhances driver trust and comfort in autonomous vehicles by personalizing system behavior to match individual driving preferences.

Abstract

Driving style refers to the behavioral preferences that drivers maintain during driving, shaped by their diverse experi- ences, habits, and needs, and is typically reflected in varying levels of aggressiveness. If humans choose to use autonomous driving systems, they would expect the driving style of the systems to closely resemble their own habit. However, this is challenging for current industrial autonomous driving systems. To address this, we developed a style controllable action generation method, STAGE, for driving tasks. Its training process is based on imitation learning, incorporating both style value and latent value action modality encoding. Preference learning is then used to identify the user’s driving style as a continuous, monotonic style value. And to reduce the cost of human involvement in the preference training process, we also developed a set of rules to compare driving style in data pairs. Then, during inference, the user inputs the style value to control the generated action patterns, dynamically meeting the user’s expectations. Using the STAGE method, we verified that the style-controlled action generation results in several typical road scenarios significantly align with human expectations. Fur- thermore, through comparisons between the STAGE method and various other approaches, we reveal the unique functionalities of STAGE, including its style controllability, style continuity, driving style alignment capability and driving safety.

Index terms

Autonomous Vehicle Navigation Human Factors and Human-in-the-Loop Imitation Learning