← Back ICRA 2026

Temporal Transfer Learning for Traffic Optimization with Coarse-Grained Advisory Autonomy

Jung-Hoon Cho, Sirui Li, Jeongyun Kim, Cathy Wu

PDF

AI summary

Key figure (auto-extracted from paper)

Strategic temporal transfer learning enables deep reinforcement learning policies to reliably generalize across diverse advisory hold durations for human drivers.

Deep reinforcement learning transfer learning advisory autonomy traffic optimization coarse-grained control mixed autonomy

Problem

Direct deep reinforcement learning fails to generalize across different advisory hold durations due to training brittleness, limiting the practical deployment of coarse-grained driving advisories for human drivers.

Approach

The authors introduce Temporal Transfer Learning (TTL) algorithms that select optimal source training tasks based on temporal similarities, enabling zero-shot policy transfer across a full range of hold durations without fine-tuning.

Key results

Proposes greedy and coarse-to-fine temporal transfer learning algorithms
Achieves reliable zero-shot generalization across hold durations from 0.1 to 40 seconds
Outperforms exhaustive and multitask reinforcement learning baselines in mixed-traffic simulations
Validates coarse-grained advisory autonomy as a viable near-term traffic optimization strategy

Why it matters

Offers a robust, data-efficient pathway to deploy real-time driving advisories that improve urban traffic flow without requiring full vehicle automation.

Abstract

The recent development of connected and automated vehicle (CAV) technologies has spurred investigations to optimize dense urban traffic, maximizing vehicle speed and throughput. This article explores advisory autonomy, in which real-time driving advisories are issued to human drivers, thus achieving near-term performance of automated vehicles. Due to the complexity of traffic systems, recent studies of coordinating CAVs have leveraged deep reinforcement learning (RL). Coarse-grained advisory is formal- ized as zero-order holds, and we consider a range of hold durations from 0.1 to 40 s. However, despite the similarity of the higher frequency tasks for CAVs, a direct application of deep RL fails to generalize to advisory autonomy tasks. To overcome this, we employ zero-shot transfer, training policies on a set of source tasks—specific traffic scenarios with designated hold durations— and then evaluating the efficacy of these policies on different target tasks.Weintroducetemporaltransferlearning(TTL)algorithmsto select source tasks for zero-shot transfer, systematically leveraging the temporal structure to solve the full range of tasks. TTL selects the most suitable source tasks to maximize the performance of the range of tasks. We validate our algorithms on diverse mixed-traffic scenarios, demonstrating that TTL more reliably solves the tasks than baselines. This article underscores the potential of coarse- grained advisory autonomy with TTL in traffic flow optimization.

Index terms

Intelligent Transportation Systems Learning and Adaptive Systems Deep Learning in Robotics and Automation Transfer Learning