Distributional Reinforcement Learning with Sample-Set Bellman Update
Weijian Zhang, Jianshu Wang, Yu Yang
Abstract
Distributional Reinforcement Learning (DRL) not only endeavors to optimize expected returns, but also strives to accurately characterize the full distribution of these returns, a key aspect in enhancing risk-aware decision-making. Previous DRL implementations often inappropriately treat statistical estimations as concrete samples, which undermines the integrity of learning. While several studies have addressed this issue, they frequently give rise to new complications, including computa- tional burdens and diminished stochastic behavior. In our work, we present a novel DRL framework that leverages the Gaussian mixture model to adeptly depict the distribution of returns. This approach ensures precise, authentic sampling critical for ro- bust learning, while also preserving computational tractability. Through extensive evaluation on a diverse array of 59 Atari games, our method not only surpasses the efficacy of prior DRL algorithms but also presents formidable competition to contemporary top-tier RL algorithms, signifying a substantial advancement in the field.