← Back ICRA 2024

Effective Representation Learning Is More Effective in Reinforcement Learning Than You Think

Jiawei Zheng, Yonghong Song

PDF

Abstract

In reinforcement learning (RL), learning directly from pixels, is commonly known as vision-based RL. Effective state representations are crucial for high performance in vision- based RL. However, in order to learn effective state repre- sentations, most current vision-based RL methods based on contrastive unsupervised learning use auxiliary tasks similar to those in computer vision, which does not guarantee the effective information interaction between representation learning and RL. To learn more efficient states, we propose a simple and effective vision-based RL method. It leverages the represen- tations acquired through contrastive learning by the Teacher Encoder and the Student Encoder to collaboratively estimate the Q-function. This cooperative process utilizes the TD error to steer updates to the Teacher Encoder, thereby ensuring effective information exchange between representation learning and RL. We refer to this approach as Reinforcement Learning with Teacher-Student Collaboration (RLTSC). RLTSC incorporates recent advancements in contrastive unsupervised learning, en- dowing it with potent representation learning capabilities. It provides a robust estimate of the Q-function with minimal variance and effectively guides the Teacher Ecoder to update and acquire a more efficient representation. RLTSC substan- tially enhances data efficiency in vision-based RL, surpassing state-of-the-art methods on various continuous and discrete control benchmarks. Remarkably, RLTSC even outperforms RL methods based on physical state features in terms of data efficiency for continuous control benchmarks. This may enlighten us: effective representation learning is more effective in reinforcement learning than you think!

Index terms

Reinforcement Learning Representation Learning Model Learning for Control