Research Analyzer
← Back ICRA 2026

Diverse Skill Discovery in Fourier Latent Space Via Unsupervised Learning

Ruopeng Cui, Yucong Sun, Xizhou Bu, Wang Chao, Wei Li

PDF

AI summary

Key figure (auto-extracted from paper)
FLSD leverages Fourier latent features to measure skill diversity, enabling quadruped robots to autonomously discover smoother and more diverse locomotion gaits without task-specific rewards.
Unsupervised skill discovery Fourier latent space Quadruped locomotion Periodic autoencoder Reinforcement learning Motion diversity

Problem

Existing unsupervised skill discovery methods measure diversity using single-step states, which ignores trajectory phase coherence, disrupts motion smoothness, and limits the discovery of transitional behaviors.

Approach

FLSD employs a Periodic Autoencoder to map robot motion sequences into a Fourier latent space, using phase-aware features to measure diversity and guide a mutual-information-based reward for training a versatile locomotion policy.

Key results

  • Reduces high-frequency motion jitter by 73%
  • Increases state space coverage by 133%
  • Discovers varied gaits including three-legged locomotion
  • Enables reliable real-world task execution via high-level orchestration

Why it matters

It eliminates manual reward engineering and task-specific data requirements, providing a scalable framework for autonomous locomotion skill acquisition in complex robotic systems.

Abstract

Unsupervised skill discovery acquires a diverse repertoire of skills through intrinsic motivation, offering the potential to alleviate the labor-intensive reward engineering in reinforcement learning and the reliance on costly task-specific data in imitation learning. However, such methods typically measure diversity based on single-step states, neglecting the trajectory phase coherence, whose absence disrupts the smooth- ness of state transitions. In this work, we explore skills in Fourier latent space via a simple mutual-information-based reward function, aiming to train a single versatile policy capable of executing diverse state transition patterns. Specifically, we utilize a spatio-temporal representation learned through a Periodic Autoencoder, which effectively captures the periodic or quasi-periodic nature of motion. These features, rather than raw states, are used to measure skill diversity. We validate our method on the 12-DOF quadruped robot Unitree A1, achieving varied gaits. Simulation results show that our method reduces high-frequency power by 73%, while improving state space coverage by 133% compared to the baseline. To accomplish specific tasks, we trained a high-level controller to orchestrate the learned skills, which improves training efficiency. Real-world experiments demonstrate that the learned skills can reliably execute tasks.

Index terms

AI-Enabled Robotics Reinforcement Learning Deep Learning Methods

Related papers