← Back ICRA 2026

Multi-Task Reinforcement Learning of Drone Aerobatics by Exploiting Geometric Symmetries

Zhanyu Guo,, Zikang Yin,, Guobin Zhu,, Shiliang Guo, and Shiyu Zhao∗

PDF

AI summary

Key figure (auto-extracted from paper)

Explicitly encoding rotational symmetry into a multi-task reinforcement learning policy enables data-efficient, robust, and unified control of diverse drone aerobatic maneuvers.

Multi-task reinforcement learning Geometric symmetry Drone aerobatics Equivariant networks Autonomous MAV control Data-efficient learning

Problem

Conventional reinforcement learning methods struggle with low data efficiency and poor generalization when training a single policy to master multiple aggressive drone maneuvers without relying on manually designed waypoint sequences.

Approach

The authors propose GEAR, a unified multi-task reinforcement learning framework that embeds the inherent rotational symmetry of drone physics directly into its neural network architecture. By combining a symmetry-aware policy backbone with flexible task-specific modulation and separate value estimators, the model efficiently learns to execute diverse maneuvers from a single policy.

Key results

Achieves 98.85% success rate across diverse aerobatic tasks in simulation
Outperforms baseline RL methods by 9.53% in final training return
Successfully deploys a single unified policy on physical MAVs for real-world maneuvers
Enables composition of basic flight primitives to execute complex aerobatics like Power Loops and Multi-Flips

Why it matters

This approach provides a scalable, data-efficient pathway for training agile, multi-maneuver drone controllers, advancing autonomous aerial robotics for applications like racing, search-and-rescue, and freestyle flight.

Abstract

Flight control for autonomous micro aerial vehi- cles (MAVs) is evolving from steady flight near equilibrium points toward more aggressive aerobatic maneuvers, such as flips, rolls, and Power Loop. Although reinforcement learning (RL) has shown great potential in these tasks, conventional RL methods often suffer from low data efficiency and limited generalization. This challenge becomes more pronounced in multi-task scenarios where a single policy is required to master multiple maneuvers. In this paper, we propose a novel end-to- end multi-task reinforcement learning framework, called GEAR (Geometric Equivariant Aerobatics Reinforcement), which fully exploits the inherent SO(2) rotational symmetry in MAV dynamics and explicitly incorporates this property into the policy network architecture. By integrating an equivariant actor network, FiLM-based task modulation, and a multi-head critic, GEAR achieves both efficiency and flexibility in learning diverse aerobatic maneuvers, enabling a data-efficient, robust, and unified framework for aerobatic control. GEAR attains a 98.85% success rate across various aerobatic tasks, significantly outperforming baseline methods. In real-world experiments, GEAR demonstrates stable execution of multiple maneuvers and the capability to combine basic motion primitives to complete complex aerobatics.

Index terms

Aerial Systems: Mechanics and Control Aerial Systems: Applications Reinforcement Learning