← Back ICRA 2026

High-Altitude Balloon Station-Keeping with First Order Model Predictive Control

Myles Pasetsky, Jiawei Lin, Bradley Guo, Sarah Dean

PDF

AI summary

Key figure (auto-extracted from paper)

First-order model predictive control outperforms state-of-the-art reinforcement learning by 24% in station-keeping time without offline training, proving gradient-based planning is viable for high-altitude balloons.

high-altitude balloons model predictive control gradient optimization station-keeping reinforcement learning baseline differentiable simulation

Problem

Prior high-altitude balloon station-keeping research relies heavily on model-free reinforcement learning, dismissing model-based approaches as impractical due to uncertain wind forecasts and complex dynamics, leaving a critical gap for rigorous baselines.

Approach

The authors implement balloon and wind dynamics as differentiable functions in JAX, enabling gradient-based trajectory optimization for online receding-horizon planning.

Key results

24% improvement in time-within-radius over Perciatelli44 RL policy
Gains an additional 1.8 hours per day within station-keeping radius
Open-sources a fully differentiable JAX implementation of high-altitude balloon dynamics
Demonstrates online planning effectiveness across simplified wind and dynamics models via ablation studies

Why it matters

Establishes a necessary model-based baseline for validating future RL controllers and guides practical deployment of autonomous high-altitude balloons for atmospheric research.

Abstract

High-altitude balloons (HABs) are common in sci- entific research due to their wide range of applications and low cost. Because of their nonlinear, underactuated dynamics and the partial observability of wind fields, prior work has largely relied on model-free reinforcement learning (RL) methods to design near-optimal control schemes for station-keeping. These methods often compare only against hand-crafted heuristics, dismissing model-based approaches as impractical given the system complexity and uncertain wind forecasts. We revisit this assumption about the efficacy of model-based control for station-keeping by developing First-Order Model Predictive Control (FOMPC). By implementing the wind and balloon dy- namics as differentiable functions in JAX, we enable gradient- based trajectory optimization for online planning. FOMPC outperforms a state-of-the-art RL policy, achieving a 24% improvement in time-within-radius (TWR) without requiring offline training, though at the cost of greater online computation per control step. Through systematic ablations of modeling assumptions and control factors, we show that online planning is effective across many configurations, including under simplified wind and dynamics models.

Index terms

Aerial Systems: Mechanics and Control Planning under Uncertainty