The Trajectory Bundle Method: Unifying Sequential-Convex Programming and Sampling-Based Trajectory Optimization
Kevin Tracy, John Zhang, Jon Arrizabalaga, Stefan Schaal, Tom Erez, Yuval Tassa, Zachary Manchester
AI summary
Problem
Traditional trajectory optimization relies on differentiable models for sequential convex programming, which fails for learned simulators, contact-rich systems, or expensive derivatives, while sampling-based methods like MPPI struggle with constraints and open-loop instability.
Approach
TBM replaces Taylor-series linearizations with linear interpolation over sampled trajectory bundles, iteratively solving convex approximations that handle arbitrary constraints and multiple shooting without requiring gradients.
Key results
- A unified derivative-free framework for general trajectory optimization via sequential convex programming
- Linear interpolation of sampled dynamics, cost, and constraints within a trust region
- Theoretical proof that MPPI is a special case of TBM under single shooting with entropy regularization
- Numerical validation showing fast convergence and strict constraint satisfaction for nonlinear, non-convex problems
Why it matters
Enables robust optimal control for complex robotic systems with black-box, learned, or contact-rich dynamics where gradient computation is infeasible.
Abstract
We present a unified framework for solving tra- jectory optimization problems in a derivative-free manner through the use of sequential convex programming. Tradition- ally, nonconvex optimization problems are solved by forming and solving a sequence of convex optimization problems, where the cost and constraint functions are approximated locally through Taylor series expansions. This presents a challenge for functions where differentiation is expensive or unavailable. In this work, we present a derivative-free approach to form these convex approximations by computing samples of the dynamics, cost, and constraint functions and letting the solver inter- polate between them. Our framework includes sample-based trajectory optimization techniques like model-predictive path integral (MPPI) control as a special case and generalizes them to enable features like multiple shooting and general equality and inequality constraints that are traditionally associated with derivative-based sequential convex programming methods. The resulting framework is simple, flexible, and capable of solving a wide variety of practical motion planning and control problems.