RPGD: A Small-Batch Parallel Gradient Descent Optimizer with Explorative Resampling for Nonlinear Model Predictive Control
Frederik Heetmeyer, Marcin Paluch, Diego Bolliger, Florian Bolli, Xiang Deng, Ennio Filicicchia, Tobi Delbruck
Abstract
Nonlinear model predictive control often involves nonconvex optimization for which real-time control systems require fast and numerically stable solutions. This work pro- poses RPGD, a Resampling Parallel Gradient Descent optimizer designed to exploit small-batch parallelism of modern hardware like neural accelerators or multithreaded microcontrollers. After initialization, it continuously maintains a small population of good control trajectory solution candidates and improves them using gradient information, followed by selection of elite candidates and resampling of the others. In simulation on a cartpole, the OpenAI Gym mountain car, a Dubins car with obstacles, and a high input dimensional 2D arm, it produces similar or lower MPC costs than benchmark cross-entropy and path integral methods. On a physical cartpole, it performs swing-up and cart target following of the pole, using either a differential equation or multilayer perceptron as dynamics model. RPGD drives an F1TENTH simulated race car at near- optimal lap times and a real F1TENTH car in laps around a cluttered room. We study alterations of RPGD’s building blocks to justify its composition. RPGD compute time in Python with TensorFlow optimization running on CPU is 2 to 4 times slower than the FORCESPRO commercial embedded solver.