← Back ICRA 2026

ManeuverNet: A Soft Actor-Critic Framework for Precise Maneuvering of Double-Ackermann-Steering Robots with Optimized Reward Functions

Kohio Deflesselle, Mélodie DANIEL, Aly Magassouba, Miguel Aranda, Olivier Ly

PDF

AI summary

Key figure (auto-extracted from paper)

ManeuverNet enables robust, precise maneuvering for double-Ackermann robots using tailored deep reinforcement learning and custom reward functions, significantly outperforming classical planners and standard DRL baselines.

Double-Ackermann steering Deep reinforcement learning Soft Actor-Critic Reward shaping Autonomous maneuvering Non-holonomic robots

Problem

Classical planners like TEB are highly sensitive to parameter tuning and robot dynamics, while standard deep reinforcement learning methods fail on double-Ackermann robots because conventional reward functions penalize the temporary goal deviations required for complex maneuvers.

Approach

ManeuverNet combines a Soft Actor-Critic algorithm with CrossQ and introduces four custom reward functions that prioritize lateral error and accommodate necessary goal deviations, enabling end-to-end learning without expert demonstrations or handcrafted guidance.

Key results

>40% improvement in maneuver success rates over DRL baselines
Mitigates parameter sensitivity of classical TEB planners
Up to 90% increase in real-world trajectory efficiency
Zero-shot transfer to diverse real-world terrains without fine-tuning

Why it matters

Provides a robust, deployable control solution for heavy double-Ackermann robots in constrained environments like agriculture, where precise maneuvering is critical but traditionally difficult to automate.

Abstract

Autonomous control of double-Ackermann- steering robots is essential in agricultural applications, where robots must execute precise and complex maneuvers within a limited space. Classical methods, such as the Timed Elastic Band (TEB) planner, can address this problem, but they rely on parameter tuning, making them highly sensitive to changes in robot configuration or environment and impractical to deploy without constant recalibration. At the same time, end-to-end deep reinforcement learning (DRL) methods often fail due to unsuitable reward functions for non-holonomic constraints, resulting in sub-optimal policies and poor generalization. To address these challenges, this paper presents ManeuverNet, a DRL framework tailored for double-Ackermann systems, combining Soft Actor-Critic with CrossQ. Furthermore, ManeuverNet introduces four specifically designed reward functions to support maneuver learning. Unlike prior work, ManeuverNet does not depend on expert data or handcrafted guidance. We extensively evaluate ManeuverNet against both state-of-the-art DRL baselines and the TEB planner. Experimental results demonstrate that our framework substantially improves maneuverability and success rates, achieving more than a 40% gain over DRL baselines. Moreover, ManeuverNet effectively mitigates the strong parameter sensitivity observed in the TEB planner. In real-world trials, ManeuverNet achieved up to a 90% increase in maneuvering trajectory efficiency, highlighting its robustness and practical applicability.

Index terms

Reinforcement Learning Machine Learning for Robot Control Motion and Path Planning