Research Analyzer
← Back ICRA 2026

Learning-Based Robust Control: Unifying Exploration and Distributional Robustness for Reliable Robotics Via Free Energy

Hozefa Jesawada, Giovanni Russo, Abdalla Swikir, Fares Abu-Dakka

PDF

AI summary

Key figure (auto-extracted from paper)
Integrating a distributionally robust free energy principle into Maximum Diffusion RL yields explicit robustness guarantees and enables zero-shot sim-to-real robotic manipulation without fine-tuning.
Distributional Robustness Free Energy Principle Maximum Diffusion RL Sim-to-Real Transfer Robotic Control Epistemic Uncertainty

Problem

Learning-based robotic policies frequently fail in real-world deployments due to epistemic uncertainties in dynamics and rewards, yet existing methods lack explicit a priori robustness guarantees while maintaining effective exploration.

Approach

The authors modify the Maximum Diffusion RL framework by embedding a distributionally robust free energy principle that jointly optimizes exploration via a maximally diffusive prior and enforces explicit robustness bounds against model misspecification through per-state-action KL ambiguity sets.

Key results

  • Outperforms standard MaxDiff baselines on continuous control benchmarks
  • Provides explicit a priori robustness guarantees against dynamics and cost perturbations
  • Enables zero-shot sim-to-real deployment on a Franka Emika Panda arm
  • Narrows the sim-to-real gap by aligning control with epistemic risk

Why it matters

It offers a theoretically grounded, deployable control framework that bridges robust control theory and learning-based robotics for reliable real-world automation.

Abstract

A key challenge towards reliable robotic control is devising computational models that can both learn policies and guarantee robustness when deployed in the field. Inspired by the free energy principle in computational neuroscience, to address these challenges, we propose a model for policy computation that jointly learns environment dynamics and rewards, while ensuring robustness to epistemic uncertainties. Expounding a distributionally robust free energy principle, we propose a modification to the maximum diffusion learning framework. After explicitly characterizing robustness of our policies to epistemic uncertainties in both environment and reward, we validate their effectiveness on continuous-control benchmarks, via both simulations and real-world experiments involving manipulation with a Franka Research 3 arm. Across simulation and zero-shot deployment, our approach narrows the sim-to-real gap, and enables repeatable tabletop manipulation without task-specific fine-tuning.

Index terms

Reinforcement Learning Robust/Adaptive Control Probabilistic Inference

Related papers