← Back ICRA 2026

Online Pareto-Optimal Decision-Making for Complex Tasks Using Active Inference

Peter Amorese, Shohei Wakayama, Nisar Ahmed, Morteza Lahijanian

PDF

AI summary

Key figure (auto-extracted from paper)

A novel active inference framework enables robots to learn multiple optimal trade-offs and align with user preferences in uncertain environments, outperforming existing multi-objective reinforcement learning methods in sample efficiency and preference adherence.

Multi-objective reinforcement learning Active inference Formal synthesis Pareto optimality Uncertain environments Temporal logic

Problem

Robots operating in uncertain environments struggle to simultaneously learn optimal trade-offs between competing objectives, adhere to user preferences, and guarantee formal task completion. Current multi-objective reinforcement learning methods lack a rigorous way to balance exploring the full Pareto front with exploiting user-specified preferences.

Approach

The framework combines a formal task planner that guarantees temporal logic satisfaction with a high-level active inference selector that minimizes surprise relative to a user-defined preference distribution, iteratively updating a Bayesian model of unknown stochastic costs.

Key results

Tractable expected free energy approximation for finite-horizon planning under cost uncertainty
Superior sample efficiency in learning Pareto-optimal behaviors versus state-of-the-art methods
Successful hardware validation on robotic dishwashing and Mars exploration simulations
Dynamic user control over exploration-exploitation balance via preference covariance

Why it matters

It provides a rigorous, sample-efficient pathway for deploying safe and preference-aligned autonomous robots in complex, uncertain real-world applications.

Abstract

When a robot autonomously performs a complex task, it frequently must balance competing objectives while maintaining safety. This becomes more difficult in uncertain environments with stochastic outcomes. Enhancing transparency in the robot’s behavior and aligning with user preferences are also crucial. This paper introduces a novel framework for multi- objective reinforcement learning that ensures safe task execution, optimizes trade-offs between objectives, and adheres to user preferences. The framework has two main layers: a multi- objective task planner and a high-level selector. The planning layer generates a set of optimal trade-off plans that guarantee satisfaction of a temporal logic task. The selector uses active inference to decide which generated plan best complies with user preferences and aids learning. Operating iteratively, the framework updates a parameterized learning model based on col- lected data. Case studies and benchmarks on both manipulation and mobile robots show that our framework outperforms other methods and (i) learns multiple optimal trade-offs, (ii) adheres to a user preference, and (iii) allows the user to adjust the balance between (i) and (ii).

Index terms

Formal Methods in Robotics and Automation Learning and Adaptive Systems Probability and Statistical Methods Active Inference