← Back ICRA 2026

CAIMAN: Causal Action Influence Detection for Sample-Efficient Loco-Manipulation

Yuanchen Yuan, Jin Cheng, Nuria Armengol UrpÃ, Stelian Coros

PDF

AI summary

Key figure (auto-extracted from paper)

CAIMAN enables legged robots to learn non-prehensile object pushing skills efficiently using a causal influence-based intrinsic reward, achieving superior sample efficiency and successful real-world transfer without fine-tuning.

Loco-manipulation Reinforcement learning Causal action influence Intrinsic motivation Sample efficiency Sim-to-real transfer

Problem

Learning whole-body object pushing for legged robots typically requires complex planning, precise environmental models, or tedious task-specific reward shaping, making exploration in large state spaces difficult and sample-inefficient.

Approach

CAIMAN uses a hierarchical reinforcement learning framework where a high-level policy is guided by an intrinsic exploration bonus based on Causal Action Influence (CAI), computed from a hybrid model combining a simple kinematic prior with learned residual dynamics.

Key results

Superior sample efficiency and success rates in sparse-reward pushing tasks
Accurate modeling of complex robot-object interactions via hybrid dynamics
Zero-shot sim-to-real transfer to a physical quadruped robot
General hierarchical framework for obstacle-aware object pushing

Why it matters

It provides a scalable reinforcement learning approach for legged robots to master complex physical interactions, advancing autonomous manipulation in unstructured environments.

Abstract

Enabling legged robots to perform non-prehensile loco-manipulation is crucial for enhancing their versatility. However, learning behaviors such as whole-body object pushing often necessitates sophisticated planning strategies or exten- sive task-specific reward shaping. In this work, we present CAIMAN, a practical reinforcement learning framework that encourages the agent to gain control over other entities in the environment. CAIMAN leverages causal action influence as an intrinsic motivation objective, allowing legged robots to efficiently acquire object pushing skills even under sparse task rewards. We employ a hierarchical control strategy, combining a low-level locomotion module with a high-level policy that generates task-relevant velocity commands and is trained to maximize the intrinsic reward. To estimate causal action influence, we learn the dynamics of the environment by integrating a kinematic prior with data collected during training. We empirically demonstrate CAIMAN’s superior sample efficiency and adaptability to diverse scenarios in simulation, as well as its successful transfer to real-world systems without further fine-tuning. A video demo is available at https://www.youtube.com/watch?v=dNyvT04Cqaw.

Index terms

Legged Robots Mobile Manipulation Reinforcement Learning