Hebbian Attractor Networks for Robot Locomotion
Alexander Dittrich, Fuda van Diggelen, Dario Floreano
AI summary
Problem
Traditional robot controllers use static weights, making them brittle in dynamic real-world environments. Biological systems continuously adapt via Hebbian plasticity, but harnessing this for stable, adaptive control without runaway dynamics remains poorly understood.
Approach
The authors introduce Hebbian Attractor Networks (HANs), which apply max-normalized Hebbian updates at a slower frequency than the control loop and use temporal averaging of pre- and postsynaptic activations to induce structured weight dynamics.
Key results
- Max-normalized plasticity induces stable fixed-point or oscillatory weight attractors
- Slower update frequencies with longer averaging windows promote weight convergence
- HANs outperform static evolved controllers and match gradient-based RL on standard benchmarks
- Framework generalizes to high-dimensional quadrupedal locomotion with robust online perturbation recovery
Why it matters
It provides a bio-inspired, gradient-free framework for continuous self-modification in embodied AI, offering improved robustness for real-world robots operating in unpredictable environments.
Abstract
Biological neural networks continuously adapt and modify themselves in response to experiences throughout their lifetime—a capability largely absent in artificial neural networks. Hebbian plasticity offers a promising path toward rapid adaptation in changing environments. Here, we introduce Hebbian Attractor Networks (HAN), a class of plastic neural networks in which local weight update normalization induces emergent attractor dynamics. Unlike prior approaches, HANs employ dual-timescale plasticity and temporal averaging of pre- and postsynaptic activations to induce either co-dynamic limit cycles or fixed-point weight attractors. Using simulated loco- motion benchmarks, we gain insight into how Hebbian update frequency and activation averaging influence weight dynamics and control performance. Our results show that slower updates, combined with averaged pre- and postsynaptic activations, promote convergence to stable weight configurations, while faster updates yield oscillatory co-dynamic systems. We further demonstrate that these findings generalize to high-dimensional quadrupedal locomotion with a simulated Unitree Go1 robot. These results highlight how the timing of plasticity shapes neural dynamics in embodied systems, providing a principled characterization of the attractor regimes that emerge in self- modifying networks.