Research Analyzer
← Back ICRA 2026

CLEVER: Stream-Based Active Learning for Robust Semantic Perception from Human Instructions

Jongseok Lee, Timo Birr, Rudolph Triebel, Tamim Asfour

PDF

AI summary

Key figure (auto-extracted from paper)
Robots can robustly perceive unseen or shifting objects by querying humans for instructions and adapting deep neural networks online.
stream-based active learning Bayesian neural networks robotic perception online adaptation human-robot interaction distribution shift

Problem

DNN-based semantic perception fails under real-world distribution shifts, such as encountering unseen or deformed objects. Existing systems lack efficient uncertainty-aware mechanisms to adapt online using human guidance.

Approach

CLEVER uses stream-based active learning with a Bayesian neural network framework to estimate uncertainty, query humans when unsure, and rapidly update only relevant model parameters based on human instructions.

Key results

  • First real-robot implementation of stream-based active learning with DNNs
  • Bayesian formulation with learned priors improves uncertainty calibration and generalization
  • Rapid online adaptation to distribution shifts in under one minute
  • Validated via user study and humanoid robot demonstrations

Why it matters

Provides a practical framework for robots to maintain reliable perception in dynamic environments by efficiently leveraging human feedback for continuous adaptation.

Abstract

We propose CLEVER, an active learning system for robust semantic perception with Deep Neural Networks (DNNs). For data arriving in streams, our system seeks human support when encountering failures and adapts DNNs online based on human instructions. In this way, CLEVER can eventually accomplish the given semantic perception tasks. Our main contribution is the design of a system that meets several desiderata of realizing the aforementioned capabilities. The key enabler herein is our Bayesian formulation that encodes domain knowledge through priors. Empirically, we not only motivate CLEVER’s design but further demonstrate its capabilities with a user validation study as well as experiments on humanoid and deformable objects. To our knowledge, we are the first to realize stream-based active learning on a real robot, providing evidence that the robustness of the DNN-based semantic perception can be improved in practice. The project website can be accessed at https://sites.google.com/view/thecleversystem.

Index terms

Object Detection Segmentation and Categorization Deep Learning for Visual Perception Probability and Statistical Methods

Related papers