← Back ICRA 2026

Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning

Haeone Lee, Taywon Min, Junsu Kim, Sinjae Kang, Fangchen Liu, Lerrel Pinto, Kimin Lee

PDF

AI summary

Key figure (auto-extracted from paper)

QoQ leverages influence functions to score and curate high-quality robot demonstration trajectories, significantly outperforming heuristic data selection methods in both simulation and real-world tasks.

Robot data curation Influence functions Demonstration quality Behavior cloning Data valuation Policy training

Problem

Human teleoperation introduces noise and suboptimal behaviors into robot demonstration datasets, making effective data curation critical but currently reliant on manual, heuristic-driven proxy metrics that fail to capture true policy impact.

Approach

QoQ defines data quality by each sample's direct contribution to reducing validation loss, using influence functions enhanced by maximum influence scoring and trajectory-wise aggregation to reduce noise and improve state coverage.

Key results

Achieves 99.2% simulation success rate and up to 30.0% real robot improvement over baselines
Introduces maximum influence scoring to isolate the most relevant validation state-action pairs
Implements trajectory-wise aggregation to ensure broad state coverage and reduce redundancy
Successfully curates high-quality data from in-the-wild DROID dataset and noisy teleoperation logs

Why it matters

Provides a grounded, efficient framework for data-centric robot learning that helps researchers filter noisy teleoperation data to train more robust policies with minimal manual curation.

Abstract

Learning from demonstrations has emerged as a promising paradigm for end-to-end robot control, particularly when scaled to diverse and large datasets. However, the quality of demonstration data, often collected through human teleop- eration, remains a critical bottleneck for effective data-driven robot learning. Human errors, operational constraints, and tele- operator variability introduce noise and suboptimal behaviors, making data curation essential yet largely manual and heuristic- driven. In this work, we propose Quality over Quantity (QoQ), a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstra- tions. To efficiently estimate this contribution, we leverage influence functions, which quantify the impact of individual training samples on model performance. We further introduce two key techniques to adapt influence functions for robot demonstrations: (i) using maximum influence across validation samples to capture the most relevant state-action pairs, and (ii) aggregating influence scores of state-action pairs within the same trajectory to reduce noise and improve data coverage. Experiments in both simulated and real-world settings show that QoQ consistently improves policy performances over prior data selection methods.

Index terms

Data Sets for Robot Learning Learning from Demonstration