← Back ICRA 2026

SCOOP'D: Learning Mixed-Liquid-Solid Scooping via Sim2Real Generative Policy

Kuanning Wang, Yongchong Gu, Yuqian Fu, Zeyu Shangguan, Sicheng He, Xiangyang Xue, Yanwei Fu, Daniel Seita

PDF

AI summary

Key figure (auto-extracted from paper)

A Sim2Real pipeline using two diffusion policies enables zero-shot robotic scooping of mixed liquid-solid mixtures across diverse real-world scenarios.

Robotic scooping Sim2Real transfer Diffusion policy Mixed liquid-solid manipulation Generative imitation learning

Problem

General-purpose robotic scooping of mixed liquids and solids remains challenging due to complex tool-object interactions, unreliable real-world perception, and the high cost of collecting physical demonstration data.

Approach

The method generates thousands of scooping demonstrations in simulation using an algorithmic demonstrator, then trains two separate diffusion models—one for optimal pre-scoop ladle positioning and another for fine-grained scooping actions—to enable direct zero-shot real-world deployment.

Key results

SimScoop dataset of 6,480 simulated scooping demonstrations
Two-stage diffusion policy for pre-scoop pose estimation and closed-loop scooping
Over 80% success rate across 240 zero-shot real-world trials on Level 1 objects
Strong generalization across varying objects, occlusions, liquids, and container types

Why it matters

Provides a scalable, data-efficient framework for complex fluid-solid manipulation tasks in assistive robotics, environmental cleanup, and industrial automation.

Abstract

Scooping items with tools such as spoons and ladles is common in daily life, ranging from assistive feeding to retrieving items from environmental disaster sites. However, developing a general and autonomous robotic scooping pol- icy is challenging since it requires reasoning about complex tool-object interactions. Furthermore, scooping often involves manipulating deformable objects, such as granular media or liquids, which is challenging due to their infinite-dimensional configuration spaces and complex dynamics. We propose a method, SCOOP’D, which uses simulation from OmniGibson (built on NVIDIA Omniverse) to collect scooping demonstra- tions using algorithmic procedures that rely on privileged state information. Then, we use generative policies via diffusion to imitate demonstrations from observational input. We directly apply the learned policy in diverse real-world scenarios, testing its performance on various item quantities, item characteristics, and container types. In zero-shot deployment, our method demonstrates promising results across 465 trials in diverse scenarios, including objects of different difficulty levels that we categorize as “Level 1” and “Level 2.” SCOOP’D outperforms all baselines and ablations, suggesting that this is a promising approach to acquiring robotic scooping skills. Project page: https://scoopdiff.github.io/

Index terms

Deep Learning in Grasping and Manipulation Imitation Learning Data Sets for Robot Learning