GRITS: A Spillage-Aware Guided Diffusion Policy for Robot Food Scooping Tasks
Yen-Ling Tai, Yi-Ru Yang, Kuan-Ting Yu, Yu-Wei Chao, Yi-Ting Chen
AI summary
Problem
Existing robotic food scooping methods struggle with diverse and dynamic food states, often causing high spillage and poor generalization to unseen scenarios due to the high cost of collecting diverse demonstration data.
Approach
GRITS integrates a differentiable spillage predictor into a diffusion policy’s denoising process, continuously steering trajectories toward safer paths at test time without requiring retraining.
Key results
- 82% task success rate and 4% spillage rate across 10 unseen food categories
- Over 40% spillage reduction compared to unguided diffusion baselines
- Effective sim-to-real transfer using primitive-shape simulated training data
- Real-world validation on a Franka Emika Panda robot platform
Why it matters
Provides a scalable, safety-focused framework for assistive feeding and food preparation robots that must handle unpredictable real-world food dynamics.
Abstract
Robotic food scooping is a critical manipulation skill for food preparation and service robots. However, existing robot learning algorithms, especially learn-from-demonstration methods, still struggle to handle diverse and dynamic food states, which often results in spillage and reduced reliabil- ity. In this work, we introduce GRITS: A Spillage-Aware Guided Diffusion Policy for RobotIc Food Scoop TaskS. This framework leverages guided diffusion policy to minimize food spillage during scooping and to ensure reliable transfer of food items from the initial to the target location. Specifically, we design a spillage predictor that estimates the probability of spillage given current observation and action rollout. The predictor is trained on a simulated dataset with food spillage scenarios, constructed from four primitive shapes (spheres, cubes, cones, and cylinders) with varied physical properties such as mass, friction, and particle size. At inference time, the predictor serves as a differentiable guidance signal, steering ∗Equal contribution. †Corresponding author. the diffusion sampling process toward safer trajectories while preserving task success. We validate GRITS on a real-world robotic food scooping platform. GRITS is trained on six food categories and evaluated on ten unseen categories with different shapes and quantities. GRITS achieves an 82% task success rate and a 4% spillage rate, reducing spillage by over 40% compared to baselines without guidance, thereby demonstrating its effectiveness. More details are available on our project website: https://hcis-lab.github.io/GRITS/.