Long-Horizon Visual Action Based Food Acquisition
Amisha Bhaskar, Rui Liu, Vishnu D. Sharma, Guangyao Shi, Pratap Tokekar
Abstract
Robotic Assisted Feeding (RAF) addresses the fundamental need for individuals with mobility impairments to regain autonomy in feeding themselves. The goal of RAF is to use a robot arm to acquire and transfer food to individuals from the table. Existing RAF methods primarily focus on solid foods, leaving a gap in manipulation strategies for semi- solid and deformable foods. We present Long-horizon Visual Action-based (LAVA) food acquisition of liquid, semisolid, and deformable foods. Long-horizon refers to the goal of “clearing the bowl” by sequentially acquiring the food from the bowl. LAVA is hierarchical: (1) At the highest level, we determine primitives using ScoopNet. (2) At the mid-level, LAVA finds parameters for the low-level primitives. (3) At the lowest level, LAVA carries out action execution using behavior cloning. We validate LAVA on real-world acquisition trials involving granular, liquid, semisolid, and deformable foods along with fruit chunks and soup. Across 46 bowls, LAVA acquires much more efficiently than baselines with a success rate of 89 ± 4%, and generalizes across realistic plate variations such as varying positions, varieties, and amount of food in the bowl. Datasets and supplementary materials can be found on our website.