Research Analyzer
← Back ICRA 2023

H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions

Kei Ota, Hsiao-Yu Tung, Kevin Smith, Anoop Cherian, Tim K. Marks, Alan Sullivan, Asako Kanezaki, Joshua Tenenbaum

PDF

Abstract

The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn’t work. We enable these capabilities in autonomous agents by proposing “Hypothesize, Simulate, Act, Update, and Repeat” (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of- the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models. 1Kei Ota is with Information Technology R&D Center, Mitsubishi Electric Corporation, Japan. Ota.Kei@ds.MitsubishiElectric.co.jp 2Kei Ota and Asako Kanezaki are with Tokyo Institute of Technology, Japan. 3Hsiao-Yu Tung, Kevin A. Smith, and Joshua B. Tenenbaum are with Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA. 4Anoop Cherian, Tim K. Marks, and Alan Sullivan are with Mitsubishi Electric Research Labs, Cambridge, MA, USA.

Index terms

Probabilistic Inference Learning Categories and Concepts Cognitive Modeling