← Back ICRA 2024

Perception through Cognitive Emulation: �A Second Iteration of NaivPhys4RP for Learningless and Safe Recognition and 6D-Pose Estimation of (Transparent) Objects�

Franklin Kenghagho Kenfack, Michael Neumann, Patrick Mania, Michael Beetz

PDF

Abstract

In our previous work, we designed a human- like white-box and causal generative model of perception NaivPhys4RP, essentially based on cognitive emulation to under- stand the past, the present and the future of the state of complex worlds from poor observations. In this paper, as recommended in that previous work, we first refine the theoretical model of NaivPhys4RP in terms of integration of variables as well as perceptual inference tasks to solve. Intuitively, the system is closed under the injection, update and dependency of variables. Then, we present a first implementation of NaivPhys4RP that demonstrates the learningless and safe recognition and 6D-Pose estimation of objects from poor sensor data (e.g., occlusion, transparency, poor-depth, in-hand). This does not only make a substantial step forward comparatively to classical perception systems in perceiving objects in these scenarios, but escape the burden of data-intensive learning and operate safely (transparency and causality — we fit sensor data into mentally constructed meaningful worlds). With respect to ChatGPT’s ambitions, it can imagine physico-realistic socio-physical scenes from texts, demonstrate understanding of these texts, and all these with no data- and resource-intensive learning. *This work was not supported by any organization 1 with Institute for Artificial Intelligence, Mathematics and Computer Sci- ence, University of Bremen, Germany fkenghag@uni-bremen.de

Index terms

Semantic Scene Understanding Cognitive Modeling Perception for Grasping and Manipulation