Human-Robot Collaboration through a Multi-Scale Graph Convolution Neural Network with Temporal Attention
Zhaowei Liu, Xilang Lu, Wenzhe Liu, Wen Qi, Hang Su
Abstract
Collaborative robots sensing and understanding the movements and intentions of their human partners are crucial for realizing human-robot collaboration. Human skele- ton sequences are widely recognized as a kind of data with great application potential in human action recognition. In this paper, a multi-scale skeleton-based human action recognition network is proposed, which leverages a spatio-temporal atten- tion mechanism. The network achieves high-accuracy human action prediction by aggregating multi-level key point features of the skeleton and applying the spatio-temporal attention mechanism to extract key temporal information features. In addition, a human action skeleton dataset containing eight different categories is collected for a human-robot collaboration task, where the human activity recognition network predicts skeleton sequences from a camera and the collaborating robot makes collaborative actions based on the predicted actions. In this study, the performance of the proposed method is compared with state-of-the-art human action recognition methods and ablation experiments are performed. The results show that the multi-scale spatio-temporal graph convolutional neural network has an action recognition accuracy of 94.16%. The effectiveness of the method is also verified by performing human-robot col- laboration experiments on a real robot platform in a laboratory environment.