EgoHMR: Egocentric Human Mesh Recovery Via Hierarchical Latent Diffusion Model
Yuxuan Liu, Jianxin Yang, Xiao Gu, Yao Guo, Guang-Zhong Yang
Abstract
Egocentric vision has gained increasing popularity in social robotics, demonstrating great potentials for personal assistance and human-centric behavior analysis. Holistic per- ception of human body itself is a prerequisite for downstream applications, including action recognition and anticipation. Extensive research has been performed for human mesh re- covery from the exocentric images captured from a third- person view, but limited studies are conducted for heavily distorted yet occluded egocentric images. In this paper, we propose Egocentric Human Mesh Recovery (EgoHMR), a novel hierarchical network based on latent diffusion models. Our method takes a single egocentric frame as the input and it can be trained in an end-to-end manner without supervision of 2D pose. The network is built upon the latent diffusion model by incorporating both global and local features in a hierarchical structure. To train the proposed network, we generate weak labels from synchronized exocentric images. The proposed method can perform human mesh recovery directly from egocentric images and detailed quantitative and qualitative experiments have been conducted to demonstrate the effectiveness of the proposed EgoHMR method.