A Real-time Filter for Human Pose Estimation based on Denoising Diffusion Models for Edge Devices
Chiara Bozzini, Michele Boldo, Enrico Martini, Nicola Bombieri
Abstract
Human Pose Estimation (HPE) is increasingly utilized across various sectors, from healthcare to Industry 5.0. To address the inherent inaccuracies in CNN-based HPE systems, filtering models are commonly employed to refine and improve inference results. However, state-of-the-art filtering models often require substantial computational resources, lim- iting their applicability in resource-constrained environments. To overcome this limitation, we propose a real-time filtering approach based on denoising diffusion models (DM) specifically optimized for edge devices. Through a micro-benchmarking process, we analyze the DM adaptability to different types and levels of noise and determine the optimal setup for specific application scenarios. We present a real-time filter that takes advantage of the DM setup with two configurations to address different application scenarios. Using a widespread edge device, we evaluate the model’s effectiveness in handling both synthetic and real noise generated by state-of-the-art HPE systems. The results demonstrate a significant improvement in real-time filtering performance with minimal computational overhead. The code is available on github.com/PARCO-LAB/LUT-DM- filters.