Automatic Captioning Based on Visible and Infrared Images
Yan Wang, Shuli Lou, Kai Wang, Xiaohu Yuan, Huaping Liu
Abstract
In this paper, we tackle the task of image cap- tioning with the complementarity of visible light images and infrared images. To address this problem, we propose an RGB- IR image fusion captioning model, which can take full advan- tage of visible light images and infrared images under different conditions. Meanwhile, we develop a wearable environment- assisted system. In addition, we collect and annotate a new dataset containing 3510 pairs of RGB-IR images to support model training. Finally, we conduct extensive experiments to evaluate the model and system. Experimental results show that our new method and system significantly outperform baselines on multiple metrics and have potential practical value.