NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields
Floris Marc Arden Erich, Naoya Chiba, Abdullah Mustafa, Yusuke Yoshiyasu, Noriaki Ando, Ryo Hanai, Yukiyasu Domae
Abstract
We present NeuralLabeling, a labeling approach and toolset for annotating 3D scenes using either bounding boxes or meshes and generating segmentation masks, affor- dance maps, 2D bounding boxes, 3D bounding boxes, 6DOF object poses, depth maps, and object meshes. NeuralLabeling uses Neural Radiance Fields (NeRF) as a renderer, allowing labeling to be performed using 3D spatial tools while incorpo- rating geometric clues such as occlusions, relying only on images captured from multiple viewpoints as input. To demonstrate the applicability of NeuralLabeling to a practical problem in robotics, we added ground truth depth maps to 30000 frames of transparent object RGB and noisy depth maps of glasses placed in a dishwasher captured using an RGBD sensor, yielding the Dishwasher30k dataset. We show that training a simple deep neural network with supervision using the annotated depth maps yields a higher reconstruction performance than training with the previously applied weakly supervised ap- proach. We also show how instance segmentation and depth completion datasets generated using NeuralLabeling can be incorporated into a robot application for grasping transparent objects placed in a dishwasher with an accuracy of 83.3%, compared to 16.3% without depth completion. Supplementary URI: https://florise.github.io/neural labeling web/.