← Back ICRA 2024

Jacquard V2: Refining Datasets Using the Human in the Loop Data Correction Method

Qiuhao Li, Shenghai Yuan

PDF

Abstract

In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets is imperative for training models to acquire implicit knowledge related to the handling of various objects. Creating datasets from scratch is a time and labor-intensive process. Moreover, existing datasets often con- tain errors due to automated annotations aimed at expediency, making the improvement of these datasets a substantial research challenge. Consequently, several issues have been identified in the annotation of grasp bounding boxes within the popular Jacquard Grasp Dataset [1]. We propose utilizing a Human- In-The-Loop(HIL) method to enhance dataset quality. This ap- proach relies on backbone deep learning networks to predict ob- ject positions and orientations for robotic grasping. Predictions with Intersection over Union (IOU) values below 0.2 undergo an assessment by human operators. After their evaluation, the data is categorized into False Negatives(FN) and True Negatives(TN). FN are then subcategorized into either missing annotations or catastrophic labeling errors. Images lacking labels are augmented with valid grasp bounding box information, whereas images afflicted by catastrophic labeling errors are completely removed. The open-source tool Labelbee was employed for 53,026 iterations of HIL dataset enhancement, leading to the removal of 2,884 images and the incorporation of ground truth information for 30,292 images. The enhanced dataset, named the Jacquard V2 Grasping Dataset, served as the training data for a range of neural networks. We have empirically demon- strated that these dataset improvements significantly enhance the training and prediction performance of the same network, resulting in an increase of 7.1% across most popular detection architectures for ten iterations. This refined dataset will be accessible on One Drive and Baidu Netdisk, while the associated tools, source code, and benchmarks will be made available on GitHub (https://github.com/lqh12345/Jacquard V2).

Index terms

Human Factors and Human-in-the-Loop Learning Categories and Concepts Data Sets for Robotic Vision