Gradual Receptive Expansion Using Vision Transformer for Online 3D Bin Packing
Minjae Kang, Hogun Kee, Yoseph Park, Junseok Kim, Jaeyeon Jeong, Geunje Cheon, Jaewon Lee, Songhwai Oh
Abstract
The bin packing problem (BPP) is a challeng- ing combinatorial optimization problem with a number of practical applications. This paper focuses on online 3D-BPP, where the packer makes immediate decisions for a loading position as items continually arrive. We propose a novel reinforcement learning algorithm, GREViT, which utilizes a vision transformer to tackle online 3D-BPP for the first time. By introducing the gradual receptive expansion technique, GREViT overcomes the limitations inherent in learning-based methods that only excel in their trained bins. As a result, GREViT surpasses existing BPP algorithms in packing ratio across various bin sizes. The effectiveness of GREViT in real- world scenarios is validated by its successful demonstrations using a real robot for solving 3D-BPP. The attached video demonstrates GREViT undertaking 3D-BPP in both simulated and real-world environments.