← Back ICRA 2026

Learning Flexible Job Shop Scheduling under Limited Buffers and Material Kitting Constraints

Shishun Zhang, Juzhan Xu, Yidan Fan, Chenyang Zhu, Ruizhen Hu, YONGJUN WANG, Kai Xu

PDF

AI summary

Key figure (auto-extracted from paper)

A deep reinforcement learning model with cost-sensitive graph message passing effectively minimizes production makespan and pallet switches under strict buffer and kitting constraints.

Flexible Job Shop Scheduling Deep Reinforcement Learning Heterogeneous Graph Neural Network Limited Buffers Material Kitting Production Optimization

Problem

Real-world manufacturing scheduling often overlooks limited buffer zones and material kitting rules, causing bottlenecks that traditional and standard DRL methods cannot effectively model or optimize.

Approach

The authors integrate a heterogeneous graph neural network into a deep reinforcement learning framework, using weighted message passing to propagate buffer state and pallet change costs to operations for proactive decision-making.

Key results

First DRL framework to solve FJSP with limited buffer and material kitting constraints
Enhanced heterogeneous GNN with cost-sensitive message passing to capture pallet switch costs
Superior performance over heuristics and advanced DRL baselines in makespan and pallet reduction
Balanced solution quality and computational efficiency across synthetic and real industrial datasets

Why it matters

It enables high-mix manufacturing lines to overcome part-sorting bottlenecks and improve overall production efficiency through scalable, constraint-aware scheduling.

Abstract

The Flexible Job Shop Scheduling Problem (FJSP) originates from real production lines, while some practical constraints are often ignored or idealized in current FJSP studies, among which the limited buffer problem has a particular impact on production efficiency. To this end, we study an extended problem that is closer to practical scenarios—the Flexible Job Shop Scheduling Problem with Limited Buffers and Material Kitting. In recent years, deep reinforcement learning (DRL) has demonstrated considerable potential in scheduling tasks. However, its capacity for state modeling remains limited when handling complex dependencies and long-term constraints. To address this, we leverage a heterogeneous graph network within the DRL framework to model the global state. By constructing efficient message passing among machines, operations, and buffers, the network focuses on avoiding decisions that may cause frequent pallet changes during long-sequence scheduling, thereby helping improve buffer utilization and overall decision quality. Experimental results on both synthetic and real production line datasets show that the proposed method outperforms traditional heuristics and advanced DRL methods in terms of makespan and pallet changes, and also achieves a good balance between solution quality and computational cost. Furthermore, a supplementary video is provided to showcase a simulation system that effectively visualizes the progression of the production line.

Index terms

Intelligent and Flexible Manufacturing