Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems
Abstract
:1. Introduction
- A GNN combined with DRL model was constricted for solving DJSP with random job arrivals;
- A scheduling pool to store the processes that satisfy the following two conditions in the current job shop environment: (1) the process being processed is currently in a executable state; (2) the machine processing the process is also in an idle state;
- A new reward function is designed to enhance the learning ability of the algorithm for solving the target.
2. Problem Modeling
2.1. Problem Introduction
- (1)
- All machines are in an executable state at the start of scheduling, and machine failures are not considered;
- (2)
- Multiple processes for the same job need to be processed sequentially in the process order;
- (3)
- Processes for individual jobs must be processed on designated machines and have different priorities between different jobs to account for the degree of urgency between different jobs;
- (4)
- Processes may not be interrupted once they have begun processing;
- (5)
- No consideration is given to the transfer time of jobs between different machines.
2.2. Construction of Mathematical Model
3. Model Construction of Reinforcement Learning Algorithm Combined with GNN
3.1. Graph Neural Network
3.1.1. Characterization of the Graph
- 1-i indicates the job number of the current node;
- 2-j indicates the process number of the current node;
- 3-Mij indicates the machine number for processing the process;
- 4-Ai indicates the arrival time of the job;
- 5-OPij indicates the operation time of the process;
- 6-Di, the latest submission time of the job;
- 7-Startij, the time when the process is started to be executed;
- 8-Stateij indicates the scheduling status of the process; 0 means not scheduled; 1 means being scheduled; 2 means the process has been scheduled;
- 9-wi indicates the urgency of the job.
3.1.2. Advantages and Methods of Job Shop Extraction of Graphs
3.2. Deep Reinforcement Learning
3.2.1. Markov Process and Reinforcement Learning
3.2.2. Double Deep Q Network
Algorithm 1. GNN + DDQN |
Initialize replay memory D; Initialize online network action-value Q with random weights ; Initialize target network action-value with = ; For episode = 1E do: Initialize state For t = 1:T do: If p < select a random action a; Else: a = ; Execute a, observe the next state:, get the immediate reward r; Store () in D; Get a random minibatch from D; Select the optimal action by Q: ; Update as fellow: Execution of gradient descent: ; Each m step, update with ; End; |
3.2.3. Action Space
3.2.4. Reward Function
4. Numerical Experiments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jackson, J.R. Simulation research on job shop production. Nav. Res. Logist. Q. 1957, 4, 287–295. [Google Scholar] [CrossRef]
- Zhong, J.W.; Shi, Y.Q. Smart factory job shop scheduling based on DQN. Mod. Manuf. Eng. 2021, 492, 17–23+93. [Google Scholar] [CrossRef]
- Zhang, C.; Song, W.; Cao, Z.; Zhang, J.; Tan, P.S.; Chi, X. Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning. Adv. Neural Inf. Process. Syst. (NeurIPS) 2020, 33, 1621–1632. [Google Scholar]
- Zhang, K.; Bi, L.; Jiao, S.G. Research on flexible job shop scheduling problem with integrated reinforcement learning algorithm. China Mech. Eng. 2023, 34, 201–207. [Google Scholar]
- Holthaus, O.; Rajendran, C. Effificient dispatching rules for scheduling in a job shop. Int. J. Prod. Econ. 1997, 48, 87–105. [Google Scholar] [CrossRef]
- Qu, X.H.; Ji, F.; Meng, G.J.; Ding, B.R.; Wang, J. Research on green scheduling problem of flexible job shop with superheuristic genetic algorithm. Mechatron. Eng. 2022, 39, 255–261. [Google Scholar]
- Zhang, S.; Wang, Y.; Ji, Z. A dynamic job shop scheduling method based on superheuristic genetic planning. J. Syst. Simul. 2020, 32, 2494–2506. [Google Scholar] [CrossRef]
- Burggräf, P.; Wagner, J.; Saßmannshausen, T.; Ohrndorf, D.; Subramani, K. Multi-agent-based deep reinforcement learning for dynamic flexible job shop scheduling. Procedia CIRP 2022, 112, 57–62. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Zhao, Y.; Wang, Y.; Tan, Y.; Zhang, J.; Yu, H. Dynamic jobshop scheduling algorithm based on deep q network. IEEE Access 2021, 9, 122995–123011. [Google Scholar] [CrossRef]
- Luo, S. Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning. Appl. Soft Comput. 2020, 91, 106208. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, L.; Ren, L.; Xie, K.; Wang, K.; Ye, F.; Chen, Z. A brief study on job shop scheduling problem based on reinforcement learning. J. Syst. Simul. 2021, 33, 2782–2791. [Google Scholar] [CrossRef]
- Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
- Hameed, M.S.A.; Schwung, A. Reinforcement learning on job shop scheduling problems using graph networks. arXiv 2020, arXiv:2009.03836. [Google Scholar]
- Yu, B.; Xie, H.; Xu, Z. PN-GCN: Positive-negative graph convolution neural network in information system to classification. Inf. Sci. 2023, 632, 411–423. [Google Scholar] [CrossRef]
- Zhao, K.; Du, C.; Tan, G. Enhancing Basketball Game Outcome Prediction through Fused Graph Convolutional Networks and Random Forest Algorithm. Entropy 2023, 25, 765. [Google Scholar] [CrossRef]
- Zhong, H.; Wang, M.; Zhang, X. HeMGNN: Heterogeneous Network Embedding Based on a Mixed Graph Neural Network. Electronics 2023, 12, 2124. [Google Scholar] [CrossRef]
- Zeng, Y.; Liao, Z.; Dai, Y.; Wang, R.; Li, X.; Yuan, B. Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism. arXiv 2022, arXiv:2201.00548. [Google Scholar]
- Park, J.; Chun, J.; Kim, S.H.; Kim, Y.; Park, J. Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning. Int. J. Prod. Res. 2021, 59, 3360–3377. [Google Scholar] [CrossRef]
- Chen, S.; Huang, Z.; Guo, H. An End-to-End Deep Learning Method for Dynamic Job Shop Scheduling Problem. Machines 2022, 10, 573. [Google Scholar] [CrossRef]
- Wang, W.; Luo, S. A review of research on intelligent shop floor scheduling strategies based on reinforcement learning. Comput. Appl. Res. 2022, 39, 1608–1614. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhu, H.; Tang, D.; Zhou, T.; Gui, Y. Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems. Robot. Comput.-Integr. Manuf. 2022, 78, 102412. [Google Scholar] [CrossRef]
- Luo, Z.; Jiang, C.; Liu, L.; Zheng, X.; Ma, H. Research on intelligent job shop scheduling method based on deep reinforcement learning. J. Internet Things 2022, 6, 53–64. [Google Scholar]
- Liu, C.L.; Chang, C.C.; Tseng, C.J. Actor-critic deep reinforcement learning for solving job shop scheduling problems. IEEE Access 2020, 8, 71752–71762. [Google Scholar] [CrossRef]
- Wang, H.; Sarker, B.R.; Li, J.; Li, J. Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q-learning. Int. J. Prod. Res. 2021, 59, 5867–5883. [Google Scholar] [CrossRef]
- Chang, J.; Yu, D.; Hu, Y.; He, W.; Yu, H. Deep reinforcement learning for dynamic flexible job shop scheduling with random job arrival. Processes 2022, 10, 760. [Google Scholar] [CrossRef]
- Wang, Y.; Ding, Y. Dynamic flexible job shop optimal scheduling and decision making method. J. Syst. Simul. 2020, 32, 2073–2083. [Google Scholar] [CrossRef]
Symbol | Meaning Description |
---|---|
i | Index of jobs, i = 1, 2, 3, …, n |
j | Index of operations belonging to jobs |
Oij | Index of operation j belonging to job i |
Ci | The finishing time of job i |
Di | The deadline of job i |
Mij | The index of machine set for operation Oij |
Ai | The arrival time of Job i |
OPij | The processing time of operation Oij |
The remaining processes of job i | |
Startij | The starting time of Oij |
Stateij | The state of Oij |
Wi | The weight of job i |
OPn | The entire process of all jobs |
StateMij | The state of Mij |
Hyperparameter | Value |
---|---|
Epoch | 400 |
Learning rate | 0.001 |
Experience pool | 2000 |
Qtarget update step | 20 |
Batchsize | 10 |
Q network layers | 5 |
Graph convolution layers | 2 |
Discount factor | 0.95 |
ε-greedy | {0.01~0.6} |
Activation function | Tanh() |
Dataset | Rule1 | Rule2 | Rule3 | Rule4 | Rule5 | Rule6 | DDQN | GNN + DRL |
---|---|---|---|---|---|---|---|---|
Bs1 10 × 10 | 8942 | 9918 | 9656 | 8482 | 8482 | 9747 | 8482 | 8482 |
Bs2 10 × 10 | 7118 | 7081 | 7211 | 7100 | 6893 | 7448 | 6885 | 6885 |
Bs3 10 × 10 | 7170 | 7594 | 9072 | 7170 | 6726 | 7268 | 6726 | 6726 |
Bs4 10 × 10 | 5740 | 5647 | 5732 | 5400 | 5640 | 5845 | 5400 | 5400 |
Bs5 20 × 15 | 61,679 | 61,549 | 73,736 | 58,476 | 60,784 | 72,205 | 53,946 | 53,913 |
Bs6 20 × 15 | 47,322 | 46,992 | 47,645 | 48,128 | 45,028 | 51,541 | 43,786 | 42,970 |
Bs7 20 × 15 | 40,484 | 36,944 | 41,448 | 36,292 | 34,938 | 41,160 | 34,269 | 34,011 |
Bs8 20 × 15 | 62,649 | 61,156 | 62,272 | 60,803 | 61,800 | 63,624 | 59,565 | 59,543 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Bi, L.; Jiao, X. Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems. Processes 2023, 11, 1571. https://doi.org/10.3390/pr11051571
Yang Z, Bi L, Jiao X. Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems. Processes. 2023; 11(5):1571. https://doi.org/10.3390/pr11051571
Chicago/Turabian StyleYang, Zhong, Li Bi, and Xiaogang Jiao. 2023. "Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems" Processes 11, no. 5: 1571. https://doi.org/10.3390/pr11051571
APA StyleYang, Z., Bi, L., & Jiao, X. (2023). Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems. Processes, 11(5), 1571. https://doi.org/10.3390/pr11051571