Manipulating XXY Planar Platform Positioning Accuracy by Computer Vision Based on Reinforcement Learning
Abstract
:1. Introduction
2. RL Method
2.1. RL Fundamentals
2.2. Q-Learning
2.3. Deep Q-Network
3. XXY Visual Feedback Control System
3.1. XXY Platform Hardware
3.2. Vision for the XXY Motion Stage
3.3. XXY Stage Controller
4. Experimental Methodology
4.1. Experimental Setup
4.2. State Design of the DQN Model
4.3. Action Design of the DQN Model
4.4. Reward Design of the DQN Model
4.5. Neural Network Design of the DQN Model
5. Simulation and Experimental Results for the Model-Free DQN Model
5.1. DQN Training
5.2. Experimental Validation of the Results of the DQN Model
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lin, C.-J.; Hsu, H.-H.; Cheng, C.-H.; Li, Y.-C. Design of an Image-Servo Mask alignment system using dual CCDs with an XXY stage. Appl. Sci. 2016, 6, 42. [Google Scholar] [CrossRef]
- Lee, H.-W.; Liu, C.-H. Vision servo motion control and error analysis of a coplanar XXY stage for image alignment motion. Math. Probl. Eng. 2013, 2013, 592312. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Huang, Y.-C.; Ma, M.-Y. An Improved Particle Swarm Optimization Using Long Short-Term Memory Model for Positioning Control of a Coplanar XXY Stage. Meas. Control 2022, 55, 881–897. [Google Scholar] [CrossRef]
- Balasubramaniana, P.; Senthilvelan, T. Optimization of Machining Parameters in EDM Process Using Cast and Sintered Copper Electrodes. Procedia Mater. Sci. 2014, 6, 1292–1302. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; The MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Pandit, A.; Hingu, B. Online Tuning of PID controller using Black Box Multi-Objective Optimization and Reinforcement Learning. IFAC-PapersOnLine 2018, 51, 844–849. [Google Scholar] [CrossRef]
- Nie, L.; Mu, C.; Yin, Z.; Jiang, W. Control law design of variable cycle engine based on DQN. In Proceedings of the 3rd International Conference on Unmanned Systems (ICUS), Harbin, China, 27–28 November 2020. [Google Scholar]
- Gauna, B.F.; Ansoategui, I.; Agiriano, I.E.; Graña, M. Reinforcement learning of ball screw feed drive controllers. Eng. Appl. Artif. Intell. 2014, 30, 107–117. [Google Scholar] [CrossRef]
- Chen, S.; Yan, D.; Zhang, Y.; Tan, Y.; Wang, W. Live Working Manipulator Control Model based on DPPO-DQN Combined Algorithm. In Proceedings of the IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China, 20–22 December 2019. [Google Scholar]
- Sakryukin, A.; Raissi, C.; Kankanhalli, M. Inferring DQN structure for high-dimensional continuous control. In Proceedings of the 37th International Conference on Machine Learning, Virtual Event, 13–18 November 2020. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Chiuan Yan Technology Co., Ltd. XXY-25-7 Model Description. Available online: https://www.aafteck.com/en/product-609758/XXY25-07-XXY25-07.html (accessed on 5 December 2022).
- Adlink Technology Inc. Motion Control Card User’s Manual, PCI-8253/8256 DSP-Based 3/6-Axis Analog Motion Control Card User’s Manual; Adlink Technology Inc.: Taipei, Taiwan, 2009. [Google Scholar]
Position | Actual Move (μm) | Error (μm) |
---|---|---|
0~100 | 100 | 0 |
100~200 | 94 | −6 |
200~300 | 101 | 1 |
300~400 | 101 | 1 |
… | … | … |
100~0 | −97 | 3 |
Action Scenario | Perform |
---|---|
Action 1 | Use Up command |
Action 2 | Use Down command |
Action 3 | Hold on |
Parameter | Value |
---|---|
Initial Value | 1000 |
Memory Size | 3000 |
Batch Size | 12 |
0.8 | |
0.01 | |
0.999 | |
Learning Rate α | 0.9 |
Discount Factor η (Gamma) | 0.95 |
State_size | 10 |
Skip | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, Y.-C.; Chan, Y.-C. Manipulating XXY Planar Platform Positioning Accuracy by Computer Vision Based on Reinforcement Learning. Sensors 2023, 23, 3027. https://doi.org/10.3390/s23063027
Huang Y-C, Chan Y-C. Manipulating XXY Planar Platform Positioning Accuracy by Computer Vision Based on Reinforcement Learning. Sensors. 2023; 23(6):3027. https://doi.org/10.3390/s23063027
Chicago/Turabian StyleHuang, Yi-Cheng, and Yung-Chun Chan. 2023. "Manipulating XXY Planar Platform Positioning Accuracy by Computer Vision Based on Reinforcement Learning" Sensors 23, no. 6: 3027. https://doi.org/10.3390/s23063027
APA StyleHuang, Y. -C., & Chan, Y. -C. (2023). Manipulating XXY Planar Platform Positioning Accuracy by Computer Vision Based on Reinforcement Learning. Sensors, 23(6), 3027. https://doi.org/10.3390/s23063027