A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm
Abstract
:1. Introduction
2. Principle of Active Vibration Isolation
2.1. The Principle of Active Vibration Control
2.2. FB and FF Control Theory
3. MATLAB and ADAMS Co-Simulation
3.1. The Single DOF Model of AVRS in the Vertical Direction
3.2. DDPG Reinforcement Learning Algorithm
3.3. Simulation of DDPG Algorithm in Active Vibration Control
3.3.1. The Structure of the Neural Network
3.3.2. Reward Function
3.4. The Results of Simulation
4. Experiments
4.1. Reward Function Optimization
4.2. Experimental Setup
4.3. Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
References
- Gordon, C.G. Generic Vibration Criteria for Vibration-Sensitive Equipment. In Proceedings of the SPIE’s International Symposium on Optical Science, Engineering, and Instrumentation International Society for Optics and Photonics, Denver, CO, USA, 28 September 1999; pp. 22–33. [Google Scholar]
- Amick, H.; Gendreau, M.; Busch, T.; Gordon, C.; Gordon, C.; Lane, S.; Bruno, S. Evolving Criteriafor Research Facilities: I—Vibration. In Proceedings of the Reprinted from Proceedings of SPIE Conference 5933: Buildings for Nanoscale Research and Beyond, San Diego, CA, USA, 31 July–1 August 2005. [Google Scholar]
- Karnopp, D.; Crosby, M.J.; Harwood, R.A. Vibration Control Using Semi-Active Force Generators. J. Eng. Ind. 1974, 96, 619–626. [Google Scholar] [CrossRef]
- Griffin, S.; Gussy, J.; Lane, S.A.; Henderson, B.K.; Sciulli, D. Virtual Skyhook Vibration Isolation System. J. Vib. Acoust. 2002, 124, 63–67. [Google Scholar] [CrossRef]
- Zuo, L.; Slotine, J.-J.E.; Nayfeh, S.A. Model reaching adaptive control for vibration isolation. IEEE Trans. Contr. Syst. Technol. 2005, 13, 611–617. [Google Scholar] [CrossRef]
- Jin, Q.B.; Liu, Q. IMC-PID Design Based on Model Matching Approach and Closed-Loop Shaping. ISA Trans. 2014, 53, 462–473. [Google Scholar] [CrossRef] [PubMed]
- Ding, R.; Wang, R.; Meng, X.; Chen, L. A Modified Energy-Saving Skyhook for Active Suspension Based on a Hybrid Electromagnetic Actuator. J. Vib. Control 2019, 25, 286–297. [Google Scholar] [CrossRef]
- Yan, B.; Brennan, M.J.; Elliott, S.J.; Ferguson, N.S. Active Vibration Isolationofa Systemwitha Distributed Parameter Isolator Using Absolute Velocity Feedback Control. J. Sound Vib. 2010, 329, 1601–1614. [Google Scholar] [CrossRef]
- Zuo, L.; Slotine, J.-J.E. Robust Vibration Isolation via Frequency-Shaped Sliding Control and Modal Decomposition. J. Sound Vib. 2005, 285, 1123–1149. [Google Scholar] [CrossRef]
- Yasuda, M.; Osaka, T.; Ikeda, M. Feedforward Control of a Vibration Isolation System for Disturbance Suppression. In Proceedings of the 35th IEEE Conference on Decision and Control, Kobe, Japan, 13 December 1996; Volume 2, pp. 1229–1233. [Google Scholar]
- Butler, H. Feedforward Signal Prediction for Accurate Motion Systems Using Digital Filters. Mechatronics 2012, 22, 827–835. [Google Scholar] [CrossRef]
- Sun, L.; Li, D.; Gao, Z.; Yang, Z.; Zhao, S. Combined Feedforward and Model-Assisted Active Disturbance Rejection Control for Non-Minimum Phase System. ISA Trans. 2016, 64, 24–33. [Google Scholar] [CrossRef]
- Yoshioka, H.; Takahashi, Y.; Katayama, K.; Imazawa, T.; Murai, N. An Active Microvibration Isolation System for Hi-Tech Manufacturing Facilities. J. Vib. Acoust. 2001, 123, 269–275. [Google Scholar] [CrossRef]
- Ding, J.; Luo, X.; Chen, X.; Bai, O.; Han, B. Design of Active Controller for Low-Frequency Vibration Isolation Considering Noise Levels of Bandwidth-Extended Absolute Velocity Sensors. IEEE/ASME Trans. Mechatron. 2018, 23, 1832–1842. [Google Scholar] [CrossRef]
- Li, P.; Lam, J.; Cheung, K.C. H∞ Control of Periodic Piecewise Vibration Systems with Actuator Saturation. J. Vib. Control 2017, 23, 3377–3391. [Google Scholar] [CrossRef]
- Tang, D.; Chen, L.; Tian, Z.F.; Hu, E. Adaptive Nonlinear Optimal Control for Active Suppression of Airfoil Flutter via a Novel Neural-Network-Based Controller. J. Vib. Control 2018, 24, 5261–5272. [Google Scholar] [CrossRef]
- van der Poel, T.; van Dijk, J.; Jonker, B.; Soemers, H. Improving the Vibration Isolation Performance of Hard Mounts for Precision Equipment. In Proceedings of the 2007 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Zurich, Switzerland, 4–7 September 2007; pp. 1–5. [Google Scholar]
- Schölkopf, B. Learning to See and Act. Nature 2015, 518, 486–487. [Google Scholar] [CrossRef]
- Littman, M.L. Reinforcement Learning Improves Behaviour from Evaluative Feedback. Nature 2015, 521, 445–451. [Google Scholar] [CrossRef]
- Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.; Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton, A.; et al. Mastering the Game of Go without Human Knowledge. Nature 2017, 550, 354–359. [Google Scholar] [CrossRef]
- Khalatbarisoltani, A.; Soleymani, M.; Khodadadi, M. Online Control of an Active Seismic System via Reinforcement Learning. Struct. Control Health Monit. 2019, 26, e2298. [Google Scholar] [CrossRef]
- Peters, J.; Schaal, S. Natural Actor-Critic. Neurocomputing 2008, 71, 1180–1190. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-Level Control through Deep Reinforcement Learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous Control with Deep Reinforcement Learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Silver, D.; Lever, G. Deterministic Policy Gradient Algorithms. In Proceedings of the 31st International Conference on Machine Learning, Bejing, China, 21–26 June 2014. [Google Scholar]
- Kofinas, P.; Vouros, G.; Dounis, A.I. Energy Management in Solar Microgrid via Reinforcement Learning Using Fuzzy Reward. Adv. Build. Energy Res. 2018, 12, 97–115. [Google Scholar] [CrossRef]
- Marashi, M.; Khalilian, A.; Shiri, M.E. Automatic Reward Shaping in Reinforcement Learning Using Graph Analysis. In Proceedings of the 2012 2nd International eConference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 18–19 October 2012; pp. 111–116. [Google Scholar]
- Sumino, S.; Mutoh, A.; Kato, S. Evolutionary Approach of Reward Function for Reinforcement Learning Using Genetic Programming. In Proceedings of the 2011 International Symposium on Micro-NanoMechatronics and Human Science, Nagoya, Japan, 6–9 November 2011; pp. 385–390. [Google Scholar]
- Smith, M.; Wang, F.-C. Controller Parameterization for Disturbance Response Decoupling: Application to Vehicle Active Suspension Control. Control Syst. Technol. IEEE Trans. 2002, 10, 393–407. [Google Scholar] [CrossRef]
- Tjepkema, D.; van Dijk, J.; Soemers, H.M.J.R. Sensor Fusion for Active Vibration Isolation in Precision Equipment. J. Sound Vib. 2012, 331, 735–749. [Google Scholar] [CrossRef]
- Yin, H.; Wang, Y.; Zhang, X.; Li, P. Feedback Delay Impaired Reinforcement Learning: Principal Components Analysis of Reward Positivity. Neurosci. Lett. 2018, 685, 179–184. [Google Scholar] [CrossRef] [PubMed]
- Hao, G.; Fu, Z.; Feng, X.; Gong, Z.; Chen, P.; Wang, D.; Wang, W.; Si, Y. A Deep Deterministic Policy Gradient Approach for Vehicle Speed Tracking Control With a Robotic Driver. IEEE Trans. Automat. Sci. Eng. 2021, 19, 2514–2525. [Google Scholar] [CrossRef]
MATLAB | |||
Solver | ode4 | Type | Fixed-step |
Episode time | 2s | Fixed-step size | 0.001s |
ADAMS Model | |||
ADAMS Solver type | Fortran | Simulation mode | Continuous |
Animation mode | batch | Communication interval | 0.001s |
DDPG | |||
TargetSmoothFactor | 1 × 10−3 | Agent sample time | 0.001s |
MaxEpisodes | 10,000 | MaxStepsPerEpisode | 20 |
MiniBatchSize | 64 | DiscountFactor | 0.998 |
Components | Symbol | Parameter | Value |
---|---|---|---|
Payload | Mass | 1750 kg | |
Helical spring | Stiffness | 300,000 N/m | |
Damping | 490 Ns/m | ||
Voice-coil motor | KT | Force constant | 80 N/A |
RM | Resistant | 34.5 Ω | |
LM | Inductance | 28 mh | |
Moving distance | 28 mm | ||
Frequency range | ≤200 Hz | ||
Geophone | Natural frequency | 4.5 Hz | |
sensitivity | 100.4 V/m/s | ||
Internal resistant | 2450 Ω |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, X.; Ma, H.; Tang, J.; Kang, Y. A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm. Appl. Sci. 2022, 12, 9703. https://doi.org/10.3390/app12199703
Jin X, Ma H, Tang J, Kang Y. A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm. Applied Sciences. 2022; 12(19):9703. https://doi.org/10.3390/app12199703
Chicago/Turabian StyleJin, Xin, Hongbao Ma, Jian Tang, and Yihua Kang. 2022. "A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm" Applied Sciences 12, no. 19: 9703. https://doi.org/10.3390/app12199703
APA StyleJin, X., Ma, H., Tang, J., & Kang, Y. (2022). A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm. Applied Sciences, 12(19), 9703. https://doi.org/10.3390/app12199703