Cyclic Air Braking Strategy for Heavy Haul Trains on Long Downhill Sections Based on Q-Learning Algorithm
Abstract
:1. Introduction
- (1)
- A heavy haul train model with operational constraints was constructed, considering the vehicle’s characteristics on long and steep slopes of railway lines, as well as heavy haul trains equipped with traditional pneumatic braking systems. In addition, with the optimization objectives of safe train operation and operational efficiency, a Q-learning algorithm-based cycle braking strategy for heavy haul trains on long downhill sections was developed under constraints such as interval speed limits and air-refilling time.
- (2)
- Simulations and experiments were conducted under actual heavy haul train operating conditions, and the experimental results were compared under different parameters and ramp speeds. The experimental results showed that the proposed intelligent control strategy performs well in various scenarios, demonstrating its effectiveness and practicality in train braking.
2. Heavy Haul Train Model
2.1. Dynamics Model
2.2. Running Constraints
2.3. Performance Indicators
- Safety: Safety is a prerequisite for train operation. The running speed of a heavy haul train must be kept under the upper limit but cannot be lower than . Here, K is defined in order to indicate whether the train’s speed remains within the speed limit.
- Air-braking distance: As excessive wear is caused by the friction between the wheels and brake shoes when the air brake is engaged for a long distance, the replacement of air brake equipment increases maintenance costs. By reducing the air brake distance during operation, the maintenance cost can be reduced. Therefore, the air brake distance of a heavy haul train is defined as
3. Algorithm Design
3.1. Markov Decision Process
3.2. Q-Learning Algorithm
- (1)
- Randomly initialize Q (s, a), .
- (2)
- According to the ε-greedy policy π and the current state s, action a is selected from the Q-table. Execute action a as determined by the decision-making process; then, obtain the reward value r by interacting with the environment and proceed to the next state. Update the Q-Table, i.e., s→s’; continue until the termination state is reached.
- (3)
- By following this procedure, after multiple iterations, the optimal policy and the optimal state–action value function can both be obtained.
3.2.1. Policy Design
3.2.2. Reward Function Design
Algorithm 1: The Q-learning-based control strategy for cyclic air braking of the heavy haul train. |
///Initialization/// |
1: Initialize Q function Q (s, a) randomly. |
///Training process// |
2: for episode = 1, … M do |
3: Initialize the state of the train. 4: for k = 0, 1, …, N − 1 do |
5: Select action a according to ε-greedy policy π. |
6: Perform action a; receive rewards r and the next state of the train |
7: Update the Q-Table through the equation, that is, |
8: Update the next state of train, ss’ |
9: end for |
10: end for |
11: Output the well-trained Q-Table |
///Online control process/// |
12: Initialize the state of the train 13: for k = 0, 1, …, N − 1 do |
14: According to select action. |
15: Perform action a and obtain the next state s’ |
16: end for |
4. Algorithm Simulation and Analysis
4.1. Experimental Parameter Settings
4.2. Simulation Experiment Verification
4.2.1. Model Training Process
4.2.2. Effectiveness Testing of Practical Application
4.2.3. Performance Comparison Experiment
5. Conclusions
- (1)
- For heavy haul trains running on long and steep downhill sections, a multi-objective optimization model under multiple constraint conditions was constructed. A Q-learning algorithm with a finite Q-Table was introduced. The train states were discretized, and HXD1 locomotives and C80 freight trains were compared as the study objects for simulation verification. The proposed method enabled the train to adapt to complex train operating environments and route conditions. By adjusting the Q-learning algorithm hyperparameters, the convergence speed of the algorithm was improved while ensuring safe train operation.
- (2)
- To validate the performance of the proposed Q-learning algorithm, comparative experiments were conducted under different parameter conditions. The experimental results demonstrated that the proposed Q-learning algorithm exhibits a stable optimization performance and effectively generates train speed profiles that satisfy constraints, providing a valuable reference for the intelligent assisted driving of heavy haul trains on long downhill sections.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jiang, B.; Tian, C.; Deng, J.; Zhu, Z. China’s railway train speed, density and weight in developing. Railw. Sci. 2022, 1, 131–147. [Google Scholar] [CrossRef]
- Liu, W.; Su, S.; Tang, T.; Cao, Y. Study on longitudinal dynamics of heavy haul trains running on long and steep downhills. Veh. Syst. Dyn. 2022, 60, 4079–4097. [Google Scholar] [CrossRef]
- Hu, Y.; Ling, L.; Wang, K. Method for analyzing the operational performance of high-speed railway long ramp EMUs. J. Southwest Jiaotong Univ. 2022, 57, 277–285. [Google Scholar]
- Zhou, X.; Cheng, S.; Yu, T.; Zhou, W.; Lin, L.; Wang, J. Research on the air brake system of heavy haul trains based on neural network. Veh. Syst. Dyn. 2023, 1–19. [Google Scholar] [CrossRef]
- Xu, K.; Liu, Q. Long and Short-Term Memory Mechanism Hybrid Model for Speed Trajectory Prediction of Heavy Haul Trains. In Proceedings of the 2023 IEEE Symposium Series on Computational Intelligence (SSCI), Mexico City, Mexico, 5–8 December 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 476–481. [Google Scholar]
- Yin, J.; Ning, C.; Tang, T. Data-driven models for train control dynamics in high-speed railways: LAG-LSTM for train trajectory prediction. Inf. Sci. 2022, 600, 377–400. [Google Scholar] [CrossRef]
- Tian, C.; Wu, M.; Zhu, L.; Qian, J. An intelligent method for controlling the ECP braking system of a heavy-haul train. Transp. Saf. Environ. 2020, 2, 133–147. [Google Scholar] [CrossRef]
- Wang, X.; Li, S.; Tang, T.; Wang, H.; Xun, J. Intelligent operation of heavy haul train with data imbalance: A machine learning method. Knowl.-Based Syst. 2019, 163, 36–50. [Google Scholar] [CrossRef]
- Su, S.; Huang, Y.; Liu, W.; Tang, T.; Cao, Y.; Liu, H. Optimization of the speed curve for heavy-haul trains considering cyclic air braking: An MILP approach. Eng. Optim. 2023, 55, 876–890. [Google Scholar] [CrossRef]
- Wei, S.; Zhu, L.; Chen, L.; Lin, Q. An adaboost-based intelligent driving algorithm for heavy-haul trains. Actuators 2021, 10, 188. [Google Scholar] [CrossRef]
- Huang, Y.; Su, S.; Liu, W. Optimization on the driving curve of heavy haul trains based on artificial bee colony algorithm. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Wang, J.; Wang, C.; Zeng, Z. Research on operation optimization of heavy-haul combined trains in long and steep downhill sections based on reinforcement learning. Electr. Drive Locomot. 2023, 6, 139–146. [Google Scholar] [CrossRef]
- Shang, M.; Zhou, Y.; Fujita, H. Deep reinforcement learning with reference system to handle constraints for energy-efficient train control. Inf. Sci. 2021, 570, 708–721. [Google Scholar] [CrossRef]
- Tang, H.; Wang, Y.; Liu, X.; Feng, X. Reinforcement learning approach for optimal control of multiple electric locomotives in a heavy-haul freight train: A Double-Switch-Q-network architecture. Knowl.-Based Syst. 2020, 190, 105173. [Google Scholar] [CrossRef]
- Wang, X.; Su, S.; Cao, Y.; Qin, L.; Liu, W. Robust cruise control for the heavy haul train subject to disturbance and actuator saturation. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8003–8013. [Google Scholar] [CrossRef]
- Wu, Q.; Yang, X.; Jin, X. Hill-starting a heavy haul train with a 24-axle locomotive. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2022, 236, 201–211. [Google Scholar] [CrossRef]
- Zhu, Q.; Su, S.; Tang, T.; Xiao, X. Energy-efficient train control method based on soft actor-critic algorithm. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 2423–2428. [Google Scholar]
- Ge, X.; Ling, L.; Chen, S.; Wang, C.; Zhou, Y.; Xu, B.; Wang, K. Countermeasures for preventing coupler jack-knifing of slave control locomotives in 20,000-tonne heavy-haul trains during cycle braking. Veh. Syst. Dyn. 2022, 60, 3269–3290. [Google Scholar] [CrossRef]
- Wei, W.; Jiang, Y.; Zhang, Y.; Zhao, X.; Zhang, J. Study on a Segmented Electro-Pneumatic Braking System for Heavy-Haul Trains. Transp. Saf. Environ. 2020, 2, 216–225. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhang, Y.; Huang, K.; Tasiu, I.A.; Lu, B.; Meng, X.; Liu, Z.; Sun, W. Modeling of regenerative braking energy for electric multiple units passing long downhill section. IEEE Trans. Transp. Electrif. 2022, 8, 3742–3758. [Google Scholar] [CrossRef]
- Niu, H.; Hou, T.; Chen, Y. Research On Energy-saving Operation Of High-speed Trains Based On Improved Genetic Algorithm. J. Appl. Sci. Eng. 2022, 26, 663–673. [Google Scholar]
- Ma, Q.; Yao, Y.; Meng, F.; Wei, W. The design of electronically controlled pneumatic brake signal propagation mode for electronic control braking of a 30,000-ton heavy-haul train. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2024, 238, 268–279. [Google Scholar] [CrossRef]
- He, J.; Qiao, D.; Zhang, C. On-time and energy-saving train operation strategy based on improved AGA multi-objective optimization. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit. 2024, 238, 511–519. [Google Scholar] [CrossRef]
- Palm, H.; Arndt, L. Reinforcement Learning-Based Hybrid Multi-Objective Optimization Algorithm Design. Information 2023, 14, 299. [Google Scholar] [CrossRef]
- Wang, X.; Wu, C.; Xue, J.; Chen, Z. A method of personalized driving decision for smart car based on deep reinforcement learning. Information 2020, 11, 295. [Google Scholar] [CrossRef]
- Wang, R.; Zhuang, Z.; Tao, H.; Paszke, W.; Stojanovic, V. Q-learning based fault estimation and fault tolerant iterative learning control for MIMO systems. ISA Trans. 2023, 142, 123–135. [Google Scholar] [CrossRef] [PubMed]
Symbol | Description | Symbol | Description |
---|---|---|---|
M | Sum of the masses of all carriages | Acceleration of the heavy haul train | |
R | Curve radius | Tunnel length | |
Output electric brake force | Output air brake force | ||
Maximum electric brake force | Air brake force | ||
Binary variable of air braking | Relative output ratio of the electric brake force | ||
Resistance of train | g | Gravity acceleration | |
Running resistance constant | Air brake distance | ||
Running speed of train | Minimum release speed of air brake | ||
Time point of engaging air brake in the (j+1)th cycle | Time point of releasing air brake in the jth cycle | ||
Gradient of the track on which the train is running | Upper limit of train running speed |
Locomotive Parameters | Freight Car Parameters | ||
---|---|---|---|
Parameter Name | Value | Parameter Name | Value |
Model | HXD1 | Model | C80 |
Mass | 200 t | Mass | 100 t |
Length | 35.2 m | Length | 13.2 m |
Distance (m) | Gradient (-‰) | Distance (m) | Gradient (-‰) |
---|---|---|---|
0–1000 | 1.5 | 12,430–14,080 | 10.5 |
1000–1400 | 7.5 | 14,080–16,330 | 11.4 |
1400–6200 | 10.9 | 16,330–19,130 | 10.6 |
6200–6750 | 9 | 19,130–20,000 | 10.9 |
6750–12,430 | 11.3 |
Parameter | Value | Parameter | Value |
---|---|---|---|
Maximum training episode M | 100,000 | Minimum air-refilling time | 50 |
Discount rate γ | 0.95 | Learning rate λ | 0.001 |
Initial value of ε | 0.98 | Final value of ε | 0.1 |
Positive reward | 5 | Negative reward | −50 |
Minimum braking speed | 30 km/h | Maximum braking speed | 80 km/h |
V0/(km/h) | Target | ||||
---|---|---|---|---|---|
Safety Indicator K | Air Braking Distance/m | Planned Running Time/s | Actual Running Time/s | Average Speed/(km/h) | |
30 | 1 | 9843.6 | 1000 | 1074.6 | 67 |
40 | 1 | 10,181.3 | 1000 | 1014 | 71 |
50 | 1 | 10,547.4 | 1000 | 993.2 | 72 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Zhou, S.; He, J.; Jia, L. Cyclic Air Braking Strategy for Heavy Haul Trains on Long Downhill Sections Based on Q-Learning Algorithm. Information 2024, 15, 271. https://doi.org/10.3390/info15050271
Zhang C, Zhou S, He J, Jia L. Cyclic Air Braking Strategy for Heavy Haul Trains on Long Downhill Sections Based on Q-Learning Algorithm. Information. 2024; 15(5):271. https://doi.org/10.3390/info15050271
Chicago/Turabian StyleZhang, Changfan, Shuo Zhou, Jing He, and Lin Jia. 2024. "Cyclic Air Braking Strategy for Heavy Haul Trains on Long Downhill Sections Based on Q-Learning Algorithm" Information 15, no. 5: 271. https://doi.org/10.3390/info15050271
APA StyleZhang, C., Zhou, S., He, J., & Jia, L. (2024). Cyclic Air Braking Strategy for Heavy Haul Trains on Long Downhill Sections Based on Q-Learning Algorithm. Information, 15(5), 271. https://doi.org/10.3390/info15050271