Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework

Li, Haijiang; Zhao, Jiapeng; Jia, Peng; Ou, Hongdong; Zhao, Weili

doi:10.3390/jmse13010105

Open AccessArticle

Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework

by

Haijiang Li

^1,2,*,

Jiapeng Zhao

^1,2,

Peng Jia

^1,2,

Hongdong Ou

^1,2 and

Weili Zhao

³

¹

School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China

²

Collaborative Innovation Centre for Transport Study, Dalian Maritime University, Dalian 116026, China

³

Qingdao Port International Company Limited Qiangang Branch, Qingdao 266011, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(1), 105; https://doi.org/10.3390/jmse13010105

Submission received: 27 December 2024 / Revised: 4 January 2025 / Accepted: 6 January 2025 / Published: 8 January 2025

(This article belongs to the Special Issue Smart Seaport and Maritime Transport Management)

Download

Browse Figures

Versions Notes

Abstract

:

This study addresses the integrated scheduling problem of dry bulk cargo terminal yards, which includes three components: transportation planning, yard selection optimization, and equipment scheduling. Additionally, the research integrates safety considerations and addresses the complexities of dynamic transportation planning. This work presents two innovations. Firstly, this study develops a sophisticated modeling framework that integrates graph structures for precise yard mapping with mixed-integer programming to enforce operational constraints. This integrated approach facilitates a more accurate and comprehensive representation of yard operations, capturing diverse operational aspects while maintaining model clarity and computational efficiency. Secondly, this study proposes an advanced solution methodology that employs a reinforcement learning technique integrating a Dueling Deep Q-Network and Double Deep Q-Network. This hybrid algorithm significantly enhances optimization performance and accelerates the learning process, thereby improving the efficiency of the solutions. The experimental results demonstrate that the proposed model effectively manages the integrated scheduling of bulk material ingress, storage, and egress within the yard. The operational plans generated by the approach outperform traditional first-come, first-served strategies, showcasing substantial improvements in port operational efficiency and reliability. This comprehensive solution underscores the potential for significant advancements in the overall management and performance of dry bulk cargo ports.

Keywords:

deep reinforcement learning; dry bulk freight yard; mixed-integer programming; operational process planning

1. Introduction

Major bulk commodities like coal, ore, and grain are transported globally via the dry bulk market, a crucial aspect of the logistics sector. About 40 percent of the world’s marine trade is made up of dry bulk, and 35 to 40 percent of all maritime cargo is transported in this manner. The demand for dry bulk is still expanding, and as of the end of May 2024, there were 13,727 ships with a combined capacity of 1.015 billion tons in the worldwide dry bulk shipping market, according to Clarkson’s data. According to key data from S&P Global Platts, the number of dry bulk maritime exports worldwide in 2023 was over 5.137 billion tons, showing a 3.9% yearly rise. The United Nations Conference on Trade and Development estimates that between 30 and 40 percent of the rise in international maritime transportation can be attributable to this expansion.

Despite facing a modest economic slowdown, China’s domestic stockpile of bulk industrial finished products has decreased to unusually low levels. Concurrently, demand from neighboring countries continues to expand. When combined with favorable governmental incentives, these elements have fueled a sustained expansion in China’s dry bulk transportation volumes. According to the “China Port Operation Analysis Report (2024)”, coastal ports around the nation handled 13.17 billion tons of cargo in 2023, showing a 6.6% rise compared to the previous year. Dry bulk freight formed nearly 84% of the overall throughput, highlighting its prominent significance. These data show the crucial relevance of dry bulk transportation in China’s port operations and marine logistics, particularly in the context of international trade and cargo processing inside coastal ports.

Similar to container ports, dry bulk port loading and unloading activities entail the storage of bulk cargo and sea–land transshipment. Dry bulk transportation, on the other hand, is less standardized than container ports; bulk material is transported directly in primary forms such as lumps, grains, and powders [1]. Bulk freight transportation is challenged by the broad range of dry bulk commodities and the considerable variances in their physical and chemical qualities.

In the current bulk carrier unloading process, work efficiency is hindered due to limited yard capacity and the shortcomings of manually generated allocation plans. The unloading-to-shipment process can be summarized as follows: after the vessel arrives at the port, the unloading machine is used to discharge the cargo, and the conveyor system is selected to transport the cargo to pre-assigned positions in the yard. When necessary, auxiliary mobile equipment is used, and transport routes are selected in advance. Based on the assigned warehouse area, cargo is transported through the conveyor system or mobile equipment, and loading operations are carried out using loading platforms, loaders, or ship loaders to complete the delivery.

There are several deficiencies throughout the process. The current operation plan mainly relies on manual generation, which takes relatively longer—about 30 min per dispatch. During the scheduling process, the focus is usually on operational efficiency, with less consideration given to energy-saving aspects. Moreover, manually developed plans tend to be crude. For example, when selecting the yard, priority is often given to selecting the nearest location for storage, and deeper issues such as potential mechanical path conflicts and safety concerns are rarely considered. As a result, frequent modifications to the plan are necessary during actual operations, leading to scheduling confusion [2].

This study focuses on optimizing the entire process, from the start of bulk cargo unloading when a vessel arrives at port until the end of the departure operation. Firstly, it handles the issue of yard berth allocation by taking into account each berth’s remaining loading capacity, the vessel’s entire unloading volume, and the ease with which outgoing operations can be completed. The purpose is to optimize berth allocation and create berth assignment plans while reducing overturning and restacking operations caused by cargo distribution limits in the yard. Following that, the study focuses on conveyor flow optimization. Based on the berth allocation plan, it chooses and optimizes conveyor flows while taking into account conveyor transporter carrying capacity, operational energy consumption, and safety. These two primary optimization concerns in bulk cargo vessel unloading and outbound operations seek to optimize the bulk cargo unloading-to-shipment process, increasing operational efficiency while decreasing safety hazards and energy consumption.

To sum up, our research has produced the following contributions:

(1): Graph-theoretic mapping mechanism. Utilizing graph theory, we created an operational graph model for dry bulk terminals and provided a mapping approach to capture the physical aspects of dry bulk cargo yards. By making it simple to graphically show production aspects including yards, stationary machinery, and mobility machinery, this model presents a unique solution to the collaborative operation challenge in dry bulk cargo yards.
(2): Integrated optimization in the model in mathematics. This work proposes a mathematical model for operational process optimization by carefully examining elements including mechanical power, operation time, transportation safety, and material amount in order to accomplish energy conservation, pollution reduction, and increased operational efficiency. This model blends resource allocation, process optimization, and yard scheduling to obtain a high degree of energy efficiency and production operations synergy. It is based on the graph structure of dry bulk shipping yards.
(3): Advanced solution algorithm. We present the Dueling Double Deep Q-Network (DQN) technique, which combines the advantages of separate value and advantage functions, to filter states and reduce overestimation in the target network. This technique enables the more precise selection of iterative states and actions.

The structure of this paper is as follows: Section 3 highlights the present issues and possible solutions, along with the operational processes of dry bulk terminals. Section 4 develops the objective functions, sub-models for operational process parameters, and the yard structure model, anchored in the mathematical foundations of operations research. Section 5 discusses the solution strategy employing deep reinforcement learning. An empirical validation case study based on Qingdao Port is offered in Section 6. Finally, the results are analyzed, and prospective avenues for further research are given.

2. Literature Review

In recent years, researchers have made significant advancements in the cooperative scheduling of dry bulk freight operations. However, most studies have predominantly focused on aspects such as berth scheduling at dry bulk ports, with scant attention given to dry bulk yard scheduling. In contrast, research on container yards has concentrated on optimizing yard routes and selecting storage slots while considering the operational time, energy efficiency, and safety. Therefore, the insights gained from studies on container terminals can be instrumental in the development of efficient dry bulk yards.

2.1. Yard Structure Modeling

Yard structure modeling entails separating the yard into multiple parts and representing them using mathematical mappings. This method makes it easier to explain the physical structure of the yard mathematically. The majority of current research uses graph-theoretic modeling techniques. For example, Wu et al. [3] used adjacency matrices to indicate the accessibility between operational points and directed graphs to depict different yard elements based on the yard conditions of the coal–water intermodal storage and distribution base in Jing Zhou, Hubei. Robert et al. [4] suggested a yard structure modeling method to optimize scheduling for stacking and reclaiming operations. They modeled each stockpile as a collection of heterogeneous blocks of varied sizes, accurately expressing the stockpiles’ geometric shapes and variations. This model improves the quality of scheduling plans in the DBET scheduling engine, resulting in more accurate and efficient solutions. Although the relationship between mobile machinery, belt conveyors, and cargo cannot be adequately described, some studies apply mathematical restrictions to limit structural features.

Formula-based constraint solutions are more suitable for mathematical expression and algorithmic implementation compared to graph-theoretic approaches, as they enable an intuitive representation of the connections between different yard components. Mathematical formulas facilitate the articulation of arbitrary connections, thereby providing comprehensive coverage of various interactions. To achieve a mathematical model representation of terminal structures, Ozgur [5] employed constraint programming, which imposed constraints on the relationships among inventory, yards, and ships. In the context of assigning yard space for export bulk cargo, Assia et al. [6] digitized the storage of bulk goods, yard partitioning, and slot occupancy. They utilized the operational time as the objective function to develop a storage strategy that minimizes the total time, employing mixed-integer linear programming techniques and representing the allocation of bulk materials through Gantt charts and mathematical formulas. Kong et al. [7] optimized the yard layout design for vertical stacking configurations by introducing an operating cycle time formula. Building on this formula, they developed an analytical model based on queuing theory to optimize the yard layout design parameters, achieving cost savings and increased efficiency.

Operational process planning in yards relies significantly on tracking the spatial changes in dynamic elements, typically represented through arrays or route mapping that incorporate time, speed, and physical attributes. To advance the construction of port structures, Alan et al. [8] utilized coordinate systems to depict the precise locations of various elements within the terminal operation system and quantified physical properties with labels, forming arrays for berth scheduling studies with established yard structures. To facilitate heuristic algorithms in yard path planning and storage solutions, Ren-Qian et al. [9] introduced the concept of a Dynamic Container Port (DCP), representing the yard structure through S-shaped routes and indexing storage locations. While this approach effectively captures the dynamic changes within the yard, it presents certain drawbacks compared to static representations using formulas or graph theory, such as challenges with model interpretability and a greater potential for error correction issues.

When distributing storage space in bulk material yards, Defeng et al. [10] handled each material pad space as a sequence of unit slots to prevent the creation of dispersed, tiny areas that can cause mistakes in model decomposition and solving. They created a brand-new mixed-integer programming (MIP) formulation that clearly illustrates the yard space’s structure. This approach, which emphasizes the integration of dynamic and static aspects, closely resembles the real operating conditions of the majority of dry bulk port yards. It has benefits over other approaches in terms of improving model breakdown and streamlining solution procedures. Consequently, the constraint criteria for the structural model of dry bulk yards will be based on this modeling approach. Nevertheless, this approach has drawbacks, including intricate representation and very frequent element changes. We can improve later model design and algorithm solving by using a graph model to depict the relationships between items. In order to capture element properties, this study integrates the previously described approaches, using a graph model to develop the fundamental structure and a mixed-integer programming methodology to design the yard operating process model.

2.2. Port Yard Operation Scheduling Models

In the yard operation process, it is critical to consider factors such as the physical characteristics of belt conveyors, the connectivity between different stacks within the yard and key locations such as ports and stations, and the physical properties and operational contexts of internal transportation machinery [3]. To improve and widen the mathematical model’s applicability, past studies must be consulted to precisely determine influential elements and provide objective functions for the model.

Different methodologies have been employed by researchers to identify model elements and goal functions. Xin et al. [11] incorporated container allocation and crane deployment into a mixed-integer programming model. By branching based on paths rather than decision variables, the study boosted resource utilization while simultaneously improving the overall operational efficiency. Tan et al. [12] selected mixed-integer programming models according to demand while developing sub-models. Prior to elaborating two model components—container transportation distance and yard space utilization rate—they took the space allocation of container transshipment airports as the objective function. In order to finish creating the model, Kang et al. [13] studied the operation process, developed a number of decision variables, and then defined the goal function to reduce the overall duration of all ships arriving at the port. The bulk cargo port environment’s berth and yard allocation challenges were examined by Tomáš et al. [14]. They suggested a metaheuristic strategy based on branch-and-price algorithms and critical shaking neighboring search to identify ship scheduling schemes by minimizing the total trip distance between the sites of the cargo storage yards and the designated berthing positions of ships. Shu et al. [15] tackled the issue of energy conservation and emissions reduction in the shipping industry by developing an AIS data processing system. They discovered that departing ships have better tidal entry capabilities than arriving ships, which contributes significantly to the green development of shipping. Based on the fundamental operations of ports, De et al. [16] examined the relationships between operational systems and developed a mixed-integer programming model that simplifies the operational costs of three subsystems—path optimization, yard selection, and ship loading and unloading—into a single mathematical formula that can be solved using column generation heuristic algorithms. Huang et al. [17] addressed the yard allocation problem in automated container terminals and proposed a two-stage mathematical model with bi-objective optimization. This approach successfully assigns containers to specified places, boosting the feasibility and efficiency of the solution process.

Ship port time, transportation machinery trip distance, and delay penalty cost were the optimization goals set by Ning et al. [18]. They built a multi-objective collaborative scheduling optimization model and applied coding and multi-objective particle swarm optimization (MOPSO) techniques to solve it by constraining variables like the ship draft depth and journey distance of transportation machinery. Shu et al. [19] developed a cost function based on multi-objective optimization, suggested a ship path planning model based on optimum control, then solved and simulated the model using dynamic programming and numerical functions. This had the highly applicable impact of improving navigation efficiency and safety. Wencheng et al. [20] focused their research on container yard management challenges. They designed a two-stage stochastic optimization yard model that comprises storage space allocation, yard gantry crane deployment, and container allocation. The solution greatly increases the overall operating performance. The berth allocation and material handling challenge were split down into two smaller concerns by Saurabh et al. [21]: berth allocation and dynamic allocation of unloaders at distinct docks. To make the problem-solving process easier, they devised a two-stage multi-objective optimization model. In order to optimize ship routes in ice-covered waters, Shu et al. [22] took variables like cost and navigation efficiency into account, created an icebreaker route planning model, converted this model into a multi-objective optimization problem based on the optimal control model, and integrated dynamic programming and numerical simulation for a multi-solution. Xu [23] evaluated four container yard layouts by building mathematical models for time analysis and energy usage. They created time equations, logged operation times at various phases as parameters, and built a complete simulation model that included everything from identifying resource groups for operational operations to calculating out non-value-added costs. After that, they compared the unit time costs of each plan. Golsa et al. [24] developed a mixed-integer programming solution that includes storage space allocation, berth allocation, and yard crane deployment in a single framework. Based on this model, a Benders decomposition technique was designed to speed up the solution process by adding valid inequalities. The resulting solutions greatly cut yard operational expenses. Beng et al. [25] proposed a dynamic method that divides the container retrieval process within the yard into multiple time intervals and dynamically reserves yard slots within each interval in order to address issues like dynamic container arrival patterns, operational conflicts, and the efficient utilization of yard resources. Operational conflicts are successfully decreased during the handling procedure with this strategy.

By splitting down the overall process into operational subprocesses, hybrid operation modeling enables several operational processes to share yard aspects; for instance, a belt conveyor that engages in transportation, warehousing, and ex-warehousing processes. This model, which focuses on individual operations and generally uses mixed-integer programming models while taking objective function optimization into account, closely reflects the actual operational situation of dry bulk ports. It has advantages over other approaches in terms of precise modeling and the simplicity of the solution. It is currently the most extensively utilized research methodology, and this work uses it as its modeling strategy.

2.3. Solution Methods

There are various approaches to solving models, and picking one that works with the mathematical model is vital to obtaining the best results. Intelligent algorithms and simulations are the core of popular heuristic solving strategies. This method’s usefulness is that it can solve high-dimensional models that are beyond the scope of ordinary methods. The most-used algorithms at the time are genetic ones; however, other research uses simulated annealing approaches to make up for the shortcomings of genetic algorithms. These designs, however, are unable to handle the complicated linkages between the several yard operational phases.

To ascertain the necessary yard area size for particular terminals, Van. et al. [26] developed a method based on the ideas of queuing theory. Using a variety of simulation techniques, they solved the model and found correlations between variables such as yard space and bulk cargo volume. In order to find Pareto-optimal solutions, Saurabh et al. [21] used a controlled elite non-dominated sorting genetic algorithm after developing a ship decision system model and running into competing multiple objectives. After creating the model, Ozgur [5] completed the solution and case validation by breaking down the complicated issue into smaller models using the Benders decomposition algorithm. Maurizio Bruglieri et al. [27] tackled the 3D yard allocation problem for break bulk cargo by creating a mixed-integer linear programming (MILP) model and a solution approach that combines variable neighborhood search (VNS) and branch-and-bound approaches. This combination method not only improves the algorithm’s capacity to explore the solution space, but it also offers a novel approach to the yard allocation problem for break bulk freight.

Li et al. [28] used a path network to model the loading mechanism and yard structure of a coal yard. They reduced the scheduling problem to straightforward inputs and outputs by breaking down the process into distinct steps and including helpful aspects of bulk cargo port operating hours in a state set. The ship loading scheduling model of bulk cargo terminals was solved using a dual deep Q-learning approach. Xing et al. [29] created a multi-objective optimization model for bulk cargo ports that have limited channels, berths, and yards. They developed the NSGA-II-DPGR allocation algorithm (non-dominated sorting genetic algorithm II with Dynamic Penalty Guided Repair), which provides a comprehensive solution for scheduling optimization in bulk cargo ports. Addressing the complexity of the multi-objective mathematical model for loading operation planning and ship traffic scheduling based on COLOPVTS, Zhang et al. [30] used a heuristic algorithm that combined non-dominated sorting genetic algorithm II (NSGA-II) and variable neighborhood search (VNS) to solve the model and obtain Pareto-optimal solutions. For a two-layer container port-based two-stage stochastic programming model, Lu et al. [31] encoded the solution—that is, the original and repair plans—into a container sequence issue. In order to find near-optimal solutions, the solution space was searched for better ship sequences using the critical shaking neighborhood search (CSNS) metaheuristic algorithm. Based on a simulation model, Vianen et al. [32] created a dynamic planner that integrates various events into processes. It helps terminal planners pre-select routes or provide alternate routes in the event that conveyors or other machinery fail by processing bulk yard network data and order information using network topology and an older planner. Hyeong-Tak et al. [33] introduced a ship navigation path planning algorithm based on deep Q-learning, which takes into account the environmental parameters of the entire port navigation area and non-navigation zones under different conditions. This strategy boosts both the safety and efficiency of ship navigation. For the integrated “production scheduling-berth allocation-yard allocation” mixed-integer linear programming (MILP) model they established in their study, Nicolas et al. [34] created a multi-start GRASP-ILS (Greedy Randomized Adaptive Search Procedure-Iterated Local Search) metaheuristic algorithm. Both the speed and the quality of the results were greatly increased by this approach. Yang et al. [35] suggested a global path planning algorithm based on Double Deep Q-Network (Double DQN) to handle the path planning problem for amphibious unmanned vehicles. By modifying the weights of the reward function, they built an optimum path, boosting the algorithm’s performance in pathfinding. Gajević et al. [36] used an analysis of variance (ANOVA) to identify all the parameters that have a significant impact on the outcome of a multi-objective program. Alimkhanova et al. [37] developed a method for intelligent management and control of the system using visible light communication technology, which provides a theoretical basis for this study on the whole series of intelligent management of the operating system. Eghbal et al. [38] proposed a conditional convergent evolutionary algorithm, the Seed Growth Algorithm (SGA), and proved the effectiveness of the algorithm in neural networks, which provides an important theoretical basis for the solution of optimization problems.

Reinforcement learning employs machine learning algorithms to mimic yard scenarios in order to optimize the operation process. The development of intelligent systems based on artificial intelligence and big data is currently at the forefront of port planning research. These systems are intended to optimize ship management, material loading and unloading, yard management, and transportation routes in order to decrease waiting times and increase efficiency. This strategy focuses on providing a learning environment in which the agent independently determines the best planning schemes, utilizing a fair system of incentives and penalties. Reinforcement learning has significant computing capabilities and is more suitable for solving intricate challenges than other methodologies. It is the solution method cited in this paper and is currently the most extensively utilized algorithm for optimization challenges.

3. Problem Formulation

Using the bulk cargo yard at Dong Jiakou Port as an example, Figure 1 depicts the structure of a 100,000-ton yard, Figure 2 demonstrates the loading and unloading operation for dry bulk cargo, and Figure 3 shows a photograph of the dock compared to a sketch of the docking and loading process. Both stationary and mobile machinery are utilized in dry bulk port yard loading and unloading activities. When a ship comes and berths, for example, an unloader takes the cargo from the ship’s hold and transfers it to a belt conveyor, which moves it to the stacker–reclaimer’s location in the proper yard. After extracting the material from the belt conveyor, the stacker–reclaimer uses its crane mechanism to raise it to the needed height and then puts it at the proper spot. The material is slotted in the yard with the help of an excavator. After extracting the material from the belt conveyor, the stacker–reclaimer uses its crane mechanism to raise it to the needed height and then puts it at the proper spot. The material is slotted in the yard with the help of an excavator.

Using vehicle loading as an example, the stacker–reclaimer and mobile machinery retrieve goods and deposit them onto the belt conveyor in the outgoing operating process before the pickup date. Any extra supplies are carried to the truck loading destination, rail loading station, or ship loading point via mobile equipment.

Port management mostly uses the following techniques for operation planning when carrying out yard operations:

(1): Proximity-Based Storage: Operators usually select the most practical local yard to store the bulk commodities when a vessel arrives and they begin unloading and warehousing operations. This frequently results in extra handling during outbound operations, which raises the operational load.
(2): Experience-Based Routing: In certain yards, belt conveyors and mobile machines encounter traffic jams when transporting bulk goods. In the past, operators have used their own experiences to allocate machinery and determine routes.
(3): Inadequate Equipment Monitoring: Excessive wear and tear of belt conveyors cannot be completely avoided because of the high workload intensity in the yard and the lack of intelligent equipment. In order to avoid emergencies, operators still have to rely on their experience to recognize dangerous locations.

The following solutions are given by this study to tackle these problems:

(1): Optimized Storage Allocation: To establish the outgoing location during warehousing, assess the cargo owner’s action in light of the material’s qualities. Based on this, determine the storage location and assign resources to eliminate the need for extra handling, which will save time and energy.
(2): Route Planning Optimization: To decrease costs and enhance efficiency, plan routes for belt conveyors and mobile equipment utilizing the maximum flow and shortest path concepts.
(3): Planning for Safety Restrictions: Use mathematical models to predict the level of belt wear and vibration conditions in response to belt conveyor wear and blockage brought on by large particles. To promote safety, utilize these computations as restrictions while planning your journey.

In conclusion, the following is a description of the research problem: Choose which stacks to employ during warehousing at a dry bulk port yard within a specific operational planning period based on the needs of the cargo owner and the qualities of the materials and then allocate resources properly. As soon as the items are shipped, distribute the transportation and operating equipment according to the features of the storage facility. Plan the operational processes for the storing and dispatching of dry bulk cargo by carefully taking into account the yard structure, belt conveyor characteristics, and mobile machinery properties. In order to decrease the operating energy consumption and transit time while maintaining the highest level of safety, the purpose is to determine the best ways to carry and store things from warehousing to dispatch.

4. Construction of the Yard Scheduling Model

4.1. Model Assumptions

(1): Only the power consumption of the stacker–reclaimer is considered during the stacking and reclaiming operations.
(2): The procedure is assumed to commence at time zero.
(3): Throughout operations, intelligent equipment is capable of transmitting relevant data in real time while functioning normally.
(4): Variable-frequency rollers are a feature of all belt conveyors that enable speed modifications.

4.2. Model Parameters

Table 1 lists the model’s state variables and input parameters.

Decision Variables

Q: total amount of bulk material;
V: set of nodes including berths, train stations, truck stations, and various yards; $V = {v_{a}}, a = 1, \dots, n_{1}$
E: set of directed edges; $E = {e_{i j}}, i = 1, \dots, v_{k}, j = 1, \dots, v_{k}$
G: port graph model; $G = (V, E)$

4.3. Model Building

The mixed-integer programming (MIP) model developed in this study is presented as follows:

\min f = λ_{1} f_{1} + λ_{2} f_{2}

(1)

f_{1} = \min \{P_{j}\}

(2)

f_{2} = \min \{T_{j}^{\min}\}

(3)

λ_{1} + λ_{2} = 1

(4)

The overarching goal of this model is represented by Equation (1), which is to minimize the total amount of energy used by a single process as well as the amount of time that a single process takes to operate. The multi-objective optimization is transformed into a single-objective optimization by using weight coefficients.

V_{ij} = \frac{Q_{ij}}{(π \times D \times v \times ρ)}

(5)

V_{j}^{\max} = \min (f_{\max}^{\min})

(6)

V_{j}^{\max} = \min (V_{ij}) = \min (\frac{Q_{ij}}{π \times D \times v \times ρ})

(7)

V_{j}^{\max} = \min (\frac{Q_{ij}}{π \times D \times v \times ρ}, \sum_{n = 1}^{n = N} \frac{Q_{ijn}}{π \times D \times v \times ρ})

(8)

T_{j}^{\max} = \sum_{i = 1}^{i = M} \frac{S_{ij}}{V_{j}^{\max}} = \frac{Q_{ij}}{L_{j}^{\max}}

(9)

Equations (5)–(7) illustrate the relationships between the operating time, speed, and material allocation in each phase of a single operation. During process j-th, the speeds across different path phases vary. According to the “bottleneck principle”, the maximum average transportation rate of materials is determined by the lowest maximum operating speed among all the conveying machines across the various segments of the process.

Equations (8) and (9) describe the maximum flow rate during the branching transport stage of a process line with multiple branches. Specifically, the maximum flow rate is equal to the sum of the flow rates of each branch. In other words, the branching stage is treated as a subprocess involving N branch lines, where n represents the number of distinct branch lines. When the non-branching segments are considered, Equation (8) expresses the maximum operating speed of the branch process.

Based on the above, let

S_{i j}

represent the transportation distance of the i-th segment in the j-th process, and let

T_{j}

denote the total transportation time for the j-th process. The total number of individual paths in the overall route is denoted as M. Consequently, the shortest transportation time for the j-th operational process is expressed by Equation (9).

Q_{ij} \leq Q_{\max}

(10)

e_{ij} + e_{ji} \leq 1

(11)

\sum_{j = 1}^{j = J} \sum_{i = 1}^{i = I} Q_{ij} = Q

(12)

According to Equation (10), the amount of material allotted to a single path in a single process does not go beyond that process’s capability. A belt conveyor segment can only go in one direction in a directed graph, according to Equation (11). According to Equation (12), the initial total material quantity is equal to the sum of the material quantities allotted to each segment.

4.4. Safety Management

Evaluating the belt conveyor’s level of wear is the main step in addressing safety concerns during transit. The belt conveyor’s vibration frequency can be used to assess its operational state, Table 2 shows the vibration evaluation criteria of belt conveyor. Belt conveyors, which are often categorized as Class II or Class III, have the following safety ranges at various vibration velocities, under the ISO 20816-3:2022 [39] standard:

Consequently, it can be concluded that the belt conveyor’s vibration velocity ought to be below 7.1 mm/s. Normal operation is permissible when the vibration is less than 4.5 mm/s, and caution is advised when operating in the 4.5–7.1 mm/s range.

4.5. Power Parameters

Energy consumption in the yard operation process is primarily influenced by several factors: the operation of belt conveyors without a load, horizontal transportation of materials using belt conveyors, vertical transportation of materials via belt conveyors, and the operation of loading and unloading equipment. Specifically, the power consumption of belt conveyors is composed of three main components: the energy required for lifting materials, the energy necessary for horizontal material movement, and the energy consumed when the conveying machinery is idle. These components are mathematically defined and described in Equation (13) of the model.

P_{ij} = \frac{C \times f \times L \times (3.6 G_{m} \times V_{j}^{\max} + L_{j}^{\max}) + L_{j}^{\max} \times H}{367}

(13)

Equation (14) presents the total energy consumption model for the j-th process segment.

P_{j} = \sum_{i = 1}^{i = O} P_{ij}

(14)

4.6. Resource Allocation

After establishing the foundational elements of a basic graph model, it is necessary to enhance its complexity to better represent real-world conditions. Each stockyard is characterized by attributes such as its stockpile capacity (

X_{a}

), the quantity of stored material (

X_{a b}

), and the distance (d) to the outbound stockyard. Each stockyard is subject to a capacity constraint, where x_ab represents the amount of material b stored in pile a. The capacity of each stockyard is denoted by

C_{a}

, and the total sum of the storage capacities of all piles must not exceed the overall capacity of the stockyard.

The distance d between a yard location and the outgoing yard is considered when new materials arrive. To minimize the need for additional handling, it is optimal to select the pile closest to the outbound yard. Each batch of materials is assigned an additional attribute, known as the outbound mode c, in addition to its physical characteristics. When the outbound mode is unknown (c = 0), the selection of the material yard should take into account the need to balance distances to various outbound yards. The yard corresponding to the transportation method chosen by the cargo owner is then targeted to move the stockpile to the nearest pile location.

The transportation of stockpiles within yards is managed using either mobile machinery or belt conveyors. Accordingly, the proximity of yard locations to the outbound yard can be categorized into two main scenarios:

Direct Belt Conveyor Connection: For yards equipped with a direct belt conveyor connection to the outbound yard, d represents the length of this conveyor. This direct linkage enables efficient material transport with minimal energy consumption.

Indirect Connection (Requires Additional Handling): In yards without a direct belt conveyor connection, additional handling operations are required. The total energy consumption of belt conveyor transport is calculated using Equation (14). Here, E₁ denotes the energy consumed by mobile loading machinery, while E₂ represents the energy consumption of mobile transportation machinery.

The equation is formulated as follows:

E = E_{1} + E_{2} + \sum P_{0} + \sum_{i = 1}^{i = O} P_{ij}

(15)

E_{1} = \frac{F_{0} + k \times W}{η_{1}} \times D

(16)

E_{2} = \frac{F_{0} \times d}{η_{2}}

(17)

s . g \{\begin{cases} \min (P_{j}, E) \\ \min d_{c}, c \neq 0 \\ \min d_{1} + d_{2} + d_{3}, c = 0 \\ c = 0, 1, 2, 3 \end{cases}

(18)

When goods are transported between two storage locations using a flow conveyor, that route segment is designed as

e_{ij} + e_{ji} \leq 1

such that if an unexpected incident occurs during the transportation process, the route

e_{i j}

/

e_{j i}

is assigned a value of 1. This design helps prevent vehicle collisions during transit, thereby enhancing safety.

On the other hand, if an unforeseen situation causes congestion during transportation, the current route’s status should be set to non-operational. Letting y represent the operational state of the flow conveyor transportation route, we have the following:

y = \{\begin{cases} 0, p a s s a b l e \\ 1, c o n g e s t e d \end{cases}

(19)

5. Numerical Experiment

5.1. Deep Q-Learning

An algorithm called Deep Q-Learning integrates reinforcement learning and deep learning. This technique can tackle high-dimensional state space decision-making challenges in sophisticated contexts such as yard operations. By tackling the problem of overestimation in neural networks, Double Deep Q-Learning expands on DQN (Deep Q-Network). To boost accuracy, it builds two target networks and updates the network with the lowest estimated value.

In typical Q-learning, the agent updates a Q-table to decide the optimum course of action in a specific scenario. However, this method’s efficiency falls as the number of states rises. This is addressed by DQN, which essentially replaces the Q-function with a neural network by allowing states to be input into the network in order to assess the value of each action in that state. It is possible to isolate the value function from the advantage function when working with complex states and actions. The selection process is driven by the value function, which assesses the worth of the upcoming action and condition. Equation (20) depicts the Q-value computation in the Dueling DQN architecture:

Q (s, a; θ, α, β) = V (s; θ, β) + (A (s, a; θ, α) - \frac{1}{|A|} \sum_{a^{'}} A (s, a^{'}; θ, α))

(20)

During evaluation, the target network in DQN could exaggerate values. This is lessened by Double DQN, which creates two target networks, selects the lower estimated value as the computation target, and updates the existing network’s parameters on a regular basis to improve stability. Equations (21) and (22) display the action selection and evaluation functions.

In conclusion, Figure 4 depicts the neural network architecture of the Dueling Double DQN method.

Deep Q-learning states are used to design yard operation activities, such as material allocation and machinery operating conditions. Its reward function becomes the goal function after normalization. When a yard operating method that minimizes transportation time and power consumption is established, implying that there is no operation process that is totally superior to this scheme, the iteration ends, and the goal is achieved. Figure 5 illustrates the solution flow of the Dueling Double DQN learning algorithm used in this investigation.

Based on the above, the algorithm solving process is as follows:

(1): Initialization: Set up the parameters for the current network, target network, and experience replay buffer. The initial state is defined as the designated inbound port, “port”. The nearest storage location from the outbound yard is determined by greedy strategy as the storage location.
(2): Action Selection: in the state “port”, select an action a with a probability of 1 − ε and proceed to evaluate its outcome.

$a = \arg \max_{a^{'}} Q (s, a^{'}; θ)$

(21)

$y = r_{k} + γ Q (s^{'}, a^{'}; θ^{-})$

(22)
(3): Execution and Memory Storage: Execute the chosen action to obtain a reward and the subsequent state. Store the experience pair $(s, a^{'}; θ, α)$ in the memory pool for future reference
(4): Termination: the training process concludes when the terminal state is reached and epsilon decreases to zero.

5.2. Greedy Algorithm

When solving models or problems, the greedy algorithm always selects the best immediate alternative. This indicates that the greedy algorithm can produce a locally optimal solution rather than exploring the overall best solution. Therefore, choosing the right greedy approach is crucial to employing a greedy algorithm to find the best solution for any problem.

To compensate for the greedy algorithm’s local optimum constraint, a suitable exploration strategy within the action space must be chosen in order to optimize the yard’s operational workflow. The path’s starting point is fixed since the inbound mode is established when materials enter the yard. The incoming endpoint is also identified by choosing the yard nearest to the outgoing yard as the destination using a greedy method. The objective now shifts to determining the shortest route between two sites. By changing its parameters, one can use Dijkstra’s algorithm, a traditional single-source shortest path algorithm, to solve restrictions by converting them into energy consumption, time, and vibration frequency.

5.3. Technical Strategy

Finding dependable transportation and storage options, choosing a suitable yard based on actual yard circumstances and machinery characteristics, and proactively considering outbound operations schemes to reduce overall costs are all part of improving the yard’s operational process. The computational cost of traditional approaches rises as the issue scale expands in this non-deterministic polynomial (NP) problem with multi-dimensional components. Currently, one of the best techniques for handling these kinds of issues is deep reinforcement learning.

The design of the proposed algorithm is as follows:

(1): Graph Modeling: develop a basic yard graph model through mathematical mapping, summarizing the distribution and operational characteristics of the yard’s mobile and stationary equipment.
(2): Model Establishment: Construct a mixed-integer programming model by integrating dynamic characteristics into the graph model. This integration is based on the yard’s energy consumption calculation method, operation time calculation method, and safety considerations.
(3): Agent Training: Normalize the objective function and use it as the reward function. Train the agent based on the principles of the Dueling Double DQN algorithm.
(4): Optimization and Output: conclude the training process once the objective is met to execute the actual optimization of the yard’s operational process and derive the optimal design for yard operations.

In conclusion, the structural flow of the model algorithm developed in this study is depicted in Figure 6.

6. Case Study Analysis

Experimental data for this study were gathered from Qingdao Port’s dry bulk terminals and yards. We created a number of standard port operation duties and specified specific amounts of operational materials based on Qingdao Port’s actual operational requirements. In order to compare and evaluate the optimized scheme with the original technique, we computed a number of indicators under simulated material volumes and operational scenarios using the Qingdao Port’s current fixed operational processes.

PyTorch 1.12 and Python 3.8 were used to implement the suggested Dueling Double DQN algorithm. A 13th-generation Intel i9 processor with 16 GB of RAM made up the computing environment.

Using the yard operations at Dong Jiakou Port in the Qingdao Port area as a case study, we validated the proposed Dueling Double DQN algorithm with material transportation data from a single operational cycle and relevant yard machinery data. The selected stockpile data and mechanical parameters comprehensively cover most port machinery and types of dry bulk cargo, offering excellent representativeness and generality. Based on this, we trained the Dueling Double DQN algorithm to solve the mixed-integer programming model and conducted a comparative analysis with solutions obtained using the DQN algorithm and actual operational schemes. The relevant algorithm parameters are shown in Table 3.

Table 4, Table 5 and Table 6 display the yard node data, belt conveyor structural data, and cargo type data utilized in this investigation, respectively. Figure 7 shows the results of yard diagram modeling. Figure 8 and Figure 9 show the simulated reward function curves and loss function curves for the DQN method and the Dueling Double DQN algorithm, respectively, based on numerical testing.

6.1. Experimental Parameters

Using the Qingdao Port yard as an example, there are four different kinds of yards: coal, 100,000-ton, 200,000-ton, and small harbor basin. We constructed a case study because there are not any unique material data for a single operational method. A single belt conveyor connects numerous yards in the yard structure, and all stacking and reclaiming is performed by a single stacker–reclaimer. As a result, they do not need to be set up as independent nodes and edges in the mathematical network; they can be integrated into a single node, and the properties of the edges can be altered. Two directed edges can be linked to generate a bidirectional edge in the mathematical graph model.

In conclusion, the following are the data and experimental parameters used in this study:

Yards Merged: 9 yards.

Three turnhouses.

Rail, truck, and ship outbound yards are the three outgoing yards.

Port is the inbound yard.

C = 0.35, f = 0.03;

G_{m}

= 1 kg; belt conveyor operating speed V = 3.15 m/s; H = 3.35 m; D = 1;

v_{j}

= 3.77 m/s; ρ = 4.9 m/s are additional parameters.

Three approaches were chosen for comparison analysis in the defined case.

Dueling Double DQN Technique: this approach was used to solve the proposed model.

Actual Yard Strategy: the yard’s existing operations plan.

Deep Q-Learning: the usual DQN method was used to solve the operational procedure.

The Dueling Double DQN employed the following greedy strategy calculation formula, with a batch size of 64, a replay buffer size of 10,00, a learning rate of 0.001, and a discount factor of 0.995:

ε = ε_{e n d} + (ε_{s t a r t} - ε_{e n d}) \times e^{- \frac{i n d e x}{ε_{d e c a y}}}

(23)

Figure 7 depicts the mathematical mapping of the fundamental yard structure, with dashed lines denoting the paths used for the movement of mobile machinery.

Figure 7. Graph model mapping of yard structure.

6.2. Case Study and Results

By integrating actual data with the case study, we established indices to perform one-hot encoding on states for agent training and updated the weights and parameters. We recorded numerical changes in files and represented the results using line graphs.

Figure 8 and Figure 9 show how the reward function varies for several deep reinforcement learning techniques. The shaded portion of Figure 8 shows that the curve was obtained by averaging.

Figure 8. Reward function changes: DQN vs. Double DQN.

Figure 9. Loss function changes: DQN vs. Double DQN.

The Dueling Double DQN algorithm was used to obtain the best results and calculate the average values after the Pareto frontier produced the ideal set of weights and parameters for time and energy consumption, as seen in Figure 7. Figure 8 and Figure 9 show the results of a comparative analysis with the typical DQN model. The cumulative reward function and strategy stability variations for both algorithms are depicted in Figure 8, and the loss function curves are displayed in Figure 9.

It is evident that because of its exploration method, the Dueling Double DQN algorithm, which combines dynamic weight updates and divides value and advantage functions, initially displays a comparatively large loss function. It offers greater strategy stability, though, and converges earlier and faster.

In conclusion, the regular DQN algorithm and the enhanced Dueling Double DQN algorithm are both capable of convergently determining the best course of action. But for more complicated high-dimensional parameters, the Dueling Double DQN exhibits better problem-solving skills, has stronger exploration capabilities, and converges more quickly. It becomes more effective and stable in its search for the best course of action by fine-tuning the reward function settings.

Table 7 presents a comparison of the operational schemes developed from our model with the real yard operations based on the data obtained. Figure 10 displays the outcomes of the planning for the operational process. Table 8 shows the time comparison between the algorithm in this study and the manually obtained method.

The solution obtained by DDDQN is as follows: Based on the basic structure of the yard, when the goods are unloaded from the port, the outbound mode of the 20,000 tons of goods is the municipal depot, the outbound mode of the 30,000 tons of goods is the train depot, and the outbound mode of the 50,000 tons of goods is the ship depot. Then, the storage yard of the municipal depot is slot10, and the storage yard of the train depot is slot6 and slot7. The storage yard for loading and unloading is slot 2/3/4/5. According to the formula

P_{ij} = \frac{C * f * L * (3.6 G_{m} * V_{j}^{\max} + L_{j}^{\max}) + L_{j}^{\max} * H}{367}

, it can be concluded that the total energy consumption of this process is 5235.36 kWh.

The manual approach means that, depending on the yard’s remaining capacity, the manual yard will often use closer yards and recurrent paths for easier transportation and storage, even if the manual scheme chooses the same storage yard. In the train departure link, the manual yard will often use tighter yards and repeated paths for storage and transportation, as opposed to the reinforcement learning-derived slot2-slot3-slot7 system. Because the artificial scheme chooses to employ slot2-slot1-slot10-corner9-corner8-slot7 for transportation, it consumes more energy than the DDDQN scheme, totaling 5423.41 kWh.

Based on the aforementioned results, the implementation of the proposed operational scheme leads to shorter inbound and outbound times as well as reduced energy consumption. In the manual scheme, the goods are transported by repeated roads, which leads to some meaningless energy consumption.

In comparison, the strategy described in this study planned the storage yard and selected different transportation paths based on the yard structure and residual capacity of each yard in advance, resulting in reduced energy consumption and transportation time.

In addition to the foregoing, the solution suggested in this study solves the anomalous vibration frequency between the conveyors of slot5 and slot6, as well as the obstacle avoidance issue in mobile equipment path planning during the route selection process. On the one hand, it avoids path conflicts; on the other, it excludes the faulty conveyor from consideration as a potential path, hence improving operational safety.

Overall, the proposed operational scheme enhances yard management efficiency and sustainability by optimizing pile allocation and minimizing unnecessary handling activities.

6.3. Model Sensitivity Analysis

Overall, the amount of storage capacity has a significant impact on the efficiency of port operations. In this work, sensitivity analysis is used to assess the efficiency of the scheduling method with varying amounts of storage slots. In this study, the number of yards for material storage is set to three, four, five, and six to investigate the solution’s energy consumption and time fluctuation. In this experiment, four groups of scenarios were set up, and their storage locations were 3/4/5/6, respectively. The specific experimental scheme and parameters are shown in Table 9:

Table 10 and Figure 11 demonstrate that when the number of stored goods changed, for instance, from 3 to 4, the change in energy consumption was not particularly significant when compared to the subsequent adjustment; however, when the number of stored goods changed from 4 to 5, the change rate of energy consumption would gradually increase; that is, job energy consumption will increase more quickly as the number of storage slots increases. This indicates that the vacant area should be arranged before each operation in order to reduce the number of vacant spaces and improve efficiency while maintaining the same vacant capacity. Additionally, it demonstrates that this study can offer improved answers to the bulk storage yard’s employees, which offers a theoretical foundation for the bulk storage yard’s planning.

7. Conclusions

Following investigation and case validation, the following findings were made:

Novel Yard Operational Process Optimization Model: This work built a novel yard operational process optimization model based on mixed-integer programming and graph theory by carefully accounting for the yard structure, mechanical operational features, and actual port conditions. We performed graph modeling and mathematical mapping of flow rates, routes, and origin–destination points by converting the yard structure into a graph network and refining path optimization methods in accordance with the real features of dry bulk yards. Low operational efficiency and excessively pointless actions were found to be prevalent problems in dry bulk yards when compared to actual operational procedures. The suggested model and algorithm improved transportation and loading/unloading efficiency and supported future intelligent applications in dry bulk yards by addressing issues like resource waste and low safety caused by the complexity of yard operational process elements.

A successful algorithm combining greedy algorithm with Dueling Double DQN: We created a solution algorithm that combines the benefits of target network optimization and state–action evaluation from the greedy algorithm, Dueling DQN, and Double DQN. In experiments, the method produced favorable results, showing increased agent learning stability and convergence speed. It demonstrated outstanding iterative learning skills in the context of the current challenge, offering a theoretical foundation for resolving difficult yard operational process optimization issues. By employing this algorithm, the resource- and time-balancing scheduling approach successfully decreased yard transportation expenses, setting the stage for later coordinated scheduling between berths and yards. The plan also lowers the overall operating risks in dry bulk yards, greatly improves the safety of cargo transportation, and provides a path toward a safe, green, and effective transformation.

Solution for complex yard operations: The difficulty of methodically developing operational schemes and coordinating operational processes in the intricate setting of dry bulk yards was the focus of this study. We optimized yard selection and mechanical scheduling modes, selecting the most efficient transportation ways and routes, by examining mechanical operating states and yard storage conditions and taking into account variables such as mechanical operational energy consumption, transportation time, and safety. This serves as a guide for thorough scheduling and optimization in dry bulk yards.

To eliminate unnecessary handling procedures, this study pre-selected yards for inbound and outbound processes based on the anticipated departure mode of the cargo. This streamlined approach encompassing the stages of grabbing, transporting, weighing, loading, and outbound processing resulted in reduced expenses and enhanced operational efficiency. The developed model provides substantial evidence for the systematic design of scheduling schemes for both mobile and fixed machinery in dry bulk ports. By calculating mechanical scheduling and transportation strategies for handling operations during three outbound processes, the model supports optimized scheduling and resource allocation, thereby improving the overall port operational performance.

The green development of bulk cargo terminals is aided by the reduction in energy use. Reduced energy use enables ports to import fewer resources in this area, which lowers costs and improves efficiency. However, because of the particulars of dry bulk cargo, the environment at the terminal is not solely correlated with energy use; at the same time, the use of green energy may also support the long-term growth of bulk cargo terminals. One of the directions for our future research will be to investigate certain environmental evaluation markers further. The loading, unloading, and transportation procedures at bulk cargo terminals entail more intricate operational circumstances, like cleaning the cargo hold when unloading. Furthermore, the overall operational model still depends on vessel scheduling, but the ways of storage entry are more diverse.

This study focuses on optimizing terminal yard operations using an algorithm based on the combined approach of “location selection, path optimization, and resource scheduling”. In future study, we intend to investigate the combined optimization of yard operation scheduling and vessel scheduling, as well as to broaden the optimization studies to additional scheduling scenarios. For example, container and liquid cargo terminals use different cargo transportation and unloading procedures than dry bulk freight terminals.

Container transport is more fixed, and liquid cargo transport requires pipelines, which is entirely different from conveyor belt systems. Furthermore, reinforcement learning models capable of dynamically updating learning parameters based on scene changes are another research direction for enhancing the adaptability of the model.

Author Contributions

H.L.: Writing review and editing, supervision, funding acquisition. J.Z.: Conceptualization, writing—original draft, software, methodology, validation, supervision. P.J.: Methodology, resources, project administration. H.O.: Investigation, visualization, software. W.Z.: Validation, investigation, data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China, grant number 2023YFB4302200, the Key Project of the Liaoning Province Social Science Planning Fund, grant number L21ACL002, and the Fundamental Research Funds for the Central Universities, grant number 3132024284.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Yang, H.; Sun, W.; Zhou, B. Application of intelligent perception fusion technology in dynamic environment of dry bulk yard. Port Waterw. Eng. 2023, 9, 177–182. [Google Scholar] [CrossRef]
Liu, H.; Wang, X. Research on optimization of intelligent production control system for bulk cargo terminal. Port Oper. 2023, 1, 52–55+59. [Google Scholar]
Wu, J.; Wang, X. Intelligent recommendation algorithm for optimal process combination of dry bulk terminal. Port Oper. 2020, 6, 32–35+55. [Google Scholar]
Burdett, R.L.; Corry, P.; Eustace, C. Stockpile scheduling with geometry constraints in dry bulk terminals. Comput. Oper. Res. 2021, 130, 105224. [Google Scholar] [CrossRef]
Unsal, O.; Oguz, C. An exact algorithm for integrated planning of operations in dry bulk terminals. Transp. Res. Part E Logist. Transp. Rev. 2019, 126, 103–121. [Google Scholar] [CrossRef]
Ouhaman, A.A.; Benjelloun, K.; Kenné, J.P.; Najid, N. The storage space allocation problem in a dry bulk terminal: A heuristic solution. IFAC-PapersOnLine 2020, 53, 10822–10827. [Google Scholar] [CrossRef]
Kong, L.; Ji, M. Mathematical modeling and optimizing of yard layout in automated container terminals. Expert Syst. Appl. 2024, 258, 125117. [Google Scholar] [CrossRef]
Dávila de León, A.; Lalla-Ruiz, E.; Melián-Batista, B.; Moreno-Vega, J.M. A machine learning-based system for berth scheduling at bulk terminals. Expert Syst. Appl. 2017, 87, 170–182. [Google Scholar] [CrossRef]
Zhang, R.-Q.; Wang, M.; Pan, X. New model of the storage location assignment problem considering demand correlation pattern. Comput. Ind. Eng. 2019, 129, 210–219. [Google Scholar] [CrossRef]
Sun, D.; Meng, Y.; Tang, L.; Liu, J.; Huang, B.; Yang, J. Storage space allocation problem at inland bulk material stockyard. Transp. Res. Part E Logist. Transp. Rev. 2020, 134, 101856. [Google Scholar] [CrossRef]
Jiang, X.J.; Jin, J.G. A branch-and-price method for integrated yard crane deployment and container allocation in transshipment yards. Transp. Res. Part B Methodol. 2017, 98, 62–75. [Google Scholar] [CrossRef]
Tan, C.; Liu, Y.; He, J.; Wang, Y.; Yu, H. Yard space allocation for container transshipment ports with mother and feeder vessels. Ocean Coast. Manag. 2024, 251, 107048. [Google Scholar] [CrossRef]
Kang, K.; Zhang, J.; Yang, X. Study on integrated dispatching model and algorithm for loading/unloading activity systems of dry bulk cargo ports. Logist. Technol. 2014, 33, 121–125+188. [Google Scholar]
Robenek, T.; Umang, N.; Bierlaire, M.; Ropke, S. A branch-and-price algorithm to solve the integrated berth allocation and yard assignment problem in bulk ports. Eur. J. Oper. Res. 2014, 235, 399–411. [Google Scholar] [CrossRef]
Shu, Y.; Han, B.; Song, L.; Yan, T.; Gan, L.; Zhu, Y.; Zheng, C. Analyzing the spatio-temporal correlation between tide and shipping behavior at estuarine port for energy-saving purposes. Appl. Energy 2024, 367, 123382. [Google Scholar] [CrossRef]
de Andrade, J.L.M.; Menezes, G.C. A column generation-based heuristic to solve the integrated planning, scheduling, yard allocation and berth allocation problem in bulk ports. J. Heuristics 2023, 29, 39–76. [Google Scholar] [CrossRef]
Huang, M.; He, J.; Yu, H.; Yan, W.; Tan, C. Improved Benders decomposition for stack-based yard template generation in an automated container terminal. Transp. Res. Part E Logist. Transp. Rev. 2024, 188, 103607. [Google Scholar] [CrossRef]
Lu, N.; Zhou, H.; Wang, X.; Shi, H.; Zhang, Z.; Guo, Z. Scheduling synergies between berth and yard in bulk ports based on MOPSO algorithm. In Proceedings of the 2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS), Shenyang, China, 25–27 February 2023; pp. 391–395. [Google Scholar] [CrossRef]
Shu, Y.; Xiong, C.; Zhu, Y.; Liu, K.; Liu, R.W.; Xu, F.; Gan, L.; Zhang, L. Reference path for ships in ports and waterways based on optimal control. Ocean Coast. Manag. 2024, 253, 107168. [Google Scholar] [CrossRef]
Wang, W.; Lin, S.; Zhen, L. Flexible storage yard management in container terminals under uncertainty. Comput. Ind. Eng. 2023, 186, 109753. [Google Scholar] [CrossRef]
Pratap, S.; Nayak, A.; Kumar, A.; Cheikhrouhou, N.; Tiwari, M.K. An integrated decision support system for berth and ship unloader allocation in bulk material handling port. Comput. Ind. Eng. 2017, 106, 386–399. [Google Scholar] [CrossRef]
Shu, Y.; Zhu, Y.; Xu, F.; Gan, L.; Lee, P.T.-W.; Yin, J.; Chen, J. Path planning for ships assisted by the icebreaker in ice-covered waters in the Northern Sea Route based on optimal control. Ocean Eng. 2023, 267, 113182. [Google Scholar] [CrossRef]
Xu, B.; Liang, J.; Yang, X.; Li, H.; Yang, Z. Evaluation of operation cost and energy consumption of ports: Comparative study on different container terminal layouts. Simul. Model. Pract. Theory 2023, 127, 102792. [Google Scholar] [CrossRef]
Soroushnia, G.; Alinaghian, M.; Malekahmadi, A. An accelerated Benders decomposition algorithm for the integrated storage space assignment, berth allocation, and yard crane deployment problem. Comput. Ind. Eng. 2024, 194, 110397. [Google Scholar] [CrossRef]
Xuan, B.; Liang, C.; Yang, X.; Li, H.; Yang, Z. A dynamic yard space reservation algorithm based on reward-penalty mechanism. Heliyon 2024, 10, e37817. [Google Scholar] [CrossRef]
van Vianen, T.A.V.; Ottjes, J.A.; Negenborn, R.R.; Lodewijks, G.; Mooijman, D.L. Simulation-based determination of the required stockyard size for dry bulk terminals. Simul. Model. Pract. Theory 2014, 42, 119–128. [Google Scholar] [CrossRef]
Bruglieri, M.; Gerzelj, E.; Guenzani, A.; Maja, R.; de Alvarenga Rosa, R. Solving the 3-D yard allocation problem for break bulk cargo via variable neighborhood search branching. Electron. Notes Discret. Math. 2015, 47, 237–244. [Google Scholar] [CrossRef]
Li, C.; Wu, S.; Li, Z.; Zhang, Y.; Zhang, L.; Gomes, L. Intelligent scheduling method for bulk cargo terminal loading process based on deep reinforcement learning. Electronics 2022, 11, 1390. [Google Scholar] [CrossRef]
Jiang, X.; Zhong, M.; Shi, J.; Li, W. Optimization of integrated scheduling of restricted channels, berths, and yards in bulk cargo ports considering carbon emissions. Expert Syst. Appl. 2024, 255, 124604. [Google Scholar] [CrossRef]
Zhang, X.; Li, J.; Yang, Z.; Wang, X. Collaborative optimization for loading operation planning and vessel traffic scheduling in dry bulk ports. Adv. Eng. Inform. 2022, 51, 101489. [Google Scholar] [CrossRef]
Lu, Z.; Yang, Z.; Wang, S.; Hu, H.; Chew, E.P.; Fan, T. Integrated planning model for two-story container ports. Transp. Res. Part C Emerg. Technol. 2024, 160, 104535. [Google Scholar] [CrossRef]
van Vianen, T.A.V.; Ottjes, J.A.; Negenborn, R.R.; Lodewijks, G.; Mooijman, D.L. Simulation-based operational control of a dry bulk terminal. In Proceedings of the 2012 9th IEEE International Conference on Networking, Sensing and Control, Beijing, China, 11–14 April 2012. [Google Scholar]
Lee, H.-T.; Kim, M.-K. Optimal path planning for a ship in coastal waters with deep Q network. Ocean Eng. 2024, 307, 118193. [Google Scholar] [CrossRef]
Cheimanoff, N.; Féniès, P.; Kitri, M.N.; Tchernev, N. Exact and metaheuristic approaches to solve the integrated production scheduling, berth allocation and storage yard allocation problem. Comput. Oper. Res. 2023, 153, 106174. [Google Scholar] [CrossRef]
Yang, X.; Shi, Y.; Liu, W.; Ye, H.; Zhong, W.; Xiang, Z. Global path planning algorithm based on double DQN for multi-tasks amphibious unmanned surface vehicle. Ocean Eng. 2022, 266, 112809. [Google Scholar]
Gajević, S.; Marković, A.; Milojević, S.; Ašonja, A.; Ivanović, L.; Stojanović, B. Multi-Objective Optimization of Tribological Characteristics for Aluminum Composite Using Taguchi Grey and TOPSIS Approaches. Lubricants 2024, 12, 171. [Google Scholar] [CrossRef]
Alimkhanova, A.; Grigorieva, S.; Shvets, O.; Gyorok, G. Data Transmission via Wireless Optical Communication in Indoor Climate Control Systems. Bulletin D. Serikbayev of EKTU, 22 December 2023. Available online: https://api.semanticscholar.org/CorpusID:266792571 (accessed on 5 January 2025).
Hosseini, E.; Al-Ghaili, A.M.; Kadir, D.H.; Daneshfar, F.; Gunasekaran, S.S.; Deveci, M. The Evolutionary Convergent Algorithm: A Guiding Path of Neural Network Advancement. IEEE Access 2024, 12, 127440–127459. [Google Scholar] [CrossRef]
ISO 20816-3:2022; Mechanical Vibration—Measurement and Evaluation of Machine Vibration—Part 3: Industrial Machinery with a Power Rating Above 15 kW and Operating Speeds Between 120 r/min and 30 000 r/min. Deutsches Institut für Normung: Berlin, Germany, 2022.

Figure 1. Structure of dry bulk cargo.

Figure 2. Overall operational process of dry bulk cargo terminal.

Figure 3. A photograph of the dock compared to a sketch of the docking and loading process. (source: google earth).

Figure 4. Neural network architecture.

Figure 5. Dueling Double DQN operation flow.

Figure 6. Computational flow of the mathematical model.

Figure 10. Operational route allocation and yard selection.

Figure 11. Energy consumption and total time changes.

Table 1. Parameter descriptions.

Parameter	Description
d	Shortest distance from the yard to the outbound yard
$x_{a}$	Rated capacity of the stockpile
$x_{a b}$	Amount of stockpile already stored
j	The j-th operation process
J	Total number of operation processes
i	The i-th path segment
I	Total number of paths in a single process
$Q_{i j}$	Material quantity allocated on each conveyor path
$V_{i j}$	Transport speed of the i-th segment in the j-th process
D	Drum diameter
v	Drum rotation speed
ρ	Material density
$S_{i j}$	Transport distance of the i-th segment in the j-th process
$T_{j}$	Total transport time of the j-th operation process
$P_{j}$	Conveying power consumption of the j-th process segment
$P_{0}$	Non-conveying power consumption for completing one operation
M	Total number of segmented paths in one transport process
N	Number of branch subprocesses in a single process
C	Resistance coefficient at conveyor belts, bearings, etc.
f	Roller resistance coefficient
$L_{j}^{\max}$	Belt conveyor transport efficiency
L	Horizontal projection of the drum center distance
$G_{m}$	Weight of rotating parts of the belt conveyor
H	Conveying height of the belt conveyor

Table 2. Belt conveyor vibration ranges.

Vibration Level	Vibration Velocity (mm/s)	Operational Status Description
0–2.3 mm/s	Low vibration	Normal operation; no special attention required
2.3–4.5 mm/s	Moderate vibration	Equipment status normal; continuous monitoring advised
4.5–7.1 mm/s	High vibration	Maintenance or repair needed
≥7.1 mm/s	Very high vibration	Equipment in dangerous state; immediate shutdown required

Table 3. Algorithm parameter settings.

Parameter	Meaning	Value
α	Learning rate	0.001
γ	Discount factor	0.95
memory_size	Capacity of replay buffer	10,000
batch_size	Batch size	64
$ε_{start}$	Initial exploration rate	1.0
$ε_{end}$	Final exploration rate	0.01
$ε_{decay}$	Exploration rate decay	0.995

Table 4. Yard node data.

ID	Name	Type	Storage Capacity	Connected Belt Conveyors
1	Slot1	Yard	10,000	2, 15
2	Slot2	Yard	20,000	1, 2, 3
…	…	…	…	…
8	Corner8	Turnhouse	0	9, 10, 11
…	…	…	…	…
15	TrainExit	Rail Outbound	0	11
16	Port	Ship Inbound	0	1

Table 5. Belt conveyor data.

Conveyor ID	Throughput (t/h)	Initially Occupied	Length (m)	Vibration Frequency
1	5000	No	1610	Adjusted dynamically during studies to assess model safety vibration frequency
2	4500	No	446
3	4500	No	702
…	…	…	…
15	4200	No	968
16	4200	No	1004
17	4200	No	968

Table 6. Bulk cargo data.

Ore Type	Inbound Mode	Outbound Mode	Density (kg/m³)	Storage Duration
Iron Ore	Ship Inbound	Ship/Truck/Train Outbound	4.9–5.1	30 days

Table 7. Comparison of yard operational schemes.

Method	Inbound Operation Time (h)	Inbound Energy Consumption (kWh)	Outbound Time (h)
Dueling Double DQN	22.3	5235.36 kWh	21.9
DQN	22.3	5235.36 kWh	21.9
Traditional scheme	22.78	5423.41 kWh	21.9

Table 8. Parameters related to the solution.

	DDDQN	DQN	Manually Generated Scenarios
Time to obtain a solution	51.7 s	61.38 s	About 30 min
Training time	24.90 s	19.81 s	——

Table 9. Experimental scenario.

Scenario	The Number of Storage Slots	The Remaining Capacity of All Slots	Outbound Mode and Outbound Quantity
1	3	$R_{v 1} = R_{v 2} = R_{v 3} = R_{v 5} = R_{v 6} = R_{v 11}$ = 0 $R_{v 4}$ = 50,000 t $R_{v 7}$ = 30,000 t $R_{v 10}$ = 20,000 t	CityExit: 20,000 t TrainExit: 30,000 t ShipExit: 50,000 t
2	4	$R_{v 1} = R_{v 2} = R_{v 5} = R_{v 6} = R_{v 11}$ = 0 $R_{v 3}$ = 20,000 t $R_{v 4}$ = 30,000 t $R_{v 7}$ = 30,000 t $R_{v 10}$ = 20,000 t	CityExit: 20,000 t TrainExit: 30,000 t ShipExit: 50,000 t
3	5	$R_{v 1} = R_{v 2} = R_{v 5} = R_{v 11}$ = 0 $R_{v 3}$ = 20,000 t $R_{v 4}$ = 30,000 t $R_{v 6}$ = 10,000 t $R_{v 7}$ = 20,000 t $R_{v 10}$ = 20,000 t	CityExit: 20,000 t TrainExit: 30,000 t ShipExit: 50,000 t
4	6	$R_{v 2} = R_{v 11}$ = 0 $R_{v 1}$ = 10,000 t $R_{v 3}$ = 20,000 t $R_{v 4}$ = 30,000 t $R_{v 6}$ = 20,000 t $R_{v 7}$ = 10,000 t $R_{v 10}$ = 10,000 t	CityExit: 20,000 t TrainExit: 30,000 t ShipExit: 50,000 t

Table 10. The impact of warehouse storage stacking and warehouse vacancy arrangement on total inbound and outbound operation time.

Number of Locations	Energy Consumption	Time (h)	Allocation of Cargo Storage Space and Storage Capacity
3	2479.96 kWh	44.43	v3 = 50,000 t → ShipExit v7 = 30,000 t → TrainExit v10 = 20,000 t → CityExit
4	3121.17 kWh	44.44	v3 = 20,000 t → ShipExit v4 = 30,000 t → ShipExit v7 = 30,000 t → TrainExit v10 = 20,000 t → CityExit
5	5210.94 kWh	44.43	v3 = 10,000 t → ShipExit v4 = 30,000 t → ShipExit v6 = 10,000 t → ShipExit v7 = 20,000 t → TrainExit v3 = 10,000 t → TrainExit v10 = 20,000 t → CityExit
6	5370.70 kWh	44.65	v1 = 10,000 t → CityExit v3 = 20,000 t → TrainExit v4 = 30,000 t → ShipExit v6 = 20,000 t → ShipExit v7 = 10,000 t → TrainExit v10 = 10,000 t → CityExit

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, H.; Zhao, J.; Jia, P.; Ou, H.; Zhao, W. Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework. J. Mar. Sci. Eng. 2025, 13, 105. https://doi.org/10.3390/jmse13010105

AMA Style

Li H, Zhao J, Jia P, Ou H, Zhao W. Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework. Journal of Marine Science and Engineering. 2025; 13(1):105. https://doi.org/10.3390/jmse13010105

Chicago/Turabian Style

Li, Haijiang, Jiapeng Zhao, Peng Jia, Hongdong Ou, and Weili Zhao. 2025. "Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework" Journal of Marine Science and Engineering 13, no. 1: 105. https://doi.org/10.3390/jmse13010105

APA Style

Li, H., Zhao, J., Jia, P., Ou, H., & Zhao, W. (2025). Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework. Journal of Marine Science and Engineering, 13(1), 105. https://doi.org/10.3390/jmse13010105

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimization of Bulk Cargo Terminal Unloading and Outbound Operations Based on a Deep Reinforcement Learning Framework

Abstract

1. Introduction

2. Literature Review

2.1. Yard Structure Modeling

2.2. Port Yard Operation Scheduling Models

2.3. Solution Methods

3. Problem Formulation

4. Construction of the Yard Scheduling Model

4.1. Model Assumptions

4.2. Model Parameters

4.3. Model Building

4.4. Safety Management

4.5. Power Parameters

4.6. Resource Allocation

5. Numerical Experiment

5.1. Deep Q-Learning

5.2. Greedy Algorithm

5.3. Technical Strategy

6. Case Study Analysis

6.1. Experimental Parameters

6.2. Case Study and Results

6.3. Model Sensitivity Analysis

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI