1. Introduction
One of the most important operational goals of a container terminal is to minimize the vessel turnaround time by maximizing the efficiency of vessel operations with regard to loading or unloading containers onto or from the vessels. The outbound containers brought in by the trucks from inland are stored in the yard until they are loaded onto the vessels. On the other hand, the inbound containers unloaded from the vessels dwell in the storage yard until they are claimed by the external trucks for inland transportation. As buffer storage for both inbound and outbound containers, the operational efficiency in the storage yard critically affects the overall productivity of the terminal. A very important factor that affects the operational efficiency in the storage yard is the determination of the stacking locations of the containers that arrive at the yard. If, for example, a container just unloaded from a vessel is stacked on top of another to be loaded soon, the upper one has to be relocated to a different stack when retrieving the lower one. Such rehandling of containers should be minimized in order to maximize the operational efficiency in the yard. When the containers are loaded onto a vessel, they follow a predetermined sequence. The loading sequence is determined at the planning stage, taking into account the vessel stability, ports of destination, and the efficiency of operation at the storage yard. Still, rehandling is the major cause of loading delay because the retrieval schedules are usually unknown at the time the containers arrive at the yard, and thus, they can be stacked at the wrong locations. In this paper, we deal with the problem of determining good locations for stacking not only the containers that newly arrive at the yard but also those that are relocated within the yard.
Perhaps the simplest approach to determining the stacking locations in the yard is using a heuristic rule based on a simple criterion such as preferring the nearest location or preferring the stack-top of containers of the same category [
1,
2,
3]. More recent and advanced methods use multiple criteria to evaluate candidate stacking locations from various perspectives [
4,
5,
6]. These methods use
scoring functions to calculate the score of a candidate location by a weighted sum of the results of evaluations based on various criteria. Since the scores and the resulting best stacking location depend on the weight combination, or
weight vector, used in the scoring function, the weight vector can be regarded as a
stacking policy. To find a good weight vector, i.e., a good stacking policy, References [
4,
6] use genetic algorithms (GA) in which each candidate policy is evaluated through simulations of applying the policy to various scenarios of operations in the yard and measuring the resulting performance. However, the policy found in this way is the one whose
average performance in various scenarios is the best. In certain situations, there may be other policies that work better than the average policy.
What we propose in this paper is a method for deriving a stacking policy, which can be adapted to changing situations by having its weight vectors adjusted. As an indicator of the situation, we use the workload of vessel operation because good stacking locations are very much dependent on it. Our method is based on the idea that a new policy may be synthesized if we are given two boundary policies: one for a very low workload and the other for a very high workload. We assume that a good policy for any intermediate situation can be derived by taking an interpolation of the weight vectors of the two boundary policies. We use a GA to search for not only the two boundary policies but also the two numeric values that quantify the two fuzzy terms ‘very low’ and ‘very high’. Our GA can be seen as a reinforcement learning algorithm conducting a search in the policy space [
7] instead of learning a value function based on rewards from the environment. Experimental results show that our method performs better by dynamically adapting the stacking policy to varying situations than the previous methods that use a static best on-average policy.
The rest of the paper is organized as follows.
Section 2 gives a detailed description of the operations in the storage yard of an automated container terminal.
Section 3 reviews the related works, and
Section 4 describes the stacking policy based on the scoring functions.
Section 5 explains how we derive a situation-adaptive stacking policy by using a GA.
Section 6 reports the results of experiments, and
Section 7 discusses how we can extend our method to the cases with multiple situation indicators. Finally,
Section 8 gives some concluding remarks.
2. Operations in the Storage Yard of an Automated Container Terminal
The descriptions in this section are mostly based on the material given in [
4,
6,
8]. As can be seen in
Figure 1, the automated container terminal can be largely divided into four regions: quay, apron, storage yard, and hinterland. The quay is where the vessels berth and a number of quay cranes (QC) load/unload containers onto/from the vessels. The apron is the area for the automated guided vehicles (AGV) to deliver containers between the quay and the storage yard. The hinterland is where the external trucks (ET) bring in containers to the storage yard or bring out the containers picked up from the storage yard. The storage yard consists of dozens of rectangular blocks that are laid out in the perpendicular direction to the quay. Each block consists of hundreds of container stacks several tiers high, where the stacks are arranged in dozens of bays in the perpendicular direction and in several rows in the horizontal direction. Since a bay is of the length of a 20 ft container, a stack of 40 ft container spans two consecutive bays. For safety reasons, the containers cannot be stacked together if their sizes are not the same. Each block is equipped with two automated stacking cranes (ASC) to handle the containers. Since the two ASCs are of the same size, one cannot move across the other and thus may interfere with each other at close ranges. The container transfer to and from an AGV is made by the seaside ASC at a seaside handover point (HP) located at the seaside end of the block, while the transfer to and from an ET is made by the landside ASC at a landside HP located at the opposite end.
The containers in the storage blocks are categorized into three groups by their directions of the flow of logistics: inbound, outbound, and transshipment containers. The inbound and outbound containers are already mentioned in the previous section. The transshipment containers are those that are unloaded from certain vessels and stored in the yard, but unlike the inbound containers, they are to be loaded onto other vessels for further sea transportation. A container stored in a block is often relocated to some other place within the block. If container X is placed on top of container Y in a stack but Y has to be taken out earlier than X, then X has to be moved to another location before Y can be retrieved. This relocation, or rehandling, is hard to avoid because the retrieval schedule of containers is usually unknown at the time the containers arrive at the yard and are piled up. Thus, we need a good policy that can select good stacking locations for the incoming containers so that both the container handling time and the possibility of rehandling are minimized. Note that a stacking policy should also be able to recommend good stacking locations for the rehandled containers to minimize further rehandling. Rehandling is considered the biggest cause of delay of container handling in the storage yard.
There arises another difficulty with container handling when the two ASCs of a block cannot move across each other. To avoid collision, an ASC sometimes has to stop and wait until the other one finishes its job and backs up. This
interference deteriorates the throughput of the ASCs. Interferences are more likely to occur as the travel distance of an ASC becomes longer because it gets close to the other ASC with a higher probability. Unfortunately, the ASCs often cannot avoid long-distance travel in such a block layout as that shown in
Figure 1. The outbound containers brought in by the ETs enter the block through the landside HPs and are usually stored at locations near the landside end. However, they eventually go out of the block through the seaside HPs when they are loaded onto their target vessels. On the other hand, the inbound containers unloaded from the vessels enter the block through the seaside HPs and then later exit through the landside HPs. These long-distance movements of containers in opposite directions easily lead to ASC interferences. One way of minimizing such interference is to have the containers tactically relayed through cooperation between the two ASCs. As an example, an inbound container stacked near the seaside end of the block can be moved first to a certain intermediate location by the seaside ASC and then to its final destination HP at the landside by the landside ASC. The first movement of this relay operation is called
repositioning. Note that the stacking locations for the containers to be repositioned should also be determined by the stacking policy.
Given a stacking policy, container handling in a storage block is typically performed in the following way. Whenever an ASC finishes its current job, it selects the most urgent job from the job queue that contains all the jobs requested for the next horizon of length, say thirty minutes. The jobs in the queue include those to be undertaken according to the loading/unloading schedules and those requested from the ETs that have already arrived at the landside HPs but have not been serviced yet. The ETs expected to arrive at the block during the next horizon are not counted because their arrival time is highly unpredictable. If the current ASC is the landside ASC, the most urgent job would be the ET job with the longest waiting time. If the current ASC is the seaside ASC, the most urgent job would be either a loading or an unloading job with the earliest deadline according to their schedules. When the job selected is a stacking job, the stacking policy examines all the available slots in the block and recommends the best one as the stacking location. An available slot can be found at the top of every stack unless it has already reached the allowed maximum tier. However, the stack should not belong to the bay where the other ASC is currently working. When the job selected is a retrieval job, no reference to the stacking policy is necessary unless there are other containers above the target container. If there are some containers above, they all must be relocated one after another to the locations recommended by the stacking policy. When the retrieval job selected requires a travel distance longer than a given threshold, the target container should be repositioned to the location recommended by the stacking policy before it can be sent to its destination HP by the other ASC.
The efficiency of ASC operation in a block can be assessed by measuring the AGV delay and the ET waiting time, where the former is counted far more important than the latter. A good stacking policy makes the ASC operation efficient by reducing the interference, rehandling, and the overall container handling time, which makes it possible for the ASCs to provide the AGVs and ETs with faster services. However, it takes quite a long period of observation to see how good a stacking policy is. Whether or not the stacking locations recommended are good can be seen better when the containers are retrieved out of the block than when they come into the block. Since the average dwell time of containers in container terminals is often longer than a week, the AGV delay and ET waiting time should be measured for a long period during which time enough containers are retrieved, under the condition that the containers keep coming in and going out fairly constantly during that period of time.
3. Related Works
There are some previous works on container stacking that deal with the problem of allocating the storage spaces for incoming containers. Reference [
9] showed how to organize the storage area to minimize the number of container handling moves given a fixed amount of space, based on simple models that capture the relationship between the handling moves and the amount of available space. Reference [
10] developed a space allocation method for inbound containers so as to minimize the expected number of rehandles. References [
11,
12] used a mixed-integer programming model together with some heuristics to allocate storage space for outbound and transshipment containers, respectively. Reference [
13] applied a constraint satisfaction technique to allocate spaces to outbound containers. Reference [
14] developed a space allocation method that can cope with the uncertainties in loading/unloading times of vessels. Reference [
15] used a genetic algorithm to optimize the space allocation problem to avoid bottlenecks in storage yard operations and to minimize vessel service time. All these methods allocate a bulk of storage locations for reservation prior to the arrival of containers. If the arrival plan changes, the storage space must be reallocated. In contrast, the methods discussed below designate specific storage locations for each individual container at the time of its arrival.
The majority of the previous works on stacking locations have used rules or heuristics. Reference [
16] considered the configuration of the container stack and the weight distribution of containers in the yard-bay to derive a decision tree model that determines the storage location of each outbound container. The decision tree can be deemed as another representation of a set of rules. Reference [
1] proposed stacking rules that recommend the containers belonging to the same category be stacked together. Containers of the same category are of the same weight class and size, have the same destination port, and are loaded onto the same vessel. Reference [
2] proposed rules that consider not only the category but also the height of the stacking position. Reference [
3] suggested a heuristic rule for determining the locations of relocated containers to minimize the number of relocations during the retrieval process. Reference [
17] conducted simulation studies to investigate the effect of using information about container departure times and the tradeoff between stacking farther away versus stacking close to the HPs. It used simple stacking rules that are designed to work in perpendicularly laid out storage blocks in an automated container terminal. Reference [
18] presented what they call a hybrid sequence stacking method that determines the stacking locations of outbound containers considering the container weights. Proposing an ideal configuration of a yard bay to avoid rehandling, this method tries to stack the incoming containers so that their positions are as close to the ideal configuration as possible.
Some previous works tried to adopt more AI (artificial intelligence) techniques based upon rules or heuristics. References [
19,
20] derived stacking policies for outbound containers considering the uncertainties in their weights with the purpose of minimizing rehandling. Their policy consists of three precedence rules, each for a container weight group, where the rules are optimized by a simulated annealing algorithm. Reference [
21] proposed a heuristic method to stack outbound containers. This method evaluates each candidate stacking location for an incoming container through a simulation and selects the best one. In the simulation, after stacking the container at the candidate location, the remaining containers arriving in a random sequence are stacked following a heuristic priority rule and the resulting performance is measured. Reference [
22] used simple rules that determine the stacking positions based on the stack height and the estimated time of retrieval. The rules adopt fuzzy logic to represent their conditions to deal with a high degree of uncertainty in the arrival of containers at the yard. Reference [
23] proposed a multi-agent system for container stacking, in which the stack agent recommends a stacking position by consulting the knowledge base composed of if-then rules that check various conditions such as the container types, the configuration of the storage space, and the occurrence of exceptional events. Each stacking decision is evaluated by the evaluation agent that rejects the decision when unacceptable. If a decision turns out to be unacceptable, a learning mechanism is activated to add a new rule to the knowledge base so that the rules responsible for the wrong decision can be disabled. This learning mechanism makes the proposed system adaptive to changes, while the adaptation is mainly focused on the disturbances and unexpected events. Reference [
24] investigated the impact of container stacking methods regarding how they deal with uncertainties in container terminals and reduce container handling costs. The stacking methods studied, however, determine only the best yard-bays but not the specific stacking slots.
Compared to the works discussed above, References [
4,
6] are much more closely related to our work. The stacking policy in these researches employs scoring functions that evaluate a candidate stacking location from various perspectives using different criteria, where the score of a location is calculated as the weighted sum of the scores for those criteria. The policy uses different scoring functions for different container types because each container type requires its own evaluation criteria for stacking. However, all those scoring functions with different weight vectors are together treated as a single policy. This policy is optimized by a search using a GA, where a candidate policy is evaluated by simulating the operations at a block under the policy for a certain period of time and measuring the resulting performance. For this simulation, Reference [
4] provided a pool of operation scenarios of various kinds for a more accurate policy evaluation. To evaluate a policy, it is applied to a randomly selected subset of those scenarios, and the resulting performances were averaged. Therefore, the policy thus optimized can be said to be the best on average. Given a certain situation, there may be some other policy that works better than the on-average best policy. Another limitation with such policy is that the policy cannot change as the operational environment changes. As an effort to overcome this problem, Reference [
5] proposed an online search algorithm that dynamically adjusts and optimizes a stacking policy by continuously generating variants of stacking policies and evaluating them while they are actually being applied for determining the stacking positions. However, this online search cannot keep up with the rapid changes in a situation. When the situation changes, the performance of the current policy begins to deteriorate, at which time that of a variant of the current policy may show a better performance. If this happens, the current policy would be switched to that variant, but only after experiencing some deterioration. In general, we cannot expect such a good variant to appear quickly at the right moment. Therefore, we can say the online search is not really reactive to changes, but it just gradually adapts to changes. Another drawback is that the online search cannot find really good policies because of its limited explorative capacity. Since all the variants must be simulated and tested online, it is hard to generate and test a number of variants under a real-time constraint, which deteriorates the search performance.
The stacking policy derived by the method proposed in this paper is a significant improvement on the policy proposed in [
4,
6]. While [
4,
6] look for a policy whose average performance in various situations is the best, our method derives a policy that can quickly adapt to changing situations. When the situation changes, our policy immediately reacts to the change and provides a newly customized action. This improvement was possible because the policy that we deal with was based on scoring functions and the weight vectors used in those functions were easily adjustable. Most previous works reviewed above in the second and the third paragraphs of this section use policies based on if-then rules. Those rules are carefully crafted by the designers rather than being optimized by any algorithms. They are hard to be automatically modified upon situation changes. While the method proposed in [
5] looks closest to ours in that the policy can adapt to changing situations, its adaptation is slow or gradual rather than immediate or reactive. The different characteristics of the related works discussed so far are compared and summarized in
Table 1. The works reviewed in the first paragraph of this section are not included in the table because they have little relevance to our work.
4. Stacking Policy Based on Scoring Functions
This section describes the stacking policies proposed by [
4,
6]. The stacking location of an incoming container is determined in two stages. First, the container is assigned to a block in the storage yard taking into account various operational conditions in all of the blocks. Second, a specific location within the assigned block is selected from among the candidate locations based on several criteria such as the distance to the destination, the stack height, the likelihood of rehandling, and so on. The stacking policies described in this paper are used in the second stage to determine a specific stacking location within the designated block.
To determine a stacking location within a block, all the available slots in the block are evaluated by using a scoring function and then the one with the best score is chosen. Note that the slot determination is required not only for the containers newly coming into the block but also for those that are rehandled or repositioned within the block. Furthermore, a good target slot can be different depending on whether the container is an inbound, an outbound, or a transshipment container.
Table 2 shows that the stacking policy uses different scoring functions for different container types. The score
of a slot
x for the
ith container type is calculated by the weighted sum given below:
where
is the evaluation value of slot
x according to the
jth criterion for the
ith container type and
is the weight for
. The stacking policy of
Table 2 consists of seven scoring functions, each of which employs a different subset of eight criteria. Notice that the decision by the policy may change as the values of the weights of the criteria change.
The criterion
is the distance to the candidate stacking location from the current location of the target container, and
is the distance from the candidate location to the outgoing HP of the container. These two criteria are calculated differently for different types of containers, as illustrated in
Figure 2.
and
give the same value if the target container is a transshipment container.
H is the height of the stack underneath the candidate location. The higher the stack, the greater the likelihood of rehandling.
E is an indicator of whether or not the candidate location is an empty ground. The empty grounds have to be saved as much as possible in preparation for the possible shortage of stacking locations.
S is the amount of reduction in empty ground slots available for container stacking. For safety reasons, containers of different sizes cannot be stacked together. A 40 ft container occupies two adjacent stacks, as shown in
Figure 1. Therefore, placing a 20 ft container on one of the two consecutive empty ground slots not only uses one ground slot for a 20 ft container but also reduces the availability of ground slots for 40 ft containers.
T indicates whether or not the container right underneath the candidate location has been temporarily repositioned to that slot. T is used only in the scoring functions for the containers to be repositioned. Since a repositioned container will soon be moved to an outgoing HP, rehandling is quite likely to occur if there is some other container on top of it. However, if the container on top of it is also a repositioned container, rehandling can be avoided by simply moving the upper one to its own outgoing HP before moving the lower one. Criterion T encourages the repositioned containers to be kept together in the same stacks. This is desirable to save stacking slots available for the containers newly coming in or for those rehandled. G is the estimated likelihood of the occurrence of rehandling when a container to be loaded to a vessel is stacked on top of others. If the container was just stacked and all the containers underneath belong to the same category, no rehandling occurs during loading. Otherwise, the underneath containers belonging to different categories can cause rehandlings. As mentioned before, the containers belonging to the same category are to be loaded onto the same vessel, have the same port of destination, are of the same size, and are of the same weight class.
P appears only in the scoring function
, which is specialized for the outbound containers brought in by the ETs. It is the preference value of a candidate location depending on which region of the block the location belongs to. It is preferable that the outbound containers to be loaded sooner are stacked closer to the seaside.
Figure 3 illustrates the distributions of the preferences over different regions in a block for the outbound containers of different loading times. The block is divided into five regions, and there are three different urgency levels for loading. The preferences are distributed differently for each urgency level, resulting in fifteen preference values each for a region and an urgency level. In the previous work [
6], these preference values as well as all the weight values in
Table 2 were determined by running a GA-based search algorithm.
Table 3 summarizes the eight criteria explained above. To calculate the weighted sum of the respective subsets of these criteria, the policy of
Table 2 uses 41 weights. The evaluation values of all the criteria are normalized to [0, 1] and the weight values are constrained to [−1, 1]. Note that a weight can be negative if the value of the corresponding criterion affects the policy adversely. As the decision made by the policy depends on the weight combination or the weight vector, the weight vector is considered as the policy. When the policy is optimized by using a GA, each candidate policy is evaluated by applying it to a variety of scenarios of operations in a block and averaging the resulting performances. In this way, the optimization algorithm derives a policy that works the best on average. Given a certain situation, however, a different policy might perform better than this best on-average policy. In the next section, we explain how we derive a policy that can be dynamically adapted to changing situations.
5. Proposed Method
In this research, we consider the current workload of vessel operation as the only important indicator of the current operational situation in a storage block. The workload of vessel operation from the standpoint of a block is the workload of its seaside ASC that handles the containers to be loaded to or unloaded from the vessels. Since any delay by the seaside ASC leads to a delay of the vessel operation at the quay, its efficient operation is critical. It may not be desirable, for example, that the seaside ASC spends too much time in container stacking when the seaside workload is high. The containers arriving at the seaside HPs are better stacked at locations not far from those HPs in order not to have other vessel operations delayed. However, a desired amount of such adjustment of travel distance for stacking cannot be made in any obvious way. It is difficult to invent a formula relating the amount of adjustment with the seaside workload that is continuously changing over time. One intuitive approach to dealing with continuously changing situations would be to divide the situations into a finite number of representative situations and then to derive a specialized policy for each representative situation. Note that a coarse division would not be effective enough because each constituent policy would suffer from the same problem of showing only the best on-average performance although to a lower degree. On the other hand, a very fine division requires too many policies that all must be derived through computationally expensive optimization search.
Another approach one may think of would be to turn the criteria used in the scoring functions of the policy into functions of the seaside workload. While most criteria are clearly independent of the seaside workload, the regional preference P seems to be dependent on it. When the seaside workload is heavy, preferring the seaside regions is not desirable because the chances of interference with the seaside ASC get high. In fact, the preference as a function of the seaside workload seems necessary more for the inbound than the outbound containers because the seaside ASC may not want to travel a long distance for container stacking when its load is heavy. Furthermore, there are some weights whose desirable values seem to depend on the seaside workload, although the corresponding criteria are not. Some of the examples are the weights for and . The value of criterion for an inbound container should be considered more importantly (i.e., should be given a larger weight) to save the travel time as the seaside workload gets higher. However, we do not know how exactly the values of the criteria or weights should change as a function of the seaside workload. In our proposed method, therefore, we exclude P from the policy, and instead, we synthesize a new policy from two boundary policies whenever needed: one for a very low workload and the other for a very high workload. For the synthesis, we take an interpolation of the two boundary policies. When we use a GA to search for the two boundary policies, we simultaneously search for the two threshold values to quantify the two fuzzy terms ‘very low’ and ‘very high’.
Let s represent the current workload that is measured by adding up the estimated processing times of all the vessel jobs scheduled to be done by the seaside ASC within the next horizon of length h seconds from the current point of time. A vessel job is either a loading or an unloading job. For a loading job, the seaside ASC makes an empty trip from its current location to the location of the target container, picks up the container, makes a loaded trip to a seaside HP, and puts the container down on top of an AGV waiting there. Among these actions, container pickup can take longer if it involves rehandlings. For an unloading job, the seaside ASC undertakes an empty trip from its current location to the seaside HP where the AGV bringing the target container is parked, picks up the container from the AGV, undertakes a loaded trip to the designated stacking location, and puts the container down at that location. Since the loading and unloading schedules for each vessel are predetermined at the planning stage well before the real operation starts, the workload of vessel operations within a horizon can be easily estimated.
We use
and
to denote the threshold values for the extreme or boundary workloads; the workload is said to be very low if
and very high if
. The seaside ASC is said to be overloaded if
, as the time taken to finish the works planned for the horizon exceeds the length of the horizon. Let
and
be the policies specialized for the situations of very low workload and very high workload, respectively. Then, the policy
for workload
s with
can be synthesized from
and
by deriving new weight values to be used in the scoring functions of
through interpolations between the corresponding weights in
and
. The
ith weight
to be used in policy
is calculated as
where
and
are the
ith weights in
and
, respectively. As
s gets closer to
,
is influenced more by
than
, or the other way around. Note that the score
for a candidate slot
x by the synthesized policy
can be directly calculated from the scores
and
as
without actually deriving the individual weights constituting
because the values
,
, and
s in Equation (
2) are independent of
i.
We use a GA for our optimization, which is basically the same as that used in [
4]. We optimize not only
and
but also the two threshold values
and
.
Figure 4 shows the representation of the candidate solution adopted by our GA. Since each policy consists of 40 weights after dropping out the criterion
P, there are 82 real values to be optimized in total, where
and
are constrained to be in [0, 2] and the weight values in [−1, 1]. During the evaluation,
and
are decoded to
and
, respectively, by having them multiplied to the length of horizon
h. The reason for setting the upper bound of
and
to 2 is that the workload of vessel operation measured in time can exceed the length of the horizon when overloaded.
To evaluate a candidate policy during the search, the policy is applied through simulation to a set of scenarios randomly chosen from the provided pool and the resulting performances are averaged. The pool contains various scenarios of different difficulty levels; a scenario is difficult if the workloads of the ASCs are high. The length of a scenario is three weeks, which is long enough to measure the efficiency of the ASC operation because enough of the containers that arrived during this period are retrieved. During the first two weeks, the stacking yard, or block, is initialized starting from an empty yard without simulating the ASC’s movements. Then, from the beginning of the third week, the efficiency of the ASC operation is measured with their movements simulated realistically, reflecting acceleration, deceleration, and interferences. More details on this crane simulation can be found in [
25]. For an evaluation of the performance in the third week of a scenario, a candidate policy
is applied to the scenario. When a container has to be stacked, the workload
s of vessel operation for the next horizon of length
h seconds is estimated. If
or
, then
or
becomes the policy to be used, respectively. Otherwise, a new policy
specialized for the workload
s is synthesized through interpolation and then applied for stacking. Note that we need to synthesize only one of the seven scoring functions shown in
Table 2 depending on the type of container to be stacked. This synthesis and application of a new policy are repeated every time a container is stacked. When the simulation of a scenario is over, the performance of the candidate policy is measured by the following objective function:
where
is the stacking policy under evaluation,
is the average (per container) AGV delay observed under
,
is the average waiting time of ETs under
, and
and
are the respective weights for
and
.
is usually much larger than
because the seaside operations are considered much more important than the landside operations. The final evaluation is obtained by averaging the objective values measured from all the scenarios.
6. Experimental Results
We used the algorithm named NTGA for policy optimization, which is the same one as that used in [
4] to derive the static stacking policy described in
Section 4. Our parameter setting of NTGA is shown in
Table 4. Since NTGA requires a random subset of operation scenarios selected from a pool to evaluate each candidate policy, we generated 1000 scenarios to constitute a pool. Each scenario consists of container handling jobs to be completed for three weeks in a block that is 46 bays long, 8 rows wide, and 5 tiers high. The job requests are made by the AGVs and ETs that arrive at the seaside and landside HPs, respectively. They either bring in a container to be stored in the block or ask for a container to be picked up from the block. The average number of AGVs arriving per day is approximately from 220 to 300, and that of ETs is from 70 to 110. More requests are from the AGVs than ETs because there are transshipment containers whose proportion among the containers unloaded from the vessels is about 50% in our scenarios. The average daily workload of the ASCs increases with the number of requests, but the workload continuously changes within a day as the requests are not evenly distributed over time.
Using this pool of scenarios, we derived both the static stacking policy of [
4] and the dynamic stacking policy proposed in this paper. Then, the two policies were applied to 100 scenarios that were separately generated following the same distribution as that used for generating the scenarios of the above pool. The two weight values
and
of Equation (
4) were empirically set to 50 and 1, respectively, not only when the policies were derived but also when they were tested. The results obtained by measuring the AGV delay and ET waiting time are shown in
Table 5. We can see that the proposed policy outperforms the static policy in terms of both AGV delay and ET waiting time. The improvement is 22.3% for AGV delay and 16.3% for ET waiting time. We also confirmed that the proposed policy performs significantly better than the static policy by using a paired
t-test with a confidence level of higher than 99.99%. The data for our experiments and the execution of our program can be found in [
26].
To obtain some hints about how the dynamic policy works, we investigated the behaviors of the two boundary policies, i.e.,
for a very low workload and
for a very high workload. The two threshold values
and
(see
Figure 4) found by our search algorithm for distinguishing the very low and very high workloads were 274 s and 1892 s, respectively. Recall that the length of our horizon is 1800 s; the workload of 274 s really looks very low, and that of 1892 s is over the capacity. If we want to quantify the overall stack preference of a boundary policy
for the
ith container type (see
Table 2 for the seven container types), we apply
to a scenario as if it is a static policy and pay special attention to the moments of stacking the
ith type containers. Whenever we come across such a moment during the simulation, we not only apply
to stack the container at the best location as usual, but also calculate the scores for all of 46 × 8 slots as if they are all candidate locations and just save the scores separately. For the latter calculation, the constraint of a maximum possible tier of five is relaxed. Furthermore, we assume that we can stack 20 ft containers even on top of 40 ft ones. When the simulation is over, we obtain the stack preference by averaging the separately saved scores for every slot in the block.
Figure 5a compares the stack preferences of
and
for the incoming inbound containers, where the slots of better scores are indicated by a darker shade. When the seaside workload is very low, the best locations by
are distributed toward the landside end. This is quite reasonable because the inbound containers will eventually leave the block through the landside HPs. Since the seaside workload is low, the seaside ASC does not hesitate to travel a long distance to stack the containers at the landside end so that later retrieval by the ETs is expedited. However, we can see that the locations toward the seaside end of the block are considered not the worst but somewhat preferable by
. Note that
is used not only in the situations of very low workloads but also in the situations of intermediate workloads through interpolation with
. If we separately derived a static policy specialized for a very low workload, it might not prefer any seaside locations at all. On the other hand, when the seaside workload is very high,
prefers the locations closer to the seaside end than those farther away. When the seaside ASC is very busy, it should avoid long-distance travel as not to delay the services to the AGVs waiting at the seaside HPs.
Figure 5b shows the overall stack preferences of
and
for the incoming outbound containers. We can see that the stack preferences are almost the opposite of what we have seen for the inbound containers.
Figure 5c shows the stack preferences of the static policy for the inbound (shown in the upper part) and outbound (shown in the lower part) containers. It seems that the static policy generally prefers the locations closer to the departure HPs regardless of the seaside workload.
7. Cases with Multiple Situation Indicators
Thus far, we have been assuming that the seaside workload is the only indicator we consider to represent the situation. Although rare practically in container terminals, we can imagine the cases in which the situation is represented by multiple indicators. Our method described in
Section 5 can be extended to cover such cases by generalizing the interpolation to a weighted average of multiple relevant terms. Consider, for simplicity of explanation, the case with two indicators
and
. We need two threshold values for each indicator to distinguish between very low and very high values, i.e.,
and
for
, and
and
for
. This leads to four extreme or boundary situations
,
,
, and
, where
is the set of situations with
and
,
with
and
,
with
and
, and
with
and
. We use
,
,
, and
to denote the policies specialized for the boundary situations
,
,
, and
, respectively. Given an intermediate situation
s other than those boundary situations, the policy
for
s can be synthesized from
,
,
, and
by taking a weighted average after normalizing the indicator values. The normalization is necessary to compensate the different scales of different indicators.
Figure 6a represents the space of all situations on a two-dimensional plane formed by two coordinates, one for indicator
and the other for indicator
. We can see how the areas of the four boundary situations
,
,
, and
are located in relation to the threshold values of the two indicators.
A and
B in the figure are two situations other than the boundary situations.
A is an intermediate situation whose indicator values do not go over any threshold.
B is not quite an intermediate situation because one of its indicator
takes a value below the lower threshold
.
Figure 6b shows the situations in
Figure 6a after a normalization, where each indicator value
is transformed to
. In
Figure 6b, the areas of boundary situations are marked by the corresponding policies, and the distances to those areas from
A and
B are indicated by
’s. Let
denote the policy for situation
A in the figure. Then, the score
for a candidate slot
x in situation
A can be calculated by a weighted average of the scores given by
,
,
, and
:
where
is the distance from
A to the area of
, and
can be interpreted as the respective closeness. If
denotes the policy for situation
B, the score
for a candidate slot
x in situation
B can be similarly calculated as
where terms related to
and
are not included because
is extremely low in
B and thus they are irrelevant. Note that Equation (
6) is equivalent to the linear interpolation we calculated in Equation (
3).
The formulation given above can be easily extended to the cases with more than two situation indicators in principle. However, extensions to such cases would be practically infeasible because the number of boundary policies increases exponentially to
, where
n is the number of situation indicators. As we have seen in
Figure 4, our chromosome for the search of the policy already consisted of 82 real-numbered genes when there was a single indicator. If there were two situation indicators, the number of genes should have increased to 164. This number doubles each time another indicator is added, resulting in a huge search space.