1. Introduction
Multipurpose batch processes have been widely used in process industries, such as chemical production [
1,
2,
3], pharmaceuticals [
4,
5,
6], and food processing [
7,
8] as these industries are subject to frequent changes in product demands and high product specialities. The use of multipurpose units (also called machines), such as multipurpose reactors, makes the production system adaptive to such dynamic and specialised environments.
Due to the dynamic nature of production activities, a multipurpose batch process will almost inevitably experience streaming uncertain events (also called
disturbances) during the online execution of a schedule [
9]. When disturbances are observed, if the scheduler does not revise the original schedule or re-generate a new schedule accordingly, it may worsen production objectives such as the profit or makespan. More seriously, the original schedule may become infeasible after adding the constraint of observed disturbances. Therefore, it is necessary to determine a
rescheduling strategy that can be used to revise the current schedule to accommodate these disturbances. Essentially, a rescheduling strategy can be considered as a function whose input typically includes the current system states (such as the operational status of machines and inventory levels of materials), the original schedule, and disturbances, and the output includes when- and how-to-reschedule decisions.
When designing a rescheduling strategy, we usually expect the strategy to have several desired properties. One of them is computational efficiency. In other words, we expect the strategy to respond quickly. For example, in an online production environment, the duration of an operation is often measured in a minute-level precision. In such settings, when a disturbance occurs, it is immediately detected by on-site sensors and uploaded to the Manufacturing Execution System (MES). Then, the Advanced Planning and Scheduling (APS) system, which appears as an integrated module in the MES, needs to process this disturbance information and decide whether to trigger a rescheduling process. If so, then the APS system is also required to return an updated schedule before the subsequent production activities are affected. Typically, all these steps need to be completed within a time window of several minutes. Therefore, computational efficiency is one of the highest priorities of designing a rescheduling strategy.
Another desired property is the ability to achieve a good long-term objective value, such as the long-term profit or long-term makespan [
10,
11]. Although it seems natural to expect this property, we would like to highlight that in practice, it is difficult to measure how good a long-term objective is. This is because achieving a good long-term objective in the context of dynamic scheduling is very different from obtaining a good objective value in a static scheduling problem. Specifically, for the static case, in most situations, we can model the problem as a Mixed-Integer Linear Programming (MILP) problem and use the MIP gap to measure how far the current objective is from the theoretical optimum. However, this approach does not apply in the dynamic situation. On the one hand, during the dynamic scheduling process, naïvely solving static scheduling problems as frequently as possible does not necessarily lead to a better long-term objective. On the other hand, the look-ahead information during dynamic scheduling is finite (meaning that we do not have access to the outcome of all future disturbances) and the dimensionality of state space is high (often hundreds to thousands). Therefore, it is unlikely to obtain, or even approximate, the optimal strategy, leading to difficulties in setting up a theoretical baseline for the long-term cost of a rescheduling strategy. Nevertheless, we still expect the long-term objective to be good in a comparative sense. That is, a good rescheduling strategy should achieve better, or at least not worse, long-term objectives when compared to the long-term objectives that are generated by well-known strategies, such as the periodically–completely rescheduling strategy.
The third desired property is that the rescheduling strategy should lead to a low level of system
nervousness [
12,
13]. Here, the term “nervousness” refers to a measure of differences in schedules before and after rescheduling. The more different the original and new schedules are, or the more frequently the rescheduling process is triggered, the higher the level of system nervousness is. Although the influence of system nervousness does not directly reflect in the conventional objectives such as the cost and makespan, it does impact the actual production activity [
13,
14,
15]. A high-level system nervousness may disrupt pre-processing procedures that are prior to the actual execution of a schedule, cause additional rearrangement costs, and even damage business credibility with third-party contractors [
16]. Therefore, how to reduce system nervousness is also an important aspect when designing a rescheduling strategy.
To achieve these desired properties, the Process System Engineering (PSE) community has proposed numerous rescheduling strategies over the past decades. These strategies are often implemented by solving a series of MILP problems that are modified from static scheduling models. These modifications include modifying parameters [
11,
17], adding penalty terms [
14,
15], adding terminal constraints [
18], partitioning the horizon [
15,
19], and re-partitioning indices and sets [
20,
21]. The objectives of these modifications include (1) incorporating newly observed disturbances [
22], (2) minimising the difference between the new schedule and the original schedule [
15], (3) ensuring that some system states such as inventory levels are stable in the long term [
11], and (4) generating some schedules that are more preferable than others [
14]. Next, we review related state-of-the-art methods. After that, we highlight the contributions of this work.
Kanakamedala et al. [
23] presented a reactive heuristic for schedule modification based on a decision tree. In the decision tree, each node represents a processing step (comparative to a task category, which is the term that is more commonly used today), and each arc represents a modification decision on that processing step. They considered two types of modification decisions, namely right-shifting and replacement (equivalent to reassignment). Since the size of the decision tree grows exponentially with the number of processing steps, they also proposed a two-stage method to prune the decision tree. Their work is different from ours because they considered only two types of modification decisions, including right-shifting and reassignment. In our method, it is also possible to modify the batching decision (how large a batch size to initiate) to obtain a better long-term objective. Also, their heuristic relies on a manually set parameter, “batch priority”, which may not be trivial to provide in practice.
Elkamel and Mohindra [
15] presented a reactive scheduling method that was intended to be used in a rolling-horizon pattern. They added penalty terms to minimise the difference between the new schedule and the original schedule. They considered four types of penalty terms, including task time-shifting penalties, assignment penalties, batch size preservation penalties, and resource purchase and reallocation penalties. In addition, they proposed an empirical rule to partition the scheduling horizon into three subhorizons, namely the left, the middle, and the right subhorizons. During the reactive scheduling procedure, only operations in the middle subhorizon were allowed to be revised, while those beyond the middle subhorizon were fixed.
Honkomp et al. [
24] proposed a simulation-based framework to evaluate the robustness and expected performance of a particular rescheduling strategy. They divided a scheduling horizon into three subhorizons, namely a fixed scheduling subhorizon (where operations are not allowed to be modified), penalised scheduling subhorizon (where modifying operations will be penalised), and free scheduling subhorizon (where operations are allowed to be modified without being penalised). Although their work does not directly deal with a specific rescheduling strategy, it represents an important category of rescheduling strategies, that is, the simulation-based rescheduling strategy. When- and how-to-reschedule decisions are determined based on the results of discrete event simulation for a schedule. The strength of this category of methods is that the robustness of a rescheduling strategy can be proven using a statistical guarantee (given that enough episodes of simulations are performed). However, in practice, there may not be sufficient reactive time for carrying out these simulations when the computational efficiency is highly desired. In our method, we prioritise computational efficiency and choose a purely reactive strategy.
There are several works that deal with the reactive scheduling of multipurpose batch processes subject to machine breakdown and rush orders. Vin and Ierapetritou [
22] used a two-stage approach to address the reactive scheduling of a multipurpose batch process. In the first stage, they used a baseline version of a continuous-time MILP formulation to generate a deterministic schedule. In the second stage, they modified the baseline MILP formulation by (1) fixing the variables before the time point of the occurrence of a disturbance event. This step was used to label the “already-executed” tasks. (2) For machine breakdown, they fixed the assignment decision while shifting subsequent operations on the same machine and added penalty terms to the objective function to minimise the differences between the original schedule and the new schedule. (3) For rush order arrivals, they also fixed the assignment decisions for all operations in the original schedule, while adding constraints to model the additional demands. Janak et al. [
19] contributed a subsequent work which also considered machine-breakdown and rush-order disturbances. They partitioned the scheduling horizon into the already-executed subhorizon and the currently executed subhorizon. For machine breakdown, they followed a strategy that depended on the specific machine that was broken down. That is, for unaffected machines, they fixed all tasks that were initiated before the shutdown event. For the affected machines, they fixed all the tasks that were completed before the shutdown event. For new orders, they used a set of heuristics to determine whether a task should be revised accordingly. The criteria for these heuristics included whether a task is executed on a “busy” machine and whether a task is executed on a “special” processing unit. Li and Ierapetritou [
21] used a set of parameters to model the outcomes of disturbance events in the MILP formulation. They then used a multi-parametric programming approach to preventively (or, equivalently, proactively) generate a new schedule during dynamic scheduling. For machine breakdown, they followed the same strategy as the one used in [
19]. For rush orders, they fixed tasks that were executed before the rush order came and updated the demand parameter after the rush order. These works are different from ours because (1) although they simulate the dynamic scheduling process by updating the values of a set of indicator variables (which indicates whether a task has been executed), they do not really follow the closed-loop pattern of a dynamic system. (2) While they generally follow a periodically rescheduling strategy, they do not explore the event-driven when-to-reschedule strategy. In our work, we will develop a dynamic system formulation to simulate the dynamic scheduling process at a full scale. Also, we will use an event-driven strategy to reduce long-term system nervousness.
There is a series of works, published after the 2010s, that model the dynamic scheduling process using a state-space model [
11,
14,
17,
25,
26]. These works treat the MILP problem as a “schedule generator” and use the state-space model to describe the system dynamics. Generally, these works follow a periodically rescheduling strategy. As identified in [
11], critical parameters that are involved in a periodically rescheduling strategy include the frequency of rescheduling, the length of the scheduling horizon, the computational resources that are allocated in a particular time step, how to deal with penalty terms, and how to add terminal constraints. To address the design of these parameters, McAllister et al. [
14] discussed the theoretical aspects of various penalty terms under the framework of Economic Model Predictive Control (EMPC). They proposed a form of penalty terms called Hinge Loss of Discounted Difference, where they only penalised moving operations earlier and did not penalise delaying a task. Ikonen et al. [
27] presented a reinforcement learning approach to determine the rescheduling timings and the computational resources that are required to be allocated in a specific rescheduling time step. In their work, the input of the reinforcement learning agent includes deviations in parameters in the MILP formulation, discrete changes (such as machine breakdowns), and the current state of the MILP solver. However, to the best of our knowledge, the warm start technique, which uses the information of the MILP solutions from previous time steps and therefore has the potential to apply dynamic scheduling to large-sized problems, has received limited attention from these works. In this work, we will also explore this technique.
As a result, the primary aspect of our contribution is that we propose an efficient rescheduling strategy that can deal with large-sized problems. As will be presented in
Section 5, our method can handle the dynamic scheduling problem of a large-sized multipurpose batch process within 1500 s. Importantly, we achieve this computational efficiency with a slightly better long-term objective (makespan, in our case) and significant reduction in long-term system nervousness compared to the periodically–completely rescheduling strategy. The second aspect of our contribution is methodological. Our rescheduling strategy is a combination of several ideas. While some ideas may have been involved in previous works, some ideas are new. The fundamental idea of our strategy is using a Directed Acyclic Graph (DAG) to describe the inter-operation dependencies in a schedule. In the DAG, each node represents an operation, and each arc represents a dependency of an operation on another. With this DAG, we then propose a procedure to determine how long an operation is allowed to be delayed for without affecting the current makespan. To the best of our knowledge, the combination of these two ideas has not been proposed before. After that, these sets of information are used in our when-to-reschedule and how-to-reschedule strategies. Our when-to-reschedule strategy is essentially a hybrid strategy, which reschedules either at periodically distributed time points or when certain conditions are met. Our how-to-reschedule strategy uses the DAG and associated information of the maximum delayable time to enable a warm start throughout the entire dynamic scheduling process. We then test our method by solving a medium-sized problem and a large-sized problem. The computational results show that our method can achieve an order of magnitude of reduction in both the number of operation changes and the computational effort required for solving the MILP problems, compared to the existing periodic complete rescheduling strategy. More importantly, these improvements are achieved with a slightly better long-term makespan.
The remainder of this paper is organised as follows. In
Section 2, we define the problem in detail. In
Section 3, we present the detailed problem formulation, which describes the online execution of a dynamic scheduling process, and the MILP formulation for generating a new schedule. In
Section 4, we propose the detailed rescheduling strategy, including the when-to-reschedule and how-to-reschedule strategies. In
Section 5, we solve large-sized examples to illustrate the capability of our method. Finally, in
Section 6, we draw conclusions and discuss future works.
4. Methodology
In this section, we present the detailed description of our rescheduling method. As our method is motivated by a combination of several ideas, and these ideas are reflected in different algorithm modules, it is clearer to present our method in a general-to-specific organisation. In
Section 4.1, we first introduce some notations that are used throughout different algorithms. Then, in
Section 4.2, we present the overall procedure of the dynamic scheduling process, including how to initialise the system, how to use the dynamic system to describe the evolution of dynamic scheduling, and how to calculate the cumulative nervousness value. In addition, we also briefly introduce the key modules of our rescheduling method, including when- and how-to-reschedule modules, the DAG generating module, and the module that is used to determine the delayable time of an operation. After that, in
Section 4.3,
Section 4.4,
Section 4.5, we expand the technical details of these algorithm modules and justify the motivation behind them.
4.1. Notations
In our method, a schedule is treated as a collection of operations. Here, an operation refers to a batch of a task. In a multipurpose batch scheduling problem, an operation can be sufficiently identified using a 5-tuple, that is,
. Here, the symbol
i represents the task,
j represents the machine,
b represents the batch size,
represents the start time, and
represents the completion (finish) time. For example, if there is an initiation of task
on machine
at time point
with 10 units of batch size, and this operation is scheduled to complete at time point
, then we write
. When necessary, we use
to denote the task of this operation,
to denote the machine of this operation,
to denote the batch size of this operation,
to denote the start time of this operation, and
to denote the completion (finish) time of this operation. Additionally, when we have an operation
as reference, we use it as the index for related variables. For example, consider the variable
that represents the initiation of an operation in the dynamic system (as we presented in
Section 3.1). The initiation of operation
can be simply written as
, which represents
.
For graph-related notations, we use to denote a DAG, whose nodes represent operations and arcs are connected from one operation to another. When there exists an arc from to in the DAG, we denote this structure by . We use to denote the collection of child nodes of an operation . Namely, for each child node , there exists an arc in the DAG. We also use to denote the descendant nodes of an operation . Namely, for each descendant node , there exists a path in the DAG.
4.2. Overview
The central idea of our method originates from the observation of how the delay of an operation propagates through a schedule. Generally, a delay in an earlier, upstream operation propagates towards the downstream part of a schedule. However, the specific impact depends on the configuration of production process and the timings of operations. For example, a downstream operation may be completely free from being affected because the delayed upstream operation is not included in any of its prerequisite operations. In other cases, a downstream operation can remain its original start time without being right-shifted or re-assigned to another machine because the idle slots between midstream operations are sufficient to “absorb” the delay disturbance from the upstream operation. Finally, if the downstream operation depends on the delayed upstream operation and the midstream operations are arranged tightly, then the downstream operation may be required to be right-shifted, which extends the original makespan.
Based on this observation, two key questions remain in developing a concrete rescheduling algorithm. These are (1) how to mathematically formalise the propagation of the delay impacts and (2) how to improve the existing methods using this observation. For the first question, we use a Directed Acyclic Graph (DAG) because it is a data structure that represents dependence relationships between a collection of objects. For the second question, by integrating the ideas behind the observation, it turns out that our method improves upon the widely used periodically–completely rescheduling framework mainly in three aspects. First, we reduce the computational time that is required to check whether a rescheduling procedure should be triggered. Second, we use the warm start technique to speed up the solution of MILP problems. Third, we reduce the system nervousness by incorporating the information from the DAG.
The overall procedure of our method is illustrated in
Figure 2. At first, we initialise the system with an initial time point, an initial system state, and an empty schedule. Also, we sample the outcomes of processing time variation and intermittent demand that can be observed from the initial time point. After that, we enter the main loop of the dynamic scheduling. While the main loop structure is mostly the same as the periodically–completely rescheduling framework, the key differences lie in the following. (1) Instead of periodically triggering a rescheduling procedure, we propose a novel when-to-reschedule module based on the ideas on the propagation of delay impacts. (2) After a new schedule is generated, rather than naïvely proceeding to the next time point, we also construct a DAG to represent the dependencies between operations in this new schedule. Based on this DAG, we calculate the maximum allowable delay time for each operation and provide the information for the rescheduling trigger in the next several time points. (3) In the how-to-reschedule module, we incorporate information about observed disturbances and the information about the DAG to fix unaffected operations, which can greatly speed up the solution process and eliminate unnecessary operation changes.
For the completeness and ease of implementation, we also provide the pseudo-code of our method in Algorithm 1.
The remainder of this section is dedicated to the explanation of the technical details in each algorithm module. In
Section 4.3, we first present how to construct a DAG that represents dependencies between operations in a schedule for a multipurpose batch process. After that, we show how to derive the maximum allowable delay time (that is, the delayable time of an operation) for each operation in a given DAG. In
Section 4.5, we present the when- and how-to-reschedule strategies.
4.3. Describing a Schedule Using a Directed Acyclic Graph
At the core of our rescheduling strategy is using a DAG to represent the dependence relationships between different operations. In this DAG, each node represents an operation, and each arc represents a dependency between two operations. As mentioned before, the purpose of constructing such a DAG is to quickly check whether the current makespan will be affected when there are one or more operations are delayed. From this perspective, the DAG that is constructed in this section can be viewed as a model that encodes the knowledge about the execution situation of a schedule.
Essentially, there are two types of dependence relationships between two operations in a schedule for a multipurpose batch process. The first type is temporal dependence, which means that, for two consecutive operations processed on the same machine, the latter operation depends on the earlier operation. Here, the term “temporal” comes from the fact that the direction of this type of dependency is along the time axis. Specifically, in our context, by mentioning “dependence”, we mean that if the earlier operation is delayed, then the latter operation will also be delayed or at least affected. To describe this type of dependency, we simply add an arc between every pair of two consecutive operations on the same machine in the DAG. This procedure of adding temporal arcs are presented in Lines 7–8 in Algorithm 3.
The second type of dependence relationship is spatial dependence, which means that a downstream operation will depend on an upstream operation , which produces the input material for . We call this type of dependency “spatial” because these dependencies are between different machines. In other words, the direction of spatial dependencies are along the space axis. However, compared to temporal dependencies, spatial dependencies are more subtle to identify. The reason is that in a multipurpose batch process, material splitting and mixing is allowed in the system. It is possible that the material produced by an upstream operation is used by multiple downstream operations (splitting). Also, the input material of a downstream operation may originate from two or even more upstream operations (mixing). For cases where some materials do not have an intermediate storage tank (i.e., zero storage capacity) and batches have to be transferred to downstream machines immediately after production (i.e., zero-wait policy), the situation will become even more complex.
Despite these difficulties, in this work, we identify spatial dependencies using a backward procedure, which is simple but sufficient for a wide variety of cases. The idea of the backward procedure is as follows. Let
be an operation. Then, for the material
k that is required for initiating
, we track the latest operation(s) in the schedule such that this latest operation(s) produces an enough quantity of
k for initiating
. To formalise, we present this backward procedure to determine the spatially upstream operation(s) of an operation in Algorithm 2. An illustrative example of Algorithm 2 can be found in the
Supplementary Materials.
By integrating Algorithm 2 and the aforementioned method of adding temporal arcs, we present Algorithm 3. Essentially, Algorithm 3 takes a schedule in the form of a set of operations as the input and outputs a DAG that represents the dependencies between operations in this schedule. The overall procedure of Algorithm 3 is given as follows. For each machine (Line 4), we traverse all the operations that are executed on this machine and add temporal arcs (Lines 7–8) and spatial arcs (Lines 9–10) to the DAG. Finally, we return the DAG with these arcs added (Line 14).
4.4. Calculating the Delayable Time of an Operation
Given a schedule in the form of a set of operations, a DAG constructed using Algorithm 3, and the current makespan of this schedule, we need to know how to determine the delayable time of each operation in this schedule without affecting the current makespan. In other words, we are interested in how many time periods are allowed to postpone an operation without extending the makespan.
We determine the delayable time of an operation using a backward propagation procedure. This procedure is inspired by the concept of the
critical path [
30], which is widely used in the area of job shop scheduling to describe the route of operations that is tight with respect to the makespan. In the context of multipurpose batch processes, it is also possible to find a “generalised critical path”. Here, we add the term “generalised” because in multipurpose batch processes, one operation may have multiple spatially upstream or downstream operations due to the existence of batch splitting and mixing. This situation is different from the case in job shop scheduling, where one operation in a job is allowed to have only one spatially upstream or downstream operation because each job has a predetermined operation sequence.
The detailed backward propagation procedure is presented in Algorithm 4. In this procedure, we start from the set of leaf nodes in the DAG (Line 5). For each leaf node, we simply calculate its delayable time as the gap between the current completion time of the schedule and the completion time of this leaf operation (Line 6). Then, for the remaining nodes, we enter a loop (Lines 7–12). In each iteration in this loop, we select an operation and calculate the delayable time of its child nodes (Line 9). We then propagate the delayable time for this operation, as presented in Line 10. This procedure is repeated until all operations in the schedule have been traversed.
In Algorithm 4, the approach in which we propagate the delayable time (Line 10) is worth explaining. In this formula, we consider a pair of operations, and . Here, represents the parent operation and represents the child operation. By saying “parent” and “child”, we mean that in the input DAG , there exist an arc in . Given the arc , we calculate the addition of the delayable time of and the time gap between the start time of and the completion time of . By traversing all the child nodes of , we then obtain the delayable time of . The essence of Algorithm 4 is repeating this process from the leaf nodes and propagates the calculation of delayable time to the entire set of operations in the schedule.
4.5. Rescheduling Strategy
4.5.1. When-to-Reschedule Strategy
As mentioned before, one desired property of a rescheduling strategy is computational efficiency. To achieve this, we simply use the logical OR operator to combine a set of simple conditions as the rescheduling trigger. In other words, for a set of logical conditions, if any of these conditions are satisfied, we trigger a rescheduling process. Otherwise, we continue to execute the current schedule.
The set of combined rescheduling conditions is stated as follows. First, based on the observed information related to processing time variation, if we can infer from the set of delayable times that the current makespan will be extended, we then trigger a rescheduling procedure. In other words, we check whether there exists an operation in the current schedule such that . This condition is used to ensure that the system is responsive to critical events that affect the short-term objective.
The second condition is based on the demand response. If we observe new demands, no matter whether they are baseline demands or intermittent demands, we trigger a rescheduling procedure. In other words, for every
, if we have that
, we trigger a rescheduling procedure. It should be noted that, although this heuristic is responsive to any changes in demand, the schedules before and after rescheduling may not appear significantly different. This is because, in our method, triggering a rescheduling procedure is actually equivalent to solving the MILP problem (presented in
Section 3.2) using the how-to-reschedule strategy, where many operations are fixed (see the next subsection). As a result, it is often the case where only a few operations are added to or removed from the original schedule when the demand change is slight, rather than triggering a complete rescheduling procedure frequently. However, the cumulative system nervousness may nonetheless be slightly increased in such cases.
The third condition is used to ensure the feasibility of a schedule. Here, we need to introduce a concept, which we call “remaining feasible time steps”. We denote the remaining feasible time steps by
. Intuitively,
represents the forthcoming time steps of a schedule that are guaranteed to be feasible. In our problem, the only source of potential infeasibility comes from the processing time variation. If an upstream operation is delayed, the downstream operations may not be initiated as scheduled. Because whenever we carry out a rescheduling process, we incorporate the information of processing time variation in the form of parameter changes in the MILP problem (see
Section 3.2), the first
time points are guaranteed to be feasible. As a result, at the time points when a new schedule is generated, we set
. When the rescheduling procedure is not triggered, the schedule is advanced by one time step and we reduce
by one time period. When we have
, which means that the remaining schedule is not necessarily guaranteed to be feasible, we trigger a rescheduling procedure.
The formlised when-to-reschedule strategy is presented in Algorithm 5.
4.5.2. How-to-Reschedule Strategy
At the core of our how-to-reschedule strategy is
warm start. When a rescheduling procedure is triggered, we solve the MILP presented in
Section 3.2 with a specific subset of decision variables fixed. This specific subset of variables includes (1) the initial system state variables, which are presented in Equations (9)–(12) in
Section 3.2, and (2) an additional collection of
variables (recall that this variable represents the binary decision of initiating an operation) that is associated with the set of operations that are not affected by an event of processing time variation. To obtain the collection (2), let the set
denote operations that are currently executing or still unexecuted in the original schedule. Then, we identify
as the set of operations that are directly affected by an event of processing time variation. In other words,
. After that, for each
, if we determine from the pre-calculated delayable time that
(meaning that this operation delay event will affect the current makespan), then we allow operations in
to be revised in the new schedule (recall that
represents the set of descendant operations of
in the DAG
). As a formalised version, Algorithm 6 presents the procedure for generating a new schedule with the unaffected operations fixed.
It should be noted that in our how-to-reschedule strategy, for an operation that is affected by an operation delay event, even if this event will not affect the current makespan according to , we still unfix all its descendant operations, namely , rather than the operation itself. This is because the procedure used to determine the delayable time of an operation (Algorithm 4) allows all the descendant operations of to be right-shifted. As a result, although the final makespan may not be affected via right-shifting, it is not guaranteed that all operations in will not be right-shifted. In other words, it is possible that some of the descendant operations of are right-shifted without extending the current makespan. Therefore, as a conservative solution, we allow all the descendant operations of to be revised.
5. Examples
We use two benchmark examples from the literature to evaluate the effectiveness of the proposed rescheduling strategy. The first example (labelled as Example 1) is a medium-sized example adopted from [
31]. This example consists of 19 tasks, 8 machines, and 26 types of materials.
Figure 3 presents the STN representation of Example 1.
Table A1 (see
Appendix A) presents the detailed task–machine configurations, which include task–machine compatibilities, the nominal processing time, the minimum and maximum batch sizes, and the probability distribution of the varying processing time.
Table A2 (see
Appendix A) presents the detailed material configurations, which includes storage capacities, initial inventory levels, supply limits, baseline demands, and the probability distribution of uncertain intermittent demands.
Another example (labelled as Example 2) is adopted from [
32], which is a large-sized example with 37 tasks, 12 machines, and 28 types of materials. The STN representation is presented in
Figure 4. In their original work, the STN was used to describe a continuous process. However, to be consistent with our theme, we modified this STN to a multipurpose batch process with task–machine and material configurations being randomly generated. These configurations are presented in
Table A3 and
Table A4 in
Appendix A.
For the dynamic process, we discretise the entire time axis into uniform time intervals. Each time interval represented one hour time in practice. For both examples, we set h and h. We set h, which means that the value of a varied processing time can be observed 12 h in advance. We also set h, which means that each schedule has to satisfy the forthcoming demand within the next 48 h plus the demands that were backlogged before.
As mentioned before, in the context of dynamic scheduling of multipurpose batch processes, it is generally challenging to have access to or even approximate the optimal rescheduling strategy. Therefore, we compare our method with the periodically–completely rescheduling method in the literature [
33,
34,
35]. For a fair comparison, we execute the dynamic scheduling process with the same realisation of disturbances. Namely, we use the strategy that solves the MILP formulation presented in
Section 3.2 at every time point without fixing any value of decision variables. Although this comparative method is not widely applied in practice, this method generally achieves a very competitive long-term objective. Therefore, we choose to compare our method with this periodically–completely rescheduling strategy.
All our computational results were generated using an Apple MacBook Pro (designed by Apple Inc., Cupertino, CA, USA, assembled in China) with an M1 Pro SoC and 16 GB of unified memory. The dynamic system module and DAG generation module were implemented in Python 3.9.16. The MILP problems were solved using Gurobi 11.0.2 [
36].
Figure 5 presents the distribution of computational time that is used in solving the MILP problems at each time point. As we can see from
Figure 5, the total computational time used by the periodically rescheduling strategy for Example 1 is 4914 s, while our method uses 803 s in total for this example. This gap in computational time is even larger for Example 2. The periodically rescheduling strategy takes 33,554 s to complete the dynamic scheduling process for Example 2, while our method takes 1296 s. This significant difference in computational effort proves the computational efficiency of our method. It is notable that in Example 2, at some time points, the periodic strategy requires a significantly longer computational time to generate a new schedule compared to other time points. For example, we can observe a peak around
and also a peak around
. The possible reason for the existence of these peaks is that at some certain time points, a particular realisation of disturbances may have resulted in an MILP problem that is significantly harder to solve. In contrast, there are generally no peaks in the distribution of computational time using our method. This difference in the distribution of computational time proves the effectiveness of the warm start approach.
As mentioned before, the second interest of a rescheduling strategy is the long-term objective. This information is presented in
Figure 6. As presented, our method generates a slightly smaller makespan for both examples than the periodically rescheduling strategy. Specifically, the periodically rescheduling strategy completes all demands that arrived within
for Example 1 using 525 time periods, while our method uses 520 time periods, which was reduced by 0.95%. The overall makespan for Example 2 for the periodically rescheduling strategy is 499 time periods, while it is 490 time periods for ours, which was reduced by 1.80%.
As demonstrated by these two examples, our method achieves long-term makespans that are quite close to those achieved by the periodic method. While proving the general effectiveness of our method for the minimisation of long-term makespan requires more numerical experiments, we propose two possible explanations for this closeness. First, from a short-term perspective, our method often generates schedules whose makespan is equal or very close to those generated by the periodic method. This can be seen in
Figure 6, where the short-term makespan (green bars) generated by our method is not significantly higher than that from the periodic method (brown bars). Recall that one of the main differences between our method and the periodic method is that our method fixes the start time of certain operations, while the periodic method is not subject to such constraints. Such fixing in our method may have a minimal impact on the short-term makespan. Based on our computational experience, this is true when tasks are not very tightly arranged on a machine in a schedule. This is because there are sufficient idle slots to allow the rearrangement of operations without affecting the makespan. Second, from a long-term perspective, even when our method results in a suboptimal short-term makespan at some time points, the MILP solver can generate schedules where this short-term optimality loss is compensated for in the next few time points. Consequently, these two methods can eventually achieve similar long-term makespans.
We are also interested in the cumulative system nervousness. As mentioned in Equation (8), when a rescheduling procedure is triggered, we add the number of operation changes as the value of system nervousness.
Figure 7 presents the distribution of the number of operation changes at each time point. In
Figure 7, we observe a significant difference in the cumulative system nervousness. The total number of operation changes that is generated by the periodically rescheduling strategy for Example 1 is 41,977 times, while the operation changes from our method during the dynamic scheduling process occurs 5471 times in total, which is reduced by 87.0%. Similarly, the cumulative number of operation changes generated by the periodically rescheduling strategy for Example 2 is 83,474 times, while our method achieves a much lower number, that is, 5627 times, which is reduced by 93.2%. These results show that the cumulative system nervousness can be greatly reduced using the warm start technique.
In brief, the use of our method could achieve a good long-term makespan without changing operations frequently.
6. Conclusions
In this work, we addressed the dynamic scheduling problem of a multipurpose batch process subject to processing time variation and demand uncertainty, which are commonly encountered in industrial practice. While we used a random variable to represent the processing time variation, we assumed that the irregular orders arrived randomly. To model the system dynamics, we used a set of piecewise functions to describe the transition function. We also developed an MILP formulation that was used as a “schedule generator” during the dynamic scheduling process. This MILP formulation was modified from the standard discrete-time formulation for static multipurpose batch process scheduling problems. The difference is that we added a set of initial state constraints to enforce the MILP to be consistent with the current system states. Also, we solved this MILP using a two-stage hierarchical approach to reduce long-term inventory levels.
After introducing the problem formulation, we provided a detailed description of our rescheduling strategy. We first presented how to construct a DAG that describes the inter-operation dependencies that are encoded in a schedule. Specifically, in our method, we identified two types of dependence, namely temporal dependence and spatial dependence. Then, having this DAG established, we presented the procedure for determining how long an operation can be delayed without affecting the current makespan. This method was inspired by the concept of the “critical path” in job shop scheduling. After that, we presented the when- and how-to-reschedule strategies that were used in this paper. Notably, while the when-to-reschedule strategy is simply a set of conditions that are combined by logical ORs, the how-to-reschedule strategy is essentially solving a warm start version of the MILP.
To test our algorithm, we compared our method to the periodically–completely rescheduling strategy with two benchmark examples. The computational results show that our method satisfies the three desired properties that are often expected in dynamic scheduling practice, that is, computational efficiency, goodness in the long-term objective, and a low level of system nervousness.
One of the key limitations of our method is that we considered solely the objective of minimising makespan. While this objective is among the most commonly considered objectives in the scheduling literature, other economic objectives, such as maximising profit and minimising cost, should be considered in future works. Another key limitation is that our method requires a set of rules to determine the dependencies between operations in a schedule. Although in the context of multipurpose batch processes, these rules are not difficult to deduce (as presented in
Section 4.3), it may require more elaboration from domain experts to provide these rules when generalising to other production processes. The practicability of providing these rules in other production processes is worth exploring in the future.
In addition, although our method performed well in the presented cases, some remaining issues are worth exploring in the future. The first issue was mentioned in
Section 4.3. That is, in this paper, we assumed that the storage policy for materials was finite intermediate storage. However, in other more complex cases, such as zero intermediate storage and zero waiting, the heuristic for determining spatially upstream operations of an operation does not apply. The essence of this difficulty is due to the semi-continuous nature of batch processes. That is, material mixing and splitting are allowed in multipurpose batch processes. Therefore, it is worth exploring how to design the heuristic to determine the spatial dependencies between operations in these cases.
The second issue is related to the warm start. As presented in Algorithm 6, for a specific operation in a schedule, if we observe an event of processing time variation in this operation, then we unfix the start time of all the descendant operations of this affected operation. In our method, we choose this heuristic because it is sometimes not clear which descendant operation requires right-shifting to “absorb” the operation delay event while not affecting the makespan. This heuristic may potentially result in conservativeness. That is, it is possible to select a smaller subset of descendant operations to unfix without affecting the current makespan. However, how to identify this subset is not a trivial task. We will explore this issue in our future work. In addition, we will also address other disturbances such as machine breakdown in our future work.