1. Introduction
With the vigorous development of manufacturing technology and artificial intelligence technology, intelligent manufacturing has become one of the mainstream directions of the manufacturing industry [
1,
2]. So far, artificial intelligence technology has been effectively applied in many directions and fields of the manufacturing industry. In terms of the operation research and management aspect, it mainly focuses on using intelligent optimization algorithms to solve the job shop scheduling problem (JSSP) [
3].
JSSP entails the efficient allocation of production resources for a decomposable processing task under certain constraints to optimize performance metrics, such as total processing time, flow time, and tardiness [
4]. The flexible job shop scheduling problem (FJSSP) allows for more flexible use of machine resources compared to the JSSP, where each operation of a job can be processed on multiple candidate machines, better reflecting real-world situations [
5,
6,
7]. Therefore, it is essential to first assign jobs to suitable machines (i.e., the routing decisions) in the FJSSP and then sequence the jobs in the waiting queue of each machine for processing (i.e., the sequencing decisions).
The JSSP and the FJSSP, as typical NP-hard problems, have garnered widespread attention and thorough research from experts and scholars due to their extensive application and complexity in the manufacturing domain. For small-scale problems, researchers tend to utilize exact methods, such as mathematical programming [
8] and branch-and-bound [
9]. These methods can solve the problem effectively while ensuring the accuracy and optimality of the obtained solutions, but they also suffer from the drawbacks of being computationally intensive and time-consuming. On the other hand, to tackle large-scale problems and tighter time constraints, meta-heuristic algorithms have been viewed as prominent choices. Meta-heuristic algorithms, such as the genetic algorithm [
10], the Tabu Search algorithm [
11], particle swarm optimization [
12], teaching–learning-based optimization [
13], and simulated annealing [
14], etc., can find approximate optimal solutions in a relatively short time, offering acceptable solutions for decision-making in practical production scenarios.
However, the actual manufacturing environment is usually dynamic, which means that unexpected random events (such as the arrival of new jobs, machine failures, reworking of jobs, etc.) often occur in the dynamic flexible job shop scheduling problem (DFJSSP) [
15]. Traditional meta-heuristic methods typically use rescheduling mechanisms to tackle these dynamic events, but these approaches present challenges in terms of increased computational complexity and stability issues [
16,
17].
As another promising approach, dispatching rules (DRs) demonstrate their effectiveness as a heuristic strategy for solving the DFJSSP due to their scalability, reusability, and rapid response to dynamic events [
18,
19,
20]. In essence, DRs are priority functions that consist of features containing job shop-related information and mathematical operators. At each routing and sequencing decision point, DRs compute priority values for each scheduling object (such as waiting operations and candidate machines) and choose the most prioritized one for the next processing action. In the past few decades, numerous types of features (e.g., processing time of each operation, idle time of each machine, etc.) and manually designed DRs (e.g., first in, first out, etc.) have been proposed to address various job shop scenarios and optimization objectives [
21].
However, it is exceedingly challenging and perhaps even infeasible to depend solely on workers’ expertise to clarify potential correlations among different features and to design effective DRs. This is due to the intricate interactive interconnection of various production components in the job shop, as well as the distinct decision environment and information requirements associated with different scheduling decisions in the DFJSSP. In this case, the application of a genetic programming hyper-heuristic (GPHH) for the automated design and generation of appropriate DRs has become a widely used method for effectively addressing the DFJSSP. Numerous studies have demonstrated that using GPHH could automatically evolve DRs that outperform manually designed ones [
22,
23].
GPHH has the ability to autonomously generate efficient DRs without extensive domain-specific knowledge. This is achieved by continuously adapting the combination of features and mathematical operators in the DRs during evolutionary iterations with predetermined algorithmic parameters (e.g., maximum depth of the tree, function set, feature set, etc.) and genetic operators (e.g., crossover, mutation, etc.). In comparison to most meta-heuristic algorithms for solving the JSSP, GPHH has the advantages of flexible encoding representation, powerful search capability, and easier application to real-world environments. Hence, DRs produced by GPHH are naturally suitable for solving the large-scale DFJSSP. In solving the JSSP, GPHH first uses a set of features (corresponding to leaf nodes) representing the states of jobs, machines, and job shops, as well as a set of mathematical functions (corresponding to non-leaf nodes) to generate DRs automatically and iteratively through off-line training. When scheduling online, the generated DRs are directly used to make scheduling decisions without repeated training by GPHH, thus realizing real-time scheduling [
24].
Previous works have shown that the maximum depth of DRs, the function set, and the feature set were the three main factors that could determine the diversity of DRs and the search space of GPHH [
25]. If the maximum depth of the DRs is
d, and the DRs are generated using the “full method” (i.e., full growth method), the size of the GPHH search space is
, where
and
denote the function set and feature set, respectively. However, the number of effective DRs is much smaller than the above theoretical value, which indicates that finding solutions through GPHH is still a difficult task.
Therefore, reducing the number of features is a viable approach to narrow the search space of GPHH and enhance its searching efficiency [
26,
27]. The feature set of GPHH can encompass numerous features describing various aspects and forms of information pertaining to the job shop scheduling process, including system-related, machine-related, and job-related features. However, not all of these features contribute positively to scheduling decisions. For instance, the due date of a job is usually considered to be an irrelevant feature for optimizing the mean flow time [
26]. In addition, there are correlations between many features, which can cause some of the same information to be double-counted. Therefore, by integrating feature selection techniques with GPHH, it is possible to eliminate irrelevant and redundant features, allowing the algorithm to purposefully explore regions containing more promising DRs and then improve algorithm efficiency.
As far as we know, the research on feature selection methods for GPHH and its variations is still limited and primarily focuses on single-objective problems. The initial studies emphasized measuring the importance of features by calculating the frequency of their occurrence in the best DRs [
28]. Mei et al. [
29] were the first to propose a metric to measure the feature importance based on the contribution of features to the fitness values of individuals. Compared to feature frequency, this metric could avoid the negative influence brought by redundant features and offer better accuracy. However, these methods were mainly explored in the context of the DFJSSP, considering only sequencing rules. Zhang et al. [
30] were pioneers in applying feature selection methods to the DFJSSP, involving two feature sets for both routing and sequencing decisions, respectively. Additionally, in their subsequent work [
31], they further developed a continuous two-stage GPHH framework to fully utilize excellent individuals during the feature selection process. However, although these methods could reduce the dimensionality of the feature set and enhance the interpretability of generated DRs through feature selection, they did not lead to an improvement in the optimization performance of GPHH.
Although feature selection has been successfully applied to single-objective problems, no feasible methods have been found for the multi-objective DFJSSP. The main challenge is that calculating the feature importance in multi-objective problems takes into account not only the performance indexes themselves, but also the correlation and non-domination relationships among these indexes. As a result, the existing feature importance measures based on fitness values for single-objective problems are no longer applicable, especially considering that the objectives in multi-objective problems are usually interrelated, and a feature may bring positive effects in one objective but may lead to negative effects in another. Therefore, it becomes very difficult to design an effective feature importance measure for multi-objective problems. In addition, the result of feature selection is heavily dependent on manually set parameters. For instance, the condition for features to be selected in the previous studies [
29,
30,
31] is that the feature weight value is greater than or equal to half of the total weight value. Such a parameter setting is too rigid, lacking not only fault tolerance and flexibility, but also sufficient adaptability for different scheduling problem scenarios.
Aiming at the limitations of the aforementioned feature selection methods, this paper proposes an improved GPHH with dual feature weight sets to automatically design energy-efficient DRs. In this proposed GPHH, two novel feature weight measures are proposed for calculating the feature weights in the multi-objective DFJSSP and forming the dual feature weight sets. Additionally, a novel hybrid population adjustment strategy is also presented to use the obtained dual feature weight sets to guide the algorithm. The main difference between the proposed feature weight measures and the existing feature selection methods is that the proposed feature weight measures do not select the features but rather assign proper weights to the features according to their contribution/importance to multi-objective problems, which serve as the probabilities of the features being selected in subsequent iterations. This way, the influence of manually set parameters on the feature set and the optimization performance of obtained DRs can be avoided, and the directional guidance for the searching process of GPHH can be provided while ensuring the diversity of generated DRs and the search space of GPHH. In this case, more excellent DRs can be obtained. The main contributions and innovations of this paper are as follows:
An improved GPHH algorithm based on dual feature weight sets is designed to guide the exploration of GPHH through measured feature weights, thereby improving the searching efficiency of the algorithm and automatically generating more promising and understandable DRs.
In order to measure feature weights (i.e., feature importance) more accurately in the multi-objective DFJSSP, two feature weight measures are proposed: one based on the fitness values of DRs and another based on the diversity of the Pareto front. Based on these two feature weight measures, the feature set for GPHH can be separated as dual feature weight sets for routing and sequencing decisions, respectively.
In order to use the obtained dual feature weight sets more effectively, a novel hybrid population adjustment strategy is also given in this paper. This strategy can adjust and refine the current population based on the feature weights so that the irrelevant and redundant features can be eliminated.
By considering total energy consumption and mean tardiness as two optimization objectives [
32,
33], the effectiveness of the proposed GPHH is demonstrated on an energy-efficient DFJSSP by comparing them with the existing related algorithms. Additionally, the specific behaviors and associated impacts of the dual feature weight sets in the scheduling process are also comprehensively analyzed.
7. Comprehensive Discussion of the Experimental Results
As numerous experimental results have been presented in
Section 5 and
Section 6, this section offers a comprehensive discussion of the results.
First of all, the experimental results in
Section 5.1 and
Section 5.2 point out that our improved GPHH algorithms (i.e., GPFW(fit) and GPFW(spa)) outperform existing methods (i.e., GPLWT, GPDR, and GPFS) with slightly longer training and testing times. This indicates that using feature weights to guide the evolutionary process of GPHH does benefit the optimization performance when compared to GPDR and surpasses the feature selection method proposed in existing work [
31], which focuses more on single-objective problems. Additionally, if we take the experimental results for the number of unique features (in
Section 6.1) and the rule size (in
Section 6.2) into consideration, one can see that our algorithms not only perform better but also produce more interpretable DRs. This is because the improved GPHH generates DRs with fewer unique features and smaller rule sizes after eliminating irrelevant and redundant features, making the DRs easier to understand.
Considering that the occurrence frequency of each feature in a rule can reflect to some extent the importance or contribution of the feature to the optimization objective, a conjoint analysis of feature weights and feature frequencies is also given in
Section 6.3. From the experimental results, one can see that feature frequencies and computed feature weights mostly align. This is because features useful for the optimization objective have larger contributions in individuals/DRs and also occur more frequently. However, feature frequencies can be influenced by irrelevant and redundant features in individuals, limiting the rationality of the results.
In summary, all the above experiments demonstrate that the proposed feature weight measures can more effectively and rationally calculate the significance or contribution of features in the multi-objective DFJSSP compared to existing methods. Hence, they can effectively improve the optimization performance of GPHH and the interpretation of generated DRs by removing irrelevant and redundant features and increasing the occurrence frequencies of high-weight features. Further experiments also point out that the proposed feature weights, in contrast to feature frequencies, better capture the importance and contribution of each feature to the optimization objectives. This is because the feature weights take into account the validity of both individuals and features for the Pareto front.