1. Introduction
Multi-satellite dynamic mission planning (MSDMP) is a large-scale combinatorial optimization problem that aims to quickly arrange multiple new tasks by adjusting the mission planning scheme in time through local/global replanning. The MSDMP system takes into account the uncertainties that may be encountered in the process of satellite observation of the Earth and demonstrates excellent performance in both routine and emergency safeguard mission scenarios [
1,
2]. However, with the continuous expansion of application scenarios, satellites are faced with a large number of dynamic factors interfering with the process of in-orbit operation, which puts forward higher requirements for the timely adjustment of planning schemes and the timeliness of planning. The existing MSDMP system is restricted by the limitations of constraints, and it is difficult to meet the requirements of intelligent applications under highly dynamic situations [
3]; moreover, the satellite controllers are directly involved in the MSDMP system for preference decision-making for complex dynamic problems, which can make up for the system deficiencies, and make the system formulate a scheme that is more in line with the current environmental requirements [
4]. Therefore, the investigation of multi-satellite dynamic mission planning based on user preference (MSDMPUP) can combine the user’s dynamically changing requirements with the mission planning system, making the system quickly respond to dynamic changes in the environment, which is the key to constructing a high-quality mission planning scheme; this approach represents the cutting-edge technology of current satellite mission planning research.
In recent years, researchers have conducted in-depth investigations on MSDMPUP problems and proposed many excellent models and algorithms. These approaches are classified into three main categories, namely, MSDMPUP planning models, planning strategies, and solution algorithms.
For MSDMPUP planning model research, common models include the integer planning model [
5], the constraint satisfaction model [
6], and the Markov decision model [
7]. Qu et al. [
8] have developed an intelligent scheduling framework that describes satellite mission planning as a mixed-integer planning model, solves it using an intelligent algorithm based on imitation learning, and proposes iterative views based on user preferences to improve the performance of the training strategy. Yu et al. [
9] combine the constraints of visibility, optical axis (OA) adjustment capability, cooperation duration, and immunity to establish a user preference constraint satisfaction scheduling model based on user preference, which effectively improves the total observational benefit of the multi-satellite imaging dynamic mission planning system. Li et al. [
10] introduce a Markov decision model based on user preferences in the task decision phase to effectively reduce the consumption time of the MSDMPUP system.
For the MSDMPUP planning strategy investigation, the main planning strategies of the MSDMPUP planning strategy investigation include proactive planning [
11], reactive planning [
12], and proactive reactive planning [
13]. Gu et al. [
14] propose an MSDMPUP model considering cloud prediction that relies on predictive recurrent neural networks and up-to-date satellite cloud images for continuous cloud forecasting and user preference selection through proactive planning. Wang et al. [
15] proposed a reactive planning method considering user-preference- and time-driven strategies for the high dynamics and uncertainty of MSDMPUP, taking into account the two key objectives of observation benefit and robustness. Dai et al. [
16] designed the MSDMPUP scheme through four phases of task preprocessing, task resource matching, task conflict assessment, and task resources, and proposed a user preference approach with task-slacking proactive–reactive planning to maximize the guaranteed task completion rate.
For the investigation of the MSDMPUP solution algorithm, the main solution algorithms include heuristic algorithms [
17], deterministic algorithms [
18], intelligent optimization algorithms [
19], and reinforcement learning [
20]. Wang et al. [
21] provide an in-depth analysis of user preference selection based on environmental changes and propose a two-way rolling level heuristic to effectively improve the scheduling efficiency of MSDMPUP systems. Wu et al. [
22] proposed a meta-heuristic approach for the large-scale MSDMPUP problem using a divide-and-conquer framework to improve the task execution benefit in different simulation environments with better allocation efficiency. Wang et al. [
23] proposed a multi-objective differential evolutionary algorithm based on spatial partitioning–user preference policy to solve the MSDMPUP problem by simultaneously considering the three objectives of optimizing the overall task benefit, antenna load balancing, and task completion timelines. Lin et al. [
24] propose a multi-satellite deep learning architecture for fusing user preferences, in which each satellite can train an efficient model based on deep learning algorithms, which significantly improves the timeliness of satellite mission decision-making. Song et al. [
25] propose a deep reinforcement learning-based solution for the large-scale MSDMPUP problem, which utilizes deep neural networks for each action to estimate to prioritize the task scheduling.
Although researchers have achieved better planning results for the MSDMPUP problem through advanced planning models, planning strategies, and solution algorithms, there is still a difference between the output scheme of the mission planning system and the real user requirements due to a variety of dynamic factors, such as task cancellation, changes in task attributes, and changes in the allocated satellite resources in complex environments [
26]. Presently, there are still many challenges in multi-satellite dynamic mission planning; the specific difficulties are described as follows:
Satellites performing on-orbit missions in space often face new emergencies, changes in user demands, and other dynamic events. Moreover, the existing target importance determination method has a limitation in that satellite resources cannot be deployed in time according to the dynamic changes in mission demand. Furthermore, it is difficult to complete the optimal matching of resource capacity and mission demand in the generated planning scheme. Therefore, the construction of a mission planning model that reasonably formulates the target importance is a major challenge for MSDMPUP.
In the existing research, users are mostly involved in the initial stage of the task and the final scheme selection stage, while dynamic tasks coexist with static tasks, and the task requirements present preference, complexity, and conflict, and there is the difficult problem of poor fitness of the output planning scheme. How to introduce the human capacity of planning and decision-making in the MSDMPUP optimized search process, design a reasonable and reliable user preference mechanism, and construct a high-quality task planning scheme is a new challenge for applying MSDMPUP in a complex environment.
The introduction of user preference requirements results in more complex real-world environmental constraints in MSDMP. Moreover, the problem space is more massive, resulting in difficulty in responding quickly to the environmental changes in the multi-satellite dynamic mission planning problem and low optimal efficiency. In addition, the design of a simple output planning scheme is a difficult challenge. Therefore, how to construct MSDMPUP solution algorithms adapted to a variety of changing modes, and effectively trade-off between the solution performance and the computational efficiency, is another challenge to applying MSDMPUP in complex environments.
To solve the above difficult problems, this work investigates MSDMPUP with imaging satellites. The main contributions of this work are as follows:
To address the real-time requirements of MSDMPUP due to dynamic changes in user requirements in complex environments, an MSDMPUP model based on a task rolling window is proposed. The model takes into account the three optimization objectives of task benefit, task completion, and task timeliness. Moreover, the model responds to environmental changes quickly by applying both periodic triggering and event triggering to achieve timely updating of the target task importance degree, which allows the task set to be scheduled in the optimal order based on the size of the target task importance degree, thus ensuring that each task can be completed within the optimal scheduling time.
To address the problem of poor adaptability of the MSDMPUP system output planning scheme, the hybrid preference interaction mechanism is proposed. In this mechanism, the pre-planning layer constructs a mapping strategy to map the MSDMPUP allocation sequence to the evaluation matrix of each sub-objective, the preference interaction layer generates a new generation of populations to participate in the algorithm optimization search according to the instructions sent by the satellite controller, and the interaction output layer constructs an inverse mapping strategy to reflect the evaluation value of the sub-objective to output the newest mission planning scheme selected by the satellite controller, which effectively deals with the dynamic changes in the environment.
To address the problem that the introduction of user preferences makes difficult for MSDMPUP, i.e., to cope with irregular environmental changes and poor solution efficiency, a knowledge transfer strategy for the multi-objective evolutionary algorithm is proposed. The knowledge transfer strategy constructs an MSDMPUP knowledge pool for knowledge storage, adaptively selects a knowledge transfer strategy according to the differences between the current environment and the historical environment under the new environment, retrieves similar knowledge in the historical environment to promote rapid convergence of the algorithm, adapts to the changes in mission requirements to obtain a high-quality multi-satellite imaging mission planning scheme, and enhances the universality of the solving algorithm.
To test the performance of the proposed method, we set up four sets of simulation experiments to verify the feasibility of the HPIM–KTSMOEA algorithm and compared the performance with other algorithms. The experimental results show that the method proposed in this work is superior and scalable, and provides a good reference for the multi-satellite dynamic mission planning problem in complex environments.
The rest of this work is organized as follows:
Section 2 presents the basic MSDMP scenario and the motivation for investigating MSDMPUP.
Section 3 introduces the MSDMPUP model based on a task rolling window and describes its objective function and constraints.
Section 4 details the hybrid preference interaction mechanism of the HPIM–KTSMOEA algorithm and the specifics of the knowledge transfer strategy for the multi-objective evolutionary algorithm.
Section 5 sets up simulation experiments to analyze the effectiveness and stability of the HPIM–KTSMOEA algorithm.
Section 6 provides our conclusions and future perspectives.
3. Problem Description
This section first describes the MSDMPUP model based on a task rolling window and then further describes the objective function and constraints of the model.
3.1. MSDMPUP Model Based on a Task Rolling Window
In this section, the MSDMPUP model based on a task rolling window (MSDMPUP–TRW) is constructed. The model is oriented toward the synthesis of satellite imaging requirements for complex environments in terms of six indicators, namely, safeguard type , application area , task requirements , resource characteristics , urgency degree , and environmental perturbations , to generate the target task importance degree required to produce the initial task set for determining the observation order of the target task.
The target task importance degree is dynamically updated by using a task rolling window to cope with changes in the complex environment and a knowledge transfer strategy for the multi-objective evolutionary algorithm is used to generate the initial task planning scheme. The specific method of the MSDMPUP–TRW model is as follows.
Firstly, a unified mathematical description of the indicators affecting the target task importance degree is provided as follows:
where
denotes the
gth target task indicator, and the indicator evaluation matrix for any target task is as follows:
where
is the quantitative value of the
gth indicator for the
ith target observation task, and a vector of weights
is obtained for the quantitative values of the targets
as follows:
According to the weight vector
, the initial target task importance degree is further calculated as follows:
Secondly, the task rolling window method is used to dynamically update the task importance degree during the task planning cycle, while using periodic triggers and emergency event triggers. The task rolling window method is used to dynamically update the task importance degree during the task planning cycle, and both the periodic trigger method and the event trigger method are used. Among them, the periodic trigger is a trigger mechanism based on time intervals, which recalculates the target task importance degree according to a preset period without being affected by external tasks. Event triggering is a triggering mechanism based on the occurrence of a specific event. When the system monitors a specific event (such as a change in satellite resources, a change in the demand for the target task, and other emergency events), the importance of each target task is recalculated, and to avoid repeated triggering and interference with the stability of the system, an event triggering threshold is set to achieve a dynamic balance between the number of triggering times and the task importance degree.
The initial moment of the task is executed using the periodic triggering method, which allows the dynamic change in the importance of each target task to be independent of the external task and to be rolled over only once every fixed period of time. During cyclic triggering execution, if an emergency event occurs when the value of the event triggering function is greater than the threshold value set for event triggering, it triggers the execution of the emergency event triggering method and the cycle is completed until the end of the task planning. The event trigger threshold is set to
, and the event trigger function is as follows:
where
is the number of target tasks,
is the target task importance of the
ith task,
,
denotes the start and end time of the task planning cycle, and
and
denote the ideal planning time and the current moment of the first task, respectively.
denotes the timeliness parameter of the
th target task, the smaller the difference between the time of the moment
and the ideal planning time
, the larger the timeliness parameter, and the larger the value of the corresponding event trigger function. Target task importance degree
combined with the timeliness parameter quantifies the urgency and importance of target task scheduling.
Figure 2 introduces the execution process of the task scrolling window. During the task planning time
, a group of target tasks
is dynamically updated with the target task importance, where the event trigger threshold is set to
; the blue and yellow squares represent the regular and emergency tasks, respectively, and the dotted lines of different color are used to split the different task scrolling windows, which overlap. The initial moment
is followed by the preparation time
, which is triggered by the periodicity to initiate the
mth task roll; the
th task roll is triggered by the contingency tasks
at time
, the
th task roll is triggered by the contingency tasks
at time
; and the
th task roll is initiated by a periodic trigger at the moment
,which loops until the end of the task planning time.
3.2. Establish Model Objective Functions and Constraints
3.2.1. Model Objective Functions and Decision Variable
The objective function construction in this study mainly considers the task benefit objective function, the task timeliness objective function, and the task completion objective function.
where Equation (6) represents the maximum value of the task benefit of the satellite performing the target task in all orbits, and the satellite should fulfill as many tasks of high target importance as possible.
denotes the set of target tasks,
denotes the set of satellites,
denotes the set of satellite execution orbits,
denotes the target importance degree matrix of target tasks,
denotes the decision factor,
denotes the target task importance degree of target tasks
,
denotes the number of target tasks,
denotes the number of satellites, and
denotes the number of executable orbital circles. Among them, the mathematical expression of the decision factor
is as follows:
where
,
indicates that the
-th target mission is not carried out by the
-th satellite in
-th orbit, indicating that the
-th target mission is carried out by the
-th satellite in the
-th orbit.
- 2.
Sub-objective function 2—minimize task timeliness objective function:
where Equation (8) indicates the minimum response time of the satellites in all orbits to perform the target tasks, and the satellites should complete as many fast response tasks as possible.
indicates the observed response time of the target task.
- 3.
Sub-objective function 3—maximize task completion objective function:
where Equation (9) represents the number of all target tasks performed for this mission planning as a percentage of the total number of tasks, and satellites should be assigned as many target tasks as possible to perform.
3.2.2. Model Constraints
This subsection describes the MSDMPUP constraints, specifically the target task allocation constraints, the target visibility constraints, the satellite performance constraints, and the target mission assignment constraints.
where
denotes the probability of successful observation of the
th satellite performing the
th target task.
denotes the probability that the
th satellite successfully observes at least one target mission, and
denotes the decision variable for the
th satellite to perform the
th target mission;
- 2.
The target visibility constraint is as follows: the satellite is visible to the target mission within the elevated node pitch angle interval:
where
represents the elevation angle of the intersection point when the imaging satellite is perpendicular to the target task,
represents the linear distance between the target task and the center of the earth,
represents the linear distance between the imaging satellite and the ground, and
represents the visible angle between the imaging satellite and the target task.
- 3.
The satellite performance constraints are as follows: these constraints are limited by the satellite’s own resources, side-swing energy, and execution time:
where
denotes the resource consumption per unit time in orbit
,
denotes the side-swing energy consumption of the satellite
performing the target task
in orbit
,
,
denotes the observation end and start time of the satellite
performing the target
mission in orbit
, and
denotes the satellite’s own energy.
- 4.
The target mission assignment constraints are as follows: these constraints mainly consist of the side-swing angle constraints, the resolution constraints, and the field-of-view angle constraints:
where
denotes the mission swing angle of the observation target,
denotes the maximum lateral swing angle of the imaging satellite,
indicates the remote sensor resolution of the imaging satellite
,
denotes the minimum resolution required by the target task, and
denotes the individual field of view of the satellite remote sensor.
4. Design of HPIM–KTSMOEA Algorithm
In this section, an MSDMPUP model based on the hybrid preference interaction mechanism and knowledge transfer strategy for the multi-objective evolutionary algorithm (HPIM–KTSMOEA) is proposed. The framework of the HPIM–KTSMOEA algorithm is shown in Algorithm 1.
Algorithm 1. The framework of HPIM–KTSMOEA |
); |
Output: MSDMPUP optimal scheme; ; 2 The MSDMPUP fitness function is constructed according to Equations (6)–(8) and the MSDMPUP constraints are constructed according to Equations (9)–(12).
*/ . for gen = 1:n ./* Generating subpopulation */ ./*Mapping strategy */ end */ /* Generate a new generation of population*/ /* HPIM-KTSMOEA algorithm */ for satellite management user */ ./*Inverse mapping strategy */ ./* MSDMPUP scheme */ 7 End |
4.1. Hybrid Preference Interaction Mechanism
This study proposes the hybrid preference interaction mechanism (HPIM) based on the MSDMPUP model to address the mismatch of planning scenarios applied in real environments caused by the limitations of user preference demand input in existing mission planning methods. In various stages of the mission planning process of the solution optimization search, the planning method is based on the preference instructions of satellite controllers so that it can be applied to different mission scenarios and effectively respond to the dynamic changes in the environment. The hybrid preference interaction mechanism consists of three layers: the pre-planning layer, the preference interaction layer, and the interaction output layer.
In the pre-planning layer, the satellite controller updates the external comprehensive index that evaluates the importance index of the target task according to the actual needs, to intervene in the observation sequence of the target task. In the preference interaction layer, the satellite controller sends demand instructions to the system according to the actual application environment, and the user needs, to realize the intervention of the optimization search process on the user preference task planning scheme. The interaction output layer outputs multiple imaging task planning schemes generated by the system for satellite management and control personnel to select, and realize the intervention of the final scheme selection.
The introduction of the hybrid preference mechanism allows satellite controllers to specify the allocation of resources based on preference-based instructions, ensuring that the mission planning scheme matches the actual needs under different mission scenarios and improving the efficiency and success rate of mission execution. In addition, the hybrid preference interaction mechanism emphasizes the interaction and collaboration between satellite controllers and the mission planning system, making the mission planning scheme flexible and scalable.
In addition, the existing investigation methods usually use the integer discrete set of satellite and target mission assignment sequences as the candidate solution of the MSDMPUP problem. Moreover, invalid and duplicate values of the candidate solutions are obtained during the optimization search process, which causes the failure of the MSDMPUP scheme. In this section, the physical significance of the satellite and the target mission is considered, and the individual sub-objective fitness values of the satellite and the target task are used as a unified medium. Furthermore, the mapping strategy and inverse mapping strategy are set up in the hybrid preference interaction mechanism to convert the MSDMPUP problem from a discrete space to a continuous space, which increases the reasonableness of the target task allocation and the efficiency of the solution.
4.1.1. Pre-Planning Layer
The pre-planning layer aims to obtain an initial allocation reference knowledge pool based on the initial set of tasks, and to construct a mapping strategy to map the multi-satellite imaging task allocation sequence into an evaluation matrix for each sub-objective, generating a knowledge pool of target task evaluations to inform the MSDMPUP planning allocation.
Figure 3 shows the schematic diagram of the pre-planning layer, where the gray circles indicate the initial population individuals, the purple circles indicate the task benefit sub-population
individuals, the green circles indicate the task timeliness sub-population
individuals, and the blue circles indicate the task completion sub-population
individuals. It is assumed that six satellites
carry out observations of six target tasks
distributed in different regions.
Firstly, according to the initial task set generated by using the MSDMPUP model based on a task rolling window, multiple sets of multi-satellite imaging task assignment sequences are randomly generated as the initial parent population , in which each set of multi-satellite imaging task assignment sequences is an individual. The NSGA-II algorithm is then applied to the MSDMPUP model to quickly obtain the initial Pareto frontier surface of the initial parent population using a non-dominated sorting method.
Secondly, the frontier surface is optimized using a reference point decomposition method to generate the optimal Pareto frontier surface for the initial allocation reference knowledge pool. The unit of knowledge of the initial allocation reference knowledge pool consists of a matrix of size
. It specifically contains five parts, namely, the executing satellite number sequence, target task sequence, task benefit, task response time and task completion, with the following expressions:
Thirdly, the fitness values of the task benefit objective function, the task timeliness objective function, and the task completion objective function are used as the evaluation criteria, and the individuals on the optimal Pareto frontier are sorted in descending order of the fitness values of the three sub-objective functions to generate the task benefit sub-population , the task timeliness sub-population , and the task completion sub-population .
Finally, a mapping strategy is constructed to map the sequence of execution satellites and target tasks in the discrete evaluation space to the continuous scheme evaluation space through the corresponding function values of each sub-objective, which is specified as follows:
where
uses the objective function evaluation matrix
as the replacement matrix for the spatial mapping, and the individuals in the task benefit sub-population
, the task timeliness sub-population
, and the task completion sub-population
are mapped to the task benefit evaluation matrix
, the task timeliness evaluation matrix
, and the task completion evaluation matrix
using the mapping strategy. The evaluation matrices for each objective function form a continuous vector space
and are stored in the target task evaluation knowledge pool
.
4.1.2. Preference Interaction Layer
The preference interaction layer aims to generate a new population generation based on the commands sent by satellite controllers and to generate user preference planning schemes by optimizing the search using a knowledge transfer strategy for the multi-objective evolutionary algorithm; these data generation steps are carried out to realize the dynamic adjustment of the user preference task planning schemes using the optimization search process. Satellite controllers can send demand instructions to the system to provide empirical knowledge according to the actual application requirements.
Figure 4 shows the schematic diagram of the preference interaction layer.
Firstly, the satellite controllers perform single command, double command, and multiple command sending according to the actual requirements. According to the difference in the sent command compared with the new generation population based on individual preference selection, the new generation population construction method is as follows:
where
denotes the new generation population,
denotes the initial parent population, and
denotes the sub-populations, specifically the task benefit sub-population
, the task timeliness sub-population
, and the task completion sub-population
.
Secondly, the knowledge transfer strategy for the multi-objective evolutionary algorithm (HPIM–KTSMOEA) is used to perform evolutionary operations on the new generation population by calculating the crowding distance between the knowledge as well as the otherness of the knowledge. According to the otherness of the knowledge as a medium of knowledge transfer between the historical environment and the current environment, the algorithm adaptively selects the knowledge transfer strategy for solving and outputting the MSDMPUP user preference planning scheme when reaching the termination condition.
4.1.3. Interaction Output Layer
The interaction output layer is aimed at reflecting the sub-target evaluation values according to the constructed inverse mapping strategy to obtain a sequence of multiple satellite and target mission assignments. The system outputs the latest mission planning scheme for the satellite controllers in order to achieve manual intervention in the final scenario selection. The inverse mapping strategy matches the current evaluation matrix obtained from the evolution operation with the historical sub-objective evaluation values of the corresponding evaluation matrix in the target task evaluation knowledge pool one by one. Next, the strategy matches the historical sub-objective evaluation value with the smallest difference with the current sub-objective evaluation value in the evaluation matrix and then maps it to the corresponding sequence relationship. The inverse mapping strategy is specified as follows:
where
,
,
represent the task benefit value, task timeliness value, and task completion degree of the current satellite performing the target task, respectively;
denotes the inverse mapping function; and
,
,
represent the corresponding historical task benefit value, historical task timeliness value, and historical task completion degree of the satellite performing the target task in the target task evaluation knowledge pool, respectively.
- 2.
Satellite controllers send double instructions:
- 3.
Satellite controllers send multiple commands:
Taking the reflective output of the task benefit priority command sent to the satellite controller as an example, the principle of the interactive output layer is introduced, as shown in
Figure 5. The target assessment matrix of task benefits obtained by the evolutionary operation of the new generation population is
, and the historical task benefit assessment matrix
is used as a reflective reference. The current sub-objective assessment value of 191 is matched one by one with the assessment value in the first column of the historical task benefit assessment matrix
, and the historical sub-objective assessment value of 186, which has the smallest difference with the current sub-objective assessment value is matched, and its reflective projections are retrieved to obtain the satellite and target task sequence relationship as
. The rows in which the historical sub-objective assessment value of 186 is located in are invalid values; these values are removed from the assessment matrix and matched against the next sub-objective assessment value of 405. The loop traversal method is used to output the final allocation sequence:
,
,
,
,
,
.
4.2. Knowledge Transfer Strategy for Multi-Objective Evolutionary Algorithm
The multi-objective evolutionary algorithm (MOEA) is an intelligent search algorithm that finds the optimal solution to a problem with multiple optimization objectives through continuous iterative evolution [
33,
34]. The MSDMPUP problem requires the simultaneous consideration of the three optimization objectives of task benefit, task completion, and task timeliness, which is a typical multi-objective optimization problem. In addition, MSDMPUP is affected by the dynamically changing requirements of users in complex environments; the problem space is more massive, facing much greater problem uncertainty. Moreover, how to ensure that the algorithm responds efficiently to changes in the environment while reducing computational consumption is a key issue to be solved.
Therefore, this section proposes a knowledge transfer strategy for the multi-objective evolutionary algorithm (KTSMOEA), which is capable of considering both regular and stochastic changes, efficiently tracking environmental changes, and accelerating the convergence of populations in new environments. The schematic diagram of the algorithm is shown in
Figure 6.
Firstly, the MSDMPUP knowledge pool is constructed for storing the current environmental knowledge and historical knowledge as a pool of solutions for solving the MSDMPUP problem as follows:
where denotes the
-th knowledge of generation
v-th in the MSDMPUP knowledge pool. Knowledge is the optimal solution to solve the MSDMPUP problem, represented in the optimization search process, as the
optimal subset, as well as the implicit features of the
optimal subset, may provide valuable information for exploring future environments in each historical environment.
denotes the historical knowledge.
denotes the
optimal subset in the
-th generation of the historical environment,
denotes the set of implicit features, and
is the average of all solutions in the optimal subset
of the historical environment
along each dimension of the decision space, which is mathematically described as follows:
where
denotes the
-th dimension of the
-th element in the optimal subset
of the historical environment
.
Secondly, the maximum storage capacity of the MSDMPUP knowledge pool is
, and in order to store more diverse historical knowledge in a limited space, historical knowledge updates are performed for cases beyond
to assist new environments in quickly exploring
-optimal mission planning schemes as follows:
where
denotes the Euclidean distance between knowledge and
denotes the crowding distance between knowledge. Smaller
indicates greater similarity between the knowledge and provide redundant information for future generations to optimize the search. Therefore, the last
of knowledge from
is removed from the MSDMPUP knowledge pool to ensure that diverse historical knowledge is always stored within the knowledge pool.
Determining the otherness between the historical and current environments and identifying the most appropriate knowledge from the knowledge pool are key issues for efficient knowledge transfer. The multi-objective optimization problem is similar in the corresponding
-optimal subsets and
-fronts in two environments with small otherness, and the algorithm search process can be accelerated by selecting the historical environment with small differences from the current environment for history transfer, as follows:
where
denotes the environmental otherness of the
-th knowledge, and the
,
,respectively, denote the global optimization objective function values corresponding to the
-th position of the
-th knowledge in the historical and current environments.
Finally, knowledge transfer strategies are adaptively selected based on knowledge otherness to transfer knowledge from the historical environment to the current environment, including the following:
If indicates that the current environment is similar to the historical environment, the knowledge corresponding to the historical environment will be directly transferred to the current environment, which means that the knowledge in the historical environment will be combined with the random individuals in the current environment to participate in the optimization search as the next-generation population.
If
indicates that the current environment is different from the historical environment, and the knowledge in the historical environment is closer to the
of the real environment than the knowledge in the current environment, then the subspace distribution alignment method
is used to mine the association rules between the historical environment and the current environment, which then transfers the knowledge from the historical environment to the current environment. The subspace distribution alignment method
is as follows:
where
is the eigenvector of
,
is the eigenvector of
.
is the orthogonal complement of
, and
,
are orthogonal matrices of
and
, respectively.
,
,
are the diagonal matrices of
and
. The subspace distribution alignment method
enables the transfer of knowledge from the historical environment to the current environment in order to quickly adapt to the current environment. The specific description is shown in Algorithm 2.
Algorithm 2. The framework of the knowledge transfer strategy for multi-objective evolutionary algorithm |
Input: Initial population P, Task benefit subpopulation P1, Task timeliness subpopulation P2, Task completion subpopulation P3 |
Output: A series of MSDMPUP scheme |
1 while termination criterion not met 2 /* Construct the MSDMPUP knowledge pool and knowledge dynamic updating*/
./* MSDMPUP knowledge pool */ else /* Determine the crowding distance between knowledge*/ /* knowledge update*/ end 3 /* knowledge transfer strategies according to knowledge otherness */ ./* Determine otherness between environments */
/*Direct transfer*/ else ./*Subspace distribution alignment method*/ end 4 Perform basic multi-objective evolutionary algorithm 5 End |
4.3. Computational Complexity of HPIM–KTSMOEA
In this section, the computational complexity of the HPIM–KTSMOEA algorithm is described in terms of two dimensions, time complexity and space complexity. Firstly, time complexity is affected by four components: task set and population initialization, whose time complexity is usually proportional to the number of tasks and population size, denoted as . denotes the target number of tasks and denotes the population size; the time complexity of calculating the fitness value of the objective function and processing the constraints depends on the objective function and the objective constraint complexity, denoted as . is the objective function for a population size of . Preference interaction mechanism, time complexity depends on the number of user preferences, is denoted as . The time complexity of the multi-objective evolutionary algorithm based on knowledge transfer strategy depends on the construction and updating of MSDMPUP knowledge base, denoted as , and is the size of knowledge pool. Therefore, the HPIM–KTSMOEA time complexity is , where represents the number of population iterations.
Secondly, the space complexity is related to the amount of data that the algorithm needs to store, which in HPIM–KTSMOEA is determined by three components: input parameters and task set, population and individual information, and MSDMPUP knowledge pool. Input parameters and task sets, whose spatial complexity depends on the number of tasks, denoted as ; population and individual information, whose spatial complexity depends on the size of the population, denoted as ; and MSDMPUP knowledge base, whose spatial complexity depends on the size of the knowledge base, denoted as . Therefore, the HPIM–KTSMOEA space complexity is .
5. Experimental Result and Analysis
To verify the effectiveness and stability of the HPIM–KTSMOEA algorithm for solving the MSDMPUP problem, the following four sets of simulation experiments were carried out in this work using MatlabR2020b software: Experiment 1. Effectiveness of the MSDMPUP model based on a task rolling window; Experiment 2. Effectiveness and stability of HPIM–KTSMOEA under small-scale MSDMPUP; Experiment 3. Effectiveness and stability of HPIM–KTSMOEA under large-scale MSDMPUP; and Experiment 4. Performance analysis of the HPIM–KTSMOEA algorithm.
5.1. Experiment Setting
According to the MSDMPUP mission requirements, this experiment generated 10 imaging satellites from the simulation scenario set of the Satellite Simulation Kit (STK). Each satellite carried only one sensor and the maximum satellite yaw angle did not exceed 40°. Satellite orbit parameters and payload parameters are shown in
Table 1. The target tasks are generated randomly, and the target task set with target task assignment constraints, target visibility, satellite performance, and target task assignment are generated by using the MSDMPUP model based on a task rolling window. The experimental simulation time was set as 00:00:00~24:00:00, 9 February 2024 (UTCG). In addition, the HPIM–KTSMOEA algorithm parameters were as follows: the individual population size was set to 100 and the maximum number of iterations was set to 1000 generations.
5.2. Experiment 1: Effectiveness of the MSDMPUP Model Based on a Task Rolling Window
To verify the effectiveness of the MSDMPUP model based on task rolling window (MSDMPUP–TRW), the MSDMPUP–TRW model in this study was compared with the multi-satellite imaging mission planning model based on event triggering (MSIMP–ET) [
35], and the multi-satellite imaging mission planning model based on periodic-triggered rolling, MSIMP–PTR [
36]. Task sets carrying the target task importance degree were generated using three types of methods.
The MSIMP–ET model addresses the problem of high dimensionality and time-varying feature information. The model formulates the multi-satellite scheduling problem as a dynamic combinatorial optimization problem, which reduces losses due to irrational allocation of satellite resources by triggering events, which means that the rolling is triggered when a new demand arrives. The MSIMP–PTR model improves the quality of the multi-satellite imaging mission planning scheme by introducing a periodic trigger rolling window into the multi-satellite scheduling problem in response to the common problem of uncertainty in the arrival time of dynamic missions and updating the target mission importance according to a fixed period.
In this experiment, 45 target tasks are randomly generated, and the initial moment is used to generate the task set
of the target task importance degree using the three types of models, MSDMPUP–TRW, MSIMP–ET, and MSIMP–PTR. where the MSDMPUP–TRW model sets the trigger threshold to 2, the MSIMP–PTR model is given a trigger period of
, and the target task importance degree is specified as shown in
Table 2. “Target” indicates the target mission number, “
”, “
”, “
” denote the target task importance degree generated by using the MSDMPUP–TRW, MSIMP–ET, and MSIMP–PTR models, respectively.
During the planning period of the experiment, the satellite control center monitored task changes at different moments. The target task importance degree of the target tasks at different moments is shown in
Figure 7, where “*” indicates the target task importance degree corresponding to each target task and connects the target tasks “*” with straight lines of different colors.
Figure 7a shows the target task importance degree of the 45 target tasks generated at the initial moment in the models MSDMPUP–TRW, MSIMP–ET, and MSIMP–PTR, and it can be seen from the figure the same target task is generated under the three types of models with a slightly different degree of importance, but the gap in the degree of importance of the target tasks is small.
Figure 7b–d show the changes in the importance of the target tasks for the three types of models, MSDMPUP–TRW, MSIMP–ET, and MSIMP–PTR, during the planning time, and the red boxes indicate the importance of the target tasks that changed more significantly after the dynamic update during the planning cycle.
As can be seen from
Figure 7, all three types of models are capable of generating dynamically updated target task importance according to the user’s task requirements. The MSIMP–ET model is more sensitive to the external environment, and the rolling completion of the update of the target task importance for each triggered task is capable of handling emergency tasks in a timely manner, but it will lead to frequent scheduling when facing more triggering events. The MSIMP–PTR model performs periodic scrolling according to the set time interval to realize dynamic updating of the importance of the target task, but the planning cycle is long and cannot sufficiently satisfy the high timeliness required of the emergency task. The MSDMPUP–TRW model combines the updating methods of periodic triggering and event triggering and sets the trigger threshold for the adaptive processing of multiple emergency tasks, which avoids frequent triggering and at the same time possesses the capability of timely processing of the emergency task.
The simulation results of Experiment 1 show that the MSDMPUP–TRW model proposed in this work can respond quickly to environmental changes and avoid frequent calculation and updating of the planning scheme to achieve timely updating of the importance of the target mission. This, in turn, affects the observation order of the satellite performing the target mission, enabling the satellite resources to be deployed in a timely manner according to the mission requirements and improving the planning efficiency.
5.3. Experiment 2: Effectiveness and Stability of HPIM–KTSMOEA under Small-Scale MSDMPUP
To verify the effectiveness of the HPIM–KTSMOEA algorithm in solving small-scale MSDMPUP, 45 target tasks were randomly generated in the local range of longitude [3, 54] and latitude [73, 135]. It is assumed that at time T, the target tasks located in the sea area (T1, T3, T4, T9, T17, T20, T24, T27, T29, T31, T35, T37, T40, and T42) dynamically update the importance of the target tasks using the MSDMPUP–TRW model. We generated a task set with the target task importance. The geographical locations of 45 target tasks and their corresponding target task importance values are shown in
Figure 8. The geographic distribution of target tasks is shown in
Figure 8a, where red circles represent target tasks. The target task importance degree at the initial time and time T is shown in
Figure 8b. The red box indicates the importance of target tasks that changed significantly after the dynamic update process.
The experiment was performed as follows: Set the case of sending preference instructions without satellite controllers as Instance1. The HPIM–KTSMOEA algorithm solves instance1 to obtain the initial optimal assignment task planning scheme of MSDMPUP, as shown in
Table 3, where, “T” indicates the target task number, “S” indicates the satellite number, “St” and “Et” indicate the start time (hour: minute: second) and end time (hour: minute: second) of satellite observation, respectively, and “--” indicates that the task assignment time window does not meet the target task assignment constraints, and, thus, no imaging task is arranged. As can be seen from
Table 3, all 45 target tasks are effectively assigned. Therefore, the HPIM–KTSMOEA algorithm achieves better task planning results.
Thereafter, to verify the effectiveness of the hybrid preference interaction mechanism on the basis of the same target task set generated in the same scene, the HPIM–KTSMOEA algorithm was used to solve eight groups of instances according to different user preference requirements. The instruction input for each instance and the optimal simulation results of the HPIM–KTSMOEA algorithm solving different instances are shown in
Table 4.
The following evaluation indexes were used to analyze the results of this experiment:
Ideal task benefit: . This represents the sum of the target task importance degree in the ideal case where all target tasks are observed.
Actual task benefit: . This represents the sum of the target task importance degree of the actual observed target.
Average task response time: . This represents the average task response time of the actual observation target.
Task completion rate: . This represents the proportion of the number of actual observed target tasks to the total number of tasks.
The performance analysis of the HPIM–KTSMOEA algorithm for solving different instances is shown in
Figure 9.
Figure 9 analyses the performance of different instances using target task ideal benefit, actual task benefit, task response time, and task completion rate, which is significant in revealing the impact of different user preference instructions on task planning performance. In particular, Instance1 represents the no user preference input instance in the ideal case, Instance2, Instance3, and Instance4 represent the task benefit priority instruction, task timeliness priority instruction, and task completion priority instruction input instances in the case of a single preference instruction, respectively. Instance5, Instance6, and Instance7 denote the double preference instruction case with task benefit priority instruction and task timeliness priority instruction inputs, task benefit priority instruction and task completion priority instruction inputs, task timeliness priority instruction and task completion priority instruction inputs, respectively. Instance8 denotes the multiple instruction inputs.
The performance analysis of the HPIM–KTSMOEA algorithm for solving different instances is shown in
Figure 9.
Figure 9a–c represent the convergence curves of the HPIM–KTSMOEA algorithm under the three dimensions of the task benefit objective function, task timeliness objective function, and task completion objective function, respectively, and
Figure 9d represents the average values of the HPIM–KTSMOEA algorithm under the different evaluation indexes.
In
Figure 9a, the HPIM–KTSMOEA algorithm for each instance under the target task benefit evaluation index is as follows: Instance2 > Instance6 > Instance5 > Instance8 > Instance4 > Instance3 > Instance7 > Instance1. This index can reflect the performance of the HPIM–KTSMOEA algorithm in terms of benefits, and by using the target task benefits for different instances can assess the impact of the user input task benefit priority instruction on the system benefits.
In
Figure 9b, the response time of each instance under the task timeliness evaluation index is as follows: Instance3 < Instance7 < Instance5 < Instance8 < Instance2 < Instance4 < Instance6 < Instance1. This metric reflects the performance of the HPIM–KTSMOEA algorithm in terms of timeliness, by comparing the average response time under different instances, and by evaluating how different settings of the user’s timeliness requirements affect the efficiency of task planning. Shorter response time means more efficient task planning.
From
Figure 9c, it can be seen that the order of each instance under the task completion evaluation index is as follows: Instance4 > Instance6 > Instance7 > Instance8 > In-stance2 > Instance5 > Instance3 > Instance1. By comparing task completion under different instances, we can understand how different user requirements for task completion affect the execution of task planning. Higher task completion means that task planning is better able to satisfy users’ needs.
Combined with the experimental results in
Table 4 and
Figure 9, Experiment 2 shows that the HPIM–KTSMOEA algorithm proposed in this work is able to solve the MSDMPUP model effectively in small-scale target task scenarios, and the hybrid preference interaction mechanism allows satellite controllers to adjust the preferences of the mission planning scheme according to the actual needs to improve the adaptability of the multi-satellite dynamic mission planning scheme.
5.4. Experiment 3: Effectiveness and Stability of HPIM–KTSMOEA under Large-Scale MSDMPUP
To verify the effectiveness of the HPIM–KTSMOEA algorithm for solving MSDMPUP on a large scale, Experiment 3 randomly generates 100 target tasks for simulating user imaging task requirements on a global scale in longitude [−90, 90] and latitude [−180, 180], and the geographical distribution of the 100 target tasks on a global scale is shown in
Figure 10.
It is assumed that the observed target tasks for demand changes globally include (T10, T23, T31, T39, T49, T57, T68, T79, and T94) at the moment of T. The initial moment and the importance of the target tasks at the moment of T generated by using the MSDMPUP–TRW model are shown in
Figure 11a. To more directly demonstrate the allocation relationship between the satellite and the target mission in the case of the no-satellite controller preference command sending, the HPIM–KTSMOEA algorithm is used to solve the visualization of the initial multi-satellite dynamic mission planning scheme in this scenario. As shown in
Figure 11b, the x-axis denotes the target mission number, the y-axis denotes the target mission corresponding to the target mission importance, and the z-axis denotes the executing satellite number. To show more clearly the target missions performed by different satellites, different colored squares are used to represent the target missions performed by different satellites.
The performance analysis of the HPIM–KTSMOEA algorithm for solving different instances of MSDMPUP at a large scale is shown in
Table 5 and
Figure 11.
Table 5 evaluation metrics included ideal target task benefits
, actual task benefits
, average task response time
, and task completion rate
.
As can be seen from the data in
Table 5, the actual task benefits in Instance2, Instance5, Instance6, and Instance8 are more similar to the ideal benefits of the target task. Moreover, the average task response time in Instance3, Instance5, Instance7, and Instance8 is shorter than that in other instances, and the task completion rate is greater than 90% in Instance4, Instance6, Instance7, and Instance8.
From the multiple convergence curves in
Figure 12, it can be seen that the HPIM–KTSMOEA algorithm for solving the MSDMPUP model finds as many allocation schemes as possible at the beginning of the iteration, and as the number of iterations increases, it quickly converges to the optimal scheme among different instances. This result proves the effectiveness of the knowledge transfer strategy of the HPIM–KTSMOEA algorithm, which is able to invoke similar knowledge in the historical environment to facilitate the fast convergence of the algorithm.
To further validate the scalability of the MSDMPUP validation of the HPIM–KTSMOEA algorithm at a large scale, we randomly generate 150, 200, 250, and 300 target tasks to simulate the user imaging requirements at the same latitude and longitude range with a step size of 50.
The performance analysis of the HPIM–KTSMOEA algorithm for solving different instances of MSDMPUP at a large scale is shown in
Table 6 and
Figure 13. The solving instances include no preference instruction input, single instruction input, dual instruction input, and multiple instruction input, with a total of four types and eight sets of instances, as shown in the column of instruction input in
Table 6. Meanwhile, the evaluation indexes in
Table 6 include target task ideal benefit
, actual task benefit
, task response time
, and task completion rate
.
The performance analysis of the HPIM–KTSMOEA algorithm for solving MSDMPUP at four scales of 150, 200, 250, and 300 target tasks is shown in
Figure 13.
Figure 13a demonstrates the target task benefits that the HPIM–KTSMOEA algorithm can obtain under different target task sizes, reflecting that the algorithm has a good ability to optimize resource allocation and task scheduling.
Figure 13b demonstrates the time response of the HPIM–KTSMOEA algorithm under different target task sizes, and the algorithm is always able to complete the target task allocation with a low response time.
Figure 13c demonstrates that the instances of the HPIM–KTSMOEA algorithm under different target task sizes all have high task completion.
Experiment 3 shows that the HPIM–KTSMOEA algorithm proposed in this work demonstrates significant advantages when dealing with large-scale target task scenarios, and it cannot only effectively solve the MSDMPUP model, but also shows excellent effectiveness, stability, and scalability for target tasks at all scales. At the same time, based on the method proposed in this work, satellite controllers can send preference commands to the system, which can optimize the search process to adjust the preferences of the MSDMPUP scheme, and the HPIM–KTSMOEA algorithm is able to solve the optimal scheme within a limited number of iterations and with a high degree of task completion in all instances. In addition, single instruction delivery significantly outperforms dual and multiple instructions for task planning adjustment for individual preference requirements.
5.5. Experiment 4: Performance Analysis of the HPIM–KTSMOEA Algorithm
This section compares the proposed HPIM–KTSMOEA algorithm with the knowledge transfer for multi-objective evolutionary algorithm, (KT–DMOEA [
37]), bidirectional knowledge transfer for MOEA algorithm (BKT–MOEA [
38]), hybrid knowledge transfer for MOEA algorithm (HKT–MOEA [
39]), bi-objective knowledge transfer for MOEA algorithm (BOKT–MOEA [
40]), and the helper objective-based multifactorial evolutionary algorithm (HO–MEA [
41]); in total, five algorithms are compared for performance.
The KT–MOEA algorithm uses a weighted approach to guide the environment to generate a new generation of populations by combining knowledge of inflection points during the dynamic evolution process, which can lead to effective tracking of environmental changes in the multi-satellite imaging mission planning system. The BKT–MOEA algorithm designs a bi-directional knowledge migration strategy that is able to identify preferred target tasks in the process of searching for knowledge-optimized searches, making the task planning scheme closer to the actual requirements. The HKT–MOEA algorithm evaluates task relevance based on the distributional characteristics of the evolving population, adaptively performs knowledge transfer, and is able to retrieve more globally optimal task planning solutions. The BKT–MOEA algorithm uses two different measures to accurately measure the similarity between different types of tasks to transfer knowledge and improve the solution efficiency of the task planning system. The HO–MEA algorithm uses a multi-mission optimization framework to transfer knowledge between the auxiliary algorithms and the original problem, improving the accuracy of the dynamic mission planning scheme for multi-satellite imaging.
Eight sets of instance simulation experiments were carried out under 100 target task scenarios, in which the instance interactive operation is consistent with the descriptions of Experiments 2 and 3, and the instances all use the same number of iterations and population size. The experimental results of the performance comparison between the HPIM–KTSMOEA algorithm and the other algorithms are shown in
Table 7 and
Figure 14.
To comprehensively evaluate the effectiveness of the HPIM–KTSMOEA algorithm on the MSDMPUP problem, this section uses the actual task benefit
, average task response time
, and task completion rate
as the evaluation indexes.
Table 7 and
Figure 14 shows that the HPIM–KTSMOEA algorithm has better stability, and has obvious advantages over other algorithms in terms of the actual task benefit
, average task response time
, and task completion rate
in solving the eight sets of instances under MSDMPUP. This demonstrates that the HPIM–KTSMOEA algorithm is able to effectively deal with the dynamic mission planning problem of multi-satellite imaging under different user requirement inputs.
We further analyzed the HPIM–KTSMOEA algorithm, KT–MOEA algorithm, BKT–MOEA algorithm, HKT–MOEA algorithm, BOKT–MOEA algorithm, and HO–MEA algorithm data using box plots in Instance8.
Figure 15a–c represent the box plots of different algorithms under the three dimensions of the task benefit objective function, the task timeliness objective function, and the task completion objective function, respectively. The top and bottom of the box plots represent the upper quartile and lower quartile, respectively, the top and bottom edge lines represent the maximum and minimum values, the middle line segment represents the average, and the black plus sign represents the outliers.
Figure 15 shows that the HPIM–KTSMOEA algorithm has advantages over other algorithms in the three dimensions of the task benefit objective function, the task timeliness objective function, and the task completion objective function, with a larger range of converged datasets and fewer outliers, indicating that the introduction of the knowledge transfer strategy is able to quickly find more high-quality task planning solutions in the search space. The KT–MOEA algorithm and BOKT–MOEA algorithm show smaller and faster convergence among the convergence data set range, which indicates that the algorithm may fall into a local optimum. The BKT–MOEA algorithm and HO–MEA algorithm median task benefits and response times are off-center and show a tendency to be biased, indicating that the algorithm solution data are not evenly distributed in the optimal search space. The HKT–MOEA algorithm has small task benefit values, long response times, and poor task completion, which indicates that it cannot sufficiently adapt to dynamically changing task requirements.
The simulation results of Experiment 4 show that the HMIM–KTSMOEA algorithm proposed in this study introduces the knowledge transfer strategy, which can effectively cope with the stochastic changes in the task planning environment in the case of multiple-user demand, effectively making a trade-off between the solution performance and the computational efficiency, and solving the optimal solution in a shorter time. Moreover, the HMIM–KTSMOEA algorithm outperforms the other five algorithms in terms of optimization objective fitness value and optimization efficiency. Therefore, the HMIM–KTSMOEA algorithm is the superior choice to solve the MSDMPUP model.
6. Conclusions
In this work, we proposed the hybrid preference interaction mechanism and knowledge transfer strategy for the multi-objective evolutionary algorithm (HPIM–KTSMOEA). Firstly, an MSDMPUP model based on a task rolling window was proposed to achieve timely updating of target task importance by responding quickly to environmental changes through the simultaneous application of periodic triggering and event triggering methods. Secondly, the hybrid preference interaction mechanism was proposed to construct a mapping strategy to map the MSDMPUP allocation sequence to a sub-objective evaluation matrix, which directs the command inputs from the satellite controllers to guide the new population generation of the new generation and achieves the preference-based adjustment of the mission planning scheme. Finally, a knowledge transfer strategy for the multi-objective evolutionary algorithm was proposed to transfer knowledge according to the difference between the current and historical environments and invoke similar knowledge in the historical environment to promote the fast convergence of the algorithm. The simulation results show that this method can quickly solve the MSDMPUP problem under different task requirements, with outstanding performance such as high task benefit, short response time, and high task completion rate.
The HPIM–KTSMOEA method proposed in this paper works through a hybrid preference mechanism to satisfy different user preference requirements, however, the hybrid preference interaction mechanism may face the problem of insufficient dynamic adaptability to cope with rapidly changing dynamic environments in real application scenarios. To overcome this limitation, the next study will combine advanced prediction techniques and reinforcement learning techniques to develop an intelligent interactive collaborative-based multi-satellite imaging dynamic mission planning system, which utilizes deep learning prediction models to predict future environmental changes and enables the algorithms to quickly adjust the preference mechanism through intelligent interactive collaboration to achieve more efficient and intelligent multi-satellite imaging dynamic mission planning. Therefore, it is significant to investigate the problem of an intelligent interactive collaborative-based multi-satellite imaging dynamic mission planning system.