Next Article in Journal
Performance of Individual Tree Segmentation Algorithms in Forest Ecosystems Using UAV LiDAR Data
Next Article in Special Issue
Super-Twisting Algorithm Backstepping Adaptive Terminal Sliding-Mode Tracking Control of Quadrotor Drones Subjected to Faults and Disturbances
Previous Article in Journal
Map Representation and Navigation Planning for Legged Climbing UGVs in 3D Environments
Previous Article in Special Issue
A Global Coverage Path Planning Method for Multi-UAV Maritime Surveillance in Complex Obstacle Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Dealing with Multiple Optimization Objectives for UAV Path Planning in Hostile Environments: A Literature Review

1
Department of Cognitive Science and AI, Tilburg School of Humanities and Digital Sciences, Tilburg University, P.O. Box 90153 Tilburg, The Netherlands
2
Data Science Centre of Excellence, Faculty of Military Sciences, Netherlands Defence Academy, P.O. Box 20701 Hague, The Netherlands
*
Author to whom correspondence should be addressed.
Drones 2024, 8(12), 769; https://doi.org/10.3390/drones8120769
Submission received: 21 November 2024 / Revised: 13 December 2024 / Accepted: 16 December 2024 / Published: 19 December 2024

Abstract

:
As Unmanned Aerial Vehicles (UAVs) are becoming crucial in modern warfare, research on autonomous path planning is becoming increasingly important. The conflicting nature of the optimization objectives characterizes path planning as a multi-objective optimization problem. Current research has predominantly focused on developing new optimization algorithms. Although being able to find the mathematical optimum is important, one also needs to ensure this optimum aligns with the decision-maker’s (DM’s) most preferred solution (MPS). In particular, to align these, one needs to handle the DM’s preferences on the relative importance of each optimization objective. This paper provides a comprehensive overview of all preference handling techniques employed in the military UAV path planning literature over the last two decades. It shows that most of the literature handles preferences by the overly simplistic method of scalarization via weighted sum. Additionally, the current literature neglects to evaluate the performance (e.g., cognitive validity and modeling accuracy) of the chosen preference handling technique. To aid future researchers handle preferences, we discuss each employed preference handling technique, their implications, advantages, and disadvantages in detail. Finally, we identify several directions for future research, mainly related to aligning the mathematical optimum to the MPS.

1. Introduction

The first documented military use of Unmanned Aerial Vehicles (UAVs) occurred in 1849 when the Austrian forces deployed around 200 hot-air balloons to bomb Venice [1]. Half a century later, in the Great War, the first pilotless aircraft was developed [2]. Although the balloons had little effect, and the aircraft were never used operationally, both showed a glimpse of the future of warfare. It was in the Vietnam war when reconnaissance UAVs were deployed on a large scale for the first time [2], flying over 3000 missions [3]. Later, during the battle of the Beka’a valley in 1982, decoy UAVs helped the Israeli air force win a decisive victory, destroying 17 out of 19 Syrian SA-6 air defence systems in ten minutes [3,4]. As a response, the Syrians sent out a large number of MiG jet fighters. However, with the help of a reconnaissance drone providing live video imagery of the take off of the jet fighters, Israel was perfectly prepared for the attack. At the end of the war, Syria lost 85 MiGs, whilst failing to destroy a single Israeli aircraft. As a result of the massive success of the operation the Israeli Air Force increased their UAV force substantially [3]. Nowadays, UAVs have become such a crucial asset on the battlefield, that the Russia-Ukraine war has been named the first drone war. Small inexpensive radio-controlled drones are being used on a wide scale to destroy millions of dollars of tanks and artillery systems [5]. One major threat against these radio-controlled drones is the use of Electronic Warfare (EW) systems to jam the drones’ radio frequencies or to trace the signal back to the drone operator. In fact, Russia’s EW systems are causing Ukraine to lose approximately 10,000 drones per month [6]. In response, Ukraine and Russia are investing heavily in Artificial Intelligence guided drones [5], for they have no radio frequency to jam nor an operator to trace the signal back to. Although the technology required still needs to be developed further, many believe that autonomous drones will be at the heart of future drone warfare [5,7].
One key feature for UAVs to operate without direct human control is autonomous navigation and flight. It involves the formulation of a global pre-mission flight path as well as local optimization based on real-time data during flight. In military applications, missions are meticulously planned, making pre-mission flight path planning especially crucial. Therefore the scope of this paper is on pre-mission path planning and for the remainder of the paper, the term “path planning” specifically refers to pre-mission path planning. Path planning can be defined as finding the optimal feasible route from point A to point B (possibly involving more points). A feasible route is one that meets the constraints posed by the UAV’s flight dynamics. Besides feasibility, it is important for a route to achieve certain goals. For example, it needs to be short, have minimal fuel consumption, and in a military context, be safe from hostile actors. Mathematically, these goals are quantified as optimization objectives and they can be conflicting. Conflicting objectives cannot be optimized simultaneously and thus trade-offs need to be made. For instance, a possible detour to evade a radar system will decrease the probability of detection, but increase route length. Similarly, flying fast will reduce the time in air but increase fuel consumption. Although the scope of the body of literature examined is on military UAV path planning, civilian UAV path planning also has conflicting objectives (e.g., speed and fuel-efficiency). The same methods of handling multiple objectives (i.e., preference handling) used in military applications are used in non-military applications. Therefore, the discussion below, especially the detailed discussion on different preference handling methods, is also relevant for non-military UAV path planning applications. The presence of these conflicting optimization objectives characterizes the UAV route planning problem as a multi-objective optimization (MOO) problem described by Equation (1). For illustrative purposes, we have avoided unnecessary mathematical rigor and described the problem loosely.
minimize x ( f 1 ( x ) , , f k ( x ) ) subject to x X
The MOO problem loosely described by Equation (1) consists of five components. For each decision variable (1) x, a set of k objectives (2) ( ( f 1 ( x ) , , f k ( x ) ) ) is computed. In the context of route planning, the decision variable x represents a route, which is a time-dependent function d ( t ) over 3-dimensional spatial coordinates. From this function, parameters such as velocity, acceleration, and attitude can be derived. Even though d ( t ) is continuous, many path planning methods work on a discretized version of the problem. The k objective functions measure the performance of each route based on a certain metric. For instance, in the simplified example visualized in Figure 1, the objectives ( k = 2 ) of minimizing route length and danger are included. Moreover, each solution (i.e., route) must satisfy the flight constraints (3) imposed by the UAV’s flight dynamics to be included in the feasible set X . Based on these ingredients, an optimization algorithm (4) searches for the optimal route. However, in multi-objective optimization, optimality is less straightforward than in single-objective optimization (SOO), which is why the quotation marks in “minimize” are included. Rather than dealing with scalar objective values, multi-dimensional objective vectors are considered, meaning the minimum is not simply the lowest objective value. Consequently, a different notion of optimality is needed. Pareto optimality defines a solution x as optimal when none of its objectives f i ( x ) can be improved without deteriorating the performance of another objective f j ( x ) [8]. If such an improvement x was possible, route x would be said to dominate route x. Of course, no rational decision-maker (DM) would ever prefer a dominated solution over the solution that dominates it. This is exactly what Pareto optimality captures; all non-dominated solutions are Pareto optimal. This (possibly infinite) set of optimal solutions is called the Pareto Front. Mathematically speaking, all routes on the Pareto Front are equally good. However, a DM will prefer some routes over others. For example, in the simplified Figure 1a, two optimal 2-dimensional routes are visualized: the short but dangerous route x 1 , and the long but safe route x 2 . As they are both on the Pareto Front (see Figure 1b), neither route x 1 nor route x 2 is mathematically better than the other. Yet, depending on the mission scenario a DM might prefer x 1 over x 2 or vice versa. In particular, the relative importance of each objective can vary per mission. For instance, a high-risk, time-critical, rescue mission will accept higher danger levels than a simple transport mission. Moreover, losing a small quadrotor UAV is less important than losing a large expensive fixed-wing UAV. Essentially, the DM’s preferences determine which solution on the Pareto front is the DM’s most preferred solution (MPS). Preference handling (5) integrates the DM’s subjective preferences into the optimization model to align the mathematical optimum to the DM’s subjective optimum (i.e., the MPS). This is crucial, as the goal of the path planning problem is not as much to arrive at a mathematical optimum as to help the DMs arrive at their MPS. Commonly, a scalar composite objective function F ( f 1 ( x ) , , f k ( x ) ) = h ( f 1 ( x ) , , f k ( x ) ) is defined that aggregates the individual objectives. The specific aggregation function h differs per method (e.g., product in Section 4.1 and weighted-sum in Section 4.2.1). Many other methods exist, each having their own advantages and disadvantages. So far, we have only discussed the path planning of a single UAV. For cooperative applications (i.e., multiple UAVs; for a review see Fevgas et al. [9]), the multi-objective nature of the problem is exacerbated. Not only does each UAV have its conflicting optimization objectives, but the optimal paths of all UAVs together are likely to be conflicting as well. Therefore, MOO is even more important in the case of cooperative path planning. For the remainder of this paper, we have not differentiated between the two. However, this would be an interesting direction for future research.
Over the past decades, UAV path planning has been an active research field. Many papers have proposed novel approaches to model one (or multiple) of the five aforementioned components. So far, the literature has predominantly focused on developing new optimization algorithms. This trend is also evident in existing review papers. Indeed, as shown in Table 1, Nearly every survey has extensively covered the optimization algorithms (component (4)) proposed by the literature. Component (1), how to model routes, has been covered by Aggarwal & Kumar [10], Jones et al. [11] , and Allaire et al. [12]. They proposed a classification based on Cell Decomposition approaches, Roadmap approaches, and Artificial Potential Field approaches. As most of the approaches make a discrete approximation of the continuous 3D space by a grid or a graph, the planned route has a lot of sharp turns. Therefore, to ensure the route satisfies the flight constraints imposed by the UAV, the route needs to be smoothed. An overview of trajectory smoothing techniques is provided by Ravankar et al. [13]. Allaire et al. [12] cover the modeling of the optimization objectives (2) and constraints (3). They identified that most of the current literature oversimplifies the dynamical constraints. Additionally, they gave an extensive overview of the optimization objectives used in 60 papers [14,15]. Preference handling (component (5)) , on the other hand, has largely been neglected. It is only briefly touched upon by Song et al. [14], as they only briefly mention the weighted sum method as a potential modeling choice without going into detail. To address this gap, this literature review discusses the preference handling techniques employed in the literature. As mentioned before, preference handling aims to align the DM’s subjective optimum with the mathematical optimum of the optimization model. Clearly, even if the optimization algorithm is perfect, it will never reach the MPS if preference handling is done incorrectly. Therefore, focusing on preference handling is of paramount importance. This review provides a comprehensive overview of preference handling in the military UAV path planning literature over the last twenty years. Both the methods and the implications arising from choosing these methods are discussed. By doing so, we aim to answer the following two research questions (RQs):
  • RQ1: What preference handling techniques have been used in the military UAV path planning literature?
  • RQ2: For each of the identified methods: What are the advantages and disadvantages of using these methods?
By answering RQ1, we aim to provide an overview of the preference handling methods used in the military UAV route planning literature. This overview provides insight into the state of the art, thereby identifying gaps in the research literature. In particular, we aim to identify the extent to which DMs have been included in the literature. By answering RQ2, we aim to provide future researchers with an overview of the advantages and disadvantages of the employed preference handling methods. This critical discussion intends to provide future researchers the necessary tools to determine which preference handling method to use. In particular, this aims to help researchers ensure that their problem modeling aligns the mathematical optimum to the DM’s MPS. As the same preference handling techniques are used in non-military applications, this discussion is relevant to research on non-military UAV path planning as well.
The remainder of this paper is organized as follows. Section 2 describes the research methodology. In particular, the process of selecting the papers included in this review are described. In Section 3, to answer RQ1, we provide a comprehensive overview of the different preference handling techniques employed in the literature. Furthermore, we investigate the extent to which DMs have been included in the performance evaluation of the proposed methods. Next, in Section 4, to answer RQ2, a critical discussion of the advantages and disadvantages of each preference handling method employed in the literature is provided. Finally, Section 6 concludes the paper and identifies directions for future research.

2. Methodology

This scoping review is largely based on the guidelines for PRISMA Scoping Reviews by Tricco et al. [17]. The review’s scope focuses on military UAV path planning. In particular, a military environment is characterized by the presence of hostile actors such as enemy radars and anti-air defences. These actors form danger zones in the environment. As these radars and anti-air defences are set up in such a way to best detect or destroy the UAV, completely avoiding these danger zones is usually not possible. Even if it were possible, some missions have crucial time requirements and therefore, a decrease in the flight time at the expense of an increase in danger can be warranted. To identify potentially relevant papers, the databases of the Netherlands Defence Academy (NLDA), the American Naval Postgraduate School (NPS), and the American Institute of Aeronautics and Astronautics (AIAA) were searched from 2003–2024. The NLDA and NPS libraries consist of a collection of multiple databases. For example, the NLDA library yields access to IEEE, ProQuest, Springer, and many more. The NPS library yields access to over 300 databases. Finally, AIAA is the world’s largest aerospace technical society with over 200,000 technical articles, including both military and civilian applications. The focus on hostile environments corresponds well to the military setting covered by the NPS and NLDA. Their extensive collection should cover most of the literature relevant to the review. The remaining gaps will be covered by the literature from AIAA, which includes a large set of technical aviation papers.
To search these databases the keywords need to cover the route planning aspect, as well as the hostile environment aspect. To include the route planning aspect, the following keywords are used: “Route planning” or “path planning”. For the hostile environment, we identified that the literature predominantly includes radars or anti-air defence systems as hostile actors. Therefore, we include the combination of threat or hostile and specify it to a military setting by adding the keywords missile or radar. Note that we have chosen not to restrict our keyword search to UAV path planning per se. As path planning for manned systems shares a lot of similarities to path planning for unmanned systems, we have chosen to include both. However, even though we did not restrict ourselves to UAV path planning, no papers focusing on pre-mission path planning for manned systems were retrieved. The resulting keyword search which was last performed on 13 August 2024 is:
  • (“Route planning” OR “Path planning”) AND (hostile OR threat) AND (radar OR missile)
The keyword search resulted in a set of 848 papers, out of which 187 were duplicates. To filter this large set of papers to a representative and relevant set that can answer the proposed research questions, the procedure visualized in Figure 2 was performed. First, duplicates were removed. Secondly, for the remaining 661 papers, relevancy was judged based on the title and abstract. In this first check, 408 articles were removed. The reasons for removal were mainly because the papers were either not about flight path planning, or had no military application. Subsequently, from the 253 reports identified, 251 papers were retrieved. For these papers, the relevance was judged based on a full text read, and the eligibility criteria in Table 2 were used. The criterion of academic relevance was included to restrict the set of papers to a set representative of the general academic consensus. The general idea is that papers that are rarely cited, have not made an impact on the route planning research community. Note that by requiring papers to be cited at least once a year, new papers with few citations are still included in the paper selection. Next, papers that completely avoided threats or ignored them altogether were removed because in military missions it is rarely possible to completely avoid threats. Completely ignoring threats, on the other hand, simplifies the problem in such a way that it is irrelevant to actual practical applications. Similarly, by solely minimizing the distance or threat the multi-objective nature of the problem is neglected and is therefore irrelevant to this literature review. Papers that do not focus on flight route planning clearly fall out of the scope of the paper. Finally, papers where no details on the methods were reported or which only had minor changes to existing research were removed. The final set of papers included in this review consists of 76 studies that form a representative subset of military UAV route planning papers.
For each of these papers, we categorized the preference handling technique into four different classes based on the point at which the DM’s preferences were included. Furthermore, we extracted more detailed information on the preference handling technique. More particularly, the specific method used was documented for each paper. Based on this information, a general overview of the state of the art of preference handling in the military UAV path planning community was provided to answer RQ1. Additionally, to identify whether papers evaluated the practical usability of their methods, we extracted how the method was evaluated (e.g., was a subject matter expert used as DM). To answer RQ2, we explain each method in further detail. In particular, we describe the advantages and disadvantages of the modeling choices identified in the preference handling stage.

3. Preference Handling in the UAV Route Planning Literature

Based on the point in the optimization process at which the preferences of the DM are included, the multi-objective optimization (MOO) literature [19] uses four different categories of handling conflicting objectives (see Figure 3). The preferences of the DM decide which routes from the Pareto front are preferred over others. The preferences can be integrated either not at all (no preference method), before (a-priori), after (a-posterior), or progressively during the optimization process (interactive). For the military UAV route planning problem, the DM can be the drone operator, an intel officer, or anyone else who plans the route.
  • No-preference Preference Handling
    Firstly, in no-preference methods, the DM is not involved in the process. Instead, a solution which is a neutral compromise between all objectives is identified. This method is commonly used when no DM is present and it usually constructs a scalar composite objective function F ( f 1 ( x ) , , f k ( x ) ) .
  • a-priori Preference Handling
    Secondly, in a-priori methods, a DM specifies their preference information before the optimization process is started. Most commonly, the preference information is used to construct a scalar composite objective function F ( f 1 ( x ) , , f k ( x ) ) . This has the advantage that the problem is simplified into an SOO problem. As a result existing SOO algorithms can be used. The main limitation is that the DM has limited knowledge of the problem beforehand, which can result in inaccurate and misleading preferences [20,21]. Intuitively, asking the DM to specify their preferences before the optimization corresponds to asking them to select a point on the Pareto Front (e.g., by specifying objective weights) before knowing what the actual Pareto Front and its corresponding tradeoffs look like. Consequently, a set of preferences specified a-priori might lead to solution x, whereas the DM might have preferred y when presented with both options.
  • a-posterior Preference Handling
    Alternatively, a-posterior methods can be used. These methods include the DM’s preferences after the optimization process. First, they approximate the Pareto Front and then allow the DM to choose a solution from said front. The main advantage is that the DM can elicit their preferences on the basis of the information provided by the Pareto Front. The DM has a better idea of the tradeoffs present in the problem and can thus make a better-informed choice. Furthermore, it can provide the DM with several alternatives rather than a single solution. The disadvantage of this class of methods is the large computational complexity of generating an entire set of Pareto optimal solutions [22]. Additionally, when the Pareto Front is high-dimensional, the large amount of information can cognitively overload and complicate the decision of the DM [20].
  • Interactive Preference Handling
    Finally, interactive preference elicitation attempts to overcome the limitations of the a-priori and a-posterior classes by actively involving the DM in the optimization process. As opposed to the sequential views of a-priori and a-posterior methods, the optimization process is an iterative cycle. Optimization is alternated with preference elicitation to iteratively update a preference model of the DM and to quickly identify regions of the Pareto Front that are preferred. Undesired parts of the Pareto Front need not be generated, which saves computational resources. Instead, these resources can be spent on better approximating the preferred regions, improving their approximation quality [23]. Lastly, being embedded in the optimization process helps the DM build trust in the model and a better understanding of the problem, the tradeoffs involved, and the feasibility of their preferences [24,25,26]. One challenge of interactive methods is that the interaction needs to be explicit enough to build an expressive and informative model of the DM’s preferences whilst also being simple enough for the DM to not cognitively overload them.
Within the UAV route planning literature, there is a large focus on a-priori preference handling approaches. Indeed, Figure 4 (for the raw data see Table A1), shows that, despite its weaknesses, over 90% of the approaches use a-priori preference handling techniques. In fact, only two papers used a no-preference handling method, four used preference handling after optimization, and only one paper used interactive preference handling. The remaining 69 papers all used a-priori preference handling. One likely cause for this predominant focus on a-priori preference handling is that these methods simplify the problem to an SOO problem, thereby allowing the authors to use SOO techniques. Another cause might be that authors are not familiar with multi-objective optimization. This is substantiated by the fact that 70% of the papers use the simple intuitive weighted sum method (see Section 4.2.1), often a method that comes to mind first when dealing with multiple optimization objectives. Besides a methodological bias on a-priori preference handling techniques, many papers neglect the DM altogether. In particular, as shown in Figure 5, 50 papers (66%) simply assumed the preference information to be known. They did not experiment with different preferences, nor how a DM should arrive at them. On the other hand, 15 papers (20%) did provide some examples of different preference information, such as a different weight vector. However, again, they simply selected some arbitrary preference information, without specifying how a DM should go about selecting the preferences. Similarly, all four papers that provided the DM with a Pareto front did not address how a DM should select the most preferred solution from the Pareto front. Moreover, one of the papers simply replaced the DM with a heuristic, assuming to be an accurate approximation of the preferences. For the interactive preference handling method, the author assumed the role of the DM and did not provide an evaluation of the practical usability for real-world domain experts. None of the aforementioned papers included a domain expert as DM in the evaluation of the method. Therefore, it is unclear whether the chosen preference handling methods can accurately model the preferences of the DM, let alone ensure that mathematical optimum is aligned to the MPS. Alternatively, one paper did include domain experts as DMs, extracting preference information via the Analytical Hierarchical Process [27]. However, they did not evaluate whether the optimal solution found was indeed the most preferred solution.
In conclusion, preference handling and its role in aligning the mathematical optimum with the DM’s MPS is neglected. Often preferences are simply assumed to be known, and real DMs are rarely included in the evaluation of the proposed methods.

4. Preference Handling Methods and Their Implications

To aid future researchers handle the preferences of the DM, we provide an extensive discussion of the methods used in the literature. More particularly, for each method, we introduce the method and its mathematical interpretation. Moreover, we highlight the different modeling choices available, state the implications, advantages, and disadvantages.

4.1. No Preference—Preference Handling

When a solution needs to be chosen in the absence of a decision-maker, a compromise solution needs to be found. Within the military route planning literature, two papers, by Flint et al. [28], and Abhishek et al. [29], have used no-preference methods. Both of these methods use a simplification of the Nash Product. In the military UAV path planning problem, the relative importance of individual objectives differs per scenario. For example, in high-risk, time-crucial missions, minimizing flight time might be more important than in low-risk transport missions. The lack of adaptability to different scenarios is therefore a crucial disadvantage of no-preference methods.

Nash Product

Flint et al. [28], and Abhishek et al. [29] maximized the product of the objective functions, as defined in Equation (2). Although the mathematical equation seems very simple, there is a lot more to it than first meets the eye.
max x F ( f 1 ( x ) , , f k ( x ) ) = i = 1 k f i ( x )
In fact, this method is (a simplification of) the Nash Product [30] or Nash Social Welfare function [31]. Essentially it finds a natural compromise between efficiency and fairness [32,33]. It aims to come close to each objective’s optimum whilst ensuring no objective is degraded by an unreasonable amount. The interpretation of the Nash Product stems from the 1950s when John Nash introduced his famous Nash Bargaining Solution [30] for bargaining games. By viewing the optimization objectives as player’s utility functions, multi-objective optimization problems can actually be seen as bargaining problems. To solve bargaining problems one can use the Nash Bargaining Solution, which requires a disagreement point d i to be provided. The disagreement point basically corresponds to the utilities under no collaboration. By collaborating, players can improve their utility. This added surplus, defined in Equation (3), is exactly what the Nash Bargaining Solution maximizes.
max x i = 1 n ( f i ( x ) d i )
Geometrically, as can be seen in Figure 6, this method finds the point on the Pareto front that maximizes the rectangle spanned by the vertical and horizontal axes between the point and the disagreement point [34]. Since the disagreement point defines the start of the vertical and horizontal axes, it has a direct influence on the optimization problem. For example, points with an objective value f i less than disagreement point d i cannot be optimal [35] (compare the orange rectangle to the red rectangle in Figure 6). Additionally, as can be seen in Figure 7, when the disagreement point is set too low, the Nash bargaining solution can provide an extreme solution instead of a compromise solution. Therefore, it is important to set the disagreement point properly. One common approach is to set the disagreement point’s values to the worst objective values of points on the Pareto front (i.e., nadir point) [34]. By doing so, one finds a balanced solution over the entire front. Note that for bi-objective optimization problems, the nadir point is easy to estimate. For more objectives, however, this becomes complicated. More details can be found in Section 4.3.1.
Besides the effect of the disagreement point, the Nash bargaining solution has some additional implications. Most of these properties are implied by the axioms upon which the Nash product is based. It is important to note that the four axioms below are proven for convex bargaining problems (f is convex). However, multiple extensions to non-convex bargaining problems with slightly different axioms exist [36].
  • Pareto Optimality—Efficiency
    The identified Nash Bargaining Solution is a Pareto-optimal solution [34], which, of course, is a desirable property.
  • Symmetry—Fairness
    When the game is symmetric, which is the case for the Nash Bargaining Solution defined in Equation (3), the solution does not favor any objective but returns a fair solution [37]. The concept of Nash Bargaining can be extended to include asymmetric games by introducing weights to the problem, as in Equation (4). These weights can be determined by the DM, where they correspond to the preferences with respect to certain objectives [38]. This extends the Nash Bargaining Solution from a no-preference method to an a-priori method.
    max x i = 1 n ( f i ( x ) d i ) w i
  • Scale Invariance—No Normalization
    Mathematically, this property means that any positive affine transformation of the objectives does not influence the decision-theoretic properties of the optimization problem. Intuitively, this means that the Nash Bargaining Solution is independent of the unit of the objectives. It does not matter whether one is measuring route length in kilometers or miles. Instead of comparing units of improvement in one objective to units of degradation in another objective, ratios are compared. For example, a route twice as dangerous but half as long as another route is equally good in this view. One important implication of this is that there is no need to normalize the objectives before optimizing.
  • Independence of Irrelevant Attributes
    This axiom states that removing any feasible alternative other than the optimal Nash bargaining solution does not influence the optimal solution. This axiom has often been critiqued [39], but in the rational optimization context, it makes sense. Removing a nonoptimal solution, should not change what the optimal solution is.
In conclusion, the Nash Product, if properly used, finds a Pareto-efficient compromise solution that balances efficiency within objectives and fairness across objectives. In order to use the Nash Product correctly, it is important to set the disagreement point properly. A common approach is to set the disagreement point as the nadir point, for which many different approximation methods exist. By doing so, one finds a compromise solution over the entire Pareto Front. Additionally, the Nash Product is scale-invariant, meaning there is no need to normalize objectives.

4.2. A-Priori Preference Handling

The most common method to deal with preferences in the UAV route planning literature is to deal with them a-priori. There are many different ways to incorporate preferences before the optimization process starts. Within the route planning literature, as visualized in Figure 8, three different general cases have been used. First, by far the most common method (54 papers), is to aggregate the objectives via a weighted sum (see Section 4.2.1). Subsequently, an SOO method can be used. Secondly, in five papers, the authors have handled the preferences of the DM by shaping a reward function to be optimized. As shown in Section 4.2.2, this is actually equivalent to the weighted sum method. Finally, in 10 papers, a DM can set goals for certain objective values and priorities in achieving them (see Section 4.2.3). Subsequently, lexicographic optimization or goal programming techniques can be used to solve the defined optimization problems.

4.2.1. Weighted Sum Method

The most used way to handle conflicting objectives is to scalarize the multi-dimensional objective function by taking the weighted sum [40,41]. When one encounters a problem with two competing objectives, one naturally thinks of minimizing the weighted sum of these objectives. It is thus no surprise that over 70% of the papers reviewed use this method. The main advantage is that after aggregation, the composite objective function can be optimized using SOO algorithms. In its most simplistic (and most used) form the composite function is defined by Equation (5).
F ( f 1 ( x ) , , f k ( x ) ) = i = 1 k w i f i ( x ) s . t . i = 1 n w i = 1 w i 0
In principle, by systematically varying the weights and solving the corresponding series of SOO problems one could obtain the Pareto front. However, this has some disadvantages. First, a uniform distribution of weights does not necessarily result in a uniformly distributed set of solutions on the Pareto front [41]. Moreover, only convex Pareto fronts can be generated (see more below). Therefore, better alternatives for finding the Pareto Front, like the ϵ -constraint method, described in Section 4.3.1, exist.
Essentially, when one uses the Weighted Sum method, there are three modeling choices to be made. First, the constraints have to be specified. Secondly, the mathematical form of the equation has to be chosen, and finally, the weights have to be decided. All of these choices have certain implications for the optimization problem.
  • Constraints—Normalization and Weakly Dominated Points
    The first constraint: i = 1 n w i = 1 does not impose any constraints on the optimization problem. Any weight vector could be normalized to satisfy the constraint. However, restricting the DM to choose weights according to this constraint can cause unnecessary complications. Generally, the weight vector is interpreted to reflect the relative importance of the objectives [42]. In the case where one objective is five times more important than the other, it would be much easier for a DM to set the weights as w = ( 5 ,   1 ) than w = ( 5 6 ,   1 6 ) . This becomes especially troublesome when there are many objectives, hindering the interpretability of the weight vector. Therefore, it is better to normalize the weights automatically after weight specification rather than restricting the DM in their weight selection.
    The second constraint w i 0 can pose a problem when one of the objective weights is equal to 0. When setting w 3 to 0, a solution with objective values ( f 1 ( x ) ,   f 2 ( x ) ,   f 3 ( x ) = ( 1 ,   1 ,   100 ) is equivalent to f 1 ( y ) ,   f 2 ( y ) ,   f 3 ( y ) = ( 1 ,   1 ,   1 ) . The equivalence of these two objective vectors introduces weakly-dominated points (i.e., points like x for which a single objective can be improved without any deterioration in another objective) to the Pareto Front. When the DM truly does not care about f 3 this is not a problem, however, when one would still prefer y over x, one could set the weight of f 3 to a very small value ϵ instead of 0 [43]. This allows the method to differentiate between f ( x ) and f ( y ) using the third objective f 3 , removing weakly dominated points from the Pareto front.
  • Mathematical Form—Imposing Linearity
    The mathematical form of the equation refers to the functional composition of the composite objective function. In all papers identified, a linear combination of objective functions, as defined in Equation (5), has been employed. However, there is no reason, besides simplicity, why the equation has to be a simple linear combination. In fact, because Equation (5) is a linear combination of the objectives, it imposes a linear structure on the composite objective function. Preferences, on the other hand, generally follow a nonlinear relationship [44]. A DM might approve a 20km detour to reduce the probability of detection from 0.2 to 0 but might be less willing to do so to reduce the probability of detection from 1 to 0.8. Furthermore, imposing a linear structure on the preferences leads to favoring extreme solutions whereas DMs generally favor balanced solutions [42]. This is particularly troublesome when the feasible objective space is non-convex, as can be seen in Figure 9 (compare Figure 9a to Figure 9b). The linear weighted sum is not able to find any solution between the two extremes. Points on this non-convex section of the Pareto Front cannot be identified by a linear weighted sum. In discrete cases, it has been shown that up to 95% of the points can remain undetected [45].
  • Weight vector—Relative Importance of Objectives
    This very simple method poses a not-so-simple question: how does one determine the weights? There is no unique answer to this, as the weight vector w needs to be determined by the DM and it differs per scenario. The most common interpretation of the weights is to assign them based on the relative importance of the individual objectives. For example, in a bi-objective problem case, setting w 1 = 1 and w 2 = 2 assumes that objective 2 is twice as important. However, as shown by Marler & Arora [42] the value of a weight is significant relative to both the weights of the other objectives and all the objective function’s ranges. In particular, if objective 1’s range is much higher than objective 2’s range, objective 1 will still dominate regardless of the weight set. Therefore, normalizing the objectives is necessary. Although the normalization of objectives is a necessary precondition to interpret weights as relative importance, only a very small portion ( 19 % ) of the literature has done so, as can be seen in Figure 10. Alternatively, one can interpret the ratio of the weights as the slope of the line crossing the mathematical optimum on the Pareto Front. By changing the slope, one can identify other points on the Pareto front. However, as one does not know the shape of the Pareto front a-priori, it is hard to relate the slope of the line to the final MPS. For example, in Figure 9b, a ratio of 1 does not actually result in a balanced solution.
    Additionally, it is hard for a DM to set the weights before having information about the possible trade-offs. Why should the weight vector w = ( 0.6 ,   0.4 ) be chosen? Why not w = ( 0.59 ,   0.41 ) ? How much would the solution and its objective values change when slightly changing the weights? These questions are important, as in a practical setting a DM will need to trust that the weights set lead to the most preferred solution and that that solution is robust to a small weight change.
In conclusion, the weighted sum method provides an easy way to deal with multiple conflicting objectives. Its scalarizing nature ensures that SOO methods can be used. However, when implementing the method, several points have to be taken into account. First, weights should only be set to 0 if DMs truly do not care about the objective. Alternatively, setting the weight to a very small amount will ensure that no weakly non-dominated points are found without degrading the performance. Secondly, to allow DMs to set weights intuitively, the weights should not be restricted to add up to 1, but rather be normalized after setting the weights. Thirdly, objective values should be normalized to allow for interpreting the weights as the relative importance of the corresponding objectives. Fourthly, the mathematical form of the equation should be representative of the DM’s preferences. If a linear combination is chosen, the preferences need to be linear as well. Moreover, a linear combination will only be able to find points on the convex hull of the Pareto Front. Finally, even if all these points are taken into account correctly it is still hard to determine the weights without any information regarding the possible trade-offs and solutions of the scenario at hand.

4.2.2. Reward Shaping

In reinforcement learning, a reward function (analogous to an objective function) is maximized. At each timestep t, a reward R t is calculated based on the state of the agent. The state is a representation of the environment that the UAV is in at time t. For instance, it includes the location of the UAV, the velocity of the UAV and the threat distribution. Both positive and negative rewards can be used. For example, to minimize the time spent in the air a negative reward ( R t D ) is awarded for being in the air. Additionally to remain undetected, a negative reward R t T can be awarded for being within the line of sight of a radar. To force the UAV to the goal, reaching it can give a large positive reward. Equation (6) is one possible, exemplary way to model this reward function.
R t t o t a l = R t T + R t D
The reward values have an interpretation similar to that of the weights of the weighted sum method. For example, when R t T = 10 and R t D = 1 being in a threat zone at time t is 10 times worse than being in the air in general. Rewriting the reward values by scaling them to be −1, and 1 and integrating weights in the equation gives us Equation (7) below.
R t t o t a l = w T R t T + w D R t D , where R t T ,   R t D [ 1 ,   1 ]
The similarity between Equation (7) and Equation (5) is striking. In particular, one can view the composite reward function R t t o t a l as a linear weighted sum of the individual rewards (in this case R t T and R t D ). Each of the individual rewards aligns with one of the optimization objectives. For example, in Equation (7) to minimize the time spent in the air the negative reward R t D is added. Similarly to minimize danger, the negative reward R t T is added. Therefore, the individual rewards are scalarized in the same way as the linear weighted sum method described in Section 4.2.1. Indeed, using a linear weighted sum of the individual rewards, as employed by Erlandsson [46], Yan et al. [47], Yuksek et al. [48], Alpdemir [49], Zhao et al. [50] is equivalent to the linear weighted sum method [51,52]. The implications for this method of rewards shaping are thus the same as the implications of the linear weighted sum method described in Section 4.2.1. For example, because the total reward is a convex combination of the individual rewards, policies that lie in concave regions of the Pareto Front will always have a lower total reward than the policies in convex regions of the Pareto Front [53,54]. In practice, this also means that by designing the reward function, the engineer includes assumptions about the DM’s preferences in the model [51]. Therefore, the DM would need to be included in the early stages of the development of the model.

4.2.3. Decision-Making Based on Goals and Priorities

Another a-priori preference handling method requires the DM to categorize each objective into a priority level and allows them to set goals for each objective. This method, formalized by Fonseca & Flemming [55], has quite a lot of flexibility. For example, Pareto optimality is represented by putting all objectives at the same priority level and not specifying any goals. Lexicographic optimization (first optimizing the most important objective, then the second, and so on) is represented by putting all objectives into different priority levels. Different interpretations of goal programming can be represented by setting the goals and priority levels in the desired ways. For example, in a bi-objective optimization problem, one objective can be set as a constraint (high priority with goal), and the other objective can be set as an optimization objective (low priority, no goal). Note that depending on the choice of priorities and goals, the method can result in a Pareto front rather than a single most preferred solution. Therefore, it can require additional preference information to reach the final most preferred solution.
Within the route planning literature, 10 papers have used variants of this method. Yang et al. [56], and de la Cruz et al. [57], later extended by Besada-Portas et al. [58] directly applied the method from Fonsesca & Flemming [55]. In particular, they define three different priority levels: constraints, important objectives, and less important objectives. First, the solutions must satisfy all constraints. Secondly, the routes are minimized with respect to the important objectives. When they are equivalent (in a Pareto sense), they are minimized with respect to the less important objectives. Finally, a simple heuristic is used to choose the final solution from the Pareto front. Although not directly based on Fonseca & Flemming’s [55] method, Ogren [59], proposed the use of Behavior Control Lyapunov Functions (BCLF). BCLF requests the DM to provide a priority table where each column is a set of goals for each objective. Each subsequent priority column imposes stronger limits on the objective values than the one before. Ultimately, the method aims to reach the highest priority column (the far right column), which is attained by satisfying each individual goal of said priority column. Kabamba et al. [60], and Savkin & Huang [61] both use lexicographic optimization, where they first minimize the maximum danger and when it is at its minimum they minimize the route length. Finally, Dogan [62], Chaudhry et al. [63], Ogren & Winstrand [64], and Zabarankin et al. [65] set one objective as a constraint and optimize the other objective, similar to setting a goal for one objective in the highest priority level and moving the optimization objective to the lower priority level.
Clearly, many different variants of methods using priorities and goals exist. Which method is better suited than the other is highly dependent on the DM and the specific problem. However, when designing a preference handling method based on goals and priorities one can take the following implications on using them into account:
  • Priorities—Lexicographic Ordering
    Categorizing objectives into priorities imposes an absolute ordering on the objectives. Even an infinitely small improvement in a higher-priority objective is preferred over an infinitely large improvement in a lower-priority objective. This ordering is called a lexicographic ordering and its sequential optimization process is called lexicographic optimization [8]. First, the highest priority is optimized. Only when there are ties or when all goals are achieved, subsequent priority levels are considered. For each lower priority level, one thus need more ties or more achieved goals, making it less likely to even reach the lower priority levels. Indeed, the presence of an excessive amount of priority levels, causes low-level priorities to be ignored [66]. Moreover, even when only two priority levels are used, such a strong lexicographic ordering has strong implications. It makes sense when a clear natural ordering is available. For example, satisfying a UAV’s flight constraints is always preferred to optimizing an objective, as a route that cannot be flown is always worse than a route that can be flown. However, for optimization objectives, this is usually not the case. It is hard to argue that there is no trade-off possible between criteria. Surely, a huge decrease in the route length warrants a tiny increase in the probability of detection. Therefore, when setting priority levels, objectives should only be categorized differently when one is infinitely more important than the other.
  • Goals—Satisficing vs Optimizing
    Setting goals has a profound impact on the resulting final solution—it can have as large an impact, if not more, as the mathematical definition of the objective functions themselves [67]. In fact, the entire underlying philosophy of the optimization process can change when a goal is set to an attainable value versus an unattainable value. When the goals are set at attainable values, goal programming essentially follows a satisficing philosophy. This philosophy, named after the combination of the verbs to satisfy and to suffice, introduced by Simon [68], argues that a decision-maker wants to reach a set of goals, and when they are reached this suffices for them and hence they are satisfied [69]. Mathematically, this means that objectives are optimized until they reach a goal value. Any further improvements are indifferent to the DM. For constraints, this makes sense, satisfying the goals imposed by a UAV flight constraints suffices. There is no need to make even tighter turns than the turn constraint imposes. For optimization objectives, however, this contradicts the entire concept of optimizing, for which “less is always better” (i.e., a monotonic belief). Even when the goal value for the danger levels is satisfied, a solution that has less danger is preferred. Note that this does not align with the principle of Pareto-optimality and can thus result in dominated solutions. This is precisely the main disadvantage of setting goals for optimization objectives. Indeed imposing attainable goals on monotonic optimization objectives should not be done. Instead, for monotonic objectives, goals should be set to unattainable values [70] to ensure the optimization process optimizes rather than satisfices. Alternatively, techniques for Pareto-efficiency restoration can be used (e.g., see Jones & Tamiz [71]). Finally, when objectives in the same priority level all have goals, the objective will be to minimize the sum of unwanted deviations from the goals. For this sum to weigh objectives equally, objectives need to be on the same order of magnitude (i.e., commensurable). Therefore, when they are incommensurable, the objectives need to be normalized [66].
    Clearly, setting goals properly is of the utmost importance. So, how would one reliably set goals? To determine whether a goal value is realistic and attainable, one needs information regarding the ranges of the objective functions as well as the trade-offs in objective space, and the beliefs of the DM about the objectives (e.g., is it a monotonic objective?). a-priori, such information is not present. Therefore, setting goals is usually not done a-priori, but rather it is an iterative (i.e., interactive) process.
In conclusion, handling preferences with goals and priorities can be a flexible method to deal with initial DM preferences. Priorities can be set when a strong hierarchical (i.e., lexicographic) ordering between objectives exists. This is surely the case between constraints and optimization objectives, but it is questionable whether this holds between optimization objectives. Secondly, when setting goals, it is important to understand the effect of the goal set. An attainable goal will lead to satisficing solutions, whereas unattainable goals will lead to optimizing solutions. Which one is preferred depends on the DM’s beliefs about the function of each objective (e.g., monotonic function). Finally, setting goals requires information about the ranges and trade-offs of the objective shape. Therefore, setting them should happen interactively, rather than a-priori.

4.3. A-Posterior Preference Handling

In a-posterior preference handling methods, the preferences of the DM are handled after the optimization. In particular, the Pareto front is generated, after which DMs need to choose their MPS. It thus consists of an optimization stage and a preference handling stage.
  • Optimization: Generating the Pareto front
    To generate the entire Pareto front, the UAV route planning literature has used two distinct methods. In two papers [72,73], the Pareto front is approximated directly using metaheuristics, in particular, evolutionary algorithms. These optimization algorithms use an iterative population-based optimization process to get an approximation of the Pareto front. As the scope of the paper is on preference handling techniques, evolutionary algorithms will not be discussed. Instead, they are extensively covered in existing surveys identified in Table 1 as well as by Yahia & Mohammed [74] and Jiang et al. [75]. The two other papers: Dasdemir et al. [76] and Tezcaner et al. [77] reformulated the problem using the ϵ -constraint method. This method is a general technique that reformulates the problem into a series of SOO problems. Although it is not directly a preference handling method, it is a general method to deal with multiple criteria and can be combined with any optimization algorithm. Therefore, it is covered in Section 4.3.1.
  • Preference Handling: Selecting the most preferred solution
    None of the identified papers that generate the entire Pareto front cover the preference handling stage after the optimization. Instead, they simply assume that the DM can choose their most preferred solution based on the Pareto front. However, as explained at the start of Section 3, choosing a solution from the Pareto front is not a straightforward task. For example, when the Pareto front is high dimensional, the DM will face cognitive overload. One paper [72] addresses this by first linearly aggregating multiple objectives into two composite objective functions to lower the dimensionality. Afterwards, the authors assume that the DM can choose their most preferred solution from the two-dimensional Pareto front. However, to linearly aggregate the objectives, the method requires a-priori preference information (i.e., a weight vector), thereby, introducing the same challenges of a-priori preference handling and in particular of the weighted sum. Clearly, the focus of the aforementioned papers is on the optimization process and on the creation of a Pareto front. The subsequent step of selecting the MPS, however, is neglected. The DM is not aided in this process, which hinders the usability of the method. In particular, the decision-aid aspect of optimization is neglected.

4.3.1. ϵ -Constraint Method

The ϵ -constraint method sets one arbitrarily chosen objective as an optimization objective and sets the other objectives as constraints. Mathematically, this means the problem is reformulated to Equation (8).
min x f j ( x ) s . t . f k ( x ) ϵ k k = 1 , , p k j
At first, these constraints are very loose bounds. By systematically tightening the bounds, one can get the entire Pareto front. A visualization of this process can be seen in Figure 11. In this example, f 1 is arbitrarily chosen to be constrained and f 2 is minimized. The first constraint, ϵ 1 1 results in the far right solution on the Pareto front ( f ( x 1 ) * ), as f 2 is at its minimum there. By systematically moving the red line to the left (tightening the bound) one can obtain all the solutions. For example, at ϵ 1 2 , one gets the green solution f ( x 2 ) * and at ϵ 1 3 , one gets the blue solution, f ( x 3 ) * . Of course, in practice, one would not use such a coarse ϵ grid. In particular, the more fine-grained the ϵ grid is, the more detailed the approximation of the Pareto front is.
The most notable advantage of using the ϵ -constraint method over the systematic weighted sum method is that also non-convex parts of the Pareto front can be generated, as can be seen in Figure 11. Besides the applicability to non-convex Pareto fronts, there are four components that have several implications for the optimization problem. First, the introduction of constraints changes the problem structure. Secondly, the ϵ -constraint method can generate weakly efficient solutions. Thirdly, in order to generate the entire Pareto front, one needs to know the range of the objectives (i.e., the start and end point of the varying of constraints). Finally, one needs to choose how to systematically tighten the constraint.
  • Constraint Introduction—Problem Complexity
    The introduction of constraints can greatly complicate the structure of the optimization problem. In particular, for the shortest path problem, introducing constraints changes the complexity of the problem from polynomially solvable to NP-complete [78]. Therefore, for larger problem instances, an exact ϵ -constraint method for the UAV route planning problem is not possible. Instead, for example, metaheuristics can be used to solve the formulated SOO problem.
  • Weakly Dominated Points
    The ϵ -constraint method can generate weakly dominated points. In particular, the method will generate all points that satisfy the constraint on f 2 ( x ) and for which f 1 ( x ) is at its minimum. When there is no unique solution (i.e., multiple solutions x for which f 1 ( x ) is at its minimum), all of these solutions can be generated. Of course, in this case, one would prefer to only keep the solution with minimal f 2 ( x ) . Therefore, one could add a second optimization problem that optimizes w.r.t. f 2 ( x ) given the optimal f 1 ( x ) value. Note that this is lexicographic optimization [79]. Alternatively, one could eliminate these weakly efficient solutions by modifying the objective function slightly. In particular, in the augmented ϵ -constraint method, the problem is reformulated to include a small term meant to differentiate between weakly efficient and efficient solutions [43]. For example, in the bi-objective case, we add the factor ρ f 2 ( x ) to get Equation (9). When ρ is sufficiently small to only differentiate between solutions with equal f 1 ( x ) , we remove weakly efficient solutions from the equation. Note that this happens without the need for the formulation and solving of additional optimization problems, thereby saving computational resources. Alternative solutions to remove weakly efficient solutions are available (e.g., Mavrotas [80]).
    min x f 1 ( x ) + ρ f 2 ( x ) s . t . f 2 ( x ) ϵ
  • Start and End Points of Tightening—Objective Range
    In order to get the full Pareto front, one needs to know where to start systematically tightening the bound and when to stop tightening the bound. In particular, the loose bound (starting point) is the nadir point. The ending point of the method is when the bound is at its tightest and only one solution remains. This is at the ideal point, the point with the best possible objective values of points on the Pareto front. The ideal point is easy to estimate; one simply needs to solve a single-objective minimization problem for each objective i and the resulting optimal objective value will be the ideal point I i of dimension i [8]. For the nadir point, one can take the worst objective values found by solving the series of SOO problems. In the bi-objective case, this will find the exact nadir point. However, when there are more than two objectives, this can greatly over- or under-estimate the nadir point [8,80]. Many different estimation methods exist (for a survey see Deb & Miettinen [81]) but no efficient general method is known [8].
  • Systematically Tightening the Weights
    Originally, the epsilon constraint method tightens the bounds of each constraint i by a value δ i . The value of δ i relative to the range of the objectives f i determines how detailed the generated Pareto front will be. A large δ i results in a coarse Pareto front, whereas a small δ i results in a fine-grained Pareto front. In particular, this δ i can be seen as the stepsize of the method. Increasing it makes bigger steps, leading to a coarser Pareto front. At the same time, it decreases the computational complexity. A bigger stepsize, reduces the amount of steps needed to reach the ideal point, hence fewer optimization problems need to be solved. Therefore, δ i needs to be fine enough to get a good approximation of the Pareto front but coarse enough to prevent a lot of redundant computational effort [82]. Alternatively, one can tighten the weights adaptively. When the bound is at a non-Pareto efficient part of the objective space, larger steps can be taken, whereas when the bound is at a Pareto efficient point, small steps can be taken. An example of this can be found in Laumans et al. [82].
In conclusion, one can use the ϵ -constraint method to generate both convex and non-convex Pareto fronts. To do so, one needs to find (or approximate) the ideal and nadir point, and systematically tighten the bound between these points. By changing the step-size δ of the tightening, one can increase the detail of the Pareto front, at the cost of an increased computational complexity. Alternatively, one can set the step-size adaptively based on objective space information. Moreover, to guarantee efficient solutions, one can include a factor ρ i f i ( x ) into the optimization objective per constraint i ( ρ i being sufficiently small). Finally, note that the introduction of constraints changes the complexity of the single-objective shortest path problem from polynomially solvable to NP-complete for the constrained single-objective shortest path problem. Therefore, for large problem instances, exact methods will not be possible and metaheuristics can be used.

4.4. Interactive Preference Handling

Interactive methods consist of an iterative cycle of preference elicitation, preference modeling, optimization, and feedback [83]. Additionally, the interaction pattern, the point at which a DM can supply feedback is important. Essentially, the DM provides some preference information upon which a preference model is built. Based on this model the region of interest (ROI) of the Pareto front is identified after which the optimization model generates solutions close to the ROI. Finally, feedback (like trade-off information) is provided to the DM such that the preferences can be updated and a novel iteration can start. Alternatively, one can distinguish a learning phase and a decision phase [84]. In the learning phase, a DM learns the possibilities and limitations of the problem and in the decision phase, the DM focuses on the ROI and tries to find their most preferred solution. Within the UAV route planning literature, one paper, by Dasdemir et al. [85], uses an interactive method. In particular, they use a reference point-based evolutionary algorithm.

Interactive Reference Point-Based Methods

Interactive reference point-based methods require the DM to supply them with a reference point. This reference point is a vector in objective space with certain goal levels (similar to the goals defined in Section 4.2.3). It provides the optimization model with information on what the ROI is. Finally, the optimization model uses this information to approximate the ROI and subsequently, it provides the DM with feedback on the possibilities, limitations, and solutions of the problem. Note that an interactive system has many more interdependent components than an a-priori or a-posterior method. Additionally, the components depend on the interaction with the DM, which is, of course, individual-specific. It is thus impossible to fully determine the implications arising from certain design choices. Therefore, the discussion below is aimed more toward explaining the general implications than toward covering very specific design choices.
  • Preference Information—Reference point
    Preference information is supplied by means of a reference point. Providing reference points is a cognitively valid way of providing preference information [86], as DMs do not have to learn to use other concepts. Instead, they can simply work in the objective space, which they are familiar with. However, as discussed in Section 4.2.3, using goals can result in dominated solutions due to the satisficing philosophy. Therefore, it is important to handle dominated solutions when using interactive reference point-based methods. Dasdemir et al. [85] prevent dominated solutions by keeping track of a set of nondominated solutions throughout the entire optimization process. After the optimization is done, the set of nondominated points that are closest to the reference point are returned as the ROI. Therefore, assuming the evolutionary algorithm has found nondominated solutions, only nondominated solutions will be returned.
    Additionally, when achieving one goal is more important than achieving another goal, it can be difficult to represent this with a reference vector. Therefore, a weighted variant [87] exists that introduces relative importance weights in the distance function (see preference model below). Note that although this gives the DM the possibility to include preferences regarding the importance of achieving certain goals, specifying weights is not a trivial task (as described in Section 4.2.1).
  • Preference Model—Distance Function
    Based on the provided reference point, a preference model is constructed. The preference model captures the subjective preferences of the DM and uses them to identify the ROI on the Pareto front. In interactive reference point-based methods, the preference model is usually fully determined by the combination of a scalar distance function and the reference point [87,88]. This distance function is sometimes called an Achievement Scalarizing Function, which essentially does distance minimization [89]. The choice of distance function has an influence on the chosen ROI. For example, by using a quadratic distance function, points far away from the reference point are punished more than by using a linear function. Moreover, by choosing a nonlinear distance function, one introduces nonlinearity, alleviating the issues of the weighted sum methods, described in Section 4.2.1. In Dasdemir et al. [85] the distance function is the normalized euclidean distance (which is nonlinear) to the reference point. The solution closest to the reference point will be ranked 1, the second closest will be ranked 2, etc. Whichever solution is closer (based on the chosen distance function) to the reference point is thus preferred by the DM.
    Instead of finding a single most preferred solution, interactive reference point-based methods aim to identify an ROI. To identify this ROI, one needs to balance specificity and diversity. In particular, the ROI should not collapse into a single point because it is too specific, but it should also not be too diverse such that it is spread around the entire Pareto front. Therefore, to tune the spread of the ROI, an ϵ -niching method is commonly used [83] (e.g., in R-NSGA-II [90] and RPSO-SS [91]). Dasdemir et al. [85] also employ such an ϵ -niching method. In particular, one representative is constructed and all solutions within an ϵ -ball to the representative are removed. Subsequently, the next representative outside of the ϵ -ball is created and the process repeats. By increasing ϵ , more solutions will be deemed equal and the ROI will be wider and vice versa. Setting this design parameter, however, is not trivial [83]. For example, in the exploration phase a DM could want a more diverse spread of the ROI compared to when the DM tries to find the final most preferred solution. Therefore, adaptively changing the ϵ based on whether the DM is exploring or selecting his MPS might be useful.
  • Feedback to the DM—Pareto Front Information
    One of the main advantages of using interactive optimization methods is that the DM can learn from the possibilities and limitations of the problem at hand by receiving feedback from the optimization process. Based on the learned information, a DM can better identify the ROI of the Pareto front. The feedback provided to the DM can take many forms. For example, possible trade-off information or a general approximation of the Pareto front can be provided. For an interactive system to work correctly, it is vital that the interaction is not too cognitively demanding for the DM (i.e., cognitive overload). At the same time, the information needs to be rich enough to provide the DM with the necessary capabilities to correctly identify their ROI. Moreover, different DMs have different skillsets and affinities with optimization, therefore, interaction should be designed in close collaboration with the relevant DMs.
    In Dasdemir et al. [85], the feedback provided to the DM is a general approximation of the Pareto front (although no guarantees are given that the provided points are actually nondominated points). In the bi-objective case they evaluate, this is a cognitively valid way of information communication, but when the Pareto front becomes higher dimensional it can cognitively overload the DM [92]. Therefore, alternative visualization methods exist (e.g., Talukder et al. [93]). Besides that, the interaction is not designed nor evaluated with the relevant DMs, therefore, it is hard to say whether, in this context, Pareto front information is an appropriate way to communicate information to the DM.
  • Interaction Pattern—During or After a Run
    Finally, when designing an interactive optimization method, one needs to decide at what point the DM is allowed to provide preference information. In particular, Xin et al. [83] have identified two interaction patterns: Interaction After a complete Run (IAR) and Interaction During a Run (IDR). Reference-point based methods generally follow the IAR method. IDR methods, on the other hand, give the DM more chances to guide the search process, thereby making it more DM-centric. Alternatively, one could allow the DM to intervene at will, either waiting for a complete run or stopping the run. Most importantly, there needs to be a balance between requesting preference information and optimization. We do not want to strain the DMs too much by asking them to provide a lot of preference information, but at the same time, we need to be able to build an accurate preference model and allow the DM to be in control.
In conclusion, interactive methods consist of an iterative cycle of preference elicitation, preference modeling, optimization, and finally feedback to the DM. Moreover, interactive methods can either request preference information after a complete optimization run or during an optimization run. Reference point-based methods use a reference point (goal vector) as preference information, which is a cognitively valid approach for the DM. Usually, they employ a nonlinear distance function that functions as a preference model. Additionally, ϵ -niching methods are used to tune the spread of the identified ROI. After doing a complete run (IAR), the DM is supplied feedback, usually some information on the Pareto Front, to base their new reference point on.

5. Comparison of Preference Handling Techniques

To compare all the identified preference-handling techniques, key evaluation criteria have been established. In particular, we evaluate them by the point at which the DM includes their preferences (e.g., a-priori or a-posterior), the computational complexity, the number of solutions generated, the optimality of the solution(s), the ease of preference specification, and finally by the preference model used. These criteria have been chosen to represent both the decision-aid aspect (number of solutions, ease of preference specification, and preference model) as well as the optimization aspect (complexity, optimality). A comparison based on these evaluation criteria can be found in Table 3 and is further discussed below. Additionally, the advantages, disadvantages, and suitable application areas are highlighted. Note that for the sake of clarity, priorities and goals are discussed separately, although many possible combinations exist.
  • Nash Product
    The Nash product, a no-preference method, transforms the original MOO problem into an SOO problem by aggregating the k-objectives into a single composite objective function via the product rule (see Equation (3)). Subsequently, it allows the use of SOO techniques to solve the formulated problem. The resulting single Pareto Optimal solution assumes equal importance of objectives and finds a balanced tradeoff between the objectives. One of the advantages of using the Nash product is the low computational complexity, allowing the use of SOO methods. Additionally, the use of the product implies the method is scale-invariant which means no normalization of objectives is needed. The main disadvantage of the method is that it is impossible to tailor to different scenarios, which is much needed due to the large variety of military missions. Suitable applications for this method are applications where one does not have access to a DM and/or requires a single balanced compromise solution.
  • Weighted Sum Method and Reward shaping
    The weighted sum method, an a-priori technique, transforms the original MOO problem into an SOO problem by linearly aggregating the k-objectives via a weighted sum (see Equation (5)). The Reward Shaping method, as used by the reviewed papers, handles preferences equivalently to the weighted sum method. By using regular SOO techniques one can solve the formulated problem. The resulting solution is weakly Pareto optimal, but when all the weights are set higher than 0 ( w i > 0 , i { 1 k } ) the solution is guaranteed to be Pareto Optimal. The main advantage of using the Weighted Sum method or Reward shaping is that it greatly reduces the computational complexity of the problem. A disadvantage is that it can only find points on the convex hull of the problem. Besides that, it can favor extreme solutions even when weights are set equal. Because of this, setting weights is non-trivial, and the interpretation of the relative importance of objectives might not hold. Additionally, it requires the normalization of objectives. Finally, specifying preference information a-priori might result in suboptimal results. Suitable applications are those with convex Pareto fronts (e.g., linear objective functions), requiring a single optimal solution, and having access to objective trade-off information before optimization.
  • Priorities
    Setting priorities transforms the original MOO problem into a series of j SOO problems (assuming j priority levels are used) via the use of lexicographic optimization. Subsequently, SOO techniques can be used to solve the formulated SOO problems. The resulting solutions are Pareto optimal, assuming the objectives all have their own priority level. The main advantage of using priorities is that it greatly reduces the computational complexity of the original MOO problem. Besides that, when a natural absolute ordering is available, it is easy to provide preferences. However, it is questionable whether such an ordering exists between optimization objectives. Applications suitable for this method are thus applications where a natural absolute ordering between objectives exists and a single optimal solution is needed.
  • Goals
    Specifying goals (i.e., reference points) transforms the MOO problem into a single SOO problem. In particular, the composite optimization function is the sum of unwanted deviations from the goal values. Traditional SOO methods can be used to solve the formulated problem. The resulting solution is not guaranteed to be Pareto optimal. The main advantage of using goals is the reduction in computational complexity. Additionally, when information on the range and trade-offs in the objective space is available, setting goals is a cognitively valid way of expressing preference information. The main disadvantage is that such information is usually not available a-priori. Besides that, solutions found by these methods are not guaranteed to be Pareto optimal. Therefore, techniques for Pareto-efficiency restoration have to be used. Suitable applications are thus characterized by having access to objective range and trade-off information a-priori and needing a single optimal solution.
  • ϵ -constraint method
    The ϵ -constraint method transforms the MOO problem into a series of n (dependent on stepsize of tightening) SOO problems. The solutions to these problems form the approximated Pareto front. To formulate the SOO problems, one objective is kept as an optimization objective, and all the others are set as constraints. First, the constraints are set loosely, and in every iteration, they are tightened. The main advantage of this method is that it provides the DM with Pareto front information before requiring them to make a choice. Additionally, it provides the DM with multiple solutions. Finally, as opposed to the systematic weighted sum method, this method also finds points on the concave parts of the Pareto front. The main disadvantage of the method is the computational complexity. To get a reasonable level of detail of the Pareto front a very large amount of SOO problems need to be solved. Therefore, this method does not scale well to many objectives or complex SOO problems, like the resource-constrained shortest path problem. Finally, when the Pareto front is high-dimensional, it is hard for the DM to select the final solution. Suitable applications for this method are thus applications where multiple alternative solutions are preferred, few optimization objectives exist, and the formulated SOO problems are easy to solve.
  • Interactive Reference point-based method
    Interactive reference point-based methods allow the DM to specify their reference points iteratively during the optimization process. By doing so they aim to approximate parts of the Pareto front close to the reference points. The resulting approximations are not guaranteed to be Pareto optimal by default. The main advantage is that the interactive nature of the method allows the DM to specify preference information under better-informed circumstances thereby simplifying the preference specification step and consequently improving the likelihood that the final solution identified is the MPS. Moreover, since preference information is supplied during the optimization process, only parts of the Pareto front have to be generated, saving computational resources. The main disadvantage of this method is that it is difficult to find the right balance between overloading the DM and getting enough preference information. Moreover, since the solutions are not guaranteed to be Pareto optimal, Pareto-efficiency restoration techniques or similar methods have to be used. Suitable areas of application are those with intensive access to DMs, who are willing to participate in the decision process.

6. Conclusions and Future Research

In this paper a comprehensive survey on preference handling in the military UAV route planning literature is presented. We provided a categorization in four classes: no-preference, a-priori, a-posterior, and interactive methods. The review of the literature has shown that, despite its weaknesses, an absolute majority (90%) of the papers uses a-priori preference handling. The literature predominantly focuses on the optimization aspect of the problem, neglecting the decision-aid aspect. This is also substantiated by reviewing how the current literature has evaluated the performance of their method. Many of the papers exclusively evaluate algorithmic performance and overlook the evaluation of the degree of decision-aid for the DM. In particular, 66% of the papers simply assumed the preferences to be a given known fact. Only one paper uses a subject matter expert as DM to extract the preferences. Clearly, many of the papers exclusively focus on the optimization aspect, but fail to ensure that the resulting mathematical optimum aligns with the most preferred solution of the DM. The proposed methods neglect to validate whether the chosen preference handling methods align with the DM’s capabilities and beliefs. To mitigate these issues, we have provided a critical discussion of the used preference handling methods, stating their advantages and disadvantages as well as providing several ways of mitigating undesired implications. By reviewing the literature, we identified several future research directions:
  • Aligning the mathematical optimum with the MPS via accurate preference models
    The goal of the route planning problem is to aid the decision-maker in finding their MPS. Currently, most of the research is focused on improving optimization algorithms. No doubt, it is important to ensure that a mathematical optimum can be found. However, when this mathematical optimum does not result in the DM’s MPS, the optimization is useless. Therefore, more effort should be put into aligning the mathematical optimum with the DM’s MPS. For instance, the weighted sum method, used by 70% of the reviewed papers, employs an overly simplistic preference model, potentially leading to a mismatch between the mathematical optimum and the MPS. Lexicographic optimization methods assume an absolute ordering between optimization objectives, however, it is questionable whether such an ordering ever exists. Therefore, future research should employ accurate representations of the DM’s preferences.
  • a-posterior and interactive preference handling
    As our literature review has shown, nearly 90% of the papers employ a-priori preference handling techniques. These methods have the major disadvantage that preference elicitation happens before any knowledge of the possibilities, limitations, and capabilities of the problem are known. As a result, the final identified solution might not be the MPS. Additionally, these methods provide a single optimal solution, whereas in some scenarios, particularly in military missions, providing alternative solutions is extremely useful. Therefore, it would be interesting to further investigate a-posterior and interactive preference handling techniques, as these techniques include preference elicitation in a later, better-informed stage, and provide multiple optimal solutions, mitigating the disadvantages of a-priori methods.
  • Cognitive validity of preference elicitation
    Most of the literature has neglected the cognitive validity of the preference elicitation stage. For example, providing weights as preference information is a nontrivial task, potentially misleading the DM. For the method to work in practice, a DM needs to be able to trust the method. Therefore, future research should be geared towards using cognitively valid preference elicitation techniques.
  • DM-centric evaluation
    Finally, as the goal of the route planning problem is to help DMs arrive at their MPS, the performance should be evaluated in this regard as well. In particular, methods should be evaluated based on their decision-aid capabilities and not just on algorithmic performance. Therefore, evaluation with expert DMs is an interesting direction for future research.
  • Preference Handling in cooperative path planning
    Our paper has focused on the handling of preferences w.r.t. the optimization objectives for individual UAVs. When one focuses on cooperative path planning (e.g., Pan et al. [94] and Liu et al. [95]), one not only has multiple conflicting objectives per UAV, but each UAV’s optimal path might conflict with the others. Therefore, there is another layer of multi-objective optimization. For future research, it would be interesting to further investigate the way preferences are handled in such cases.
  • Standardization of evaluation of methods
    Currently, research papers propose an optimization algorithm and evaluate it using self-defined scenarios. Since every paper defines its own scenario, it is difficult to compare the results of different methods with each other. For future research, it would be interesting to define a set of benchmark scenarios, on which each method can be tested. This would facilitate a better comparison of methods.
  • Definition of optimization objectives
    Many different mathematical definitions of optimization objectives are employed throughout the literature. For example, to minimize the probability of detection, the literature has maximized the distance from the radar [96], minimized the radar power with various different functions [97,98,99], minimized a probability defined by a Gaussian [62], minimize time in threat zone [100], etc. Similarly for the flight time objective, many different definitions exist. All these different definitions have different effects on the resulting mathematical optimum.

Author Contributions

Conceptualization, T.Q.; methodology, T.Q.; investigation, T.Q.; writing—original draft preparation, T.Q.; writing—review and editing, T.Q., R.L., M.V., H.M. and B.Č.; visualization, T.Q.; supervision, R.L., M.V., H.M. and B.Č. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Chair of Data Science, Safety & Security at Tilburg University.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

DURC Statement

Current research is limited to the review of preference handling techniques in UAV route planning, which is beneficial for ensuring the alignment of the mathematical optimum with the decision-maker’s optimum and does not pose a threat to public health or national security. The authors acknowledge the dual-use potential of the research involving drone route planning and confirm that all necessary precautions have been taken to prevent potential misuse. As an ethical responsibility, the authors strictly adhere to relevant national and international laws about DURC. The authors advocate for responsible deployment, ethical considerations, regulatory compliance, and transparent reporting to mitigate misuse risks and foster beneficial outcomes.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
UAVUnmanned Aerial Vehicle
EWElectronic Warfare
DMDecision-Maker
MPSMost Preferred Solution
RQResearch Question
NLDANetherlands Defence Academy
NPSNaval Postgraduate School
AIAAAmerican Institute of Aeronautics and Astronautics
MOOMulti-Objective Optimization
SOOSingle-Objective Optimization
ROIRegion Of Interest
IARInteraction After complete Run
IDRInteraction During Run

Appendix A

Table A1. Overview of Preference Handling in the Military UAV Path Planning Literature. N = Normalized, L = Linear, WS = Weighted Sum.
Table A1. Overview of Preference Handling in the Military UAV Path Planning Literature. N = Normalized, L = Linear, WS = Weighted Sum.
ReferenceClassMethodEvaluation
Flint et al. [28]No PreferenceNash ProductNot Applicable
Kim & Hespanha [101]a-prioriN-L-WSPreference Assumed
Misovec et al. [102]a-prioriL-WSPreference Assumed
Dogan [62]a-prioriGoals and PrioritiesMultiple preferences visualized
Chaudhry et al. [63]a-prioriGoals and PrioritiesMultiple preferences visualized
Qu et al. [103]a-prioriL-WSMultiple preferences visualized
McLain & Beard [104]a-prioriL-WSPreference Assumed
Ogren & Winstrand [64]a-prioriGoals and PrioritiesPreference Assumed
Changwen et al. [105]a-prioriL-WSPreference Assumed
Weiss et al. [106]a-prioriN-L-WSPreference Assumed
Foo et al. [107]a-prioriL-WSMultiple preferences visualized
Ogren et al. [59]a-prioriGoals and PrioritiesPreference Assumed
Kabamba et al. [60]a-prioriGoals and PrioritiesNot Applicable
Zabarankin et al. [65]a-prioriGoals and PrioritiesMultiple preferences visualized
Zhang et al. [108]a-prioriL-WSPreference Assumed
Lamont et al. [72]a-posteriorL-WS, Pareto FrontProvide Pareto Front
Zhenhua et al. [73]a-posteriorPareto FrontProvide Pareto Front
de la Cruz et al. [57]a-prioriGoals and PrioritiesPreference Assumed
Xia et al. [109]a-prioriL-WSMultiple preferences visualized
Tulum et al. [110]a-prioriL-WSHeuristic approximation
Qianzhi & Xiujuan [111]a-prioriL-WSPreference Assumed
Wang et al. [112]a-prioriL-WSPreference Assumed
Zhang et al. [113]a-prioriL-WSPreference Assumed
Besada-Portas et al. [58]a-prioriGoals and PrioritiesPreference Assumed
Xin Yang et al. [114]a-prioriN-L-WSPreference Assumed
Yong Bao et al. [115]a-prioriL-WSPreference Assumed
Kan et al. [97]a-prioriL-WSMultiple preferences visualized
Lei & Shiru [98]a-prioriL-WSPreference Assumed
Zhou et al. [99]a-prioriL-WSPreference Assumed
Holub et al. [116]a-prioriL-WSMultiple preferences visualized
Chen et al. [117]a-prioriL-WSPreference Assumed
Li et al. [118]a-prioriL-WSPreference Assumed
Wallar et al. [119]a-prioriN-L-WSPreference Assumed
Yang et al. [56]a-prioriGoals and PrioritiesPreference Assumed
Qu et al. [120]a-prioriL-WSMultiple preferences visualized
Duan et al. [121]a-prioriL-WSPreference Assumed
Chen & Chen [117]a-prioriL-WSPreference Assumed
Wang et al. [122]a-prioriN-L-WSPreference Assumed
Xiaowei & Xiaoguang [123]a-prioriL-WSPreference Assumed
Wen et al. [124]a-prioriL-WSExpert DM
Zhang et al. [125]a-prioriL-WSPreference Assumed
Erlandsson [46]a-priorireward shapingMultiple preferences visualized
Humphreys et al. [126]a-prioriL-WSPreference Assumed
Tianzhu et al. [127]a-prioriL-WSPreference Assumed
Wang & Zhang [128]a-prioriL-WSPreference Assumed
Jing-Lin et al. [129]a-prioriL-WSPreference Assumed
Savkin & Huang [61]a-prioriGoals and PrioritiesNot Applicable
Zhang et al. [130]a-prioriL-WSMultiple preferences visualized
Maoquan et al. [131]a-prioriN-L-WSPreference Assumed
Danacier et al. [132]a-prioriL-WSMultiple preferences visualized
Chen & Wang [133]a-prioriL-WSPreference Assumed
Zhang & Zhang [96]a-prioriL-WSPreference Assumed
Patley et al. [134]a-prioriL-WSPreference Assumed
Ma et al. [135]a-prioriL-WSPreference Assumed
Dasdemir et al. [85]interactiveReference Point-basedAuthor as DM
Zhang et al. [136]a-prioriL-WSPreference Assumed
Wu et al. [137]a-prioriL-WSPreference Assumed
Xiong et al. [138]a-prioriN-L-WSPreference Assumed
Zhang et al. [139]a-prioriL-WSPreference Assumed
Yan et al. [47]a-priorireward shapingPreference Assumed
Abhishek et al. [29]no preferenceNash productNot Applicable
Zhou et al. [140]a-prioriN-L-WSPreference Assumed
Zhang et al. [141]a-prioriL-WSMultiple preferences visualized
Xu et al. [142]a-prioriL-WSPreference Assumed
Huan et al. [143]a-prioriL-WSPreference Assumed
Leng & Sun [100]a-prioriN-L-WSMultiple preferences visualized
Yuksek et al. [48]a-priorireward shapingPreference Assumed
Woo et al. [144]a-prioriL-WSPreference Assumed
Lu et al. [145]a-prioriL-WSPreference Assumed
Zhang et al. [146]a-prioriL-WSPreference Assumed
Alpdemir [49]a-priorireward shapingPreference Assumed
Dasdemir et al. [76]a-posteriorPareto FrontProvide Pareto Front
Zhao et al. [50]a-priorireward shapingPreference Assumed
Tezcaner et al. [77]a-posteriorPareto FrontProvide Pareto Front
Zhang et al. [147]a-prioriL-WSPreference Assumed
Luo et al. [148]a-prioriN-L-WSMultiple preferences visualized

References

  1. Buckley, J. Air Power in the Age of Total War, 1st ed.; Routledge: Oxfordshire, UK, 2006. [Google Scholar] [CrossRef]
  2. Imperial War Museums. A Brief History of Drones; Imperial War Museums: London, UK, 2018. [Google Scholar]
  3. Central Intelligence Agency. Remotely Piloted Vehicles in the Third World: A New Military Capability; Central Intelligence Agency: McLean, VA, USA, 1986.
  4. Lambeth, B.S. Moscow’s Lessons from the 1982 Lebanon Air War. Technical Report; RAND: Boston, MA, USA, 1984. [Google Scholar]
  5. Zafra, M.; Hunder, M.; Rao, A.; Kiyada, S. How Drone Combat in Ukraine is Changing Warfare; Reuters: Hoboken, NJ, USA, 2024. [Google Scholar]
  6. Watling, J.; Reynolds, N. Meatgrinder: Russian Tactics in the Second Year of Its Invasion of Ukraine Special Report; Technical Report; Royal United Services Institute: London, UK, 2023. [Google Scholar]
  7. Milley, M.A.; Schmidt, E. America Isn’t Ready for the Wars of the Future. Foreign Aff. 2024, 103, 26. [Google Scholar]
  8. Ehrgott, M. Multicriteria Optimization, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2005; p. 7. [Google Scholar] [CrossRef]
  9. Fevgas, G.; Lagkas, T.; Argyriou, V.; Sarigiannidis, P. Coverage Path Planning Methods Focusing on Energy Efficient and Cooperative Strategies for Unmanned Aerial Vehicles. Sensors 2022, 22, 1235. [Google Scholar] [CrossRef] [PubMed]
  10. Aggarwal, S.; Kumar, N. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges. Comput. Commun. 2020, 149, 270–299. [Google Scholar] [CrossRef]
  11. Jones, M.; Djahel, S.; Welsh, K. Path-Planning for Unmanned Aerial Vehicles with Environment Complexity Considerations: A Survey. ACM Comput. Surv. 2023, 55, 1–39. [Google Scholar] [CrossRef]
  12. Allaire, F.C.J.; Labonté, G.; Tarbouchi, M.; Roberge, V. Recent advances in unmanned aerial vehicles real-time trajectory planning. J. Unmanned Veh. Syst. 2019, 7, 259–295. [Google Scholar] [CrossRef]
  13. Ravankar, A.; Ravankar, A.A.; Kobayashi, Y.; Hoshino, Y.; Peng, C.C. Path Smoothing Techniques in Robot Navigation: State-of-the-Art, Current and Future Challenges. Sensors 2018, 18, 3170. [Google Scholar] [CrossRef]
  14. Song, B.; Qi, G.; Xu, L. A Survey of Three-Dimensional Flight Path Planning for Unmanned Aerial Vehicle. In Proceedings of the 2019 Chinese Control And Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; IEEE: Piscataway, NJ, USA, 2019; Volume 6, pp. 5010–5015. [Google Scholar] [CrossRef]
  15. Ait Saadi, A.; Soukane, A.; Meraihi, Y.; Benmessaoud Gabis, A.; Mirjalili, S.; Ramdane-Cherif, A. UAV Path Planning Using Optimization Approaches: A Survey. Arch. Comput. Methods Eng. 2022, 29, 4233–4284. [Google Scholar] [CrossRef]
  16. Zhao, Y.; Zheng, Z.; Liu, Y. Survey on computational-intelligence-based UAV path planning. Knowl.-Based Syst. 2018, 158, 54–64. [Google Scholar] [CrossRef]
  17. Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
  18. Haddaway, N.R.; Page, M.J.; Pritchard, C.C.; McGuinness, L.A. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst. Rev. 2022, 18, e1230. [Google Scholar] [CrossRef]
  19. Miettinen, K. Nonlinear Multiobjective Optimization; International Series in Operations Research & Management Science; Springer: Boston, MA, USA, 1998; Volume 12. [Google Scholar] [CrossRef]
  20. Thiele, L.; Miettinen, K.; Korhonen, P.J.; Molina, J. A Preference-Based Evolutionary Algorithm for Multi-Objective Optimization. Evol. Comput. 2009, 17, 411–436. [Google Scholar] [CrossRef] [PubMed]
  21. Wang, H.; Olhofer, M.; Jin, Y. A mini-review on preference modeling and articulation in multi-objective optimization: Current status and challenges. Complex Intell. Syst. 2017, 3, 233–245. [Google Scholar] [CrossRef]
  22. Tezcaner, D.; Köksalan, M. An Interactive Algorithm for Multi-objective Route Planning. J. Optim. Theory Appl. 2011, 150, 379–394. [Google Scholar] [CrossRef]
  23. Zajac, S.; Huber, S. Objectives and methods in multi-objective routing problems: A survey and classification scheme. Eur. J. Oper. Res. 2021, 290, 1–25. [Google Scholar] [CrossRef]
  24. Afsar, B.; Miettinen, K.; Ruiz, F. Assessing the Performance of Interactive Multiobjective Optimization Methods. ACM Comput. Surv. 2022, 54, 85. [Google Scholar] [CrossRef]
  25. Belton, V.; Branke, J.; Eskelinen, P.; Greco, S.; Molina, J.; Ruiz, F.; Słowiński, R. Interactive Multiobjective Optimization from a Learning Perspective. In Multiobjective Optimization: Interactive and Evolutionary Approaches; Branke, J., Miettinen, K., Slowinski, R., Eds.; Theoretical Computer Science and General: Strasbourg, France, 2008; Volume 5252, pp. 405–433. [Google Scholar] [CrossRef]
  26. Olson, D.L. Review of Empirical Studies in Multiobjective Mathematical Programming: Subject Reflection of Nonlinear Utility and Learning. Decis. Sci. 1992, 23, 1–20. [Google Scholar] [CrossRef]
  27. Khaira, A.; Dwivedi, R. A State of the Art Review of Analytical Hierarchy Process. Mater. Today Proc. 2018, 5, 4029–4035. [Google Scholar] [CrossRef]
  28. Flint, M.; Fernandez-Gaucherand, E.; Polycarpou, M. Cooperative control for UAV’s searching risky environments for targets. In Proceedings of the 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), Maui, HI, USA, 9–12 December 2003; IEEE: Piscataway, NJ, USA, 2003; pp. 3567–3572. [Google Scholar] [CrossRef]
  29. Abhishek, B.; Ranjit, S.; Shankar, T.; Eappen, G.; Sivasankar, P.; Rajesh, A. Hybrid PSO-HSA and PSO-GA algorithm for 3D path planning in autonomous UAVs. SN Appl. Sci. 2020, 2, 1805. [Google Scholar] [CrossRef]
  30. Nash, J.F. The Bargaining Problem. Econometrica 1950, 18, 155. [Google Scholar] [CrossRef]
  31. Kaneko, M.; Nakamura, K. The Nash Social Welfare Function. Econometrica 1979, 47, 423. [Google Scholar] [CrossRef]
  32. Hooker, J.N. Moral Implications of Rational Choice Theories. In Handbook of the Philosophical Foundations of Business Ethics; Springer: Dordrecht, The Netherlands, 2013; pp. 1459–1476. [Google Scholar] [CrossRef]
  33. Brânzei, S.; Gkatzelis, V.; Mehta, R. Nash Social Welfare Approximation for Strategic Agents. Oper. Res. 2022, 70, 402–415. [Google Scholar] [CrossRef]
  34. Charkhgard, H.; Keshanian, K.; Esmaeilbeigi, R.; Charkhgard, P. The magic of Nash social welfare in optimization: Do not sum, just multiply! ANZIAM J. 2022, 64, 119–134. [Google Scholar] [CrossRef]
  35. Narahari, Y. Cooperative Game Theory. In The Two Person Bargaining Problem; Indian Institute of Science: Bangalore, India, 2012. [Google Scholar]
  36. Zhou, L. The Nash Bargaining Theory with Non-Convex Problems. Econometrica 1997, 65, 681. [Google Scholar] [CrossRef]
  37. Huynh, D. Bargaining Games: A Comparison of Nash’s Solution with the Coco-Value; CEREMADE: Paris, France, 2016. [Google Scholar]
  38. Binmore, K.; Rubinstein, A.; Wolinsky, A. The Nash Bargaining Solution in Economic Modelling. RAND J. Econ. 1986, 17, 176. [Google Scholar] [CrossRef]
  39. Dagan, N.; Volij, O.; Winter, E. A characterization of the Nash bargaining solution. Soc. Choice Welf. 2002, 19, 811–823. [Google Scholar] [CrossRef]
  40. Yang, X.S. Multi-Objective Optimization. In Nature-Inspired Optimization Algorithms; Elsevier: Amsterdam, The Netherlands, 2014; pp. 197–211. [Google Scholar] [CrossRef]
  41. Deb, K.; Deb, K. Multi-objective Optimization. In Search Methodologies; Springer: Boston, MA, USA, 2014; pp. 403–449. [Google Scholar] [CrossRef]
  42. Marler, R.T.; Arora, J.S. The weighted sum method for multi-objective optimization: New insights. Struct. Multidiscip. Optim. 2010, 41, 853–862. [Google Scholar] [CrossRef]
  43. Steuer, R.E. Multiple Criteria Optimization; Theory, Computation, and Application; John Wiley: Hoboken, NJ, USA, 1986; p. 425. [Google Scholar]
  44. Kaim, A.; Cord, A.F.; Volk, M. A review of multi-criteria optimization techniques for agricultural land use allocation. Environ. Model. Softw. 2018, 105, 79–93. [Google Scholar] [CrossRef]
  45. Doğan, I.; Lokman, B.; Köksalan, M. Representing the nondominated set in multi-objective mixed-integer programs. Eur. J. Oper. Res. 2022, 296, 804–818. [Google Scholar] [CrossRef]
  46. Erlandsson, T. Route planning for air missions in hostile environments. J. Def. Model. Simul. Appl. Methodol. Technol. 2015, 12, 289–303. [Google Scholar] [CrossRef]
  47. Yan, C.; Xiang, X.; Wang, C. Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments. J. Intell. Robot. Syst. 2020, 98, 297–309. [Google Scholar] [CrossRef]
  48. Yuksek, B.; Demirezen, U.M.; Inalhan, G. Development of UCAV Fleet Autonomy by Reinforcement Learning in a Wargame Simulation Environment. In Proceedings of the AIAA Scitech 2021 Forum, Online, 11–21 January 2021; Volume 1. [Google Scholar] [CrossRef]
  49. Alpdemir, M.N. Tactical UAV path optimization under radar threat using deep reinforcement learning. Neural Comput. Appl. 2022, 34, 5649–5664. [Google Scholar] [CrossRef]
  50. Zhao, X.; Yang, R.; Zhang, Y.; Yan, M.; Yue, L. Deep Reinforcement Learning for Intelligent Dual-UAV Reconnaissance Mission Planning. Electronics 2022, 11, 2031. [Google Scholar] [CrossRef]
  51. Hayes, C.F.; Rădulescu, R.; Bargiacchi, E.; Källström, J.; Macfarlane, M.; Reymond, M.; Verstraeten, T.; Zintgraf, L.M.; Dazeley, R.; Heintz, F.; et al. A practical guide to multi-objective reinforcement learning and planning. Auton. Agents Multi-Agent Syst. 2022, 36, 26. [Google Scholar] [CrossRef]
  52. Moffaert, K.V.; Nowé, A. Multi-Objective Reinforcement Learning using Sets of Pareto Dominating Policies. J. Mach. Learn. Res. 2014, 15, 3663–3692. [Google Scholar]
  53. Vamplew, P.; Yearwood, J.; Dazeley, R.; Berry, A. On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts. In AI 2008: Advances in Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2008; Volume 5360, pp. 372–378. [Google Scholar] [CrossRef]
  54. Nguyen, T.T.; Nguyen, N.D.; Vamplew, P.; Nahavandi, S.; Dazeley, R.; Lim, C.P. A multi-objective deep reinforcement learning framework. Eng. Appl. Artif. Intell. 2020, 96, 103915. [Google Scholar] [CrossRef]
  55. Fonseca, C.; Fleming, P. Multiobjective optimization and multiple constraint handling with evolutionary algorithms. I. A unified formulation. IEEE Trans. Syst. Man, Cybern. Part A Syst. Humans 1998, 28, 26–37. [Google Scholar] [CrossRef]
  56. Yang, P.; Tang, K.; Lozano, J.A. Estimation of Distribution Algorithms based Unmanned Aerial Vehicle path planner using a new coordinate system. In Proceedings of the 2014 IEEE Congress on Evolutionary Computation (CEC), Beijing, China, 6–11 June 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 7, pp. 1469–1476. [Google Scholar] [CrossRef]
  57. de la Cruz, J.M.; Besada-Portas, E.; Torre-Cubillo, L.; Andres-Toro, B.; Lopez-Orozco, J.A. Evolutionary path planner for UAVs in realistic environments. In Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation, Atlanta, GA, USA, 12–16 July 2008; Volume 7, pp. 1477–1484. [Google Scholar] [CrossRef]
  58. Besada-Portas, E.; de la Torre, L.; de la Cruz, J.M.; de Andres-Toro, B. Evolutionary Trajectory Planner for Multiple UAVs in Realistic Scenarios. IEEE Trans. Robot. 2010, 26, 619–634. [Google Scholar] [CrossRef]
  59. Ogren, P.; Backlund, A.; Harryson, T.; Kristensson, L.; Stensson, P. Autonomous UCAV Strike Missions Using Behavior Control Lyapunov Functions. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Keystone, CO, USA, 21–24 August 2006; Volume 8. [Google Scholar] [CrossRef]
  60. Kabamba, P.T.; Meerkov, S.M.; Zeitz, F.H. Optimal Path Planning for Unmanned Combat Aerial Vehicles to Defeat Radar Tracking. J. Guid. Control. Dyn. 2006, 29, 279–288. [Google Scholar] [CrossRef]
  61. Savkin, A.V.; Huang, H. Optimal Aircraft Planar Navigation in Static Threat Environments. IEEE Trans. Aerosp. Electron. Syst. 2017, 53, 2413–2426. [Google Scholar] [CrossRef]
  62. Dogan, A. Probabilistic Path Planning for UAVs. In Proceedings of the 2nd AIAA “Unmanned Unlimited” Conference and Workshop & Exhibit, San Diego, CA, USA, 15–18 September 2003; Volume 9. [Google Scholar] [CrossRef]
  63. Chaudhry, A.; Misovec, K.; D’Andrea, R. Low observability path planning for an unmanned air vehicle using mixed integer linear programming. In Proceedings of the 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601), Nassau, Bahamas, 14–17 December 2004; IEEE: Piscataway, NJ, USA, 2004; pp. 3823–3829. [Google Scholar] [CrossRef]
  64. Ogren, P.; Winstrand, M. Combining Path Planning and Target Assignment to Minimize Risk in SEAD Missions. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, San Francisco, CA, USA, 15–18 August 2005; Volume 8. [Google Scholar] [CrossRef]
  65. Zabarankin, M.; Uryasev, S.; Murphey, R. Aircraft routing under the risk of detection. Nav. Res. Logist. (NRL) 2006, 53, 728–747. [Google Scholar] [CrossRef]
  66. Romero, C. Handbook of Critical Issues in Goal Programming, 1st ed.; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
  67. Simon, H.A. On the Concept of Organizational Goal. Adm. Sci. Q. 1964, 9, 1. [Google Scholar] [CrossRef]
  68. Simon, H.A. Rational choice and the structure of the environment. Psychol. Rev. 1956, 63, 129–138. [Google Scholar] [CrossRef] [PubMed]
  69. Jones, D.; Tamiz, M. Practical Goal Programming; International Series in Operations Research & Management Science; Springer: Boston, MA, USA, 2010; Volume 141. [Google Scholar] [CrossRef]
  70. Hannan, E.L. An assessment of some criticisms of goal programming. Comput. Oper. Res. 1985, 12, 525–541. [Google Scholar] [CrossRef]
  71. Jones, D.; Tamiz, M. Detection and Restoration of Pareto Inefficiency. In Practical Goal Programming; International Series in Operations Research & Management Science; Hillier, F.S., Ed.; Springer: Boston, MA, USA, 2010; Volume 141, pp. 95–110. [Google Scholar]
  72. Lamont, G.B.; Slear, J.N.; Melendez, K. UAV Swarm Mission Planning and Routing using Multi-Objective Evolutionary Algorithms. In Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making, Honolulu, HI, USA, 1–5 April 2007; IEEE: Piscataway, NJ, USA, 2007; Volume 4, pp. 10–20. [Google Scholar] [CrossRef]
  73. Zhenhua, W.; Weiguo, Z.; Jingping, S.; Ying, H. UAV route planning using Multiobjective Ant Colony System. In Proceedings of the 2008 IEEE Conference on Cybernetics and Intelligent Systems, Chengdu, China, 21–24 September 2008; IEEE: Piscataway, NJ, USA, 2008; Volume 9, pp. 797–800. [Google Scholar] [CrossRef]
  74. Yahia, H.S.; Mohammed, A.S. Path planning optimization in unmanned aerial vehicles using meta-heuristic algorithms: A systematic review. Environ. Monit. Assess. 2023, 195, 30. [Google Scholar] [CrossRef]
  75. Jiang, Y.; Xu, X.X.; Zheng, M.Y.; Zhan, Z.H. Evolutionary computation for unmanned aerial vehicle path planning: A survey. Artif. Intell. Rev. 2024, 57, 267. [Google Scholar] [CrossRef]
  76. Dasdemir, E.; Batta, R.; Köksalan, M.; Tezcaner Öztürk, D. UAV routing for reconnaissance mission: A multi-objective orienteering problem with time-dependent prizes and multiple connections. Comput. Oper. Res. 2022, 145, 105882. [Google Scholar] [CrossRef]
  77. Tezcaner Öztürk, D.; Köksalan, M. Biobjective route planning of an unmanned air vehicle in continuous space. Transp. Res. Part B Methodol. 2023, 168, 151–169. [Google Scholar] [CrossRef]
  78. Handler, G.Y.; Zang, I. A dual algorithm for the constrained shortest path problem. Networks 1980, 10, 293–309. [Google Scholar] [CrossRef]
  79. Sáez-Aguado, J.; Trandafir, P.C. Variants of the ϵ -constraint method for biobjective integer programming problems: Application to p-median-cover problems. Math. Methods Oper. Res. 2018, 87, 251–283. [Google Scholar] [CrossRef]
  80. Mavrotas, G. Effective implementation of the ϵ-constraint method in Multi-Objective Mathematical Programming problems. Appl. Math. Comput. 2009, 213, 455–465. [Google Scholar] [CrossRef]
  81. Deb, K.; Miettinen, K. A Review of Nadir Point Estimation Procedures Using Evolutionary Approaches: A Tale of Dimensionality Reduction. Technical Report. In Proceedings of the Multiple Criterion Decision Making (MCDM-2008) Conference, Chengdu, China, 22–26 June 2009. [Google Scholar]
  82. Laumanns, M.; Thiele, L.; Zitzler, E. An efficient, adaptive parameter variation scheme for metaheuristics based on the epsilon-constraint method. Eur. J. Oper. Res. 2006, 169, 932–942. [Google Scholar] [CrossRef]
  83. Xin, B.; Chen, L.; Chen, J.; Ishibuchi, H.; Hirota, K.; Liu, B. Interactive Multiobjective Optimization: A Review of the State-of-the-Art. IEEE Access 2018, 6, 41256–41279. [Google Scholar] [CrossRef]
  84. Miettinen, K.; Ruiz, F.; Wierzbicki, A.P.; Jaszkiewicz, A.; Słowiński, R. Introduction to Multiobjective Optimization: Interactive Approaches. In Multiobjective Optimization, LNCS 5252; Springer: Berlin/Heidelberg, Germany, 2008; pp. 27–57. [Google Scholar]
  85. Dasdemir, E.; Köksalan, M.; Tezcaner Öztürk, D. A flexible reference point-based multi-objective evolutionary algorithm: An application to the UAV route planning problem. Comput. Oper. Res. 2020, 114, 104811. [Google Scholar] [CrossRef]
  86. Larichev, O.I. Cognitive validity in design of decision-aiding techniques. J. -Multi-Criteria Decis. Anal. 1992, 1, 127–138. [Google Scholar] [CrossRef]
  87. Luque, M.; Miettinen, K.; Eskelinen, P.; Ruiz, F. Incorporating preference information in interactive reference point methods for multiobjective optimization. Omega 2009, 37, 450–462. [Google Scholar] [CrossRef]
  88. Bechikh, S.; Kessentini, M.; Said, L.B.; Ghédira, K. Preference Incorporation in Evolutionary Multiobjective Optimization; Elsevier: Amsterdam, The Netherlands, 2015; pp. 141–207. [Google Scholar] [CrossRef]
  89. Nikulin, Y.; Miettinen, K.; Mäkelä, M.M. A new achievement scalarizing function based on parameterization in multiobjective optimization. OR Spectr. 2012, 34, 69–87. [Google Scholar] [CrossRef]
  90. Deb, K.; Sundar, J. Reference point based multi-objective optimization using evolutionary algorithms. In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, WA, USA, 8–12 July 2006; Volume 7, pp. 635–642. [Google Scholar] [CrossRef]
  91. Allmendinger, R.; Li, X.; Branke, J. Reference Point-Based Particle Swarm Optimization Using a Steady-State Approach. In Proceedings of the Simulated Evolution and Learning, Melbourne, Australia, 7–10 December 2008; pp. 200–209. [Google Scholar] [CrossRef]
  92. Deb, K.; Jain, H. An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints. IEEE Trans. Evol. Comput. 2014, 18, 577–601. [Google Scholar] [CrossRef]
  93. Talukder, A.K.A.; Deb, K. PaletteViz: A Visualization Method for Functional Understanding of High-Dimensional Pareto-Optimal Data-Sets to Aid Multi-Criteria Decision Making. IEEE Comput. Intell. Mag. 2020, 15, 36–48. [Google Scholar] [CrossRef]
  94. Pan, Y.; Li, R.; Da, X.; Hu, H.; Zhang, M.; Zhai, D.; Cumanan, K.; Dobre, O.A. Cooperative Trajectory Planning and Resource Allocation for UAV-Enabled Integrated Sensing and Communication Systems. IEEE Trans. Veh. Technol. 2024, 73, 6502–6516. [Google Scholar] [CrossRef]
  95. Liu, S.; Lin, Z.; Wang, Y.; Huang, W.; Yan, B.; Li, Y. Three-body cooperative active defense guidance law with overload constraints: A small speed ratio perspective. Chin. J. Aeronaut. 2024, 38, 103171. [Google Scholar] [CrossRef]
  96. Zhang, W.; Zhang, B. Improvement of UAV Track Trajectory Algorithm Based on Ant Colony Algorithm. In Proceedings of the 2019 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), Changsha, China, 12–13 January 2019; IEEE: Piscataway, NJ, USA, 2019; Volume 1, pp. 28–31. [Google Scholar] [CrossRef]
  97. Kan, E.M.; Lim, M.H.; Yeo, S.P.; Ho, J.S.; Shao, Z. Contour Based Path Planning with B-Spline Trajectory Generation for Unmanned Aerial Vehicles (UAVs) over Hostile Terrain. J. Intell. Learn. Syst. Appl. 2011, 3, 122–130. [Google Scholar] [CrossRef]
  98. Lei, L.; Shiru, Q. Path Planning For Unmanned Air Vehicles Using An Improved Artificial Bee Colony Algorithm. In Proceedings of the 31st Chinese Control Conference, Hefei, China, 25–27 July 2012; pp. 2486–2491. [Google Scholar]
  99. Zhou, S.; Wang, J.; Jin, Y. Route Planning for Unmanned Aircraft Based on Ant Colony Optimization and Voronoi Diagram. In Proceedings of the 2012 Second International Conference on Intelligent System Design and Engineering Application, Sanya, China, 6–7 January 2012; IEEE: Piscataway, NJ, USA, 2012; Volume 1, pp. 732–735. [Google Scholar] [CrossRef]
  100. Leng, S.; Sun, H. UAV Path Planning in 3D Complex Environments Using Genetic Algorithms. In Proceedings of the 2021 33rd Chinese Control and Decision Conference (CCDC), Kunming, China, 22–24 May 2021; IEEE: Piscataway, NJ, USA, 2021; Volume 5, pp. 1324–1330. [Google Scholar] [CrossRef]
  101. Jongrae, K.; Hespanha, J. Discrete approximations to continuous shortest-path: Application to minimum-risk path planning for groups of UAVs. In Proceedings of the 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), Maui, HI, USA, 9–12 December 2003; IEEE: Piscataway, NJ, USA, 2003; Volume 2, pp. 1734–1740. [Google Scholar] [CrossRef]
  102. Misovec, K.; Inanc, T.; Wohletz, J.; Murray, R. Low-observable nonlinear trajectory generation for unmanned air vehicles. In Proceedings of the 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), Maui, HI, USA, 9–12 December 2003; IEEE: Piscataway, NJ, USA, 2003; Volume 3, pp. 3103–3110. [Google Scholar] [CrossRef]
  103. Yao-hong, Q.; Quan, P.; Jian-guon, Y. Flight path planning of UAV based on heuristically search and genetic algorithms. In Proceedings of the 31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005; Raleigh, NC, USA, 6–10 November 2005, IEEE: Piscataway, NJ, USA, 2005; p. 5. [Google Scholar] [CrossRef]
  104. McLain, T.W.; Beard, R.W. Coordination Variables, Coordination Functions, and Cooperative Timing Missions. J. Guid. Control. Dyn. 2005, 28, 150–161. [Google Scholar] [CrossRef]
  105. Changwen, Z.; Lei, L.; Fanjiang, X.; Fuchun, S.; Mingyue, D. Evolutionary route planner for unmanned air vehicles. IEEE Trans. Robot. 2005, 21, 609–620. [Google Scholar] [CrossRef]
  106. Weiss, B.; Naderhirn, M.; del Re, L. Global real-time path planning for UAVs in uncertain environment. In Proceedings of the 2006 IEEE Conference on Computer Aided Control System Design, 2006 IEEE International Conference on Control Applications, 2006 IEEE International Symposium on Intelligent Control, Munich, Germany, 4–6 October 2006; IEEE: Piscataway, NJ, USA, 2006; Volume 10, pp. 2725–2730. [Google Scholar] [CrossRef]
  107. Foo, J.L.; Knutzon, J.; Oliver, J.; Winer, E. Three-Dimensional Path Planning of Unmanned Aerial Vehicles Using Particle Swarm Optimization. In Proceedings of the 11th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Portsmouth, VA, USA, 6–8 September 2006; Volume 9. [Google Scholar] [CrossRef]
  108. Zhang, R.; Zheng, C.; Yan, P. Route Planning for Unmanned Air Vehicles with Multiple Missions Using an Evolutionary Algorithm. In Proceedings of the Third International Conference on Natural Computation (ICNC 2007), Haikou, China, 24–27 August 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 23–28. [Google Scholar] [CrossRef]
  109. Xia, L.; Jun, X.; Manyi, C.; Ming, X.; Zhike, W. Path planning for UAV based on improved heuristic A* algorithm. In Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments, Beijing, China, 16–19 August 2009; IEEE: Piscataway, NJ, USA, 2009; Volume 8, pp. 3–488. [Google Scholar] [CrossRef]
  110. Tulum, K.; Durak, U.; Yder, S.K. Situation aware UAV mission route planning. In Proceedings of the 2009 IEEE Aerospace Conference, Big Sky, MN, USA, 7–14 March 2009; IEEE: Piscataway, NJ, USA, 2009; Volume 3, pp. 1–12. [Google Scholar] [CrossRef]
  111. Qianzhi, M.; Xiujuan, L. Application of artificial fish school algorithm in UCAV path planning. In Proceedings of the 2010 IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA), Changsha, China, 23–26 September 2010; IEEE: Piscataway, NJ, USA, 2010; Volume 9, pp. 555–559. [Google Scholar] [CrossRef]
  112. Wang, G.; Li, Q.; Guo, L. Multiple UAVs Routes Planning Based on Particle Swarm Optimization Algorithm. In Proceedings of the 2010 2nd International Symposium on Information Engineering and Electronic Commerce, Ternopil, Ukraine, 23–25 July 2010; IEEE: Piscataway, NJ, USA, 2010; Volume 7, pp. 1–5. [Google Scholar] [CrossRef]
  113. Chao, Z.; Zhen, Z.; Daobo, W.; Meng, L. UAV path planning method based on ant colony optimization. In Proceedings of the 2010 Chinese Control and Decision Conference; Xuzhou, China, 26–28 May 2010, IEEE: Piscataway, NJ, USA, 2010; Volume 5, pp. 3790–3792. [Google Scholar] [CrossRef]
  114. Xin, Y.; Ming-yue, D.; Cheng-ping, Z. Fast Marine Route Planning for UAV Using Improved Sparse A* Algorithm. In Proceedings of the 2010 Fourth International Conference on Genetic and Evolutionary Computing; Shenzhen, China, 13–15 December 2010, IEEE: Piscataway, NJ, USA, 2010; Volume 12, pp. 190–193. [Google Scholar] [CrossRef]
  115. Yong, B.; Xiaowei, F.; Xiaoguang, G. Path planning for reconnaissance UAV based on Particle Swarm Optimization. In Proceedings of the 2010 Second International Conference on Computational Intelligence and Natural Computing; Wuhan, China, 13–14 September 2010, IEEE: Piscataway, NJ, USA, 2010; Volume 9, pp. 28–32. [Google Scholar] [CrossRef]
  116. Holub, J.; Foo, J.L.; Kalivarapu, V.; Winer, E. Three Dimensional Multi-Objective UAV Path Planning Using Digital Pheromone Particle Swarm Optimization. In Proceedings of the 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 20th AIAA/ASME/AHS Adaptive Structures Conference 14th AIAA, Honolulu, HI, USA, 23–26 April 2012; Volume 4. [Google Scholar] [CrossRef]
  117. Chen, Y.M.; Wu, W.Y.; Rd, Y.T. Cooperative Electronic Attack for Groups of Unmanned Air Vehicles based on Multi-agent Simulation and Evaluation. Int. J. Comput. Sci. Issues 2012, 9, 1. [Google Scholar]
  118. Li, B.; Gong, L.; Zhao, C. Unmanned combat aerial vehicles path planning using a novel probability density model based on Artificial Bee Colony algorithm. In Proceedings of the 2013 Fourth International Conference on Intelligent Control and Information Processing (ICICIP), Beijing, China, 9–11 June 2013; IEEE: Piscataway, NJ, USA, 2013; Volume 6, pp. 620–625. [Google Scholar] [CrossRef]
  119. Wallar, A.; Plaku, E.; Sofge, D.A. A planner for autonomous risk-sensitive coverage (PARCov) by a team of unmanned aerial vehicles. In Proceedings of the 2014 IEEE Symposium on Swarm Intelligence, Orlando, FL, USA, 9–12 December 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 12, pp. 1–7. [Google Scholar] [CrossRef]
  120. Qu, Y.; Zhang, Y.; Zhang, Y. Optimal flight path planning for UAVs in 3-D threat environment. In Proceedings of the 2014 International Conference on Unmanned Aircraft Systems (ICUAS), Orlando, FL, USA, 27–30 May 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 5, pp. 149–155. [Google Scholar] [CrossRef]
  121. Duan, Y.; Ji, X.; Li, M.; Li, Y. Route planning method design for UAV under radar ECM scenario. In Proceedings of the 2014 12th International Conference on Signal Processing (ICSP), Hangzhou, China, 19–23 October 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 10, pp. 108–114. [Google Scholar] [CrossRef]
  122. Wang, Q.; Zhang, A.; Qi, L. Three-dimensional path planning for UAV based on improved PSO algorithm. In Proceedings of the The 26th Chinese Control and Decision Conference (2014 CCDC), Changsha, China, 31 May–2 June 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 5, pp. 3981–3985. [Google Scholar] [CrossRef]
  123. Xiaowei, F.; Xiaoguang, G. Effective Real-Time Unmanned Air Vehicle Path Planning in Presence of Threat Netting. J. Aerosp. Inf. Syst. 2014, 11, 170–177. [Google Scholar] [CrossRef]
  124. Wen, N.; Zhao, L.; Su, X.; Ma, P. UAV online path planning algorithm in a low altitude dangerous environment. IEEE/CAA J. Autom. Sin. 2015, 2, 173–185. [Google Scholar] [CrossRef]
  125. Zhang, D.; Xian, Y.; Li, J.; Lei, G.; Chang, Y. UAV Path Planning Based on Chaos Ant Colony Algorithm. In Proceedings of the 2015 International Conference on Computer Science and Mechanical Automation (CSMA), Hangzhou, China, 23–25 October 2015; IEEE: Piscataway, NJ, USA, 2015; Volume 10, pp. 81–85. [Google Scholar] [CrossRef]
  126. Humphreys, C.; Cobb, R.; Jacques, D.; Reeger, J. Optimal Mission Path for the Uninhabited Loyal Wingman. In Proceedings of the 16th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Dallas, TX, USA, 22–26 June 2015; Volume 6. [Google Scholar] [CrossRef]
  127. Ren, T.; Zhou, R.; Xia, J.; Dong, Z. Three-dimensional path planning of UAV based on an improved A* algorithm. In Proceedings of the 2016 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC); Nanjing, China, 12–14 August 2016, IEEE: Piscataway, NJ, USA, 2016; Volume 8, pp. 140–145. [Google Scholar] [CrossRef]
  128. Wang, Q.; Zhang, J. MPC and TGFC for UAV real-time route planning. In Proceedings of the 2017 36th Chinese Control Conference (CCC), Dalian, China, 26–28 July 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 7, pp. 6847–6850. [Google Scholar] [CrossRef]
  129. Jing-Lin, H.; Xiu-Xia, S.; Ri, L.; Xiong-Feng, D.; Mao-Long, L. UAV real-time route planning based on multi-optimized RRT algorithm. In Proceedings of the 2017 29th Chinese Control And Decision Conference (CCDC), Chongqing, China, 28–30 May 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 5, pp. 837–842. [Google Scholar] [CrossRef]
  130. Zhang, C.; Liu, H.; Tang, Y. Quantitative Evaluation of Voronoi Graph Search Algorithm in UAV Path Planning. In Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 23–25 November 2018; IEEE: Piscataway, NJ, USA, 2018; Volume 11, pp. 563–567. [Google Scholar] [CrossRef]
  131. Maoquan, L.; Yunfei, Z.; Shihao, L. The Gradational Route Planning for Aircraft Stealth Penetration Based on Genetic Algorithm and Sparse A-Star Algorithm. MATEC Web Conf. 2018, 151, 04001. [Google Scholar] [CrossRef]
  132. Danancier, K.; Ruvio, D.; Sung, I.; Nielsen, P. Comparison of Path Planning Algorithms for an Unmanned Aerial Vehicle Deployment Under Threats. IFAC-PapersOnLine 2019, 52, 1978–1983. [Google Scholar] [CrossRef]
  133. Chen, Y.; Wang, S. Flight Parameter Model Based Route Planning Method of UAV Using Stepped-Adaptive Improved Particle Swarm Optimization. In Proceedings of the 2019 5th International Conference on Control, Automation and Robotics (ICCAR), Beijing, China, 19–22 April 2019; IEEE: Piscataway, NJ, USA, 2019; Volume 4, pp. 524–530. [Google Scholar] [CrossRef]
  134. Patley, A.; Bhatt, A.; Maity, A.; Das, K.; Ranjan Kumar, S. Modified Particle Swarm Optimization based Path Planning for Multi-Uav Formation. In Proceedings of the AIAA Scitech 2019 Forum, San Diego, CA, USA, 7–11 January 2019; Volume 1. [Google Scholar] [CrossRef]
  135. Ma, N.; Cao, Y.; Wang, X.; Wang, Z.; Sun, H. A Fast path re-planning method for UAV based on improved A* algorithm. In Proceedings of the 2020 3rd International Conference on Unmanned Systems (ICUS), Harbin, China, 27–28 November 2020; IEEE: Piscataway, NJ, USA, 2020; Volume 11, pp. 462–467. [Google Scholar] [CrossRef]
  136. Zhang, Z.; Wu, J.; Dai, J.; He, C. A Novel Real-Time Penetration Path Planning Algorithm for Stealth UAV in 3D Complex Dynamic Environment. IEEE Access 2020, 8, 122757–122771. [Google Scholar] [CrossRef]
  137. Wu, C.; Huang, X.; Luo, Y.; Leng, S. An Improved Fast Convergent Artificial Bee Colony Algorithm for Unmanned Aerial Vehicle Path Planning in Battlefield Environment. In Proceedings of the 2020 IEEE 16th International Conference on Control & Automation (ICCA), Sapporo, Japan, 9–11 October 2020; IEEE: Piscataway, NJ, USA, 2020; Volume 10, pp. 360–365. [Google Scholar] [CrossRef]
  138. Xiong, C.; Xin, B.; Guo, M.; Ding, Y.; Zhang, H. Multi-UAV 3D Path Planning in Simultaneous Attack. In Proceedings of the 2020 IEEE 16th International Conference on Control & Automation (ICCA), Sapporo, Japan, 9–11 October 2020; IEEE: Piscataway, NJ, USA, 2020; Volume 10, pp. 500–505. [Google Scholar] [CrossRef]
  139. Zhang, Z.; Wu, J.; Dai, J.; He, C. Rapid Penetration Path Planning Method for Stealth UAV in Complex Environment with BB Threats. Int. J. Aerosp. Eng. 2020, 2020, 8896357. [Google Scholar] [CrossRef]
  140. Zhou, X.; Gao, F.; Fang, X.; Lan, Z. Improved Bat Algorithm for UAV Path Planning in Three-Dimensional Space. IEEE Access 2021, 9, 20100–20116. [Google Scholar] [CrossRef]
  141. Zhang, J.; Li, N.; Zhang, D.; Wei, X.; Zhang, X. Multi-UAV cooperative Route planning based on decision variables and improved genetic algorithm. J. Phys. Conf. Ser. 2021, 1941, 012012. [Google Scholar] [CrossRef]
  142. Xu, H.; Jiang, S.; Zhang, A. Path Planning for Unmanned Aerial Vehicle Using a Mix-Strategy-Based Gravitational Search Algorithm. IEEE Access 2021, 9, 57033–57045. [Google Scholar] [CrossRef]
  143. Huan, L.; Ning, Z.; Qiang, L. UAV Path Planning Based on an Improved Ant Colony Algorithm. In Proceedings of the 2021 4th International Conference on Intelligent Autonomous Systems (ICoIAS), Wuhan, China, 14–16 May 2021; IEEE: Piscataway, NJ, USA, 2021; Volume 5, pp. 357–360. [Google Scholar] [CrossRef]
  144. Woo, J.W.; Choi, Y.S.; An, J.Y.; Kim, C.J. An Approach to Air-to-Surface Mission Planner on 3D Environments for an Unmanned Combat Aerial Vehicle. Drones 2022, 6, 20. [Google Scholar] [CrossRef]
  145. Lu, L.; Dai, J.; Ying, J. Distributed multi-UAV cooperation for path planning by an NTVPSO-ADE algorithm. In Proceedings of the 2022 41st Chinese Control Conference (CCC), Hefei, China, 25–27 July 2022; IEEE: Piscataway, NJ, USA, 2022; Volume 7, pp. 5973–5978. [Google Scholar] [CrossRef]
  146. Zhang, Z.; Wu, J.; Dai, J.; He, C. Optimal path planning with modified A-Star algorithm for stealth unmanned aerial vehicles in 3D network radar environment. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2022, 236, 72–81. [Google Scholar] [CrossRef]
  147. Zhang, Z.; Jiang, J.; Wu, J.; Zhu, X. Efficient and optimal penetration path planning for stealth unmanned aerial vehicle using minimal radar cross-section tactics and modified A-Star algorithm. ISA Trans. 2023, 134, 42–57. [Google Scholar] [CrossRef]
  148. Luo, J.; Liang, Q.; Li, H. UAV penetration mission path planning based on improved holonic particle swarm optimization. J. Syst. Eng. Electron. 2023, 34, 197–213. [Google Scholar] [CrossRef]
Figure 1. Illustrative example of a simplified bi-objective military route planning problem relating the decision variable space to the objective space for two different Pareto-optimal solutions.
Figure 1. Illustrative example of a simplified bi-objective military route planning problem relating the decision variable space to the objective space for two different Pareto-optimal solutions.
Drones 08 00769 g001
Figure 2. Flow chart [18] of the paper selection process.
Figure 2. Flow chart [18] of the paper selection process.
Drones 08 00769 g002
Figure 3. Four different approaches of Preference Handling techniques.
Figure 3. Four different approaches of Preference Handling techniques.
Drones 08 00769 g003
Figure 4. Preference handling classes employed in the literature.
Figure 4. Preference handling classes employed in the literature.
Drones 08 00769 g004
Figure 5. Performance evaluation using preference information.
Figure 5. Performance evaluation using preference information.
Drones 08 00769 g005
Figure 6. Geometric interpretation of the Nash Product.
Figure 6. Geometric interpretation of the Nash Product.
Drones 08 00769 g006
Figure 7. Effect of changing the disagreement point.
Figure 7. Effect of changing the disagreement point.
Drones 08 00769 g007
Figure 8. a-priori preference handling methods in UAV path planning.
Figure 8. a-priori preference handling methods in UAV path planning.
Drones 08 00769 g008
Figure 9. Geometric interpretation of the Weighted Sum method on convex and non-convex feasible sets.
Figure 9. Geometric interpretation of the Weighted Sum method on convex and non-convex feasible sets.
Drones 08 00769 g009
Figure 10. Normalization of objective values before weight specification.
Figure 10. Normalization of objective values before weight specification.
Drones 08 00769 g010
Figure 11. Geometric interpretation of the ϵ -constraint method on a non-convex pareto front.
Figure 11. Geometric interpretation of the ϵ -constraint method on a non-convex pareto front.
Drones 08 00769 g011
Table 1. Focus of existing surveys.
Table 1. Focus of existing surveys.
ReferencesYearOARROMCPH
Zhao et al. [16]2018
Song et al. [14]2019
Allaire et al. [12]2019
Aggarwal & Kumar [10]2020
Saadi et al. [15]2022
Jones et al. [11]2023
OA: Optimization Algorithm, RR: Route Representation, OM: Objective Modeling, C: Constraints, PH: Preference Handling; ✔: Concept covered, ∼: Concept introduced, ✗: Concept not covered.
Table 2. Inclusion and exclusion criteria of the paper selection.
Table 2. Inclusion and exclusion criteria of the paper selection.
Inclusion CriteriaExclusion Criteria
Significant contribution to research
( Average yearly citations ≥ 1)
Low academic impact
(Average yearly citations < 1)
Threat as optimization objectiveCompletely avoid or ignore threats
Handle multiple objectivesSole Focus on Threat or Distance minimization
Route planning in the flight domainNo route planning or different domain
Sufficient explanation of methodologyNo details
Propose novel workMinor change from existing work
Table 3. Summary of Preference Handling Techniques.
Table 3. Summary of Preference Handling Techniques.
MethodClassComplexityNumber of SolutionsOptimalityEase of Preference SpecificationPreference Model
Nash ProductNo-preference1 SOPSingle solutionPareto optimalNot ApplicableEqual importance of objectives
Favors Balanced solutions
Weighted Sum/Reward Shapinga-priori1 SOPSingle solution(Weakly) Pareto optimalHard to provide weights and potentially
misleading results
Overly simplistic
Linear aggregation of preferences
Favors extreme solutions
Prioritiesa-priorik SOPsSingle solutionPareto optimalEasy to provide ordering but questionable
it exists
Imposes absolute ordering on
objectives with very strong implications
Goals (reference points)a-priori1 SOPSingle solutionNot Pareto optimalSpecifying goals is cognitively valid but requires
information on objective space range and trade-offs
which are usually not available a-priori
Distance to reference point
ϵ -constrainta-posteriorn SOPsEntire Pareto frontPareto optimalHard to select the final solution from Pareto
front if it is high-dimensional
Not Applicable
Interactive Reference PointInteractivem SOPsROI of Pareto frontNot Pareto optimalCognitively valid and interaction allows
for the required information to be provided
Distance to reference point
SOP = Single-objective Optimization Problem, k = number of objectives, n = number of steps in constraints, m = number of iterations
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Quadt, T.; Lindelauf, R.; Voskuijl, M.; Monsuur, H.; Čule, B. Dealing with Multiple Optimization Objectives for UAV Path Planning in Hostile Environments: A Literature Review. Drones 2024, 8, 769. https://doi.org/10.3390/drones8120769

AMA Style

Quadt T, Lindelauf R, Voskuijl M, Monsuur H, Čule B. Dealing with Multiple Optimization Objectives for UAV Path Planning in Hostile Environments: A Literature Review. Drones. 2024; 8(12):769. https://doi.org/10.3390/drones8120769

Chicago/Turabian Style

Quadt, Thomas, Roy Lindelauf, Mark Voskuijl, Herman Monsuur, and Boris Čule. 2024. "Dealing with Multiple Optimization Objectives for UAV Path Planning in Hostile Environments: A Literature Review" Drones 8, no. 12: 769. https://doi.org/10.3390/drones8120769

APA Style

Quadt, T., Lindelauf, R., Voskuijl, M., Monsuur, H., & Čule, B. (2024). Dealing with Multiple Optimization Objectives for UAV Path Planning in Hostile Environments: A Literature Review. Drones, 8(12), 769. https://doi.org/10.3390/drones8120769

Article Metrics

Back to TopTop