2.3. Path Optimizer
The path optimizer in this study has the same structure as the EM planner, so it is only briefly described here. For more detail, please refer to [
30]. The personalization of the path optimizer will be introduced in
Section 2.6. In general, the path optimization aims to obtain a smooth path represented as
for every candidate lane. In the E-step, the “Last Cycle Trajectory” (labeled in
Figure 1) of the ego vehicle and the predicted trajectories of obstacles are projected on the SL plane, and the relation between the ego vehicle and the obstacles will be evaluated at each time point, as depicted in
Figure 3. For motion prediction of the obstacles, the advanced methods are mainly based on machine learning, such as CNN-LSTM-based [
31] and Transformer-based [
32] methods, which take a sequence of historical states as the input of the deep neural networks.
Once the station coordinates of the ego vehicle’s bounding box intersect those of the obstacles’ bounding box given a specific time, the overlap regions will be marked on the station-lateral plane, e.g., the red region in
Figure 3.
In the M-step, the DP path search provides a rough path solution with a feasible tunnel, with which the spline-based QP path planning generates a smooth path. Specifically, during the DP path search, candidate paths for a specific lane are generated piecewise based on quintic polynomials.
Figure 4 depicts the candidate paths generated in parallel for each lane.
Then, the candidate paths are evaluated by a cost functional, which is a linear combination of smoothness, obstacle avoidance, guidance line, and road boundary cost functionals [
30]:
Here, the smoothness functional of a given path is defined as
where
, and
correspond to the heading difference between the lane and the ego vehicle, a term related to the curvature of the path, and a term associated with the derivative of the curvature, respectively. The obstacle cost functional is defined as
where
denotes the bounding box distance between the obstacle and ego vehicle at station
in a sequence of fixed station coordinates
.
is a monotonically decreasing function.
is the safety distance, while
is the nudge range.
denotes a high collision cost. The guidance line cost is measured as
to lead the vehicle to follow the centerline of the lane
. The road boundary cost is set to a high value if the bounding box of the ego vehicle intersects the road boundary. By applying the DP algorithm, a candidate path with the lowest cost can be obtained, named the DP path.
With the DP path, a feasible tunnel for QP path planning is generated. The spline-based QP path planning aims to further smooth the DP path within the feasible tunnel. To reserve the smoothness and flexibility of the path, the EM planner adopts the smoothing spline function:
where
,
, is an
m-th degree polynomial function with the coefficient
, and
are the spline knots. The parameter vector of the smoothing spline function is denoted as
. The QP path planning attempts to find a smoothing spline function
that optimizes the quadratic objective functional:
subject to linear constraints
, where
is absolutely continuous and
, and
are weights. In detail,
where
is the DP path;
describes the distance to the DP path; and
,
, and
are related to the smoothness of
f. The objective functional describes the balance between obstacle avoidance and smoothness. Additionally, the linear constraints include tunnel boundary constraints and smoothness constraints. The tunnel boundary constraints require that the bounding box of the ego vehicle is constrained within the feasible tunnel, which reduces the search space and ensures the convexity of QP. The smoothness constraints require that the polynomials of the smoothing spline are joined at spline knots with up to
m-th order derivative matching.
By solving the QP problem, a smooth path is obtained, as shown in
Figure 5. The solid blue line is the QP path, while the DP path is represented by the dashed blue line. The orange region depicts the feasible tunnel.
2.4. Personalized Speed Optimizer
Car following, which is one of the most common driving modes, reflects driver characteristics in the longitudinal direction. It refers to following the same preceding vehicle without changing lanes. The personalization of a speed optimizer requires the resulting longitudinal motion during car following to match the desired clearance, speed, and acceleration of a specific human driver. Therefore, the proposed personalized speed optimizer, consisting of a speed decider and a QP speed planner, is developed using the car-following segments of the collected human driving data. Since we aim to achieve humanlike driving with characteristics in both longitudinal and lateral directions in the EM framework, the goal of speed optimization is to generate a speed profile (the s-coordinate of the center of the rear axle as a function of time) to be combined with the QP path .
There are typically three types of driver characteristics in the longitudinal direction: (1) desired clearance: a driver tries to keep a desired distance between his car and the preceding car; (2) tolerance for acceleration and deceleration: different drivers have different tolerances; and (3) dynamic car-following characteristics: the relation between the acceleration adopted by the driver and the intervehicle states (e.g., clearance, speed of the ego vehicle). In this study, the desired clearance characteristics are modeled by the personalized speed decider, while the other two are modeled by the personalized QP speed planner.
Specifically, the desired clearance characteristics are described by the quadratic desired clearance (QDC) model [
33]:
where
denotes the desired clearance;
v is the speed of the ego vehicle; and the constant coefficients
a,
b, and
c are identified by the least square method. Based on the QDC model, the personalized speed decision-making algorithm, as summarized in Algorithm 1, is proposed to provide a reference speed profile for personalized QP speed planning. First, with all the needed information about the ego vehicle and obstacles transferred from the data center, we calculate the predicted relative lateral distance between the ego vehicle and each obstacle. Second, the speed profiles of the preceding obstacles are extracted to mark the obstacle regions and the preliminary feasible region in the station-time plane as the red regions and the orange region shown in
Figure 6, respectively. Next, given the dimensions of the ego vehicle and the minimum safety clearance (2 m in this study, less than which is considered a collision), the upper and lower bounds for a speed profile in each following segment can be calculated. The final bounds are obtained by connecting the bounds of the adjacent segments, as represented by the solid green lines in
Figure 6. Finally, the personalized speed decision
is calculated by subtracting the desired clearance from the upper bound
while satisfying
, as represented by the dashed green line in
Figure 6.
Algorithm 1: Personalized speed decision making |
- Input:
T: planning horizon; : the dimensions and predicted trajectories of the detected obstacles; : the dimensions of the ego vehicle; : the QP path generated in Section 2.3; : last cycle speed profile; : the desired speed set by the driver; the QDC model for the driver. - Output:
: lower and upper bounds for a speed profile; : personalized speed decision.
- 1
Calculate the predicted relative lateral distance between the ego vehicle and each obstacle: ; - 2
Extract the speed profiles of the preceding obstacles , according to which the obstacle regions and the preliminary feasible region are marked in the station-time plane as the red regions and the orange region shown in Figure 6, respectively; - 3
Calculate the upper and lower bounds, , for a speed profile in each following segment considering the dimensions of the ego vehicle and the minimum safety clearance:
where , denote the station of the rear edge of the preceding vehicle at time point t, the station of the front edge of the vehicle behind at time point t, and the minimum safety clearance, respectively. , , and are the front overhang, wheelbase, and rear overhang of the ego vehicle, respectively; - 4
Connect the bounds of the adjacent segments, as shown in Figure 6 by the solid green lines; - 5
Calculate the desired stations based on the QDC model:
where is the planned speed profile of the ego vehicle from the last planning cycle, and is the period of motion planning, as shown in Figure 6 by the dashed green line.
|
After the personalized speed decision making, a speed profile with the desired clearance characteristics of the driver is obtained. As for the tolerance for acceleration and deceleration, although constraining the
and
is a direct and effective way [
34], it is not considered in this study because the needed acceleration and deceleration under extreme conditions may not meet the driver’s preference. However, modeling the dynamic car-following characteristics is more difficult. Personalized speed profiles should possess the driver’s characteristics in the longitudinal direction and smoothness. Considering that the MLCF model not only mimics drivers’ longitudinal behaviors well but also has low computational cost [
9,
35], a personalized QP speed planning method based on the MLCF model is proposed to generate personalized smooth speed profiles. Mathematically, the QP speed planning aims to find a smoothing spline
defined as Equation (
5) that minimizes the quadratic objective functional that has the same form as Equation (
6) as well as the domain
of parameter vector
, subject to linear constraints
. Here,
where
is the personalized speed decision,
describes the distance to the desired station, and
, and
are related to the smoothness of
. Note that
is related to speed, which should not be minimized, so
. The linear constraints consist of boundary constraints and smoothness constraints. The boundary constraints include
where the first constraint forbids the vehicle from reversing, and
and
are the lower and upper bounds for a speed profile obtained through the personalized speed decision making. The last three constraints are related to traffic regulations and vehicle dynamics constraints, where
,
,
,
, and
are the corresponding bounds for speed, acceleration, and jerk. The determination of the above bounds involves analysis of vehicle dynamics under various scenarios. For simplicity, empirical values,
,
,
,
, and
, were employed in this study since we focus on the topic of achieving humanlike driving. Note that driver characteristics are not considered in the last three constraints in Equation (
10) because, for safety reasons, the speed, acceleration, or jerk under extreme conditions may not meet the driver’s preference. An example of the optimal solution
, which balances the driver’s preference for clearance and the smoothness of the speed profile, is depicted in
Figure 7.
Given the personalized speed decision
and the objective functionals in Equation (
9), it is intuitive that the first objective functional
penalizes the difference between the planned station
and the desired station
, while the last two terms,
and
, penalize the acceleration and jerk, respectively. Meanwhile, the closer the planned station is to the desired station, the greater the acceleration is [
30]. The balance between them, which is adjusted by the corresponding weights
and
, is actually related to the dynamic car-following characteristics. Therefore, we further explored the relation between the two weights and the resulting clearance, speed, and acceleration in car-following scenarios. Simulation results suggest that the clearance, speed, and acceleration of the resulting longitudinal motion are strongly affected by the ratio between the weights of distance to the desired station and acceleration, i.e.,
, as shown in
Figure 8. Additionally, since
concerns the jerk, which is not involved in the considered driver characteristics,
is set to the same value as
. In addition,
can be set to an arbitrary value since
. Thus, the behavior modeling becomes finding a dynamic ratio
that best describes the driver’s dynamic car-following characteristics. Then
can be easily obtained. To solve the problem, a QP speed planner customizing algorithm based on Bayesian optimization and leave-one-out cross validation (LOOCV) is proposed.
Considering the outstanding performance of the MLCF model, the task is transformed into establishing the relation between ratio
r and acceleration
a, where
a is calculated by the MLCF model:
Here,
and
are sensitivity to velocity error and sensitivity to distance error, respectively, whose reciprocal are linear functions of ego vehicle speed
v.
is the speed of the preceding vehicle.
d denotes the clearance. Constants
,
,
, and
are estimated using the driving data. Constants
and
are identified by optimizing the error between the output of the following system and the measured
a,
v, and
d. More details of the MLCF model are introduced in [
9]. Additionally, one can use other car-following models as substitutes for the MLCF model, leading to different performances of the personalized speed optimizer.
In addition, the acceleration of the planned longitudinal motion monotonously increases as
r increases due to the nature of the smoothing spline [
30]. Therefore,
r is assumed to be linear, quadratic, and logarithmic functions of
successively, whose parameters can be obtained by Bayesian optimization [
36]. Since the amount of collected data is small, LOOCV is employed to evaluate the hypothesis models. The proposed algorithm for obtaining
r is summarized in Algorithm 2, where
denotes the number of episodes of driving data, and
denotes the number of folds of LOOCV. Each episode
contains the trajectories of the ego vehicle and obstacles, and the car-following episode refers to the episode in which the driver of the ego vehicle performs no other behavior except car following.
First, the car-following episodes in the human driving data are divided into training set
, validation set
, and testing set
. Then, the parameters,
k and
b, of each hypothesis model in
is identified by Bayesian optimization (from step 2 to step 19) with the training and validation sets. Specifically,
,
,
, and
for every fold of LOOCV are obtained through the method in [
9]. For fitting SVE and SDE, the effective values of the velocity error and the distance error for each speed interval are calculated using the training data, respectively. With the effective values, the constants
,
,
, and
in Equation (
11) can be identified through the least square method. Additionally,
and
can be obtained by optimizing the error between the simulated trajectories of the MLCF model and the trajectories in
.
Next, the parameters of the hypothesis model is selected either randomly or according to the expected improvement (EI) acquisition function
[
37]:
where
denotes the observations consisting of parameters and the corresponding losses,
is the number of observations,
denotes the predictive variance function of the Gaussian process prior, and
denotes the cumulative distribution function of the standard normal distribution
.
is defined as
where
is the predictive mean function of the Gaussian process prior. During each iteration of Bayesian optimization,
,
,
,
,
k, and
b are determined, with which the QP speed planner is employed to perform car following in every episode in
. After that, the error
is calculated, which is the evaluation of the human likeness in the longitudinal direction of the parameters in the iteration. The three terms of
describe the difference between the simulated trajectories and the trajectories in
in acceleration, speed, and clearance. The weights
,
, and
are set to 0.9, 0.09, and 0.01 by trial. With the error
, the observation set is updated by adding the current parameter vector and the error. After the Bayesian optimization, the parameters of the hypothesis models are obtained. Finally,
is used to evaluate the hypothesis models on the testing set, and the best one is selected to be integrated with the QP speed planner. In this way, a QP speed optimizer with the driver characteristics in the longitudinal direction is developed.
2.5. Personalized Reference Trajectory Decider
After the path optimization and the speed optimization, a trajectory with a personalized speed profile for each candidate lane is obtained, among which a reference trajectory should be selected to be transferred to the vehicle controller. In fact, different drivers facing the same situation may make completely different decisions. For example, a driver may prefer to follow the preceding vehicle, while some drivers would overtake it. Our goal is to develop a reference trajectory decider that can make the same decisions as a specific driver, namely, a personalized reference trajectory decider. The main challenge of modeling such lateral behavioral preferences is the complexity and diversity of traffic scenes.
Algorithm 2: QP speed planner customizing based on Bayesian optimization and LOOCV using human driving data |
- Input:
: human driving data; : hypothesis models; : the maximum number of Bayesian optimization iterations; the personalized speed decider; the QP speed planner. - Output:
: The best hypothesis model and its parameters.
- 1
Divide the car-following episodes in into , , and ; - 2
foreach r in do
- 25
end - 26
- 27
|
We propose to develop the personalized reference trajectory decider based on a Bayesian network to model driver behavior in the lateral direction. There are two steps to building such a Bayesian network, including designing the structure
and identifying the parameters
of the network. In general, determining the structures of BNs is intractable. Fortunately, the BN structure of the proposed model can be determined by analyzing the reasoning patterns of human drivers. When driving on a highway, a driver needs to observe the physical states and lane-changing intentions of the surrounding vehicles, as well as the physical states of the ego vehicle, to make lateral behavioral decisions. Given the information about vehicles from sensors and V2X modules, the variables can be assumed to be context-specifically independent [
38]. Thus, the structure of the BN (represented by
) is designed as depicted in
Figure 9, where
(
) denotes whether the ego vehicle is traveling slower than expected;
(
) denotes the lane the ego vehicle belongs to;
(
) denotes whether the adjacent lane on the right side exists;
are the physical states (including the position, speed, and acceleration in
s direction relative to the ego vehicle for each surrounding vehicle) and the lateral intentions (
, representing changing to the left lane, lane-keeping, and changing to the right lane) of the surrounding vehicles; and
(
) is the lateral behavioral decisions of the ego vehicle. Here,
denotes the set of possible values of a random variable
. Lowercase letter
refers to a value of a random variable
.
The lane-changing intentions of the surrounding vehicles and the ego vehicle can be obtained through unsupervised techniques, e.g., HDP-HSMM [
39]. Note that a variable denoting the existence of the adjacent lane on the left side is not necessary. The lanes are numbered from left to right as
, and whether the adjacent lane on the left side exists is naturally known. For example, the lane on the left side does not exist if the ego vehicle drives in lane 1, while it exists if the ego vehicle drives in lane 2. Given the values of the parent variables of
, the BN returns the distribution of the lateral behavioral decisions of the ego vehicle.
After determining the structure, the next step is to identify the parameters
with human driving data, i.e., the probabilities of the distribution
, where
denotes the parent variable set of
. Since the physical states of the surrounding vehicles are continuous and other variables are discrete, the BN is a hybrid Bayesian network (HBN). Building an HBN is generally much more complex than building a BN [
40]. However, considering that the output variable
is discrete, the HBN in this study can be transformed into a BN with only discrete variables by discretizing the continuous variables. There are many techniques for discretization, which can be divided into unsupervised (e.g., equal width and equal frequency [
41]) and supervised (e.g., CAIM [
42] and CACC [
43]) groups. Among them, unsupervised techniques should be considered in this study because
are unlabeled. Meanwhile, the existing unsupervised discretizing methods make the variable space extremely large when applied to build the BN, which increases the complexity of the model because they do not utilize the underlying relationships (context-specific independence) between variables. For example, when the blue vehicle makes the lane-changing decision, it does not need to consider the vehicles in lane 1, as shown in
Figure 10.
To identify the parameters
of the BN with the structure
in an efficient and lightweight way, we propose an algorithm based on the probability tree [
44], which discretizes all the continuous variables in the BN, as summarized in Algorithm 3. A probability tree is a directed labeled tree whose inner and leaf nodes represent random variables and probability values, respectively. The branches of an inner node represent the possible values or intervals of the variable. The path of a particular node
consists of the branches from the root node to it, denoted as
. A sample in
is described as on the
when the values of the variables in the sample match the branches of the
. In addition, during the generation of a probability tree, all variables are expanded successively, and each variable is expanded on all branches generated by the last expanded variable, as shown in
Figure 11.
For building a probability tree, three core tasks need to be completed: (1) determining the discrete and continuous variables to be expanded, (2) determining the expanding order of the variables, and (3) determining where to partition and how many intervals to partition into for continuous variables. The proposed algorithm expands discrete variables preferentially in general.
is first expanded because whether the ego vehicle reaches the driver’s desired speed may influence the driver’s lateral decision in any traffic situation, as depicted in
Figure 11.
Algorithm 3: Building the BN of the personalized reference trajectory decider |
- Input:
: human driving data; : the BN with the determined structure and an undetermined parameter vector . - Output:
A BN built with .
- 1
Transform into ; - 2
Expand , , and successively; - 3
Identify the interested surrounding vehicles (ISVs) for each branch of the probability tree according to the ego lane; - 4
Expand the lateral intention variables of the ISVs successively; - 5
Let be the set of all branches generated by last expansion; - 6
Let be the set of the physical state variables of the ISVs for each branch B in : ; - 7
foreach B in do - 8
- 9
end - 10
; - 11
Calculate the probability at each leaf of the probability tree with based on the maximum likelihood estimation [ 45]: .
|
Algorithm 4: Continuous variable discretizing |
- Input:
; ; : the maximum number of intervals for a continuous variable to be partitioned. - Output:
: branches that originate from B.
- 1
Let be the set of new branches that grow out of the branches in ; - 2
Initialize ; - 3
Initialize ;
- 20
end - 21
.
|
Next,
and
are expanded for similar reasons. Then, the lateral intentions of the interested surrounding vehicles (ISVs) are expanded successively. Here, the ISVs are determined by analyzing the traffic scenario. For simplicity, a scenario of a highway with four lanes is used for explanation, as shown in
Figure 10. First, only the two closest vehicles ahead and behind in each lane (the vehicles labeled as “
f” and “
r”) need to be considered. One reason for ignoring the vehicles farther away in front is that they generally do not directly interact with the ego vehicle. Another important reason is that considering those vehicles will make the structure of the BN much more complex, which hinders its application. In addition, there is no additional threshold to exclude vehicles that are very far because it may vary from one driver to another. Algorithm 4, presented later in this paper, can automatically partition the distance variable of an obstacle vehicle according to its effect on the ego lateral intentions. In this way, the algorithm equivalently finds the distance threshold beyond which the vehicles will not affect the ego lateral intentions.
Second, the vehicle behind the ego vehicle (labeled as “1r”) does not affect the lateral intentions of the ego vehicle. Third, vehicles beyond the width of two lanes on the left or right (labeled as “4f” and “4r”) are not considered. Furthermore, the vehicle in the rear of the lane next to the adjacent lane (labeled as “3r”) does not hinder the ego vehicle. Moreover, if some of the ISVs are absent, their lateral intentions are considered ’keep’, and their physical states are set to particular values, e.g.,
. Finally, the lateral intentions of the surrounding vehicles also contribute to the determination of the ISVs; e.g., in
Figure 10, vehicle “2f" is not an ISV if it is changing to the right lane. In summary, the ISVs for the ego vehicle driving in each lane are listed in
Table 1, where “2r: keep” means that vehicle “2r” is regarded as an ISV only if it keeps the lane. Note that the ISVs are defined concerning time, and the selection of ISVs will be updated over time. In addition, the ISV selection rules presented in
Table 1 are general rules for scenarios with four traffic lanes, and rules for scenarios with more lanes can be established by similar analysis.
Actually, the determination of ISVs takes advantage of the context independence between variables. With the lateral intention variables of the ISVs expanded, continuous variables (physical states of the ISVs) are partitioned and expanded successively by executing Algorithm 4. In
Figure 11, for example, the lateral intention variable
of the first ISV is expanded, following which the lateral intention variables
,
, and
of other ISVs (represented by “...”) are expanded successively. Then the position variable
of the first ISV is partitioned and expanded, and the speed and acceleration variables
and
of the first ISV, as well as the physical state variables of other ISVs, are partitioned and expanded successively. Since the ISVs for each branch
B in
vary from those in other branches, Algorithm 4 is executed sequentially.
In Algorithm 4, the first step for discretizing a variable
at a branch
b is to find its local minimum and maximum values in the subset of
whose samples are on the
, instead of global minimum and maximum values in
. This can contribute to retaining more information when variables are partitioned into intervals. For example, assuming that the ranges of a variable in
and the subset are
and
, respectively, discretization with partition 2 contains more information about the local distribution than with partition 1, as shown in
Figure 12.
Then, the number of intervals that the variable is partitioned into is determined according to the maximum KL divergence . Obviously, the larger the is, the more information about the distribution is retained. However, the growth of increases the size of the probability tree, i.e., the number of the BN’s parameters, exponentially, which significantly increases the time complexity and space complexity of the algorithm. Thus, is employed to assess the gain of partitioning. The core idea is that more intervals are needed only if the increase in benefits maximizing the difference between the distributions of the ego lateral intentions in the partitioned intervals. Specific to a driving scenario, if the obstacle vehicle in the adjacent lane is close to the ego vehicle in the longitudinal direction, a driver would not change to that lane even though the obstacle vehicle will move away soon. In this case, there is no need to partition the speed variable of the obstacle vehicle since whether it is partitioned does not affect the lateral intentions of the driver; i.e., the distributions of the lateral intentions in the partitioned intervals are the same. To calculate , the probabilities of ego lateral intentions for each partitioned interval w are first calculated by , where denotes the number of samples that match the and the interval b, and is the number of samples in which the ego lateral intention equals in the samples just mentioned. Next, given two intervals, the KL divergences of them are . Here, and represent the distributions of the two intervals. The KL divergences are calculated for all combinations of the intervals. The maximum KL divergence for a particular number of intervals is defined as the maximum value of all the KL divergences. Note that a small negative constant is added to the KL divergence formula because smaller is preferred when the remains the same.
By executing Algorithm 4 for each
B in
, all branches
and
are obtained, according to which the probability at each leaf is estimated based on the maximum likelihood estimation (MLE) as
where
. At this point, a BN with the driver characteristics in the lateral direction is built. Once the information about the ego vehicle and the ISVs is given, the BN outputs a lateral decision with the highest probability based on the experience extracted from
.
However, the decisions of the BN cannot be adopted directly since the BN considers simplified information about the surrounding vehicles. On the contrary, the path optimizer and the speed optimizer plan path and speed profiles with complete information from the data center. Additionally, the lateral decisions made by the BN may be unstable during lane changing as long as a state has not been experienced in the collected human driving dataset. Therefore, a scheduling strategy must be designed to deal with the flaws. For AVs, safety is always the priority. Hence, the decision of the BN is adopted only when the corresponding trajectory is safe. The safety of a trajectory can be assessed from the kinematic aspect using the same method in the path optimizer and from the vehicle dynamics aspect via adhesion evaluation [
46]. Moreover, traffic regulations should take precedence over driver’s preference. As for the instability problem, theoretically, it could be solved if enough data were collected. However, it is hard to ensure the adequacy of data volume. Therefore, once a lane-changing decision is made by the BN, the ego vehicle will complete the lane-changing process unless the safety or traffic regulation requirements are not satisfied. For a situation not covered by the BN, the lateral decision will be made according to the safety assessment and traffic regulations. In addition, lateral decisions must be made after the motion planning for each candidate lane. Although making a lateral decision first and only planning motion for that decision saves computational resources, it can be dangerous in some unusual scenarios. For example, suppose a slow vehicle ahead suddenly changes lanes into the lane of the ego vehicle, while the BN decides to keep the lane. In that case, the ego vehicle may be unable to avoid the collision even with maximum deceleration. In contrast, a lane-changing decision after motion planning can be substituted for the decision of the BN and prevent the collision. With the scheduling strategy described above and the BN, the personalized reference trajectory decider is developed, which selects a trajectory among the trajectories of candidate lanes to be transferred to the vehicle controller.
2.6. Personalizing the Path Optimizer
The personalized reference trajectory decider enables personalized lateral behavioral decision making. To further consider the lateral characteristics at the trajectory level, the path optimizer introduced in
Section 2.3 needs personalization.
Remind that in the path optimizer, the DP path search provides a rough path solution, which is smoothed by the QP path planning afterward. Therefore, the DP path search should have a great impact on the generation of the final path. According to Equation (
1), when the ego vehicle overlaps with obstacles or the road boundaries,
and
dominate the total cost
. Conversely, when the ego vehicle does not overlap with obstacles or road boundaries,
and
dominate the total cost. Since safety is the priority, we can only adjust
and
to improve the human likeness of the final path. The weights,
,
, and
, in Equation (
2) determine the relative magnitude of
and
and thus need to be adjusted. In general, the greater the weights are, the smoother the lane-changing paths are. On the contrary, the smaller the weights are, the closer the paths are to the centerline of the lanes.
Figure 13 compares the simulated trajectories, speed, acceleration, and curvature of the ego vehicle generated by three motion planners with three sets of randomly selected weights. The motion planners had the same personalized speed optimizer and personalized reference trajectory decider. Parameter 1, Parameter 2, and Parameter 3 denote
,
, and
, respectively. The triangles marked on the trajectories are spaced at 0.1 s intervals, representing the locations of the vehicles and indicating the traveling direction of the vehicle. The blue rectangle represents the ego vehicle, whose trajectories, speed, acceleration, and curvature in the driving data are represented by the blue curves. Those of the three motion planners compared are represented by the orange, green, and red curves. The transparent gray rectangles are the surrounding vehicles. It is shown that the weights affect not only the path but also the speed and acceleration.
Similar to Algorithm 2, we can also use Bayesian optimization and LOOCV to adjust the weights. The difference is that the parameter vector becomes
and the optimization objective is replaced by
where
stands for the simulated trajectory, and
is the trajectory in the driving data.
is the number of time steps. Additionally, the personalized speed optimizer and the personalized reference trajectory decider are employed during the optimization. After the optimization, a personalized path optimizer is obtained.