3.1. MPC Problem Formulation
The Model Predictive Control problem is, in its essence, a recurring optimization problem that searches for the optimal control input over the receding control horizon, and, in most cases, also solves for the state variables over the prediction horizon. Only the first control input in the control horizon is realized by the actuators. In the context of our problem, attitude dynamics are coupled with the ROE-based relative translational motion, and the state vector is set to
where the dynamics of the state vector are described by (
3) and (
10). The control input, on the other hand, is set to
where
u is the magnitude of the piece-wise constant input acceleration provided by the single thruster, and
is the desired quaternion. The state vector as well as the control vector are collated over the prediction and the control horizons in the two matrices
and
respectively, such that
with
being the length of the prediction horizon, and
being the length of the control horizon, where
. It is to be noted that the notations
, where
, and
, where
, are used, and the time difference between any two consecutive instances is constant and is equal to the user-defined sampling time, i.e.,
. Moreover, after the control horizon is over, the input acceleration is set to zero (i.e.,
) and the desired attitude is left to be decided by the mission-specific mode management logic of the ADCS. For the sake of simulating the attitude evolution, the reference attitude at an arbitrary time step beyond the control horizon is set to the quaternion from the previous step. Formally,
.
It is also important to note that the adopted satellite is assumed, without loss of generality of the approach, to have its thruster directed along the z-direction of its body frame, and hence the input acceleration vectors in
and in
can be related as
After having introduced the necessary notations, the optimization problem could then be formulated as follows:
Problem 1.
where
,
, and
, are the time, dimensionless ROE vector, and quaternion vector, respectively, at the beginning of each horizon. Note that these are receding quantities that are specific to each prediction horizon. Accordingly, they coincide with the quantities at the beginning of the simulation only once, at the time of the first prediction. Moreover,
is the maximum applicable acceleration that can be provided by the thruster,
is the angular velocity vector of the satellite, and
is the maximum allowable angular rate around any axis.
Indeed, a constant value for the input acceleration vector,
, has to be fed to the ROE dynamics constraint (16). However, the attitude dynamics are much faster than those of the ROE, and since the vector
is attitude-dependent, its value can change significantly from
to
, even when
is kept constant. Therefore, a prediction of
,
had to be fixed and fed to the dynamics constraint (16). The value of
is set to that of
at time
, with its formal definition being as follows:
The attitude kinematics (10) are discretized through a symbolic Runge–Kutta fourth-order scheme and the discrete kinematics are imposed as constraint (17) in Problem 1. Moreover, the discrete angular velocity vector in constraint (19) in Problem 1, , could be obtained by applying Formula (9) using the discrete unit quaternion signal . It is also worth noting that the squared norms (, , and ) are constrained instead of the norms themselves. This formulation is adopted to facilitate the job of the optimizer since it allows the Jacobian of the constraints to be defined at all values of and .
The cost function components of Problem 1,
,
, and
, are defined as
where
is the reference ROE vector, while
,
,
, and
are positive-definite MPC gains.
It is important to note that the ultimate goal of autonomous orbit keeping is to rendezvous with a virtual target then track its absolute orbit, i.e.,
. Moreover, the matrices
and
are chosen to be diagonal matrices, which renders
a summation over the squared weighted 2-norms of the error vector,
. This ultimately means that
is a measure of distance between
and
, which is zero in our case, in a scaled-ROE space. Indeed, reasoning
in terms of the magnitude of the error vector in a scaled-ROE space does not only allow us to eventually drive
to
, and hence for the satellite to rendezvous and then track the orbit of the virtual target, but also allows us to indirectly minimize the total delta-V cost, since the distance in the ROE space, or in a scaled-ROE space for that matter (e.g., the dimensional ROE space, which scales the ROE vector by the mean semi-major axis of the chief), can be used as a measure for the total delta-V cost [
14,
15]. A closer look at
reveals that while minimizing it indeed drives the error signal to zero, it is, in fact, an implicit trade-off between fuel and time optimality, depending on the ratio between the traces of
and
, respectively. The greater the value of this ratio, the more inclined towards fuel optimality the cost function becomes, while the lower the ratio, the more the scheme leans towards time optimality.
Adding
to the compound cost function directly minimizes the total required thrust from the onboard single thruster. The adopted convex form of
is the fuel-optimal form of a cost function, which also coincides with the delta-V-optimal form for single-thruster satellites [
23].
Finally, the purpose of adding the last component of the cost function, , is to softly constraint the desired attitude to be as close as possible to the actual attitude in order to avoid unnecessary large slews, which in turn minimizes the attitude control effort. This soft constraint allows large slew maneuvers, depending on the choice of the MPC gains, only when it is meaningful from the ROE error (i.e., ) or fuel cost (i.e., ) points of view.
3.2. Parameter Tuning
In the proposed formulation of MPC, there are many parameters and gains to be chosen, which can affect the overall performance and robustness of the MPC. In many industrial applications, such parameters are chosen engineering experience. In this application, the initial guess for the prediction horizon has been formulated based on the literature review. Since the
-optimal locations for ROE changes through impulsive delta-V increments are known to occur, at most, every half-orbit [
15], and since the input acceleration of the low-thrust propulsion system should be distributed around these
-optimal locations (if
optimality is sought), the prediction horizon,
, is chosen to be at least half the orbital period. In this setting, the MPC is able to foresee all the
-optimal locations throughout a certain orbit, which hence leads to optimal allocation of the available thrust. Overly increasing the prediction horizon is expected to only overwhelm the onboard processor, while not having much effect on the optimality of the solution.
The choice of the control horizon,
, on the other hand, is decided through trial and error, which is an iterative process as all the other parameters have to be fixed before the tuning process starts. Nonetheless, regarding the choice of the
value, it has to be a small positive integer less than or equal to
. Common guidelines for choosing the prediction and control horizons can be found in [
24].
Tuning of the sampling time, , has also been performed through trial and error. In the context of Problem 1, although the dynamics of the ROE are slow and the sampling time could be chosen as a large value in order to not solve the optimization problem so frequently, the following important consideration has to be taken into account to carry out the tuning process. Since the attitude dynamics are much faster than that of the ROE, one cannot ignore that a fixed value for the input acceleration vector in the RTN frame, , is fed to the linearized ROE dynamics constraint (16), see (22), which renders the model used in the MPC not accurate unless a small value is chosen for the sampling time. Indeed, including in the compound cost function helps in slowing down the attitude dynamics. Still, the need to choose a scant value for the sampling time is meaningful. A compromise has to be found between choosing a small and having a more accurate model for the MPC on one hand, and choosing a large that allows the optimization problem to be solved less frequently on the other.
The cost function gains (i.e., , , , and in this paper) are more challenging to tune, since the cost function of the MPC is problem-dependent and no general guidelines exist for its weighting. Thus, more focus is put on analyzing the effect of choosing different cost function weights and on actually selecting an optimal set of these weights.
In order to reduce the number of tunable parameters, the components of the cost function, i.e.,
,
, and
, are related to the main weighting matrix,
, such that
where
with
being the expected value of
over the prediction horizon, which is approximated by the ROE vector at the beginning of each horizon, and
and
being predictions for the values of
u and
over the control horizon. The approximated expected values,
,
, and
, are illustrated graphically over one prediction horizon in
Figure 3. For imaging purposes, each of them is taken to be one-dimensional. Indeed, the values assumed for
and
are merely the authors’ predictions for the expected values of these quantities over the whole simulation; however, choosing other values for these variables would not affect the final MPC gains, as will be elaborated upon later in the text.
Substituting from (25) into (24), the MPC gains can be related to
,
,
, and
as
where
and
change for every prediction horizon.
It is conceivable that when
is very small (i.e., the rendezvous of the satellite with its virtual target has already taken place), the priority should be given to
in order for the ADCS not to hyper-react to the small errors in
. Indeed, the phenomenon of hyper-reaction of the ADCS when
becomes too small has been verified via numerical simulation for a system using the definition of
in (26). It is for this reason that
has to be redefined in order to steer the priority to
when
is approaching zero, such that
where
is the saturation function which is defined as follows:
Looking at Equations (24)–(27), it is clear that in order to define all the MPC gains, one has to choose , , , , and .
The choice of
is rather simple since all the ROEs can be weighted equally. However, in order to enhance the stability of the final relative orbit (i.e., to keep the obtained orbit for as long as possible without it being affected much by the perturbations), an emphasis is put on minimizing
when the ROEs are close to the target ones, and
is defined as
where
Q is a tuning parameter that controls the order of magnitude of the cost function, and
is the identity matrix with dimensions
.
Fixing the value of
Q to
, of
to 10, and of
to
, a brute force sensitivity analysis over
and
is carried out to choose adroit values for
and
. Thanks to the High-Performance Computer of the University of Luxembourg [
25], 726 simulations could be run simultaneously where, in addition to interchanging the values of
and
, the seed of the Random Number Generator (RNG), which controls the initial state at the beginning of the simulation, could also be taken into account. In these simulations, and in the sequel, warm-starting is employed at the beginning of each prediction horizon, meaning that the optimized state and control profiles from the previous prediction horizon are utilized to construct the initial guess for the current one.
Before starting the simulations, it was noted that the admissible values for and belong to the period , since the most important component of the cost function is and since the two other components are normalized with respect to (see (24)).
One simulation would have its RNG seed,
, and
as one possible combination from the following sets:
To this end, a fitness function needs to be introduced in order to asses how good the output of the simulation is. Such fitness function in our context needed to address four criteria (performance metrics):
- 1.
Driving the ROE vector to zero, which is the main goal of the MPC scheme to achieve orbit keeping. This is assessed through observing the norm of the finale of the ROE time series, denoted as .
- 2.
Enhancing the stability of the relative orbit finale which, in other words, is minimizing the Root Mean Square (RMS) of the last portion of the relative semi-major axis profile over time, denoted as .
- 3.
Minimizing the total delta-V, .
- 4.
Reducing the attitude effort, which is quantified through the mean angular rate of the satellite, .
It is necessary to state that the term “
finale” in this paper refers to the last
of the simulation time span. Furthermore, although this research is concerned with low-thrust propellers, the total delta-V is considered instead of the total thrust in order to enable comparisons between the proposed scheme and those from the literature. The total delta-V in our context can be calculated using the following formula:
where
and
are the initial and final times of the simulation, respectively. Given that the input thrust/acceleration from the single thruster is piece-wise-constant, the total delta-V can be rewritten as follows:
with
being the number of receding horizons being processed within a simulation,
j the horizon index, and
being the fixed sampling time.
The adopted overall fitness function could then be written as
where
,
,
, and
are weights that determine the importance of each of the four criterion, and the function
is defined as
with
and
being the minimum and maximum functions over all the simulations. It is clear that this form of the fitness function only shoots out values between 0 and 1.
Running the simulations for the 726 combinations in (30) for a fixed initial chief orbit, defined in
Table 1, applying the definition of the fitness function in (34) to all the 726 simulations and to the four performance metrics, and fixing the values of the weights to
,
,
, and
, which reflect the high importance of minimizing
in comparison to the other metrics, the heat maps that summarize all the simulations could be obtained and are presented in
Figure 4.
The heat maps in
Figure 4 could only be obtained after averaging the fitness over each RNG seed in (30), and after adopting a scattered data interpolant in order to smooth the heat maps.
It is clear from the figures that the smaller is, the better and become, while this relation is conceivably inverted for . Furthermore, and could be noticed to be positively correlated, and are both negatively correlated with and . Further investigations on the simulations where and are both large revealed that minimal input acceleration is provided at the current attitude of the satellite, which is almost stagnant, while the effect of this acceleration on the orbit correction is, as well, minimal. Nonetheless, the ROE vector is approaching its set point, although very slowly.
The overall fitness function,
, is depicted in
Figure 5 and the fittest point, i.e., the point with maximum overall fitness, (
and
) is marked with the gray circle on the heat map. It is clear from the fittest
combination that, for the initial chief orbit in
Table 1, it was not necessary to include the direct delta-V penalty,
, in the cost function, since
was already indirectly accounting for delta-V cost as previously mentioned. One has to acknowledge that the fittest combination of
and
in
Figure 5 is not necessarily universal for any arbitrary initial chief’s orbit, and the selection of the optimal
combination needs to be performed again for each initial orbit of the virtual chief. This does not represent a limitation of the approach, since the reference orbit of the chief is fixed once as soon as the scientific mission is designed.
The employed approach served the need to identify an adroit set of combinations of and . Future investigations may focus on the setup of a heuristic approach, e.g., genetic algorithms, to search for the combination that globally maximizes the overall fitness function. In this context, note that the definition of the fitness functions and of the related parameters impact the global optimum.
Finally, the values of the optimized and are heavily dependant on the authors’ predictions of and , and another choice of these quantities would have definitely led to different optimal and ; however, the values of and will not change since is directly divided by to obtain and the same dynamics apply for with the squared 2-norm of to obtain (see (26)).
3.3. Closing the Control Loop
The MPC described so far is one key module of the whole control loop, which, at practical implementation, requires feedback to operate properly, which is provided by the navigation system, which, in turn, relies on sensors as well as a state estimation filter. The full closed loop, which illustrates the module execution logic onboard the deputy spacecraft, is depicted in
Figure 6. There, Osc2Mean is the function that transforms osculating orbital elements to mean ones [
18], Body2Inert is the method that rotates any vector from the body frame to the inertial frame, and OE2ROE is the function that transforms the absolute orbital elements of both the chief and the deputy to a ROE vector (see (
2)). The breve accent,
, signifies a quantity which is affected by either one or a combination of the following sources of errors:
Estimation errors (e.g., );
Actuator errors (e.g., );
Physical constraints (e.g., ).
The proposed MPC is validated by means of numerical simulations, emulating the performance of the navigation module and the effects of the physical limitations of the adopted actuators.
The physical limitations are only present in the saturation block (see
Figure 6), which could be implemented in the numerical simulations. In fact, this saturation is taken into account in the MPC implementation, but the saturation block after the MPC is implemented anyway as a safeguard in case the MPC computes an infeasible solution.
The “Sensors” and “Filter” blocks in
Figure 6 are replaced by the following surrogate models, which treat the propagated state of the deputy as ground truth:
where “Cart2OE” is the method that transforms the Cartesian state to orbital elements, and
is a normally distributed random variable with
as its expected value and
as its variance. Hence,
is the covariance matrix of the random noise affecting the estimation of the Cartesian state of the deputy satellite, which is defined as
where
and
are the variances of the one-dimensional position and velocity estimation errors, respectively, which, for the numerical simulations presented in this work, were extracted from the publicly available Triton-X brochure.
It is to be noted that the relative navigation is typically more accurate than the absolute one [
4,
5]. It is for this reason that a similar model is used to emulate the behaviour of the relative navigation system within the numerical simulations as follows:
where
is the covariance matrix of the random noise affecting the estimation of the ROE vector, which is expanded as
with the diagonal elements being the variances of the random noise that affect each element of the ROE vector. These variances are taken to all be equal in the sequel, i.e.,
.
Lastly, the “Pointing error” block is replaced by the following surrogate model:
where
is the pointing error angle, which was obtained from the Triton-X brochure in the validation simulations, and
is a random three-element unit vector.
It is important to note that although the chief, in this context, is a virtual point, it is propagated using the perturbed dynamics for two main reasons:
In many applications, the tracking of the reference orbit and the absolute control of the reference orbit are handled separately;
The proposed scheme can be directly applied to the rendezvous with an actual spacecraft.
As a result, the autonomous orbit keeping compensates errors and perturbations acting on the relative dynamics.