As mentioned in the previous sections, there are several parameters that determine the performance of the gradient descent algorithm in the estimation of the rotational quaternion. These parameters are the step size , maximum allowed number of iterations , and initial rotational quaternion . In this section, we analyze the effect of these parameters on the number of iterations and principal angle error (PAE) of the estimate. Moreover, we present and evaluate different methods of initial quaternion propagation and introduce yet another parameter () as the figure of merit and termination criterion. The goal is to present the methodology and guidelines for selecting the optimal set of parameters that provides the highest accuracy of the estimate using the fewest iterations, which represent the opposing requirements in iterative algorithms. Finding the optimal parameter set leads to lower power consumption and lowers the required computational power, which may be of crucial importance, especially in low-power applications.
The simulations and results presented in this section are based on the software generated three-axes gravitational acceleration, three-axes magnetic field, and three-axes angular velocity characteristic for IMUs. In these simulations, the referent vector of gravitational acceleration is chosen to be
m/s
, and the referent vector of the magnetic field
T, according to the World Magnetic Model [
33] at the location 45
48
26
N, 15
58
3
E, representing the coordinates of Zagreb, Croatia. The software generated data are based on the ground truth rotational quaternions known for every sample, which allow for the calculation of the PAE. The PAE is extracted from the corrective rotational matrix defined as
,
, and
being the true and estimated rotational matrices, respectively. For each sample, the PAE is calculated as:
The simulations are performed in an ideal, noiseless condition as well as with the Gaussian white noise added to the software generated data. The noise levels of all three sensors (accelerometer, magnetometer, and gyroscope) correspond to the noise levels of the low cost IMU MPU9250 used as a reference. Before the noise analysis, the magnetometer calibration procedure was performed. To eliminate the unwanted artifacts, the noise levels were extracted from the measurement of more than 3500 samples with the IMU in static position. The noise variance applied to the software generated data is the highest variance extracted from the measurement for each sensor, namely, (m/s) for the accelerometer, 0.65 (T) for the meagnetometer, and (rad/s) for the gyroscope.
4.1. Selection of Step Size
The step size, denoted by
in (
10), is one of the most important parameters in the gradient descent algorithm. It is a positive real number that determines how fast the estimate converges toward the ground truth. If not chosen carefully, it may cause divergence of the algorithm. The step size is often empirically determined. In order to select the optimal step size, let us assume that the attitude of an object is represented by the rotational quaternion
defined by the arbitrarily chosen vector of rotation
and the angle of rotation
. Based on (
7), for the given angle
, the true rotational quaternion
can be calculated. Note that the vector of rotation
should be normalized before calculating the rotational quaternion. Using (
8), the rotation matrix
can be calculated, which, together with the known reference vectors of the gravitational acceleration
and the magnetic field
expressed in the inertial frame of reference, were used to calculate the representation of the same physical quantities in the object’s body frame (
and
).
In this simulation, no noise was added to the calculated body frame vectors. The body frame vectors were passed as the inputs to the gradient descent algorithm implemented in MathWorks Matlab as described in the previous section. The initial rotational quaternion was chosen to be
. The value of the cost function
for each iteration of the gradient descent algorithm is plotted in
Figure 2 for different values of step size
and angle
.
Focusing on
Figure 2a plotted for the rotation angle
, one can observe that the convergence rate of the cost function highly depends on the step size value. The values of tested step sizes are chosen in the range from 0.01 to 0.13. The initial value of the cost function is equal for each step size. As expected, the value of the cost function drops with each iteration indicating the drop of error in estimated rotational quaternion (
13), except for
. This is the largest step size in the range and causes the divergence of the gradient descent, which manifests in monotonic growth of the cost function with each iteration. For the other values of the step size, a clear difference in steepness is observable. For all step sizes (except
), the knee in each graph that represents a boundary between the regions of higher (for lower iteration numbers) and lower (for the higher iteration numbers) steepness of the cost function can be detected. The difference in steepness causes certain values of the step size to be a better choice for certain applications. For example, if the expected number of iterations is lower than 5, one should chose the step size with the steepest graph in that region (i.e.,
) since this step size yields the estimate in lowest number of iterations. On the other hand, different step size may be more suitable if an application requires larger number of iterations. If the number of iterations exceeds 15, the optimal choice represents the step size
.
Although the analysis presented above is based on the specific case of the rotational quaternion defined by the specific angle of rotation
, the same behavior is observable for different angles of rotation presented in
Figure 2b–d. By increasing the angle of rotation, the distance between the true and initial rotational quaternion in quaternion space increases. Because of that, the initial value of the cost function increases with the angle
. Moreover, in the 30 iterations presented in
Figure 2, different final values of the cost function are achieved indicating the difference in accuracy of the estimate. This behavior is characteristic for iterative algorithms. However, notice that the step sizes yielding the fastest convergence for the lower and the higher iteration numbers remain
and
, respectively. Since the number of iterations in attitude estimation based on the presented algorithm is expected to be small, the step size is set to
.
4.2. Termination of Iterative Process
In the previous examples, the step size and the cost function are analyzed with the fixed number of iterations set to 30, although, in the steady state, the number of iterations used for attitude estimation is expected to be much lower. In some publications, a single iteration is assumed to be sufficient [
30,
34]. However, this hypothesis is difficult to prove without setting a figure of merit. Here, we analyze different methods and approaches for the termination of the gradient descent iterative process. In general, the process can be terminated based on the number of iterations or desired accuracy. These two criteria represent opposing requirements.
The first and the simplest approach for the termination of the iterative process is setting the maximum number of iterations . In perfect noiseless conditions, as the number of iterations approaches infinity, an estimate approaches the ground truth. Thus, setting the finite number of iterations leads to the loss of accuracy. On the other hand, aiming for the highest accuracy requires large (theoretically infinite) number of iterations, which significantly increases the processing time. If the time constraint is the limiting factor of the application, one can reduce the duration of the gradient descent algorithm by limiting the maximum number of iterations by setting . The value of highly depends on the processing power of the targeted processing unit.
If, on the other hand, the desired accuracy of the estimate is reached, the iterative process should be terminated. However, a figure of merit is needed to properly set the termination criterion. In ideal lossless conditions, the termination criterion can be set based on the value of the cost function. Recall that, if the step size is set correctly, the value of the cost function monotonically decreases with each iteration. If the value of the cost function
drops below some a priori defined value
, the desired accuracy is reached and the iterative process can be terminated. Thus,
represents the user defined positive real number that sets the highest acceptable error and thus the desired accuracy of the estimate. The lower the
, the higher the accuracy. Unfortunately, such a criterion is impractical in real world applications. The reason for that lies in the fact that the measurements of physical quantities based on which the rotational quaternion is estimated are always affected by noise. In the presence of noise, the gradient descent algorithm is still able to find the rotational quaternion that minimizes the cost function, however, the value of the cost function at its minimum is no longer equal to zero, causing it to be practically impossible to reach the desired accuracy by setting
. However, as the estimate approaches the ground truth, the difference between the values of the cost function in current and previous iteration becomes smaller. Eventually, at the minimum of the cost function, the difference becomes equal to zero. Thus, the termination criterion can be set based on parameter
. Such a criterion is reported in [
26]. Here, we proposed yet another criterion based on the gradient of the cost function
defined by (
16). Indeed, calculating the difference between two neighboring points of the cost function is very similar to finding its derivative. This information is also contained in the gradient itself, which is calculated in each iteration of the gradient descent algorithm. If the minimum of the cost function is reached,
becomes a zero vector. Thus, to determine the accuracy of the gradient descent algorithm, we introduce the scalar representation of the gradient
G defined as:
The iterative process can be terminated in the k-th iteration if , here is an arbitrarily chosen user-defined positive real number. Similar to , by decreasing the value of , the accuracy of the quaternion estimation increases at the cost of the number of iterations.
To gain a better understanding of how
and
affect the quality of the rotational quaternion estimation, another simulation is performed. In this simulation, it is assumed that an object rotates with the constant angular speed of 20 deg/s around an arbitrary chosen vector of rotation
. The initial angle of rotation around the vector
is chosen to be
. Based on the provided data, the true rotational quaternion is calculated for each time step. The time period between the calculated quaternion is chosen to be 0.1 s. The true quaternion is then used to calculate the body frame gravitational acceleration and magnetic field that are passed to the gradient descent algorithm as the input arguments at each time step. The initial quaternion for the very first iterative process was chosen to be
. However, the rotational quaternion estimated in every other iterative process is passed as the initial quaternion for the next iterative process. This is one of the initial quaternion propagation methods that will be discussed in detail further on. In this simulation, the number of iterations and PAE were monitored with
fixed to
for different
in the range from 5 to 80. The step size is chosen to be
. The simulation results are shown in
Figure 3.
In
Figure 3a, one can observe that
has a dominant effect on the duration of the transient, i.e., the number of samples of the gravitational acceleration and the magnetic field needed for the algorithm to converge. Indeed, the higher the
, the quicker the steady state is reached. In a steady state, the number of iterations required for reaching the accuracy defined by
=
is equal to 9 for every
, except for
. Notice that, for
, the number of iterations is maxed out during the whole simulation period. This indicates that five iterations are simply not enough for reaching the required accuracy. This is also evident in
Figure 3b. Notice that, in a steady state, the same PAE is reached for each
except for
. The PAE is greater for the
, therefore indicating an insufficient number of iterations. Furthermore, one can notice that, during the transient, the number of iterations is maxed out for each
except for
for which the steady state is reached in only 0.1 s, indicating a single sample of gravitational acceleration and magnetic field.
The same simulation is repeated with
set to the fixed value of 100, while
is varied in the range from
to
. Since the number of iterations is set to a relatively high value, a quick convergence is expected for each
. This is confirmed by the results shown in
Figure 4a. However, the number of iterations in a steady state is not equal for every
. The lower the
, the higher the number of iterations required for reaching the desired accuracy. This is evident from
Figure 4b as well. If
is smaller, the small PAE is reached in a steady state. This is a clear confirmation that parameters
G defined in (
22) and
represent a valid figure of merit and termination criterion.
The simulation results presented above show that, if the
is too high, the accuracy of the estimate is too low. On the other hand, if the
is too low, the desired accuracy may be too high with respect to
. Increasing
leads to a larger number of iterations. If a targeted processing unit is capable of processing a large number of iterations in a given amount of time, this is a perfectly viable option in lossless conditions. However, in the presence of noise, a higher number of iterations does not necessarily mean higher accuracy. Thus, setting
too low may lead to wasting computational resources. Naturally, there is an optimal range of values of
, which should be determined using simulations or experiments. The optimal
minimizes the number of iterations required in a steady state. Here, we propose the method to determine the optimal
. The method is based on static simulation with noise added to the gravitational acceleration and magnetic field. The assumed attitude is represented by the rotational quaternion defined by the vector and angle of rotation,
and
, respectively. The initial quaternion is again chosen to be
. The rotation quaternion estimated for each sample is passed as the initial quaternion in the next iterative process. In a steady state, without the presence of noise, the previously estimated quaternion is the same as the current quaternion since the simulation is static. Consequently, the number of iterations per iterative process is expected to drop to zero in a steady state, regardless of
. However, in the presence of noise, zero iterations per iterative process means that the selected value of
is not optimal. In that case, higher accuracy can be achieved by decreasing the value of
.
should be decreased until the number of iterations starts to increase. Decreasing the
even further leads to the increase in the number of iterations without gaining accuracy. Since the object is static, the additional iterations appear solely due to the sensor noise. This conclusion is supported by the simulation results shown in
Figure 5 representing the number of iterations for
in a range from
to
.
All the graphs in
Figure 5 have the highest number of iterations around
s caused by the difference between the true and the initial quaternion, which quickly vanishes. This effect is detailed in the previous section.
Figure 5a,b represent results with the highest
. The average number of iterations over the simulation period of 100 s is below 0.1. Based on the presented results, one can conclude that the chosen values of
are too high and that the accuracy of the estimate can be further improved by decreasing the value of
. On the other hand,
Figure 5e,f represent the opposite scenario. Although the average number of iterations is not unreasonably high (below 2), at some time steps, a significantly larger number of iterations is needed to achieve the desired accuracy. This is an indication that the selected
is too low, which results in a noisy estimate. Thus, one concludes that optimal
yields the average number of iterations around the midpoint of the 0–1 range.
Figure 5c,d represent the results obtained for
equal to
and
, respectively. In these cases, the number of iterations does not exceed 1 at any time step. Furthermore, notice that, in both cases, the average number of iterations is closer to 0.5. Thus, the
is fixed to
for the next simulations in which we examine different methods of the initial quaternion propagation.
4.3. Propagation of Initial Quaternion
There are several approaches to selecting the initial quaternion for the gradient descent algorithm. Here, we present and examine three different propagation methods. In propagation method 1, a fixed arbitrarily selected rotation quaternion is used for each sample/iterative process. Since it is not possible to know the attitude of an object a priori, each rotational quaternion selected as an initial quaternion represents a reasonable guess. In propagation method 2, the rotational quaternion estimated in current iterative process is used as an initial quaternion of the next iterative process. This propagation method was briefly introduced in the sections above. Here, we use the assumption that the attitude of an object cannot change instantaneously (or cannot change much in a short period of time). Thus, the previously estimated quaternion is expected to be closer to the current quaternion, which reduces the number of iterations in comparison with the previous method. Finally, propagation method 3 uses an attitude estimate and information about the angular rate at the current time step to predict the rotational quaternion at the next time step. This prediction is then used as an initial quaternion in the next iterative process.
Unlike the other gradient descent based attitude estimation methods reported in the literature [
30,
34], which fix the number of iterations to 1, adaptively changing the step size with respect to the measured angular velocity, the iterative process here is performed with an initial quaternion closest to the true quaternion until the desired accuracy defined by
=
is reached. This method is expected to yield the lowest number of iterations. All three propagation methods are implemented as a part of the complementary filter. The complementary filter fuses the quaternions estimated using the gradient descent algorithm (based on the gravitational acceleration and magnetic field data) and quaternions estimated from the angular velocity. Here, the complementary filter gain is set to
, providing the same weight to each of the quaternion estimation methods. The value of the parameter
K is not optimized, as it is not part of the gradient descent algorithm per se.
In this simulation, it is assumed again that an object rotates with the constant angular speed of 20 deg/s around an arbitrary chosen vector of rotation
. The initial quaternion and angle of rotation around the vector
are chosen to be
and
, respectively. For each time step, the true rotational quaternion is calculated, which is then used to calculate the body frame gravitational acceleration and magnetic field. The angular velocity vector remains the same in both frames since it lies on the axis of rotation. Corresponding noise is added to each physical quantity and passed to the complementary filter implemented in MathWorks Matlab with the sample time of 0.1 s. The rotational quaternions estimated in each time step using all three initial quaternion propagation methods are converted in Euler angles and shown in
Figure 6. Ppropagation methods 2 and 3 show excellent results with respect to the ground truth. Propagation method 1, however, had difficulties estimating the rotational quaternion even with the maximum iteration number
, which is especially manifested in the plot of Euler angle
in
Figure 6a.
At some specific time steps, the initial quaternion is simply not good enough for the gradient descent to converge, which manifests in periodic spikes. On the other hand, Euler angles
and
are estimated reasonably well even with propagation method 1. The number of required iterations and PAE for the same simulation are shown in
Figure 7.
Figure 7 clearly shows the difference in performance between the three propagation methods. It is clear that the iteration number is maxed out at exactly the same time steps at which the Euler angle
is incorrectly estimated using propagation method 1 (
Figure 7a). This is evident even from the PAE in
Figure 7b. This indicates that propagation method 1 is non-optimal, which is also confirmed by a relatively large average number of iterations of 9.3 and PAE of 16.6
. Propagation method 2 yields much more reasonable results with respect to the number of iterations and the PAE shown in
Figure 7c,d, respectively. Notice that the average number of iterations is very close but is still below 1. This means that the iterative process is not always needed for the rotational quaternion estimation. This can also be observed at the time steps at which the number of iterations is equal to zero. It is interesting that the noise added to the simulated data actually helps to bring the initial quaternion closer to the optimal estimate, which leads to the desired accuracy without applying a single iteration of a gradient descent algorithm. This effect is very similar to the process of intentionally applying noise known as dithering [
35]. Furthermore, notice that the average PAE of 4.67
is drastically improved with respect to propagation method 1. As for propagation method 3, the results of which are shown in
Figure 7e,f, it evidently yields the best results. The average PAE is over an order of magnitude lower with respect to propagation method 2, i.e., 0.464
. Moreover, notice that the noise floor of the PAE is reached. This confirms that
defining the accuracy of the estimate is correctly set. The improvement is clearly visible in the average number of iterations, which is dropped to 0.441. Thus, one can conclude that the attitude estimation algorithm mostly relies upon the angular velocity data, which for MPU9250 have the lowest noise level of the three. This noise level is decreased even further through the numerical integration characteristic for the estimation of the rotational quaternion from angular velocity. Only now and then is the estimate automatically corrected using the gradient descent algorithm if
.