1. Introduction
Ultrasound technology is widely used for object positioning. Applications such as robot navigation [
1], indoor navigation [
2], human-device interface systems [
3], body-tracking [
4] or medical-probes tracking [
5] are just a few examples of the many potential applications based on ultrasonic waves.
Positioning systems usually have a set of fixed anchor nodes that defines the infrastructure for the location. To locate the target, there are mainly two different approaches:
Locating an active object, one able to emit and/or receive ultrasonic signals [
6,
7].
Locating a passive object, one which just reflects the incoming ultrasonic wave emitted by the anchors [
8].
Within the active-object alternative, one option is to use the anchors as receivers and the mobile node as the signal emitter. Based on the Time of Flight (ToF) or on the Received Signal Strength Indicator (RSSI), the anchors calculate their distance to the object [
9]. Another alternative is using the Angle of Arrival (AoA) [
10], where the position is obtained from the direction of arrival of the signal to the receiver. This work focuses on mechanisms based on ToF measurements, which are generally more robust and accurate, by relying on the predictable velocity of the ultrasonic wave in the air. If a ToF-based mechanism is used, all the anchors require an additional synchronization mechanism to have a common clock reference. Similarly, the roles can be inverted and the anchors can synchronously transmit beacons, with the mobile node as the receiver, which computes the distances locally. To achieve the time synchronization between the nodes, a combination of different technologies may be used in the same system, such as ultra wide band (UWB) and ultrasounds [
11]. Another popular approach to avoid the requirement of having a tight synchronization mechanism between the anchors is to use two-way ranging mechanisms [
12] in which either the mobile node or the anchors reply with another signal after a fixed amount of time and the one-to-one distances are computed based on the individual round-trip times.
There are works that use the active-object approach, based on ultrasound technology, for positioning and tracking: Chen H. et al. [
9] proposed a system, where the positioning is based in a fixed receiver array performing the localization of a transmitter array attached to the hand of the user. Chen J. et al. [
13] described using ultrasonic signal and radio signal together to develop a transmitting 3D pen, and the algorithm to position the pen based on a set of receiving nodes covering the writing plane.
In the passive-object alternative, just the echo, or reflected wave, is detected back by the anchors. This is typically feasible for very short-range applications, such as gesture recognition [
14], in which the surface to locate is interference-free and has a reflective surface large enough to be easily recognized. The same distance measuring techniques used in the active alternative can be used with the passive alternative [
15], taking into account the characteristics of the passive approach.
Ultimately, the accuracy and robustness of the system rely on the dependability of the distance measurements. It is critical to recognize the incoming signal (either reflected or actively transmitted by another node) over the ultrasonic background noise. Several methods are proposed in the literature based on different criteria (time, frequency, phase) [
16], with the most popular being the cross correlation of the received and expected signal. It requires low computational power, introduces low delay and offers higher robustness against noise when detecting an echo [
13,
16].
This technique enables the emission of different signals (i.e., in the case where the anchors play the emitter role) to differentiate between incoming pulses, such as Direct Sequence Code-Division Multiple Access (DS-CDMA) [
17,
18].
Once the distances to the anchors are obtained, the location of the object can be determined, using a positioning algorithm, one based on the trilateration concept [
19,
20]. Knowing the position of three anchors
,
,
and the pairwise distances (
,
and
), the coordinates of the object,
can be calculated solving the following system of equations:
This algebraic solution corresponds to the cross points of the three spheres with center , and , and radii , and , respectively.
Three anchor nodes, and the distance from all three anchors to the object, are needed as a minimum requirement to obtain the 3D location of the object. If there are fewer than three distances to the anchor nodes (i.e., there is no direct acoustic channel between the object and one anchor), it is not possible to determine the location.
When there are more than three anchors involved in the location, we have an overdetermined system, and the method is called multilateration. Its advantage is a potentially increased robustness against inaccurate or missing distances. With
N anchors, it is required to solve a system with
N equations, making necessary the use of recursive algorithms to obtain an optimal solution [
21]:
To compute the coordinates of the object, and even to previously or simultaneously position the anchor infrastructure, fast and robust algorithms are required. They should be able to easily adapt to a varying number of noisy distances, and therefore are not totally reliable. Furthermore, if trajectories are to be obtained, a further processing step is useful to smooth out the path of the object and improve the accuracy of the estimated track.
In this paper, several approaches to achieve the object location and tracking are proposed using multidimensional scaling (MDS) and optimization algorithms. A qualitative evaluation of these algorithms is performed in this work. In addition, the integration of the algorithms in a synthetic data generation framework is discussed. This use case shows how the dataset creation task, i.e., ultrasound gesture dataset, could benefit from these algorithms due to the high flexibility to configure the desired output with different noise levels and gesture options. At the same time, since the desired data are configured by the user, this framework would generate simultaneously data and labels. By applying this framework, the possibility of incurring human error is reduced, as is the required time to generate synthetic labeled datasets.
MDS localization techniques have been previously researched, mostly for technologies such as Wireless Sensor Networks (WSN), Radio or 5G [
22]. However, to the best of our knowledge, these techniques have not been evaluated in emerging techniques, such as ultrasound for airborne applications. Because of this, the aim of this work is the usage of this algorithm for ultrasound data for target localization.
This work is structured as follows:
Section 2 presents the objectives to cover in this work.
Section 3 explains the proposed new algorithms to perform both the infrastructure and target positioning.
Section 3.4 explains the filter techniques studied in this work for smoothing out the trajectory, and
Section 4 describes the simulation performed.
Section 5 summarizes the results obtained, focusing on different parameters of each algorithm, and methods to improve the results via filtering or changing the infrastructure layout. Finally,
Section 6 presents the conclusions of this work.
2. Envisioned System
The goal of the present work is to analyze the feasibility and performance of a synthetic data generation framework based on the researched algorithms, due to its capabilities to accurately generate numerical samples. The required input for the data generation is an initial selection of the followed path (equation or time series of the desired movement). At the same time, this framework would enable the user to generate a more varied dataset since the noise level can be controlled as well as different modifications of the initial data (including rotating, scaling and translating the samples) in the 3D axis, which can later be converted to different formats to fit the specific application, i.e., images or voxels.
This framework could ease data gathering tasks, as real sensors are not required for this process, and it can generate numerous relevant samples that emulate different scenarios/technologies based on the configuration selected by the user, such as the anchor distribution and noise levels.
This system would be beneficial for tasks, such as gesture recognition based on multiple technologies, which numerous authors are researching. Most of the studies in this field are focused on radar [
15,
23], Wi-Fi [
24] and ultrasound sensors [
9,
13,
25]. In this paper, the framework will be evaluated for the generation of ultrasound data for gesture recognition. This technology is selected due to the emerging techniques with ultrasound sensors, which could be implemented directly on simple microcontroller-based devices, like that proposed in [
25].
The system to be emulated with the proposed framework is assumed to perform the following tasks (
Figure 1):
Distance estimation. The devices use ultrasound transceiver(s) to locally compute their distances to an object, e.g., the user’s hand, typically using ToF-based measurements. The pairwise distances between the anchors are also computed (with a lower frequency) to self-locate the anchor infrastructure.
Positioning algorithms. Using the pairwise distances between the anchors obtained in the previous point, the position of each anchor is computed. Then, using these positions and the distances between the user’s hand and all the anchors, the current position of the object is computed.
Tracking algorithms. The position of the object is periodically updated, effectively obtaining an estimation of its trajectory. This trajectory is filtered to improve its accuracy.
Recognition. The estimated trajectory is used as input for a gesture recognition stage, e.g., implemented with a neural network.
The current work focuses on the second and third steps, in which we transform from a temporal series of distances to the 3D trajectory of the object and the 3D positions of the anchors. It is important to say that, even when in this paper, the localization algorithms are tested with synthetic data, the proposed algorithms could also be deployed in a real scenario for target positioning.
To evaluate the applicability of the proposed algorithms, the following criteria are used:
The computational requirements of the positioning and tracking algorithms must be low enough to be executed in real time on low-power devices. Furthermore, they must be flexible enough to adapt to time-varying and noisy conditions, with a potentially variable number of anchors in range.
Analyze the accuracy of both the estimated object’s trajectory and the anchor’s position. The precision of the measured data will directly affect the results when using a classification algorithm to study the data. Because of this, it is important to ensure the high performance of the localization algorithms as well as the proposed filtering techniques. To evaluate this, noise—as typically encountered in ultrasounds systems in this case—is added to the raw distances. The positioning and tracking algorithms must provide optimal estimations and a robust behavior in the presence of noise, missing distances and outliers.
4. Simulation Setup
After presenting the different alternatives for object and infrastructure positioning based on pairwise distance measurements, we estimate the performance of the proposed algorithms in terms of accuracy and execution speed. We compare the two different approaches depicted in
Figure 2:
One-step approach (
Figure 2a). The SMACOF MDS algorithm is used to simultaneously obtain the positions of the anchors and the moving objects. It is expected to be slower and less accurate if noisy dissimilarities (such as those between the moving objects) are introduced in the computation, but all the anchors and object positions are computed simultaneously.
Two-step approach (
Figure 2b). The SMACOF MDS algorithm is used once to obtain the positions of the anchors. Then, the LM-BFGS optimization algorithm is used to compute only the coordinates of the moving object, and repeated periodically to update its position. This approach is faster, but relies on an accurate initial estimation of the anchor positions.
By using these two algorithms, it is possible to design the proposed framework for synthetic data generation. It executes the SMACOF MDS algorithm periodically to ensure that the position of the anchors is correct while locating simultaneously the target. Between these anchors check, the LM-BFGS algorithm is used due to its low latency and high accuracy when the position of the anchors is known.This approach is faster and results in a smaller error, as we will discuss in
Section 5.2 and
Section 5.4. The SMACOF MDS and LM-BFGS optimization computation steps can be done in a central processing node to which all the anchors report, or it can be done locally in the anchors or the mobile object, if they have access to all the distances. The particular communication scheme to disseminate the distances and the positions is out of the scope of this work. Finally, if we want to estimate a path and not only the single positions, a smoothing filter is used to compute the trajectory of the object.
Consequently, this framework can be used to generate synthetic trajectories for an arbitrary number of anchor configurations and gestures to fit multiple scenarios and applications, as shown in the synthetic data creation block in
Figure 2. At the same time, data augmentation for a single gesture and anchors setup is possible by varying the random initialization seeds of the noise for the SMACOF MDS and the LM-BFGS optimization algorithms, as shown in the data estimation block in
Figure 2. Consequently, this framework can efficiently generate numerous samples of the desired data to contemplate all the possible results of measurements with real devices. Furthermore, different noise models and strengths can be injected to the raw distances, emulating different disturbances and inaccuracies that the ultrasound distance gathering system can experience in a real deployment.
4.1. System Modeling
In this work, the framework emulates a system of nine ultrasound anchors sending a sinusoidal pulse (reference pulse). The echoes are then sampled, and the ToF is obtained with classical cross-correlation techniques, using a reference signal. Every pairwise distance is computed at 20 Hz, i.e., a new target position is computed every 50 ms. A two-step approach to compute the anchor and target positions as depicted in
Figure 2.
The infrastructure of the anchors for the proposed framework is shaped as a 2D array of nine anchors (ultrasound transceivers) located in the same surface (in the XY plane with
), which represents a plausible configuration for future applications. Specifically, the anchors are located as seen in
Figure 3. The positioning is limited to the space in front of said surface (
) since the sign of the
z coordinate cannot be defined when all the anchors are in the same plane. Furthermore, the ultrasound transceivers sensors used as an experimental support for the simulations have a detection range limited to 180 degrees in the Z-axis.
The anchor array is able to transmit and receive ultrasonic signals and locate passive objects based on ToF measurements. It has two operating modes:
To calculate the pairwise distances between the anchors, they actively exchange ultrasonic signals (two-way ranging).
To calculate the pairwise distances between the anchors and the mobile object, they actively transmit and then sense the reflected echo. Anchors can be synchronized, in which case, only one of the transceivers needs to transmit and they all can receive the echo and timestamp it based on a common clock. Otherwise, they can all transmit and sense only the echo coming from their own transmission; in such a case, time synchronization is not required.
4.2. Noise Modeling
The noise in the distances obtained with an ultrasound-based measurement system depends on the accuracy of the ToF samples. There are different factors that impact the performance, such as the bandwidth of the transmitted pulse and the sampling rate of the acquisition stage.
Based on our experimental measurements using the system of
Figure 1, the noise,
N, in the computed distances,
, can be modeled as unbiased (zero average) additive white Gaussian noise (AWGN), with a given standard deviation,
, and a probability density function,
, given by the following:
In a representative x-y-z point,
, the measured equivalent noise in the Euclidean distance, after acquiring 50,000 samples, can be fitted with a
mm, as seen in
Figure 4. This provides a good estimation of the scale of the expected noise in a real system, and it is used as a reference to model the noise in the simulations.
As a summary, in this section, the two-steps algorithm proposed and evaluated in this work is described, as well as the steps performed to particularize the framework for validate the use of ultrasonic system as technology for that algorithm.
6. Conclusions
This work presents a novel two-step technique to perform general infrastructure and moving-object positioning based on measured pairwise distances. In the first step, MDS is used to obtain the coordinates of the anchors, repeated with a low frequency, e.g., to correct minor and infrequent potential displacements of the anchors. We use the SMACOF variant of the mMDS family of algorithms. With the coordinates of the computed anchors computed, a fast optimization algorithm is used to obtain the unknown coordinates of the objects. This step is repeated with a high frequency. The LM-BFGS optimization algorithm is used for this step. Its performance is thoroughly analyzed with simulations, particularized to the use case of a system with ultrasound transceivers. The distribution and shape of the anchor infrastructure, the size of the region in which the positioning takes place and the strength of the noise are realistically modeled after such a system.
This two-step approach described in the work would be optimal in scenarios where the position of the anchors does not change frequently through time. Therefore, the one-step approach described in
Section 4 in which all the positions are computed at the same time, is reserved for special situations, e.g., when there are no anchors (all the objects are considered mobile) or when we want to simultaneously obtain the position of several (more than a dozen) mobile objects. For the rest of the scenarios, our approach performs the localization with low computational time, making it suitable for use in real-time systems and even in constrained edge devices.
Efficient and simple filtering techniques significantly reduce the error and improve the reconstruction of the real path followed by the mobile object. This feature can be exploited when using the proposed algorithms for synthetic data generation. The current dataset creation step for applications, such as AI models, are time consuming, due to the complexity of the recording and labeling tasks, which could be reduced by using the proposed system as a synthetic data generation framework. This framework is independent from hardware and it could simulate trajectories/movement from a large range of sensors. The parameters of this framework (noise, gesture and anchors number and position) are defined by the user through the initial configurations.
The use of ultrasonic signals for target positioning is widely researched, but to the best of our knowledge, our two-step approach inspired by wireless sensor network’s positioning algorithms has not been used or described. The proposed technique enables using an arbitrary number of ultrasound transceivers, and removes the constraint of knowing the position of the anchors beforehand, while providing an optimal AWGN rejection. This could drive the adoption of ultrasound technology in the positioning field and foster the research of novel applications and electronic components based on non-audible acoustic waves.