1. Introduction
With the wide-spread of the Internet of Things, there will be more and more context-aware applications that collect environmental information through sensor nodes (hereinafter referred to as nodes) and make corresponding actions in real time in the future smart society. Nodes will be deployed and connected to the Internet through technologies such as NB-IoT and LoRa [
1]. Generally, during the data collection process, the sink node (hereinafter referred to as sink) needs to receive data from each node one by one. When there are tens of thousands of nodes within the coverage of a sink, it will take a lot of time to receive all the data. In addition, the nodes share the same channel in the communication. If the network uses the CSMA (carrier-sense multiple access) protocol for channel access, the increase in the number of nodes will lead to frequent transmission collisions, which further degrades spectrum efficiency.
In some IoT applications, people only want the statistics of the data (e.g., the average temperature or the average noise level in a certain area and so on), not the individual data collected by nodes. For such applications, there is an efficient method called over-the-air computation (AirComp) [
2,
3]. In AirComp, nodes first preprocess the data, and then all nodes simultaneously transmit the preprocessed data through analog signals [
4]. All signals add together at the antenna of the sink, which corresponds to the sum operation. By using analog AirComp, the computation error can be made smaller than that of digital schemes when using the same amount of resources (power and bandwidth) [
5]. As well as the sum operation, AirComp also can support any kind of nomographic functions by proper preprocessing and postprocessing [
6,
7,
8]. Recently, deep AirComp has been studied, using federated learning in the pre-processing and post-processing stages, which enables more advanced processing of sensor data [
9].
Typically, in order to ensure unbiased data fusion, all nodes use transmission power control [
10,
11] to achieve a consistent signal magnitude when all signals arrive at the sink. Two problems may occur in the original AirComp model:
For nodes far away from the sink with low channel gain, greater transmission power is required to reach the target magnitude in order to minimize the computation error. This will result in a short lifetime of nodes and the entire network.
Under the transmission power limit, some nodes cannot reach the target magnitude even though they use the maximum transmission power. This leads to the computation error, which is measured by mean squared error (MSE).
Various methods have been studied to reduce the MSE from different aspects. The basic method is to apply transmission power control [
10,
11] using high power when the channel gain is low. However, this alone cannot well solve the problem when some nodes are in deep fading or far from the sink.
Conventional methods to channel fading can be applied in AirComp, but simultaneous transmission of multiple signals needs to be taken into account carefully. Time diversity is considered in [
12], where the computation is distributed to multiple slots, and each node can independently select one slot to transmit its signal based on its channel gain. The optimal number of slots is formulated, considering the benefit of channel gain improvement and the demerit of increased noise power due to using multiple slots. A counterpart in the frequency domain is studied in [
13], where the computation is distributed over multiple channels.
Another effective method to channel fading is path diversity, e.g., relay for network coding [
14]. Because AirComp usually works for analog signals, amplify-and-forward based relay is investigated in [
15] using a dedicated relay node to help nodes with low channel gains. Two relay policies, Simple Relay Policy (SRP) and Coherent Relay Policy (CRP), are studied. The two relay policies are further refined with a new metric, by which the transmission power is reduced while the MSE is kept low. However, the transmission power of the relay node may be much higher than that of normal nodes, leading to a surge in power consumption. In addition, the impact of the relay node on the network lifetime is not examined in this article. Under the constraint of transmission power, not all nodes can use the relay. In [
16], node scheduling, deciding which nodes can use the relay, is studied. Recently, new mechanisms, such as intelligent reflective surfaces, were studied in [
17]. The phases of reflective items can be adjusted so that signals add constructively at the sink. However, signals are not amplified at the reflective surface, and usually many reflective items are required to reach a reasonable signal strength. In addition, when multiple signals use the same reflective surface, the control of the surface phase is a non-convex problem.
It is also possible to use multiple antennas at the sink [
18]. However, the alignment in channel gain is different from beamforming [
19] because in AirComp different signals add together, not necessarily enhancing each other. The control of the antenna array is a non-convex problem, and it is difficult to find the optimal solution. Extending the distance between antennas, simultaneous receiving at multiple sinks is studied in [
20], where the interference needs to be taken into account.
The knowledge of channel gains at the sink and the synchronization between nodes and the sink usually are assumed, which, however, is non-trivial. Channel estimation is studied in [
18], using an iterative method with multiple antennas. The statistics of channel gains instead of their instantaneous values are exploited in [
12]. A more aggressive policy, blind AirComp, is explored in [
21].
Typically, it is assumed that signals from all sensors are non-correlated, which simplifies the analysis. However, when sensors are densely deployed, or in some specific applications (e.g., when AirComp is used for model aggregation in federated learning), signals from multiple nodes are correlated. This problem is explored in [
22,
23], which requires the correlation information and is not well solved yet.
As for power consumption, in [
24], Zang et al. try to minimize the sum power of the sensors under the constraint of MSE. The results show that the sensors with poor or good channel conditions should use less power than the ones with moderate channel conditions to reduce the MSE and the sum power of nodes. However, the random variation of channel gains is not taken into account.
To solve the aforementioned problems, in this paper, we investigate relay communication for AirComp, and use relay selection to both extend network lifetime and reduce MSE. To the best of our knowledge, this is the first work on relay selection for AirComp. In this method, each node and the sink are equipped with a single antenna. It is assumed that signals from all nodes are non-correlated and channel gains are known at the sink. We first calculate a set of candidate relay nodes based on channel gains. Then, an evaluation metric, considering transmission power, remaining battery capacity, and MSE, is computed for each candidate, based on which the optimal relay is selected. In addition, the relay node is periodically selected at regular intervals to avoid over-consumption of power at the relay node. This basic method is further enhanced, explicitly considering the network lifetime in the relay selection. Extensive simulation evaluations show that the basic method can extend the network lifetime and reduce MSE, while the enhanced method can further prolong the network lifetime under the same MSE.
In the rest of this paper,
Section 2 reviews the related work on the AirComp model, especially the relay transmission methods. Then,
Section 3 presents the proposed method, both the selection of candidate relay nodes and the metric for evaluating the suitability of candidate relay nodes.
Section 4 describes the simulation setting and analyzes the results and factors affecting the lifetime and MSE. Finally,
Section 5 concludes this paper and points out future works.
3. Proposed Method
In this paper, instead of using a dedicated relay node, we propose a new method to properly select an ordinary node as a relay node in the AirComp network. This method not only considers the power consumption and maximum power constraint of nodes in the network but also reduces the MSE as much as possible. Therefore, we call it relay Selection considering both Lifetime And MSE (SLAM). Here, network lifetime is defined as the difference between the beginning of communications in the network and the time when the first node in the network depletes its energy.
It is already proven that CRP works better than SRP in the amplify-and-forward-based relay [
15]. Therefore, in the following we mainly study how to dynamically select a relay for CRP.
3.1. System Model
All nodes and the sink use a single antenna. Nodes synchronize their time with the sink. The sink knows the channel coefficient () between itself and all nodes and the channel coefficients among all nodes, and performs centralized relay selection. The estimation of channel gain is left for future work. Block fading is assumed, i.e., the channel coefficient is stable within two slots.
Because AirComp is performed on analog signals, amplify-and-forward (AF)-based relay is used, in the same way as in [
15]. However, different from the dedicated relay node used in [
15], we chose an ordinary node in the network as a relay node. This choice also caused us to pay attention to the transmission power of the relay node to avoid an over-increase in its power consumption. Because the relay node has to transmit its own signal as well, its transmission power is
The transmission power of the relay node is decided by the nodes using the relay, under the maximal power constraint. Therefore, it always approaches, but is less than, the maximal power constraint.
The proposed method can be roughly divided into two modules—the calculation of candidate relay nodes and the selection of a relay node from the candidates, as shown in
Figure 4.
3.2. Calculation of Candidate Relay Nodes
Two conditions are given for deciding whether a node can be used as a relay node for over-the-air computation, as follows:
The transmission power of a node using the relay is less than that in the basic AirComp model, i.e., the relay node should help to reduce the transmission power of other nodes using relay.
The relay node’s transmission power is less than the maximum value.
We designed Algorithm 1 to select candidate relay nodes. In Algorithm 1, we regarded each ordinary node in turn as a relay node. When regarding node
as a relay node (Line 5), we computed the difference of channel gains to relay and sink as
(Line 7) and sorted nodes in an ascending order of
(Line 8). This is to let nodes far away from the sink but near to the relay use the relay node first. Then, with each possible number of nodes that can use node
as the relay, we recorded the corresponding MSE (Line 9–13). Finally, we found the optimal number of nodes using relay according to the MSE value, and recorded nodes’ power and MSE (Line 14). After all nodes were iterated, we sorted
in an ascending order (Line 16) and selected a number of top nodes as relay candidates
(Line 17).
Algorithm 1: Select candidate relay nodes |
Procedure: Find candidate relay nodes Input: : Channel coefficient between sink and node, : Channel coefficient between nodes and ; Output: : A set of candidate relay nodes; For do | Regard node as a relay node | Get (channel coefficient between relay and node ); | Calculate the difference of channel gain to relay and sink, ; | Sort in the ascending order; | For do | | Assume top j nodes in use the relay node; | | Minimize MSE of CRP, with power constraint for all nodes including relay; | | Calculate transmission power () of all nodes; | End | Find optimal number of nodes that can use node as relay and record End Sort in the ascending order; Select a number of top nodes as relay candidates based on ; Return
|
3.3. Relay Selection Considering Both MSE and Remaining Energy
Here, we present the evaluation metric for selecting a node from the candidate relay nodes as the relay.
3.3.1. Evaluation Metric for Selecting Relay
Relay selection should consider both MSE and the remaining energy of candidate relay nodes. Usually, it is preferred to select for the relay a node that both has a large remaining energy (long lifetime) and a small MSE. To this end, we designed a new metric to evaluate the suitability of each candidate as a relay, considering both factors, as follows:
where
is the remaining energy of node
,
is the normalized value of
,
is the normalized value of
, and
and
denote the weights of remaining energy and MSE, respectively. Here, we multiply the value of MSE by −1 in the evaluation metric to ensure that for both
and
, a large value means a better suitability.
Using Equation (10), we can compute the metric for each candidate relay node in
. Then, from
, the node with the largest metric
is selected as the relay, as follows:
Relay selection is divided into two steps. The remaining energy per node is not considered when selecting candidate relay nodes, but is considered when selecting one as a relay from the candidate relay nodes. This is due to the following reason: in slow channel fading, the channel gain is stable for a relatively long time, but the remaining energy per node changes with time. Then, it is possible to compute the candidate relay nodes at a long interval, while periodically selecting a relay at a short interval, so as to distribute the load and power consumption to different nodes.
3.3.2. Problem of the SLAM
We performed some simulations to evaluate the impact of parameter
, using the condition in
Table 1.
Figure 5 shows how network lifetime and MSE change with
. With a large
, the metric focuses more on remaining energy instead of MSE. Therefore, MSE increases with
. However, the lifetime does not always increase with
. It decreases when
becomes greater than 0.6, which is different from the expectation.
During the simulation process, we found that when a node in the network was far away from both the relay node and the sink, the channel gain of this node was very low, causing it to use high transmission power in the communication. As a result, among the nodes in the network, the first node that runs out of energy is not the node that acts as the relay node. The remaining energy of the relay node is not used to improve the network lifetime in this case.
3.4. Extending the Basic Method (SLAM-E)
To solve the problem of the basic method, we designed Algorithm 2 to explicitly predict the remaining lifetime of the network when a relay node is selected.
In Algorithm 2, regarding each candidate in as a relay in turn, we predicted the network lifetime based on the remaining energy and current power consumption of each node (lines 5 to 9). Here, network lifetime is the minimum of all nodes’ lifetime (line 10). By iterating over all the candidate relay nodes in turn, the network lifetimes corresponding to all the candidate relay nodes are obtained.
Then, we updated the metric in Equation (10), replacing
by
, as follows.
The other process is the same as the basic method in Algorithm 1.
Algorithm 2: Predict the remaining lifetime of the network |
Procedure: Find network lifetime Input: (Number of nodes), , (the set of all nodes’ remaining battery); Output: recording the lifetime of each node in ; For do | Regard node as relay , and get remaining energy of all nodes (); | For = 1, 2, …, , do | | Calculate the transmission power of node j (); | | ; | End | Network lifetime computation: ; End Return L;
|
3.5. Analysis of Computational Complexity
In the network, the number of nodes is , and the number of candidate relay nodes is . Here, denotes the number of the candidate relay nodes. Both the basic and extended methods use Algorithm 1 to compute the set of candidate relay nodes. In Algorithm 1, a loop with a loop count is performed from line 4 to line 15, and another inner loop with a loop count is performed from line 9 to line 13. In line 11, the CRP algorithm is used, whose time complexity is . Therefore, the time complexity of Algorithm 1 is.
In Algorithm 2, an outer loop with a loop count is performed from line 4 to line 11, and an inner loop with a loop count is performed from line 6 to line 9. Therefore, the time complexity of Algorithm 2 is.
The basic method (SLAM) directly computes the evaluation metric by Equations (10)–(11), and the extra computation is . The extended method (SLAM-E) needs to predict the network lifetime, and the extra computation is .
Then, the overall computation complexity of both methods is , decided by Algorithm 1.