An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks

Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang

doi:10.3390/s16101547

Open AccessArticle

An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks

by

Donghao Wang

,

Jiangwen Wan

^*,

Junying Chen

and

Qiang Zhang

School of Instrumentation Science and Opto-Electronics Engineering, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Sensors 2016, 16(10), 1547; https://doi.org/10.3390/s16101547

Submission received: 30 May 2016 / Revised: 19 August 2016 / Accepted: 14 September 2016 / Published: 22 September 2016

(This article belongs to the Section Sensor Networks)

Download

Browse Figures

Versions Notes

Abstract

:

To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It’s theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods.

Keywords:

wireless sensor networks; sparse representation; compressive sensing; data gathering; online dictionary learning

1. Introduction

Wireless sensor networks (WSNs) which are composed of lots of tiny, resource-constrained and cheap sensor nodes are self-organized networks. These nodes are always deployed in distributed mode to perform various applications, such as healthcare monitoring, transportation systems, industry service, and environmental measurement of humidity or temperature data [1]. In each case, efficient data gathering for target information is one of the primary missions.

In a typical scenario, WSNs have many ordinary sensor nodes and a base station named sink node. The ordinary nodes can only perform simple measurement and communication tasks since they are always equipped with limited power supply and most of the time, it is difficult to replace or recharge the battery. In contrast, the sink node is capable of performing complex operations since it is usually supplied with unlimited resources. Thus how to balance the energy consumption and develop energy efficient data collection protocols is still a research hotspot.

To reduce energy consumption for data gathering in WSNs, distributed source coding (DSC) [2] was proposed to compress the raw data between the ordinary nodes. DSC-based data collection protocols are composed of two important procedures. The first step is the collection of spatial-temporal correlation properties of the raw data. The second is a coding step which is based on the Slepian-Wolf coding theory. The coding process imposes no communication burden among sensor nodes, but the data correlation of the whole network must be calculated at the sink node before data collection, which results in a relatively high computational cost.

In recent years, compressive sensing has emerged as a new approach for signal acquisition, which can guarantee exact signal reconstruction with a small number of measurements [3]. Data compression and data collection are integrated into a single procedure in compressive sensing-based data gathering methods. Moreover, the high computational burdens are transferred to the base station. Finally, the incomplete data can be recovered by various complicated reconstruction algorithms at the sink node. Nevertheless, to ensure exact reconstruction, the key point is that the signals are required to be sparse in some base dictionary. Sparse representation expresses signals as sparse linear combinations of the basis. Therefore, dictionary learning for sparse signal representation is one of the core problems of compressive sensing.

The paper presents an ODL-CDG algorithm. The proposed algorithm aims to reduce the energy consumption for data gathering problem in WSNs. How to design ODL-CDG algorithm to be robust to environmental noise is also our objective. The main contributions of this paper can be summarized as follows:

(1): Inspired by the periodicity of nature signals, the learned dictionary is constrained with sparse structure where each atom is a sparse linear combination of the base dictionary. We first apply the sparse structured dictionary in the compressive data gathering process.
(2): The self-coherence of the learned dictionary is introduced as a penalty during the optimization procedure, which reduces the reconstruction error caused by ambient noise.
(3): In respect of the sparse structured dictionary D and the Gaussian observation matrix Φ, it’s theoretically demonstrated that the sensing matrix P = ΦD meets the property of RIP with very high probability. What’s more, the lower bound of necessary number of measurements for exact reconstruction is given.
(4): With these consideration, the online dictionary learning algorithm is designed to improve the adaptability of data gathering algorithm for a variety of practical applications. The training data is gathered in parallel with the compressive sensing procedure, which reduces the enormous energy consumption.

The remainder of this paper is organized as follows: in Section 2, we review the previous works involving dictionary learning and energy-efficient data gathering problem in wireless sensor networks. Section 3 presents the mathematical formulation of the problem in detail. Section 4 demonstrates the RIP property of the sensing matrix. In Section 5, the optimized solution of our proposed ODL-CDG problem is given and the convergence property is also analyzed. In Section 6, the performances of the ODL-CDG algorithm are verified on synthetic datasets and the real datasets by experimental results. Finally, conclusions are drawn and future work proposed in Section 7.

2. Related Work

In the past few years, much effort has gone into designing data gathering techniques with the aim of reducing energy consumption in WSNs. Luo et al. [4] first proposed a complete design for compressive sensing-based data gathering (CDG) in large scale wireless sensor networks. In CDG, the sensor readings are assumed to be spatially correlated. Moreover, the communication cost is reduced and the load balance is achieved simultaneously. Liu et al. [5] introduced a novel compressed sensing method called expanding window compressed sensing (EW-CS) to improve recovery quality for non-uniform compressible signals. Shen et al. [6] proposed non-uniform compressive sensing (NCS) for signal reconstruction in the WSNs. The spatio-temporal correlation of the sensed data and the network heterogeneity were both taken into consideration in NCS, which leads to significantly less samples. In [7], the authors presented a quantitative analysis of the primary energy consumption of WSNs. They pointed out that the compressed sensing and distributed compressed sensing may act as energy efficient sensing approaches in comparison with other compression techniques.

The abovementioned compressive sensing-based data gathering methods can relieve the energy shortage problem and prolong network lifespan. However, these methods have limitations in that the signals are assumed only to be sparsely represented in a specified basis, etc., in a wavelet, Discrete Cosine Transform (DCT) or Fourier basis. Actually, a single predetermined basis may not be able to sparsely represent all types of signals, since there is a wide variety of applications for WSNs.

To adapt to signals of enormous diversity and dynamics, dictionary learning from a set of training signals has received lots of attention. The goal is to train a dictionary that can decompose the signals using a few atoms. The K-SVD [8] method is one of the well-known dictionary learning algorithms, which can lead to much more compact representation of signals. Duarte et al. [9] proposed to train the dictionary and optimize the sampling matrix simultaneously. The motivation is to make the mutual coherence between the dictionary and the projection matrix as minimal as possible. Christian et al. [10] presented a dictionary learning algorithm called IDL which made a trade-off between the coherence of the dictionary to the observations of signal class and the self-coherence of the dictionary atoms. To accelerate the convergence of K-SVD, an overcomplete dictionary was proposed in [11]. The authors suggested updating the atoms sequentially, thus leading to much better learning accuracy when compared with K-SVD. In [12], a new dictionary learning framework for distributed compressive sensing application was presented utilizing the data correlation between intra-nodes and inter-nodes, which resulted in improved compressive sensing (CS) performance

However, the case where there is no access to the original data is not taken into account in the above work. What’s more, obtaining the full original data may be costly in wireless sensor networks. That is our motivation to learn the dictionary from a compressive sensing approach. Studer et al. [13] investigated dictionary learning from sparsely corrupted signals or compressed signals. In [14], the authors further extended the problem of compressive dictionary learning based on sparse random projections. The idea was coming from their previous paper [15], where the compressive K-SVD (CK-SVD) algorithm was proposed to learn a dictionary using compressive sensing measurements. Aghagolzadeh et al. [16] associated the spatial diversity of compressive sensing measurements without additional structural constraints on the learned dictionary, which guaranteed the convergence to a unique solution with high probability.

Nevertheless, the environmental noise is not considered in the methods mentioned above. As analyzed in Section 3, the reconstruction error caused by environmental noise is positively correlated with the self-coherence of the learned dictionary. Thus, the self-coherence of the learned dictionary is added as a penalty term during the dictionary updating step. The novel dictionary is also imposed by structural constraints.

3. Problem Formulation

In this section, we introduce the related issues in respect to dictionary learning and compressive sensing theory. The final form of ODL-CDG problem is formulated in detail. The main notations of the paper are summarized in Table 1.

3.1. Compressive Sensing

Compressive sensing (CS) theory builds on the surprising revelation that a sparse signal can be recovered from a much smaller number of sampling values. Let x ∈ R^N be the original signal vector, which denotes sensor readings gathered in wireless sensor networks. Suppose

Φ \in ℝ^{M \times N} (M < N)

is the measurement matrix with independent and identically distributed Gaussian entries and unit norm columns. Thus the lower-dimensional linear measurement vector y ∈ R^M can be obtained from the following standard measurement model:

y = Φ x + e

(1)

where e ~ N(0, σ²) ∈ R^M is a white Gaussian noise vector.

Since sensor readings have spatial correlation, the signal vector x is assumed to be K-sparse in a given orthonormal basis Ψ = [Ψ₁ Ψ₂…Ψ_N], Ψ_i ∈ R^N. That is:

x = Ψ θ

(2)

where vector θ = [θ₁,θ₂,…θ_N]^T is the corresponding sparse coefficients, with the constraint of

{‖ θ ‖}_{0} = K ≪ N

. The orthonormal basis Ψ can be constructed from various bases: DCT, wavelets, curvelets, etc.

As the number of equations M is much smaller than the number of variables N, the reconstruction of original signal x is an under-determined problem. An initial approach to solve the problem of recovering x depends on solving the following l₀ minimization problem:

\min_{θ \in ℝ^{N}} {‖ θ ‖}_{0}, s . t . {‖ y - Φ Ψ θ ‖}_{2} \leq η

(3)

where η is the expected noise on the measurements,

{‖ • ‖}_{0}

denotes the number of nonzero entries of vector θ, and

{‖ • ‖}_{2}

counts the standard Euclidean norm. The above problem is NP-hard, so it’s numerically unstable to seek a global solution. To get an approximate solution, various greedy algorithms could be employed, like compressive sampling matching pursuit (CoSaMP) [17], Orthogonal Matching Pursuit with Replacement (OMPR) [18], and stagewise orthogonal matching pursuit (StOMP) [19].

Fortunately, the above problem is equivalent to the following l₁ minimization problem under certain conditions. Thus the recovery can be obtained using linear programming (LP) techniques, searching for resolution of:

\min_{θ \in ℝ^{N}} {‖ θ ‖}_{1}, s . t . {‖ y - Φ Ψ θ ‖}_{2} \leq η

(4)

If matrix P = ΦΨ satisfies the RIP [20], the solutions of optimization Equations (3) and (4) coincide with each other. The definition of RIP is as follows:

Definition 1. (Restricted Isometry Property):

Let P = ΦΨ be an M × N matrix and let θ be the sparse vector. The number of nonzero entries of vector θ is no larger than K. Define the K-restricted isometry constant δ_K as the smallest constant that satisfies:

(1 - δ_{K}) {‖ θ ‖}_{2}^{2} \leq {‖ Ρ θ ‖}_{2}^{2} \leq (1 + δ_{K}) {‖ θ ‖}_{2}^{2}

(5)

Then the matrix P = ΦΨ is said to satisfy the K-restricted isometry constant with the constant δ_K.

3.2. The Conventional Dictionary Learning Methods

Although exact recovery on account of a fixed sparse representation dictionary is guaranteed for inherently sparse signals or compressible signals, natural signals tend to have various types. Consequently, a single fixed dictionary would not be enough to sparsely represent all types of signals. Hence, much work has been done to achieve sparse redundant dictionary using dictionary learning methods, since it can enhance their ability to adapt to different types of signals.

Let

{x^{i}}_{i = 1}^{L}

denote training data for dictionary learning. xⁱ ∈ R^N represents a data vector and L represents the amount of training data. Thus the data matrix X = [x¹, x²,…,x^L] ∈ R^N^×L is obtained. Then the general form of conventional dictionary learning methods can be rewritten as:

\min_{D, C} {‖ X - D C ‖}_{F}^{2}, s . t . \forall i, {‖ c_{i} ‖}_{0} \leq S

(6)

where ‖•‖_F represents matrix Frobenius norm, D ∈ R^N^×K denotes the sparse redundant dictionary, and C ∈ R^K^×L denotes the sparse matrix.

However, it is a fact that the original training data may not be available, or the cost for obtaining enough original data is high. In this paper, we are interested in training a sparse representation dictionary with only a few of CS measurements, which are linear projections of the original signals X onto a random sensing matrix Φ. The problem of learning a dictionary D ∈ R^N^×K from a series of linear and non-adaptive measurements is defined as:

y^{i} = Φ D a^{i} ， i = 1, ..., L

(7)

with yⁱ ∈ R^M representing the compressed version of xⁱ and aⁱ representing the sparse column vector of the sparse matrix.

3.3. Sparse Structured Dictionary

In the sparse dictionary model D, it is assumed that each atom of the dictionary can be expressed as a sparse linear combination of few atoms in a fixed base dictionary Ψ. The dictionary is therefore expressed as:

D = Ψ Θ

(8)

where

Θ \in R^{N \times p}

(p ≥ N) is the atom representation dictionary which is assumed to be sparse. Obviously, the base dictionary Ψ should contain some prior knowledge about the training signals. The matrix Ψ itself can also act as the sparse representation dictionary, such as the (DCT) dictionary or the overcomplete dictionary learned through other methods. The sparse dictionary model can adapt to various signals by modifying the matrix Θ. As a general, with another Θ columns added to the base dictionary, the matrix Θ can be expressed as the following structure:

Θ = [I_{N \times N} | Σ_{N \times N}]

(9)

where Σ is assumed to be sparse and normalized.

3.4. The Recovery Error Penalty

In practical applications, the compressive data gathering procedure is corrupted with ambient noise. As shown in Equation (1), e represents the measurement noise. We employ the mean square error (MSE) to estimate the performance of reconstructing a sparse random vector θ in the presence of random Gaussian noise vector e:

MSE = E_{θ, e} [{‖ \hat{θ} - θ ‖}_{2}^{2}]

(10)

where

\hat{θ}

denotes an estimation value of θ and E_θ,e(•) denotes the mathematical expectation concerning the joint distribution of the random vectors θ and e. The well-known oracle estimator assumes that the position of non-zero entries of sparse vector θ is known as a priori knowledge. The prior support is defined as a set Γ ⊂{1,2,…,N}. Thus Equation (1) can be expressed in the following form:

y = Φ D I_{Γ} θ_{Γ} + e

(11)

where I_Γ denotes the matrix obtained by just preserving the corresponding columns of the identity matrix on the support of Γ, and θ_Γ ∈ R^S denotes the vector obtained by deleting the set of entries out of the support Γ. Then the formulation of the oracle estimator is given as follows [21]:

{MSE}^{o r a c l e} = σ^{2} T r [{(I_{Γ}^{T} D^{T} Φ^{T} Φ D I_{Γ})}^{- 1}]

(12)

where Tr(•) denotes the trace of a matrix.

To mitigate the MSE caused by ambient noise, we consider making

T r [{(I_{Γ}^{T} D^{T} Φ^{T} Φ D I_{Γ})}^{- 1}]

as small as possible. As theoretically discussed in [22], the term

T r [{(I_{Γ}^{T} D^{T} Φ^{T} Φ D I_{Γ})}^{- 1}]

is positively correlated with the self-coherence of the sparse dictionary. Therefore, the penalty term

∥ D^{T} D - I ∥_{F}^{2}

is introduced here to constrain the self-coherence of the learned dictionary.

3.5. The Final Form of ODL_CDG Problem

In this section, the method for training a sparse representation dictionary with only a few CS measurements is presented. The CS measurements can be arbitrary linear combinations of the original signals. The self-coherence of the learned dictionary is introduced to reduce the recovery error. Thus, in consideration of the sparse structured constraint and the self-coherence of the learned dictionary, the sparse dictionary model can be obtained from low dimensional CS measurements. Finally, the following optimization problem is obtained:

\min_{A, D} {\frac{1}{2} {‖ Y - Φ D A ‖}_{F}^{2} + {‖ D^{T} D - I ‖}_{F}^{2} + λ_{A} {‖ A ‖}_{1}}, s . t . {‖ Θ ‖}_{1} \leq ε_{θ}

(13)

4. Necessary Guarantees for Signal Reconstruction

Cai et al. [23] proved that Basis Pursuit algorithm could guarantee the reconstruction of Equation (4) in the following theorem:

Theorem 1.

Assume that the measurement matrix Φ satisfies

δ_{2 s} (Φ) < 1 / \sqrt{2}

for some

S \in ℕ

. Let θ be a S-spare vector. Then, the recovery error of Equation (4) is bounded by the noise level.

The reconstruction of signals that are sparse in an orthonormal basis is guaranteed by the above theorem. Nevertheless, we mainly stress the problem that signals are not sparse in an orthonormal basis but rather in a redundant dictionary D ∈ R^N×2N, as described in Section 3.2 and Section 3.3. For the sake of elaboration convenience, the following two lemmas are given:

Lemma 1.

Let entries of Φ ∈ R^M×N be independent normal variables with mean zero and variance n⁻¹. Let D_Λ, i.e., |Λ| = S, be the submatrix extracted from columns of the redundant matrix D. Define the isometry constant δ_Λ = δ_Λ(D) and v: = δ_Λ + δ + δ_Λδ for 0 < δ < 1. Then:

(1 - ν) {‖ θ ‖}_{2}^{2} \leq {‖ Φ D_{Λ} θ ‖}_{2}^{2} \leq (1 + ν) {‖ θ ‖}_{2}^{2}

(14)

with probability exceeding:

1 - 2 {(1 + \frac{12}{δ})}^{S} e^{- \frac{c}{9} δ^{2} M}

(15)

where c is a positive constant, in particular c equals 7/18 for the Gaussian matrix Φ.

Proof.

The proof of lemma 1 can be found in [24]. □

Lemma 2.

The restricted isometry constant of a redundant dictionary D with coherence μ is bounded with:

δ_{S} (D) \leq (S - 1) μ

(16)

Proof.

This can be obtained from the proof [25]. □

Theorem 2.

Assume that the structure of our redundant dictionary is D = ΨΘ = Ψ[I_N×NΣ_N×N], where Ψ is the orthogonal base dictionary, i.e., discrete cosine transform (DCT) based ion R^N for N = 2^2p+1. The number of atoms is K = 2^2p+2. Suppose that the sparsity of the signal is smaller than 2^p−4, the necessary sampling number that could guarantee signal reconstruction is obtained by:

M \geq C_{1} (4 S (2 p \log 2 - \log S) + C_{2} + t)

(17)

with the constants C₁ ≈ 524.33 and C₂ ≈ 5.75.

Proof.

For t > 0, assume that the local isometry constant of matrix P = ΦΨ is δ_Λ (P), which is no larger than δ_Λ (D) + δ + δ_Λ (D)δ with probability at least 1−e^−t. □

By Lemma 1, we obtain that:

\begin{array}{l} Ρ (δ_{Λ} (Ρ) > δ_{Λ} (D) + δ + δ_{Λ} (D) δ) \\ \leq 2 {(1 + \frac{12}{δ})}^{S} e^{- \frac{c}{9} δ^{2} M} \end{array}

Thus the global isometry constant

δ_{S} (A) ≔ \sup_{| Λ | = S} δ_{Λ} (A), S ϵ N

is bounded over all

(\begin{matrix} K \\ S \end{matrix})

possibilities. So:

\begin{array}{l} Ρ (δ_{S} (Ρ) > δ_{S} (D) + δ + δ_{S} (D) δ) \\ \leq 2 (\begin{array}{l} K \\ S \end{array}) {(1 + \frac{12}{δ})}^{S} e^{- \frac{c}{9} δ^{2} M} \end{array}

Using the Stirling formula, and confining the above term to less than e^−t, the following inequation is obtained:

\begin{array}{l} 2 {(\frac{e K}{S})}^{S} {(1 + \frac{12}{δ})}^{S} e^{- \frac{c}{9} δ^{2} M} < e^{- t} \\ \Rightarrow 2 {(\frac{K}{S})}^{S} {(e (1 + \frac{12}{δ}))}^{S} < e^{- t + \frac{c}{9} δ^{2} M} \\ \Rightarrow \log 2 + S \log (\frac{K}{S}) + S \log (e (1 + \frac{12}{δ})) < - t + \frac{c}{9} δ^{2} M \\ \Rightarrow \frac{c}{9} δ^{2} M \geq S \log (\frac{K}{S}) + S \log (e (1 + \frac{12}{δ})) \log 2 + t \\ \Rightarrow M \geq 9 δ^{- 2} c^{- 1} (S \log (\frac{K}{S}) + S \log (e (1 + \frac{12}{δ})) + \log 2 + t) \\ \Rightarrow M \geq 9 δ^{- 2} c^{- 1} (S \log (\frac{K}{S}) + \log (2 e (1 + \frac{12}{δ})) + t) \\ \Rightarrow M \geq C_{1} (S \log (\frac{K}{S}) + C_{2} + t) \end{array}

The above theoretical derivation states that δ_S(P) is less than δ_S(D) + δ + δ_S(D)δ with probability at least 1−e^−t when

M \geq C_{1} (S \log (\frac{K}{S}) + C_{2} + t)

.

Let μ be the coherence of dictionary D, we assume:

S - 1 \leq \frac{1}{10} μ^{- 1}

(18)

Then combining with Lemma 2, we can obtain:

δ_{S} (D) \leq (S - 1) μ \leq \frac{1}{10}

(19)

Thus, defining δ = 7/33 yields:

\begin{array}{l} δ_{S} (Ρ) \leq δ_{S} (D) + δ + δ_{S} (D) δ \\ \leq \frac{1}{10} + \frac{7}{33} (1 + \frac{1}{10}) = \frac{1}{3} \end{array}

(20)

As demonstrated by Theorem 1, the necessary number of samples to have

δ_{2 s} (Φ) < 1 / \sqrt{2}

is:

M \geq C_{1} (S \log (\frac{K}{S}) + C_{2} + t)

(21)

Replacing S in Equation (21) with 2S, and plugging K = 2^2p+2 and δ = 7/33 into Equation (21), the necessary number of samples is finally available. That is:

M \geq C_{1} (2 S (2 p \log 2 + \log 2 - \log S) + C_{2} + t)

(22)

5. The Solution of the ODL-CDG Algorithm

Considering the demonstration in Section 4, we conclude that the ODL-CDG algorithm is feasible with a sufficient number of measurements. To get the solution by existing methods, the optimization Problem (13) is reformulated as an unconstrained optimization problem. Therefore, the cost objective function of final form of ODL-CDG problem is:

\min_{A, Θ} {\frac{1}{2} {‖ Y - Φ Ψ Θ A ‖}_{F}^{2} + {‖ {(Ψ Θ)}^{T} (Ψ Θ) - I ‖}_{F}^{2} + λ_{A} {‖ A ‖}_{1} + λ_{Θ} {‖ Θ ‖}_{1}}

(23)

where λ_A and λ_Θ represent the sparse coefficient regularization parameter and the structured dictionary regularization parameter, respectively. The ODL-CDG algorithm is solved in a two-step iterative approach, which alternates between sparse coding and dictionary update procedures.

5.1. Sparse Coding

In the sparse coding step, sparse coefficients are obtained using the dictionary D which is computed from the previous step:

\min_{A} {\frac{1}{2} {‖ Y - Φ Ψ Θ A ‖}_{F}^{2} + λ_{A} {‖ A ‖}_{1}}

(24)

The above optimization Problem (24) can be successfully computed using Matching Pursuit LASSO (MPL) [26]. MPL can greatly speed up the convergence of the Problem (24) when employed in batch-mode.

5.2. Dictionary Update

In the dictionary update step, with the consideration of the sparsity constraint on the dictionary, the following optimization equation is obtained:

\min_{Θ} {\frac{1}{2} {‖ Y - Φ Ψ Θ A ‖}_{F}^{2} + {‖ {(Ψ Θ)}^{T} (Ψ Θ) - I ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1}}

(25)

The object function in Equation (25) can be rewritten into the following form:

\min_{Θ} F (Θ) = f (Θ) + g (Θ)

(26)

with:

f (Θ) = \frac{1}{2} {‖ Y - Φ Ψ Θ A ‖}_{F}^{2} + {‖ {(Ψ Θ)}^{T} (Ψ Θ) - I ‖}_{F}^{2} and g (Θ) = λ_{Θ} {‖ Θ ‖}_{1}

(27)

Obviously, the accelerated proximal gradient (APG) approach [27,28] can be used to solve the above unconstrained non-smooth convex problem, where both f, and g are convex. Furthermore, f is Lipschitz continuous:

{‖ \nabla f (Θ_{1}) - \nabla f (Θ_{2}) ‖}_{2} \leq L_{f} {‖ Θ_{1} - Θ_{2} ‖}_{2}, \forall Θ_{1} ， Θ_{2} \in ℝ^{N \times p}

(28)

where ‖•‖₂ denotes the standard Euclidean norm and L_f > 0 is the Lipschitz constant of

\nabla

f.

Since g(Θ) is nonsmooth, it is difficult to directly minimize the objective function F(Θ). Instead, F(Θ) is approximated locally as a quadratic function at point Y_k and we try to repeatedly solve:

Θ_{k + 1} = \arg \min_{\hat{Θ} \in ℝ^{N \times p}} Q (Θ, Y_{k}) ≐ f (Y_{k}) + 〈 \nabla f (Y_{k}), Θ - Y_{k} 〉 + \frac{L_{f}}{2} {‖ Θ - Y_{k} ‖}_{F}^{2} + g (Θ)

(29)

For convenience, let S_λ(y) denote the soft-thresholding operator [29], which is defined as follows:

S_{λ} (y) = {\begin{cases} y - ε, if y > ε \\ y + ε, if y < - ε \\ 0, otherwise \end{cases}

(30)

where y ∈ R and ε > 0. This operator can be also useful when applied elementwise to vectors and matrices.

Let G ∈ R^N^×p. Then:

S_{λ_{Θ}} (G) = \arg \min_{\hat{Θ} \in ℝ^{N \times p}} {\frac{1}{2} {‖ Θ - G ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1}}

(31)

where

S_{λ_{Θ}} (G)

is the soft-thresholding operator, as defined in Equation (30).

Proposition 1.

Let Y_k ∈ R^N^×p, g(Θ) = λ_Θ‖Θ‖₁. Assume f(Θ) is Lipschitz continuous with a positive constant L_f, then we have:

\arg \min_{\hat{Θ} \in ℝ^{N \times p}} Q (Θ, Y_{k}) = S_{λ {L_{f}}^{- 1}} (G_{k})

(32)

where

G_{k} = Y_{k} - \frac{1}{L_{f}} \nabla f (Y_{k})

.

Proof.

\begin{array}{l} Q (Θ, Y_{k}) & ≐ f (Y_{k}) + 〈 \nabla f (Y_{k}), Θ - Y_{k} 〉 + \frac{L_{f}}{2} {‖ Θ - Y_{k} ‖}_{F}^{2} + g (Θ) \\ = f (Y_{k}) + 〈 \nabla f (Y_{k}), Θ - Y_{k} 〉 + \frac{L_{f}}{2} {‖ Θ - Y_{k} ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1} \\ = \frac{L_{f}}{2} {‖ Θ - (Y_{k} - \frac{1}{L_{f}} \nabla f (Y_{k})) ‖}_{F}^{2} - \frac{1}{2 L_{f}} {‖ \nabla f (Y_{k}) ‖}_{F}^{2} + f (Y_{k}) + λ_{Θ} {‖ Θ ‖}_{1} \\ = \frac{L_{f}}{2} {‖ Θ - G ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1} + f (Y_{k}) - \frac{1}{2 L_{f}} {‖ \nabla f (Y_{k}) ‖}_{F}^{2} \end{array}

(33)

□

Then combined with Equation (32), the final solution is obtained:

\begin{array}{l} Θ_{k + 1} & = \arg \min_{\hat{Θ} \in ℝ^{N \times p}} Q (Θ, Y_{k}) \\ = \underset{\hat{Θ} \in ℝ^{N \times p}}{\arg \min} {\frac{L_{f}}{2} {‖ Θ - G ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1} + f (Y_{k}) - \frac{1}{2 L_{f}} {‖ \nabla f (Y_{k}) ‖}_{F}^{2}} \\ = \underset{\hat{Θ} \in ℝ^{N \times p}}{\arg \min} {\frac{L_{f}}{2} {‖ Θ - G ‖}_{F}^{2} + λ_{Θ} {‖ Θ ‖}_{1}} \\ = S_{λ {L_{f}}^{- 1}} (G_{k}) \end{array}

(34)

However, it is not always easy to compute the Lipschitz constant L_f. The APG algorithm with a backtracking stepsize rule is employed in Algorithm 1. We appoint an initial estimate value of L_f and increase the estimate gradually until the violation rule is reached.

Finally the pseudo-code of the ODL-CDG algorithm is shown in Algorithm 2.

Algorithm 1. APG with backtracking.

Initialization: Let L₀ > 0, η > 1

While not converged do

1: Find the smallest nonnegative interger i_k with

\bar{L} = η^{i_{k}} L_{k - 1} (k \geq 1)

to satisfy

F (S_{λ \bar{L}} (G_{k})) \leq Q (S_{λ \bar{L}} (G_{k}), Y_{k})

2: Set

L_{k} = η^{i_{k}} L_{k - 1} (k \geq 1)

and compute

Y_{k} \leftarrow X_{k} + \frac{t_{k - 1} - 1}{t_{k}} (X_{k} - X_{k - 1})

Θ_{k + 1} \leftarrow a r g \min_{X} Q_{L_{f}} (Θ, Y_{k})

t_{k + 1} \leftarrow \frac{1 + \sqrt{1 + 4 t_{k}^{2}}}{2}

k \leftarrow k + 1

End While

Algorithm 2. ODL-CDG Algorithm.

Input: Y,Φ,Ψ,λ_A,λ_Θ,T,ε_stop

Output: D,

\hat{X}

Main procedure:

While t < T and

∥ Θ_{k + 1} - Θ_{k} ∥_{2} > ε_{s t o p}

do

Sparse Coding using MPL:

\min_{A^{(t)}} {\frac{1}{2} ∥ Y - Φ Ψ Θ^{(t - 1)} A^{(t)} ∥_{F}^{2} + λ_{A} ∥ A^{(t)} ∥_{1}}

Dictionary Update using APG (see Algorithm 1)

\min_{{\hat{Θ}}^{(t)}} Q (Θ, Y) = \underset{\hat{Θ} \in ℝ^{N \times p}}{\arg \min} {\frac{L_{f}}{2} {‖ Θ^{(t)} - G^{(t - 1)} ‖}_{F}^{2} + λ_{Θ} {‖ Θ^{(t)} ‖}_{1}}

t \leftarrow t + 1

End while

Then

D = Ψ [I_{N \times N} | Σ_{N \times N}] = [Ψ Ψ Σ]

Compute

\hat{A}

:

\min_{\hat{A}} {\frac{1}{2} {‖ Y - Φ D \hat{A} ‖}_{F}^{2} + λ_{A} {‖ \hat{A} ‖}_{1}}

\hat{X} = D \hat{A}

5.3. Convergence Analysis

As described above, the ODL-CDG algorithm contains the sparse coding step and the dictionary learning step. In the sparse coding step, the convergence of the optimization Problem (24) is guaranteed by MPL [26]. Furthermore, in the dictionary update step, the sequence of function values F(Θ_k) produced by APG is non-increasing, since the Lipschitz constant L_f satisfies L₀ ≤ L_f(k) ≤ ηL_f for every k ≥ 1. The convergence rate of APG with the backtracking rule is demonstrated as O(k⁻²), that is to say F(Θ_k) − F(Θ*) ≤ Ck⁻² [27]. What’s more, the convergence of the alternating minimization method is also studied in [11]. Therefore, the convergence of ODL-CDG algorithm can be guaranteed.

6. Simulation

This section presents our simulation results on synthetic data and the real datasets. The performance of the proposed dictionary is compared with a pre-specified dictionary, like the DCT dictionary, and other dictionary learning approaches, such as K-SVD, IDL and CK-SVD.

6.1. Recovery Accuracy

The initial basis Ψ is a 50 × 50 DCT matrix. A set of training signals

{y_{i}}_{i = 1}^{L}

is generated by a linear combination of the original synthetic data. The process is accomplished by applying a projection matrix Φ with independent and identically distributed Gaussian entries and column normalization. Input parameters to Algorithm 1 are λ_A = 0.1, λ_Θ = 0.05, ε_stop = 0.001 and T = 100.

The performance is evaluated by using the relative reconstruction error, i.e.,

\frac{∥ \hat{X} - X ∥_{F}}{∥ X ∥_{F}}

, where X and

\hat{X}

denote the original signal and the reconstructed signal, respectively. Each setting is averaged for 50 trials. The simulation results are presented in Figure 1 and Figure 2. In Figure 1, each subgraph corresponds to a certain amount of sampling ratio. The signals are added with white Gaussian noise, which yields the signal-to-noise ratio (SNR) to range from 20 dB to 50 dB. As can been seen from Figure 1a, ODL has poor performance when the sampling ratio is low, but the ODL dictionary outperforms both the DCT dictionary and the K-SVD method in the relative reconstruction error, when the sampling ratio is high (high than 20%). The fixed dictionary, DCT, is the worst case. That’s because the DCT dictionary using a fixed structure cannot sparsely represent synthetic data of various diversities. In comparison, K-SVD is better than the DCT dictionary. This is because K-SVD can adapt to sparsely represent the synthetic data by training. Since the IDL algorithm trains the dictionary using the self-coherence constraint term, the relative reconstruction error is smaller than DCT and K-SVD. Similar results can be obtained from Figure 2, where the relative reconstruction errors of DCT, K-SVD, IDL and ODL are obtained with sampling ratios of 10%, 15%, 20%, 25%, 30% and 40%, respectively. As we can see in our simulations, the results of ODL are much worse than DCT, K-SVD, and IDL when the sampling ratios is quite low (less than 10%). But ODL outperforms these algorithms compared by gradually increasing the sampling ratio.

6.2. Impact of Regularization Parameters on Sparse Representation Error

The performance of ODL-CDG algorithm may also be highly influenced by the setting value of regularization parameters λ_A and λ_Θ. In this experiment, we analyze how the selected regularization parameters affect the sparse representation error. The optimal parameters for ODL-CDG algorithm is determined. The datasets used in this section are collected from the IntelLab [30]. We select the temperature values and the humidity values of size 54 × 100 between 28 February and 27 March 2004. The time interval is 31 s. We first solve the following sparse representation problem on training data

{x^{i}}_{i = 1}^{L}

:

\hat{θ} = \underset{θ \in ℝ^{L}}{\arg \min} {‖ D \hat{θ} - x^{i} ‖}_{2}, s . t . {‖ θ ‖}_{0} \leq S

(35)

where S denotes the sparsity of the coefficient θ.

Then, the sparse representation error of the learned dictionary is evaluated using the root mean square error (RMSE), which is defined as follows:

R M S E = \frac{1}{\sqrt{L}} \sum_{i = 1}^{L} {‖ D \hat{θ} - x^{i} ‖}_{2}

(36)

where L is the amount of training data. We average the experiment 50 times for every training data vector. Figure 3 shows the simulation results. In general, the sparse representation error is becoming larger as the parameter λ_A gradually increases. That’s because regularization parameter λ_A determines the sparsity of the sparse coefficient. To constrain the coefficients to be sparser, we need to increase λ_A to a specific threshold. However, as can be seen from Figure 3, the sparse representation error increases tremendously as λ_A exceeds 0.1. In Figure 4, parameter λ_Θ shows that it has the same trend as λ_A in impacting the sparse representation error. Based on the above discussion, we set the regularization parameter as a relatively small value, such as λ_A = 0.1 and λ_Θ = 0.05. These are also the optimal parameter values we set in Section 6.1.

6.3. Energy Consumption Comparison

In this subsection, we simulate the energy consumption of the ODL-CDG algorithm. The simulation platform is MATLAB. Suppose 500 nodes are randomly deployed in a 1000 m × 1000 m area and the sink node is deployed in the center. The random topology of these sensor nodes is shown in Figure 5. The communication range is 50 m and the initial energy is 2 J. The original data used in this section is synthetized of multiple data sets. Thus they cannot be sparsely represented in a predefined dictionary. To evaluate the energy consumption of ODL-CDG, we employ the same energy model in [31]:

E_{t r a n s} = {\begin{cases} l \times (E_{e l e c} + E_{m p} \times d^{2}), i f d \leq d_{T h r e s} \\ l \times (E_{e l e c} + E_{m p} \times d^{4}), i f d > d_{T h r e s} \end{cases}

(37)

E_{r e c} = l \times E_{e l e c}

(38)

where E_trans denotes the energy consumption of transmitting l bits of data to another node within distance d, E_rec denotes the energy consumption of receiving l bits of data, E_elec denotes the energy consumption of the modular circuit, and E_mp denotes the energy consumed by the power amplifying circuit. The parameters input to the ODL-CDG algorithm are the same as in Section 6.1 and the related parameters are listed in Table 2.

It is regarded as a successful reconstruction when the relative reconstruction error is smaller than 0.1. Figure 6 shows the energy consumption of ODL-CDG algorithm compared with other dictionary learning-based data gathering methods. Note that the K-SVD-based data gathering method requires one to access the whole data, so the original training data should be transmitted to the sink node by multi-hop paths. Moreover, the K-SVD dictionary should be updated in time since the synthetized data contains large diversities. Thus, the initial step of dictionary learning before data gathering may consume large amounts of energy. Therefore, the energy consumption of the K-SVD-based data gathering method is significantly larger than that of CK-SVD and the ODL-CDG. Figure 6 shows that ODL-CDG algorithm achieves the best energy savings. That’s because the dictionary in the ODL-CDG algorithm is learned in the process of compressive data gathering, which greatly reduces the energy consumption for raw data transmission through the entire network. Similarly, the total energy consumption of CK-SVD is enlarged with the increase of successful reconstruction number. But its energy consumption is still high than ODL-CDG as can be seen from Figure 6. The reason is that the introduced self-coherence penalty term can restrain the reconstruction error in the ODL-CDG algorithm, so the ODL-CDG algorithm should collect much fewer CS measurements than the CK-SVD-based data gathering method for the same reconstruction accuracy.

In Figure 7, the impact of different dictionary learning-based data gathering methods on the lifespan of nodes is studied. The node is supposed to survive when its energy is higher than zero. As can be seen from Figure 7, the ODL-CDG algorithm outperforms the other methods since the proposed dictionary has better adaptability to various signals. Thus the ODL-CDG algorithm reduces the energy consumption and prolongs the network lifespan.

7. Conclusions and Future Work

In this paper, we propose the ODL-CDG algorithm for energy efficient data collection in WSNs. The training signals for dictionary learning are obtained by a compressive data gathering approach, which greatly reduces the energy consumption. Inspired by the periodicity of natural signals, the learned dictionary is constrained with a sparse structure. To reduce the recovery error caused by environmental noise, the self-coherence of the learned dictionary is also introduced as a penalty term during the dictionary optimization procedure. Simulation results show the online dictionary learning algorithm outperforms both pre-specified dictionaries, like the DCT dictionary, and other dictionary learning approaches, like K-SVD, IDL and CK-SVD. The energy consumption of the ODL-CDG algorithm is significantly less than that of K-SVD-based and CK-SVD-based data gathering methods, which helps to enhance the network lifetime. In the future, we intend to employ other measurement matrices, such as the sparse measurement matrices, to further reduce the energy consumption. What’s more, how to apply the proposed algorithm to real large-scale WSNs is also a potential research direction.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant No. 61371135 and Beihang University Innovation & Practice Fund for Graduate under Grant YCSJ-02-2016-04. The authors are grateful to the anonymous reviewers for their intensive reviews and insightful suggestions, which have improved the quality of this paper significantly.

Author Contributions

Donghao Wang made main contribution to the original ideas on the original optimization methods, detailed algorithm implementation, and the manuscript writing. Jiangwen Wan conceived the scope of the paper, and reviewed and revised the manuscript in detail. Junying Chen performed the experiments. Qiang Zhang analyzed the data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rault, T.; Bouabdallah, A.; Challal, Y. Energy efficiency in wireless sensor networks: A top-down survey. Comput. Netw. 2014, 67, 104–122. [Google Scholar] [CrossRef]
Cheng, S.T.; Shih, J.S.; Chou, C.L.; Horng, G.J.; Wang, C.H. Hierarchical distributed source coding scheme and optimal transmission scheduling for wireless sensor networks. Wirel. Pers. Commun. 2012, 70, 847–868. [Google Scholar] [CrossRef]
Baraniuk, R. Compressive sensing. IEEE Signal Process. Mag. 2007, 24, 4. [Google Scholar] [CrossRef]
Luo, C.; Wu, F.; Sun, J.; Chen, C.W. Compressive data gathering for large-scale wireless sensor networks. In Proceedings of the 15th ACM International Conference on Mobile Computing and Networking, Beijing, China, 20–25 September 2009; pp. 145–156.
Liu, Y.; Zhu, X.; Zhang, L.; Cho, S.H. Expanding window compressed sensing for non-uniform compressible signals. Sensors 2012, 12, 13034–13057. [Google Scholar] [CrossRef] [PubMed]
Shen, Y.; Hu, W.; Rana, R.; Chou, C.T. Nonuniform compressive sensing for heterogeneous wireless sensor networks. IEEE Sens. J. 2013, 13, 2120–2128. [Google Scholar] [CrossRef]
Razzaque, M.A.; Dobson, S. Energy-efficient sensing in wireless sensor networks using compressed sensing. Sensors 2014, 14, 2822–2859. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Aharon, M.; Elad, M.; Bruckstein, A. K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 2006, 54, 4311–4322. [Google Scholar] [CrossRef]
Duarte-Carvajalino, J.M.; Sapiro, G. Learning to sense sparse signals: Simultaneous sensing matrix and sparsifying dictionary optimization. IEEE Trans. Image Process. 2009, 18, 1395–1408. [Google Scholar] [CrossRef] [PubMed]
Sigg, C.D.; Dikk, T.; Buhmann, J.M. Learning dictionaries with bounded self-coherence. IEEE Signal Process. Lett. 2012, 19, 861–864. [Google Scholar] [CrossRef]
Sadeghi, M.; Babaie-Zadeh, M.; Jutten, C. Learning overcomplete dictionaries based on atom-by-atom updating. IEEE Trans. Signal Process. 2014, 62, 883–891. [Google Scholar] [CrossRef]
Chen, W.; Wassell, I.; Rodrigues, M. Dictionary design for distributed compressive sensing. IEEE Signal Process. Lett. 2015, 22, 95–99. [Google Scholar] [CrossRef]
Studer, C.; Baraniuk, R.G. Dictionary learning from sparsely corrupted or compressed signals. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan, 25–30 March 2012; pp. 3341–3344.
Pourkamali-Anaraki, F.; Becker, S.; Hughes, S.M. Efficient dictionary learning via very sparse random projections. In Proceedings of the 11th International Conference on Sampling Theory and Applications, SampTA 2015, Washington, DC, USA, 25–29 May 2015; pp. 478–482.
Pourkamali-Anaraki, F.; Hughes, S.M. Compressive k-svd. In Proceedings of the 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Vancouver, BC, Canada, 26–31 May 2013; pp. 5469–5473.
Dictionary and Image Recovery from Incomplete and Random Measurements. Available online: http://arxiv.org/pdf/1508.00278.pdf (accessed on 15 September 2016).
Needell, D.; Tropp, J.A. Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmon. Anal. 2009, 26, 301–321. [Google Scholar] [CrossRef]
Jain, P.; Tewari, A.; Dhillon, I.S. Orthogonal matching pursuit with replacement. In Proceedings of the Advances in Neural Information Processing Systems, Granada, Spain, 12–14 December 2011; Curran Associates Inc.: Granada, Spain, 2011; pp. 1215–1223. [Google Scholar]
Donoho, D.L.; Tsaig, Y.; Drori, I.; Starck, J.L. Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE Trans. Inf. Theory 2012, 58, 1094–1121. [Google Scholar] [CrossRef]
Candes, E.J.; Romberg, J.K.; Tao, T. Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 2006, 59, 1207–1223. [Google Scholar] [CrossRef]
Candes, E.; Tao, T. The dantzig selector: Statistical estimation when p is much larger than n. Ann. Stat. 2007, 35, 2313–2351. [Google Scholar] [CrossRef]
Li, G.; Zhu, Z.; Yang, D.; Chang, L.; Bai, H. On projection matrix optimization for compressive sensing systems. IEEE Trans. Signal Process. 2013, 61, 2887–2898. [Google Scholar] [CrossRef]
Cai, T.T.; Zhang, A. Sparse representation of a polytope and recovery of sparse signals and low-rank matrices. IEEE Trans. Inf. Theory 2013, 60, 122–132. [Google Scholar] [CrossRef]
Rauhut, H.; Schnass, K.; Vandergheynst, P. Compressed sensing and redundant dictionaries. IEEE Trans. Inf. Theory 2008, 54, 2210–2219. [Google Scholar] [CrossRef]
Tropp, J.A. Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 2004, 50, 2231–2242. [Google Scholar] [CrossRef]
Tan, M.K.; Tsang, I.W.; Wang, L. Matching pursuit lasso part ii: Applications and sparse recovery over batch signals. IEEE Trans. Signal Process. 2015, 63, 742–753. [Google Scholar] [CrossRef]
Toh, K.C.; Yun, S. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pac. J. Optim. 2010, 6, 615–640. [Google Scholar]
Jiang, K.; Sun, D.; Toh, K.C. An inexact accelerated proximal gradient method for large scale linearly constrained convex sdp. SIAM J. Optim. 2012, 22, 1042–1064. [Google Scholar] [CrossRef]
Balavoine, A.; Rozell, C.J.; Romberg, J. Discrete and continuous-time soft-thresholding for dynamic signal recovery. IEEE Trans. Signal Process. 2015, 63, 3165–3176. [Google Scholar] [CrossRef]
Madden, S. Intel Lab Data. Available online: http://db.csail.mit.edu/labdata/labdata.html (accessed on 15 September 2016).
Heinzelman, W.R.; Chandrakasan, A.; Balakrishnan, H. Energy-efficient communication protocol for wireless microsensor networks. In Proceedings of the 33rd Annual Hawaii International Conference on System Siences, Maui, HI, USA, 4–7 January 2000; p. 223.

Figure 1. The relative reconstruction error of DCT, K-SVD, IDL and ODL-CDG under different signal-to-noise ratio: (a) Sampling Ratio = 10%; (b) Sampling Ratio = 20%; (c) Sampling Ratio = 30%; (d) Sampling Ratio = 40%.

Figure 2. The relative reconstruction error of DCT, K-SVD, IDL and ODL-CDG under different sampling ratio: (a) SNR = 20 dB; (b) SNR = 30 dB; (c) SNR = 40 dB; (d) SNR = 50 dB.

Figure 3. The impact of sparse coefficient regularization parameter λ_A on sparse representation error.

Figure 4. The impact of structured dictionary regularization parameter λ_Θ on sparse representation error.

Figure 5. Random deployment of 500 sensor nodes.

Figure 6. The total energy consumption of different dictionary learning based data gathering methods.

Figure 7. The number of survival nodes in different methods.

Table 1. Summary of notations.

**Table 1.** Summary of notations.
M	Number of necessary measurements
N	Number of sensor nodes
K	Number of atoms of dictionary D
L	Length of training data vectors
λ_A	Sparse coefficient regularization parameter
λ_Θ	Structured dictionary regularization parameter
L_f	Lipschitz constant
Φ	Measurement matrix
Ψ	Orthonormal basis dictionary
P	Sensing matrix
D	Structured dictionary
X	Data matrix
$\hat{X}$	Estimated data matrix
Θ	Sparse atom representation dictionary

Table 2. Experimental parameters.

**Table 2.** Experimental parameters.
Parameter Name	Value
Node number	500
Transmission range	50 m
Initial energy	2 J
Data Size	1024 bit
E_elec	50 nJ/bit
E_amp	0.1 nJ/(bit·m²)

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, D.; Wan, J.; Chen, J.; Zhang, Q. An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks. Sensors 2016, 16, 1547. https://doi.org/10.3390/s16101547

AMA Style

Wang D, Wan J, Chen J, Zhang Q. An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks. Sensors. 2016; 16(10):1547. https://doi.org/10.3390/s16101547

Chicago/Turabian Style

Wang, Donghao, Jiangwen Wan, Junying Chen, and Qiang Zhang. 2016. "An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks" Sensors 16, no. 10: 1547. https://doi.org/10.3390/s16101547

APA Style

Wang, D., Wan, J., Chen, J., & Zhang, Q. (2016). An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks. Sensors, 16(10), 1547. https://doi.org/10.3390/s16101547

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks

Abstract

1. Introduction

2. Related Work

3. Problem Formulation

3.1. Compressive Sensing

3.2. The Conventional Dictionary Learning Methods

3.3. Sparse Structured Dictionary

3.4. The Recovery Error Penalty

3.5. The Final Form of ODL_CDG Problem

4. Necessary Guarantees for Signal Reconstruction

5. The Solution of the ODL-CDG Algorithm

5.1. Sparse Coding

5.2. Dictionary Update

5.3. Convergence Analysis

6. Simulation

6.1. Recovery Accuracy

6.2. Impact of Regularization Parameters on Sparse Representation Error

6.3. Energy Consumption Comparison

7. Conclusions and Future Work

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI