1. Introduction
Recently, the distributed data processing methods based on multi-agent networks have received much attention. The traditional methods put all the data into one machine and perform the computation centrally. However, as the size of data continues to grow, this kind of centralized strategy is limited by the computing power of the hardware. In contrast to this, the distributed methods distribute computing tasks to agents over decentralized networks [
1,
2]. Each agent keeps an arithmetic unit and a memory unit. The agents interact with each other through communication links, and this communication occurs only among the neighboring agents. Under these conditions, the distributed methods can effectively solve the optimization problems common to sensor networks [
3], economic dispatch [
4,
5,
6], machine learning [
7,
8] and dynamic control [
9].
The existing decentralized algorithms have included some successful results [
10,
11,
12,
13,
14,
15]. Previous works considered the problem models composed of a single function. With a fixed stepsize, Shi et al. designed EXTRA [
10], which can exactly converge to the optimal solution. Lei et al. studied problems with bound constraints and proposed the primal-dual algorithm [
11]. In addition, recent works [
16,
17,
18] investigated a general distributed optimization with an objective function by designing decentralized subgradient-based algorithms, but diminishing or non-summable step-sizes are utilized, which may cause slow convergence rates [
19].
In order to make full use of these special properties, some scholars have studied the nonsmooth composite optimization problems, which possess smooth and nonsmooth structures. By extending EXTRA to the nonsmooth combinational optimization, Shi et al. proposed PG-EXTRA [
20]. Li et al. introduced the network-independent stepsize to PG-EXTRA and then developed NIDS [
21]. In addition, Aybat et al. proposed DPDA-D [
22] for time-varying networks. Considering the situation that the nonsmooth term cannot be split, Xu et al. proposed the [
23]. PG-ADMM [
24] was designed based on the distributed alternating direction multiplier method. Particularly, the nonsmooth combinational optimization problems also include a class of problems consisting of three functions and a linear operator. This structure is mainly discussed in the centralized optimization [
25,
26,
27,
28], and recently some distributed works also appear in [
29,
30]. In this paper, inspired by the constrained optimization problem [
31], we study the constrained nonsmooth composite optimization problems over networks.
The contributions of this paper can be summarized as follows:
This paper focuses on an optimization problem with partially smooth and nonsmooth objective functions, where the decision variable satisfies local equality and feasible constraints, unlike these works [
10,
16,
18,
19,
20,
21] without considering any constraints. Then, to solve this problem, we propose a novel decentralized algorithm by combining primal-dual frame with the proximal operators, which avoids the estimation of subgradients for nonsmooth terms.
Different from existing node-based methods [
16,
17,
18,
19,
20,
21], the proposed algorithm adopts an edge-based communication pattern that explicitly highlights the process of information exchange among neighboring agents and further gets rid of the dependence on Laplacians [
13]. Such a consideration also makes it possible to use uncoordinated stepsizes instead of commonly global or dynamic ones [
10,
12,
16,
18,
19,
21].
By employing the first-order optimal conditions and fixed-point theory of operators, the convergence is proved, and its sublinear rate (k is the number of iteration); i.e., at most, iterations in order to reach an accuracy of is established.
Organization: The rest of this paper is organized as follows. In
Section 2, the necessary notations and basic knowledge are first provided, and then we describe the optimization problem over the networks and necessary assumptions.
Section 3 supplies the development of the proposed decentralized algorithm. In
Section 4, the convergence analysis for the proposed algorithm is provided. In
Section 5, we use the simulation experiments to verify the theoretical analysis. Finally, conclusions are given in
Section 6.
2. Preliminaries
In this section, we introduce the notations involved in this paper. Meanwhile, the objective problem and its explanation are also supplied.
2.1. Graph Theory and Notations
The knowledge of graph theory is used to construct the mathematical model of the communication network. Let describe the network as a graph, where is the set of vertices and is the set of edges. For an agent , denotes the set of its neighbors. Let the unordered pair represent the edge between agent i and agent j. However, or is still order, i.e., the variables with respect to them are different.
Next, we explain the notations that appear in this paper. Let represent the set of real numbers. Therefore, denotes the n-dimensional vector space, and denotes the set of all n-row and m-column real matrices. We define as the n-dimensional identity operator, as the n-dimensional null vector, and as the null matrix. If their dimensions are clear from the context, we omit their subscript. Then, is the block diagonal matrix grouped by matrices P and Q. For a matrix P, let M be its transpose. We denote as the induced norm with matrix P. The subdifferential of function f is , where . The conjugate function is defined by . For a positive constant , the resolvent of the proximal operator is , while the resolvent with respect to the matrix M is . Moreover, let represent the optimal solution set of a solvable optimization problem over networks.
2.2. Decentralized Optimization Problem
The constrained composite optimization problem over networks studied in this paper is based on the network
with
m agents. Specifically, the formulation of the problem is established as follows:
In problem (1),
is the decision variable;
and
are two private cost functions to agent
i, where the former has the Lipschitz continuous gradient, but the latter may be nonsmooth;
is a vector and
is a linear operator. Convex set
gives the box constraints to the decision variable of agent
i.
To clarify the properties of problem (1), the following necessary assumption is given.
Assumption 1. For any agent :
- (i)
The cost function is Lipschitz continuous and convex; i.e., if we consider the positive Lipschitz constant , then it holds the inequality for the gradient : - (ii)
The local cost function is a nonsmooth and convex function.
- (iii)
The optimal solution to objective problem (1) exists, which satisfies both the equality constraints and the box constraints.
- (iv)
The graph is undirected and connected.
Note that the cost functions
and
are separable. Hence, we introduce the consensus constraint to transform problem (1) into the structure that can be computed in a decentralized manner:
Define the set
and consider the indicator function
such that Problem (3) can be processed by the penalty function method. For
and
, let
if
and
otherwise. Thus, Problem (3) is equivalent to the following problem:
Then, let
be the global variable. For
and
, we introduce a linear operator
, which generates the edge-based variable from
x. With the set
, the constraint in the problem (4) can be transformed into another penalty function. Therefore, the problem (1) is finally equivalent to the following problem:
Based on the problem (
5), we design a novel decentralized algorithm to solve the constrained composite optimization problem over networks in the next section.
3. Algorithm Development
The introduction with respect to the design process of the proposed algorithm is provided in this section.
Notice that Problem (
5) is an unconstrained problem. According to [
32] (Proposition 19.20), we obtain the following Lagrangian function:
where
,
and
are dual variables, and
,
and
are the conjugate functions of
,
,
, respectively. Notice that
is an edge-based variable, where
is the local variable of agent
i and
is for agent
j. Then, the last term of the Lagrangian function (
6) satisfies:
Thus, the Lagrangian function (
6) can also be written as
Taking the partial derivatives of the Lagrangian function (
7) and combining the operator splitting method [
29], we propose a new update flow as follows:
where
,
,
are the auxiliary variables, and
,
, and
are positive stepsizes. Notice that the stepsizes are uncoordinated, which can be selected independently related to different agents and enjoy their own acceptable ranges. Additionally, the edge-based parameters
can be seen as inherent parameters of the communication network, revealing the quality of the communication.
The steps related to the edge-based variables in update flow (
8) cannot be conducted directly, so we next replace them with the agent-based variables. We apply the Moreau decomposition to the first step in update flow (
8) such that for the second term on the right side, we have
Define (
9) as the projection
. Then, according to the definition of the set
, the projection has the following explicit expression:
Thus, for
,
, the update step for
can be decomposed into
Moreover, the update step for
can be replaced by
Combining the update flow (
8), (
10) and (
11), we finally propose the decentralized algorithm for Problem (1) in Algorithm 1.
Here, we directly give the stepsize condition of Algorithm 1 in the following assumption. The specific theoretical origin of this condition can be found in the convergence
analysis section.
Assumption 2. (Stepsize conditions)
For any agent and , the stepsizes , , and are positive. Let the following condition hold:where is the Lipschitz constant for the gradient . Algorithm 1 The Decentralized Algorithm |
For each agent and all , let , , , , , and .
Each agent i repeats, for all ,
Agent i sends , , to all of its neighbors. The sequence to estimate the optimal solution.
|
4. Convergence Analysis
In this section, we first establish the compact form with operators of the proposed algorithm. Then, the results of the theoretical analysis are provided.
For
,
, we make the following definitions. Let
w and
represent the variables stacked by
and
, respectively. Define vectors
and
. Then, we let
,
,
and
be the stepsize matrices. Then
,
and
hold such that there exist
,
and
. The linear operator
is stacked by
. Considering the resolvent of the proximal operator, the update flow (
8) leads to the following equalities:
where
is the auxiliary variable.
Define two variables
and
. Based on the equalities in (
12), Algorithm 1 is equivalent to the following compact form described by the operators:
where the operators are given as follows:
Consider one iteration of the proposed algorithm as an operator
T. Then we let
be the fixed point of the operator
T such that
. Next, we conduct the convergence analysis.
Lemma 1. (Optimal analysis) Let Assumption 1 be satisfied. The fixed point related to the operator T meets the first-order optimal conditions of the objective problem, and is an optimal solution.
Proof. Substituting the fixed point into (
12), we have the following set of equalities:
which is also the KKT condition of the Lagrangian function (
6). Therefore,
is an optimal solution to problem (1). □
The relationship between the fixed point and the optimal solution is ensured by Lemma 1. Split the operator
as
, where we let
and further define another linear operator
With these definitions above, the following lemma provides the property of the operator T for convergence analysis.
Lemma 2. Under Assumption 1, there exists the following inequality for :where for is the Lipschitz parameter matrix. Proof. With the definition of operator
, we have the equality
According to [
32] (Theorem 18.16), for
,
is cocoercive, i.e., it holds
Note that for any vector
a and
b in the same dimension and a diagonal positive definite matrix
V, then there exists the inequality
. Hence, we have
Combining (
14)–(
16), we can obtain the objective inequality and end the proof. □
Lemma 3. Under Assumption 1, there exists the following inequality for :where is defined before Lemma 2. Proof. Considering the change of the optimal residual before and after one iteration, we have
From the second step of the update flow (
13), there exists
such that the equality (
18) leads to
From the first step of the update flow (
13), it holds that
Then, we discuss the right side of (
19). Note that Lemma 1 proves the equivalence between the fixed point and the optimal solution. Substituting the property of fixed points into the update flow (
13), we obtain
and
Hence, the third term on the right side of (
19) satisfies
where the inequality is based on Lemma 2. Notice that the operator
is monotone [
32] (Theorem 21.2 and Proposition 20.23), i.e., it holds
Since the linear operator
is a skew-symmetric matrix, it is monotone [
29]. Combining (
19)–(
21), we obtain
From the second step of the update flow (
13), it holds
where
,
and
are the linear operators. Considering that
is also a linear operator, the second term on the right side of (
22) has an equivalent form:
Substituting (
23) into (
22), we complete the proof. □
Summarizing the above lemmas, the following theorem supplies the convergence results.
Theorem 1. When Assumption 1 and 2 are satisfied, for the sequence generated by the operator T, we havewhere . Then, the sequence has sublinear rate , and the sequence converges to an optimal solution . Proof. With the definition of
, we have the following equality:
Substituting (
25) into (
17), we obtain the inequality (
24). Note that under Assumption 2, the matrix
is positive definite. Hence, the sequence
converges to the fixed point
. Meanwhile, utilizing [
10] (Proposition 1) results in the
rate, and based on Lemma 1, the convergence of
holds. □
In Theorem 1, the positive definite property is needed for the induced matrices, which leads to the stepsize conditions in Assumption 2.
5. Numerical Simulation
The correctness of the theoretical analysis is verified through numerical simulation on a constrained optimization problem over networks in this section.
The constrained quadratic programming problem [
33] is considered in the experiments, which has the formulation as follows:
where matrix
is diagonal and positive definite,
is a vector, and
is the penalty factor. Both
and
are vectors with constants, which give the bounds of the decision variable
. In the light of (1), we can set
and
.
In this case, the dimension of the decision variable is set as
, and we let
. For
, the paramount data of Problem (26) are selected randomly. The elements of matrix
are in
, and the elements of the linear operator
are in
. Both vectors
and
take values in
. The box constraints are considered as
. Then, we set the uncoordinated setpsizes randomly as
, while
,
and
are in
. The numerical experiments are performed over the generated network with eight agents, which is displayed in
Figure 1. The simulations are carried by running the distributed algorithms on a laptop with Intel(R) Core i5-5500U CPU @ 2.40 GHz, 8.0 GB of RAM, and Matlab R2016a on Windows 10 operating system.
The simulation results are shown in
Figure 2 and
Figure 3. The transient behaviors of each component of
is displayed in
Figure 2, in which a node-based consensus algorithm [
34] is introduced as a comparative profile. Note that the obtained optimal solution from the proposed algorithm is in line with that of the node-based consensus one, i.e.,
, but the latter achieves a stable consensus after 15,000 iterations.
Figure 3 shows that our proposed algorithm outperforms the node-based and subgradient algorithms [
35] in terms of convergence performance by evaluating the relative errors
.
6. Conclusions
In this paper, a distributed algorithm based on proximal operators has been designed to deal with discussed a class of distributed composite optimization problems, in which the local function has a smooth and nonsmooth structure and the decision variable abides by both affine and feasible constraints. Distinguishing attributes of the proposed algorithm include the use of uncoordinated stepsizes and the edge-based communication that avoids the dependency on Laplacian weight matrices. Meanwhile, the algorithm has been verified in theory and simulation. However, there are still some aspects worthy of improvement in this paper. For example, it is worth adopting efficient accelerated protocols (such as the Nesterov-based method and heavy ball method) to improve the convergence rate and developing asynchronous distributed algorithms to deal with the issue of communication latency. In addition, more general optimization models and more efficient algorithms should be investigated in order to address potential applications, e.g., [
36,
37,
38] with nonconvex objectives, coupled and nonlinear constraints.
Author Contributions
L.F.: Conceptualization, Investigation, Project administration, Software, Writing—original draft. L.R.: Data curation, Software. G.M.: Software. J.T.: Project administration, Software. W.D.: Software. H.L.: Funding acquisition, Investigation, Methodology, Project administration, Resources. All authors have read and agreed to the published version of the manuscript.
Funding
The work described in this paper is supported in part by the Research Project Supported by Shanxi Scholarship Council of China (2020-139) and in part by the Xinzhou Teachers University Academic Leader Project.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
The authors gratefully acknowledge their technical and financial support.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Li, H.; Su, E.; Wang, C.; Liu, J.; Xia, D. A primal-dual forward-backward splitting algorithm for distributed convex optimization. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 1–7. [Google Scholar] [CrossRef]
- Li, H.; Hu, J.; Ran, L.; Wang, Z.; Lü, Q.; Du, Z.; Huang, T. Decentralized Dual Proximal Gradient Algorithms for Non-Smooth Constrained Composite Optimization Problems. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2594–2605. [Google Scholar] [CrossRef]
- Zhang, Y.; Lou, Y.; Hong, Y.; Xie, L. Distributed projection-based algorithms for source localization in wireless sensor networks. IEEE Trans. Wirel. Commun. 2015, 14, 3131–3142. [Google Scholar] [CrossRef]
- Li, B.; Wang, Y.; Li, J.; Cao, S. A fully distributed approach for economic dispatch problem of smart grid. Energies 2018, 11, 1993. [Google Scholar] [CrossRef]
- Cao, C.; Xie, J.; Yue, D.; Huang, C.; Wang, J.; Xu, S.; Chen, X. Distributed economic dispatch of virtual power plant under a non-ideal communication network. Energies 2017, 10, 235. [Google Scholar] [CrossRef]
- Li, H.; Zheng, Z.; Lü, Q.; Wang, Z.; Gao, L.; Wu, G.; Ji, L.; Wang, H. Primal-dual fixed point algorithms based on adapted metric for distributed optimization. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–15. [Google Scholar] [CrossRef]
- Ababei, C.; Moghaddam, M.G. A survey of prediction and classification techniques in multicore processor systems. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 1184–1200. [Google Scholar] [CrossRef]
- Liu, S.; Qiu, Z.; Xie, L. Convergence rate analysis of distributed optimization with projected subgradient algorithm. Automatica 2017, 83, 162–169. [Google Scholar] [CrossRef]
- Li, Y.; Liao, X.; Li, C.; Huang, T.; Yang, D. Impulsive synchronization and parameter mismatch of the three-variable autocatalator model. Phys. Lett. A 2007, 366, 52–60. [Google Scholar] [CrossRef]
- Shi, W.; Ling, Q.; Wu, G.; Yin, W. EXTRA: An exact first-order algorithm for decentralized consensus optimization. SIAM J. Optim. 2015, 25, 944–966. [Google Scholar] [CrossRef] [Green Version]
- Lei, J.; Chen, H.F.; Fang, H.T. Primal-dual algorithm for distributed constrained optimization. Syst. Control Lett. 2016, 96, 110–117. [Google Scholar] [CrossRef]
- Nedic, A.; Ozdaglar, A.; Parrilo, P.A. Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 2010, 55, 922–938. [Google Scholar] [CrossRef]
- Gharesifard, B.; Cortes, J. Distributed continuous-time convex optimization on weight-balanced digraphs. IEEE Trans. Autom. Control 2014, 59, 781–786. [Google Scholar] [CrossRef]
- Zhu, M.; Martinez, S. On distributed convex optimization under inequality and equality constraints. IEEE Trans. Autom. Control 2012, 57, 151–164. [Google Scholar]
- Wang, X.; Hong, Y.; Ji, H. Distributed optimization for a class of nonlinear multiagent systems with disturbance rejection. IEEE Trans. Cybern. 2016, 46, 1655–1666. [Google Scholar] [CrossRef]
- Nedic, A.; Ozdaglar, A. Distributed subgradient methods for multiagent optimization. IEEE Trans. Autom. Control 2009, 54, 48–61. [Google Scholar] [CrossRef]
- Zhang, L.; Liu, S. Projected subgradient based distributed convex optimization with transmission noises. Appl. Math. Comput. 2022, 418, 126794. [Google Scholar] [CrossRef]
- Ren, X.; Li, D.; Xi, Y.; Shao, H. Distributed subgradient algorithm for multi-agent optimization with dynamic stepsize. IEEE/CAA J. Autom. Sin. 2021, 8, 1451–1464. [Google Scholar] [CrossRef]
- Niu, Y.; Wang, H.; Wang, Z.; Xia, D.; Li, H. Primal-dual stochastic distributed algorithm for constrained convex optimization. J. Frankl. Inst. 2019, 356, 9763–9787. [Google Scholar] [CrossRef]
- Shi, W.; Ling, Q.; Wu, G.; Yin, W. A proximal gradient algorithm for decentralized composite optimization. IEEE Trans. Singal Process. 2015, 63, 6013–6023. [Google Scholar] [CrossRef]
- Li, Z.; Shi, W.; Yan, M. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Trans. Singal Process. 2019, 67, 4494–4506. [Google Scholar] [CrossRef] [Green Version]
- Aybat, N.S.; Hamedani, E.Y. A distributed ADMM-like method for resource sharing over time-varying networks. SIAM J. Optim. 2019, 29, 3036–3068. [Google Scholar] [CrossRef]
- Xu, J.; Tian, Y.; Sun, Y.; Scutari, G. Distributed algorithms for composite optimization: Unified framework and convergence analysis. IEEE Trans. Signal Process. 2021, 69, 3555–3570. [Google Scholar] [CrossRef]
- Aybat, N.S.; Wang, Z.; Lin, T.; Ma, S. Distributed linearized alternating direction method of multipliers for composite convex consensus optimization. IEEE Trans. Autom. Control 2018, 63, 5–20. [Google Scholar] [CrossRef]
- Latafat, P.; Patrinos, P. Asymmetric forward-backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017, 68, 57–93. [Google Scholar] [CrossRef]
- Combettes, P.L.; Pesquet, J.C. Primal-dual splitting algorithm for solving inclusions with mixtures of composite, lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 2012, 20, 307–330. [Google Scholar] [CrossRef]
- Condat, L. A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 2013, 158, 460–479. [Google Scholar] [CrossRef]
- Vu, B.C. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 2013, 38, 667–681. [Google Scholar] [CrossRef]
- Latafat, P.; Freris, N.M.; Patrinos, P. A new randomized block-coordinate primal-dual proximal algorithm for distributed optimization. IEEE Trans. Autom. Control 2019, 64, 4050–4065. [Google Scholar] [CrossRef]
- Wei, Y.; Fang, H.; Zeng, X.; Chen, J.; Pardalos, P. A smooth double proximal primal-dual algorithm for a class of distributed nonsmooth optimization problems. IEEE Trans. Autom. Control 2020, 65, 1800–1806. [Google Scholar] [CrossRef]
- Liu, Q.; Yang, S.; Hong, Y. Constrained consensus algorithms with fixed step size for distributed convex optimization over multiagent networks. IEEE Trans. Autom. Control 2017, 62, 4259–4265. [Google Scholar] [CrossRef]
- Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces; Springer: Cham, Switzerland, 2011. [Google Scholar]
- Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Zhao, Y.; Liu, Q. A consensus algorithm based on collective neurodynamic system for distributed optimization with linear and bound constraints. Neural Netw. 2020, 122, 144–151. [Google Scholar] [CrossRef] [PubMed]
- Yuan, D.; Xu, S.; Zhao, H. Distributed primal-dual subgradient method for multiagent optimization via consensus algorithms. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2011, 41, 1715–1724. [Google Scholar] [CrossRef] [PubMed]
- Ćalasan, M.; Micev, M.; Ali, Z.M.; Zobaa, A.F.; Aleem, S.H.E.A. Parameter estimation of induction machine single-cage and double-cage models using a hybrid simulated annealing–evaporation rate water cycle algorithm. Mathematics 2020, 8, 1024. [Google Scholar] [CrossRef]
- Ali, Z.M.; Diaaeldin, I.M.; Aleem, S.H.E.A.; El-Rafei, A.; Abdelaziz, A.Y.; Jurado, F. Scenario-based network reconfiguration and renewable energy resources integration in large-scale distribution systems considering parameters uncertainty. Mathematics 2021, 9, 26. [Google Scholar] [CrossRef]
- Rawa, M.; Abusorrah, A.; Bassi, H.; Mekhilef, S.; Ali, Z.M.; Aleem, S.; Hasanien, H.M.; Omar, A.I. Economical-technical-environmental operation of power networks with wind-solar-hydropower generation using analytic hierarchy process and improved grey wolf algorithm. Ain Shams Eng. J. 2021, 12, 2717–2734. [Google Scholar] [CrossRef]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).