Next Article in Journal
Local-Forest Method for Superspreaders Identification in Online Social Networks
Next Article in Special Issue
Distributed Support Vector Ordinal Regression over Networks
Previous Article in Journal
Research on the Rapid Diagnostic Method of Rolling Bearing Fault Based on Cloud–Edge Collaboration
Previous Article in Special Issue
Research on Throughput-Guaranteed MAC Scheduling Policies in Wireless Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks

1
Department of Computer Science, Xinzhou Teachers University, Xinzhou 034000, China
2
Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
3
Department of Mathematics, Xinzhou Teachers University, Xinzhou 034000, China
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(9), 1278; https://doi.org/10.3390/e24091278
Submission received: 22 July 2022 / Revised: 6 September 2022 / Accepted: 6 September 2022 / Published: 11 September 2022
(This article belongs to the Special Issue Signal and Information Processing in Networks)

Abstract

:
In this paper, we focus on the nonsmooth composite optimization problems over networks, which consist of a smooth term and a nonsmooth term. Both equality constraints and box constraints for the decision variables are also considered. Based on the multi-agent networks, the objective problems are split into a series of agents on which the problems can be solved in a decentralized manner. By establishing the Lagrange function of the problems, the first-order optimal condition is obtained in the primal-dual domain. Then, we propose a decentralized algorithm with the proximal operators. The proposed algorithm has uncoordinated stepsizes with respect to agents or edges, where no global parameters are involved. By constructing the compact form of the algorithm with operators, we complete the convergence analysis with the fixed-point theory. With the constrained quadratic programming problem, simulations verify the effectiveness of the proposed algorithm.

1. Introduction

Recently, the distributed data processing methods based on multi-agent networks have received much attention. The traditional methods put all the data into one machine and perform the computation centrally. However, as the size of data continues to grow, this kind of centralized strategy is limited by the computing power of the hardware. In contrast to this, the distributed methods distribute computing tasks to agents over decentralized networks [1,2]. Each agent keeps an arithmetic unit and a memory unit. The agents interact with each other through communication links, and this communication occurs only among the neighboring agents. Under these conditions, the distributed methods can effectively solve the optimization problems common to sensor networks [3], economic dispatch [4,5,6], machine learning [7,8] and dynamic control [9].
The existing decentralized algorithms have included some successful results [10,11,12,13,14,15]. Previous works considered the problem models composed of a single function. With a fixed stepsize, Shi et al. designed EXTRA [10], which can exactly converge to the optimal solution. Lei et al. studied problems with bound constraints and proposed the primal-dual algorithm [11]. In addition, recent works [16,17,18] investigated a general distributed optimization with an objective function by designing decentralized subgradient-based algorithms, but diminishing or non-summable step-sizes are utilized, which may cause slow convergence rates [19].
In order to make full use of these special properties, some scholars have studied the nonsmooth composite optimization problems, which possess smooth and nonsmooth structures. By extending EXTRA to the nonsmooth combinational optimization, Shi et al. proposed PG-EXTRA [20]. Li et al. introduced the network-independent stepsize to PG-EXTRA and then developed NIDS [21]. In addition, Aybat et al. proposed DPDA-D [22] for time-varying networks. Considering the situation that the nonsmooth term cannot be split, Xu et al. proposed the [23]. PG-ADMM [24] was designed based on the distributed alternating direction multiplier method. Particularly, the nonsmooth combinational optimization problems also include a class of problems consisting of three functions and a linear operator. This structure is mainly discussed in the centralized optimization [25,26,27,28], and recently some distributed works also appear in [29,30]. In this paper, inspired by the constrained optimization problem [31], we study the constrained nonsmooth composite optimization problems over networks.
The contributions of this paper can be summarized as follows:
  • This paper focuses on an optimization problem with partially smooth and nonsmooth objective functions, where the decision variable satisfies local equality and feasible constraints, unlike these works [10,16,18,19,20,21] without considering any constraints. Then, to solve this problem, we propose a novel decentralized algorithm by combining primal-dual frame with the proximal operators, which avoids the estimation of subgradients for nonsmooth terms.
  • Different from existing node-based methods [16,17,18,19,20,21], the proposed algorithm adopts an edge-based communication pattern that explicitly highlights the process of information exchange among neighboring agents and further gets rid of the dependence on Laplacians [13]. Such a consideration also makes it possible to use uncoordinated stepsizes instead of commonly global or dynamic ones [10,12,16,18,19,21].
  • By employing the first-order optimal conditions and fixed-point theory of operators, the convergence is proved, and its sublinear rate O 1 / k (k is the number of iteration); i.e., at most, O 1 / ϵ iterations in order to reach an accuracy of ϵ is established.
Organization: The rest of this paper is organized as follows. In Section 2, the necessary notations and basic knowledge are first provided, and then we describe the optimization problem over the networks and necessary assumptions. Section 3 supplies the development of the proposed decentralized algorithm. In Section 4, the convergence analysis for the proposed algorithm is provided. In Section 5, we use the simulation experiments to verify the theoretical analysis. Finally, conclusions are given in Section 6.

2. Preliminaries

In this section, we introduce the notations involved in this paper. Meanwhile, the objective problem and its explanation are also supplied.

2.1. Graph Theory and Notations

The knowledge of graph theory is used to construct the mathematical model of the communication network. Let G = V , E describe the network as a graph, where V is the set of vertices and E V × V is the set of edges. For an agent i V , N i denotes the set of its neighbors. Let the unordered pair ( i , j ) E represent the edge between agent i and agent j. However, ( i , j ) or ( j , i ) is still order, i.e., the variables with respect to them are different.
Next, we explain the notations that appear in this paper. Let R represent the set of real numbers. Therefore, R n denotes the n-dimensional vector space, and R n × m denotes the set of all n-row and m-column real matrices. We define I n as the n-dimensional identity operator, 0 n as the n-dimensional null vector, and 0 n × n as the null matrix. If their dimensions are clear from the context, we omit their subscript. Then, blkdiag { P , Q } is the block diagonal matrix grouped by matrices P and Q. For a matrix P, let M be its transpose. We denote x P = x P x as the induced norm with matrix P. The subdifferential of function f is f , where f x = p | q R n , f x + p q x f q . The conjugate function f * is defined by f * p = sup q R n p , q f q . For a positive constant μ , the resolvent of the proximal operator is pro x μ f y = I + μ f 1 y = arg min x f ( x ) + 1 2 μ x y 2 , while the resolvent with respect to the matrix M is prox f M ( y ) = I + M 1 f 1 y = arg min x f ( x ) + 1 2 x y M 2 . Moreover, let S represent the optimal solution set of a solvable optimization problem over networks.

2.2. Decentralized Optimization Problem

The constrained composite optimization problem over networks studied in this paper is based on the network G = { V , E } with m agents. Specifically, the formulation of the problem is established as follows:
min x ˜ R n i = 1 m f i x ˜ + g i x ˜ ,
s . t . A i x ˜ = b i , i = 1 , , m ,
x ˜ i = 1 m Ω i .
In problem (1), x ˜ R n is the decision variable; f i : R n R + and g i : R n R + are two private cost functions to agent i, where the former has the Lipschitz continuous gradient, but the latter may be nonsmooth; b i R r is a vector and A i : R n R r is a linear operator. Convex set Ω i gives the box constraints to the decision variable of agent i.
To clarify the properties of problem (1), the following necessary assumption is given.
Assumption 1.
For any agent i V :
(i) 
The cost function f i is Lipschitz continuous and convex; i.e., if we consider the positive Lipschitz constant β i , then it holds the inequality for the gradient f i :
f i x ˜ f i y ˜ 2 β i x ˜ y ˜ f i x ˜ f y ˜ .
(ii) 
The local cost function g i is a nonsmooth and convex function.
(iii) 
The optimal solution x ˜ * to objective problem (1) exists, which satisfies both the equality constraints and the box constraints.
(iv) 
The graph G is undirected and connected.
Note that the cost functions f i and g i are separable. Hence, we introduce the consensus constraint to transform problem (1) into the structure that can be computed in a decentralized manner:
min x 1 , , x m i = 1 m f i x i + g i x i ,
s . t . A i x i = b i , i = 1 , , m ,
x i = x j , i = 1 , , m , j N i ,
x i Ω i .
Define the set A i = z R n A i z = b i and consider the indicator function
δ C e = 0 , e C , + , e C ,
such that Problem (3) can be processed by the penalty function method. For i V and j N i , let C i j = I if i < j and C i j = I otherwise. Thus, Problem (3) is equivalent to the following problem:
min x 1 , , x m i = 1 m f i x i + g i x i + δ A i x i + δ Ω i x i ,
s . t . C i j x i + C j i x j = 0 , i = 1 , , m , j N i .
Then, let x = c o l x 1 , , x m be the global variable. For i V and j N i , we introduce a linear operator N i , j : x ( ( C i j x i ) , ( C j i x j ) ) , which generates the edge-based variable from x. With the set C i , j = ( z 1 , z 2 ) z 1 + z 2 = 0 , the constraint in the problem (4) can be transformed into another penalty function. Therefore, the problem (1) is finally equivalent to the following problem:
min x 1 , , x m i = 1 m f i x i + g i x i + δ A i x i + δ Ω i x i + i = 1 m i , j E δ C i , j N i , j x .
Based on the problem (5), we design a novel decentralized algorithm to solve the constrained composite optimization problem over networks in the next section.

3. Algorithm Development

The introduction with respect to the design process of the proposed algorithm is provided in this section.
Notice that Problem (5) is an unconstrained problem. According to [32] (Proposition 19.20), we obtain the following Lagrangian function:
L = i = 1 m f i x i + g i x i + v i T x i δ A i * v i + u i T x i δ Ω i * u i + i = 1 m i , j E w i , j T N i , j x δ C i , j * w i , j ,
where v i R n , u i R n and w ( i , j ) R 2 n are dual variables, and δ A i * , δ Ω i * and δ C i , j * are the conjugate functions of δ A i , δ Ω i , δ C i , j , respectively. Notice that w i , j = ( w i j , w j i ) R 2 n is an edge-based variable, where w i j R n is the local variable of agent i and w j i R n is for agent j. Then, the last term of the Lagrangian function (6) satisfies:
i = 1 m i , j E w i , j T N i , j x δ C i , j * w i , j = i = 1 m j N i w i , j T N i , j x δ C i , j * w i , j ,
Thus, the Lagrangian function (6) can also be written as
L = i = 1 m f i x i + g i x i + v i T x i δ A i * v i + u i T x i δ Ω i * u i + i = 1 m j N i w i , j T N i , j x δ C i , j * w i , j ,
Taking the partial derivatives of the Lagrangian function (7) and combining the operator splitting method [29], we propose a new update flow as follows:   
w ¯ i , j k = prox ω i , j δ C i , j * w i , j k + ω i , j N i , j x k , u ¯ i k = prox μ i δ Ω i * u i k + μ i x i k , v ¯ i k = prox σ i δ A i * v i k + σ i x i k , x i k + 1 = prox γ i g i x i k γ i f i x i k γ i v ¯ i k γ i u ¯ i k γ i j N i C i j w ¯ i j k , w i , j k + 1 = w ¯ i , j k + ω i , j N i , j x k + 1 N i , j x k , u i k + 1 = u ¯ i k + μ i x i k + 1 x i k , v i k + 1 = v ¯ i k + σ i x i k + 1 x i k ,
where w ¯ i , j = ( w ¯ ( i , j ) , w ¯ ( j , i ) ) R 2 n , u ¯ i R n , v ¯ i R n are the auxiliary variables, and γ i , σ i , and μ i are positive stepsizes. Notice that the stepsizes are uncoordinated, which can be selected independently related to different agents and enjoy their own acceptable ranges. Additionally, the edge-based parameters ω ( i , j ) can be seen as inherent parameters of the communication network, revealing the quality of the communication.
The steps related to the edge-based variables in update flow (8) cannot be conducted directly, so we next replace them with the agent-based variables. We apply the Moreau decomposition to the first step in update flow (8) such that for the second term on the right side, we have
prox ω i , j 1 δ C i , j ω i , j 1 w i , j k + N i , j x k = arg min y C i , j y ω i , j 1 w i , j k + N i , j x k .
Define (9) as the projection P C i , j ω i , j 1 w i , j k + N i , j x k . Then, according to the definition of the set C i , j , the projection has the following explicit expression:
P C i , j a 1 a 2 = 1 2 a 1 a 2 a 2 a 1 .
Thus, for i V , j N i , the update step for w ¯ i , j can be decomposed into
w ¯ i j k = 1 2 w i j k + w j i k + ω i , j 2 C i j x i k + C j i x j k .
Moreover, the update step for w i , j can be replaced by
w i j k + 1 = w ¯ i j k + ω i , j C i j x i k + 1 x i k .
Combining the update flow (8), (10) and (11), we finally propose the decentralized algorithm for Problem (1) in Algorithm 1.
Here, we directly give the stepsize condition of Algorithm 1 in the following assumption. The specific theoretical origin of this condition can be found in the convergence analysis section.
Assumption 2.
(Stepsize conditions)
For any agent i V and j N i , the stepsizes γ i , μ i , σ i and ω i , j are positive. Let the following condition hold:
γ i < 1 β i 2 + μ i + σ i + j N i ω i , j ,
where β i is the Lipschitz constant for the gradient f i .
Algorithm 1 The Decentralized Algorithm
  • Initialization : For each agent i V and all j N i , let w ¯ i j 0 R n , u ¯ i 0 R n , v ¯ i 0 R n , x i 0 R n , w i j 0 R n , u i 0 R n and v i 0 R n .
  • For   k = 0 , 1 , 2 ,   do
  • Each agent i repeats, for all j N i ,
    w ¯ i j k = 1 2 w i j k + w j i k + ω i , j 2 C i j x i k + C j i x j k , u ¯ i k = pro x μ i δ Ω i * u i k + μ i x i k , v ¯ i k = pro x σ i δ A i * v i k + σ i x i k , x i k + 1 = pro x τ i g i x i k γ i f i x i k γ i u ¯ i k γ i v ¯ i k γ i j N i C i j w ¯ i j k , w i j k + 1 = w ¯ i j k + ω i , j C i j x i k + 1 x i k , u i k + 1 = u ¯ i k + μ i x i k + 1 x i k , v i k + 1 = v ¯ i k + σ i x i k + 1 x i k .
  • Agent i sends w i j k + 1 , w ¯ i j k , C i j x i k + 1 to all of its neighbors.
  • End
  • Output : The sequence x i k k = 1 to estimate the optimal solution.

4. Convergence Analysis

In this section, we first establish the compact form with operators of the proposed algorithm. Then, the results of the theoretical analysis are provided.
For i V , j N i , we make the following definitions. Let w and w ¯ represent the variables stacked by w i , j and w ¯ i , j , respectively. Define vectors u = c o l ( u 1 , u 2 , , u m ) , v = c o l ( v 1 , v 2 , , v m ) , u ¯ = c o l ( u ¯ 1 , u ¯ 2 , , u ¯ m ) and v ¯ = c o l ( v ¯ 1 , v ¯ 2 , , v ¯ m ) . Then, we let W = blkdiag ω i , j I 2 n ( i , j ) E , Γ = blkdiag γ i I n i V , Σ = diag σ i I n i V and M = blkdiag μ i I n i V be the stepsize matrices. Then C = i , j E C i , j , Ω = i V Ω i and A = i V A i hold such that there exist δ C = i , j E δ C i , j , δ Ω = i V δ Ω i and δ A = i V δ A i . The linear operator N : x N i , j x i , j E is stacked by N i , j . Considering the resolvent of the proximal operator, the update flow (8) leads to the following equalities:
δ C * w ¯ k + W 1 w ¯ k = W 1 w k + N x k , δ Ω * u ¯ k + M 1 u ¯ k = M 1 u k + x k , δ A * v ¯ k + Σ 1 v ¯ k = Σ 1 v k + x k , g x ¯ k + Γ 1 x ¯ k + u ¯ k + v ¯ k + N w ¯ k = Γ 1 x k f x k , w k + 1 = w ¯ k + W N x ¯ k W N x k , u k + 1 = u ¯ k + M x ¯ k M x k , v k + 1 = v ¯ k + Σ x ¯ k Σ x k ,
where x ¯ k = x k + 1 is the auxiliary variable.
Define two variables U = c o l w , u , v , x and U ¯ = c o l w ¯ , u ¯ , v ¯ , x ¯ . Based on the equalities in (12), Algorithm 1 is equivalent to the following compact form described by the operators:
U ¯ k = T H + T A 1 T H T Q T D U k , U k + 1 = U k + T S 1 T H T Q U ¯ k U k ,
where the operators are given as follows:
T H : U W 1 w , M 1 u , Σ 1 v , N w + u + v + Γ 1 x ,
T A : U δ C * w , δ Ω * u , δ A * v , g x ,
T Q : U N w , u , v , N w + u + v ,
T D : U 0 n , 0 n , 0 n , f x ,
T S : U W w , M u , Σ v , Γ x .
Consider one iteration of the proposed algorithm as an operator T. Then we let U * = c o l x * , w * , u * , v * be the fixed point of the operator T such that U * = T U * . Next, we conduct the convergence analysis.
Lemma 1.
(Optimal analysis) Let Assumption 1 be satisfied. The fixed point U * related to the operator T meets the first-order optimal conditions of the objective problem, and x * S is an optimal solution.
Proof. 
Substituting the fixed point into (12), we have the following set of equalities:
δ C * w * N x * = 0 , δ Ω * u * x * = 0 , δ A * v * x * = 0 , g x * + f x * + u * + v * + N w * = 0 ,
which is also the KKT condition of the Lagrangian function (6). Therefore, x * is an optimal solution to problem (1). □
The relationship between the fixed point and the optimal solution is ensured by Lemma 1. Split the operator T H as T H = T P + T K , where we let
T P = W 1 0 0 1 2 N 0 M 1 0 1 2 I 0 0 Σ 1 1 2 I 1 2 N 1 2 I 1 2 I Γ 1 ,
T K = 0 0 0 1 2 N 0 0 0 1 2 I 0 0 0 1 2 I 1 2 N 1 2 I 1 2 I 0 ,
and further define another linear operator
T P ˜ = W 1 0 0 1 2 N 0 M 1 0 1 2 I 0 0 Σ 1 1 2 I 1 2 N 1 2 I 1 2 I Γ 1 .
With these definitions above, the following lemma provides the property of the operator T for convergence analysis.
Lemma 2.
Under Assumption 1, there exists the following inequality for U * :
U ¯ k U * T D U k U * 1 4 x k x ¯ k B 2 ,
where B = blkdiag β i I n for i V is the Lipschitz parameter matrix.
Proof. 
With the definition of operator T D , we have the equality
U ¯ k U * T D U k U * = x k x ¯ k f x k f x * + x k x * f x k f x * .
According to [32] (Theorem 18.16), for i V , f i is cocoercive, i.e., it holds
f x k f x * B 1 2 x k x * f x k f x * .
Note that for any vector a and b in the same dimension and a diagonal positive definite matrix V, then there exists the inequality x y x V 2 + 1 4 y V 1 2 . Hence, we have
x k x ¯ k f x k f x * f x k f x * B 1 2 + 1 4 x k x ¯ k B 2 .
Combining (14)–(16), we can obtain the objective inequality and end the proof. □
Lemma 3.
Under Assumption 1, there exists the following inequality for U * :
U k + 1 U * T S 2 U k U * T S 2 U k + 1 U k T S 2 T P ˜ 2 + 1 2 x k x ¯ k B 2 ,
where T P ˜ is defined before Lemma 2.
Proof. 
Considering the change of the optimal residual before and after one iteration, we have
U k + 1 U * T S 2 U k U * T S 2 = U k + 1 U k T S 2 + 2 U k + 1 U * T S U k + 1 U k = U k + 1 U k T S 2 + 2 U k U * T S U k + 1 U k .
From the second step of the update flow (13), there exists
T S U k + 1 U k = T H T Q U ¯ k U k ,
such that the equality (18) leads to
U k + 1 U * T S 2 U k U * T S 2 = U k + 1 U k T S 2 + 2 U ¯ k U * T H T Q U ¯ k U k 2 U ¯ k U k T H T Q U ¯ k U k U k + 1 U k T S 2 + 2 U ¯ k U * T H T Q U ¯ k U k 2 U ¯ k U k T P U ¯ k U k .
From the first step of the update flow (13), it holds that
T H T Q U ¯ k U k = T D U k + T Q U ¯ k + T A U ¯ k .
Thus, we further have
U k + 1 U * T S 2 U k U * T S 2 U k + 1 U k T S 2 2 U ¯ k U k T P U ¯ k U k 2 U ¯ k U * T D U k + T Q U ¯ k + T A U * 2 U ¯ k U * T A U ¯ k T A U * .
Then, we discuss the right side of (19). Note that Lemma 1 proves the equivalence between the fixed point and the optimal solution. Substituting the property of fixed points into the update flow (13), we obtain U * = U ¯ * and
T A U * = T Q U * T D U * .
Hence, the third term on the right side of (19) satisfies
U ¯ k U * T D U k + T Q U ¯ k + T A U * = U ¯ k U * T D U k U * + U ¯ k U * T Q U ¯ k U * 1 4 x k x ¯ k B 2 + U ¯ k U * T Q U ¯ k U * ,
where the inequality is based on Lemma 2. Notice that the operator T A is monotone [32] (Theorem 21.2 and Proposition 20.23), i.e., it holds
U ¯ k U * T A U ¯ k T A U * 0 .
Since the linear operator T Q is a skew-symmetric matrix, it is monotone [29]. Combining (19)–(21), we obtain
U k + 1 U * T S 2 U k U * T S 2 U k + 1 U k T S 2 2 U ¯ k U k T P U ¯ k U k + 1 2 x k x ¯ k B 2 .
From the second step of the update flow (13), it holds
U ¯ k U k = T H T Q 1 T S U k + 1 U k ,
where T H , T Q and T S are the linear operators. Considering that T P is also a linear operator, the second term on the right side of (22) has an equivalent form:
U ¯ k U k T P U ¯ k U k = U k + 1 U k T P ˜ U k + 1 U k .
Substituting (23) into (22), we complete the proof. □
Summarizing the above lemmas, the following theorem supplies the convergence results.
Theorem 1.
When Assumption 1 and 2 are satisfied, for the sequence U k k 0 generated by the operator T, we have
U k + 1 U * T s 2 U k U * T s 2 U k + 1 U k 2 T P ˜ B ˜ T S 2 ,
where B ˜ = blkdiag 0 n × n , 0 n × n , 0 n × n , 1 4 B . Then, the sequence U k k 0 has sublinear rate O 1 / k , and the sequence x k k 0 converges to an optimal solution x * S .
Proof. 
With the definition of B ˜ , we have the following equality:
1 4 x k x ¯ k B 2 = U k + 1 U k B ˜ 2 .
Substituting (25) into (17), we obtain the inequality (24). Note that under Assumption 2, the matrix 2 T P ˜ B ˜ T S is positive definite. Hence, the sequence U k k 0 converges to the fixed point U * . Meanwhile, utilizing [10] (Proposition 1) results in the O 1 / k rate, and based on Lemma 1, the convergence of x k k 0 holds. □
In Theorem 1, the positive definite property is needed for the induced matrices, which leads to the stepsize conditions in Assumption 2.

5. Numerical Simulation

The correctness of the theoretical analysis is verified through numerical simulation on a constrained optimization problem over networks in this section.
The constrained quadratic programming problem [33] is considered in the experiments, which has the formulation as follows:
min x ˜ i = 1 m x ˜ E i 2 + e i x ˜ + ρ i x ˜ 1 ,
s.t. A i x ˜ = b i , i V ,
x ˜ min x ˜ x ˜ max , i V ,
where matrix E i R n × n is diagonal and positive definite, e i R n is a vector, and ρ i is the penalty factor. Both x i min and x i max are vectors with constants, which give the bounds of the decision variable x ˜ . In the light of (1), we can set f i x ˜ = x ˜ E i 2 + e i T x ˜ and g i x ˜ = ρ 1 x ˜ 1 .
In this case, the dimension of the decision variable is set as n = 4 , and we let r = 1 . For i V , the paramount data of Problem (26) are selected randomly. The elements of matrix E i are in [ 1 , 2 ] , and the elements of the linear operator A i are in [ 1 , 15 ] . Both vectors e i and b i take values in [ 5 , 5 ] . The box constraints are considered as [ 2.5 , 2.5 ] . Then, we set the uncoordinated setpsizes randomly as γ i [ 0.005 , 0.006 ] , while σ i , μ i and ω ( i , j ) are in [ 5 , 6 ] . The numerical experiments are performed over the generated network with eight agents, which is displayed in Figure 1. The simulations are carried by running the distributed algorithms on a laptop with Intel(R) Core i5-5500U CPU @ 2.40 GHz, 8.0 GB of RAM, and Matlab R2016a on Windows 10 operating system.
The simulation results are shown in Figure 2 and Figure 3. The transient behaviors of each component of x i k is displayed in Figure 2, in which a node-based consensus algorithm [34] is introduced as a comparative profile. Note that the obtained optimal solution from the proposed algorithm is in line with that of the node-based consensus one, i.e., x * = 0 . 6900 , 0 . 6270 , 0 . 8046 , 0 . 4400 T , but the latter achieves a stable consensus after 15,000 iterations. Figure 3 shows that our proposed algorithm outperforms the node-based and subgradient algorithms [35] in terms of convergence performance by evaluating the relative errors i = 1 m x i k x ˜ * / m x ˜ * .

6. Conclusions

In this paper, a distributed algorithm based on proximal operators has been designed to deal with discussed a class of distributed composite optimization problems, in which the local function has a smooth and nonsmooth structure and the decision variable abides by both affine and feasible constraints. Distinguishing attributes of the proposed algorithm include the use of uncoordinated stepsizes and the edge-based communication that avoids the dependency on Laplacian weight matrices. Meanwhile, the algorithm has been verified in theory and simulation. However, there are still some aspects worthy of improvement in this paper. For example, it is worth adopting efficient accelerated protocols (such as the Nesterov-based method and heavy ball method) to improve the convergence rate and developing asynchronous distributed algorithms to deal with the issue of communication latency. In addition, more general optimization models and more efficient algorithms should be investigated in order to address potential applications, e.g., [36,37,38] with nonconvex objectives, coupled and nonlinear constraints.

Author Contributions

L.F.: Conceptualization, Investigation, Project administration, Software, Writing—original draft. L.R.: Data curation, Software. G.M.: Software. J.T.: Project administration, Software. W.D.: Software. H.L.: Funding acquisition, Investigation, Methodology, Project administration, Resources. All authors have read and agreed to the published version of the manuscript.

Funding

The work described in this paper is supported in part by the Research Project Supported by Shanxi Scholarship Council of China (2020-139) and in part by the Xinzhou Teachers University Academic Leader Project.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors gratefully acknowledge their technical and financial support.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, H.; Su, E.; Wang, C.; Liu, J.; Xia, D. A primal-dual forward-backward splitting algorithm for distributed convex optimization. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 1–7. [Google Scholar] [CrossRef]
  2. Li, H.; Hu, J.; Ran, L.; Wang, Z.; Lü, Q.; Du, Z.; Huang, T. Decentralized Dual Proximal Gradient Algorithms for Non-Smooth Constrained Composite Optimization Problems. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2594–2605. [Google Scholar] [CrossRef]
  3. Zhang, Y.; Lou, Y.; Hong, Y.; Xie, L. Distributed projection-based algorithms for source localization in wireless sensor networks. IEEE Trans. Wirel. Commun. 2015, 14, 3131–3142. [Google Scholar] [CrossRef]
  4. Li, B.; Wang, Y.; Li, J.; Cao, S. A fully distributed approach for economic dispatch problem of smart grid. Energies 2018, 11, 1993. [Google Scholar] [CrossRef]
  5. Cao, C.; Xie, J.; Yue, D.; Huang, C.; Wang, J.; Xu, S.; Chen, X. Distributed economic dispatch of virtual power plant under a non-ideal communication network. Energies 2017, 10, 235. [Google Scholar] [CrossRef]
  6. Li, H.; Zheng, Z.; Lü, Q.; Wang, Z.; Gao, L.; Wu, G.; Ji, L.; Wang, H. Primal-dual fixed point algorithms based on adapted metric for distributed optimization. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–15. [Google Scholar] [CrossRef]
  7. Ababei, C.; Moghaddam, M.G. A survey of prediction and classification techniques in multicore processor systems. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 1184–1200. [Google Scholar] [CrossRef]
  8. Liu, S.; Qiu, Z.; Xie, L. Convergence rate analysis of distributed optimization with projected subgradient algorithm. Automatica 2017, 83, 162–169. [Google Scholar] [CrossRef]
  9. Li, Y.; Liao, X.; Li, C.; Huang, T.; Yang, D. Impulsive synchronization and parameter mismatch of the three-variable autocatalator model. Phys. Lett. A 2007, 366, 52–60. [Google Scholar] [CrossRef]
  10. Shi, W.; Ling, Q.; Wu, G.; Yin, W. EXTRA: An exact first-order algorithm for decentralized consensus optimization. SIAM J. Optim. 2015, 25, 944–966. [Google Scholar] [CrossRef] [Green Version]
  11. Lei, J.; Chen, H.F.; Fang, H.T. Primal-dual algorithm for distributed constrained optimization. Syst. Control Lett. 2016, 96, 110–117. [Google Scholar] [CrossRef]
  12. Nedic, A.; Ozdaglar, A.; Parrilo, P.A. Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 2010, 55, 922–938. [Google Scholar] [CrossRef]
  13. Gharesifard, B.; Cortes, J. Distributed continuous-time convex optimization on weight-balanced digraphs. IEEE Trans. Autom. Control 2014, 59, 781–786. [Google Scholar] [CrossRef]
  14. Zhu, M.; Martinez, S. On distributed convex optimization under inequality and equality constraints. IEEE Trans. Autom. Control 2012, 57, 151–164. [Google Scholar]
  15. Wang, X.; Hong, Y.; Ji, H. Distributed optimization for a class of nonlinear multiagent systems with disturbance rejection. IEEE Trans. Cybern. 2016, 46, 1655–1666. [Google Scholar] [CrossRef]
  16. Nedic, A.; Ozdaglar, A. Distributed subgradient methods for multiagent optimization. IEEE Trans. Autom. Control 2009, 54, 48–61. [Google Scholar] [CrossRef]
  17. Zhang, L.; Liu, S. Projected subgradient based distributed convex optimization with transmission noises. Appl. Math. Comput. 2022, 418, 126794. [Google Scholar] [CrossRef]
  18. Ren, X.; Li, D.; Xi, Y.; Shao, H. Distributed subgradient algorithm for multi-agent optimization with dynamic stepsize. IEEE/CAA J. Autom. Sin. 2021, 8, 1451–1464. [Google Scholar] [CrossRef]
  19. Niu, Y.; Wang, H.; Wang, Z.; Xia, D.; Li, H. Primal-dual stochastic distributed algorithm for constrained convex optimization. J. Frankl. Inst. 2019, 356, 9763–9787. [Google Scholar] [CrossRef]
  20. Shi, W.; Ling, Q.; Wu, G.; Yin, W. A proximal gradient algorithm for decentralized composite optimization. IEEE Trans. Singal Process. 2015, 63, 6013–6023. [Google Scholar] [CrossRef]
  21. Li, Z.; Shi, W.; Yan, M. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Trans. Singal Process. 2019, 67, 4494–4506. [Google Scholar] [CrossRef] [Green Version]
  22. Aybat, N.S.; Hamedani, E.Y. A distributed ADMM-like method for resource sharing over time-varying networks. SIAM J. Optim. 2019, 29, 3036–3068. [Google Scholar] [CrossRef]
  23. Xu, J.; Tian, Y.; Sun, Y.; Scutari, G. Distributed algorithms for composite optimization: Unified framework and convergence analysis. IEEE Trans. Signal Process. 2021, 69, 3555–3570. [Google Scholar] [CrossRef]
  24. Aybat, N.S.; Wang, Z.; Lin, T.; Ma, S. Distributed linearized alternating direction method of multipliers for composite convex consensus optimization. IEEE Trans. Autom. Control 2018, 63, 5–20. [Google Scholar] [CrossRef]
  25. Latafat, P.; Patrinos, P. Asymmetric forward-backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017, 68, 57–93. [Google Scholar] [CrossRef]
  26. Combettes, P.L.; Pesquet, J.C. Primal-dual splitting algorithm for solving inclusions with mixtures of composite, lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 2012, 20, 307–330. [Google Scholar] [CrossRef]
  27. Condat, L. A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 2013, 158, 460–479. [Google Scholar] [CrossRef]
  28. Vu, B.C. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 2013, 38, 667–681. [Google Scholar] [CrossRef]
  29. Latafat, P.; Freris, N.M.; Patrinos, P. A new randomized block-coordinate primal-dual proximal algorithm for distributed optimization. IEEE Trans. Autom. Control 2019, 64, 4050–4065. [Google Scholar] [CrossRef]
  30. Wei, Y.; Fang, H.; Zeng, X.; Chen, J.; Pardalos, P. A smooth double proximal primal-dual algorithm for a class of distributed nonsmooth optimization problems. IEEE Trans. Autom. Control 2020, 65, 1800–1806. [Google Scholar] [CrossRef]
  31. Liu, Q.; Yang, S.; Hong, Y. Constrained consensus algorithms with fixed step size for distributed convex optimization over multiagent networks. IEEE Trans. Autom. Control 2017, 62, 4259–4265. [Google Scholar] [CrossRef]
  32. Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces; Springer: Cham, Switzerland, 2011. [Google Scholar]
  33. Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  34. Zhao, Y.; Liu, Q. A consensus algorithm based on collective neurodynamic system for distributed optimization with linear and bound constraints. Neural Netw. 2020, 122, 144–151. [Google Scholar] [CrossRef] [PubMed]
  35. Yuan, D.; Xu, S.; Zhao, H. Distributed primal-dual subgradient method for multiagent optimization via consensus algorithms. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2011, 41, 1715–1724. [Google Scholar] [CrossRef] [PubMed]
  36. Ćalasan, M.; Micev, M.; Ali, Z.M.; Zobaa, A.F.; Aleem, S.H.E.A. Parameter estimation of induction machine single-cage and double-cage models using a hybrid simulated annealing–evaporation rate water cycle algorithm. Mathematics 2020, 8, 1024. [Google Scholar] [CrossRef]
  37. Ali, Z.M.; Diaaeldin, I.M.; Aleem, S.H.E.A.; El-Rafei, A.; Abdelaziz, A.Y.; Jurado, F. Scenario-based network reconfiguration and renewable energy resources integration in large-scale distribution systems considering parameters uncertainty. Mathematics 2021, 9, 26. [Google Scholar] [CrossRef]
  38. Rawa, M.; Abusorrah, A.; Bassi, H.; Mekhilef, S.; Ali, Z.M.; Aleem, S.; Hasanien, H.M.; Omar, A.I. Economical-technical-environmental operation of power networks with wind-solar-hydropower generation using analytic hierarchy process and improved grey wolf algorithm. Ain Shams Eng. J. 2021, 12, 2717–2734. [Google Scholar] [CrossRef]
Figure 1. The 8-agents communication network.
Figure 1. The 8-agents communication network.
Entropy 24 01278 g001
Figure 2. Trajectory of variable x i k . (a) Proposed algorithm. (b) Node-based consensus algorithm.
Figure 2. Trajectory of variable x i k . (a) Proposed algorithm. (b) Node-based consensus algorithm.
Entropy 24 01278 g002
Figure 3. Convergence performance comparison.
Figure 3. Convergence performance comparison.
Entropy 24 01278 g003
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Feng, L.; Ran, L.; Meng, G.; Tang, J.; Ding, W.; Li, H. Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy 2022, 24, 1278. https://doi.org/10.3390/e24091278

AMA Style

Feng L, Ran L, Meng G, Tang J, Ding W, Li H. Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy. 2022; 24(9):1278. https://doi.org/10.3390/e24091278

Chicago/Turabian Style

Feng, Liping, Liang Ran, Guoyang Meng, Jialong Tang, Wentao Ding, and Huaqing Li. 2022. "Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks" Entropy 24, no. 9: 1278. https://doi.org/10.3390/e24091278

APA Style

Feng, L., Ran, L., Meng, G., Tang, J., Ding, W., & Li, H. (2022). Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy, 24(9), 1278. https://doi.org/10.3390/e24091278

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop