Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks

Feng, Liping; Ran, Liang; Meng, Guoyang; Tang, Jialong; Ding, Wentao; Li, Huaqing

doi:10.3390/e24091278

Open AccessArticle

Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks

by

Liping Feng

¹,

Liang Ran

²,

Guoyang Meng

³,

Jialong Tang

²,

Wentao Ding

² and

Huaqing Li

^1,2,*

¹

Department of Computer Science, Xinzhou Teachers University, Xinzhou 034000, China

²

Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China

³

Department of Mathematics, Xinzhou Teachers University, Xinzhou 034000, China

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(9), 1278; https://doi.org/10.3390/e24091278

Submission received: 22 July 2022 / Revised: 6 September 2022 / Accepted: 6 September 2022 / Published: 11 September 2022

(This article belongs to the Special Issue Signal and Information Processing in Networks)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we focus on the nonsmooth composite optimization problems over networks, which consist of a smooth term and a nonsmooth term. Both equality constraints and box constraints for the decision variables are also considered. Based on the multi-agent networks, the objective problems are split into a series of agents on which the problems can be solved in a decentralized manner. By establishing the Lagrange function of the problems, the first-order optimal condition is obtained in the primal-dual domain. Then, we propose a decentralized algorithm with the proximal operators. The proposed algorithm has uncoordinated stepsizes with respect to agents or edges, where no global parameters are involved. By constructing the compact form of the algorithm with operators, we complete the convergence analysis with the fixed-point theory. With the constrained quadratic programming problem, simulations verify the effectiveness of the proposed algorithm.

Keywords:

nonsmooth optimization; decentralized optimization; primal-dual algorithm; uncoordinated stepsizes; distributed signal processing; information processing

1. Introduction

Recently, the distributed data processing methods based on multi-agent networks have received much attention. The traditional methods put all the data into one machine and perform the computation centrally. However, as the size of data continues to grow, this kind of centralized strategy is limited by the computing power of the hardware. In contrast to this, the distributed methods distribute computing tasks to agents over decentralized networks [1,2]. Each agent keeps an arithmetic unit and a memory unit. The agents interact with each other through communication links, and this communication occurs only among the neighboring agents. Under these conditions, the distributed methods can effectively solve the optimization problems common to sensor networks [3], economic dispatch [4,5,6], machine learning [7,8] and dynamic control [9].

The existing decentralized algorithms have included some successful results [10,11,12,13,14,15]. Previous works considered the problem models composed of a single function. With a fixed stepsize, Shi et al. designed EXTRA [10], which can exactly converge to the optimal solution. Lei et al. studied problems with bound constraints and proposed the primal-dual algorithm [11]. In addition, recent works [16,17,18] investigated a general distributed optimization with an objective function by designing decentralized subgradient-based algorithms, but diminishing or non-summable step-sizes are utilized, which may cause slow convergence rates [19].

In order to make full use of these special properties, some scholars have studied the nonsmooth composite optimization problems, which possess smooth and nonsmooth structures. By extending EXTRA to the nonsmooth combinational optimization, Shi et al. proposed PG-EXTRA [20]. Li et al. introduced the network-independent stepsize to PG-EXTRA and then developed NIDS [21]. In addition, Aybat et al. proposed DPDA-D [22] for time-varying networks. Considering the situation that the nonsmooth term cannot be split, Xu et al. proposed the [23]. PG-ADMM [24] was designed based on the distributed alternating direction multiplier method. Particularly, the nonsmooth combinational optimization problems also include a class of problems consisting of three functions and a linear operator. This structure is mainly discussed in the centralized optimization [25,26,27,28], and recently some distributed works also appear in [29,30]. In this paper, inspired by the constrained optimization problem [31], we study the constrained nonsmooth composite optimization problems over networks.

The contributions of this paper can be summarized as follows:

This paper focuses on an optimization problem with partially smooth and nonsmooth objective functions, where the decision variable satisfies local equality and feasible constraints, unlike these works [10,16,18,19,20,21] without considering any constraints. Then, to solve this problem, we propose a novel decentralized algorithm by combining primal-dual frame with the proximal operators, which avoids the estimation of subgradients for nonsmooth terms.
Different from existing node-based methods [16,17,18,19,20,21], the proposed algorithm adopts an edge-based communication pattern that explicitly highlights the process of information exchange among neighboring agents and further gets rid of the dependence on Laplacians [13]. Such a consideration also makes it possible to use uncoordinated stepsizes instead of commonly global or dynamic ones [10,12,16,18,19,21].
By employing the first-order optimal conditions and fixed-point theory of operators, the convergence is proved, and its sublinear rate $O (1 / k)$ (k is the number of iteration); i.e., at most, $O (1 / ϵ)$ iterations in order to reach an accuracy of $ϵ$ is established.

Organization: The rest of this paper is organized as follows. In Section 2, the necessary notations and basic knowledge are first provided, and then we describe the optimization problem over the networks and necessary assumptions. Section 3 supplies the development of the proposed decentralized algorithm. In Section 4, the convergence analysis for the proposed algorithm is provided. In Section 5, we use the simulation experiments to verify the theoretical analysis. Finally, conclusions are given in Section 6.

2. Preliminaries

In this section, we introduce the notations involved in this paper. Meanwhile, the objective problem and its explanation are also supplied.

2.1. Graph Theory and Notations

The knowledge of graph theory is used to construct the mathematical model of the communication network. Let

G = (V, E)

describe the network as a graph, where

V

is the set of vertices and

E \subset V \times V

is the set of edges. For an agent

i \in V

,

N_{i}

denotes the set of its neighbors. Let the unordered pair

(i, j) \in E

represent the edge between agent i and agent j. However,

(i, j)

or

(j, i)

is still order, i.e., the variables with respect to them are different.

Next, we explain the notations that appear in this paper. Let

R

represent the set of real numbers. Therefore,

R^{n}

denotes the n-dimensional vector space, and

R^{n \times m}

denotes the set of all n-row and m-column real matrices. We define

I_{n}

as the n-dimensional identity operator,

0_{n}

as the n-dimensional null vector, and

0_{n \times n}

as the null matrix. If their dimensions are clear from the context, we omit their subscript. Then,

blkdiag {P, Q}

is the block diagonal matrix grouped by matrices P and Q. For a matrix P, let M be its transpose. We denote

{∥x∥}_{P} = \sqrt{x^{⊤} P x}

as the induced norm with matrix P. The subdifferential of function f is

\partial f

, where

\partial f (x) = \{p | \forall q \in R^{n}, f (x) + p^{⊤} (q - x) \leq f (q)\}

. The conjugate function

f^{*}

is defined by

f^{*} (p) = {sup}_{q \in R^{n}} \{〈p, q〉 - f (q)\}

. For a positive constant

μ

, the resolvent of the proximal operator is

pro x_{μ f} (y) = {(I + μ \partial f)}^{- 1} (y) = arg {min}_{x} f (x) + \frac{1}{2 μ} {∥x - y∥}^{2}

, while the resolvent with respect to the matrix M is

{prox}_{f}^{M} (y) = {(I + M^{- 1} \partial f)}^{- 1} (y) = arg {min}_{x} f (x) + \frac{1}{2} {∥x - y∥}_{M}^{2}

. Moreover, let

S

represent the optimal solution set of a solvable optimization problem over networks.

2.2. Decentralized Optimization Problem

The constrained composite optimization problem over networks studied in this paper is based on the network

G = {V, E}

with m agents. Specifically, the formulation of the problem is established as follows:

\begin{matrix} min_{\tilde{x} \in R^{n}} \sum_{i = 1}^{m} f_{i} (\tilde{x}) + g_{i} (\tilde{x}), \end{matrix}

(1a)

\begin{matrix} s . t . A_{i} \tilde{x} = b_{i}, i = 1, \dots, m, \end{matrix}

(1b)

\begin{matrix} \tilde{x} \in ⋂_{i = 1}^{m} Ω_{i} . \end{matrix}

(1c)

In problem (1),

\tilde{x} \in R^{n}

is the decision variable;

f_{i} : R^{n} \to R \cup \{+ \infty\}

and

g_{i} : R^{n} \to R \cup \{+ \infty\}

are two private cost functions to agent i, where the former has the Lipschitz continuous gradient, but the latter may be nonsmooth;

b_{i} \in R^{r}

is a vector and

A_{i} : R^{n} \to R^{r}

is a linear operator. Convex set

Ω_{i}

gives the box constraints to the decision variable of agent i.

To clarify the properties of problem (1), the following necessary assumption is given.

Assumption 1.

For any agent

i \in V

:

(i): The cost function $f_{i}$ is Lipschitz continuous and convex; i.e., if we consider the positive Lipschitz constant $β_{i}$ , then it holds the inequality for the gradient $\nabla f_{i}$ :

$\begin{matrix} {∥\nabla f_{i} (\tilde{x}) - \nabla f_{i} (\tilde{y})∥}^{2} \leq β_{i} {(\tilde{x} - \tilde{y})}^{⊤} (\nabla f_{i} (\tilde{x}) - \nabla f (\tilde{y})) . \end{matrix}$

(2)
(ii): The local cost function $g_{i}$ is a nonsmooth and convex function.
(iii): The optimal solution ${\tilde{x}}^{*}$ to objective problem (1) exists, which satisfies both the equality constraints and the box constraints.
(iv): The graph $G$ is undirected and connected.

Note that the cost functions

f_{i}

and

g_{i}

are separable. Hence, we introduce the consensus constraint to transform problem (1) into the structure that can be computed in a decentralized manner:

\begin{matrix} min_{x_{1}, \dots, x_{m}} \sum_{i = 1}^{m} f_{i} (x_{i}) + g_{i} (x_{i}), \end{matrix}

(3a)

\begin{matrix} s . t . A_{i} x_{i} = b_{i}, i = 1, \dots, m, \end{matrix}

(3b)

\begin{matrix} x_{i} = x_{j}, i = 1, \dots, m, j \in N_{i}, \end{matrix}

(3c)

\begin{matrix} x_{i} \in Ω_{i} . \end{matrix}

(3d)

Define the set

A_{i} = \{z \in R^{n} |A_{i} z = b_{i}\}

and consider the indicator function

\begin{matrix} δ_{C} (e) = \{\begin{matrix} 0, e \in C, \\ + \infty, e \notin C, \end{matrix} \end{matrix}

such that Problem (3) can be processed by the penalty function method. For

i \in V

and

j \in N_{i}

, let

C_{i j} = I

if

i < j

and

C_{i j} = - I

otherwise. Thus, Problem (3) is equivalent to the following problem:

\begin{matrix} min_{x_{1}, \dots, x_{m}} \sum_{i = 1}^{m} f_{i} (x_{i}) + g_{i} (x_{i}) + δ_{A_{i}} (x_{i}) + δ_{Ω_{i}} (x_{i}), \end{matrix}

(4a)

\begin{matrix} s . t . C_{i j} x_{i} + C_{j i} x_{j} = 0, i = 1, \dots, m, j \in N_{i} . \end{matrix}

(4b)

Then, let

x = c o l (x_{1}, \dots, x_{m})

be the global variable. For

i \in V

and

j \in N_{i}

, we introduce a linear operator

N_{(i, j)} : x \mapsto {({(C_{i j} x_{i})}^{⊤}, {(C_{j i} x_{j})}^{⊤})}^{⊤}

, which generates the edge-based variable from x. With the set

C_{(i, j)} = \{{(z_{1}^{⊤}, z_{2}^{⊤})}^{⊤} |z_{1} + z_{2} = 0\}

, the constraint in the problem (4) can be transformed into another penalty function. Therefore, the problem (1) is finally equivalent to the following problem:

\begin{matrix} min_{x_{1}, \dots, x_{m}} & \sum_{i = 1}^{m} f_{i} (x_{i}) + g_{i} (x_{i}) + δ_{A_{i}} (x_{i}) + δ_{Ω_{i}} (x_{i}) + \sum_{i = 1}^{m} \sum_{(i, j) \in E} δ_{C_{(i, j)}} (N_{(i, j)} x) . \end{matrix}

(5)

Based on the problem (5), we design a novel decentralized algorithm to solve the constrained composite optimization problem over networks in the next section.

3. Algorithm Development

The introduction with respect to the design process of the proposed algorithm is provided in this section.

Notice that Problem (5) is an unconstrained problem. According to [32] (Proposition 19.20), we obtain the following Lagrangian function:

\begin{matrix} \begin{matrix} L = \sum_{i = 1}^{m} (f_{i} (x_{i}) + g_{i} (x_{i}) + v_{i}^{T} x_{i} - δ_{A_{i}}^{*} (v_{i}) + u_{i}^{T} x_{i} - δ_{Ω_{i}}^{*} (u_{i})) \\ + \sum_{i = 1}^{m} \sum_{(i, j) \in E} (w_{(i, j)}^{T} N_{(i, j)} x - δ_{C_{(i, j)}}^{*} (w_{(i, j)})), \end{matrix} \end{matrix}

(6)

where

v_{i} \in R^{n}

,

u_{i} \in R^{n}

and

w_{(i, j)} \in R^{2 n}

are dual variables, and

δ_{A_{i}}^{*}

,

δ_{Ω_{i}}^{*}

and

δ_{C_{(i, j)}}^{*}

are the conjugate functions of

δ_{A_{i}}

,

δ_{Ω_{i}}

,

δ_{C_{(i, j)}}

, respectively. Notice that

w_{(i, j)} = {(w_{i j}^{⊤}, w_{j i}^{⊤})}^{⊤} \in R^{2 n}

is an edge-based variable, where

w_{i j} \in R^{n}

is the local variable of agent i and

w_{j i} \in R^{n}

is for agent j. Then, the last term of the Lagrangian function (6) satisfies:

\begin{matrix} \sum_{i = 1}^{m} \sum_{(i, j) \in E} (w_{(i, j)}^{T} N_{(i, j)} x - δ_{C_{(i, j)}}^{*} (w_{(i, j)})) = \sum_{i = 1}^{m} \sum_{j \in N_{i}} (w_{(i, j)}^{T} N_{(i, j)} x - δ_{C_{(i, j)}}^{*} (w_{(i, j)})), \end{matrix}

Thus, the Lagrangian function (6) can also be written as

\begin{matrix} \begin{matrix} L = \sum_{i = 1}^{m} (f_{i} (x_{i}) + g_{i} (x_{i}) + v_{i}^{T} x_{i} - δ_{A_{i}}^{*} (v_{i}) + u_{i}^{T} x_{i} - δ_{Ω_{i}}^{*} (u_{i})) \\ + \sum_{i = 1}^{m} \sum_{j \in N_{i}} (w_{(i, j)}^{T} N_{(i, j)} x - δ_{C_{(i, j)}}^{*} (w_{(i, j)})), \end{matrix} \end{matrix}

(7)

Taking the partial derivatives of the Lagrangian function (7) and combining the operator splitting method [29], we propose a new update flow as follows:

\begin{matrix} \{\begin{matrix} {\bar{w}}_{(i, j)}^{k} = {prox}_{ω_{(i, j)} δ_{C_{(i, j)}}^{*}} (w_{(i, j)}^{k} + ω_{(i, j)} (N_{(i, j)} x^{k})), \\ {\bar{u}}_{i}^{k} = {prox}_{μ_{i} δ_{Ω_{i}}^{*}} (u_{i}^{k} + μ_{i} x_{i}^{k}), \\ {\bar{v}}_{i}^{k} = {prox}_{σ_{i} δ_{A_{i}}^{*}} (v_{i}^{k} + σ_{i} x_{i}^{k}), \\ x_{i}^{k + 1} = {prox}_{γ_{i} g_{i}} (\begin{matrix} x_{i}^{k} - γ_{i} \nabla f_{i} (x_{i}^{k}) - γ_{i} {\bar{v}}_{i}^{k} - γ_{i} {\bar{u}}_{i}^{k} - γ_{i} \sum_{j \in N_{i}} C_{i j}^{⊤} {\bar{w}}_{i j}^{k} \end{matrix}), \\ w_{(i, j)}^{k + 1} = {\bar{w}}_{(i, j)}^{k} + ω_{(i, j)} (N_{(i, j)} x^{k + 1} - N_{(i, j)} x^{k}), \\ u_{i}^{k + 1} = {\bar{u}}_{i}^{k} + μ_{i} (x_{i}^{k + 1} - x_{i}^{k}), \\ v_{i}^{k + 1} = {\bar{v}}_{i}^{k} + σ_{i} (x_{i}^{k + 1} - x_{i}^{k}), \end{matrix} \end{matrix}

(8)

where

{\bar{w}}_{(i, j)} = {({\bar{w}}_{(i, j)}^{⊤}, {\bar{w}}_{(j, i)}^{⊤})}^{⊤} \in R^{2 n}

,

{\bar{u}}_{i} \in R^{n}

,

{\bar{v}}_{i} \in R^{n}

are the auxiliary variables, and

γ_{i}

,

σ_{i}

, and

μ_{i}

are positive stepsizes. Notice that the stepsizes are uncoordinated, which can be selected independently related to different agents and enjoy their own acceptable ranges. Additionally, the edge-based parameters

ω_{(i, j)}

can be seen as inherent parameters of the communication network, revealing the quality of the communication.

The steps related to the edge-based variables in update flow (8) cannot be conducted directly, so we next replace them with the agent-based variables. We apply the Moreau decomposition to the first step in update flow (8) such that for the second term on the right side, we have

\begin{matrix} {prox}_{ω_{(i, j)}^{- 1} δ_{C_{(i, j)}}} (ω_{(i, j)}^{- 1} w_{(i, j)}^{k} + N_{(i, j)} x^{k}) = \underset{y \in C_{(i, j)}}{arg min} \{∥y - (ω_{(i, j)}^{- 1} w_{(i, j)}^{k} + N_{(i, j)} x^{k})∥\} . \end{matrix}

(9)

Define (9) as the projection

P_{C_{(i, j)}} (ω_{(i, j)}^{- 1} w_{(i, j)}^{k} + N_{(i, j)} x^{k})

. Then, according to the definition of the set

C_{(i, j)}

, the projection has the following explicit expression:

P_{C_{(i, j)}} [\begin{matrix} a_{1} \\ a_{2} \end{matrix}] = \frac{1}{2} [\begin{matrix} a_{1} - a_{2} \\ a_{2} - a_{1} \end{matrix}] .

Thus, for

i \in V

,

j \in N_{i}

, the update step for

{\bar{w}}_{(i, j)}

can be decomposed into

\begin{matrix} {\bar{w}}_{i j}^{k} = \frac{1}{2} (w_{i j}^{k} + w_{j i}^{k}) + \frac{ω_{(i, j)}}{2} (C_{i j} x_{i}^{k} + C_{j i} x_{j}^{k}) . \end{matrix}

(10)

Moreover, the update step for

w_{(i, j)}

can be replaced by

\begin{matrix} w_{i j}^{k + 1} = {\bar{w}}_{i j}^{k} + ω_{(i, j)} C_{i j} (x_{i}^{k + 1} - x_{i}^{k}) . \end{matrix}

(11)

Combining the update flow (8), (10) and (11), we finally propose the decentralized algorithm for Problem (1) in Algorithm 1.

Here, we directly give the stepsize condition of Algorithm 1 in the following assumption. The specific theoretical origin of this condition can be found in the convergence analysis section.

Assumption 2.

(Stepsize conditions)

For any agent

i \in V

and

j \in N_{i}

, the stepsizes

γ_{i}

,

μ_{i}

,

σ_{i}

and

ω_{(i, j)}

are positive. Let the following condition hold:

γ_{i} < \frac{1}{\frac{β_{i}}{2} + μ_{i} + σ_{i} + \sum_{j \in N_{i}} ω_{(i, j)}},

where

β_{i}

is the Lipschitz constant for the gradient

\nabla f_{i}

.

Algorithm 1 The Decentralized Algorithm

$Initialization :$ For each agent $i \in V$ and all $j \in N_{i}$ , let ${\bar{w}}_{i j}^{0} \in R^{n}$ , ${\bar{u}}_{i}^{0} \in R^{n}$ , ${\bar{v}}_{i}^{0} \in R^{n}$ , $x_{i}^{0} \in R^{n}$ , $w_{i j}^{0} \in R^{n}$ , $u_{i}^{0} \in R^{n}$ and $v_{i}^{0} \in R^{n}$ .
$For$ $k = 0, 1, 2, \dots$ $do$
Each agent i repeats, for all $j \in N_{i}$ ,

$\begin{matrix} {\bar{w}}_{i j}^{k} & = \frac{1}{2} (w_{i j}^{k} + w_{j i}^{k}) + \frac{ω_{(i, j)}}{2} (C_{i j} x_{i}^{k} + C_{j i} x_{j}^{k}), \\ {\bar{u}}_{i}^{k} & = pro x_{μ_{i} δ_{Ω_{i}}^{*}} (u_{i}^{k} + μ_{i} x_{i}^{k}), \\ {\bar{v}}_{i}^{k} & = pro x_{σ_{i} δ_{A_{i}}^{*}} (v_{i}^{k} + σ_{i} x_{i}^{k}), \\ x_{i}^{k + 1} & = pro x_{τ_{i} g_{i}} (\begin{matrix} x_{i}^{k} - γ_{i} \nabla f_{i} (x_{i}^{k}) - γ_{i} {\bar{u}}_{i}^{k} - γ_{i} {\bar{v}}_{i}^{k} - γ_{i} \sum_{j \in N_{i}} C_{i j}^{⊤} {\bar{w}}_{i j}^{k} \end{matrix}), \\ w_{i j}^{k + 1} & = {\bar{w}}_{i j}^{k} + ω_{(i, j)} C_{i j} (x_{i}^{k + 1} - x_{i}^{k}), \\ u_{i}^{k + 1} & = {\bar{u}}_{i}^{k} + μ_{i} (x_{i}^{k + 1} - x_{i}^{k}), \\ v_{i}^{k + 1} & = {\bar{v}}_{i}^{k} + σ_{i} (x_{i}^{k + 1} - x_{i}^{k}) . \end{matrix}$
Agent i sends $w_{i j}^{k + 1}$ , ${\bar{w}}_{i j}^{k}$ , $C_{i j} x_{i}^{k + 1}$ to all of its neighbors.
$End$
$Output :$ The sequence ${(x_{i}^{k})}_{k = 1}^{\infty}$ to estimate the optimal solution.

4. Convergence Analysis

In this section, we first establish the compact form with operators of the proposed algorithm. Then, the results of the theoretical analysis are provided.

For

i \in V

,

j \in N_{i}

, we make the following definitions. Let w and

\bar{w}

represent the variables stacked by

w_{(i, j)}

and

{\bar{w}}_{(i, j)}

, respectively. Define vectors

u = c o l (u_{1}, u_{2}, \dots, u_{m}), v = c o l (v_{1}, v_{2}, \dots, v_{m}), \bar{u} = c o l ({\bar{u}}_{1}, {\bar{u}}_{2}, \dots, {\bar{u}}_{m})

and

\bar{v} = c o l ({\bar{v}}_{1}, {\bar{v}}_{2}, \dots, {\bar{v}}_{m})

. Then, we let

W = blkdiag {\{ω_{(i, j)} I_{2 n}\}}_{(i, j) \in E}

,

Γ = blkdiag {\{γ_{i} I_{n}\}}_{i \in V}

,

Σ = diag {\{σ_{i} I_{n}\}}_{i \in V}

and

M = blkdiag {\{μ_{i} I_{n}\}}_{i \in V}

be the stepsize matrices. Then

C = \prod_{(i, j) \in E} C_{(i, j)}

,

Ω = \prod_{i \in V} Ω_{i}

and

A = \prod_{i \in V} A_{i}

hold such that there exist

δ_{C} = \sum_{(i, j) \in E} δ_{C_{(i, j)}}

,

δ_{Ω} = \sum_{i \in V} δ_{Ω_{i}}

and

δ_{A} = \sum_{i \in V} δ_{A_{i}}

. The linear operator

N : x \mapsto {(N_{(i, j)} x)}_{(i, j) \in E}

is stacked by

N_{(i, j)}

. Considering the resolvent of the proximal operator, the update flow (8) leads to the following equalities:

\begin{matrix} \{\begin{matrix} \partial δ_{C}^{*} ({\bar{w}}^{k}) + W^{- 1} {\bar{w}}^{k} = W^{- 1} w^{k} + N x^{k}, \\ \partial δ_{Ω}^{*} ({\bar{u}}^{k}) + M^{- 1} {\bar{u}}^{k} = M^{- 1} u^{k} + x^{k}, \\ \partial δ_{A}^{*} ({\bar{v}}^{k}) + Σ^{- 1} {\bar{v}}^{k} = Σ^{- 1} v^{k} + x^{k}, \\ \partial g ({\bar{x}}^{k}) + Γ^{- 1} {\bar{x}}^{k} + {\bar{u}}^{k} + {\bar{v}}^{k} + N^{⊤} {\bar{w}}^{k} = Γ^{- 1} x^{k} - \nabla f (x^{k}), \\ w^{k + 1} = {\bar{w}}^{k} + W N {\bar{x}}^{k} - W N x^{k}, \\ u^{k + 1} = {\bar{u}}^{k} + M {\bar{x}}^{k} - M x^{k}, \\ v^{k + 1} = {\bar{v}}^{k} + Σ {\bar{x}}^{k} - Σ x^{k}, \end{matrix} \end{matrix}

(12)

where

{\bar{x}}^{k} = x^{k + 1}

is the auxiliary variable.

Define two variables

U = c o l (w, u, v, x)

and

\bar{U} = c o l (\bar{w}, \bar{u}, \bar{v}, \bar{x})

. Based on the equalities in (12), Algorithm 1 is equivalent to the following compact form described by the operators:

\begin{matrix} \{\begin{matrix} {\bar{U}}^{k} = {(T_{H} + T_{A})}^{- 1} (T_{H} - T_{Q} - T_{D}) U^{k}, \\ U^{k + 1} = U^{k} + T_{S}^{- 1} (T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}), \end{matrix} \end{matrix}

(13)

where the operators are given as follows:

T_{H} : U \mapsto {(\begin{matrix} {(W^{- 1} w)}^{⊤}, {(M^{- 1} u)}^{⊤}, {(Σ^{- 1} v)}^{⊤}, {(N^{⊤} w + u + v + Γ^{- 1} x)}^{⊤} \end{matrix})}^{⊤},

T_{A} : U \mapsto {(\partial δ_{C}^{*} {(w)}^{⊤}, \partial δ_{Ω}^{*} {(u)}^{⊤}, \partial δ_{A}^{*} {(v)}^{⊤}, \partial g {(x)}^{⊤})}^{⊤},

T_{Q} : U \mapsto {({(- N w)}^{⊤}, - u^{⊤}, - v^{⊤}, {(N^{⊤} w + u + v)}^{⊤})}^{⊤},

T_{D} : U \mapsto {(0_{n}, 0_{n}, 0_{n}, \nabla f {(x)}^{⊤})}^{⊤},

T_{S} : U \mapsto {({(W w)}^{⊤}, {(M u)}^{⊤}, {(Σ v)}^{⊤}, {(Γ x)}^{⊤})}^{⊤} .

Consider one iteration of the proposed algorithm as an operator T. Then we let

U^{*} = c o l (x^{*}, w^{*}, u^{*}, v^{*})

be the fixed point of the operator T such that

U^{*} = T U^{*}

. Next, we conduct the convergence analysis.

Lemma 1.

(Optimal analysis) Let Assumption 1 be satisfied. The fixed point

U^{*}

related to the operator T meets the first-order optimal conditions of the objective problem, and

x^{*} \in S

is an optimal solution.

Proof.

Substituting the fixed point into (12), we have the following set of equalities:

\{\begin{matrix} \partial δ_{C}^{*} (w^{*}) - N x^{*} = 0, \\ \partial δ_{Ω}^{*} (u^{*}) - x^{*} = 0, \\ \partial δ_{A}^{*} (v^{*}) - x^{*} = 0, \\ \partial g (x^{*}) + \nabla f (x^{*}) + u^{*} + v^{*} + N^{⊤} w^{*} = 0, \end{matrix}

which is also the KKT condition of the Lagrangian function (6). Therefore,

x^{*}

is an optimal solution to problem (1). □

The relationship between the fixed point and the optimal solution is ensured by Lemma 1. Split the operator

T_{H}

as

T_{H} = T_{P} + T_{K}

, where we let

T_{P} = [\begin{matrix} W^{- 1} & 0 & 0 & \frac{1}{2} N \\ 0 & M^{- 1} & 0 & \frac{1}{2} I \\ 0 & 0 & Σ^{- 1} & \frac{1}{2} I \\ \frac{1}{2} N^{⊤} & \frac{1}{2} I & \frac{1}{2} I & Γ^{- 1} \end{matrix}],

T_{K} = [\begin{matrix} 0 & 0 & 0 & - \frac{1}{2} N \\ 0 & 0 & 0 & - \frac{1}{2} I \\ 0 & 0 & 0 & - \frac{1}{2} I \\ \frac{1}{2} N^{⊤} & \frac{1}{2} I & \frac{1}{2} I & 0 \end{matrix}],

and further define another linear operator

T_{\tilde{P}} = [\begin{matrix} W^{- 1} & 0 & 0 & - \frac{1}{2} N \\ 0 & M^{- 1} & 0 & - \frac{1}{2} I \\ 0 & 0 & Σ^{- 1} & - \frac{1}{2} I \\ - \frac{1}{2} N^{⊤} & - \frac{1}{2} I & - \frac{1}{2} I & Γ^{- 1} \end{matrix}] .

With these definitions above, the following lemma provides the property of the operator T for convergence analysis.

Lemma 2.

Under Assumption 1, there exists the following inequality for

U^{*}

:

{({\bar{U}}^{k} - U^{*})}^{⊤} T_{D} (U^{k} - U^{*}) \geq - \frac{1}{4} {∥x^{k} - {\bar{x}}^{k}∥}_{B}^{2},

where

B = blkdiag \{β_{i} I_{n}\}

for

i \in V

is the Lipschitz parameter matrix.

Proof.

With the definition of operator

T_{D}

, we have the equality

\begin{matrix} {({\bar{U}}^{k} - U^{*})}^{⊤} T_{D} (U^{k} - U^{*}) \\ = - {(x^{k} - {\bar{x}}^{k})}^{⊤} (\nabla f (x^{k}) - \nabla f (x^{*})) + {(x^{k} - x^{*})}^{⊤} (\nabla f (x^{k}) - \nabla f (x^{*})) . \end{matrix}

(14)

According to [32] (Theorem 18.16), for

i \in V

,

\nabla f_{i}

is cocoercive, i.e., it holds

\begin{matrix} {∥\nabla f (x^{k}) - \nabla f (x^{*})∥}_{B^{- 1}}^{2} \leq {(x^{k} - x^{*})}^{⊤} (\nabla f (x^{k}) - \nabla f (x^{*})) . \end{matrix}

(15)

Note that for any vector a and b in the same dimension and a diagonal positive definite matrix V, then there exists the inequality

x^{⊤} y \leq {∥x∥}_{V}^{2} + \frac{1}{4} {∥y∥}_{V^{- 1}}^{2}

. Hence, we have

\begin{matrix} {(x^{k} - {\bar{x}}^{k})}^{⊤} (\nabla f (x^{k}) - \nabla f (x^{*})) \leq {∥\nabla f (x^{k}) - \nabla f (x^{*})∥}_{B^{- 1}}^{2} + \frac{1}{4} {∥x_{k} - {\bar{x}}_{k}∥}_{B}^{2} . \end{matrix}

(16)

Combining (14)–(16), we can obtain the objective inequality and end the proof. □

Lemma 3.

Under Assumption 1, there exists the following inequality for

U^{*}

:

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{S}}^{2} - {∥U^{k} - U^{*}∥}_{T_{S}}^{2} \leq {∥U^{k + 1} - U^{k}∥}_{T_{S} - 2 T_{\tilde{P}}}^{2} + \frac{1}{2} {∥x^{k} - {\bar{x}}^{k}∥}_{B}^{2}, \end{matrix}

(17)

where

T_{\tilde{P}}

is defined before Lemma 2.

Proof.

Considering the change of the optimal residual before and after one iteration, we have

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{S}}^{2} - {∥U^{k} - U^{*}∥}_{T_{S}}^{2} \\ = - \begin{matrix} {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} + 2 {(U^{k + 1} - U^{*})}^{⊤} T_{S} (U^{k + 1} - U^{k}) \end{matrix} \\ = \begin{matrix} {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} + 2 {(U^{k} - U^{*})}^{⊤} T_{S} (U^{k + 1} - U^{k}) \end{matrix} . \end{matrix}

(18)

From the second step of the update flow (13), there exists

T_{S} (U^{k + 1} - U^{k}) = (T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}),

such that the equality (18) leads to

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{S}}^{2} - {∥U^{k} - U^{*}∥}_{T_{S}}^{2} \\ = {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} \\ + 2 {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}) \\ - 2 {({\bar{U}}^{k} - U^{k})}^{⊤} (T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}) \\ \leq {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} \\ + 2 {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}) \\ - 2 {({\bar{U}}^{k} - U^{k})}^{⊤} T_{P} ({\bar{U}}^{k} - U^{k}) . \end{matrix}

From the first step of the update flow (13), it holds that

(T_{H} - T_{Q}) ({\bar{U}}^{k} - U^{k}) = - (T_{D} U^{k} + T_{Q} {\bar{U}}^{k} + T_{A} {\bar{U}}^{k}) .

Thus, we further have

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{S}}^{2} - {∥U^{k} - U^{*}∥}_{T_{S}}^{2} \\ \leq {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} - 2 {({\bar{U}}^{k} - U^{k})}^{⊤} T_{P} ({\bar{U}}^{k} - U^{k}) \\ - 2 {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{D} U^{k} + T_{Q} {\bar{U}}^{k} + T_{A} U^{*}) \\ - 2 {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{A} {\bar{U}}^{k} - T_{A} U^{*}) . \end{matrix}

(19)

Then, we discuss the right side of (19). Note that Lemma 1 proves the equivalence between the fixed point and the optimal solution. Substituting the property of fixed points into the update flow (13), we obtain

U^{*} = {\bar{U}}^{*}

and

T_{A} U^{*} = - T_{Q} U^{*} - T_{D} U^{*} .

Hence, the third term on the right side of (19) satisfies

\begin{matrix} {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{D} U^{k} + T_{Q} {\bar{U}}^{k} + T_{A} U^{*}) \\ = {({\bar{U}}^{k} - U^{*})}^{⊤} T_{D} (U^{k} - U^{*}) + {({\bar{U}}^{k} - U^{*})}^{⊤} T_{Q} ({\bar{U}}^{k} - U^{*}) \\ \geq - \frac{1}{4} {∥x^{k} - {\bar{x}}^{k}∥}_{B}^{2} + {({\bar{U}}^{k} - U^{*})}^{⊤} T_{Q} ({\bar{U}}^{k} - U^{*}), \end{matrix}

(20)

where the inequality is based on Lemma 2. Notice that the operator

T_{A}

is monotone [32] (Theorem 21.2 and Proposition 20.23), i.e., it holds

\begin{matrix} {({\bar{U}}^{k} - U^{*})}^{⊤} (T_{A} {\bar{U}}^{k} - T_{A} U^{*}) \geq 0 . \end{matrix}

(21)

Since the linear operator

T_{Q}

is a skew-symmetric matrix, it is monotone [29]. Combining (19)–(21), we obtain

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{S}}^{2} - {∥U^{k} - U^{*}∥}_{T_{S}}^{2} \\ \leq {∥U^{k + 1} - U^{k}∥}_{T_{S}}^{2} - 2 {({\bar{U}}^{k} - U^{k})}^{⊤} T_{P} ({\bar{U}}^{k} - U^{k}) + \frac{1}{2} {∥x^{k} - {\bar{x}}^{k}∥}_{B}^{2} . \end{matrix}

(22)

From the second step of the update flow (13), it holds

{\bar{U}}^{k} - U^{k} = {(T_{H} - T_{Q})}^{- 1} T_{S} (U^{k + 1} - U^{k}),

where

T_{H}

,

T_{Q}

and

T_{S}

are the linear operators. Considering that

T_{P}

is also a linear operator, the second term on the right side of (22) has an equivalent form:

\begin{matrix} {({\bar{U}}^{k} - U^{k})}^{⊤} T_{P} ({\bar{U}}^{k} - U^{k}) = {(U^{k + 1} - U^{k})}^{⊤} T_{\tilde{P}} (U^{k + 1} - U^{k}) . \end{matrix}

(23)

Substituting (23) into (22), we complete the proof. □

Summarizing the above lemmas, the following theorem supplies the convergence results.

Theorem 1.

When Assumption 1 and 2 are satisfied, for the sequence

{(U^{k})}_{k \geq 0}

generated by the operator T, we have

\begin{matrix} {∥U^{k + 1} - U^{*}∥}_{T_{s}}^{2} - {∥U^{k} - U^{*}∥}_{T_{s}}^{2} \leq - {∥U^{k + 1} - U^{k}∥}_{2 (T_{\tilde{P}} - \tilde{B}) - T_{S}}^{2}, \end{matrix}

(24)

where

\tilde{B} = blkdiag \{0_{n \times n}, 0_{n \times n}, 0_{n \times n}, \frac{1}{4} B\}

. Then, the sequence

{(U^{k})}_{k \geq 0}

has sublinear rate

O (1 / k)

, and the sequence

{(x^{k})}_{k \geq 0}

converges to an optimal solution

x^{*} \in S

.

Proof.

With the definition of

\tilde{B}

, we have the following equality:

\begin{matrix} \frac{1}{4} {∥x^{k} - {\bar{x}}^{k}∥}_{B}^{2} = {∥U^{k + 1} - U^{k}∥}_{\tilde{B}}^{2} . \end{matrix}

(25)

Substituting (25) into (17), we obtain the inequality (24). Note that under Assumption 2, the matrix

2 (T_{\tilde{P}} - \tilde{B}) - T_{S}

is positive definite. Hence, the sequence

{(U^{k})}_{k \geq 0}

converges to the fixed point

U^{*}

. Meanwhile, utilizing [10] (Proposition 1) results in the

O (1 / k)

rate, and based on Lemma 1, the convergence of

{(x^{k})}_{k \geq 0}

holds. □

In Theorem 1, the positive definite property is needed for the induced matrices, which leads to the stepsize conditions in Assumption 2.

5. Numerical Simulation

The correctness of the theoretical analysis is verified through numerical simulation on a constrained optimization problem over networks in this section.

The constrained quadratic programming problem [33] is considered in the experiments, which has the formulation as follows:

\begin{matrix} min_{\tilde{x}} \sum_{i = 1}^{m} {∥\tilde{x}∥}_{E_{i}}^{2} + e_{i}^{⊤} \tilde{x} + ρ_{i} {∥\tilde{x}∥}_{1}, \end{matrix}

(26a)

\begin{matrix} s.t. A_{i} \tilde{x} = b_{i}, i \in V, \end{matrix}

(26b)

\begin{matrix} {\tilde{x}}^{\min} \leq \tilde{x} \leq {\tilde{x}}^{\max}, i \in V, \end{matrix}

(26c)

where matrix

E_{i} \in R^{n \times n}

is diagonal and positive definite,

e_{i} \in R^{n}

is a vector, and

ρ_{i}

is the penalty factor. Both

x_{i}^{\min}

and

x_{i}^{\max}

are vectors with constants, which give the bounds of the decision variable

\tilde{x}

. In the light of (1), we can set

f_{i} (\tilde{x}) = {∥\tilde{x}∥}_{E_{i}}^{2} + e_{i}^{T} \tilde{x}

and

g_{i} (\tilde{x}) = ρ_{1} {∥\tilde{x}∥}_{1}

.

In this case, the dimension of the decision variable is set as

n = 4

, and we let

r = 1

. For

i \in V

, the paramount data of Problem (26) are selected randomly. The elements of matrix

E_{i}

are in

[1, 2]

, and the elements of the linear operator

A_{i}

are in

[1, 15]

. Both vectors

e_{i}

and

b_{i}

take values in

[- 5, 5]

. The box constraints are considered as

[- 2.5, 2.5]

. Then, we set the uncoordinated setpsizes randomly as

γ_{i} \in [0.005, 0.006]

, while

σ_{i}

,

μ_{i}

and

ω_{(i, j)}

are in

[5, 6]

. The numerical experiments are performed over the generated network with eight agents, which is displayed in Figure 1. The simulations are carried by running the distributed algorithms on a laptop with Intel(R) Core i5-5500U CPU @ 2.40 GHz, 8.0 GB of RAM, and Matlab R2016a on Windows 10 operating system.

The simulation results are shown in Figure 2 and Figure 3. The transient behaviors of each component of

x_{i}^{k}

is displayed in Figure 2, in which a node-based consensus algorithm [34] is introduced as a comparative profile. Note that the obtained optimal solution from the proposed algorithm is in line with that of the node-based consensus one, i.e.,

x^{*} = {[0 . 6900, 0 . 6270, 0 . 8046, 0 . 4400]}^{T}

, but the latter achieves a stable consensus after 15,000 iterations. Figure 3 shows that our proposed algorithm outperforms the node-based and subgradient algorithms [35] in terms of convergence performance by evaluating the relative errors

\sum_{i = 1}^{m} ∥ x_{i}^{k} - {\tilde{x}}^{*} ∥ / m ∥ {\tilde{x}}^{*} ∥

.

6. Conclusions

In this paper, a distributed algorithm based on proximal operators has been designed to deal with discussed a class of distributed composite optimization problems, in which the local function has a smooth and nonsmooth structure and the decision variable abides by both affine and feasible constraints. Distinguishing attributes of the proposed algorithm include the use of uncoordinated stepsizes and the edge-based communication that avoids the dependency on Laplacian weight matrices. Meanwhile, the algorithm has been verified in theory and simulation. However, there are still some aspects worthy of improvement in this paper. For example, it is worth adopting efficient accelerated protocols (such as the Nesterov-based method and heavy ball method) to improve the convergence rate and developing asynchronous distributed algorithms to deal with the issue of communication latency. In addition, more general optimization models and more efficient algorithms should be investigated in order to address potential applications, e.g., [36,37,38] with nonconvex objectives, coupled and nonlinear constraints.

Author Contributions

L.F.: Conceptualization, Investigation, Project administration, Software, Writing—original draft. L.R.: Data curation, Software. G.M.: Software. J.T.: Project administration, Software. W.D.: Software. H.L.: Funding acquisition, Investigation, Methodology, Project administration, Resources. All authors have read and agreed to the published version of the manuscript.

Funding

The work described in this paper is supported in part by the Research Project Supported by Shanxi Scholarship Council of China (2020-139) and in part by the Xinzhou Teachers University Academic Leader Project.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors gratefully acknowledge their technical and financial support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, H.; Su, E.; Wang, C.; Liu, J.; Xia, D. A primal-dual forward-backward splitting algorithm for distributed convex optimization. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 1–7. [Google Scholar] [CrossRef]
Li, H.; Hu, J.; Ran, L.; Wang, Z.; Lü, Q.; Du, Z.; Huang, T. Decentralized Dual Proximal Gradient Algorithms for Non-Smooth Constrained Composite Optimization Problems. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2594–2605. [Google Scholar] [CrossRef]
Zhang, Y.; Lou, Y.; Hong, Y.; Xie, L. Distributed projection-based algorithms for source localization in wireless sensor networks. IEEE Trans. Wirel. Commun. 2015, 14, 3131–3142. [Google Scholar] [CrossRef]
Li, B.; Wang, Y.; Li, J.; Cao, S. A fully distributed approach for economic dispatch problem of smart grid. Energies 2018, 11, 1993. [Google Scholar] [CrossRef]
Cao, C.; Xie, J.; Yue, D.; Huang, C.; Wang, J.; Xu, S.; Chen, X. Distributed economic dispatch of virtual power plant under a non-ideal communication network. Energies 2017, 10, 235. [Google Scholar] [CrossRef]
Li, H.; Zheng, Z.; Lü, Q.; Wang, Z.; Gao, L.; Wu, G.; Ji, L.; Wang, H. Primal-dual fixed point algorithms based on adapted metric for distributed optimization. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–15. [Google Scholar] [CrossRef]
Ababei, C.; Moghaddam, M.G. A survey of prediction and classification techniques in multicore processor systems. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 1184–1200. [Google Scholar] [CrossRef]
Liu, S.; Qiu, Z.; Xie, L. Convergence rate analysis of distributed optimization with projected subgradient algorithm. Automatica 2017, 83, 162–169. [Google Scholar] [CrossRef]
Li, Y.; Liao, X.; Li, C.; Huang, T.; Yang, D. Impulsive synchronization and parameter mismatch of the three-variable autocatalator model. Phys. Lett. A 2007, 366, 52–60. [Google Scholar] [CrossRef]
Shi, W.; Ling, Q.; Wu, G.; Yin, W. EXTRA: An exact first-order algorithm for decentralized consensus optimization. SIAM J. Optim. 2015, 25, 944–966. [Google Scholar] [CrossRef] [Green Version]
Lei, J.; Chen, H.F.; Fang, H.T. Primal-dual algorithm for distributed constrained optimization. Syst. Control Lett. 2016, 96, 110–117. [Google Scholar] [CrossRef]
Nedic, A.; Ozdaglar, A.; Parrilo, P.A. Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 2010, 55, 922–938. [Google Scholar] [CrossRef]
Gharesifard, B.; Cortes, J. Distributed continuous-time convex optimization on weight-balanced digraphs. IEEE Trans. Autom. Control 2014, 59, 781–786. [Google Scholar] [CrossRef]
Zhu, M.; Martinez, S. On distributed convex optimization under inequality and equality constraints. IEEE Trans. Autom. Control 2012, 57, 151–164. [Google Scholar]
Wang, X.; Hong, Y.; Ji, H. Distributed optimization for a class of nonlinear multiagent systems with disturbance rejection. IEEE Trans. Cybern. 2016, 46, 1655–1666. [Google Scholar] [CrossRef]
Nedic, A.; Ozdaglar, A. Distributed subgradient methods for multiagent optimization. IEEE Trans. Autom. Control 2009, 54, 48–61. [Google Scholar] [CrossRef]
Zhang, L.; Liu, S. Projected subgradient based distributed convex optimization with transmission noises. Appl. Math. Comput. 2022, 418, 126794. [Google Scholar] [CrossRef]
Ren, X.; Li, D.; Xi, Y.; Shao, H. Distributed subgradient algorithm for multi-agent optimization with dynamic stepsize. IEEE/CAA J. Autom. Sin. 2021, 8, 1451–1464. [Google Scholar] [CrossRef]
Niu, Y.; Wang, H.; Wang, Z.; Xia, D.; Li, H. Primal-dual stochastic distributed algorithm for constrained convex optimization. J. Frankl. Inst. 2019, 356, 9763–9787. [Google Scholar] [CrossRef]
Shi, W.; Ling, Q.; Wu, G.; Yin, W. A proximal gradient algorithm for decentralized composite optimization. IEEE Trans. Singal Process. 2015, 63, 6013–6023. [Google Scholar] [CrossRef]
Li, Z.; Shi, W.; Yan, M. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Trans. Singal Process. 2019, 67, 4494–4506. [Google Scholar] [CrossRef] [Green Version]
Aybat, N.S.; Hamedani, E.Y. A distributed ADMM-like method for resource sharing over time-varying networks. SIAM J. Optim. 2019, 29, 3036–3068. [Google Scholar] [CrossRef]
Xu, J.; Tian, Y.; Sun, Y.; Scutari, G. Distributed algorithms for composite optimization: Unified framework and convergence analysis. IEEE Trans. Signal Process. 2021, 69, 3555–3570. [Google Scholar] [CrossRef]
Aybat, N.S.; Wang, Z.; Lin, T.; Ma, S. Distributed linearized alternating direction method of multipliers for composite convex consensus optimization. IEEE Trans. Autom. Control 2018, 63, 5–20. [Google Scholar] [CrossRef]
Latafat, P.; Patrinos, P. Asymmetric forward-backward-adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 2017, 68, 57–93. [Google Scholar] [CrossRef]
Combettes, P.L.; Pesquet, J.C. Primal-dual splitting algorithm for solving inclusions with mixtures of composite, lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 2012, 20, 307–330. [Google Scholar] [CrossRef]
Condat, L. A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 2013, 158, 460–479. [Google Scholar] [CrossRef]
Vu, B.C. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 2013, 38, 667–681. [Google Scholar] [CrossRef]
Latafat, P.; Freris, N.M.; Patrinos, P. A new randomized block-coordinate primal-dual proximal algorithm for distributed optimization. IEEE Trans. Autom. Control 2019, 64, 4050–4065. [Google Scholar] [CrossRef]
Wei, Y.; Fang, H.; Zeng, X.; Chen, J.; Pardalos, P. A smooth double proximal primal-dual algorithm for a class of distributed nonsmooth optimization problems. IEEE Trans. Autom. Control 2020, 65, 1800–1806. [Google Scholar] [CrossRef]
Liu, Q.; Yang, S.; Hong, Y. Constrained consensus algorithms with fixed step size for distributed convex optimization over multiagent networks. IEEE Trans. Autom. Control 2017, 62, 4259–4265. [Google Scholar] [CrossRef]
Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces; Springer: Cham, Switzerland, 2011. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Zhao, Y.; Liu, Q. A consensus algorithm based on collective neurodynamic system for distributed optimization with linear and bound constraints. Neural Netw. 2020, 122, 144–151. [Google Scholar] [CrossRef] [PubMed]
Yuan, D.; Xu, S.; Zhao, H. Distributed primal-dual subgradient method for multiagent optimization via consensus algorithms. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2011, 41, 1715–1724. [Google Scholar] [CrossRef] [PubMed]
Ćalasan, M.; Micev, M.; Ali, Z.M.; Zobaa, A.F.; Aleem, S.H.E.A. Parameter estimation of induction machine single-cage and double-cage models using a hybrid simulated annealing–evaporation rate water cycle algorithm. Mathematics 2020, 8, 1024. [Google Scholar] [CrossRef]
Ali, Z.M.; Diaaeldin, I.M.; Aleem, S.H.E.A.; El-Rafei, A.; Abdelaziz, A.Y.; Jurado, F. Scenario-based network reconfiguration and renewable energy resources integration in large-scale distribution systems considering parameters uncertainty. Mathematics 2021, 9, 26. [Google Scholar] [CrossRef]
Rawa, M.; Abusorrah, A.; Bassi, H.; Mekhilef, S.; Ali, Z.M.; Aleem, S.; Hasanien, H.M.; Omar, A.I. Economical-technical-environmental operation of power networks with wind-solar-hydropower generation using analytic hierarchy process and improved grey wolf algorithm. Ain Shams Eng. J. 2021, 12, 2717–2734. [Google Scholar] [CrossRef]

Figure 1. The 8-agents communication network.

Figure 2. Trajectory of variable

x_{i}^{k}

. (a) Proposed algorithm. (b) Node-based consensus algorithm.

Figure 2. Trajectory of variable

x_{i}^{k}

. (a) Proposed algorithm. (b) Node-based consensus algorithm.

Figure 3. Convergence performance comparison.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, L.; Ran, L.; Meng, G.; Tang, J.; Ding, W.; Li, H. Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy 2022, 24, 1278. https://doi.org/10.3390/e24091278

AMA Style

Feng L, Ran L, Meng G, Tang J, Ding W, Li H. Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy. 2022; 24(9):1278. https://doi.org/10.3390/e24091278

Chicago/Turabian Style

Feng, Liping, Liang Ran, Guoyang Meng, Jialong Tang, Wentao Ding, and Huaqing Li. 2022. "Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks" Entropy 24, no. 9: 1278. https://doi.org/10.3390/e24091278

APA Style

Feng, L., Ran, L., Meng, G., Tang, J., Ding, W., & Li, H. (2022). Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks. Entropy, 24(9), 1278. https://doi.org/10.3390/e24091278

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Decentralized Primal-Dual Proximal Operator Algorithm for Constrained Nonsmooth Composite Optimization Problems over Networks

Abstract

1. Introduction

2. Preliminaries

2.1. Graph Theory and Notations

2.2. Decentralized Optimization Problem

3. Algorithm Development

4. Convergence Analysis

5. Numerical Simulation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI