Next Article in Journal
Popularity Prediction of Online Contents via Cascade Graph and Temporal Information
Next Article in Special Issue
Numerical Algorithms for Computing an Arbitrary Singular Value of a Tensor Sum
Previous Article in Journal
New Results on the SSIE with an Operator of the form FΔƐ + Fx Involving the Spaces of Strongly Summable and Convergent Sequences Using the Cesàro Method
Previous Article in Special Issue
Unveiling the Dynamics of the European Entrepreneurial Framework Conditions over the Last Two Decades: A Cluster Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions

1
Department of Mathematical Sciences, Cameron University, Lawton, OK 73505, USA
2
Department of Theory of Optimal Processes, Ivan Franko National University of Lviv, Universytetska Str. 1, 79000 Lviv, Ukraine
3
PEQUAN, LIP6, Sorbonne Université, 4 Place Jussieu, 75252 Paris, France
4
Fraunhofer ITWM, Fraunhofer-Platz 1, 67663 Kaiserslautern, Germany
5
Department of Computational Mathematics, Ivan Franko National University of Lviv, Universytetska Str. 1, 79000 Lviv, Ukraine
6
Department of Computer Science, University of Oklahoma, Norman, OK 73071, USA
*
Author to whom correspondence should be addressed.
Axioms 2021, 10(3), 158; https://doi.org/10.3390/axioms10030158
Submission received: 22 June 2021 / Revised: 16 July 2021 / Accepted: 19 July 2021 / Published: 21 July 2021
(This article belongs to the Special Issue Numerical Analysis and Computational Mathematics)

Abstract

:
We develop a local convergence of an iterative method for solving nonlinear least squares problems with operator decomposition under the classical and generalized Lipschitz conditions. We consider the case of both zero and nonzero residuals and determine their convergence orders. We use two types of Lipschitz conditions (center and restricted region conditions) to study the convergence of the method. Moreover, we obtain a larger radius of convergence and tighter error estimates than in previous works. Hence, we extend the applicability of this method under the same computational effort.

1. Introduction

Nonlinear least squares problems often arise while solving overdetermined systems of nonlinear equations, estimating parameters of physical processes by measurement results, constructing nonlinear regression models for solving engineering problems, etc. The most used method for solving nonlinear least squares problems is the Gauss–Newton method [1]. In the case when the derivative can not be calculated, difference methods are used [2,3].
Some nonlinear functions have a differentiable and a nondifferentiable part. In this case, a good idea is to use a sum of the derivative of the differentiable part of the operator and the divided difference of the nondifferentiable part instead of the Jacobian [4,5,6]. Numerical study shows that these methods converge faster than Gauss–Newton type’s method or difference methods.
In this paper, we study the local convergence of the Gauss–Newton–Secant method under the classical and generalized Lipschitz conditions for first-order Fréchet derivative and divided differences.
Let us consider the nonlinear least squares problem:
min x R p 1 2 ( F ( x ) + G ( x ) ) T ( F ( x ) + G ( x ) ) ,
where residual function F + G : R p R m ( m p ) is nonlinear in x, F is a continuously differentiable function, and G is a continuous function, the differentiability of which, in general, is not required.
We propose the following modification of the Gauss–Newton method combined with the Secant-type method [4,6] for finding the solution to problem (1):
x n + 1 = x n ( A n T A n ) 1 A n T ( F ( x n ) + G ( x n ) ) , n = 0 , 1 , ,
where A n = F ( x n ) + G ( x n , x n 1 ) , F ( x n ) is a Fréchet derivative of F ( x ) ; G ( x n , x n 1 ) is a divided difference of the first order of function G x [7] at points x n , x n 1 ; and x 0 , x 1 are given.
Setting A n = F ( x n ) , for solving problem (1), from (2) we obtain an iterative Gauss–Newton-type method:
x n + 1 = x n ( F ( x n ) T F ( x n ) ) 1 F ( x n ) T ( F ( x n ) + G ( x n ) ) , n = 0 , 1 , .
For m = p , problem (1) turns into a system of nonlinear equations:
F ( x ) + G ( x ) = 0 .
In this case, method (2) is transformed into the combined Newton–Secant method [8,9,10]:
x n + 1 = x n ( F ( x n ) + G ( x n , x n 1 ) ) 1 ( F ( x n ) + G ( x n ) ) , n = 0 , 1 , ,
and method (3) into the Newtons-type method for solving nonlinear equations [11]:
x n + 1 = x n ( F ( x n ) ) 1 ( F ( x n ) + G ( x n ) ) , n = 0 , 1 , .
The convergence domain is small (in general), and error estimates are pessimistic. These problems restrict the applicability of these methods. The novelty of our work is in the claim that these problems can be addressed without adding hypotheses. In particular, our idea is to use a center and restricted radius Lipschitz conditions. Such an approach to the study of the convergence of methods allows for extending the convergence ball of the method and improving error estimates.
The remainder of the paper is organized as follows: Section 2 deals with the local convergence analysis. The numerical experiments appear in Section 3. Section 4 contains the concluding remarks and ideas about future works.

2. Local Convergence Analysis

Let us consider, at first, some auxiliary lemmas needed to obtain the main results. Let D be an open subset of R p .
Lemma 1
([4]). Let e ( t ) = 0 t E ( u ) d u , where E is an integrable and positive nondecreasing function on [ 0 , T ] . Then, e ( t ) is monotonically increasing with respect to t on [ 0 , T ] .
Lemma 2
([1,12]). Let h ( t ) = 1 t 0 t H ( u ) d u , where H is an integrable and positive nondecreasing function on [ 0 , T ] . Then, h ( t ) is nondecreasing with respect to t on ( 0 , T ] .
Additionally, h ( t ) at t = 0 is defined as h ( 0 ) = lim t 0 1 t 0 t H ( u ) d u .
Lemma 3
([13]). Let s ( t ) = 1 t 2 0 t S ( u ) u d u , where S is an integrable and positive nondecreasing function on [ 0 , T ] . Then, s ( t ) is nondecreasing with respect to t on ( 0 , T ] .
Definition 1.
The Fréchet derivative F satisfies the center Lipschitz condition on D with L 0 average if
F ( x ) F ( x ) 0 ρ ( x ) L 0 ( u ) d u , for each x D R p ,
where ρ ( x ) = x x , x D is a solution of problem (1), and L 0 is an integrable, positive, and nondecreasing function on [ 0 , T ] .
The functions M 0 , L , M , L 1 and M 1 introduced next are as the function L 0 : integrable, positive, and nondecreasing functions defined on [ 0 , 2 R ] .
Definition 2.
The first order divided difference G ( x , y ) satisfies the center Lipschitz condition on D × D with M 0 average if
G ( x , y ) G ( x , x ) 0 ρ ( x ) + ρ ( y ) M 0 ( u ) d u , for each x , y D .
Let B > 0 and α > 0 . We define function φ on [ 0 , + ) by
φ ( t ) = B 2 α + 0 t L 0 ( u ) d u + 0 2 t M 0 ( u ) d u 0 t L 0 ( u ) d u + 0 2 t M 0 ( u ) d u .
Suppose that equation
φ ( t ) = 1
has at least one positive solution. Denote by γ the minimal such solution. Then, we can define Ω 0 = D Ω ( x , γ ) , where Ω ( x , γ ) = { x : x x < γ } .
Definition 3.
The Fréchet derivative F satisfies the restricted radius Lipschitz condition on Ω 0 with L average if
F ( x ) F ( x τ ) τ ρ ( x ) ρ ( x ) L ( u ) d u , x τ = x + τ ( x x ) , 0 τ 1 , for each x Ω 0 .
Definition 4.
The first order divided difference G ( x , y ) satisfies the restricted radius Lipschitz condition on Ω 0 with M average if
G ( x , y ) G ( u , v ) 0 x u + y v M ( u ) d u , for each x , y , u , v Ω 0 .
Definition 5.
The Fréchet derivative F satisfies the radius Lipschitz condition on D with L 1 average if
F ( x ) F ( x τ ) τ ρ ( x ) ρ ( x ) L 1 ( u ) d u , for each x D .
Definition 6.
The first order divided difference G ( x , y ) satisfies the radius Lipschitz condition on D with M 1 average if
G ( x , y ) G ( u , v ) 0 x u + y v M 1 ( u ) d u , for each x , y , u , v D .
Remark 1.
It follows from the preceding definitions that L = L ( L 0 ,   M 0 ) , M = M ( L 0 , M 0 ) , and for each t [ 0 , γ ]
L 0 ( t ) L 1 ( t ) ,
L ( t ) L 1 ( t ) ,
M ( t ) M 1 ( t ) ,
since Ω 0 D . By L ( L 0 , M 0 ) , we mean that L (or M) depends on L 0 and M 0 by the definition of Ω 0 . In case any of (15)–(17) are strict inequalities, the following benefits are obtained over the work in [4] using L 1 , M 1 instead of the new functions:
(a1) 
An at least as large convergence region leading to at least as many initial choices;
(a2) 
At least as tight upper bounds on the distances x n x , so at least as few iterations are needed to obtain a desired error tolerance.
These benefits are obtained under the same computational effort as in [4], since the new functions L 0 , M 0 , L , and M are special cases of the functions L 1 and M 1 . This technique of using the center Lipschitz condition in combination with the restricted convergence region has been used by us on Newton’s, Secant, Newton-like methods [14,15], and can be used on other methods, too, with the same benefits.
The proof of the next result follows as the corresponding one in [4], but there are crucial differences, where we use ( L 0 , L ) instead of L 1 and ( M 0 , M ) instead of M 1 used in [4].
We use the Euclidean norm. Note that the following equality is satisfied for the Euclidean norm A B = A T B T , where A , B R m × p .
Theorem 1.
Let F + G : R p R m be continuous on an open convex subset D R p , F be a continuously differentiable function, and G be a continuous function. Suppose that problem (1) has a solution x D ; the inverse operation
( A T A ) 1 = [ ( F ( x ) + G ( x , x ) ) T ( F ( x ) + G ( x , x ) ) ] 1
exists, such that ( A T A ) 1 B ; (7), (8), (11), and (12) hold, and γ given in (10) exists.
Furthermore,
F ( x ) + G ( x ) η , F ( x ) + G ( x , x ) α ;
B R 0 R L 0 ( u ) d u + 0 2 R M 0 ( u ) d u η < 1
and Ω = Ω ( x , r ) D , where r is the unique positive zero of the function q given by
q ( r ) = B α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L ( u ) u d u + 0 r M ( u ) d u + 2 α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u + 1 r 0 r L 0 ( u ) d u + 1 r 0 2 r M 0 ( u ) d u η 1 .
Then, for x 0 , x 1 Ω , the iterative sequence { x n } , n = 0 , 1 , , generated by (2), is well defined, remains in Ω, and converges to x . Moreover, the following error estimates hold for each n = 0 , 1 , 2 , :
x n + 1 x C 1 x n 1 x + C 2 x n x + C 3 x n 1 x x n x + C 4 x n x 2 ,
where
g ( r ) = B 1 φ ( r ) ; C 1 = g ( r ) 1 2 r 0 2 r M 0 ( u ) d u η ;
C 2 = g ( r ) 1 r 0 r L 0 ( u ) d u + 1 2 r 0 2 r M 0 ( u ) d u η ;
C 3 = g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 1 r 0 r M ( u ) d u ;
C 4 = g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 1 r 0 r L ( u ) u d u .
Proof. 
We obtain
lim r 0 + 1 r 0 r L 0 ( u ) d u lim r 0 + L 0 ( r ) r r L 0 ( 0 ) ,
lim r 0 + 1 r 0 2 r M 0 ( u ) d u lim r 0 + M 0 ( 2 r ) 2 r r 2 M 0 ( 0 ) ,
since L 0 and M 0 are positive and nondecreasing functions on [ 0 , R ] , and [ 0 , 2 R ] , respectively. Taking into account Lemma 1 for a sufficiently small η , q ( 0 ) = B ( L 0 ( 0 ) + 2 M 0 ( 0 ) ) η 1 < 0 . With a sufficiently large R, the inequality q ( R ) > 0 holds. By the intermediate value theorem, the function q has a positive zero on ( 0 , R ) denoted by r . Moreover, this zero is the only one on ( 0 , R ) . Indeed, according to Lemma 2, the function 1 r 0 r L 0 ( u ) d u + 1 r 0 2 r M 0 ( u ) d u η is non-decreasing with respect to r on ( 0 , R ] . By Lemma 1, functions 0 r L ( u ) d u , 0 r M ( u ) d u , and 0 2 r M ( u ) d u are monotonically increasing on [ 0 , R ] . Furthermore, by Lemma 3, the function 0 r L ( u ) u d u = r 2 1 r 2 0 r L ( u ) u d u is monotonically increasing with respect to r on ( 0 , R ] . Therefore, q ( r ) is monotonically increasing on ( 0 , R ] . Thus, the graph of function q ( r ) crosses the positive r-axis only once on ( 0 , R ) . Finally, from the monotonicity of q and since q ( γ ) > 0 , we obtain r < γ , so Ω ( x , r ) Ω 0 .
We denote A n = F ( x n ) + G ( x n , x n 1 ) . Let n = 0 . By the assumption x 0 , x 1 Ω , we obtain the following estimation:
I ( A T A ) 1 A 0 T A 0 = ( A T A ) 1 ( A T A A 0 T A 0 ) = ( A T A ) 1 A T ( A A 0 ) + ( A T A 0 T ) ( A 0 A ) + ( A T A 0 T ) A ( A T A ) 1 A T A A 0 + A T A 0 T A 0 A + A T A 0 T A B α A A 0 + A T A 0 T A 0 A + α A T A 0 T .
Using conditions (11) and (12), we obtain
A 0 A = ( F ( x 0 ) + G ( x 0 , x 1 ) ) ( F ( x ) + G ( x , x ) ) = F ( x 0 ) F ( x ) + G ( x 0 , x 1 ) G ( x , x ) F ( x 0 ) F ( x ) + G ( x 0 , x 1 ) G ( x , x ) 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u ,
where ρ k = ρ ( x k ) . Then, from inequality (29) and the equation q ( r ) = 0 , we obtain by (10)
I ( A T A ) 1 A 0 T A 0 B 2 α + 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u × 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u B 2 α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u < 1 .
Next, from (29)–(31) and the Banach lemma [16], it follows that ( A 0 T A 0 ) 1 exists, and
( A 0 T A 0 ) 1 g 0 = B 1 B 2 α + 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u × 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u 1 g ( r ) = B 1 B 2 α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u × 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 1 .
Hence, x 1 is correctly defined. Next, we will show that x 1 Ω ( x , r ) .
Using the fact
A T ( F ( x ) + G ( x ) ) = ( F ( x ) + G ( x , x ) ) T ( F ( x ) + G ( x ) ) = 0 ,
x 0 , x 1 Ω ( x , r ) and the choice of r , we obtain the estimate
x 1 x = x 0 x ( A 0 T A 0 ) 1 A 0 T ( F ( x 0 ) + G ( x 0 ) ) A T ( F ( x ) + G ( x ) ) ( A 0 T A 0 ) 1 A 0 T [ A 0 0 1 F ( x + t ( x 0 x ) ) d t G ( x 0 , x ) ] ( x 0 x ) + ( A 0 T A T ) ( F ( x ) + G ( x ) ) .
So, considering the inequalities
A 0 0 1 F ( x + t ( x 0 x ) ) d t G ( x 0 , x ) = F ( x 0 ) 0 1 F ( x + t ( x 0 x ) ) d t + G ( x 0 , x 1 ) G ( x 0 , x ) = 0 1 F ( x 0 ) F ( x + t ( x 0 x ) ) d t + G ( x 0 , x 1 ) G ( x 0 , x ) = 0 1 F ( x 0 ) F ( x 0 t ) d t + G ( x 0 , x 1 ) G ( x 0 , x ) 0 1 t ρ 0 ρ 0 L ( u ) d u d t + 0 ρ 1 M ( u ) d u = 0 ρ 0 L ( u ) u d u + 0 ρ 1 M ( u ) d u 1 r 2 0 r L ( u ) u d u ρ 0 2 + 1 r 0 r M ( u ) d u ρ 1 ,
A 0 A + A 0 A α + 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u ,
we obtain
x 1 x g 0 α + 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u × 0 ρ 0 L ( u ) u d u + 0 ρ 1 M ( u ) d u x 0 x + η [ 0 ρ 0 L 0 ( u ) d u + 0 ρ 0 + ρ 1 M 0 ( u ) d u ] g 0 α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u × 1 r 2 0 r L ( u ) u d u ρ 0 2 + 1 r 0 r M ( u ) d u ρ 1 x 0 x + η 1 r 0 r L 0 ( u ) d u ρ 0 + 1 2 r 0 2 r M 0 ( u ) d u ( ρ 0 + ρ 1 ) < g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L ( u ) u d u + 0 r M ( u ) d u + 1 r 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u η r = p ( r ) r = r ,
where
p ( r ) = g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L ( u ) u d u + 0 r M ( u ) d u + 1 r 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u η .
Therefore, x 1 Ω ( x , r ) , and estimate (22) holds for n = 0 .
Let us assume that x n Ω ( x , r ) for n = 0 , 1 , . . . , k and estimate (22) holds for n = 0 , 1 , . . . , k 1 , where k 1 is an integer. We shall show x n + 1 Ω and that the estimate (22) holds for n = k .
We can write
I ( A T A T ) 1 A k T A k = ( A T A ) 1 ( A T A A k T A k ) = ( A T A ) 1 ( A T ( A A k ) + ( A T A k T ) ( A k A ) + ( A T A k T ) A ) B α A A k + A T A k T A k A + α A T A k T B 2 α + 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u [ 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u ] B 2 α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u × 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u < 1 .
Consequently, A k T A k 1 exists, and
( A k + 1 T A k + 1 ) 1 g k = B 1 B 2 α + 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u × 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u 1 g ( r ) .
Therefore, x k + 1 is correctly defined, and the following estimate holds:
x k + 1 x = x k x ( A k T A k ) 1 [ A k T ( F ( x k ) + G ( x k ) ) A T ( F ( x ) + G ( x ) ) ] ( A k T A k ) 1 A k T [ A k 0 1 F ( x + t ( x k x ) ) d t G ( x k , x ) ] ( x k x ) + ( A k T A T ) ( F ( x ) + G ( x ) ) ( A k T A k ) 1 A k T [ A k 0 1 F ( x + t ( x k x ) ) d t G ( x k , x ) ] ( x k x ) + ( A k T A T ) ( F ( x ) + G ( x ) ) g k α + 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u [ 0 ρ k L ( u ) u d u + 0 ρ k 1 M ( u ) d u ] x k x + η 0 ρ k L 0 ( u ) d u + 0 ρ k + ρ k 1 M 0 ( u ) d u g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u × 1 r 2 0 r L ( u ) u d u ρ k 2 + 1 r 0 r M ( u ) d u ρ k 1 x k x + η 1 r 0 r L 0 ( u ) d u ρ k + 1 2 r 0 2 r M 0 ( u ) d u ( ρ k + ρ k 1 ) < p ( r ) r = r .
This proves that x k + 1 Ω ( x , r ) and estimate (22) for n = k .
Thus, by the induction method, (2) is correctly defined, x n Ω ( x , r ) , and estimate (22) holds for each n = 0 , 1 , 2 , .
It remains to be proven that x n x for n .
Let us define functions a and b on [ 0 , r ] as
a ( r ) = g ( r ) α + 0 r L 0 ( u ) d u + 0 2 r M 0 ( u ) d u 0 r L ( u ) u d u + 0 r M ( u ) d u + 1 r 0 r L 0 ( u ) d u + 1 2 r 0 2 r M 0 ( u ) d u η ;
b ( r ) = g ( r ) 1 2 r 0 2 r M ( u ) d u η .
According to the choice of r , we obtain
a ( r ) 0 , b ( r ) 0 , a ( r ) + b ( r ) = 1 .
Using estimate (22), the definition of functions a, b and constants C i ( i = 1 , 2 , 3 , 4 ), we have
x n + 1 x C 1 x n 1 x + ( C 2 + C 3 r + C 4 r ) x n x = a ( r ) x n x + b ( r ) x n 1 x .
According to the proof in [17], under the conditions (42)–(45), the sequence { x n } converges to x for n . □
Corollary 1
([4]). The convergence order of method (2) for the problem (1) with zero residual is equal to 1 + 5 2 .
If η = 0 , we have the nonlinear least squares problem with zero residual. Then, the constants C 1 = 0 and C 2 = 0 , and estimate (22) takes the form
x n + 1 x C 3 x n 1 x x n x + C 4 x n x 2 .
This inequality can be written as
x n + 1 x ( C 3 + C 4 ) x n 1 x x n x .
Then, we can write an equation for determining the convergence order as follows:
t 2 t 1 = 0 .
Therefore, the positive root, t = 1 + 5 2 of the latter equation is the order of convergence of method (2).
In case G ( x ) 0 in (1), we obtain the following consequences.
Corollary 2
([4]). The convergence order of method (2) for problem (1) with zero residual is quadratic.
Indeed, if G ( x ) 0 , then C 3 = 0 , and estimate (22) takes the form
x n + 1 x C 4 x n x 2 ,
which indicates the quadratic convergence rate of method (2).
Remark 2.
If L 0 = L = L 1 and M 0 = M = M 1 , our results specialize to the corresponding ones in [4]. Otherwise, they constitute an improvement as already noted in Remark 1. As an example, let q 1 , g 1 , C 1 1 , C 2 1 , C 3 1 , C 4 1 , r 1 denote the functions and parameters where L 0 , L , M 0 , M are replaced by L 1 , L 1 , M 1 , M 1 , respectively. Then, we have in view of (15)–(17) that
q ( r ) q 1 ( r ) ,
g ( r ) g 1 ( r ) ,
C 1 C 1 1 ,
C 2 C 2 1 ,
C 3 C 3 1 ,
and
C 4 C 4 1 .
Hence, we have
r 1 r ,
the new error bounds (22) being tighter than the corresponding (6) in [4], and the rest of the advantages (already mentioned in Remark 1) holding true.
Next, we study the convergence of method (2) if L 0 , L , M 0 , M are constants, as a consequence of Theorem 1.
Corollary 3.
Let F + G : R p R m be continuous on an open convex subset D R p , F be a continuously differentiable, and G be a continuous function on D. Suppose that problem (1) has a solution x D , and the inverse operation
( A T A ) 1 = [ ( F ( x ) + G ( x , x ) ) T ( F ( x ) + G ( x , x ) ) ] 1
exists, such that ( A T A ) 1 B .
Suppose that the Fréchet derivative F satisfies the classic Lipschitz conditions
F ( x ) F ( x ) L 0 x x , for each x D ,
F ( x ) F ( y ) L x y , for each x , y Ω 0
and the function G has a first order divided difference G ( x , y ) that satisfies
G ( x , y ) G ( x , x ) M 0 ( x x + y x ) , for each x , y D ,
G ( x , y ) G ( u , v ) M ( x u + y v ) , for each x , y , u , v Ω 0 ,
where Ω 0 = D Ω x , B 2 α 2 + B B α B ( L 0 + 2 M 0 ) .
Furthermore,
F ( x ) + G ( x ) η , F ( x ) + G ( x , x ) α , B ( L 0 + 2 M 0 ) η < 1
and Ω = Ω ( x , r ) D , where
r = 4 ( 1 B T 0 η ) B α ( 4 T 0 + T ) + B 2 α 2 ( 4 T 0 + T ) 2 + 8 B T 0 ( 2 T 0 + T ) ( 1 B T 0 η ) ,
T 0 = L 0 + 2 M 0 , T = L + 2 M . Then, for each x 0 , x 1 Ω , the iterative sequence { x n } , n = 0 , 1 , . . . , generated by (2) is well defined, remains in Ω, and converges to x , such that the following error estimate holds for each n = 0 , 1 , 2 , :
x n + 1 x C 1 x n 1 x + C 2 x n x + C 3 x n 1 x x n x + C 4 x n x 2 ,
where
g ( r ) = B [ 1 B ( 2 α + ( L 0 + 2 M 0 ) r ) ( L 0 + 2 M 0 ) r ] 1 ;
C 1 = g ( r ) M 0 η ; C 2 = g ( r ) ( L 0 + M 0 ) η ;
C 3 = g ( r ) ( α + ( L 0 + 2 M 0 ) r ) M ;
C 4 = g ( r ) ( α + ( L 0 + 2 M 0 ) r ) L 2 .
The proof of Corollary 3 is analogous to the proof of Theorem 1.

3. Numerical Examples

In this section, we give examples to show the applicability of method (2) and to confirm Remark 2. We use the norm x = i = 1 p x i 2 for x R p .
Example 1.
Let function F + G : R 2 R 3 be defined by
F ( x ) + G ( x ) = 3 u 2 v + v 2 1 + | u 2 1 | u 4 + u v 3 1 + | v | v 0.3 + | u 1 | ,
F ( x ) = 3 u 2 v + v 2 1 u 4 + u v 3 1 v 0.3 , G ( x ) = | u 2 1 | | v | | u 1 | ,
where x = ( u , v ) . The solution of this problem x ( 0.917889 , 0.288314 ) and η 0.079411 .
Let us give the number of iterations needed to obtain an approximate solution of this problem. We test method (2) for the different initial points x 0 = δ ( 1.1 , 0.5 ) T , where δ R , and use the stopping criterion x n + 1 x n ε . The additional point x 1 = x 0 + 10 4 . The numerical results are shown in Table 1.
In Table 2, we give values of x n + 1 , x n + 1 x n and the norm of residual at each iteration.
Example 2.
Let function F + G : D R R 3 be defined by [5]:
F ( x ) + G ( x ) = x + μ λ x 3 + x μ λ | x 2 1 | λ ,
F ( x ) = x + μ λ x 3 + x μ 0 , G ( x ) = 0 0 λ | x 2 1 | λ ,
where λ , μ R are two parameters. Here x = 0 and η = 2 | μ | . Thus, if μ = 0 , then we have a problem with zero residual.
Let us consider Example 2 and show that r 1 r and the new error estimates (64) are tighter than the corresponding ones in [4]. We consider the case of the classical Lipschitz conditions (Corollary 3). Error estimates from [4] are as follows:
x n + 1 x C 1 1 x n 1 x + C 2 1 x n x + C 3 1 x n 1 x x n x + C 4 1 x n x 2 ,
where
g 1 ( r ) = B [ 1 B ( 2 α + ( L 1 + 2 M 1 ) r ) ( L 1 + 2 M 1 ) r ] 1 ;
C 1 1 = g 1 ( r 1 ) M 1 η ; C 2 1 = g 1 ( r 1 ) ( L 1 + M 1 ) η ;
C 3 1 = g 1 ( r 1 ) ( α + ( L 1 + 2 M 1 ) r 1 ) M 1 ;
C 4 1 = g 1 ( r 1 ) ( α + ( L 1 + 2 M 1 ) r 1 ) L 1 2 .
They can be obtained from (64) by replacing r , L 0 , L , M 0 , M in g ( r ) , C 1 , C 2 , C 3 , C 4 by r 1 , L 1 , L 1 , M 1 , M 1 , respectively. Similarly,
r 1 = 4 ( 1 B T 1 η ) 5 B α T 1 + 25 B 2 α 2 T 1 2 + 24 B T 1 2 ( 1 B T 1 η ) , T 1 = L 1 + 2 M 1 .
Let us choose D = ( 0.5 ; 0.5 ) . Thus, we have B = 0.5 , η = 2 | μ | , α = 2 , L 0 = max x D 3 | λ | | x | , L = max x , y Ω 0 3 | λ | | x + y | , L 1 = max x , y D 3 | λ | | x + y | , M 0 = M = M 1 = | λ | . Radii are written in Table 3.
Table 4 and Table 5 report the left and right side of error estimates (64) and (73). We obtained these results for ε = 10 8 and starting approximations x 1 = 0.2001 , x 0 = 0.2 . We see that the new error bounds (64) are tighter than the corresponding (73) from [4].

4. Conclusions

We developed an improved local convergence analysis of the Gauss–Newton–Secant method for solving nonlinear least squares problems with nondifferentiable operator. We use a center and restricted radius Lipschitz conditions to study the method. As a consequence, we obtain a larger radius of convergence and tighter error estimates under the same computational effort as in earlier papers. This idea can be used to extend the usage of other methods with inverses, such as Newton-type, Secant-type, single-step, or multi-step, to mention a few. This should be our future work. Finally, it is worth mentioning that except for the methods used in this paper, some of the most representative computational intelligence algorithms can be used to solve the problems, such as monarch butterfly optimization (MBO) [18], the earthworm optimization algorithm (EWA) [19], elephant herding optimization (EHO) [20], the moth search (MS) algorithm [21], the slime mould algorithm (SMA), and Harris hawks optimization (HHO) [22].

Author Contributions

Editing, I.K.A.; Conceptualization S.S.; Investigation I.K.A., S.S., R.I., H.Y. and M.I.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, C.; Zhang, W.; Jin, X. Convergence and uniqueness properties of Gauss-Newton’s method. Comput. Math. Appl. 2004, 47, 1057–1067. [Google Scholar] [CrossRef] [Green Version]
  2. Argyros, I.K.; Ren, H. A derivative free iterative method for solving least squares problems. Numer. Algorithms 2011, 58, 555–571. [Google Scholar]
  3. Shakhno, S.M.; Gnatyshyn, O.P. On an iterative algorithm of order 1.839... for solving the nonlinear least squares problems. Appl. Math. Comput. 2005, 161, 253–264. [Google Scholar] [CrossRef]
  4. Shakhno, S.M.; Iakymchuk, R.P.; Yarmola, H.P. An iterative method for solving nonlinear least squares problems with nondifferentiable operator. Mat. Stud. 2017, 48, 97–107. [Google Scholar] [CrossRef] [Green Version]
  5. Shakhno, S.M.; Iakymchuk, R.P.; Yarmola, H.P. Convergence analysis of a two-step method for the nonlinear least squares problem with decomposition of operator. J. Numer. Appl. Math. 2018, 128, 82–95. [Google Scholar]
  6. Shakhno, S.; Shunkin, Y. One combined method for solving nonlinear least squares problems. Visnyk Lviv Univ. Ser. Appl. Math. Comp. Sci. 2017, 25, 38–48. (In Ukrainian) [Google Scholar]
  7. Ulm, S. On generalized divided differences. Izv. ESSR Ser. Phys. Math. 1967, 16, 13–26. (In Russian) [Google Scholar]
  8. Cătinaş, E. On some iterative methods for solving nonlinear equations. Rev. Anal. Numér. Théor. Approx. 1994, 23, 47–53. [Google Scholar]
  9. Shakhno, S.M.; Mel’nyk, I.V.; Yarmola, H.P. Convergence analysis of combined method for solving nonlinear equations. J. Math. Sci. 2016, 212, 16–26. [Google Scholar] [CrossRef]
  10. Shakhno, S.M. Convergence of combined Newton-Secant method and uniqueness of the solution of nonlinear equations. Sci. J. Tntu 2013, 1, 243–252. (In Ukrainian) [Google Scholar]
  11. Zabrejko, P.P.; Nguen, D.F. The majorant method in the theory of Newton-Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim. 1987, 9, 671–686. [Google Scholar] [CrossRef]
  12. Wang, X.; Li, C. Convergence of Newton’s method and uniqueness of the solution of equations in Banach space II. Acta Math. Sin. 2003, 19, 405–412. [Google Scholar] [CrossRef]
  13. Wang, X. Convergence of Newton’s method and uniqueness of the solution of equations in Banach space. IMA J. Numer. Anal. 2000, 20, 123–134. [Google Scholar] [CrossRef] [Green Version]
  14. Argyros, I.K.; Hilout, S. On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 2013, 225, 372–386. [Google Scholar] [CrossRef]
  15. Argyros, I.K.; Magreñán, A.A. Iterative Methods and Their Dynamics with Applications: A Contemporary Study; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  16. Dennis, J.E.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
  17. Ren, H.; Argyros, I.K. Local convergence of a secant type method for solving least squares problems. Appl. Math. Comput. 2010, 217, 3816–3824. [Google Scholar] [CrossRef]
  18. Wang, G.G.; Deb, S.; Cui, Z. Monarch butterfly optimization. Neural Comput. Appl. 2019, 31, 1995–2014. [Google Scholar] [CrossRef] [Green Version]
  19. Wang, G.G.; Deb, S.; Dos, L.; Coelho, L.D.S. Earthworm optimization algorithm: A bio-inspired metaheuristic algorithm for global optimization problems. Int. J. Bio-Inspired Comput. 2018, 12, 1–22. [Google Scholar] [CrossRef]
  20. Wang, G.G.; Deb, S.; Coelho, L.D.S. Elephant Herding Optimization. In Proceedings of the 3rd International Symposium on Computational and Business Intelligence (ISCBI 2015), Bali, Indonesia, 7–9 December 2015; pp. 1–5. [Google Scholar]
  21. Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl.-Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
  22. Zhao, J.; Gao, Z.-M. The hybridized Harris hawk optimization and slime mould algorithm. J. Phys. Conf. Ser. 2020, 1682, 012029. [Google Scholar] [CrossRef]
Table 1. Results for Example 1, ε = 10 8 .
Table 1. Results for Example 1, ε = 10 8 .
δ = 0.1 δ = 1 δ = 5 δ = 10 δ = 100
Number of iterations128151725
Table 2. Iterative sequence, norm of growth, and residual for Example 1, x 0 = ( 0.8 , 0.2 ) T , ε = 10 6 .
Table 2. Iterative sequence, norm of growth, and residual for Example 1, x 0 = ( 0.8 , 0.2 ) T , ε = 10 6 .
n x n + 1 x n + 1 x n F ( x n + 1 ) + G ( x n + 1 )
0(0.937901, 0.312602)0.1780330.143759
1(0.918455, 0.290216)2.965298 × 10 2 7.973496 × 10 2
2(0.917850, 0.288333)1.977741 × 10 3 7.941104 × 10 2
3(0.917888, 0.288313)4.346993 × 10 5 7.941092 × 10 2
4(0.917889, 0.288314)7.873833 × 10 7 7.941092 × 10 2
Table 3. Radii of convergence domains.
Table 3. Radii of convergence domains.
λ μ L 0 L L 1 M r r 1
0.400.61.0042051.20.40.3192590.235702
0.10.20.150.30.30.11.1926330.885163
Table 4. Results for λ = 0.4 , μ = 0 .
Table 4. Results for λ = 0.4 , μ = 0 .
n | x n + 1 x | The Right Side of (64)The Right Side of (73)
04.364164 × 10 3 0.1253180.169740
11.425535 × 10 5 1.245455 × 10 3 1.529729 × 10 3
22.179258 × 10 11 8.675961 × 10 8 1.060957 × 10 7
33.542853 × 10 22 4.314684 × 10 16 5.272102 × 10 16
Table 5. Results for λ = 0.1 , μ = 0.2 .
Table 5. Results for λ = 0.1 , μ = 0.2 .
n | x n + 1 x | The Right Side of (64)The Right Side of (73)
02.063103 × 10 3 5.909333 × 10 2 8.484100 × 10 2
15.453349 × 10 7 9.113893 × 10 3 1.080560 × 10 2
22.054057 × 10 14 9.051468 × 10 5 1.057648 × 10 4
31.447579 × 10 18 2.390964 × 10 8 2.792694 × 10 8
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Argyros, I.K.; Shakhno, S.; Iakymchuk, R.; Yarmola, H.; Argyros, M.I. Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions. Axioms 2021, 10, 158. https://doi.org/10.3390/axioms10030158

AMA Style

Argyros IK, Shakhno S, Iakymchuk R, Yarmola H, Argyros MI. Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions. Axioms. 2021; 10(3):158. https://doi.org/10.3390/axioms10030158

Chicago/Turabian Style

Argyros, Ioannis K., Stepan Shakhno, Roman Iakymchuk, Halyna Yarmola, and Michael I. Argyros. 2021. "Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions" Axioms 10, no. 3: 158. https://doi.org/10.3390/axioms10030158

APA Style

Argyros, I. K., Shakhno, S., Iakymchuk, R., Yarmola, H., & Argyros, M. I. (2021). Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions. Axioms, 10(3), 158. https://doi.org/10.3390/axioms10030158

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop