1. Introduction
Nonlinear least squares problems often arise while solving overdetermined systems of nonlinear equations, estimating parameters of physical processes by measurement results, constructing nonlinear regression models for solving engineering problems, etc. The most used method for solving nonlinear least squares problems is the Gauss–Newton method [
1]. In the case when the derivative can not be calculated, difference methods are used [
2,
3].
Some nonlinear functions have a differentiable and a nondifferentiable part. In this case, a good idea is to use a sum of the derivative of the differentiable part of the operator and the divided difference of the nondifferentiable part instead of the Jacobian [
4,
5,
6]. Numerical study shows that these methods converge faster than Gauss–Newton type’s method or difference methods.
In this paper, we study the local convergence of the Gauss–Newton–Secant method under the classical and generalized Lipschitz conditions for first-order Fréchet derivative and divided differences.
Let us consider the nonlinear least squares problem:
where residual function
(
) is nonlinear in
x,
F is a continuously differentiable function, and
G is a continuous function, the differentiability of which, in general, is not required.
We propose the following modification of the Gauss–Newton method combined with the Secant-type method [
4,
6] for finding the solution to problem (
1):
where
,
is a Fréchet derivative of
;
is a divided difference of the first order of function
[
7] at points
,
; and
,
are given.
Setting
, for solving problem (
1), from (
2) we obtain an iterative Gauss–Newton-type method:
For
, problem (
1) turns into a system of nonlinear equations:
In this case, method (
2) is transformed into the combined Newton–Secant method [
8,
9,
10]:
and method (
3) into the Newtons-type method for solving nonlinear equations [
11]:
The convergence domain is small (in general), and error estimates are pessimistic. These problems restrict the applicability of these methods. The novelty of our work is in the claim that these problems can be addressed without adding hypotheses. In particular, our idea is to use a center and restricted radius Lipschitz conditions. Such an approach to the study of the convergence of methods allows for extending the convergence ball of the method and improving error estimates.
The remainder of the paper is organized as follows:
Section 2 deals with the local convergence analysis. The numerical experiments appear in
Section 3.
Section 4 contains the concluding remarks and ideas about future works.
2. Local Convergence Analysis
Let us consider, at first, some auxiliary lemmas needed to obtain the main results. Let D be an open subset of .
Lemma 1 ([
4]).
Let , where E is an integrable and positive nondecreasing function on . Then, is monotonically increasing with respect to t on . Lemma 2 ([
1,
12]).
Let where H is an integrable and positive nondecreasing function on . Then, is nondecreasing with respect to t on . Additionally, at is defined as .
Lemma 3 ([
13]).
Let where S is an integrable and positive nondecreasing function on . Then, is nondecreasing with respect to t on . Definition 1. The Fréchet derivative satisfies the center Lipschitz condition on D with average ifwhere , is a solution of problem (1), and is an integrable, positive, and nondecreasing function on . The functions and introduced next are as the function : integrable, positive, and nondecreasing functions defined on .
Definition 2. The first order divided difference satisfies the center Lipschitz condition on with average if Let
and
. We define function
on
by
Suppose that equation
has at least one positive solution. Denote by
the minimal such solution. Then, we can define
, where
.
Definition 3. The Fréchet derivative satisfies the restricted radius Lipschitz condition on with L average if Definition 4. The first order divided difference satisfies the restricted radius Lipschitz condition on with M average if Definition 5. The Fréchet derivative satisfies the radius Lipschitz condition on D with average if Definition 6. The first order divided difference satisfies the radius Lipschitz condition on D with average if Remark 1. It follows from the preceding definitions that , , and for each since . By , we mean that L (or M) depends on and by the definition of . In case any of (15)–(17) are strict inequalities, the following benefits are obtained over the work in [4] using instead of the new functions: - (a1)
An at least as large convergence region leading to at least as many initial choices;
- (a2)
At least as tight upper bounds on the distances , so at least as few iterations are needed to obtain a desired error tolerance.
These benefits are obtained under the same computational effort as in [
4], since the new functions
and
M are special cases of the functions
and
. This technique of using the center Lipschitz condition in combination with the restricted convergence region has been used by us on Newton’s, Secant, Newton-like methods [
14,
15], and can be used on other methods, too, with the same benefits.
The proof of the next result follows as the corresponding one in [
4], but there are crucial differences, where we use
instead of
and
instead of
used in [
4].
We use the Euclidean norm. Note that the following equality is satisfied for the Euclidean norm where .
Theorem 1. Let be continuous on an open convex subset , F be a continuously differentiable function, and G be a continuous function. Suppose that problem (1) has a solution ; the inverse operationexists, such that ; (7), (8), (11), and (12) hold, and γ given in (10) exists. Furthermore,and where is the unique positive zero of the function q given by Then, for , the iterative sequence , generated by (2), is well defined, remains in Ω, and converges to . Moreover, the following error estimates hold for each :where Proof. We obtain
since
and
are positive and nondecreasing functions on
, and
, respectively. Taking into account Lemma 1 for a sufficiently small
,
. With a sufficiently large
R, the inequality
holds. By the intermediate value theorem, the function
q has a positive zero on
denoted by
. Moreover, this zero is the only one on
. Indeed, according to Lemma 2, the function
is non-decreasing with respect to
r on
. By Lemma 1, functions
,
, and
are monotonically increasing on
. Furthermore, by Lemma 3, the function
is monotonically increasing with respect to
r on
. Therefore,
is monotonically increasing on
. Thus, the graph of function
crosses the positive
r-axis only once on
. Finally, from the monotonicity of
q and since
, we obtain
, so
.
We denote
. Let
. By the assumption
,
, we obtain the following estimation:
Using conditions (
11) and (
12), we obtain
where
. Then, from inequality (
29) and the equation
, we obtain by (
10)
Next, from (
29)–(
31) and the Banach lemma [
16], it follows that
exists, and
Hence, is correctly defined. Next, we will show that .
Using the fact
and the choice of
, we obtain the estimate
So, considering the inequalities
we obtain
where
Therefore,
, and estimate (
22) holds for
.
Let us assume that
for
and estimate (
22) holds for
, where
1 is an integer. We shall show
and that the estimate (
22) holds for
.
Consequently,
exists, and
Therefore,
is correctly defined, and the following estimate holds:
This proves that
and estimate (
22) for
.
Thus, by the induction method, (
2) is correctly defined,
, and estimate (
22) holds for each
.
It remains to be proven that for .
Let us define functions
a and
b on
as
According to the choice of
, we obtain
Using estimate (
22), the definition of functions
a,
b and constants
(
), we have
According to the proof in [
17], under the conditions (42)–(45), the sequence
converges to
for
. □
Corollary 1 ([
4]).
The convergence order of method (2) for the problem (1) with zero residual is equal to . If
, we have the nonlinear least squares problem with zero residual. Then, the constants
and
, and estimate (
22) takes the form
This inequality can be written as
Then, we can write an equation for determining the convergence order as follows:
Therefore, the positive root,
of the latter equation is the order of convergence of method (
2).
In case
in (
1), we obtain the following consequences.
Corollary 2 ([
4]).
The convergence order of method (2) for problem (1) with zero residual is quadratic. Indeed, if
, then
, and estimate (
22) takes the form
which indicates the quadratic convergence rate of method (
2).
Remark 2. If and , our results specialize to the corresponding ones in [4]. Otherwise, they constitute an improvement as already noted in Remark 1. As an example, let denote the functions and parameters where are replaced by , respectively. Then, we have in view of (15)–(17) thatand Hence, we havethe new error bounds (22) being tighter than the corresponding (6) in [4], and the rest of the advantages (already mentioned in Remark 1) holding true. Next, we study the convergence of method (
2) if
are constants, as a consequence of Theorem 1.
Corollary 3. Let be continuous on an open convex subset , F be a continuously differentiable, and G be a continuous function on D. Suppose that problem (1) has a solution , and the inverse operationexists, such that . Suppose that the Fréchet derivative satisfies the classic Lipschitz conditionsand the function G has a first order divided difference that satisfieswhere . Furthermore,and where Then, for each , the iterative sequence , generated by (2) is well defined, remains in Ω, and converges to , such that the following error estimate holds for each :where The proof of Corollary 3 is analogous to the proof of Theorem 1.
3. Numerical Examples
In this section, we give examples to show the applicability of method (
2) and to confirm Remark 2. We use the norm
for
Example 1. Let function be defined bywhere . The solution of this problem and . Let us give the number of iterations needed to obtain an approximate solution of this problem. We test method (
2) for the different initial points
, where
, and use the stopping criterion
. The additional point
. The numerical results are shown in
Table 1.
In
Table 2, we give values of
,
and the norm of residual at each iteration.
Example 2. Let function be defined by [5]:where are two parameters. Here and . Thus, if , then we have a problem with zero residual. Let us consider Example 2 and show that
and the new error estimates (
64) are tighter than the corresponding ones in [
4]. We consider the case of the classical Lipschitz conditions (Corollary 3). Error estimates from [
4] are as follows:
where
They can be obtained from (64) by replacing
in
,
,
,
,
by
, respectively. Similarly,
Let us choose
. Thus, we have
,
,
,
,
,
,
. Radii are written in
Table 3.
Table 4 and
Table 5 report the left and right side of error estimates (64) and (73). We obtained these results for
and starting approximations
,
. We see that the new error bounds (64) are tighter than the corresponding (73) from [
4].
4. Conclusions
We developed an improved local convergence analysis of the Gauss–Newton–Secant method for solving nonlinear least squares problems with nondifferentiable operator. We use a center and restricted radius Lipschitz conditions to study the method. As a consequence, we obtain a larger radius of convergence and tighter error estimates under the same computational effort as in earlier papers. This idea can be used to extend the usage of other methods with inverses, such as Newton-type, Secant-type, single-step, or multi-step, to mention a few. This should be our future work. Finally, it is worth mentioning that except for the methods used in this paper, some of the most representative computational intelligence algorithms can be used to solve the problems, such as monarch butterfly optimization (MBO) [
18], the earthworm optimization algorithm (EWA) [
19], elephant herding optimization (EHO) [
20], the moth search (MS) algorithm [
21], the slime mould algorithm (SMA), and Harris hawks optimization (HHO) [
22].